Trivial Lifehacker Profile Monitoring Script

Posted by JD 12/16/2009 at 09:25

Lifehacker is a site that many of us watch daily for tips. If you join the community, you may find that the LH site can sometimes become … er … slow. The cause of this can be many things, but personally, I think they’ve gone overboard with all the javascript.

Someone mentioned they had written a real-time notification script to tell him about relies to his posts and he asked others what they would like of the script. I started thinking and determined it would be a fairly trivial script to do what I wanted – email a list of replies in simple HTML.

As with all scripts, they are never really done and prone to tweaks for the next year. I think the next tweak will be to try the feed.xml that LH provides. Perhaps it will be smaller, faster and easier to parse?

So I attempted to attach the script to this article using tools build into the blogging system. Both have failed.

  1. Upload – so RSS feed readers see an attachment link – no joy
  2. Excerpt – I’ve used this in previous versions of the blog successfully. No more. The issue is probably due to my theme.

Anyway, here’s the code


#!/usr/bin/perl

  1. #####################################################
  2. Display Lifehacker Profile data
  3. Recent Replies (first pg only)
  4. Followers
  5. Friends
    #
  6. Known Linux Dependencies:
  7. - perl and LWP and Getopt modules
  8. - sudo cpan -i LWP::Simple
    #
  9. Installation
  10. 1) Change the Your-LH-Profile-name-Here below to yours
  11. 2) chmod +x lh_profile_monitor.cgi
  12. 3a) Either run as is $0 > output.html
  13. Or
  14. 3b) Setup as a CGI on your web server
  15. Or
  16. 3c) Setup as a crontab entry which will automatically email the results
    #
  17. $Id: lh_profile_monitor.cgi,v 1.5 2009/12/16 14:32:55 jdfsdp Exp jdfsdp $
    #
  18. #####################################################
    use strict;
    use LWP::Simple;
    use Getopt::Long;
  1. #####################################################
    sub life_hacker_data();
  1. #####################################################
    my $profile_name=“Your-LH-Profile-name-Here”;
    my $doHelp = 0;
    my $download = 1;
    my $profile_page=“http://lifehacker.com/people/$profile_name/”;
  2. my $profile_page=“http://lifehacker.com/people/$profile_name/feed.xml”;
    my $date=`date +%r`;
    my $ret=GetOptions (“help|?” => \$doHelp,
    “profile=s” => \$profile_name,
    “download=i” => \$download);

my $html_header = “Content-type: text/html\n\n


”refresh\" CONTENT=\“600\”>
$profile_name-Recent LifeHacker
";
my $html_footer = “”;

  1. ########################################################
    sub Usage()
    {
    print “\nUsage:
    $0 [-download 0/1] -profile LifeHacker_Profile\n”;
    exit 1;
    }
  1. #####################################################
  2. main()
  3. The output is a filtered html file to stdout
    #
  4. Grab the new page
    Usage() if ( $doHelp );
  1. `/usr/local/bin/curl $profile_page > $profile_tmp_file` if ($download);
    my $content = get( $profile_page ) if ($download);
    my @lh_content = split(/\n/, $content);
    print “$html_header\n”;
    print “

    List of Recent Replied to Messages


    Pulled from ”$profile_page\“>$profile_page at $date
    1. ”;
      my $ret=life_hacker_data();
      print “$ret\n”;
      print “\n$html_footer\n\n”;
    1. #####################################################
      sub life_hacker_data()
      {
    2. Display the total count of
      1. Friends
      2. Followers
      3. Msgs with replies
        my $ret="";
        foreach (@lh_content){
        if (m#/friends/$profile_name|/followers/$profile_name|replied# ){
        $ret .= $_;
        }
        }
        $ret =~ s#
#
  • #g;
    $ret =~ s#href=“/friends#href=”http://lifehacker.com/friends#g;
    $ret =~ s#href=“/followers#href=”http://lifehacker.com/followers#g;
    $ret =~ s#

    #

  • \n

    #i;
    $ret =~ s#^[.]Click here to view all[.]$##ig; # Remove excess lines
    $ret =~ s#view all##ig; # Remove excess lines
    $ret =~ s#»##ig; # Remove excess lines return $ret;

    }

    WPA Passphrase Cracking for Sale $40

    Posted by JD 12/16/2009 at 09:01

    Saw an article today that someone has decided to sell WPA passphrase cracking service for about $40. It takes about 40 minutes, but on average just 20 minutes. Seems he has a Beowulf compute cluster with idle time.

    If this is what some guy can do, imagine what different governments can do.

    Once again, no consumer-grade WiFi is deemed non-secure. Go wired if you care at all. There is no secure wifi/radio networking, none.

    Big Server OS Installs Is a Problem

    Posted by JD 12/15/2009 at 08:27

    Many companies don’t really consider the bloating of server operating systems as a real problem to be addressed. This is wrong because as soon as you write any data to disk, you’ve just signed up your company to safeguard that data multiple times (3-90) for the next 3-5 years, if not longer.

    How did I come up with this?

    Assumptions – hopefully realistic for your situation

    • Windows 2008 Server – 20GB installation for the OS only (MS says 32GB of disk is the min)
    • Data is stored on a SAN, so we will ignore it. The size of data isn’t the issue in this article.
    • Compressed and incremental backups are performed with 30 days retained.
    • At least 1 copy is maintained off site for DR

    Break down of backup disk use

    • Install image – 20GB of storage
    • OS Backup – 20GB of storage
    • Off site Backup – 20GB of storage
    • 2 extra copies of backup – 40GB of storage

    Total is 100GB of storage media for a single Windows 2008 Server install. Not all that bad, really. Then consider that even small businesses probably have 5 computer servers, that becomes 500GB of storage. Still not so bad. Heck, your DR plan is just to copy the last backup to an external drive and take it home every Friday. Good enough.

    Now imagine you have 50 or 100 or 1000 or 20,000 installations. Now it gets tougher to deal with. Those simple backups become 25TB, 50TB, 500TB and 10PB of storage and you haven’t got anything but the OS backed up – no data.

    Alternatives?

    1. Data deduplication on archive storage frames
    2. Fixed OS images – if they are all the same, you only need 1 backup
    3. Use a smaller OS image

    Data Deduplication

    Data Deduplication has been an expensive option that small companies with normal data requirements wouldn’t deploy due to cost, complexity and lacking skills. This is about to change with the newest Sun ZFS that should be out early 2010. It is already available in OpenSolaris, if you want to get started with trials. I’ve seen demonstrations with 90% OS deduplication. That means for every added server OS install, you only add 10% more to be backed up. Obviously, this will increase whenever a new OS or patch deployment over weeks and months occur, but this solution is compelling and will easily pay for itself with any non-trivial server infrastructure.

    Fixed OS Images

    This is always a good idea, but with the way that MS-Windows performs installations, files are written all over the place and registry entries are best applied only by installation tools. Configuration methods on Windows tends to be point and click, which can’t be scripted effectively.

    On UNIX-like operating systems, base images can be installed, application installation scripted and overall configuration settings scripted too. There are a number of tools that make this easy, like Puppet. This is FOSS.

    Use a Smaller OS

    Xen Ubuntu Linux 8.04.x running a complete enterprise messaging system with over a years worth of data is under 8GB including 30 days of incremental backups. Other single purpose server disk requirements are smaller, much smaller. This blog server is 2.6GB with 30days of incremental backups. That’s almost a 10x factor smaller than MS-Windows server. Virtualization helps too. JeOS is a smaller Ubuntu OS install meant for virtual servers.

    No Single Answer

    There is no single answer to this problem. I doubt any company can run completely on Linux systems only. Data deduplication is becoming more and more possible for backups, but it isn’t ready for transactional, live systems. Using fixed OS images is a best practice, but many software systems demand specialized installation and settings which make this solution exponentially complex.

    A hybrid solution will likely be the best for the next few years, but as customers, we need to voice our concerns over this issue with every operating system provider.

    Cold Backup for Alfresco

    Posted by JD 12/13/2009 at 20:16

    The script below was created as part of an Alfresco upgrade process and meant to be run manually. This is fairly trivial cold backup script for Alfresco 2.9b, which is a dead release tree from our friends at Alfresco. It hasn’t been tested with any other version and only backs up locally, but could easily backup remote with sshfs or nfs mounts or even rdiff-backup commands swapped in.

    For nightly backup of our production servers, we actually perform rdiff-backups of shutdown virtual machines, which take about 3 minutes each. That little amount of downtime to have a differential backup of the entire VM is worth it to us.

    #!/bin/sh
    # ###############################################################
    # This script should not be run from cron. It will wait for the mysql
    # DB password to be entered.
    # 
    #  Created by JDPFU 10/2009
    # 
    # ###############################################################
    # Alfresco Backup Script - tested with Alfresco v2.9b
    #   Gets the following files
    #    - alf_data/
    #    - Alfresco MySQL DB
    #    - Alf - Extensions
    #    - Alf - Global Settings
    # ###############################################################
    export TOP_DIR=/opt/Alfresco2.9b
    DB_NAME=alfresco_2010_8392
    export EXT_DIR=$TOP_DIR/tomcat/shared/classes/alfresco/extension
    export BACK_DIR=/backup/ALFRESCO
    export BACKX_DIR=$BACK_DIR/extension
    
    # Shutdown Alfresco
    /etc/init.d/alfresco.sh stop
    
    # Backup the DB and important files.
    # dir.root setting will change in the next version
    /usr/bin/mkdir  -p $BACK_DIR
    cd  $BACK_DIR/; 
    /usr/bin/rsync  -vv -u -a --delete --recursive --stats --progress $TOP_DIR/alf_data $BACK_DIR/
    
    echo "
      Reading root MySQL password from file
    "
    /usr/bin/mysqldump -u root \
        -p`cat ~root/bin/$DB_NAME.passwd.root` $DB_NAME | \
        /bin/gzip > $BACK_DIR/${DB_NAME}_`date +%Y%m%d`.gz
    /usr/bin/find  $BACK_DIR -type f -name "$DB_NAME"/* -atime 60 -delete
    
    /usr/bin/cp  $TOP_DIR/*sh $BACK_DIR
    /usr/bin/mkdir  -p $BACKX_DIR
    /usr/bin/rsync  -vv -u -a --delete --recursive --stats --progress  $EXT_DIR/* $BACKX_DIR/
    
    # Start Alfresco
    /etc/init.d/alfresco.sh start
    

    Why a cold backup? Unless you have a really large DB, being down a few minutes isn’t really a big deal. If you can’t afford to be down, you would already be mirroring databases and automatically fail over anyway. Right?

    We use a few extensions for Alfresco, that’s why we bother with the extensions/ directory.
    There are many ways to make this script better. It was meant as a trivial example or starting point to show simple scripting methods while still being useful.

    Customer Loyalty Communications

    Posted by JD 12/13/2009 at 09:13

    The last few years, companies have added customer loyalty programs to their marketing. Most of these fail for a number of reasons.

    Which companies have the highest customer loyalty and why? Which have failed, at least for me?

    Successes

    Coke – People like to drink Coke everywhere in the world. When Coke changed their flavoring based on taste testing, the world cried out to put back the old flavor almost like an addict would. Flavored sugar water doesn’t mean much to me.

    Apple – Apple fans go crazy about their products and will tell EVERYONE how great each is. Apple product cost between 20% and 100% more than similar products that aren’t as easy to use. People are willing to pay more for that. I’m not a fan of Apple – mostly because they charge more and their fans are obnoxious.
    I did get a phone call from Apple last year because someone was trying to use a credit card with my name on it to buy an iPhone and iTunes stuff. This call was from Apple, not my credit card company. I became hostile towards to nice man on the phone immediately, before I gave him a chance to explain the issue. He never wavered and was always polite and professional – without any accent in his speech. While this hasn’t changed my negative opinion of Apple product pricing, it hasn’t added any more negative thoughts either.
    Apple, when will your customers be able to multi-task on an iphone? When will they be allowed to change the battery? When will they be allowed to select from any application that can run on the device?

    Google – Google does most things they do VERY WELL and don’t ask me directly for anything in return. They make their money by correlating all my web data together, building a profile about me and selling ads around that data. Most of us don’t really know what this means and we don’t care. I avoid google without filtering personal connection, use, computer data. Further, I avoid sending email to gmail addresses.

    Airlines – Delta and United FF programs. They aren’t really that useful to me anymore. I’ve used Continental and AA FF programs in the past but never used an award ticket from them. Which FF program works best for you depends on where you live and where you travel. I have turned in some Delta points for a $1400 international ticket, which made it completely worth while. My United miles expired before I could use them, so I transferred them to a charity.

    McDonald’s – Kids, advertising, convenience. I don’t get it at all. I haven’t eaten at McD’s in perhaps 2.5 years. It was an emergency the last time I did because I needed something to eat, quick, on the way to a once in a lifetime event. The closest restaurant to my home is a McDonald’s. I could walk there. I have never been to that store.

    Twitter – You love it or your don’t care. I don’t care. Why didn’t AIM or gTalk or MSN setup interfaces with SMS texts? Maybe they did, but I just didn’t know about it?

    What’s missing?

    Customer loyalty needs to feel like a friend telling another friend about something great that they know is likely to be relevant to them them, not just something good. My friends know the types of things I’m interested in based on prior communications. They contact me when they see something really interesting to me. When was the last time you got any great insight from a customer loyalty communication. Seriously? Most of these communications are a list of 50 things on sale and none are of interest. None. The same old marketing like newspaper inserts. It needs to be targeted and on point for my needs.

    Acura – I’ve owned two Acura vehicles and I’m mostly pleased. My interactions with most Acura dealers has been pleasant enough too. When I purchased my last Acura, my last name was misspelled on all the documents and on the title. Boo. A single attempt to correct that through Acura failed, so I gave up. When my annual registration comes due, I initially tried to correct it, but that failed too. My name gets misspelled a lot, so this isn’t a big deal. At least the Acura misspelling result isn’t offensive. Every quarter, an Acura magazine arrives with stories, lifestyle articles, travel hints and offers – Free Augusta National Golf tickets and the like. I don’t golf, but the offer is appreciated. Some of the other deals are interesting and generally leave a favorable impression of Acura.

    My next vehicle will probably be another Acura in a few years. The last purchase occurred without visiting the dealership. The papers were signed on my kitchen table on the day the vehicle was delivered to my home. That impression is hard to beat even with the misspelled name.

    TiVo – These guys are similar to Apple, except I like them. Their product works better than any alternative, but it costs more than any alternative. I dislike that a monthly plan is even offered and I wish the lifetime plans weren’t so expensive. I’ve been a tivo owner since 2003. That same device is working. I swapped the disk drive a few years ago to get more storage. It is about time to swap the drive again to further increase the lifetime. I don’t use any of the paid add-on options, but I do have it download free internet content like Tekzilla and hak5 weekly shows. Convenience rules.

    Failures

    Hilton Hotels – I signed up for a Hilton awards program a few years ago due to conference attendance. I tied my room reservation to it, then attended. After my visit, I checked that it was recorded to my HH program, it wasn’t so I sent the information about my stay to the feedback link on the program site. A few days later, I started receiving emails from the hotel manager asking how my stay was. I provided good feedback and explained that the program hadn’t connected my stay with the frequent stay program ID. I attempted to connect it once more. No joy. It has been a year and still isn’t connected. I get monthly emails from Hilton which reminds me they don’t follow through. Attempts to leave their email marketing list have failed too, which frustrate me even more, every month. I’m at the point where I avoid staying at Hilton Hotels or any of their 10 other names. FAIL.

    Microsoft – The two most common communications I get from Microsoft is patch your PC and your antivirus is out of date. Is that really the message they want to send weekly? Microsoft has lost my trust. Every time they create something new, I immediately wonder how it will prevent me from using anyone elses’ stuff or how much it will cost me. exFAT file system is their latest push for memory cards to support large media files. I don’t understand why all the memory manufacturers don’t just use the FOSS ext2 file systems instead? Oh – because Microsoft doesn’t (and won’t) support ext2. OTOH, WinXP and earlier OSes don’t support exFAT either.

    Linux / Ubuntu – This isn’t really fair. Linux isn’t a company and has no advertising budget. Ubuntu doesn’t seem to have much advertising budget either, at least for the masses. What can Linux do better? Well, they can show 30 second clips of people using the software to solve a real problem with FOSS. It would be best of the problem highlighted something that Windows or Macs don’t do well at all. #1 – every time should show price followed by system maintenance and upgrade processes (click the red triangle in the corner). Currently, failing. Yes, I know that Linux is just the kernel and that no users actually use it directly. We all use some higher level tool created by GNU or Ubuntu or Red Hat or SuSE or Mandrake or some developer in his basement.

    Amazon – I shop on Amazon for price and convenience. I maintain a wish list of things to make gifts easier and as reminders for things to purchase later. I don’t think I’ve ever purchased anything recommended for me from Amazon. They know the types of things I buy with over 200 purchases. If I bought a router, I probably don’t need another. 3 months later, I don’t need to see CAT5e cables or a switch either. I’ve had a few issues with Amazon product shipments over the years, but Amazon has always made me whole again, always. Their customer service does a good job. Their product suggestions, not so much.

    Travelocity – They know where I’ve traveled, how long I’ve stayed and when I tend to go. They also know my searches for destinations. Yet, they don’t send deals for those destinations or worse, keep sending them when I’m already back home. I want international travel deals. I doubt I’ll ever take a vacation to Los Vegas or fly to Ashville, NC. STOP OFFERING THOSE DEALS, Travelocity. Offering a flight from Atlanta to Savannah is a waste of your time too. I’d end up spending more time dealing with airport garbage than a simple drive there. I’m not going to fly commercially to Savannah, ever. I’ve routinely searched for flights to Bali, Singapore, New Zealand, Australia, London, Europe, Chili, and Peru. Get the hint and target those deals, please?

    My Senators – About once a year, I get an email from my senators claiming to have stopped some bill that is bad for the country. I wrote to them a few years ago about some of my concerns which they responded to by a carefully copy/pasted paragraph about each of my concerns. Most recently, it was about the health care bill, which I’ve never written to them about. Nice. Fail.

    Grocery Stores – They give small discounts for the cost of you letting them see what you purchase. I’ve never had a grocery store loyalty card. My privacy is worth more than $100/yr. When my local Kroger started pushing them, I spoke with the store manager about my displeasure. He wasn’t helpful, I stopped shopping at Kroger. Publix is a local competitor where I started stopping. They also had a discount card, but if I didn’t have one, the cashier always scanned hers so I got the discount. Kroger – FAIL, Publix – Success. I suppose manufacturers would be snail-mailing coupons to me if I had a card? That local Kroger went out of business. I doubt I had anything to do with that, but the store manager definitely did. Good Bye.

    Customer Loyalty Programs

    Which programs work for you and which have failed? Why?

    What's Wrong with New Linux Users? 10

    Posted by JD 12/09/2009 at 08:54

    Simple. They aren’t willing to spend the same amount of time they’ve spent learning some other operating system to learn Linux.

    I’m happy to help them learn Linux in general (not a specific distribution), provided they display a sincere interest and a burning desire to learn.

    That doesn’t mean I’ll spoon feed answers for every question they have, that is impossible, but I will help them learn how to find answers to their questions and teach them things that UNIX-like operating systems can do out of the box that most Windows-based systems cannot.

    Before heading down the UNIX OS path, be aware that months of effort will probably be needed. Do you have the stomach for that commitment?

    Any takers?

    December OpenSolaris Meetup

    Posted by JD 12/09/2009 at 07:46

    I attended the Atlanta area OpenSolaris Meetup last night even though we were getting some major rain in the area which made the 30 minute drive challenging. Why would I bother? Swag? Scott D presenting? Being around other nerds that like Solaris? No, although those are all valid reasons too.

    Even with the nasty weather, the room was packed and we had to bring in some more chairs so everyone could sit. About 20 people attended.

    New stuff in ZFS

    Yep, the entire meeting was about fairly new features added to ZFS on OpenSolaris. Things like data deduplication and how well it works in normal and extreme situations. The main things I took away from the talk were:

    1. ZFS is stable
    2. Data Deduplication, dedup for short, should only be used on backup areas, not on live production data, until you become comfortable with it and the performance in your environment
    3. Dedup happens at the block level of a zpool, anything above that level still works as designed
    4. Only use builds after 129 of OpenSolaris if you plan to use dedup. Earlier versions had data loss issues with the code.
    5. Solaris doesn’t have the dedup code yet. It is not currently scheduled for any specific release either.
    6. DeDup is only available in real-time now, there is no dedup thread that can be scheduled to run later. This could have unknown performance impacts (good or bad).
    7. ZFS supports both read and write cache devices. This means we can specify cheap and expensive SSD memory be used for caching either cache and deploy cheaper, larger SATA disks for the actual disk storage. Some cost/performance examples were shown with 10,000rpm SAS drives compared to SSD cache with 4200 SATA drives. The price was about the same, 4x more storage was available and performance was 2x better for read and about the same for write. Nice.
    8. ZFS has added a way to check for disk size changes – suppose your storage is external to the server and really just a logical allocation. On the storage server, you can expand the LUN that the server sees. ZFS can be configured to manually or automatically refresh disk device sizes.
    9. Device removal – currently there is no direct method to remove the disk from a ZFS pool. There are work arounds, however. Anyway, they are planning to release the method this year in OpenSolaris ZFS to remove a disk from a zpool.

    To really get the demo, you need to accept the other great things about ZFS as a basis, then add the new capabilities on top. One of the demonstrations was how IT shops can charge back for data storage to multiple users since they are using the data, even when 20 other departments are also using the same data blocks. Basically, dedup gives you more disk storage without buying more disk.

    ACLs are managed at the file system level, not the disk block level, so the dedup’ed data still can only be accessed appropriately.

    Why OpenSolaris ?

    Is an open source version of Sun Microsystems Solaris operating systems that runs on lots of hardware you may already own. It also runs inside most virtual machines as a client or guest. Since it looks and feels like Solaris, you can become familiar with it for zero cost on your PC at home for just the cost of disk storage – about 20GB. Sun also uses OpenSolaris to trial new features prior to placing them into the real Solaris releases. I run OpenSolaris in a virtual machine under Widnows7 using the free version of Sun’s VirtualBox hypervisor. I know others who run it directly on hardware, under Xen and under VMware hypervisors too. Just give it enough virtual disk storage and go. I think 10GB is enough to load it, but a little more, say 20GB, will let you play with it and applications more.

    If you are in the market for NetApp storage, you really need to take a look at Sun’s storage servers running ZFS. The entry price is significantly less and you get all the flexibility of Solaris without giving up CIFS, iSCSI, NFS, and, in the future, fibre channel storage. Good sales job Sun.

    Swag

    No meetup is a success without some swag. Water bottles, t-shirts, hats, and books, were all available. We were encouraged to take some after the iPod Nano raffle was won (not by me). Pizza and sodas were also provided by the sponsors.

    Simple Security for Emails Clicks

    Posted by JD 12/06/2009 at 08:18

    Scenario

    We all get emails asking us to do something. Sometimes the email includes a link to a specific web site to help you complete the task. Well, unless you only see plain text email messages, no RTF, no HTML, then you can’t trust that the URL you click is really where you are being taken. If there are any miss-spellings or simple grammar issues, ignore the email.

    Simple Solution

    Don’t click links provided in emails. Rather, manually go the the website. Use your password manager, like KeePass, to open the correct page and enter your login credentials. Or you could type the known URL, just don’t click the URL in the email. Simple enough?

    You know this stuff. You know not to click. But it takes just 1 small mistake to be p’wned and you may not realize it for a few days, if ever.

    Examples

    Hak5.org forums were hacked in 2009. When the maintainers realized it, they sent an email warning everyone and suggesting that you never use the same password on multiple websites. Good advice. If you use a password manager, like KeePass, you never need to worry about reusing the same password. Just use the Generate button to create a strong password for each website you visit.

    Web site administrators are being targeted now to gain access to the servers by people who want more bot controllers in their bot network. I’ve used cPanel before at a hosting provider and it would be easy for that page to be cloned, yet still appear to work. The cloner can grab the login credentials and pass them on to the real cPanel page. When you run a web site, the management of that website often bounces between lots of unrelated web servers that you aren’t used to seeing, adding to the confusion. Even Yahoo hosting was targeted, so I can’t believe that some very popular, yet cheap, providers aren’t also.

    I get spam emails that are usually sent from small business servers all the time. These servers could be misconfigured to allow email relaying or compromised by some other method. Regardless, sending an email to the administrator never seems to help.

    During holidays, we all get Holiday eCards asking us to click on a URL. I’ve gotten 3 called ICQ Greeting Cards this week. The link in all three of those emails was to some nasty software, for sure. Don’t click. They even had safe links included in the email to the real ICQ site to earn my trust.

    Be careful out there, especially if you are an administrator for others or public facing internet services. I expect to be hacked at some point. I have been hacked – over 10 years ago. Hopefully, being hacked again won’t happen for some time.

    Shame on Pidgin-Plain Text Passwords 4

    Posted by JD 12/01/2009 at 18:02

    Today I was going through my list of files to backup on my Linux laptop and removing temporary and cache files when I came across a directory that I didn’t recognize. The files were listed as changed with the last 3 days.

    changed .purple
    changed .purple/accels
    changed .purple/accounts.xml
    changed .purple/blist.xml
    changed .purple/prefs.xml
    changed .purple/status.xml
    

    It turns out they are for pidgin, the extremely popular Instant Messaging software. Ok, I use that – fine. But my interest got the best of me and I looked at the accounts.xml file. Obviously it is an XML file, but I was shocked to discover the following (modified for my protection):



    prpl-jabber
    admin-userid@example.com/Admin
    some-really-complex-password-with-lots-of-special-characters-in-clear-text
    admin

    The password isn’t encrypted. Not at all!

    This is unacceptable.

    There is an encryption plugin for pidgin but it is for IMs, not the stupid passwords. This is just crazy. Heck, there are ROT13 methods and trivial 2-way password encrypt/decrypt methods which could be used if necessary.

    The pidgin wiki has this to say. I have to admit, they do have a point, but I still disagree with it. At least they do set the directory permissions to 700 and file permissions to 600 (user only), but this doesn’t help with my backups placed on another system, does it?

    Subtitle Script for I to l Converstions

    Posted by JD 11/30/2009 at 18:45

    A quick script to change a capital I (eye) in the middle of a word into a lowercase l (el). If you like Asian films, you understand why I wrote this script. I had an itch. It needed to be scratched. This is useful for .srt files used in movie subtites.

    #!/usr/bin/perl
    # Perl script to change every 'I' into an 'l' in the middle of a word
    # input is stdin and output is to stdout; redirection is your friend
    my $line;
    while(<>){
     chomp;
     $line=$_;
     $_=$line;
    # Match lines with non-whitespace characters leading a capital I 
      if ( m/[\S]I/ ){
         $line =~ tr/I/l/;
      }
      print "$line\n";
    }
    

    It is very common for subtitle files, SRT format, to have a capital I in the middle of words since bitmap patterns are used to create the files. For native speakers of English, this is HIGHLY distracting – to the point that the subtitles must be fixed before a movie can be enjoyed.

    I tried a few other methods, before determining this simple character translation was needed.

    1. ispell – There were too many words that were not in the dictionary and spacing of words often groups them in strange ways.
    2. replacement dictionary – I created a hundred word dictionary replacement sed script. There were always new words that needed to be added for every SRT file.
    3. Manual editing – yep, I spent a few hours manually editing files. This wasn’t very efficient and ruined the movie plot since I’d already read it before viewing it.

    Some combination of methods will probably be necessary. I intend to merge them into a single perl script and perform them in the most efficient order. It will begin with the I—>l translation.