Pondering File Transfer Speeds 1
I move files around my network alot. Most of the time, these transfers are between wired GigE connected systems and are limited by disk performance, not the network. It is good and fast. Multi-gigabyte files transfer in seconds.
However, there are some tools that only work on Windows and my only Windows machine is a WiFi (G) connected laptop. Yes, I can wire it into the GigE network and see huge transfer speeds only limited by the laptop disk drive, but that is usually not how I do it. Yes, I use WiFi for convenience.
Why is 1 tool 2x faster than others? Why?
Computer As an Alarm or Timer - XFCE4
What
You’d like to use your Linux computer as an alarm clock, egg timer, or just to remind you of things beyond what a calendar program or cron or at support.
How
The best solution depends on your selected OS and window manager. I am currently using xfce4 as my window manager. That means I’ll look for a solution that docks with the xfce WM. xfce4 timer is the name of the program. I suspect GNome and KDE and LDXE have their own alarm/timer programs.
Solved: Clock Time Loss Under Windows7 and Vista 2
How to solve this
There are many ways to solve this issue. This is just the one I used based on my experience and expertise. I didn’t use this complex solution initially, it was only after all other solutions attempted failed, badly. My Windows Vista and Win7 computers were losing 2 minutes a day. After the first attempt to correct it with daily time sync, is was still losing about a minute, which was impacting some scheduled events. 1 minute off matters when someone else sets the start and end schedule.
Stolen Laptop, What Now?
I saw a headline about stolen laptops here and thought I’d mention my methods before reading the other article.
Before Stolen Laptop
The most important stuff happens before your laptop is stolen, but you need to do it. It isn’t automatic.
New VISA Credit Cash Back Scam Email
Well, I’ve arrived. Seems besides winning the European, Spanish, Hong Kong, Singapore, and world wide lotteries, I’ve also won 10% cash back from VISA every month. I just need to enroll by clicking on a web link. Sweet!
WOW! That’s a deal!
Except it is a scam.
Why Some Hardware in Your Computer Doesn't Work With Linux
I read a comment on a popular blog site today where people were complaining that Ubuntu didn’t work with their computer. They’d tried a few different versions and it still didn’t work. Of course, they blamed Ubuntu, not the hardware provider.
Some complained about sound or video or wireless cards not working. I’ve had issues with RAID cards not working beyond a basic level; JBOD only, no RAID support. In the old days, the complaints were with modems (win-modems) not working.
In their mind, Ubuntu wants them to switch from the other operating system and needs to do whatever it takes to support that. Clearly they are confused. Ubuntu has very little to do with which hardware is supported. Very little.
Virtualization Survey, an Overview 1
Sadly, the answer to which virtualization is best for Linux isn’t an easy one to answer. There are many different factors that go into the answer. While I cannot answer the question, since your needs and mine are different, I can provide a little background on what I chose and why. We won’t discuss why you should be running virtualization or which specific OSes to run. You already know why.
Key things that go into my answer
- I’m not new to UNIX. I’ve been using UNIX since 1992.
- I don’t need a GUI. Actually, I don’t want a GUI and the overhead that it demands.
- I would prefer to pay for support, when I need it, but not be forced to pay to do things we all need to accomplish – backups for example.
- My client OSes won’t be Windows. They will probably be the same OS as the hypervisor hosting them. There are some efficiencies in doing this like reduced virtualization overhead.
- I try to avoid Microsoft solutions. They often come with additional requirements that, in turn, come with more requirements. Soon, you’re running MS-ActiveDirectory, MS-Sharepoint, MS-SQL, and lots of MS-Windows Servers. With that come the MS-CALs. No thanks.
- We’re running servers, not desktops. Virtualization for desktops implies some other needs (sound, graphics acceleration, USB).
- Finally, we’ll be using Intel Core 2 Duo or better CPUs. They will have VT-x support enabled and 8GB+ of RAM. AMD makes fine CPUs too, but during our recent upgrade cycle, Intel had the better price/performance ratio.
Major Virtualization Choices
- VMware ESXi 4 (don’t bother with 3.x at this point)
- Sun VirtualBox
- KVM as provided by RedHat or Ubuntu
- Xen as provided by Ubuntu
I currently run all of these except KVM, so I think I can say which I prefer and which is proven.
ESXi 4.x
I run this on a test server just to gain knowledge. I’ve considered becoming VMware Certified and may still get certified, which is really odd. I don’t believe many mainstream certifications mean much, except CISSP, VMware, Oracle DBA and Cisco. I dislike that VMware has disabled things that used to work in prior versions to encourage full ESX deployments over the free ESXi. Backups at the hypervisor level, for example. I’ve been using some version of VMware for about 5 years.
A negative, VMware can be picky about which hardware it will support. Always check the approved hardware list. Almost every desktop motherboard will not have a supported network card and may not like the disk controller, so spending another $30-$200 on networking will be necessary.
ESXi is rock solid. No crashes, ever. There are many very large customers running thousands of VMware ESX server hosts.
Sun VirtualBox
I run this on my laptop because it is the easiest hypervisor to use. Also, since this works on desktops, it includes USB pass thru capabilities. That’s a good thing, except, it is also the least stable hypervisor that I use. That system locks up about once a month for no apparent reason. That is unacceptable for a server under any conditions. The host OS is Windows7 x64, so that could be the stability issue. I do not play on this Windows7 machine. The host OS is almost exclusively used as a platform for running VirtualBox and very little else.
Until VirtualBox gains stability, it isn’t suitable for use on servers, IMHO.
Xen (Ubuntu patches)
I run this on 2 servers each running about 6 client Linux systems. During system updates, another 6 systems can be spawned as part of the backout plan or for testing new versions of stuff. I built the systems over the last few years using carefully selected name brand parts. I don’t use HVM mode, so each VM runs with 97% of native hardware performance by running the same kernel.
There are downsides to Xen.
- Whenever the Xen kernel gets updated, this is a big deal, requiring the hypervisor be rebooted. In fact, I’ve had to reboot the hypervisor 3 times after a single kernel update before it takes in all the clients. Now I plan for that.
- Kernel modules have to be manually copied into each VM, which isn’t a big deal, but does have to be done.
- I don’t use a GUI, that’s my preference. If you aren’t experienced with UNIX, you’ll want to find a GUI to help create, configure and manage Xen infrastructure. I have a few scripts – vm_create, kernel_update, and lots of chained backup scripts to get the work done.
- You’ll need to roll your own backup method. There are many, many, many, many options. If you’re having trouble determining which hypervisor to use, you don’t have a chance to determine the best backup method. I’ve discussed backup options extensively on this blog.
- No USB pass thru, that I’m aware. Do you know something different?
I’ve only had 1 crash after a kernel update with Xen and that was over 8 months ago. I can’t rule out cockpit error.
Xen is what Amazon EC2 uses. They have millions of VMs. Now, that’s what I call scalability. This knowledge weighed heavily on my decision.
KVM
I don’t know much about KVM. I do know that both RedHat and Ubuntu are migrating to KVM as the default virtualization hypervisor in their servers since the KVM code was added to the Linux kernel. Conanacal’s 10.04 LTS release will also include an API 100% compatible with Amazon’s EC2 API, binary compatible VM images, and VM cluster management. If I were deploying new servers today, I’d at least try the beta 9.10 Server and these capabilities. Since we run production servers on Xen, until KVM and the specific version of Ubuntu required are supported by those apps, I don’t see us migrating.
Did I miss any important concerns?
It is unlikely that your key things match mine. Let me know in the comments.
Big Server OS Installs Is a Problem
Many companies don’t really consider the bloating of server operating systems as a real problem to be addressed. This is wrong because as soon as you write any data to disk, you’ve just signed up your company to safeguard that data multiple times (3-90) for the next 3-5 years, if not longer.
How did I come up with this?
Assumptions – hopefully realistic for your situation
- Windows 2008 Server – 20GB installation for the OS only (MS says 32GB of disk is the min)
- Data is stored on a SAN, so we will ignore it. The size of data isn’t the issue in this article.
- Compressed and incremental backups are performed with 30 days retained.
- At least 1 copy is maintained off site for DR
Break down of backup disk use
- Install image – 20GB of storage
- OS Backup – 20GB of storage
- Off site Backup – 20GB of storage
- 2 extra copies of backup – 40GB of storage
Total is 100GB of storage media for a single Windows 2008 Server install. Not all that bad, really. Then consider that even small businesses probably have 5 computer servers, that becomes 500GB of storage. Still not so bad. Heck, your DR plan is just to copy the last backup to an external drive and take it home every Friday. Good enough.
Now imagine you have 50 or 100 or 1000 or 20,000 installations. Now it gets tougher to deal with. Those simple backups become 25TB, 50TB, 500TB and 10PB of storage and you haven’t got anything but the OS backed up – no data.
Alternatives?
- Data deduplication on archive storage frames
- Fixed OS images – if they are all the same, you only need 1 backup
- Use a smaller OS image
Data Deduplication
Data Deduplication has been an expensive option that small companies with normal data requirements wouldn’t deploy due to cost, complexity and lacking skills. This is about to change with the newest Sun ZFS that should be out early 2010. It is already available in OpenSolaris, if you want to get started with trials. I’ve seen demonstrations with 90% OS deduplication. That means for every added server OS install, you only add 10% more to be backed up. Obviously, this will increase whenever a new OS or patch deployment over weeks and months occur, but this solution is compelling and will easily pay for itself with any non-trivial server infrastructure.
Fixed OS Images
This is always a good idea, but with the way that MS-Windows performs installations, files are written all over the place and registry entries are best applied only by installation tools. Configuration methods on Windows tends to be point and click, which can’t be scripted effectively.
On UNIX-like operating systems, base images can be installed, application installation scripted and overall configuration settings scripted too. There are a number of tools that make this easy, like Puppet. This is FOSS.
Use a Smaller OS
Xen Ubuntu Linux 8.04.x running a complete enterprise messaging system with over a years worth of data is under 8GB including 30 days of incremental backups. Other single purpose server disk requirements are smaller, much smaller. This blog server is 2.6GB with 30days of incremental backups. That’s almost a 10x factor smaller than MS-Windows server. Virtualization helps too. JeOS is a smaller Ubuntu OS install meant for virtual servers.
No Single Answer
There is no single answer to this problem. I doubt any company can run completely on Linux systems only. Data deduplication is becoming more and more possible for backups, but it isn’t ready for transactional, live systems. Using fixed OS images is a best practice, but many software systems demand specialized installation and settings which make this solution exponentially complex.
A hybrid solution will likely be the best for the next few years, but as customers, we need to voice our concerns over this issue with every operating system provider.
Cold Backup for Alfresco
The script below was created as part of an Alfresco upgrade process and meant to be run manually. This is fairly trivial cold backup script for Alfresco 2.9b, which is a dead release tree from our friends at Alfresco. It hasn’t been tested with any other version and only backs up locally, but could easily backup remote with sshfs or nfs mounts or even rdiff-backup commands swapped in.
For nightly backup of our production servers, we actually perform rdiff-backups of shutdown virtual machines, which take about 3 minutes each. That little amount of downtime to have a differential backup of the entire VM is worth it to us.
#!/bin/sh # ############################################################### # This script should not be run from cron. It will wait for the mysql # DB password to be entered. # # Created by JDPFU 10/2009 # # ############################################################### # Alfresco Backup Script - tested with Alfresco v2.9b # Gets the following files # - alf_data/ # - Alfresco MySQL DB # - Alf - Extensions # - Alf - Global Settings # ############################################################### export TOP_DIR=/opt/Alfresco2.9b DB_NAME=alfresco_2010_8392 export EXT_DIR=$TOP_DIR/tomcat/shared/classes/alfresco/extension export BACK_DIR=/backup/ALFRESCO export BACKX_DIR=$BACK_DIR/extension # Shutdown Alfresco /etc/init.d/alfresco.sh stop # Backup the DB and important files. # dir.root setting will change in the next version /usr/bin/mkdir -p $BACK_DIR cd $BACK_DIR/; /usr/bin/rsync -vv -u -a --delete --recursive --stats --progress $TOP_DIR/alf_data $BACK_DIR/ echo " Reading root MySQL password from file " /usr/bin/mysqldump -u root \ -p`cat ~root/bin/$DB_NAME.passwd.root` $DB_NAME | \ /bin/gzip > $BACK_DIR/${DB_NAME}_`date +%Y%m%d`.gz /usr/bin/find $BACK_DIR -type f -name "$DB_NAME"/* -atime 60 -delete /usr/bin/cp $TOP_DIR/*sh $BACK_DIR /usr/bin/mkdir -p $BACKX_DIR /usr/bin/rsync -vv -u -a --delete --recursive --stats --progress $EXT_DIR/* $BACKX_DIR/ # Start Alfresco /etc/init.d/alfresco.sh start
Why a cold backup? Unless you have a really large DB, being down a few minutes isn’t really a big deal. If you can’t afford to be down, you would already be mirroring databases and automatically fail over anyway. Right?
We use a few extensions for Alfresco, that’s why we bother with the extensions/ directory.
There are many ways to make this script better. It was meant as a trivial example or starting point to show simple scripting methods while still being useful.
December OpenSolaris Meetup
I attended the Atlanta area OpenSolaris Meetup last night even though we were getting some major rain in the area which made the 30 minute drive challenging. Why would I bother? Swag? Scott D presenting? Being around other nerds that like Solaris? No, although those are all valid reasons too.
Even with the nasty weather, the room was packed and we had to bring in some more chairs so everyone could sit. About 20 people attended.
New stuff in ZFS
Yep, the entire meeting was about fairly new features added to ZFS on OpenSolaris. Things like data deduplication and how well it works in normal and extreme situations. The main things I took away from the talk were:
- ZFS is stable
- Data Deduplication, dedup for short, should only be used on backup areas, not on live production data, until you become comfortable with it and the performance in your environment
- Dedup happens at the block level of a zpool, anything above that level still works as designed
- Only use builds after 129 of OpenSolaris if you plan to use dedup. Earlier versions had data loss issues with the code.
- Solaris doesn’t have the dedup code yet. It is not currently scheduled for any specific release either.
- DeDup is only available in real-time now, there is no dedup thread that can be scheduled to run later. This could have unknown performance impacts (good or bad).
- ZFS supports both read and write cache devices. This means we can specify cheap and expensive SSD memory be used for caching either cache and deploy cheaper, larger SATA disks for the actual disk storage. Some cost/performance examples were shown with 10,000rpm SAS drives compared to SSD cache with 4200 SATA drives. The price was about the same, 4x more storage was available and performance was 2x better for read and about the same for write. Nice.
- ZFS has added a way to check for disk size changes – suppose your storage is external to the server and really just a logical allocation. On the storage server, you can expand the LUN that the server sees. ZFS can be configured to manually or automatically refresh disk device sizes.
- Device removal – currently there is no direct method to remove the disk from a ZFS pool. There are work arounds, however. Anyway, they are planning to release the method this year in OpenSolaris ZFS to remove a disk from a zpool.
To really get the demo, you need to accept the other great things about ZFS as a basis, then add the new capabilities on top. One of the demonstrations was how IT shops can charge back for data storage to multiple users since they are using the data, even when 20 other departments are also using the same data blocks. Basically, dedup gives you more disk storage without buying more disk.
ACLs are managed at the file system level, not the disk block level, so the dedup’ed data still can only be accessed appropriately.
Why OpenSolaris ?
Is an open source version of Sun Microsystems Solaris operating systems that runs on lots of hardware you may already own. It also runs inside most virtual machines as a client or guest. Since it looks and feels like Solaris, you can become familiar with it for zero cost on your PC at home for just the cost of disk storage – about 20GB. Sun also uses OpenSolaris to trial new features prior to placing them into the real Solaris releases. I run OpenSolaris in a virtual machine under Widnows7 using the free version of Sun’s VirtualBox hypervisor. I know others who run it directly on hardware, under Xen and under VMware hypervisors too. Just give it enough virtual disk storage and go. I think 10GB is enough to load it, but a little more, say 20GB, will let you play with it and applications more.
If you are in the market for NetApp storage, you really need to take a look at Sun’s storage servers running ZFS. The entry price is significantly less and you get all the flexibility of Solaris without giving up CIFS, iSCSI, NFS, and, in the future, fibre channel storage. Good sales job Sun.
Swag
No meetup is a success without some swag. Water bottles, t-shirts, hats, and books, were all available. We were encouraged to take some after the iPod Nano raffle was won (not by me). Pizza and sodas were also provided by the sponsors.