Big Server OS Installs Is a Problem
Many companies don’t really consider the bloating of server operating systems as a real problem to be addressed. This is wrong because as soon as you write any data to disk, you’ve just signed up your company to safeguard that data multiple times (3-90) for the next 3-5 years, if not longer.
How did I come up with this?
Assumptions – hopefully realistic for your situation
- Windows 2008 Server – 20GB installation for the OS only (MS says 32GB of disk is the min)
- Data is stored on a SAN, so we will ignore it. The size of data isn’t the issue in this article.
- Compressed and incremental backups are performed with 30 days retained.
- At least 1 copy is maintained off site for DR
Break down of backup disk use
- Install image – 20GB of storage
- OS Backup – 20GB of storage
- Off site Backup – 20GB of storage
- 2 extra copies of backup – 40GB of storage
Total is 100GB of storage media for a single Windows 2008 Server install. Not all that bad, really. Then consider that even small businesses probably have 5 computer servers, that becomes 500GB of storage. Still not so bad. Heck, your DR plan is just to copy the last backup to an external drive and take it home every Friday. Good enough.
Now imagine you have 50 or 100 or 1000 or 20,000 installations. Now it gets tougher to deal with. Those simple backups become 25TB, 50TB, 500TB and 10PB of storage and you haven’t got anything but the OS backed up – no data.
Alternatives?
- Data deduplication on archive storage frames
- Fixed OS images – if they are all the same, you only need 1 backup
- Use a smaller OS image
Data Deduplication
Data Deduplication has been an expensive option that small companies with normal data requirements wouldn’t deploy due to cost, complexity and lacking skills. This is about to change with the newest Sun ZFS that should be out early 2010. It is already available in OpenSolaris, if you want to get started with trials. I’ve seen demonstrations with 90% OS deduplication. That means for every added server OS install, you only add 10% more to be backed up. Obviously, this will increase whenever a new OS or patch deployment over weeks and months occur, but this solution is compelling and will easily pay for itself with any non-trivial server infrastructure.
Fixed OS Images
This is always a good idea, but with the way that MS-Windows performs installations, files are written all over the place and registry entries are best applied only by installation tools. Configuration methods on Windows tends to be point and click, which can’t be scripted effectively.
On UNIX-like operating systems, base images can be installed, application installation scripted and overall configuration settings scripted too. There are a number of tools that make this easy, like Puppet. This is FOSS.
Use a Smaller OS
Xen Ubuntu Linux 8.04.x running a complete enterprise messaging system with over a years worth of data is under 8GB including 30 days of incremental backups. Other single purpose server disk requirements are smaller, much smaller. This blog server is 2.6GB with 30days of incremental backups. That’s almost a 10x factor smaller than MS-Windows server. Virtualization helps too. JeOS is a smaller Ubuntu OS install meant for virtual servers.
No Single Answer
There is no single answer to this problem. I doubt any company can run completely on Linux systems only. Data deduplication is becoming more and more possible for backups, but it isn’t ready for transactional, live systems. Using fixed OS images is a best practice, but many software systems demand specialized installation and settings which make this solution exponentially complex.
A hybrid solution will likely be the best for the next few years, but as customers, we need to voice our concerns over this issue with every operating system provider.
December OpenSolaris Meetup
I attended the Atlanta area OpenSolaris Meetup last night even though we were getting some major rain in the area which made the 30 minute drive challenging. Why would I bother? Swag? Scott D presenting? Being around other nerds that like Solaris? No, although those are all valid reasons too.
Even with the nasty weather, the room was packed and we had to bring in some more chairs so everyone could sit. About 20 people attended.
New stuff in ZFS
Yep, the entire meeting was about fairly new features added to ZFS on OpenSolaris. Things like data deduplication and how well it works in normal and extreme situations. The main things I took away from the talk were:
- ZFS is stable
- Data Deduplication, dedup for short, should only be used on backup areas, not on live production data, until you become comfortable with it and the performance in your environment
- Dedup happens at the block level of a zpool, anything above that level still works as designed
- Only use builds after 129 of OpenSolaris if you plan to use dedup. Earlier versions had data loss issues with the code.
- Solaris doesn’t have the dedup code yet. It is not currently scheduled for any specific release either.
- DeDup is only available in real-time now, there is no dedup thread that can be scheduled to run later. This could have unknown performance impacts (good or bad).
- ZFS supports both read and write cache devices. This means we can specify cheap and expensive SSD memory be used for caching either cache and deploy cheaper, larger SATA disks for the actual disk storage. Some cost/performance examples were shown with 10,000rpm SAS drives compared to SSD cache with 4200 SATA drives. The price was about the same, 4x more storage was available and performance was 2x better for read and about the same for write. Nice.
- ZFS has added a way to check for disk size changes – suppose your storage is external to the server and really just a logical allocation. On the storage server, you can expand the LUN that the server sees. ZFS can be configured to manually or automatically refresh disk device sizes.
- Device removal – currently there is no direct method to remove the disk from a ZFS pool. There are work arounds, however. Anyway, they are planning to release the method this year in OpenSolaris ZFS to remove a disk from a zpool.
To really get the demo, you need to accept the other great things about ZFS as a basis, then add the new capabilities on top. One of the demonstrations was how IT shops can charge back for data storage to multiple users since they are using the data, even when 20 other departments are also using the same data blocks. Basically, dedup gives you more disk storage without buying more disk.
ACLs are managed at the file system level, not the disk block level, so the dedup’ed data still can only be accessed appropriately.
Why OpenSolaris ?
Is an open source version of Sun Microsystems Solaris operating systems that runs on lots of hardware you may already own. It also runs inside most virtual machines as a client or guest. Since it looks and feels like Solaris, you can become familiar with it for zero cost on your PC at home for just the cost of disk storage – about 20GB. Sun also uses OpenSolaris to trial new features prior to placing them into the real Solaris releases. I run OpenSolaris in a virtual machine under Widnows7 using the free version of Sun’s VirtualBox hypervisor. I know others who run it directly on hardware, under Xen and under VMware hypervisors too. Just give it enough virtual disk storage and go. I think 10GB is enough to load it, but a little more, say 20GB, will let you play with it and applications more.
If you are in the market for NetApp storage, you really need to take a look at Sun’s storage servers running ZFS. The entry price is significantly less and you get all the flexibility of Solaris without giving up CIFS, iSCSI, NFS, and, in the future, fibre channel storage. Good sales job Sun.
Swag
No meetup is a success without some swag. Water bottles, t-shirts, hats, and books, were all available. We were encouraged to take some after the iPod Nano raffle was won (not by me). Pizza and sodas were also provided by the sponsors.