Readers Ask About ... LVM+JFS+RAID 1

Posted by JD 08/08/2011 at 05:00

Below is the first of 6 questions from a reader. I definitely don’t have all the answers, but I’m not short on opinion. ;)

Part 1 – LVM+JFS+RAID | Part 2 – Service Virtualization | Part 3 – Virtualizing Media Storage | Part 4 – Hosting Email

duijf asks:

I have a total of 5 quiet 5400RPM 1TB drives configured in a RAID5+1 array. I installed Ubuntu Server 10.04 onto LVM , inside the LVs JFS is used as the file-system. Is this good practice?

Summary

  • Ubuntu Server 10.04
  • LVM
  • JFS
  • 5×1TB 5400 rpm drives

Good practice is a matter of opinion. I like:

  • Ubuntu 10.04
  • LVM
  • JFS

I’m not crazy about 5400 RPM drives, especially in a RAID5 configuration.

Ubuntu 10.04

When it comes to picking an OS, I tend towards stability as the single most important thing. Ubuntu 10.04 provides stability. Use it and keep it patched. I use 10,04 on production machines too. If you are still running Ubuntu 8.04, like we do here, that’s fine, but you need to be planning an upgrade path. Debian-Stable works as do the RHEL and similar based distros like CentOS. Fedora or any non-LTS Ubuntu are not recommended for production servers. Lab use, yes. Production, no.

For ease of maintenance, I prefer APT-based distros. Ubuntu and Debian fit.

LVM Logical Volume Manager

Using some sort of LVM is a best practice too, but it isn’t the only way to get the capabilities that LVM brings. ZFS is another way, though LVM is much more popular in the Linux world.

I will admit that I do not use LVM due to an unfortunate mistake made many years ago which lead to massive data loss. Still, the key to using any volume manager is to get comfortable with the tool, practice recovery steps before you need them, verify that you can restore from snapshots and work out a bare metal restore plan. LVM snapshots do not replace backups. However, I would definitely take a snapshot prior to every backup AND prior to every system update. Consider snapshots as seat belts before getting into a race car. Highly recommended.

JFS

JFS isn’t new.
JFS isn’t sexy.
JFS isn’t the fastest file system available.

JFS has been working for many, many years since IBM donated the code.
JFS is proven.
JFS is bootable.
JFS is Journaled File System.

I use JFS for my RAID storage and have for over a decade. In fact, I started using JFS before I should have and had some real problems early on. This was before JFS was supported in stock kernels and I was attempting a system upgrade. When you are considering which file system to deploy, consider these things:

  • Can a rescue CDROM boot it?
  • Can a rescue CDROM access it?
  • Is it journaled to avoid long FSCK times?
  • Has it been proven by millions of users across many thousands, if not hundreds of thousands, of systems?
  • Will your data be safe?
  • Can you easily backup the data AND restore it?
  • Is the read AND write performance average or better?
  • Is the file system supported by your distribution without rebuilding the kernel or adding special modules?

If you choose a file system that don’t meet these conditions, you are asking for trouble later.

JFS passes each of these tests.

I’m watching newer file systems eagerly. Those like ZFS and BTRFS are interesting for their added capabilities. I’ve started playing with EXT4 as a replacement for JFS, but haven’t taken any production systems there. In another 6 months, EXT4 will probably become my default file system, unless some major great things happen with ZFS or BTRFS. I’ve read that Linus runs BTRFS on his system.

RAID5

RAID5 is a way to merge 3 or more disks into an allocation of storage protected by parity validation. 5×1TB disks in a 4+1 configuration means that about 3TB of usable storage is available with storage from the remaining 2 drives used for parity and as a failover disk.

Google did a study about disk failures. Basically, if any error was seen in the SMART data, the drive had a 60% chance of failing prematurely.

A few years ago, someone looked at the MTBF for hard disks and the error rates He raised a flag about RAID5 use on large disks based on MTBF math. The numbers lead the researches to believe that after 1 disk fails in a 7 drive array, when a new drive was rebuilding in the array, another disk was 50% likely to fail. More drives increases the likelihood of a failure during rebuild. I don’t believe we have actually seen the predicted failure rates, but I’ll let someone else validate that with THEIR DATA, not mine. ;)

If I had 5 HDDs, I’d deploy RAID6, not RAID5. Actually, I’d look to deploy ZFS with RAIDz2 on a storage system. Further, I’d be certain to limit the partition sizes on the RAID to be something easily backed up to another disk or NAS on my network. For this example, was about 3TB of usable storage available. I would split that up into 3×1TB partitions so 3×1.5TB NAS disks (no RAID) could easily be used to incrementally backup the RAID data. If that isn’t clear, please as questions in the comments. If you can’t backup the data, you don’t want to have it at all.

Slow Drives – 5400 rpm

The Green hard drives are usually cheaper and meant for home use. Due to cheaper components, the reliability is also lower. Good marketing around lower power use has won over how-important-it-your-data concerns. Green drives have much higher failure rates when used 24/7. I don’t think any drive with less than a 3 year warranty is worth my time, period. I buy drives with 5 year warranties.

Some green drives are suitable for backup and storing completely unimportant data like TV shows. I would never deploy green drives on a RAID. RAID is used when your data is important. Using a green drive says, my data is not important. Just an opinion.

Disk drives in arrays that automatically stop spinning on their own seem like a bad idea to me. Data around failure rates isn’t published that I’ve seen. Only a large user of disk drives would have enough data points to build a statistically valid sample and conclusions based on that data.

Green Drive Issues Definitely eye opening.

Backups

I’m a little concerned that there wasn’t any mention of backing up the data in any of the questions. If you can’t afford to backup the data, then you shouldn’t hold the data anywhere. The temptation is too great to believe that RAID will protect you, somehow. That simply is not the case.

Backups need to be 5 things – RAID only provides one of these.

  1. Automatic (RAID is automatic)
  2. Incremental/Versioned
  3. Efficient on Storage and Network
  4. Recovery Validated
  5. Offsite for highly critical data

Anything that isn’t automatic is quickly forgotten and not performed. That’s just human nature.

Versioned means you can restore data from yesterday, last week or last month. This protects against viruses or system changes that don’t get noticed for a few weeks. If you are mirroring your data, then the mirrored files have the same issue as the source and become useless.

I like my backups to be reverse incremental. Huh? That means the most recent backup looks like a mirror, so restoring 1 file from yesterday is a copy, not needing a backup/recovery tool to dig into the archive to find a single file. Backups from 2-500 days ago take a little more work, but the data is there, compressed, efficiently stored and available.

Efficient on storage and the network means backups shouldn’t take hours and hours to complete nor should they flood your network and prevent other important traffic from flowing. The best way to deal with this is to use backup tools that use librsync (or similar) and only transfer changed-data.

As an example, I used to backup my systems with rsync. It was taking 45 minutes to do this for every virtual machine. That meant many, many hours of time was required to backup just a single physical machine, which was unacceptable. I switched our backups to rdiff-backup and was able to backup the same data in under 4 minutes per virtual machine – less than 30 minutes total. Rdiff-backup stores older backup versions compressed as change-only data. If the data didn’t change, then it isn’t stored specifically in that version. I’m able to retain 90 days of daily backups for just 10% more storage than the 1-day-old mirror. Pretty cool. There are a few other articles here about rdiff-backup.

A backup without recovery verification is worthless. In the military they say, “hope is not a plan.” I don’t want to HOPE my data will be restored. I want a plan, that has been validated, to follow at 4am when I need the data restored. I’ve tested under fire our recovery plans. They work. They have saved me from my own stupidity more than I like too.

Come back later for more user questions. The upcoming questions concern deploying:

  • virtualizaion
  • Media and SMB
  • Email hosting
  • Reverse Proxy, bridges,
  • VPNs

That seems like a nice set of questions. Thanks to duijf for letting me use these questions.

  1. JD 01/07/2012 at 20:40

    Here is the Part 4 – Hosting Email article.