How Long Do You Use Hard Drives? 12
This morning I was thinking about how long hard drives should be used. Seems that the disks spinning 24/7/365 in an array here were purchased in 2006, just under 6 yrs ago. The drives themselves have never caused any issues, though a loose SATA cable was problematic the first 12 months or so. Since then, that array has been working perfectly.
The OS boot HDD was bought a few months before the disks for the array.
Holy CRAP! Almost 6 yrs old! I’m afraid, very afraid.
How Long Is Too Long?
We all know that every HDD will fail. It is like death and taxes. Nobody gets out of here alive. These hard drives will fail. The key is not to be using them when that happens.
The warranty for all the drives was 5 years. They are all Seagates from before Seagate lied. I haven’t purchased another Seagate product since. There are 4 Seagate model ST3320620AS in the array and the boot HDD is a model ST3300620AS. I just used the other one of these 300G in a new XBMC box.
History of the Array
- Array HDDs were purchased from NewEgg in December 2006.
- OS HDD was bought 2-3 months earlier.
- The RAID5 disk array was built around February 2007.
- The array was migrated to a new server around February 2010.
Current Array Status
Grabbed this a few minutes ago.
$ more /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [ra
id10]
md0 : active raid5 sde12 sdc10 sdf11 sdd13
937705728 blocks level 5, 128k chunk, algorithm 2 [4/4] [UUUU]unused devices:
All is well, it is not full, but well used
/dev/md0 937644140 774953616 162690524 83% /r5
The OS HDD is older by a few months. It has partitions for /home (approx. 260G) and the OS (20G). I’m getting afraid. That box (Core i5) is running Ubuntu Srv 10.04 x64. I’ve been planning an upgrade to 12.04 – actually a fresh install, but looking at the age of the HDDs, perhaps it is time to buy (5) new 1.5G-2G HDDs and rebuild everything? The external disks will be RAID1 this time, not RAID5. The RAID5 hole makes me a little nervous, though I’ve never had any issue with it. About 1TB of storage is cleared off a USB3 HDD to dump all the data from the RAID over. Only data is stored there, not programs, though the virtual machine running Windows7 Media Center does sit there. Running a VM off USB3 storage does make me nervous. I’ve seen queuing issues over USB2 and USB3 storage.
Having some extra room to play with btrfs on unimportant data will be nice too. Just started using EXT4 around here, most storage is on the proven JFS file system. JFS can be booted, which is very important. If XFS was bootable, I’d use that instead. Storage inside Linux VMs is ext2, but the hostOS is either EXT4 or JFS for journaling. EXT2 has a lower overhead and with the hostOS doing the journaling, I’m not worried about data loss.
Questions to You?
- How Long Do You Use Hard Drives?
- When do you migrate old HDDs to some less important purpose? Perhaps as archive media or purely for backups?
- Do you just stop using them completely and sell/give them away?
Please provide your wisdom in the comments.
I usually replace harddrives after 3..4 years (365/7 spinning) of use. Partly because of assumed attrition, although I admit I can’t really support this timespan with hard data. It just seems right to me. The well known Google study “Failure Trends in a Large Disk Drive Population” is so old now, that it may not say much about contemporary harddrives. Another reason is that after 3..4 years the speed improvements of new harddrives usually seem worthwhile to me.
Apart from this I check weekly the SMART values 05/C5/C6. If there is more than an infrequent slight increase in one of these values, I replace the drive. In one case so far this has reliably predicted an impending drive failure.
My rules may not be cost-optimal, but I find the cost of new harddrives neglectible compared with how much I value my data (independent from that I’m a backup freak). I have never lost significant data since the days when audio cassettes were used for data storage.
The first thing I do after sorting a drive out is erasing it. Currently I use a sevenfold overwrite, ending with zeroing everything (thwarting any idea that it may contain encrypted data). The current consensus seems to be that a single overwrite is sufficient for modern drives, but a more thorough method is not much more effort, and feels better ;-)
I usually keep about two “old” harddrives for experimental purposes, the rest I give away or destroy them. Selling seems not worth the effort for me.
I too like replacing drives every three years. My next scheduled decommission is in December of this year. Hard drive prices have been declining nicely lately, so I’m hoping by then they’ll be back to where they were a year ago.
When I replace a drive one of four things happens to it (in order of likelihood):
1. It becomes an encrypted archive of the data on that drive at the time of decommissioning. When it dies, it dies. Not much of a big deal
2. It gets repurposed for use in something unimportant (test lab, seed box, etc).
3. Nuked and disassembled so I can play with the powerful magnets in there and admire the platters (until I decide to shatter them). My inner child can’t pass up the chance of playing with shiny things and powerful magnets :P
4. Nuked and given to a family member for backup purposes (and I do tell them it’s already 3 years old — only done once so far)
Matt Simmons over at Standalone Sysadmin did a good writeup on RAID5 and Unrecoverable Reading Errors recently. For modern desktop drives it’s about 1 URE for every 12 TB of data read, which given today’s drive sizes, isn’t much. This really bites you in the butt when rebuilding a RAID5 array of large disks because you’re reading a large amount of data, so the chances of a URE during rebuild is quite high, causing everything to be lost.
http://www.standalone-sysadmin.com/blog/2012/08/i-come-not-to-praise-raid-5/
The drive in my constantly recording Tivo (purchased in 2003) lasted until just last month. And it was only spin-rited once, in 2008.
Ok, so I just ordered a couple 2TB replacement HDDs. Didn’t get the cheapest with 3G/s SATA. Spent $10 more to get 6G/s SATA and 2x the cache. I’m unhappy with the warranty, only 1 yr. To get 3 yrs, each drive would be $200. Still, I’m worried about unplanned downtime.
Having 2x the storage available will be nice too.
@Barb: I swapped out the 80GB HDD in my TiVo s2 about 5 yrs ago for a 300Gb drive. I think that was the last PATA HDD ever bought here. I stopped using the tivo about a year ago when Comcast forced all TV, even local channels, to be QAM. Comcast is slowly obsoleting all our home video equipment.
I’m slow to move to new connection/bus technologies. Would prefer to use DDR2 over DDR3, if I could. The 2% higher performance doesn’t matter to me. I was using AGP video cards for a few more years after PCIe and PCIx were popular too.
Anyway. these 2 disks will allow testing of compatibility with larger HDDs and the external array. It currently has 320Gb disks inside, so the jump to 2TB is a little steep. Cross your fingers they work. I will. These will be RAID1 storage.
Some interesting data (may be information ) on Seagate HDD P/N. The P/N could relate 25% less performance in the same drive model. Seems Seagate did a slight redesign in March-May 2012.
I’m hoping that the $100 vs $110 model HDDs at Newegg reflect the 25% performance difference. Newegg and Staples are selling these STBD2000101 for $100, but the data from the websites does not provide enough detail. The ST2000DM001 is $110 on Newegg.
Doing more research and found threads at the Seagate Forums. Basically, everyone there considers the ST2000DM001 models completely broken and likely to fail. There’s a 500-600k duty cycle expected and an audible chirp issue being reported about every 10 minutes as the APM happens to park the disks. Seagate is asking for S/N of drives with the issue. Seems to the forum posters that every HDD in that line has the issue – every one. I’d bet they put all those S/Ns into a spreadsheet, not a DB.
Perhaps Seagate is trying to cause failures just after the warranty period ends in a reliable way to increase HDD purchases? This is a little too much conspiracy theory even for me.
I’ve been avoiding Seagate HDDs since the mid-06-07s when they knew about a major issue and lied for over a year to everyone as if it didn’t exist. It has only been an hour since I ordered and I’m already having buyers remorse.
Seagate was a customer at a previous employer many, many years ago. I saw a little bit of their internal methods. A company that was doing business 20 yrs behind current methods is what I saw.
The replacement drives arrived last week and I finally got around to testing them after a few false starts.
The latest firmware from Seagate was installed on each drive. No odd noises heard from the spinning disks either before or after the new firmware.
Tried eSATA and USB on 2 different machines. Only the USB connections saw both disks, neither eSATA connections saw any.
USB2 is really slow, so I wanted desperately to use eSATA. When only one disk is seen, that is because the eSATA port is not MP capable. I’m positive that the laptop eSATA port is MP – I’ve used 2 HDDs simultaneously on that machine. Don’t believe that any other eSATA ports here are compatible with MP. That is something lacking. Couldn’t figure out the issue with the laptop eSATA port. Either a port failure or cable issue is my guess.
In an attempt to get ready for the disk swap into the array, I connected the dock via USB2 to the laptop, used USB pass-thru to an Ubuntu-running VM and got to work.
The rsync as been running for over 2 days. About 8% of the partition (136G) is used after all this time. USB2 is really slow. I need to do better.
Just touched the disk drives. Both are are extremely warm, hot; 3 seconds hot, definitely not 4 seconds.
Killed the rsync just a few minutes ago. It isn’t going to finish in a reasonable time.
I need a calculator to figure out how long that is … 22 days for the sync between 2 partitions in a RAID1 set to sync. I hope nothing fails in the meantime. The data mirroring was nowhere near complete with only 290G transferred of the 920G total.
USB2 is a terrible idea for a RAID set.
OTOH, the drives are getting a workout.
So I finally decided to connect the new disks to the final server where they will connect, just not through the infiniband connection that the array uses.
Reassembling the array was easy. That machine already has an md0. After looking up which specific partitions … had to use parted because fdisk from 10.04 couldn’t handle the GPT format, life was good.
and the sync is 10x faster. To clarify, the sync is were the 2 disks in the RAID1 mirror are behind on the mirror of the data. It is catching up still from the rsync job 5+ days ago. This time I connected the disks to a USB3 interface even though the dock connector is only USB2. Hopefully, this faster bus will let the sync finish much quicker. Checking …
Yep. Looks much better, about 1.3 days total until the data sync is finished. It picked up just were it left off on the other machine. It is still far below the USB2 theoretical limits, but better than the USB passthru performance on virtualbox.
As a point of reference, the chunk size for this mirror is 256K, double the size of the other array. According to the reading that I’ve done, both are reasonable sizes. This RAID is running EXT4, not my trusty JFS. That is new to me. I’ve been testing EXT4 on other systems the last year and never lost any data, so I hope it is good enough.
Testing successful. Failed array drive this morning.
A reboot didn’t automatically bring it back up either. At least both disks are seen. My feeble attempts to reassemble the disks into an array failed so far.
At least there wasn’t any primary data on it.
Another RAID failure for the two new 2TB Seagate HDDs.
Once again I’d hoped to swap in the new HDDs and was in the process of running an rsync when the RAID1 set dropped a drive and went into read-only mode. The data was still available, but no writes are possible. I’m hoping this is just the USB connection causing issues.
The Plan
I’d hoped to rsync the data this morning and swap the new drives into the disk array, then bring the RAID set up using the infiniband connection. That should be much more stable than USB, right? Then I’d run for a few weeks without much stress and have careful backups to see how it went with the RAID1 configuration.
I’ve moved all critical data off the old RAID, so if there is a RAID failure during the next few weeks, it doesn’t matter too much.
If that failed to work well, I’d planned to drop the RAID config completely and go with continuous replication from a primary to a secondary partition, then still have daily, versioned, automatic backups to another HDD. I thought this would provide the best redundancy possible for a short risk period from the first write until the data was replicated to a different physical HDD. I know that would work. Heck, Mom’s PC has that working with hourly snapshots.
This shouldn’t be so hard.
Thanks for sharing this article! Anyway, how will I know whether my hard drive needs replacement already? Hope you could share more tips. Thanks!
@Jerry – there is no scientific answer to your question. You should replace every HDD before it fails.
How you determine a HDD is about to fail can be a mix of
In short, you should expect an HDD to fail when you least want it to happen. Be prepared. Always.
I do pay attention to warranty lengths when purchasing new HDDs. If the maker doesn’t think a HDD will last 5 yrs – with their warranty, that makes a clear statement. Would you purchase a TV knowing it would fail just after the warranty was up? No, we expect a TV to last 10 yrs beyond the warranty. I had a CRT TV that lasted over 20 yrs. THAT is my expectation. Why is it different for HDDs?
There are tools that claim to help people know when a disk is about to fail. SMART data is known to not predict failures. Google did a study about 7 yrs ago on HDD failures and published it. It is an interesting read.
Every HDD will fail. All of them.