What makes a hard drive reliable ??
Sep 15, 2011 at 4:55 PM Post #61 of 80
It depends on the model. If you get a 1TB Seagates are very reliable. If you get a 2TB or 3TB be prepared for a dead drive. That being said, one of the most reliable manufacturers of late has been Samsung. Their 2TB drives were the only ones worth buying for a while and still out-perform a number of it's competition. Seagate recently bought Samsung's HDD devision so we'll see what that means in the long run for both brands.
 
Until that's sorted, Western Digital is going to be my go-to brand.
 
Of course, I've always heard that by far the best solution is to buy multiple drives with similar stats from multiple vendors that way, when used in a redundant system, it's extremely unlikely multiple drives will fail at the same time.
 
Sep 15, 2011 at 5:25 PM Post #62 of 80
Of course, I've always heard that by far the best solution is to buy multiple drives with similar stats from multiple vendors that way, when used in a redundant system, it's extremely unlikely multiple drives will fail at the same time.


You're thinking of when in a RAID setup, and while that's one route to take it does have it's downside: slower speed. With identical drives they're always performing to the max. But in a RAID, if you had 6 hard drives that did 150MB/s reads w/ a 10ms seek time, and the 7th drive did only 120MB/s w/ a 15ms seek time, then the entire array is limited to be as if all the other drives were the same as that one.

Unfortunately, sequential read and seek times aren't the only specs that matter. There are also the algorithms that the drives use in their firmware to handle NCQ and other things. Each algorithm is better in certain scenarios than others, but with multiple types of drives the array will always be limited to the slowest algorithm in each case.

Most of the time the performance hit is bigger than you think.
 
Sep 15, 2011 at 5:31 PM Post #63 of 80
Not really. Where it actually has slower speed is if one drive is significantly slower than the others. If all the drives are 7,200RPM with the same cache size the speed hit will be negligible. We're talking a Mb or two off the top end, nothing so dramatic.
 
This tactic is actually what Google does in it's own data servers. 
 
Sep 15, 2011 at 5:34 PM Post #64 of 80
Quick answer. Hard drives are incredibly unreliable.
 
I have about 3 hard drives right now.
 
2 are brand new WD hard drives.
 
my Toshiba HD in my toshiba hard drive is near the end of it's life.
 
And 1 of the 2 WD is making clicking noises.
 
When the hell is ssd going to be cheaper i am sick of this bs.
 
Sep 15, 2011 at 6:17 PM Post #65 of 80
Not really. Where it actually has slower speed is if one drive is significantly slower than the others. If all the drives are 7,200RPM with the same cache size the speed hit will be negligible. We're talking a Mb or two off the top end, nothing so dramatic.
 
This tactic is actually what Google does in it's own data servers. 


For Google, it's cheaper to have the servers more reliable than have them faster because the extra maintenance and reliability checks are more expensive than adding a few extra servers to make up for the lost speed.

For your Home, you gain more reliability by going with RAID 6, 0+1, 10, 50, or 60 than by worrying about which brands of drives you're using.

And it's not the MB/s that makes a difference, it's the IOPS. Even if we went by MB/s for a second here - if you have a RAID5 with 7 drives that read at 150MB/s and 1 at 120MB/s, then you're losing 7x30MB/s, or 210MB/s. That's a lot more than "a Mb or two off the top" :wink:. In fact, you gain MORE reliability and MORE speed by removing that 8th hard drive than by having it in just because it's a different brand, because there'll be one less possible point of failure.
 
Sep 15, 2011 at 6:37 PM Post #67 of 80
If reliability is your goal then RAID is not the answer given all the extra issues it adds on top of the drive's inherent instability. RAID is about performance and creating large pools of data, not about creating higher reliability. ZFS and other similar systems are less about performance and more about reliability and are the proper choice if that is your intended goal.
 
Minus the fact that unless you're using a high-end protocol like USB3.0 or Lightpeak/Thunderbolt there's no way you'd ever get anywhere near a speed that you'd lose that much performance. And no, I was talking about a Mb here or there off the entire array, not off the individual drives. Most drives are very close in terms of performance and the biggest real hit you'd notice is in response time of the array.
 
Cheaper SSDs aren't that far away. But they aren't without issues. They have issues about how often you can read/write individual sectors. Granted, they're usually rated for a million writes but at the price point of an SSD I'd imagine myself keeping a drive through multiple machines.
 
That being said, I'm about 3 years away from my next system and by that point SSDs will be the drive I get. Most likely, since I use desktops, it'll be a medium sized SSD (480GB or so) and then a multi-TB HDD for low-read/write file storage.
 
Sep 15, 2011 at 8:15 PM Post #69 of 80
If reliability is your goal then RAID is not the answer given all the extra issues it adds on top of the drive's inherent instability. RAID is about performance and creating large pools of data, not about creating higher reliability. ZFS and other similar systems are less about performance and more about reliability and are the proper choice if that is your intended goal.
 
Minus the fact that unless you're using a high-end protocol like USB3.0 or Lightpeak/Thunderbolt there's no way you'd ever get anywhere near a speed that you'd lose that much performance. And no, I was talking about a Mb here or there off the entire array, not off the individual drives. Most drives are very close in terms of performance and the biggest real hit you'd notice is in response time of the array.


First of all, yes ZFS is great, but we're on head-fi right now. How many people reading this thread do you think have heard of it, let alone know how to implement it? Someone who's just looking for a safe way to store music, movies, and photos is going to have a lot less hassle with an external RAID enclosure than setting up a Solaris fileserver with a ZFS filesystem, LDAP authentication, and Samba integration.

Second, no, RAID is first and foremost about reliability - the increased speed is just a nice bonus. If you're talking about RAID0 then yeah that's just speed, but every other RAID level is about redundancy.

Furthermore, ZFS uses the same exact methods a RAID card uses to stripe data across an array of drives, allowing you to pick either no redundancy (RAID0), mirror (RAID1), single parity (RAID5), or double parity (RAID6). The only difference between ZFS and a hardware RAID card - as far as redundancy is concerned - is that ZFS uses the CPU to calculate parity while hardware cards have their own processor that offloads the work. ZFS is definitely the overall best filesystem, but this aspect of it is no different than any other software RAID except in performance and logistics.

Lastly, "response times", or IOPS, is the #1 deciding factor in storage performance. It's why SSDs are so much faster than HDDs. Using different brands (or even different models) of hard drives will decrease it enough to have a performance impact. It's a tradeoff between speed and reliability that you have to make. In some cases yes, using different brands is good. In other cases you're better off buying more of the same one and increasing your parity level instead.
 
Sep 16, 2011 at 4:26 AM Post #70 of 80
RAID doesn't do copy on write, file-system level checksums... There are a dozen different features built into ZFS that make it SIGNIFICANTLY more stable for files than RAID.
 
Also, you can do it with FreeNAS automatically. It's very simple and what I plan on doing when I can raise the capital. 8TB of usable storage for $900.
 
And no, RAID0 has NO reliability. It's concatenated drives behaving as one drive. It is all about performance. The original RAID standard was all about performance. If you don't backup a RAID0 array you are going to lose all of your data because if one drive in a RAID0 array goes down the entire array is dead. Other, later RAID implementations offer parity on certain levels but they do not account for a number of common issues that happen with any hard drive, not just RAID. ZFS is a file-system level solution that fixes every single one of those problems.
 
However, the issues with RAID go further. If you use a hardware controller with certain RAID levels and then the card dies you lose all access to the array until you can replace it with the exact same model. This is one of the major drawbacks of RAID5 as it's a non-standard implementation in a lot of situations. Any ZFS-compatible system can read any basic ZRS array. The only question is the ZRAID levels it supports. If your RAID is software-based (which is slower than ZFS, by the way and way more resource intensive) and you have a system failure you will lose everything regardless of how your hardware is. Not to mention the myriad of read/write errors commonly associated with software RAID.
 
So no, ZFS is in no way like a software RAID. No corruption on the OS-side of things can kill a ZFS zpool. Software RAIDs are only used because they're dirt cheap. Not because they're fast of stable.
 
Again, I'll keep saying this because you keep ignoring it. If you have two drives from two manufacturers and they have identical stats or at least nearly identical then the performance hit is negligible. If you have 100-disc arrays it can slowly add up to a noticeable number but in those exact situations stability is just as important. However, on a much smaller scale (the home user you like to mention) of less than 10 drives the reduction is a small fraction. There's a reason I've never heard a single Sysadmin say, "Buy all your drives at once from one company." They always tell you buy multiple pressings of the same drive from different vendors. Keep the stats as identical as possible.
 
Besides, the idea that Google cares more about reliability than speed is insane. Google is obsessed with speed. If you really want, I can cite a dozen different examples of where they've gone out of their way to increase the speed of the net simple so they can shave a few MS of page load times. They even use invalid code on their pages specifically because even though it create compatibility issues in certain circumstances, it's just that tiny bit faster.
 
Sep 16, 2011 at 8:06 AM Post #71 of 80
Yes, there are logistics benefits to using ZFS pools instead of RAID, but that only has to do with ease of maintenance. Like you said, there's a problem of migrating arrays between different hardware controllers (not impossible by the way, it just takes time for the new controller to re-configure it without destroying the data) which you don't have with ZFS, and ZFS is cheaper and easier to work with since you can use any SAS/SATA port anywhere in the system instead of using multiple cards that have to be the same model. But other than that, they're the same thing.

You want a good benefit to hardware RAID though? If you've got only one or two servers and not an entire enterprise SAN, hardware RAID offloads the parity calculations, leaving the CPU free for other stuff. With ZFS you're basically dedicating 5-10% of your CPU just to the storage.

Also, please reread my last post, because now you're putting words in my mouth. I never said RAID0 was redundant, and I never said that ZFS was as slow as other software solutions.
 
Sep 16, 2011 at 12:10 PM Post #72 of 80
I know you never said it was, but you have mentioned straight RAID as a redundancy solution before and I was just echoing that RAID0 isn't. It's one of the earliest implementations of RAID.
 
And no, as I continue to say, RAID is fundamentally different than ZFS. I'll keep hammering away at this until you get it. There are no checksums in RAID. At all. On any level. That means a file that's copied can get corrupted, though it's rare. However, rare does not mean never happens. This does happen. The other major issue that all RAID arrays have is what happens if your array goes down in the middle of a massive write. The chances of you losing that data are pretty much assured. However, there's now a fantastic chance that you've corrupted the index. That just lost you every last file on the array and while it's recoverable unless you do a lot of defragmentation (which raises the risk of this happening, oh the irony) then there's a good chance what limited file you recover are going to have some level of corruption.
 
ZFS has built-in checksums. That means on every read and every write a quick checksum is made to the file to make sure it's not been corrupted somewhere along the way due to basic software errors. Then it uses copy-on-write protection which means that if you were to lose power mid-write the worst that would happen is whatever data you were writing is now lost. However, any data on the zpool form before that is perfectly safe with 0% chance of corruption.
 
As for performance, it doesn't actually take a lot of processing power to do a ZFS RAIDZ zpool. The big requirement is RAM, which is why a lot of people just skip it and use SSDs as cache disks. However, if you do that, a RAIDZ1 zpool is actually pretty close to a RAID5 array in terms of speed, though not quite as fast as RAID10 or RAID0.
 
So while some of the things they will do for you are the same, when it comes to reliability ZFS has a near 0% chance of failure over it's entire lifetime where a RAID system will die at some point. It's why if you're doing it seriously you always have at least two independent RAID servers so that when one array dies (and it will) you have the other as the fall-back. ZFS eliminates the need for that second array as a redundancy solution. At least, from any system-level issues.
 
Neither system touches human-level stupidity. Oh, wait, ZFS does. It has versioning built in to the filesystem, similar to Time Machine and RSYNC, that allows it to take snapshots of any changes at a bit level automatically and roll them back as necessary. Which, oddly, saves you form stupid user errors as well.
 
Which RAID doesn't do in any circumstances at any RAID level.
 
So the benefits of RAID over ZFS: RAID is a little faster and is less hardware intensive. The benefits of ZFS over RAID: ZFS has none of the inherent stability issues and is significantly less likely to suffer from any level of data lose.
 
By the way, to everyone else, I know it seems like this is horribly off topic but hardware isn't the only thing that effects reliability. The circumstances in which a drive is used make a world of difference.
 
Sep 16, 2011 at 1:12 PM Post #73 of 80
Not OT at all.  Keep going! 
popcorn.gif

 
Sep 16, 2011 at 1:13 PM Post #74 of 80
ZFS is primarily a filesystem with a "built-in" LVM with RAID-like features (zpool). You can't compare it's versioning and checksum features to RAID, because you can still install it onto one or more RAID arrays and use it. It's not an either-or thing here. Those features should be compared with NTFS, HFS, ext4, and so on. They are completely different things.

I've only been talking about zpools vs RAID, which is the only part of ZFS that is directly comparable.

To put it in audio terms: RAID=preamp, ZFS=amp. You can go ahead and compare the sound quality of the two all you want but it's not going to mean much. In this case the amp (ZFS) has it's own preamp built-in (striping and parity), and so the only valid comparison would be to compare those features to the preamp (RAID-5 vs RAID-Z, etc). Get it now?
 
Sep 16, 2011 at 1:54 PM Post #75 of 80
Zpool is not RAID. It's more like SPAN and doesn't start to act like RAID until you us some level of RAIDZ (1-3) and mirroring.
 
Oh, I also forgot to mention that one of the greatest advantages of ZFS over RAID is it's very hard to mix RAID levels behind mirroring and striping. ZFS can have any combination of RAIDZ as well as mirroring it chooses. You could, for example, have two RAIDZ-3 arrays of 10 disks (7 usable disks with 3-drive failure protection) that are mirrored so if on RAIDZ zpool goes down the other is still running. You just get an error saying, "This disk is dead. You might want to replace it so you don't lose any data." It's not nearly as simple to do that with RAID.
 
There is some level of it that you can do in some of the more advanced implementations but ZFS allows you do to any configuration you can think of. Should you wish, you could have ten 10-drive RAIDZ-1 zpools all acting as a RAIDZ-3 zpool. It'd be a ridiculous amount of data redundancy but you could do it.
 
And no, RAID isn't the preamp with ZFS being the amp. That would imply that you'd mix the two, which you wouldn't do. You'd lose half the benefits of ZFS and only have the basic checksum and versioning support. You'd lose all the benefits of data integrity. That's why you never here someone saying, "Buy this RAID box to use with your ZFS implementation/" They always say, "Find a box that can do JBOD and let ZFS handle everything."
 
A better comparison would be that RAID-5 + RSYNC + some level of system-based checksum monitoring is equivalent to RAIDZ-1.
 
And no, I wouldn't compare it to NTFS, HFS+ or EXT4. Why? Because you don't compare things based on the most basic level of shared functionality (all are filesystems) you compare it to what it's actually used as, which is so much more. Then, I wouldn't even compare HFS+ to NTFS because even with all the issues with HFS+ (or EXT4 for that matter) it's still miles beyond NTFS. None are particularly modern, though. Now, if you really wanted to compare a filesystem to ZFS you'd need to compare it to BTRFS. It's got a lot of the same features as ZFS. It is, however, very early days and is still little more than an experiment. It's not ready for prime-time.
 
Now, if you REALLY wanted to make a pro for RAID over ZFS it's quite simple: You can't run a ZFS zpool attached to a Windows OS. There is no support, on any level. You can get support on Linux and OS X (not to mention Solaris, obviously) but that's really it. No one's taken the time to put it on Windows. Though, ZFS is mostly used on servers and in that context it only matters what the host runs. Be it something like FreeNAS (full, native support) or Ubuntu (kernel-extension support) you can use it just fine and then set up sharing with any other OS. With FreeNAS it's as simple as flipping a switch.
 
So I would say my getting "it" was not the problem, no.
 

Users who are viewing this thread

Back
Top