16
Jan
2015

Performance of a three-disk software RAID-5 array

I have been using a three-disk software RAID-5 array for my bulk media storage for the past couple of years, using Seagate Barracuda ST500DM002 drives (500GB each, 1TB storage space with 1 drive fault-tolerance).

IMG_66625A-536AEF-7A5B63-FF600F-F1D095-4879E3

I am using Linux mdadm for this, for a couple of reasons:

  • Software RAID doesn’t require an expensive RAID card
  • I can monitor the RAID using simple tools
  • Performance can actually be greater than that of hardware RAID cards as the main CPU generally has a lot more grunt than many RAID cards
  • Software RAID doesn’t care about what drive controller you are using – this makes it great if I want to move the drives into a new system
  • Just like any other component in a system, hardware RAID cards can fail (then again so can drive controllers on motherboards)
  • In the future, I can add extra drives to expand the RAID array – not many RAID cards can do this, especially not cheap ones!
  • The cables are plugged directly into the motherboard instead of an add-in card – much tidier!

RAID-5 means that you need at least three drives, and the data is spread across all three drives to increase performance. Also, you lose one third of the capacity to “parity” data, which is used when one drive goes missing to reconstruct the data that would have gone on that missing drive. This means I can have a hard drive die and I wouldn’t lose any data until another one dies. I would just throw in a new drive to replace the old one and the RAID would be reconstructed back to perfect health!

One disadvantage is that Windows cannot read mdadm drives, not that I really need it to anyway. The new Storage Spaces feature in Windows 8 is a similar function to mdadm.

The performance of my array is excellent:

Screenshot - 160115 - 05:43:22

I can reach about 300MB/s read from the array, though access time hasn’t really been improved:

sysbench --test=fileio --file-total-size=6G --file-test-mode=rndrw --init-rng=on --max-time=300 --max-requests=0 run

Using the above command, I got about 762.55Kb/sec, which pales in comparison to the 20-30+ Mb/sec I get from SSD’s.

As for monitoring, the Disk Utility in Ubuntu is very useful:

Screenshot - 160115 - 05:46:50

Screenshot - 160115 - 05:47:02

Here I can see the array sitting happy, with the three 500GB drives listed above (which can all be checked for SMART errors very easily).

Screenshot - 160115 - 05:50:06

Periodically, though, I do checks on the RAID-5 array to see if any “bit-rot” has occurred (random corruption) by doing a “scrub”. This sequentially reads in all the data and compares it all against the “parity” data and corrects any errors (takes about an hour on this 1TB array):

Screenshot - 160115 - 05:48:07

It can also be checked from a remote location using Webmin!

Screenshot - 160115 - 05:57:44

So that’s the details on my RAID-5 array! If you use Linux a lot or want to set up a home server using RAID-5, or even RAID-6, then consider mdadm!

Enable SPI and/or I2C on Raspberry Pi
Handy app for remote monitoring: Performance Monitor using JuiceSSH

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.