I had originally planned to post a guide on setting up WordPress on SBS 2008 using Web Platform Installer, but last Thursday I tried to grow my Openfiler RAID 6 software array and soon after is when something went horribly wrong. The drive I added apparently devloped bad sectors during the grow process. When it reached that area on the drive instead of failing out the single drive the logs indicate it failed all the drives on that controller (an Areca 1220). Also, my area had a power failure that exceeded the battery life on my UPS. The logs indicate that apcupsd shutdown the box before the UPS ran out of power, but when my Openfiler box came back up, mdadm reported the following error: “mdadm: Failed to restore critical section for reshape, sorry.” I tried various solutions including updating mdadm, but the closest I could get was a hard system hang when mdadm brought the array up. Now flukes happen, but I’ve been using this system for about 3 years now and this is the third time I’ve encountered problems with either adding drives or having drives fail cause the Linux software RAID to fail where manual intervention was required. In my opinion, this should almost never happen when a RAID level is used that should be able to survive multiple drive failures. While this was the first time I had to resort to restoring my data from backups, I just can’t recommend anyone use it for anything they care about. Its too bad because when it worked it was great. With Openfiler, I was able to setup iSCSI targets that could sustain 80+ MB/s which outperforms many expensive commercial systems I’ve worked with. However, for redundant storage its lack of graceful error handling excludes it from consideration in my opinion.
I’m looking at some combination of hardware RAID and unRAID. Even though unRAID is essentially Linux software RAID, in the worst case I would only lose the data on whichever non-parity disks failed. Of course that comes at the cost of significantly limted performance compared to what I was seeing with Openfiler, but for my applications (SageTV, shared file storage, and NFS storage for non-critical XenServer VMs) it should be more than adequate.