Tuesday, September 4, 2007

Deploying an USB based ZFS storage pool at home

I recently bought a Sun Ultra 20 M2 workstation which I mainly use as a Java development platform. Nevertheless, I wanted to take advantage of Solaris 10 in my home network too and, after much reading about ZFS, the first thing I wanted to implement was a personal file server.

I started digging into official documentation at OpenSolaris.org and at Sun to discover the full ZFS possibilities and determine which was the best setup for 3-users small network. Minimum system requirements were met, so the first decision I had to take was: which devices I'm populating my zpool with? The requirements I had were very simple:
  • I needed enough storage to weekly backup 3 machines; a rough estimate was 30 GB for machine so a total of 100 GB would be sufficient for the moment (for the sake of simplicity, backup is going to be done with rsync);
  • 100 GB would be required to host and share my CD collection between all the clients I have at home;
  • all the extra storage would be welcome and used as "scratch" space;
  • a replication scheme should be implemented. Priority is given to storage rather than to performance.
So, which kind of disks was I going to use?

This wonderful machine, unfortunately, natively supports only two internal SATA(/SAS) disks and the available I/O ports were:
  • 6 USB 2.0 (2 in the front and 4 in the back)
  • 2 FireWire 400 (IEEE 1394a)
There was not so much to play with. One option I considered was buying an additional SATA controller to drive some additional SATA disk. Solaris Express Developer Edition 05/07 includes the following drivers:
  • marvell88sx (Marvell 88SX SATA controller)
  • si3124 (SiliconImage 3124/3132 SATA controller)
  • ahci (Intel ICH6 and VIA vt8251 SATA controllers)
This option was particularly interesting because of the superior performance of an SATA disk compared to an USB 2.0 high speed device. Many controller I've seen use both Marvell and SiliconImage chips and finally a server class SuperMicro controller caught my attention. My workstation, by the way, is not equipped with PCI-X slots and, even if that controller is PCI compatible, I didn't want to run an underpowered solution. This option was finally discarded. Had I known this before, I would have seriously considered buying a Sun Ultra 40, which supports up to 8 internal SATA drives.

At this point I was left to choose between FireWire and USB. FireWire ports on the Ultra 20 M2 are IEEE 1394a, whose data rate is limited to 400 MBit/s. IEEE 1394a data rate is slightly inferior to the data rate ofUSB 2.0, which is 480 MBit/s. Nevertheless, FireWire peer-to-peer network architecture and FireWire support for memory-mapped device allows a more effective and less resource consuming approach for storage resulting in less CPU consumption, so FireWire appeared as an attractive solution. Unfortunately two FireWire ports are not enough for me, because they do not leave room for any replication scheme but a two-way mirror, which is a replication scheme I would avoid trying to maximize storage availability.

ZFS, indeed (as of Solaris Express Developer Edition 05/07), provides data redundancy in two flavors: mirrored or RAID-Z. An n-way mirror is a set of n disks where n-1 copies of the data are made during writing. This solution can survive various disk failures and provides (roughly) parallel read access to the n copies of data. RAID-Z is available with single or double parity. Citing official documentation, a RAID-Z configuration with n disks of size x with p parity disks can hold approximately (n-p)*x bytes and can withstand one device failing before data integrity is compromised.

At the end, I bought 3 350 GB Lacie USB 2.0 desktop disks and created my first zpool with them. I also added as a cache device an old 200 GB Iomega disk which I couldn't use as pool device. As the size of the pool depends on the smallest disk, a (350-350-350-200) GB RAID Z1 configuration would be equivalent to a (200-200-200-200) GB which turns to be roughly 600 GB. A (350-350-350) RAID Z1 configuration grants roughly 700 GB, which is 100 GB bigger!

The zpool was easily created with just one command and a few seconds wait:

# zpool create tank raidz c2t0d0 c3t0d0 c4t0d0 cache c6t0d0
# zpool status
pool: tank
state: ONLINE
scrub: scrub completed with 0 errors on Sat Apr 19 00:58:31 2008

tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c4t0d0 ONLINE 0 0 0
c3t0d0 ONLINE 0 0 0
c2t0d0 ONLINE 0 0 0
c6t0d0 ONLINE 0 0 0

errors: No known data errors

Now the pool's ready for hosting filesystems. I created one filesystem for every users' home and set a custom quota for everyone of them. The home filesystems are auto mounted at user login with just a one liner (assuming home filesystems are created in the /tank/zones/ssh-zone/home subtree):

# cat /etc/auto_home

* -fstype=lofs :/tank/zones/ssh-zone/home/&

I also set gzip compression for every filesystem with

# zfs set compression=gzip [filesystem-name]

The whole process took me less than half an hour (most of which I spent reading zpool and zfs man pages) and now have a single-parity RAID-Z pool hosting a quota-based, gzip compressed filesystem for every user, created just in time thanks to ZFS.

No comments: