Aug 252007
 

If you’re hoping to read about Linux finally getting ZFS (except as a FUSE module) then you are going to be disappointed … this is merely a rant about the foolishness shown by the open-source world. It seems that the reason we won’t see ZFS in the Linux kernel is not because of technical issues but because of licensing issues … the two open-source licenses (GPL and CDDL) are allegedly incompatible!

Now some may wonder why ZFS is so great given that most of the features are available in other storage/filesystem solutions. Well as an old Unix systems administrator, I have seen many different storage and filesystem solutions over time … Veritas, Solaris Volume Manager, the AIX logical volume manager, Linux software RAID, Linux LVM, …, and none come as close to perfection as ZFS. In particular ZFS is insanely simple to manage, and those who have never managed a server with hundreds of disks may not appreciate just how desireable this simplicity is.

Lets take a relatively common example from Linux; we have two disks and no RAID controller so it makes sense to use Linux software RAID to create a virtual disk that is a mirror of the two physical disks. Not a difficult task. Now we want to split that disk up into seperate virtual disks to put filesystems on; we don’t know how large the different filesystems will become so we need to have some facility to grow and shrink those virtual disks. So we use LVM and make that software RAID virtual disk into an LVM “physical volume”, add the “physical volume” to a volume group, and finally create “logical volumes” for each filesystem we want. Then of course we need to put a filesystem on each “logical volume”. None of these steps are particularly difficult, but there are 5 seperate steps, and the separate software components are isolated from each other … which imposes some limitations.

Now imagine doing the same thing with ZFS … we create a storage pool consisting of two mirrored physical disks with a single command. This storage pool is automatically mounted as a filesystem ready for immediate use. If we need separate filesystems, we can create each with a single command. Now we come to the advantages … filesystem ‘snapshots’ are almost instantaneous and do not consume additional disk space until changes are made to the original filesystem at which point the increase in size is directly proportional to the changes made. Each ZFS filesystem shares the storage pool with the size being totally dynamic (by default) so that you do not have a set size reserved for each filesystem … essentially the free space on every single filesystem is available to all filesystems.

So what is the reason for not having ZFS under Linux ? It is open-source so it is technically possible to add to the Linux kernel. It has already been added to the FreeBSD kernel (in “-CURRENT”) and will shortly be added to the released version of OSX. Allegedly because the license is incompatible. The ZFS code from Sun is licensed under the CDDL license and the Linux kernel is licensed under the GPL license. I’m not sure how they are incompatible because frankly I have better things to do with my time than read license small-print and try to determine the effects.

But Linux (reluctantly admittedly) allows binary kernel modules to be loaded into the kernel and the license on those certainly isn’t the GPL! So why is not possible to allow GPLed code and CDDLed code to co-exist peacefully ? After all it seems that if ZFS were compiled as a kernel module and released as a binary blob, it could then be used … which is insane!

The suspicion I have is that there is a certain amount of “not invented here” going on.