Feb 112013
 

One of the obvious things to do with a ZFS storage pool is to increase the size of the disks in it – after all disks get bigger and cheaper over time. Not that it is a very difficult thing to do, but it is always worth doing a quick search to find out what others have done before setting forth. And if nobody blogs their own experience, there’s nothing for anybody to find!

So I started off with four 2Tbyte drives configured as two vdevs each of which was a mirror. And I had two 3Tbyte disks to swap in. So I was going to be swapping one of the vdevs (consisting of two 2Tbyte drives) with the 3Tbyte drives.

In the details below, I have a storage pool called zroot and the two disks being replaced are gpt/disk3 and gpt/disk2. As you will notice, I am growing the storage pool I boot off; however the disks I am using do not contain a boot partition with the boot code.

The first job was to swap out one of the 2Tbyte drives. This was done by :-

  1. Take disk to be swapped out offline: zpool offline zroot gpt/disk3
  2. Shut down the server and take the selected drive out. Swap over the disk caddy onto a new 3Tbyte drive, and swap that back in.
  3. Power on the server.
  4. Create an EFI partition table: gpart create -s gpt ada3
  5. Optionally create a swap partition: gpart add -t freebsd-swap -s 4G -l swap3 ada3
  6. Create a ZFS partion: gpart add -t freebsd-zfs -l disk3 ada3
  7. Replace the device: zpool replace zroot gpt/disk3

Now is the time to wait for the resilvering process to complete. Once that has finished, the steps above can be repeated for the other drive in the vdev. Once the resilvering for that replacement has finished, you may want to check the size of the pool.

If the size has not increased, you may need to do: zpool online -e zroot gpt/disk2 gpt/disk3.

Oct 272009
 

Whether you are using ufs filesystems or zfs storage pools, Solaris has a rather nifty way of migrating storage from one SAN to another wih no (or little) downtime. Or various other reasons involving moving from one disk to another. The key advantage to the following method is to reducing or eliminating downtime. Even if your users can take the hit, not having to slowly watch a multiterabyte filesystem copying from one disk to another is reason enough to use this technique.

Basically it is by using mirroring. Using mirroring to copy a disk might seem a little odd to begin with, but once you’ve seen it work you’ll be a fan.

For UFS (and SVM) Filesystems

This section assumes that the source disk device (cXXXXX) is set in the variable ${sourcedisk} and the destination is in ${destdisk}.

For UFS filesystems, the first step (which does require an outage) is to :-

  1. Stop the application that uses the filesystem being migrated.
  2. Unmount the filesystem.
  3. Encapsulate the existing filesystem device into a SVM metadevice: metainit d1001 1 1 ${sourcedisk}
  4. Create a mirror device with the new metadevice as a submirror: metainit d1000 -m d1001
  5. Change the references in /etc/vfstab to the old device name (${sourcedisk}) to the new mirror (not sub-mirror!) device – d1000
  6. Remount the filesystem and restart the application.

This should take no more than 10 minutes and is the only outage involved. There are two remaining sets of steps :-

  1. Create a new metadevice using the new disk: metainit d1002 1 1 ${destdisk}
  2. Attach the new metadevice to the mirror as an additional sub-mirror: metattach d1000 d1002

At this point, the mirror will start resilvering. It may take some time to complete, but the time it takes to do so does not really matter. In particular the resilvering process should not cause a performance problem to your application – the application I/O takes priority.

When the resilvering is complete :-

  1. Remove the metadevice containing the old SAN disk: metadetach d1000 d1001
  2. Remove the metadevice that is no longer required: metaclear d1001
  3. Attach “nothing” to the mirror metadevice (this is to ensure that the mirror grows to the size of the new submirror): metattach d1000
  4. Finally, ignore the warning on the manual page (which is outdated) and grow the filesystem: growfs -M /mount/point /dev/md/rdsk/d1000

You will see that I have used the metadevice names d1000 (for the mirror), d1001 (for the old sub-mirror), and d1002 (for the new submirror). Whatever device names you use, it is worth trying to be consistent – it helps a lot when you have dozens of filesystems to process.

ZFS Storage Pools

This is even simpler. If you have a storage pool called ${pool} which contains a single device called ${sourcedisk}, you simply :-

  1. Attach the new device: zpool attach ${pool} ${sourcedisk} ${destdisk}
  2. Wait for the resilvering to finish.
  3. Dettach the old device: zpool detach ${pool} ${sourcedisk}

Of course be aware of anything you read on the Internet! I have not actually tested the above; I’m merely regurgitating memory that has recently been exercised – I’m doing a SAN migration at work right now.

Oct 032009
 

Yesterday I went through the process of creating a ZFS storage pool with a single device :-

zpool create zt1 cXXXXX

Next adding an additional device to mirror the first :-

zpool attach zt1 cXXXXX cYYYYY

Watched it resilver, and then detached the first replica reducing the number of replicas to one :-

zpool detach zt1 cXXXXX

This is one of the nicest ways possible to migrate a large dataset from one set of devices to another (say replacing a SAN). However the documentation on Sun’s manual page for zpool is just a little vague in the relevant area and does not explicitly say that a single replica is a perfectly valid configuration.

This might all seem a little obvious, but removing a replica to reduce a storage pool to an pool without a mirror (no redundancy) is something that some volume managers don’t allow.

Feb 252009
 

Traditionally I have always mounted just the filesystems I needed in single user mode whilst tinkering in Solaris. Turns out this is a dumb method for ZFS filesystems.

What happens is that the zfs mount command will create any directories necessary to mount the filesystem. Later this can stop other ZFS filesystems from mounting when the tinkering is finished. This could be an argument for not creating hierarchies of filesystems, but that’s rather extreme.

The better solution is to mount all the ZFS filesystems in one go with :-

zfs mount -a
Jan 082009
 

If you think that you can use a ZFS volume as a Solaris LVM metadevice database, you will be wrong. Whilst it works initially, the LVM subsystem is initialised before ZFS this cannot find the databases. Whilst this may seem to be a perverse configuration at least one administrator has tried it – being me!