Dec 282025
 

If you happen to have tried upgrading Ubuntu 24.10 (probably – I didn’t check before getting this done) to Ubuntu 25.04 with ZFS, you will realise that the upgrade is blocked because of known issues. Specifically (without having seen the issue personally), the upgrade blocks at a certain point where the userland ZFS tools have been upgraded and the old kernel is still running.

Fair enough, but why hasn’t it been fixed? Or even a suggested work-around?

One suggestion I came across was to remove the ZFS storage pool(s), upgrade, and add them back in. For those not familiar with ZFS, this is done by simply importing the previously exported (or not) pool without loss of data.

Although backups are as always a good idea!

But there’s more to the suggestion than that, so here are my working notes … the ones written down long-hand with a pen on paper (something I rarely do these days) :-

  1. Shut down the virtual machines.
  2. Shut down the gooey – as in shut down the applications, and return to the login screen.
  3. Switch to a text console (most of the work is done here).
  4. Shut down the containers.
  5. Unmoun the ZFS filesystems
    • zfs unmount -a
    • Which failed, so killed off various running processes with pkill -u ${USER}
    • A second zfs unmount -a also failed and had to kill off various other processes until it worked.
  6. Export the pool which failed – including a second attempt when forced.
  7. Removed the ZFS packages :-
    • dpkg –remove zfs-zed zfs-utils-linux zfs-dkms
  8. Rebooted as Linux still thinks ZFS is enabled.
  9. Upgrade started in text mode.
    • Skips past the ZFS block and completes normally.
  10. Added the ZFS pages back (zfs-zed zfs-utils-linux zfs-dkms) and imported the pool. This did issue a dire warning about potential data loss with ZFS and this version of the Linux kernel. With any luck this is an outdated warning and perhaps more to do with ZFS root.

But that dire warning is probably worth avoiding the upgrade.

Model lighthouse in a lake.
The Lighthouse
Jul 182021
 

This is a procedure to replace one working drive in a fully functional mirror vdev; if you are replacing a failed disk there is no advantages in following this procedure. Although if you have a somewhat functional disk it may be worth trying.

So why not simply yank out the working disk you want to replace? Well, you can of course and that would work but there is nothing Murphy likes more than a mirrored vdev temporarily down to a single disk – resilvering onto a new disk guarantees a higher chance of failure of the previously working disk (I have actually seen this happening).

So I’m going to describe how to make a three-way mirror with three disks and then detach the disk you wanted to replace.

To do this there are some prerequisites :-

  1. You will need space to install an additional disk into your system; perhaps temporarily in an “unsuitable” location.
  2. You will need a spare SATA controller port to plug the new disk into. If necessary with an additional PCIe SATA controller (which sounds expensive but safety is worth the cost).
  3. You will need a SATA data cable and a SATA power cable.

The first step is to make very careful note of what devices you are going to “swap over” – ideally using their WWNs. If you don’t use WWNs, sorting out which disk is which is going to be a bit trickier.

The second step is to practice the steps involved using a ‘fake’ storage pool backed up by tiny disk files :-

# cd /pool1/temp
# for w in one two three
do
  dd if=/dev/zero of=test-disk-${w}.img bs=1M count=1000
done
# zpool create test mirror /pool1/temp/test-disk-one.img /pool1/temp/test-disk-two.img
# zpool attach test /pool1/temp/test-disk-one.img /pool/temp/test-disk-three.img
# zpool detach test /pool1/temp/test-disk-one.img

That’s pretty much it in a nutshell.

The real process is a bit more disturbing of course and most of the work is physical. The first difference from practice is that when you attach the new disk to one or other of the existing devices within the mirror, you will have to wait until the resilvering process is complete.

Whilst you will receive an estimate for that if you run zpool status, the estimate that you get :-

  scan: resilver in progress since Sun Jul 18 08:20:54 2021
	8.25T scanned at 1.09G/s, 7.28T issued at 981M/s, 8.25T total
	995G resilvered, 88.23% done, 0 days 00:17:16 to go

(Only showing the relevant part as the full output from my system is confusing and deceptive)

Is wildly inaccurate – partially because the resilvering process takes second place to any ordinary file system activity. My own estimate (1 hour per Tbyte) is probably also wildly inaccurate; basically it is done when it is done.

Detaching the old device is fast – you won’t need to sit down to wait for it.

Jun 212020
 

If you are just running Ubunbtu with ZFS without poking into the details, you may not be aware of the scrubber running. For background information, and for the benefit of those who prefer to go their own way, this is all about that little scrubber.

A pool scrub operation is where the kernel runs through checking all of the data in a pool and makes any necessary repairs. Whilst ZFS does check the integrity of the data (using checksums) when performing reads, a regular scrub repairs these issues in advance.

It need only be run weekly for larger systems or monthly for normal systems (it’s a pretty arbitrary border line). And can be started manually with :-

# zpool scrub pool0

(And “pool0” being the name of the pool to scrub)

Whilst a scrub is going on in the background, the only effect on the system is that disk accesses to that pool will be slightly slower than normal. Usually not enough to notice unless you are benchmarking!

When in progress the output of zpool status pool0 will show the current state and how long it is expected to take to complete the scrub. Once finished the status will look like :-

# zpool status | grep scan:
  scan: scrub repaired 0B in 0 days 09:19:27 with 0 errors on Sun Jun 21 10:36:28 2020

May 172020
 

There are two aspects to ZFS that I will be covering here – checksums and error-correcting memory. The first is a feature of ZFS itself; the second is a feature of the hardware that you are running and some claim that it is required for ZFS.

Checksums

By default ZFS keeps checksums of the blocks of data that it writes to later verify that the data block hasn’t been subject to silent corruption. If it detects corruption, it can use resilience (if any) to correct the corruption or it can indicate there’s a problem.

If you have only one disk and don’t ask to keep multiple copies of each block, then checksums will do little more than protect the most important metadata and tell you when things go wrong.

All that checksum calculation does make file operations slightly slower but frankly without benchmarks you are unlikely to notice. And it gives extra protection to your data.

For those who do not believe that silent data corruption exists, take a look at the relevant Wikipedia page. Everyone who has old enough files has come across occasional weird corruption in them, and whilst there are many possible causes, silent data corruption is certainly one of them.

Personally I feel like a probably unnoticeable loss of performance is more than balanced by greater data resilience.

Error-Correcting Memory

(Henceforth “ECC”)

I’m an enthusiast for ECC memory – my main workstation has a ton of it, and I’ve insisted on ECC memory for years. I’ve seen errors being corrected (although that was back when I was running an SGI Indigo2). Reliability is everything.

However there are those who will claim you cannot run ZFS without ECC memory. Or that ZFS without ECC is more dangerous than any other file system format without ECC.

Not really.

Part of the problem is that those with the most experience of ZFS are salty old Unix veterans who would are justifiably contemptuous of server hardware that lacks ECC memory (that includes me). We would no sooner consider running a serious file server on hardware that lacks ECC memory than rely on disk ‘reliability’ and not mirror or RAID those fallible pieces of spinning rust.

ZFS will run fine without ECC memory.

But will it make it worse?

It’s exceptionally unlikely – there are arguable examples of exceptionally esoteric failure conditions that may make things worse (the “scrub of death”) but I side with those who feel that such situations are not likely to occur in the real world.

And as always, why isn’t your data backed up anyway?

Apr 262020
 

Experimenting with Ubuntu’s “new” (relatively so) ZFS installation option is all very well, but encryption is not optional for a laptop that is taken around the place.

Whether I should have spent more time poking around the installer to find the option is a possibility, but post-install enabling encryption isn’t so difficult.

The first step is to create an encrypted filesystem – encryption only works on newly created filesystems and cannot be turned on later :-

zfs create -o encryption=on \
  -o keyformat=passphrase \
  rpool/USERDATA/ehome

You will be asked for the passphrase as it is created. Forgetting this is extremely inadvisable!

One created, reboot to check that :-

  1. You get prompted for the passphrase (as of Ubuntu 20.04 you do).
  2. That the encrypted filesystem gets mounted automatically (likewise).

At this point you should be able to create the filesystems for the relevant home directories :-

zfs create rpool/USERDATA/ehome/root
cd /root
rsync -arv . /ehome/root
cd /
zfs set mountpoint=/root rpool/USERDATA/ehome/root
(An error will result as there is something already there but it does the important bit)
zfs set mountpoint=none rpool/USERDATA/root_xyzzy
(A similar error)

Repeat this for each user on the system, and reboot. See if you can login and your files are present.

This leaves the old unencrypted home directories around (which can be removed with zfs destroy -r rpool/USERDATA/root_xyzzy). It is possible that this re-arrangement of how home directories work will break some of Ubuntu’s features – such as scheduled snapshots of home directories (which is why the destroy command needs the “-r” flag before).

But it’s getting there.