Booting Ubuntu 24.04 with ZFS Root and systemd-boot: A Hard-Earned Guide

Booting Ubuntu 24.04 with ZFS Root and systemd-boot: A Hard-Earned Guide

Before diving into the technical details, it's important to understand what makes this installation unique and why it required patience and persistence. Unlike typical Ubuntu installations that rely on GRUB and ext4, this system was crafted with a ZFS root on striped NVMe drives (though a mirror would have provided better redundancy) and uses systemd-boot as the bootloader. The result is a minimalist, robust setup that boots quickly, is resilient against disk errors, and is designed with flexibility in mind.

We didn't arrive here easily. GRUB was our first attempt — but repeated issues with its interaction with ZFS led us to abandon it. Ultimately, systemd-boot proved to be the cleaner and more reliable alternative. We also chose to let ZFS manage entire drives instead of carving out partitions, giving it full control over data layout and fault management. This article documents every step, the lessons learned, and the reasoning behind each decision.

Overview

This article documents the full process of building a clean, reliable Ubuntu desktop system that:

Boots with systemd-boot** (no GRUB)**
  • Uses a ZFS stripe as root
  • Avoids the common pitfalls of service misordering and boot crashes
  • Brings up a full MATE desktop environment on top

This guide was written after over 10 hours of real-world trial, error, and persistence — including several reinstalls, blind alleys, and ultimately, success.

Starting Point

We began by booting from the Ubuntu Server 24.04 USB installer, choosing the minimal install route to avoid Snap preloads and desktop bloat.

Before anything else, we installed the necessary tools into the live environment:

sudo apt update
sudo apt install zfsutils-linux debootstrap

These two packages are essential — zfsutils-linux is needed to create and manage ZFS pools, while debootstrap allows us to bootstrap a minimal Ubuntu system manually.

Preparing the Disks

  1. Wiped partition tables entirely using sgdisk --zap-all and parted.
  2. Used two FireCuda NVMe drives in a ZFS stripe, without any separate boot partition on them.
  3. Created a 1GiB EFI partition on a separate NVMe drive, formatted it vfat and mounted at /boot/efi. it's going to contain all your kernels initramfs and configs that may come later. it could be created on the ZFS stripe or mirror yet to give ZFS a whole disk and if there is a spare

To find the boot disk to be used

akadata@homer:~$ ls -l /dev/disk/by-id/ | grep addlink
lrwxrwxrwx 1 root root 13 May 11 22:21 nvme-addlink_M.2_PCIE_G4x4_NVMe_2023022003000799 -> ../../nvme2n1


Creating the EFI Partition

To find the boot disk to be used:

ls -l /dev/disk/by-id/ | grep addlink

Then use parted to create a new GPT label and an EFI partition:

parted /dev/disk/by-id/nvme-addlink_M.2_PCIE_G4x4_NVMe_2023022003000799
(parted) mklabel gpt
(parted) mkpart ESP fat32 1MiB 2GiB
(parted) set 1 boot on
(parted) quit

Format the partition as FAT32:

mkfs.vfat -F32 /dev/disk/by-id/nvme-addlink_M.2_PCIE_G4x4_NVMe_2023022003000799-part1

ZFS Root Pool

We created the ZFS pool manually with two FireCuda drives using their disk-by-id paths:

zpool create -o ashift=12 -O acltype=posixacl -O xattr=sa \
  -O relatime=off -O dnodesize=auto -O normalization=formD \
  -O mountpoint=none -O compression=off \
  -R /mnt tank mirror \
  /dev/disk/by-id/nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YP1 \
  /dev/disk/by-id/nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YWK

To create a stripe instead of a mirror (faster, no redundancy):

zpool create -R /mnt tank \
  /dev/disk/by-id/nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YP1 \
  /dev/disk/by-id/nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YWK

To create a 3-disk raidz1:

zpool create -R /mnt tank raidz1 \
  /dev/disk/by-id/nvme-disk1 \
  /dev/disk/by-id/nvme-disk2 \
  /dev/disk/by-id/nvme-disk3

To create a 4-disk raidz2:

zpool create -R /mnt tank raidz2 \
  /dev/disk/by-id/nvme-disk1 \
  /dev/disk/by-id/nvme-disk2 \
  /dev/disk/by-id/nvme-disk3 \
  /dev/disk/by-id/nvme-disk4

We later enabled compression selectively per dataset.

System Preparation and Chroot Setup

Once the ZFS pool and EFI partition were created, the next steps involved preparing the Ubuntu system manually.

1. Create ZFS datasets for the root and home:

zfs create -o mountpoint=/ tank/ROOT
zfs create -o mountpoint=/ tank/ROOT/noble

zfs create -o mountpoint=/home tank/home

Before bootstrapping, it's important to understand the purpose of the dataset layout and mount the EFI partition:

Mount it at the correct EFI location:

mkdir -p /tank/ROOT/noble/boot/efi
mount /dev/disk/by-id/nvme-addlink_M.2_PCIE_G4x4_NVMe_2023022003000799-part1 \
/tank/ROOT/noble/boot/efi

We use tank/ROOT as a parent dataset for all bootable root environments. This structure allows for clean isolation of different OS roots (e.g., noble, arch, lfs) beneath a single management namespace. Each subdataset under tank/ROOT represents a full root filesystem. In this case, tank/ROOT/noble is used for the Ubuntu Noble installation.

2. Bootstrap the base system using debootstrap:

debootstrap --include=linux-image-generic,linux-headers-generic,vim,sudo noble \
/tank/ROOT/noble http://archive.ubuntu.com/ubuntu

3. Bind necessary virtual filesystems and chroot:

mount --rbind /dev /tank/ROOT/noble/dev
mount --rbind /proc /tank/ROOT/noble/proc
mount --rbind /sys /tank/ROOT/noble/sys
mount --rbind /run /tank/ROOT/noble/run
mount -t efivarfs efivarfs /tank/ROOT/noble/sys/firmware/efi/efivars
chroot /tank/ROOT/noble

Note: Mounting efivarfs is critical for successful EFI bootloader installation and configuration.

4. Inside the chroot, update the initramfs and install the base system tools:

update-initramfs -c -k all
systemctl enable ssh
systemctl set-default graphical.target

Bootloader: systemd-boot

Installation

We opted for systemd-boot instead of GRUB:

apt install systemd-boot-efi
bootctl --path=/boot/efi install
bootctl status
bootctl update

bootctl status Output

System:
      Firmware: UEFI 2.90 (American Megatrends 5.32)
 Firmware Arch: x64
   Secure Boot: disabled
  TPM2 Support: yes
  Measured UKI: no
  Boot into FW: supported

Current Boot Loader:
      Product: systemd-boot 255.4-1ubuntu8
     Features: ✓ Boot counting
               ✓ Menu timeout control
               ✓ One-shot menu timeout control
               ✓ Default entry control
               ✓ One-shot entry control
               ✓ Support for XBOOTLDR partition
               ✓ Support for passing random seed to OS
               ✓ Load drop-in drivers
               ✓ Support Type #1 sort-key field
               ✓ Support @saved pseudo-entry
               ✓ Support Type #1 devicetree field
               ✓ Enroll SecureBoot keys
               ✓ Retain SHIM protocols
               ✓ Menu can be disabled
               ✓ Boot loader sets ESP information
          ESP: /dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538
         File: └─/EFI/systemd/systemd-bootx64.efi

Random Seed:
 System Token: set
       Exists: yes

Available Boot Loaders on ESP:
          ESP: /boot/efi (/dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538)
         File: ├─/EFI/systemd/systemd-bootx64.efi (systemd-boot 255.4-1ubuntu8)
               └─/EFI/BOOT/BOOTX64.EFI (systemd-boot 255.4-1ubuntu8)

Boot Loaders Listed in EFI Variables:
        Title: Linux Boot Manager
           ID: 0x0000
       Status: active, boot-order
    Partition: /dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538
         File: └─/EFI/systemd/systemd-bootx64.efi

        Title: UEFI OS
           ID: 0x0001
       Status: active, boot-order
    Partition: /dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538
         File: └─/EFI/BOOT/BOOTX64.EFI

Boot Loader Entries:
        $BOOT: /boot/efi (/dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538)
        token: 9270275906c34edd9126b5905f20cd8e

Default Boot Loader Entry:
         type: Boot Loader Specification Type #1 (.conf)
        title: Ubuntu 24.04 LTS
           id: 9270275906c34edd9126b5905f20cd8e-6.8.0-31-generic.conf
       source: /boot/efi//loader/entries/9270275906c34edd9126b5905f20cd8e-6.8.0-31-generic.conf
     sort-key: ubuntu
      version: 6.8.0-31-generic
   machine-id: 9270275906c34edd9126b5905f20cd8e
        linux: /boot/efi//9270275906c34edd9126b5905f20cd8e/6.8.0-31-generic/linux
       initrd: /boot/efi//9270275906c34edd9126b5905f20cd8e/6.8.0-31-generic/initrd.img-6.8.0-31-generic
      options: root=ZFS=tank/ROOT/noble rw systemd.machine_id=9270275906c34edd9126b5905f20cd8e

\

This installed its own bootloader and automatically generated loader entries under /boot/efi/loader/entries/. While this may work, it lacks explicit control over boot parameters such as specifying the exact ZFS root dataset, which is essential for reliability in multi-dataset configurations.

Before proceeding, ensure basic system identity is configured:

echo "homer" > /etc/hostname
echo "127.0.1.1 homer" >> /etc/hosts

To make this work, we first created the necessary directory:

mkdir -p /boot/efi/ubuntu

Then we manually copied the kernel and initramfs into that location:

cp /boot/vmlinuz /boot/efi/ubuntu/
cp /boot/initrd.img /boot/efi/ubuntu/

We also copied the versioned files to retain flexibility:

cp /boot/vmlinuz-6.8.0-31-generic /boot/efi/ubuntu/
cp /boot/initrd.img-6.8.0-31-generic /boot/efi/ubuntu/

We also created our own loader entry manually for clarity:

/boot/efi/loader/entries/ubuntu.conf

title   Ubuntu Noble (ZFS)
linux   /ubuntu/vmlinuz
initrd  /ubuntu/initrd.img
options root=ZFS=tank/ROOT/noble rw

Both this and the UUID-based autogenerated one worked, but we preferred clarity.

Note: Every time you update the kernel or initramfs, you must repeat the copy step to ensure /boot/efi/ubuntu/ has the latest versions. Otherwise, the bootloader will not see your updates. While bootctl will also install and update its own kernel and initrd copies in a different managed location, those do not include the manual ZFS root parameters unless explicitly configured. Our approach gives clear control over what is booted and ensures consistency with the expected root dataset.

Pitfalls and Problems

1. zfs-mount.service and zfs-volumes.target

Enabling these too early caused multiple boot failures:

  • Stalls during boot
  • Hanging zfs-mount.service waiting for non-existent zvols

Fix: We masked them completely:

ln -s /dev/null /etc/systemd/system/zfs-mount.service
ln -s /dev/null /etc/systemd/system/zfs-volumes.target

2. Root Password & sudo

After bootstrapping the system with debootstrap, we had no root password and no sudo.

Fix: Set root password from recovery or init, and:

To allow your user to execute sudo commands without entering a password, add the following line to `/etc/sudoers` using `visudo`:

apt install vim sudo
usermod -aG sudo youruser

Replace `username` with your actual login name. Using `vim` via `visudo` ensures syntax is validated before saving.

username ALL=(ALL) NOPASSWD: ALL

3. Multiple Reinstalls

Many failures required us to:

  • Re-wipe ZFS pools
  • Correct EFI partitions
  • Rebuild zpool.cache

Persistence paid off.

akadata@homer:/home/akadata# sudo zpool status
  pool: tank
 state: ONLINE
config:

	NAME                                                STATE     READ WRITE CKSUM
	tank                                                ONLINE       0     0     0
	  nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YP1  ONLINE       0     0     0
	  nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YWK  ONLINE       0     0     0

errors: No known data errors

ZFS supports automated scrubs to help detect and correct silent data corruption. To ensure your system stays healthy, enable the periodic scrub timer with:

systemctl enable zfs-scrub.timer

This will automatically scrub your ZFS pools on a regular schedule behind the scenes.

You can also scrub manually any time with:

zpool scrub tank

And check the progress:

zpool status

akadata@homer:/home/akadata# sudo zpool scrub tank
akadata@homer:/home/akadata# sudo zpool status
  pool: tank
 state: ONLINE
  scan: scrub in progress since Mon May 12 08:20:15 2025
	30.4G / 30.4G scanned, 5.63G / 30.4G issued at 5.63G/s
	0B repaired, 18.51% done, 00:00:04 to go
config:

	NAME                                                STATE     READ WRITE CKSUM
	tank                                                ONLINE       0     0     0
	  nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YP1  ONLINE       0     0     0
	  nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YWK  ONLINE       0     0     0

errors: No known data errors

Final Boot Sequence

  • EFI loads systemd-boot
  • systemd-boot selects Ubuntu with ZFS=tank/ROOT/noble
  • No zfs-mount or zfs-volumes services interfere
  • initrd imports the pool and mounts /
  • MATE desktop launches smoothly

Desktop Setup

Once booting was solid:

apt install ubuntu-mate-desktop
  • PulseAudio configured manually
  • Bluetooth audio worked after pairing, installing pavucontrol, and switching to A2DP

Reflections

Over 10 hours went into debugging init failures, masking systemd services, reimporting ZFS pools, rebuilding initramfs, and correcting loader entries.

We initially tried both GRUB and systemd-boot. GRUB failed to reliably boot the ZFS root, and after numerous issues, we chose to go all-in with systemd-boot. The result was a cleaner and far faster boot process.

We also recognized that while it is technically possible to place EFI and ZFS on the same disk, allowing ZFS to manage entire drives gives it full control over redundancy, scrub health, and clean device layout — avoiding the complexities of manually partitioned hybrid disks. Keeping the EFI partition separate on its own small SSD ensures boot simplicity, and letting ZFS own whole disks aligns perfectly with its design philosophy.

What made this work was:

  • A willingness to understand the underlying tools
  • Avoiding GRUB and its complex boot logic
  • Learning how ZFS interacts with init and systemd

Benchmarking Your ZFS Pool

To quickly benchmark your ZFS stripe or mirror, install fio:

sudo apt install fio

Random Read Test

Run a basic 4K random read test:

fio --name=randread --ioengine=libaio --direct=1 --rw=randread --bs=4k --numjobs=4 --size=1G --runtime=60 --group_reporting --filename=/tank/testfile


Result:

READ: bw=1786MiB/s (1872MB/s), IOPS=457k, io=4096MiB (4295MB), run=2294msec


This result demonstrates excellent random read throughput and high IOPS for a two-disk NVMe ZFS stripe.

Sequential Read Tests

To test the maximum sequential read speed of your ZFS pool, use the following:

fio --name=seqread --ioengine=libaio --direct=1 --rw=read --bs=1M --numjobs=4 --size=1G --runtime=60 --group_reporting --filename=/tank/testfile


Result:

READ: bw=40.4GiB/s (43.4GB/s), IOPS=41.4k, io=4096MiB (4295MB), run=99msec


Extended Sequential Read Scaling

We tested larger sizes and more concurrent jobs for throughput scaling:

4GiB test with 4 jobs:

READ: bw=40.7GiB/s (43.7GB/s), IOPS=41.7k, io=16.0GiB, run=393msec


4GiB test with 15 jobs:

READ: bw=77.8GiB/s (83.6GB/s), IOPS=79.7k, io=60.0GiB, run=771msec


4GiB test with 24 jobs:

READ: bw=73.1GiB/s (78.5GB/s), IOPS=74.9k, io=96.0GiB, run=1313msec


Cached read test (no ``):

READ: bw=69.0GiB/s (74.1GB/s), IOPS=70.7k, io=96.0GiB, run=1391msec


These results demonstrate the scalability of read throughput on a ZFS stripe using NVMe drives. With proper concurrency and large block sizes (1M), performance reached over 80GB/s under realistic system loads.

Remove the test file after benchmarking:

rm /tank/testfile


Next Steps

Now that we have:

  • A stable ZFS root
  • systemd-boot cleanly managing entries
  • MATE desktop running

We can safely snapshot the system:

sudo zfs snapshot tank/ROOT/noble@mate-clean


And experiment with new roots (like LFS or Arch) in tank/ROOT/xyz — each with its own loader entry.

Persistence builds systems. Let this guide light your path.

System Specifications

  • Motherboard: MSI MPG Z790 CARBON WIFI (MS-7D89), manufactured by Micro-Star International Co., Ltd.
  • CPU: Intel Core i9-13900K — running at stock speed, cooled by a Kraken 360mm AIO water cooler
  • Drives: Two Seagate FireCuda 530 NVMe drives (PCIe 4.0, up to 7000MB/s throughput), each directly attached to CPU PCIe lanes for maximum performance
  • Graphics: NVIDIA RTX 3060 Ti 12GB using PCIe 4.0 x8, sharing the CPU PCIe lanes with the NVMe drives
  • Memory: 128GB DDR5 (4x32GB Corsair DIMMs)
  • Model: CMT64GX5M2B5200C40
  • Speed: 4000 MT/s (configured), dual-rank, 1.1V
    1. All DIMMs matched and operating in quad-channel configuration
  • Boot Device: Separate 2TiB NVMe drive used exclusively for the EFI system partition and later planned for additional ZFS storage for temporary data or virtual machine disks

This setup provides exceptional performance for both development and production-like testing, especially under heavy I/O and multitasking workloads.

This configuration provides a strong balance of workstation and server-grade capability, ideal for development, VM workloads, and testing advanced filesystem setups like ZFS root.

🚧 Appendix: Instant Recovery with ZFS Snapshots

If something breaks (bad upgrade, missing packages, broken audio, etc.), and you've taken a ZFS snapshot of your working root, recovery is instant.


Make a snapshot before doing anything else — and this is how to recover

ZFS with systemd-boot makes it very difficult to make a permanent mistake.

Before applying any risky change:

zfs snapshot tank/ROOT/ubuntu-noble@ready-to-develop

If something breaks — whether it's a bad upgrade, missing packages, broken audio, or boot failure — recovery is instant if you've taken a snapshot of your working root.

This was tested firsthand: while setting up distcc and ccache, something went wrong. The snapshot made earlier proved invaluable — within minutes, the system was fully functional again. The desktop returned, Firefox loaded with all logins and passwords intact, and nothing needed to be reconfigured.

Real Snapshot Proof

root@homer:/boot/efi/loader/entries# zfs list
NAME                     USED  AVAIL  REFER  MOUNTPOINT
tank                     335G  3.19T    96K  none
tank/HOME                 96K  3.19T    96K  /home
tank/LOG                  96K  3.19T    96K  /var/log
tank/ROOT               65.8G  3.19T    96K  /
tank/ROOT/dev-session    424K  3.19T  26.4G  /
tank/ROOT/lfs           2.19G  3.19T  2.19G  /mnt/lfs
tank/ROOT/noble         63.1G  3.19T  59.1G  /
tank/ROOT/ubuntu-noble   452M  3.19T  26.4G  /
tank/SRV                  96K  3.19T    96K  /srv
tank/lfs                 671M  3.19T    96K  none
tank/lfs/sources         671M  3.19T   671M  /mnt/sources
tank/libvirt            10.4G  3.19T  10.4G  /var/lib/libvirt
tank/mysql                96K  3.19T    96K  none
tank/vm                  258G  3.19T    96K  none
tank/vm/win11            258G  3.28T   128G  -
root@homer:/boot/efi/loader/entries# zfs list -t snapshot
NAME                                      USED  AVAIL  REFER  MOUNTPOINT
tank/ROOT/noble@mate-clean               4.07G      -  26.2G  -
tank/ROOT/ubuntu-noble@stable-base       13.6M      -  26.4G  -
tank/ROOT/ubuntu-noble@ready-to-develop  10.5M      -  26.4G  -
tank/libvirt@initial                        0B      -  10.4G  -
tank/vm/win11@initial-install            30.0G      -   128G  -
root@homer:/boot/efi/loader/entries# 
  1. Clone the snapshot:
zfs clone tank/ROOT/ubuntu-noble@ready-to-develop tank/ROOT/dev-session

  1. Set mountpoint and disable auto mount:
zfs set mountpoint=/ tank/ROOT/dev-session
zfs set canmount=noauto tank/ROOT/dev-session

  1. Create a loader entry:
cp /boot/efi/loader/entries/ubuntu.conf /boot/efi/loader/entries/dev-session.conf
vim /boot/efi/loader/entries/dev-session.conf

Edit the new entry:

title	Ubuntu Noble dev-session (ZFS)
linux	/ubuntu/vmlinuz
initrd	/ubuntu/initrd.img
options	root=ZFS=tank/ROOT/dev-session ro quiet splash hugepages=32768 transparent_hugepage=never intel_iommu=on iommu=pt vfio-pci.ids=8086:a780 modprobe.blacklist=i915 systemd.log_level=debug systemd.log_target=console

  1. Reboot and press `` at the boot menu if you want to test without saving.

If it works, use it. Nothing is lost. You still have the old root filesystem mounted elsewhere if you need to retrieve files like Firefox profiles, Wi-Fi configs, netplan settings, or libvirt contents.

Optionally, move libvirt to its own dataset to persist across sessions:

zfs create tank/libvirt
zfs set mountpoint=/var/lib/libvirt tank/libvirt
cp -a /mnt/noble/var/lib/libvirt /var/lib/

And finally:

zfs snapshot tank/ROOT/ubuntu-noble@stable-base
zfs snapshot tank/ROOT/ubuntu-noble@ready-to-develop
zfs snapshot tank/libvirt@initial

From here, you're ready to experiment with distcc, ccache, compile flags, or rebuild your desktop entirely — and you can roll back or boot into a known-good state at any time.

✅ Steps to Recover:

  1. Reboot and select "dev-session" — your clean desktop is ready.

Add a boot entry (copy and edit the existing one):

cp /boot/efi/loader/entries/ubuntu.conf /boot/efi/loader/entries/dev-session.conf
vim /boot/efi/loader/entries/dev-session.conf

Update the file:

title Ubuntu Noble dev-session (ZFS)
options root=ZFS=tank/ROOT/dev-session ro ...

Set mountpoint and canmount:

zfs set mountpoint=/ tank/ROOT/dev-session
zfs set canmount=noauto tank/ROOT/dev-session

Clone the snapshot:

zfs clone tank/ROOT/ubuntu-noble@ready-to-develop tank/ROOT/dev-session


🧠 Pro Tip:

Always take a snapshot before any major system change:
zfs snapshot tank/ROOT/ubuntu-noble@before-update


✨ Why This Works

Thanks to ZFS's copy-on-write and snapshot capabilities, your full root system can be cloned instantly, including /var, /home, /etc, and /usr. This gives you:

  • Safe staging environment for development
  • Zero-risk system upgrades
  • A quick rollback path without reinstalling

You can test, break, recover — and never lose your working system.


Document this in your boot article for Ubuntu 24.04 with ZFS root. It’s your get-out-of-jail-free card, built right in.