Booting Ubuntu 24.04 with ZFS Root and systemd-boot: A Hard-Earned Guide

Booting Ubuntu 24.04 with ZFS Root and systemd-boot: A Hard-Earned Guide

Before diving into the technical details, it's important to understand what makes this installation unique and why it required patience and persistence. Unlike typical Ubuntu installations that rely on GRUB and ext4, this system was crafted with a ZFS root on striped NVMe drives (though a mirror would have provided better redundancy) and uses systemd-boot as the bootloader. The result is a minimalist, robust setup that boots quickly, is resilient against disk errors, and is designed with flexibility in mind.

We didn't arrive here easily. GRUB was our first attempt — but repeated issues with its interaction with ZFS led us to abandon it. Ultimately, systemd-boot proved to be the cleaner and more reliable alternative. We also chose to let ZFS manage entire drives instead of carving out partitions, giving it full control over data layout and fault management. This article documents every step, the lessons learned, and the reasoning behind each decision.

Overview

This article documents the full process of building a clean, reliable Ubuntu desktop system that:

Boots with systemd-boot** (no GRUB)**
  • Uses a ZFS stripe as root
  • Avoids the common pitfalls of service misordering and boot crashes
  • Brings up a full MATE desktop environment on top

This guide was written after over 10 hours of real-world trial, error, and persistence — including several reinstalls, blind alleys, and ultimately, success.

Starting Point

We began by booting from the Ubuntu Server 24.04 USB installer, choosing the minimal install route to avoid Snap preloads and desktop bloat.

Preparing the Disks

  1. Wiped partition tables entirely using sgdisk --zap-all and parted.
  2. Used two FireCuda NVMe drives in a ZFS stripe, without any separate boot partition on them.
  3. Created a 1GiB EFI partition on a separate NVMe drive, formatted it vfat and mounted at /boot/efi. it's going to contain all your kernels initramfs and configs that may come later. it could be created on the ZFS stripe or mirror yet to give ZFS a whole disk and if there is a spare

To find the boot disk to be used

akadata@homer:~$ ls -l /dev/disk/by-id/ | grep addlink
lrwxrwxrwx 1 root root 13 May 11 22:21 nvme-addlink_M.2_PCIE_G4x4_NVMe_2023022003000799 -> ../../nvme2n1

Creating the EFI Partition

To find the boot disk to be used:

ls -l /dev/disk/by-id/ | grep addlink

Then use parted to create a new GPT label and an EFI partition:

parted /dev/disk/by-id/nvme-addlink_M.2_PCIE_G4x4_NVMe_2023022003000799
(parted) mklabel gpt
(parted) mkpart ESP fat32 1MiB 2GiB
(parted) set 1 boot on
(parted) quit

Format the partition as FAT32:

mkfs.vfat -F32 /dev/disk/by-id/nvme-addlink_M.2_PCIE_G4x4_NVMe_2023022003000799-part1

ZFS Root Pool

We created the ZFS pool manually with two FireCuda drives using their disk-by-id paths:

zpool create -o ashift=12 -O acltype=posixacl -O xattr=sa \
  -O relatime=off -O dnodesize=auto -O normalization=formD \
  -O mountpoint=none -O compression=off \
  -R /mnt tank mirror \
  /dev/disk/by-id/nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YP1 \
  /dev/disk/by-id/nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YWK

To create a stripe instead of a mirror (faster, no redundancy):

zpool create -R /mnt tank \
  /dev/disk/by-id/nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YP1 \
  /dev/disk/by-id/nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YWK

To create a 3-disk raidz1:

zpool create -R /mnt tank raidz1 \
  /dev/disk/by-id/nvme-disk1 \
  /dev/disk/by-id/nvme-disk2 \
  /dev/disk/by-id/nvme-disk3

To create a 4-disk raidz2:

zpool create -R /mnt tank raidz2 \
  /dev/disk/by-id/nvme-disk1 \
  /dev/disk/by-id/nvme-disk2 \
  /dev/disk/by-id/nvme-disk3 \
  /dev/disk/by-id/nvme-disk4

We later enabled compression selectively per dataset.

System Preparation and Chroot Setup

Once the ZFS pool and EFI partition were created, the next steps involved preparing the Ubuntu system manually.

1. Create ZFS datasets for the root and home:

zfs create -o mountpoint=/ tank/ROOT
zfs create -o mountpoint=/ tank/ROOT/noble

zfs create -o mountpoint=/home tank/home

Before bootstrapping, it's important to understand the purpose of the dataset layout and mount the EFI partition:

Mount it at the correct EFI location:

mkdir -p /tank/ROOT/noble/boot/efi
mount /dev/disk/by-id/nvme-addlink_M.2_PCIE_G4x4_NVMe_2023022003000799-part1 \
/tank/ROOT/noble/boot/efi

We use tank/ROOT as a parent dataset for all bootable root environments. This structure allows for clean isolation of different OS roots (e.g., noble, arch, lfs) beneath a single management namespace. Each subdataset under tank/ROOT represents a full root filesystem. In this case, tank/ROOT/noble is used for the Ubuntu Noble installation.

2. Bootstrap the base system using debootstrap:

debootstrap --include=linux-image-generic,linux-headers-generic,vim,sudo noble \
/tank/ROOT/noble http://archive.ubuntu.com/ubuntu

3. Bind necessary virtual filesystems and chroot:

mount --rbind /dev /tank/ROOT/noble/dev
mount --rbind /proc /tank/ROOT/noble/proc
mount --rbind /sys /tank/ROOT/noble/sys
mount --rbind /run /tank/ROOT/noble/run
mount -t efivarfs efivarfs /tank/ROOT/noble/sys/firmware/efi/efivars
chroot /tank/ROOT/noble
Note: Mounting efivarfs is critical for successful EFI bootloader installation and configuration.

4. Inside the chroot, update the initramfs and install the base system tools:

update-initramfs -c -k all
systemctl enable ssh
systemctl set-default graphical.target

Bootloader: systemd-boot

Installation

We opted for systemd-boot instead of GRUB:

apt install systemd-boot-efi
bootctl --path=/boot/efi install
bootctl status
bootctl update

bootctl status Output

System:
      Firmware: UEFI 2.90 (American Megatrends 5.32)
 Firmware Arch: x64
   Secure Boot: disabled
  TPM2 Support: yes
  Measured UKI: no
  Boot into FW: supported

Current Boot Loader:
      Product: systemd-boot 255.4-1ubuntu8
     Features: ✓ Boot counting
               ✓ Menu timeout control
               ✓ One-shot menu timeout control
               ✓ Default entry control
               ✓ One-shot entry control
               ✓ Support for XBOOTLDR partition
               ✓ Support for passing random seed to OS
               ✓ Load drop-in drivers
               ✓ Support Type #1 sort-key field
               ✓ Support @saved pseudo-entry
               ✓ Support Type #1 devicetree field
               ✓ Enroll SecureBoot keys
               ✓ Retain SHIM protocols
               ✓ Menu can be disabled
               ✓ Boot loader sets ESP information
          ESP: /dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538
         File: └─/EFI/systemd/systemd-bootx64.efi

Random Seed:
 System Token: set
       Exists: yes

Available Boot Loaders on ESP:
          ESP: /boot/efi (/dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538)
         File: ├─/EFI/systemd/systemd-bootx64.efi (systemd-boot 255.4-1ubuntu8)
               └─/EFI/BOOT/BOOTX64.EFI (systemd-boot 255.4-1ubuntu8)

Boot Loaders Listed in EFI Variables:
        Title: Linux Boot Manager
           ID: 0x0000
       Status: active, boot-order
    Partition: /dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538
         File: └─/EFI/systemd/systemd-bootx64.efi

        Title: UEFI OS
           ID: 0x0001
       Status: active, boot-order
    Partition: /dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538
         File: └─/EFI/BOOT/BOOTX64.EFI

Boot Loader Entries:
        $BOOT: /boot/efi (/dev/disk/by-partuuid/3e069d1d-f0d4-45a8-aca8-dc8a10c4d538)
        token: 9270275906c34edd9126b5905f20cd8e

Default Boot Loader Entry:
         type: Boot Loader Specification Type #1 (.conf)
        title: Ubuntu 24.04 LTS
           id: 9270275906c34edd9126b5905f20cd8e-6.8.0-31-generic.conf
       source: /boot/efi//loader/entries/9270275906c34edd9126b5905f20cd8e-6.8.0-31-generic.conf
     sort-key: ubuntu
      version: 6.8.0-31-generic
   machine-id: 9270275906c34edd9126b5905f20cd8e
        linux: /boot/efi//9270275906c34edd9126b5905f20cd8e/6.8.0-31-generic/linux
       initrd: /boot/efi//9270275906c34edd9126b5905f20cd8e/6.8.0-31-generic/initrd.img-6.8.0-31-generic
      options: root=ZFS=tank/ROOT/noble rw systemd.machine_id=9270275906c34edd9126b5905f20cd8e

This installed its own bootloader and automatically generated loader entries under /boot/efi/loader/entries/. While this may work, it lacks explicit control over boot parameters such as specifying the exact ZFS root dataset, which is essential for reliability in multi-dataset configurations.

Before proceeding, ensure basic system identity is configured:

echo "homer" > /etc/hostname
echo "127.0.1.1 homer" >> /etc/hosts

To make this work, we first created the necessary directory:

mkdir -p /boot/efi/ubuntu

Then we manually copied the kernel and initramfs into that location:

cp /boot/vmlinuz /boot/efi/ubuntu/
cp /boot/initrd.img /boot/efi/ubuntu/

We also copied the versioned files to retain flexibility:

cp /boot/vmlinuz-6.8.0-31-generic /boot/efi/ubuntu/
cp /boot/initrd.img-6.8.0-31-generic /boot/efi/ubuntu/

We also created our own loader entry manually for clarity:

/boot/efi/loader/entries/ubuntu.conf

title   Ubuntu Noble (ZFS)
linux   /ubuntu/vmlinuz
initrd  /ubuntu/initrd.img
options root=ZFS=tank/ROOT/noble rw

Both this and the UUID-based autogenerated one worked, but we preferred clarity.

Note: Every time you update the kernel or initramfs, you must repeat the copy step to ensure /boot/efi/ubuntu/ has the latest versions. Otherwise, the bootloader will not see your updates. While bootctl will also install and update its own kernel and initrd copies in a different managed location, those do not include the manual ZFS root parameters unless explicitly configured. Our approach gives clear control over what is booted and ensures consistency with the expected root dataset.

Pitfalls and Problems

1. zfs-mount.service and zfs-volumes.target

Enabling these too early caused multiple boot failures:

  • Stalls during boot
  • Hanging zfs-mount.service waiting for non-existent zvols

Fix: We masked them completely:

ln -s /dev/null /etc/systemd/system/zfs-mount.service
ln -s /dev/null /etc/systemd/system/zfs-volumes.target

2. Root Password & sudo

After bootstrapping the system with debootstrap, we had no root password and no sudo.

Fix: Set root password from recovery or init, and:

To allow your user to execute sudo commands without entering a password, add the following line to `/etc/sudoers` using `visudo`:

apt install vim sudo
usermod -aG sudo youruser

Replace `username` with your actual login name. Using `vim` via `visudo` ensures syntax is validated before saving.

username ALL=(ALL) NOPASSWD: ALL

3. Multiple Reinstalls

Many failures required us to:

  • Re-wipe ZFS pools
  • Correct EFI partitions
  • Rebuild zpool.cache

Persistence paid off.

akadata@homer:/home/akadata# sudo zpool status
  pool: tank
 state: ONLINE
config:

	NAME                                                STATE     READ WRITE CKSUM
	tank                                                ONLINE       0     0     0
	  nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YP1  ONLINE       0     0     0
	  nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YWK  ONLINE       0     0     0

errors: No known data errors

ZFS supports automated scrubs to help detect and correct silent data corruption. To ensure your system stays healthy, enable the periodic scrub timer with:

systemctl enable zfs-scrub.timer

This will automatically scrub your ZFS pools on a regular schedule behind the scenes.

You can also scrub manually any time with:

zpool scrub tank

And check the progress:

zpool status
akadata@homer:/home/akadata# sudo zpool scrub tank
akadata@homer:/home/akadata# sudo zpool status
  pool: tank
 state: ONLINE
  scan: scrub in progress since Mon May 12 08:20:15 2025
	30.4G / 30.4G scanned, 5.63G / 30.4G issued at 5.63G/s
	0B repaired, 18.51% done, 00:00:04 to go
config:

	NAME                                                STATE     READ WRITE CKSUM
	tank                                                ONLINE       0     0     0
	  nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YP1  ONLINE       0     0     0
	  nvme-Seagate_FireCuda_530_ZP2000GM30023_7VR04YWK  ONLINE       0     0     0

errors: No known data errors

Final Boot Sequence

  • EFI loads systemd-boot
  • systemd-boot selects Ubuntu with ZFS=tank/ROOT/noble
  • No zfs-mount or zfs-volumes services interfere
  • initrd imports the pool and mounts /
  • MATE desktop launches smoothly

Desktop Setup

Once booting was solid:

apt install ubuntu-mate-desktop


  • PulseAudio configured manually
  • Bluetooth audio worked after pairing, installing pavucontrol, and switching to A2DP

Reflections

Over 10 hours went into debugging init failures, masking systemd services, reimporting ZFS pools, rebuilding initramfs, and correcting loader entries.

We initially tried both GRUB and systemd-boot. GRUB failed to reliably boot the ZFS root, and after numerous issues, we chose to go all-in with systemd-boot. The result was a cleaner and far faster boot process.

We also recognized that while it is technically possible to place EFI and ZFS on the same disk, allowing ZFS to manage entire drives gives it full control over redundancy, scrub health, and clean device layout — avoiding the complexities of manually partitioned hybrid disks. Keeping the EFI partition separate on its own small SSD ensures boot simplicity, and letting ZFS own whole disks aligns perfectly with its design philosophy.

What made this work was:

  • A willingness to understand the underlying tools
  • Avoiding GRUB and its complex boot logic
  • Learning how ZFS interacts with init and systemd

Benchmarking Your ZFS Pool

To quickly benchmark your ZFS stripe or mirror, install fio:

sudo apt install fio

Random Read Test

Run a basic 4K random read test:

fio --name=randread --ioengine=libaio --direct=1 --rw=randread --bs=4k --numjobs=4 --size=1G --runtime=60 --group_reporting --filename=/tank/testfile

Result:

READ: bw=1786MiB/s (1872MB/s), IOPS=457k, io=4096MiB (4295MB), run=2294msec

This result demonstrates excellent random read throughput and high IOPS for a two-disk NVMe ZFS stripe.

Sequential Read Tests

To test the maximum sequential read speed of your ZFS pool, use the following:

fio --name=seqread --ioengine=libaio --direct=1 --rw=read --bs=1M --numjobs=4 --size=1G --runtime=60 --group_reporting --filename=/tank/testfile

Result:

READ: bw=40.4GiB/s (43.4GB/s), IOPS=41.4k, io=4096MiB (4295MB), run=99msec

Extended Sequential Read Scaling

We tested larger sizes and more concurrent jobs for throughput scaling:

4GiB test with 4 jobs:

READ: bw=40.7GiB/s (43.7GB/s), IOPS=41.7k, io=16.0GiB, run=393msec

4GiB test with 15 jobs:

READ: bw=77.8GiB/s (83.6GB/s), IOPS=79.7k, io=60.0GiB, run=771msec

4GiB test with 24 jobs:

READ: bw=73.1GiB/s (78.5GB/s), IOPS=74.9k, io=96.0GiB, run=1313msec

Cached read test (no ``):

READ: bw=69.0GiB/s (74.1GB/s), IOPS=70.7k, io=96.0GiB, run=1391msec

These results demonstrate the scalability of read throughput on a ZFS stripe using NVMe drives. With proper concurrency and large block sizes (1M), performance reached over 80GB/s under realistic system loads.

Remove the test file after benchmarking:

rm /tank/testfile

Next Steps

Now that we have:

  • A stable ZFS root
  • systemd-boot cleanly managing entries
  • MATE desktop running

We can safely snapshot the system:

sudo zfs snapshot tank/ROOT/noble@mate-clean

And experiment with new roots (like LFS or Arch) in tank/ROOT/xyz — each with its own loader entry.

Persistence builds systems. Let this guide light your path.

System Specifications

  • Motherboard: MSI MPG Z790 CARBON WIFI (MS-7D89), manufactured by Micro-Star International Co., Ltd.
  • CPU: Intel Core i9-13900K — running at stock speed, cooled by a Kraken 360mm AIO water cooler
  • Drives: Two Seagate FireCuda 530 NVMe drives (PCIe 4.0, up to 7000MB/s throughput), each directly attached to CPU PCIe lanes for maximum performance
  • Graphics: NVIDIA RTX 3060 Ti 12GB using PCIe 4.0 x8, sharing the CPU PCIe lanes with the NVMe drives
  • Memory: 128GB DDR5 (4x32GB Corsair DIMMs)
  • Model: CMT64GX5M2B5200C40
    1. Speed: 4000 MT/s (configured), dual-rank, 1.1V
    2. All DIMMs matched and operating in quad-channel configuration
  • Boot Device: Separate 2TiB NVMe drive used exclusively for the EFI system partition and later planned for additional ZFS storage for temporary data or virtual machine disks

This setup provides exceptional performance for both development and production-like testing, especially under heavy I/O and multitasking workloads.

This configuration provides a strong balance of workstation and server-grade capability, ideal for development, VM workloads, and testing advanced filesystem setups like ZFS root.