$linuxjunkies
>

ZFS on Linux: The Basics

Set up OpenZFS on Linux: create pools with mirroring or RAID-Z, manage datasets with compression and quotas, and use snapshots for instant backups.

AdvancedUbuntuDebianFedoraArch10 min readUpdated May 26, 2026

Before you start

  • Root or sudo access on the target system
  • One or more spare block devices or disks (do not use disks with data you need)
  • Kernel headers installed (required for DKMS module compilation on Debian/Ubuntu/Fedora)
  • OpenZFS 2.1 or later recommended; version 2.2+ required for RAID-Z expansion

ZFS is not just a filesystem — it is a combined volume manager and filesystem that brings enterprise storage concepts to any Linux machine. Copy-on-write semantics, built-in checksumming, snapshots, transparent compression, and RAID-Z parity all live under one roof. OpenZFS, the community fork that ships on Linux, is stable, actively maintained, and widely used in production. This guide walks you through installation, creating your first pool, carving it into datasets, and taking snapshots — the four pillars you need to be productive with ZFS.

Why ZFS Is Different

Traditional filesystems sit on top of a volume manager (like LVM) which sits on top of a RAID layer (like mdadm). ZFS collapses all three into a single coherent layer. Every block is checksummed end-to-end, so silent data corruption — bitrot — is detected and, on mirrored or RAID-Z vdevs, automatically repaired. Snapshots are nearly free because ZFS never overwrites data in-place; old blocks are preserved by the copy-on-write engine until no snapshot references them. Understanding this model makes everything else click.

Installation

ZFS is not in the mainline Linux kernel due to CDDL/GPL licensing tension. On most distros you install it as a DKMS module or a pre-built out-of-tree module.

Debian / Ubuntu

The zfsutils-linux package pulls in the DKMS kernel module and userland tools.

sudo apt update
sudo apt install zfsutils-linux

On Ubuntu 22.04+ the module ships pre-built for the HWE kernels. Debian 12 builds it via DKMS automatically on install.

Fedora / RHEL 9 / Rocky Linux 9

Use the official OpenZFS repository. Replace fedora with el9 for RHEL-family systems.

# Fedora
sudo dnf install https://zfsonlinux.org/fedora/zfs-release-2-5$(rpm --eval "%{dist}").noarch.rpm
sudo dnf install zfs

# Rocky / AlmaLinux / RHEL 9
sudo dnf install https://zfsonlinux.org/epel/zfs-release-2-5.el9.noarch.rpm
sudo dnf install epel-release
sudo dnf install zfs

Load the module immediately without rebooting:

sudo modprobe zfs

Arch Linux

# Install from AUR; the archzfs repo is the easiest path
# Add the archzfs repo to /etc/pacman.conf first (see archzfs.com), then:
sudo pacman -Sy zfs-dkms zfs-utils

Enable the ZFS services

Import pools at boot and mount datasets automatically:

sudo systemctl enable --now zfs-import-cache.service zfs-import.target
sudo systemctl enable --now zfs-mount.service zfs.target

Creating a Pool

A pool (zpool) is the top-level storage container. It is made of one or more vdevs (virtual devices). Choose the right topology up front — changing it later is limited.

  • stripe — single disk or multiple disks with no redundancy. Fast, but one disk failure destroys the pool.
  • mirror — two or more disks, full copies. Survives all but simultaneous failures.
  • RAID-Z1/Z2/Z3 — like RAID-5/6/7, tolerates 1/2/3 disk failures per vdev. Minimum 3/4/5 disks per vdev.

Always use disk IDs, not device names like sdb — names shift on reboot.

ls -l /dev/disk/by-id/ | grep -v part

Create a mirrored pool

sudo zpool create -o ashift=12 mypool mirror \
  /dev/disk/by-id/ata-WDC_WD40EZRZ_AAAA-1 \
  /dev/disk/by-id/ata-WDC_WD40EZRZ_BBBB-2

ashift=12 sets the internal block size to 4 KiB, matching almost all drives made after 2011. Using the wrong ashift causes permanent write amplification — set it correctly at creation time.

Create a RAID-Z2 pool (4-disk example)

sudo zpool create -o ashift=12 datapool raidz2 \
  /dev/disk/by-id/disk1 \
  /dev/disk/by-id/disk2 \
  /dev/disk/by-id/disk3 \
  /dev/disk/by-id/disk4

Verify the pool

zpool status mypool
zpool list

Output shows vdev topology, health (ONLINE), used/available space, and any errors. All three columns — READ, WRITE, CKSUM — should be zero on a healthy pool.

Datasets

Within a pool you create datasets — the ZFS equivalent of directories with their own properties. Each dataset can have its own compression, quota, record size, and mount point, all inherited or overridden independently.

Create datasets

# Root dataset for a home directory tree
sudo zfs create mypool/home

# Nested dataset with a specific mount point
sudo zfs create -o mountpoint=/srv/postgres mypool/pgdata

# Dataset with LZ4 compression (nearly always worth enabling)
sudo zfs create -o compression=lz4 mypool/backups

Set and get properties

# Enable compression on an existing dataset
sudo zfs set compression=lz4 mypool/home

# Set a quota
sudo zfs set quota=100G mypool/home/alice

# List all properties for a dataset
zfs get all mypool/pgdata

Key dataset properties to know

  • compressionlz4 is fast and effective for almost every workload; zstd gives better ratios for cold data.
  • recordsize — default 128 KiB is good for general use; drop to 8k or 16k for database files to match typical I/O patterns.
  • atime=off — disables access-time writes; safe for most workloads, meaningful performance gain on read-heavy data.
  • quota vs refquotaquota includes space used by snapshots; refquota only counts live data.

Snapshots

Snapshots are point-in-time, read-only copies of a dataset. Because ZFS is copy-on-write, creating a snapshot costs essentially nothing and takes milliseconds. Space is consumed only as the live dataset diverges from the snapshot.

Take a snapshot

# Naming convention: dataset@label
sudo zfs snapshot mypool/home@2025-01-15

# Recursive snapshot of a dataset and all its children
sudo zfs snapshot -r mypool@weekly-2025-01-15

List and inspect snapshots

zfs list -t snapshot
zfs list -t snapshot -o name,used,referenced,creation

Roll back a dataset

Rollback reverts a dataset to a snapshot. By default it refuses if more recent snapshots exist; use -r to destroy them.

sudo zfs rollback mypool/home@2025-01-15

# Destroy intermediate snapshots and roll back
sudo zfs rollback -r mypool/home@2025-01-15

Access snapshot contents without rolling back

Every dataset exposes its snapshots under the hidden .zfs/snapshot directory — no mount command needed.

ls /home/.zfs/snapshot/
cp /home/.zfs/snapshot/2025-01-15/alice/.bashrc /home/alice/.bashrc

Delete a snapshot

sudo zfs destroy mypool/home@2025-01-15

Routine Maintenance

Scrubbing

A scrub reads every block in the pool, verifies checksums, and repairs any errors it can. Run it monthly at minimum. On large pools it takes hours — schedule it with a systemd timer or cron.

sudo zpool scrub mypool

# Check scrub progress
zpool status mypool | grep scan

Monitor pool health

zpool status -v
zpool iostat -v 5   # live I/O stats, refreshed every 5 seconds

Troubleshooting

Pool fails to import after reboot

If the cache file is stale, import by scanning for pools:

sudo zpool import -a          # import all available pools
sudo zpool import mypool      # import a specific pool by name

DEGRADED pool — one disk failed

Replace the failed disk (ZFS keeps serving data on a mirror or RAID-Z while degraded). Resilver starts automatically once you attach the replacement.

sudo zpool replace mypool /dev/disk/by-id/old-disk /dev/disk/by-id/new-disk
zpool status mypool   # watch resilver progress

Module not loaded after kernel update

DKMS rebuilds modules automatically on kernel upgrades. If it failed:

sudo dkms autoinstall
sudo modprobe zfs

On Fedora/RHEL you may need to re-run dnf install zfs after a kernel update to trigger the DKMS build for the new kernel version.

High ARC memory usage

ZFS's Adaptive Replacement Cache (ARC) aggressively uses free RAM, which is intentional. Linux will reclaim it under memory pressure. If you need a hard cap, set it in /etc/modprobe.d/zfs.conf:

echo 'options zfs zfs_arc_max=4294967296' | sudo tee /etc/modprobe.d/zfs.conf
# 4294967296 bytes = 4 GiB cap; adjust to your needs
sudo dracut -f   # Fedora/RHEL: rebuild initramfs
sudo update-initramfs -u   # Debian/Ubuntu: rebuild initramfs
tested on:Ubuntu 24.04Debian 12Fedora 40Rocky 9

Frequently asked questions

Can I add more disks to an existing RAID-Z vdev?
As of OpenZFS 2.2, online RAID-Z expansion (adding one disk to an existing RAID-Z vdev) is supported. Before 2.2 it was impossible. Adding a whole new vdev to the pool has always been supported and is still the safest path for capacity expansion.
Is ZFS safe to use on a root filesystem?
Yes. Ubuntu has supported ZFS on root since 20.04 via the installer, and it is widely used in production. Fedora/RHEL require manual setup but are stable. The main caveat is that GRUB support for ZFS features lags behind OpenZFS releases — some newer features must be disabled or grub2-install re-run after enabling them.
How much RAM does ZFS need?
The old '1 GB per TB of storage' rule was a myth from legacy Solaris ZFS. OpenZFS runs acceptably with 2 GB of RAM for home use. More RAM means a larger ARC cache and better read performance, but there is no hard minimum tied to pool size.
Does compression slow down performance?
With lz4, almost never — lz4 compression is fast enough that reducing I/O bytes usually makes reads and writes faster, not slower. On CPU-constrained embedded systems or with heavier codecs like gzip-9, there can be a measurable overhead.
What is the difference between a snapshot and a clone?
A snapshot is read-only and stays tied to its origin dataset. A clone is a writable dataset created from a snapshot; it starts out sharing all blocks with the snapshot and only diverges as new data is written. Clones are useful for testing changes to data without duplicating it.

Related guides