What is the difference between clat and lat in fio output?

clat (completion latency) measures the time from when the I/O is submitted to the kernel until it completes. lat (total latency) adds the submission overhead on top. For async engines like libaio, they are nearly identical; clat is the number your application actually experiences.

Should I use libaio or io_uring as the I/O engine?

Use io_uring on kernel 5.1 or later with fio 3.18 or later—it has lower overhead and better reflects modern kernel I/O paths. Use libaio for compatibility with older systems or when comparing against historical benchmarks that used it.

Why does fio report lower IOPS than my drive's spec sheet?

Vendor specs are typically measured at a specific queue depth (often 32–128) with preconditioned drives. Consumer NVMe drives also have an SLC write cache that gives burst performance; once it fills, sustained performance drops significantly. Test with a file larger than the cache and match the rated queue depth.

Can I benchmark a filesystem instead of a raw block device?

Yes, using a file on a mounted filesystem is safer and tests the full I/O stack including the filesystem layer. Use direct=1 to bypass the page cache. Raw device tests are more accurate for storage hardware comparisons but are destructive.

How long should each fio test run?

At least 60 seconds with time_based to get past any burst caching effects. For consumer SSDs with large SLC caches, consider 120–300 seconds to see sustained performance after the cache is exhausted.

How to Benchmark Disk Performance with fio

fio (Flexible I/O Tester) is the standard tool for serious disk benchmarking on Linux. Unlike dd, which only scratches the surface, fio lets you simulate real workloads—random 4K reads mimicking a database, sequential 1M writes mimicking a video stream, or mixed read/write patterns matching a mail server. This guide walks through installing fio, writing job files, running meaningful tests, and reading the output without guessing.

Install fio

Install from your distribution's package manager. The version in most LTS repos is recent enough for everything here.

Debian / Ubuntu

sudo apt update && sudo apt install fio

Fedora / RHEL / Rocky

sudo dnf install fio

Arch

sudo pacman -S fio

Confirm the version after installing:

fio --version

Key Concepts Before You Test

Throughput — how many MB/s the device can sustain. Matters for sequential workloads: video, backups, large file copies.
IOPS — I/O operations per second. Matters for random workloads: databases, virtual machine images, busy web servers.
Latency — how long a single I/O takes. Even high-IOPS devices can have tail latencies that wreck application performance.
Queue depth (iodepth) — how many I/Os are in flight simultaneously. SSDs saturate at higher depths; spinning disks rarely benefit past 1–4.
I/O engine — libaio (Linux async I/O) is the right engine for benchmarking block devices. Use io_uring on kernels ≥ 5.1 for lower overhead; results will differ slightly from libaio.

Warning: Always target a test file or a raw block device you can afford to overwrite. Running fio against your root partition or a mounted filesystem with important data can cause data loss or filesystem corruption.

Writing fio Job Files

You can pass all options on the command line, but job files are reproducible and self-documenting. A job file uses INI-like syntax: a [global] section sets defaults, then each named section defines a job.

Sequential Read/Write Throughput

This test measures sustained throughput—what you care about for large sequential transfers.

cat > /tmp/seq-throughput.fio <<'EOF'
[global]
ioengine=libaio
direct=1
runtime=60
time_based
size=4g
numjobs=1
group_reporting

[seq-read]
rw=read
bs=1m
iodepth=8
filename=/tmp/fio-testfile

[seq-write]
rw=write
bs=1m
iodepth=8
filename=/tmp/fio-testfile
EOF

direct=1 bypasses the page cache so you measure the device, not RAM.
bs=1m is a 1 MiB block size—realistic for sequential workloads.
runtime=60 with time_based runs for 60 seconds regardless of how much data is written.

Random 4K Read/Write (IOPS Test)

This is the most important benchmark for SSDs and any storage backing a database or VM.

cat > /tmp/rand-iops.fio <<'EOF'
[global]
ioengine=libaio
direct=1
runtime=60
time_based
size=4g
numjobs=4
group_reporting

[rand-read]
rw=randread
bs=4k
iodepth=32
filename=/tmp/fio-testfile

[rand-write]
rw=randwrite
bs=4k
iodepth=32
filename=/tmp/fio-testfile
EOF

numjobs=4 spawns four parallel workers. Multiply this by iodepth for total outstanding I/Os (128 here).
iodepth=32 keeps the device's internal queue filled on NVMe drives. Drop to 1–4 for spinning disks.

Mixed Read/Write with Latency Focus

Real applications rarely do 100% reads or writes. This job simulates a 70/30 read/write mix and captures latency percentiles.

cat > /tmp/mixed-latency.fio <<'EOF'
[global]
ioengine=libaio
direct=1
runtime=60
time_based
size=4g
numjobs=1
group_reporting
clat_percentiles=1
percentile_list=50:95:99:99.9

[mixed-rw]
rw=randrw
rwmixread=70
bs=4k
iodepth=16
filename=/tmp/fio-testfile
EOF

clat_percentiles=1 and percentile_list tell fio to report completion latency at the 50th, 95th, 99th, and 99.9th percentiles. The 99th and 99.9th percentiles—often called "tail latency"—are what applications actually experience at peak load.

Running the Tests

Run a job file by passing it directly to fio:

sudo fio /tmp/rand-iops.fio

To benchmark a raw block device instead of a file (more accurate, destructive):

sudo fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k \
  --ioengine=libaio --iodepth=32 --runtime=60 --time_based \
  --numjobs=4 --group_reporting --name=raw-randread

To use io_uring on a modern kernel (kernel ≥ 5.1, fio ≥ 3.18):

sudo fio /tmp/rand-iops.fio --ioengine=io_uring

Interpreting fio Output

fio output is dense. Here is a representative snippet from a random-read job (values will vary by device):

rand-read: (groupid=0, jobs=4): err= 0: pid=12345
  read: IOPS=85.2k, BW=333MiB/s (349MB/s)(19.5GiB/60002msec)
    clat (usec): min=98, max=4521, avg=374.12, stdev=98.43
     lat (usec): min=99, max=4524, avg=374.89
    clat percentiles (usec):
     | 50.00th=[  338], 95.00th=[  562], 99.00th=[  742], 99.90th=[ 1237]

IOPS — 85.2k random 4K reads per second. Compare against your device's rated spec.
BW — Bandwidth in MiB/s and MB/s (fio shows both; note the units differ by ~5%).
clat — Completion latency: the time from submitting the I/O until it completes. This is what your application waits for.
lat — Total latency including submission overhead. Usually nearly identical to clat for async engines.
clat percentiles — The 99th percentile at 742 µs means 1 in 100 reads takes ≥ 742 µs. For a latency-sensitive database, watch the 99.9th percentile closely.

For sequential tests, focus on BW. For random tests, focus on IOPS and clat percentiles. Both matter for mixed workloads.

Saving Output for Comparison

sudo fio /tmp/rand-iops.fio --output=/tmp/results-sda.txt --output-format=normal

Use --output-format=json if you want to parse results with a script or feed them into a dashboard.

Verification

After each run, confirm the test file was actually written (for write jobs):

ls -lh /tmp/fio-testfile

Cross-check sequential write throughput against a quick hdparm read test:

sudo hdparm -t /dev/sda

The numbers won't be identical—hdparm uses cached reads and a single thread—but a dramatic difference (more than 2×) suggests something is wrong with your fio setup, such as missing direct=1 or a wrong iodepth.

Troubleshooting

Results look unrealistically fast

You almost certainly forgot direct=1. Without it, fio reads and writes hit the Linux page cache and you are benchmarking RAM, not the disk.

fio reports "libaio not available"

Install the libaio development library and recompile, or use --ioengine=psync as a fallback (less accurate for queue-depth tests). On most distros, libaio is already present; the error usually means fio was compiled without it.

# Debian/Ubuntu
sudo apt install libaio1

# Fedora/RHEL
sudo dnf install libaio

IOPS much lower than the drive's spec sheet

Drive specs are measured at the factory under ideal conditions at a specific queue depth (often 32 or 128). Try increasing iodepth or numjobs, and ensure size is large enough that the drive's SLC cache is exhausted (at least 2–4× the cache size, typically 4–16 GiB on consumer NVMe).

Permission denied on /dev/sdX

Raw device access requires root or membership in the disk group. Use sudo or add your user to the group for repeated testing.

sudo usermod -aG disk $USER