$linuxjunkies
>

How to Diagnose a Slow Network on Linux

Diagnose Linux network slowness layer by layer using ping, mtr, iperf3, ethtool, tcpdump, and dmesg—from bad cables to kernel buffer tuning.

IntermediateUbuntuDebianFedoraArch9 min readUpdated May 26, 2026

Before you start

  • Root or sudo access on the machine being diagnosed
  • A second machine or public server available to run iperf3 tests against
  • Basic familiarity with reading ip link and ping output
  • ethtool, mtr, iperf3, and tcpdump installed (install commands shown in guide)

A slow network on Linux can stem from a dozen different places: a flaky cable, a misconfigured NIC, a saturated link, a broken DNS resolver, or a misbehaving application. The trick is to work through the OSI stack methodically rather than guessing. This guide walks you through the essential tools—ping, mtr, iperf3, ethtool, tcpdump, and dmesg—and shows you which layer each one targets so you stop chasing ghosts.

Before you touch a packet analyser, confirm that the NIC and its link are healthy. A half-duplex mismatch or a bad cable will defeat every software fix you attempt.

Inspect NIC settings with ethtool

Replace eth0 with your actual interface name (find it with ip link).

ethtool eth0

Look at Speed, Duplex, and Link detected. A gigabit port negotiating at 100 Mb/s or falling back to half-duplex is a common culprit. Force the correct values if auto-negotiation is broken:

ethtool -s eth0 speed 1000 duplex full autoneg on

Check driver statistics for hardware-level errors:

ethtool -S eth0 | grep -iE 'error|drop|miss|fifo|over'

Non-zero error counters here point to a physical problem: swap the cable or SFP before going further.

Read kernel messages with dmesg

The kernel logs NIC resets, firmware errors, and link flaps. Filter for your interface:

dmesg -T | grep -iE 'eth0|eno1|enp|firmware|reset|link'

Repeated "link is down / link is up" messages mean a physical or switch-port problem. Firmware error lines mean a driver or NIC issue—check for a driver update via your package manager.

Test Basic Connectivity and Latency

ping — round-trip time and packet loss

Start with your default gateway, then a well-known external host:

ping -c 20 $(ip route show default | awk '{print $3}')
ping -c 20 8.8.8.8

What to look for:

  • Packet loss to the gateway — problem is local: cable, switch port, or NIC.
  • Loss only to external hosts — problem is upstream: ISP or WAN link.
  • High jitter (wildly varying RTT) — congestion or a flaky wireless link.
  • Consistent high RTT to the gateway — gateway CPU overloaded or QoS misconfigured.

mtr — path-level diagnosis

Install it if needed:

# Debian/Ubuntu
apt install mtr-tiny

# Fedora/RHEL/Rocky
dnf install mtr

# Arch
pacman -S mtr

Run a 100-packet report to a remote host:

mtr --report --report-cycles 100 8.8.8.8

Read the output column by column: Loss% and Avg latency per hop. Loss that appears at hop N and persists through all later hops means that hop is the problem. Loss that appears at one hop only is usually ICMP rate-limiting on a router—not a real problem. A sudden latency jump of 20 ms+ that persists to the destination points to a congested or degraded link at that hop.

Measure Raw Throughput

Latency tests tell you about delay; they do not tell you about bandwidth. Use iperf3 to measure actual throughput between two machines.

Set up the iperf3 server

On the remote machine (or a machine on the same LAN if you are testing internal speed):

iperf3 -s

Open the port in your firewall if necessary:

# firewalld (Fedora/RHEL)
firewall-cmd --add-port=5201/tcp --temporary

# ufw (Ubuntu/Debian)
ufw allow 5201/tcp

Run the iperf3 client

Test download (server→client) and upload (client→server) separately:

# Upload: client sends to server
iperf3 -c SERVER_IP -t 30

# Download: server sends to client (-R reverses direction)
iperf3 -c SERVER_IP -t 30 -R

Compare results against the negotiated link speed you saw in ethtool. A 1 Gbps link should push 900+ Mbps between two machines on the same switch with no other load. If you are hitting 100 Mbps on a gigabit link, you likely have a duplex mismatch or a software bottleneck (check CPU during the test with vmstat 1).

To simulate multiple streams (useful for detecting interrupt-affinity issues):

iperf3 -c SERVER_IP -t 30 -P 4

Inspect Traffic with tcpdump

When latency and throughput numbers point to a specific flow—or when you suspect retransmits—capture packets to confirm.

Capture retransmits

tcpdump -i eth0 -nn 'tcp[tcpflags] & (tcp-syn|tcp-fin|tcp-rst) != 0' -w /tmp/flags.pcap

A flood of TCP RST or repeated SYN packets indicates a connection that cannot complete—often a firewall rule or a server that is not listening.

Look for retransmission storms

tcpdump -i eth0 -nn -v tcp 2>/dev/null | grep -i 'retransmit\|dup ack' | head -40

If you prefer a graphical view, save a pcap and open it in Wireshark. The Statistics → TCP Stream Graphs → Time-Sequence (tcptrace) view shows retransmit storms instantly.

Identify chatty processes

ss -tupn

This maps open connections to PIDs without needing a capture. Pair it with nethogs for per-process bandwidth in real time:

nethogs eth0

Check Kernel Network Stack Settings

Once you have ruled out hardware and path problems, the bottleneck may be inside the kernel itself.

# Check receive and send buffer sizes
sysctl net.core.rmem_max net.core.wmem_max net.ipv4.tcp_rmem net.ipv4.tcp_wmem
# Check for dropped packets at the interface level
ip -s link show eth0

The RX errors and dropped counters in ip -s link map directly to the counters you saw in ethtool. If RX drops are climbing under load but ethtool shows no hardware errors, increase the NIC ring buffer:

ethtool -g eth0          # show current and max ring sizes
ethtool -G eth0 rx 4096  # increase RX ring

Verify DNS Is Not the Problem

Slow DNS makes applications feel network-slow even when throughput is fine. Time a lookup explicitly:

time dig google.com @$(resolvectl status | awk '/DNS Servers/{print $3; exit}')

Anything over 100 ms for a local resolver is worth investigating. Check /etc/resolv.conf and, on systemd-resolved systems, resolvectl status to confirm the right server is being used.

Verification Checklist

Run through these after making any change:

  1. ethtool eth0 — Speed/Duplex show the expected values, Link detected: yes.
  2. ping -c 50 GATEWAY — 0% loss, stable RTT.
  3. mtr --report 8.8.8.8 — no persistent loss along the path.
  4. iperf3 -c SERVER_IP -t 30 — throughput within 10% of link speed.
  5. ip -s link show eth0 — RX/TX error and drop counters not climbing.

Troubleshooting Quick Reference

SymptomLikely LayerFirst Tool
Loss only to gatewayPhysical / L2ethtool, dmesg
Loss starts at a specific hopNetwork / L3mtr
Good ping, bad throughputTransport / L4iperf3, tcpdump
Fast locally, slow WANWAN / ISPmtr (to external)
Apps slow, network tools fastApplication / DNSdig, ss, nethogs
Drops under load onlyDriver / bufferethtool -S, ethtool -G
tested on:Ubuntu 24.04Fedora 40Debian 12Arch rolling

Frequently asked questions

Why does mtr show packet loss at one hop but not beyond it?
That hop is rate-limiting ICMP responses—a normal router behaviour, not a real problem. Only worry about loss that persists from a given hop all the way to the destination.
iperf3 shows much lower throughput than my link speed. What should I check first?
Confirm duplex with ethtool, then watch CPU usage during the test with vmstat. A single-stream iperf3 test can be CPU-bound on fast links; try multiple parallel streams with -P 4 to rule that out.
How do I make ethtool ring buffer and duplex changes persist across reboots?
On systemd-networkd systems add EthernetNegotiation and RxBufferSize directives in your .network file. On NetworkManager systems use a dispatcher script or nmcli to set ethtool options. Direct ethtool -s and -G calls do not survive a reboot.
My ping RTT is fine but web pages load slowly. What is the likely cause?
Check DNS resolution time with dig and check for TCP retransmits with tcpdump. Also confirm no per-process bandwidth hog with nethogs, and test HTTP specifically with curl -w '%{time_connect} %{time_starttransfer}' to separate connect time from first-byte time.
Is tcpdump safe to run on a production server?
Yes, with care. On a high-traffic interface, write directly to a pcap file with -w and limit capture size with -c or -G rather than printing to the terminal, which can itself cause drops and add CPU load.

Related guides