How to Diagnose a Slow Network on Linux
Diagnose Linux network slowness layer by layer using ping, mtr, iperf3, ethtool, tcpdump, and dmesg—from bad cables to kernel buffer tuning.
Before you start
- ▸Root or sudo access on the machine being diagnosed
- ▸A second machine or public server available to run iperf3 tests against
- ▸Basic familiarity with reading ip link and ping output
- ▸ethtool, mtr, iperf3, and tcpdump installed (install commands shown in guide)
A slow network on Linux can stem from a dozen different places: a flaky cable, a misconfigured NIC, a saturated link, a broken DNS resolver, or a misbehaving application. The trick is to work through the OSI stack methodically rather than guessing. This guide walks you through the essential tools—ping, mtr, iperf3, ethtool, tcpdump, and dmesg—and shows you which layer each one targets so you stop chasing ghosts.
Check Physical and Link Layer First
Before you touch a packet analyser, confirm that the NIC and its link are healthy. A half-duplex mismatch or a bad cable will defeat every software fix you attempt.
Inspect NIC settings with ethtool
Replace eth0 with your actual interface name (find it with ip link).
ethtool eth0
Look at Speed, Duplex, and Link detected. A gigabit port negotiating at 100 Mb/s or falling back to half-duplex is a common culprit. Force the correct values if auto-negotiation is broken:
ethtool -s eth0 speed 1000 duplex full autoneg on
Check driver statistics for hardware-level errors:
ethtool -S eth0 | grep -iE 'error|drop|miss|fifo|over'
Non-zero error counters here point to a physical problem: swap the cable or SFP before going further.
Read kernel messages with dmesg
The kernel logs NIC resets, firmware errors, and link flaps. Filter for your interface:
dmesg -T | grep -iE 'eth0|eno1|enp|firmware|reset|link'
Repeated "link is down / link is up" messages mean a physical or switch-port problem. Firmware error lines mean a driver or NIC issue—check for a driver update via your package manager.
Test Basic Connectivity and Latency
ping — round-trip time and packet loss
Start with your default gateway, then a well-known external host:
ping -c 20 $(ip route show default | awk '{print $3}')
ping -c 20 8.8.8.8
What to look for:
- Packet loss to the gateway — problem is local: cable, switch port, or NIC.
- Loss only to external hosts — problem is upstream: ISP or WAN link.
- High jitter (wildly varying RTT) — congestion or a flaky wireless link.
- Consistent high RTT to the gateway — gateway CPU overloaded or QoS misconfigured.
mtr — path-level diagnosis
Install it if needed:
# Debian/Ubuntu
apt install mtr-tiny
# Fedora/RHEL/Rocky
dnf install mtr
# Arch
pacman -S mtr
Run a 100-packet report to a remote host:
mtr --report --report-cycles 100 8.8.8.8
Read the output column by column: Loss% and Avg latency per hop. Loss that appears at hop N and persists through all later hops means that hop is the problem. Loss that appears at one hop only is usually ICMP rate-limiting on a router—not a real problem. A sudden latency jump of 20 ms+ that persists to the destination points to a congested or degraded link at that hop.
Measure Raw Throughput
Latency tests tell you about delay; they do not tell you about bandwidth. Use iperf3 to measure actual throughput between two machines.
Set up the iperf3 server
On the remote machine (or a machine on the same LAN if you are testing internal speed):
iperf3 -s
Open the port in your firewall if necessary:
# firewalld (Fedora/RHEL)
firewall-cmd --add-port=5201/tcp --temporary
# ufw (Ubuntu/Debian)
ufw allow 5201/tcp
Run the iperf3 client
Test download (server→client) and upload (client→server) separately:
# Upload: client sends to server
iperf3 -c SERVER_IP -t 30
# Download: server sends to client (-R reverses direction)
iperf3 -c SERVER_IP -t 30 -R
Compare results against the negotiated link speed you saw in ethtool. A 1 Gbps link should push 900+ Mbps between two machines on the same switch with no other load. If you are hitting 100 Mbps on a gigabit link, you likely have a duplex mismatch or a software bottleneck (check CPU during the test with vmstat 1).
To simulate multiple streams (useful for detecting interrupt-affinity issues):
iperf3 -c SERVER_IP -t 30 -P 4
Inspect Traffic with tcpdump
When latency and throughput numbers point to a specific flow—or when you suspect retransmits—capture packets to confirm.
Capture retransmits
tcpdump -i eth0 -nn 'tcp[tcpflags] & (tcp-syn|tcp-fin|tcp-rst) != 0' -w /tmp/flags.pcap
A flood of TCP RST or repeated SYN packets indicates a connection that cannot complete—often a firewall rule or a server that is not listening.
Look for retransmission storms
tcpdump -i eth0 -nn -v tcp 2>/dev/null | grep -i 'retransmit\|dup ack' | head -40
If you prefer a graphical view, save a pcap and open it in Wireshark. The Statistics → TCP Stream Graphs → Time-Sequence (tcptrace) view shows retransmit storms instantly.
Identify chatty processes
ss -tupn
This maps open connections to PIDs without needing a capture. Pair it with nethogs for per-process bandwidth in real time:
nethogs eth0
Check Kernel Network Stack Settings
Once you have ruled out hardware and path problems, the bottleneck may be inside the kernel itself.
# Check receive and send buffer sizes
sysctl net.core.rmem_max net.core.wmem_max net.ipv4.tcp_rmem net.ipv4.tcp_wmem
# Check for dropped packets at the interface level
ip -s link show eth0
The RX errors and dropped counters in ip -s link map directly to the counters you saw in ethtool. If RX drops are climbing under load but ethtool shows no hardware errors, increase the NIC ring buffer:
ethtool -g eth0 # show current and max ring sizes
ethtool -G eth0 rx 4096 # increase RX ring
Verify DNS Is Not the Problem
Slow DNS makes applications feel network-slow even when throughput is fine. Time a lookup explicitly:
time dig google.com @$(resolvectl status | awk '/DNS Servers/{print $3; exit}')
Anything over 100 ms for a local resolver is worth investigating. Check /etc/resolv.conf and, on systemd-resolved systems, resolvectl status to confirm the right server is being used.
Verification Checklist
Run through these after making any change:
ethtool eth0— Speed/Duplex show the expected values, Link detected: yes.ping -c 50 GATEWAY— 0% loss, stable RTT.mtr --report 8.8.8.8— no persistent loss along the path.iperf3 -c SERVER_IP -t 30— throughput within 10% of link speed.ip -s link show eth0— RX/TX error and drop counters not climbing.
Troubleshooting Quick Reference
| Symptom | Likely Layer | First Tool |
|---|---|---|
| Loss only to gateway | Physical / L2 | ethtool, dmesg |
| Loss starts at a specific hop | Network / L3 | mtr |
| Good ping, bad throughput | Transport / L4 | iperf3, tcpdump |
| Fast locally, slow WAN | WAN / ISP | mtr (to external) |
| Apps slow, network tools fast | Application / DNS | dig, ss, nethogs |
| Drops under load only | Driver / buffer | ethtool -S, ethtool -G |
Frequently asked questions
- Why does mtr show packet loss at one hop but not beyond it?
- That hop is rate-limiting ICMP responses—a normal router behaviour, not a real problem. Only worry about loss that persists from a given hop all the way to the destination.
- iperf3 shows much lower throughput than my link speed. What should I check first?
- Confirm duplex with ethtool, then watch CPU usage during the test with vmstat. A single-stream iperf3 test can be CPU-bound on fast links; try multiple parallel streams with -P 4 to rule that out.
- How do I make ethtool ring buffer and duplex changes persist across reboots?
- On systemd-networkd systems add EthernetNegotiation and RxBufferSize directives in your .network file. On NetworkManager systems use a dispatcher script or nmcli to set ethtool options. Direct ethtool -s and -G calls do not survive a reboot.
- My ping RTT is fine but web pages load slowly. What is the likely cause?
- Check DNS resolution time with dig and check for TCP retransmits with tcpdump. Also confirm no per-process bandwidth hog with nethogs, and test HTTP specifically with curl -w '%{time_connect} %{time_starttransfer}' to separate connect time from first-byte time.
- Is tcpdump safe to run on a production server?
- Yes, with care. On a high-traffic interface, write directly to a pcap file with -w and limit capture size with -c or -G rather than printing to the terminal, which can itself cause drops and add CPU load.
Related guides
Common Linux Network Ports Reference
Learn Linux port ranges, read /etc/services, find what's listening with ss and nmap, and apply solid firewall rules to expose or block the right ports.
How to Configure a Static IP on Linux
Configure a static IP on Linux using Netplan, NetworkManager (nmcli), or systemd-networkd across Ubuntu, Fedora, Debian, and Arch with verified steps.
firewalld Zones and Rich Rules in Practice
Assign interfaces to firewalld zones, open services, write rich rules for source-based and rate-limited policies, and manage runtime vs permanent config.
nftables from Scratch
Build a complete nftables firewall from scratch: tables, chains, hooks, sets, maps, NAT, and atomic transactional updates explained with real rules.