Dropped Packets | Blue Matador

Each network device has a counter for the number of dropped packets. When packets are dropped, the transport layer (layer 4 of the OSI model) is responsible for retransmission.

UDP is quick but unreliable. Its packets are neither retransmitted nor counted in the number of dropped packets. If you are dropping a lot of packets, and also sending a lot of UDP traffic, the problem may be worse than the network interface is letting on.

While TCP will retransmit packets, it will often take multiple seconds to do so, encouraging your applications to time out. RFC 6298 defines the retransmission timeout calculation. In short, your application will fare better with a reliable network infrastructure.

Dropping packets is most often the symptom of a hardware failure of the network card, network cables, or network devices like switches and routers. It could also be an issue of throughput, where you’re sending too much data for network components to handle. When network devices receive more traffic, they store the information in buffers. When the buffers are full, the new traffic is dropped.

Effects

Possible issues caused by dropped packets include:

Increased number of timeouts to databases, services, and caches
Missing data in syslog, statsd, or other UDP monitoring tools
Increased 5xx HTTP status codes due to unexpected timeouts
Difficulties connecting to the server

Quick Fix

If you’re in a cloud environment, terminate your server and relaunch to move to a different physical server.

If you’re in a physical environment, correlate dropped packet reports to identify the faulty hardware.

Thorough Fix

Start with network components including switches, routers, cables, and gateways. Replace them systematically while watching dropped packets on servers within the network. If throughput is the issue, then the network may only drop packets at a certain time of day when traffic is at peak throughput.

After all network components have been verified, upgrade the firmware and device drivers on the affected servers. Also consider upgrading the OS.

Always implement your own retry logic inside your application to control the retransmission timeout.

Resources

Packet Loss (Wikipedia)
Open Systems Interconnection (OSI) model (Wikipedia)
4 Causes of Packet Loss and How to Fix Them (Annesse)
TCP/IP network connectivity problems in Linux on AWS (AWS Documentation)
Troubleshoot VPN Packet Loss in Linux on AWS (AWS Documentation)
How to Check for Dropped Packets in Windows (Techwalla)
Investigating Lost Packets With Wireshark (YouTube, 2min 26s)
Detect dropped network packets on your linux system (The Wowza Guru)
RFC 6298: Computing TCP's Retransmission Timer (IETF)