/ˈleɪ.tən.si/

noun — "the wait time between asking and getting."

Latency is the amount of time it takes for data to travel from a source to a destination across a network. It measures delay rather than capacity, and directly affects how responsive applications feel, especially in real-time systems such as voice, video, and interactive services.

Technically, Latency is usually measured in milliseconds (ms) and is influenced by propagation delay, processing delay, queuing delay, and transmission delay. It plays a critical role in IP-based networks, wide-area links (WAN), and transport protocols like TCP. Even with high Bandwidth, poor latency can make a network feel slow or unresponsive.

Network mechanisms such as QoS can reduce the impact of latency by prioritizing time-sensitive traffic, but they cannot eliminate physical limits like distance or speed-of-light constraints. This is why latency is typically lower in local networks than across global Internet paths.

Key characteristics of Latency include:

  • Time-based metric: measures delay, not data volume.
  • Distance-sensitive: increases with physical and logical path length.
  • Critical for real-time traffic: voice, gaming, and video are highly sensitive.
  • Independent of bandwidth: high throughput does not guarantee low latency.
  • Cumulative: each network hop adds delay.

In real-world use, low latency is essential for online gaming, VoIP calls, financial trading, and industrial control systems. High latency may still allow large file transfers, but it degrades interactive experiences where immediate feedback matters.

Conceptually, Latency is the pause between pressing a doorbell and hearing it ring inside.

See Bandwidth, QoS, WAN, TCP.