Understanding Network Performance: Latency vs. Throughput

Maintaining network performance is a crucial part of system reliability. Service providers need to understand how network conditions impact speed and subsequently enhance their networks to maintain fast data transfers, especially at scale.

Latency and throughput are two of the main metrics used to assess performance. Latency indicates the delay experienced by a piece of data while it's transferred, whereas throughput is a measure of the amount of data that can pass through a network in a given time period.

Analyzing latency and throughput can help identify issues that prevent you from fully utilizing your network's bandwidth. Bandwidth describes the theoretical maximum transfer speed, but the maximum throughput could be lower if your network is misconfigured or suffering from stability issues. In this article, you'll learn how latency and throughput affect performance so you can design more robust network architectures.

More networking basics:

Latency: How Fast Data Travels

Latency refers to the delay between starting a data transfer and receiving a response. The delay can occur within the network layer or inside the application that processes the data. You can check a network's latency using the `ping` command to send a packet to a specific network device.

Networks with high latency feel sluggish to use. Even simple actions become slow to respond due to the delay applied to each network request. It's important to anticipate potential sources of latency so you can keep your services performing well.

Because latency is the total time taken to receive a response, it's affected by various factors. The following list provides a starting point to investigate latency issues, but any part of your infrastructure could contribute to delays.

Geographical Distance

Geographic separation between users and your services will create high latency because of the increased distances that network data has to travel. This means more network hops between data centers and internet service providers, longer cable runs and ultimately increased exposure to physical constraints. For instance, light takes 5 ms to travel 1,000 miles, but real-world networks achieve only a fraction of this rate.

To resolve geographical latency issues, you need to deploy additional edge servers closer to your users. Operating regional infrastructure reduces the distances that data has to travel, keeping response times fast. Distributing and caching content across regions ensures content is always available locally to users and minimizes the number of trips to your primary servers.

Faulty or Underperforming Hardware

Hardware issues can cause slowdowns that result in increased network latency. These problems are most commonly related to network interface cards (NICs), such as when an outdated or misconfigured network card is used. Buggy firmware can cause increased packet processing overheads or restrict the amount of bandwidth that's available. Similarly, physical connection problems—such as mistakingly connecting a 1 Gbps cable to a 10 Gbps NIC—can reduce capacity and affect the latency you observe.

Keep in mind that it's not always networking equipment that's the culprit. Application-level latency can also be due to high central processing unit (CPU) utilization, insufficient memory or poor I/O performance that limits the speed of storage reads and writes. You should evaluate different hardware options to select an optimal configuration for your workload, such as a modern multicore CPU, adequate memory and the use of fast NVMe SSDs instead of spinning HDDs for storage-dependent applications.

Increased Traffic Load

Network congestion leads to latency because your infrastructure and applications must serve more traffic simultaneously. The extra resource contention means operations take longer to complete, increasing the time that elapses before users get a response. This can also create cascading effects on other systems that interact with your workload.

You can manage traffic load changes scaling your infrastructure and services in response to demand changes. For example, starting additional compute nodes, load balancers and app replicas will help to maintain consistent performance during usage spikes. Use autoscaling to add new resources automatically and keep capacity aligned with current utilization.

Poorly Written Applications

Many network teams begin investigating suspected latency by checking for the problems discussed above, but sometimes the culprit is the application itself. Poorly optimized applications can cause latency issues when operations run slowly, occupy a network interface for too long or require excessive CPU, memory or I/O resources.

It's inevitable that some processes will take meaningful time to complete, but you should still architect your applications to minimize latency. Properly designing database queries, data structures, computational algorithms and memory access patterns allows programs to run efficiently without expensive wait times.

Throughput: How Much Data Travels at Once

Throughput is the amount of data that passes through two points in a network over a given period. If one gigabyte of data is transferred in one second, then the throughput could be expressed as 1 GB/s.

Throughput is often used interchangeably with bandwidth, but the two terms refer to different things. Bandwidth describes the maximum possible capacity of the network based on the infrastructure that's used, and throughput is always the actual measurement of a transfer rate achieved in practice.

When a network's throughput is low, then data transfers will be slow. The network will struggle to handle demand spikes caused by multiple users, as there will be insufficient resources to keep the traffic moving at the design speed indicated by the available bandwidth. Achieving high throughput is therefore key to creating a performant network.

Factors Affecting Throughput

Throughput is often influenced by other factors connected to network performance. For example, traffic congestion can reduce throughput, as the available resources must be shared between more connections. Latency will also impact throughput because all operations in the network will take more time to complete.

Poor throughput can be caused by packet loss as well. This typically occurs when unreliable hardware components are used in your network infrastructure. They may drop packets that must then be resent, reducing the amount of data that successfully passes through the network in a given time frame. Transfers involving mobile devices are particularly susceptible to packet loss because connectivity can be momentarily lost as the device changes location.

When throughput appears to be inconsistent, it's often due to jitter. Jitter occurs when packets within a single transfer experience variable response times. Throughput and jitter can affect each other, with low throughput often leading to jitter and high jitter resulting in a reduction in overall throughput. Some packets will take longer to reach their destination than others, so connections have to stay open after they'd normally be completed.

Latency vs. Throughput

You've seen that latency and throughput are closely related network performance concepts, but they relate to different facets of network performance. Latency measures the delay experienced by individual network requests, but throughput concerns the total amount of data that the network can process in a set time frame.

Low latency means packets will reach their destinations quickly. This is especially important for real-time applications such as video calling, where any extra latency can cause unpleasant lag or stuttering in the stream. Conversely, high throughput ensures large data volumes can be transferred reliably, which is important for file download and streaming scenarios.

Networks should ideally offer both high throughput and low latency. This ensures that transfers will be completed quickly and reliably, even during relatively congested periods. However, this can be difficult to attain in real-world scenarios because so many conditions determine a network's actual performance. Optimizing your infrastructure, carefully architecting your apps and scaling systems to prioritize the metric that matters most to your service can help ensure users get a good experience, but latency and throughput could still be impacted by "final mile" problems like a flaky mobile signal or high utilization at the user's ISP.

Optimize Latency and Throughput for Stable Network Performance

In this article, you learned that latency and throughput issues degrade your network's performance and affect its ability to make use of the available bandwidth. Designing your infrastructure and applications for reduced latency and maximum throughput will improve networking outcomes, increasing user retention and preventing issues caused by dropped packets and connection failures.

Although it's not possible to eradicate latency and throughput issues from every network transfer, there are steps you can take to make it less likely that users will encounter a problem. Equinix's dedicated cloud platform can help reduce latency and maximize throughput by providing infrastructure you can deploy close to your end users. Equinix also supports direct private connections to other cloud providers, ISPs and enterprise networks, which can help reduce the latency of internetwork communications. This gives you consistent performance and high reliability compared to other cloud services, where your data often travels over the public internet and network links that are shared by many other customers.

Latency and Throughput, Your Network’s Vital Signs