When faster isn’t better: How per-packet load balancing throttles your critical traffic

Q: What is per-session load balancing in SD-WAN optimization?

Per-session load balancing is a traffic distribution method that keeps all packets within a single TCP flow on one network path, preserving packet order. Unlike per-packet load balancing, which splits individual packets across multiple links, per-session approaches prevent out-of-order delivery that triggers unnecessary retransmissions and throughput degradation in TCP-dependent enterprise applications.

Q: How does per-packet compare to per-session load balancing?

Per-packet load balancing distributes individual packets across multiple paths to maximize raw bandwidth, but causes out-of-order delivery that cripples TCP throughput by up to 70%. Per-session load balancing routes all packets in a flow along one path, preserving TCP's expected packet order. Per-packet methods remain suitable for UDP traffic like video streaming, where order is irrelevant.

Q: How do out-of-order packets degrade TCP throughput mechanically?

When per-packet distribution routes TCP packets across paths with mismatched latencies, packets arrive out of sequence. TCP interprets this reordering as packet loss, shrinking its congestion window and triggering unnecessary retransmissions. High-latency paths delay acknowledgments, compounding the effect. Latency gaps exceeding 50 milliseconds between bonded links can crash TCP throughput by over 70%.

Q: What business benefit does latency-homogeneous link bonding deliver?

Bonding links with similar latency – ideally less than 20 milliseconds difference – prevents TCP from misinterpreting reordered packets as congestion. Latency gaps under 10 milliseconds maintain smooth throughput, while gaps of 10–50 milliseconds reduce throughput by 20–40%. Enforcing latency homogeneity preserves performance for cloud databases, video conferencing, and other latency-sensitive enterprise applications without sacrificing multi-link bandwidth.

Q: What should enterprises evaluate when deploying SD-WAN TCP optimization?

Enterprises should evaluate three capabilities: per-session load balancing as the default for TCP traffic, real-time latency monitoring that bonds only paths within tight thresholds (ΔRTT under 20 milliseconds), and modern TCP algorithms like BBR. BBR maintains approximately 80% throughput even with 100-millisecond latency swings, compared to roughly 30% for legacy Cubic TCP stacks.

Per-packet load balancing can slash TCP throughput by 70% in multi-path networks. Learn why per-session routing and latency controls protect performance.

By Tayo Ogunseyinde

Systems Engineer

Read Time: 5 min
Published: February 11, 2025
Modified: May 28, 2026

5 min read
February 11, 2025
May 28, 2026

Summary

Per-packet load balancing promises higher bandwidth but can devastate TCP throughput by up to 70% when path latencies diverge. Effective load balancing in SD-WAN optimization requires per-session steering for TCP flows, tight latency homogeneity across bonded links, and modern congestion algorithms like TCP BBR to maintain performance across heterogeneous paths.

Per-packet load balancing causes TCP to misinterpret out-of-order arrivals as congestion, triggering retransmissions and shrinking congestion windows dramatically.
Latency differences exceeding 50ms between bonded paths can crash TCP throughput by over 70%, crippling latency-sensitive applications.
Per-session load balancing preserves packet order within each flow, making it the preferred default for TCP-heavy enterprise environments.
Versa SD-WAN treats paths as equal only when latencies differ by less than 10%, enforcing the homogeneity TCP requires.
TCP BBR maintains roughly 80% throughput even with 100ms latency swings, significantly outperforming older CUBIC TCP under variable conditions.

Imagine your network as a highway. Per-packet load balancing splits your data into tiny cars and sends them down multiple lanes, promising faster speeds.

But what if some lanes have hidden potholes and traffic jams? For TCP, the protocol powering most of your critical apps, per-packet load-balancing can backfire spectacularly.

Despite its bandwidth benefits, this approach can cripple TCP throughput in cloud environments. So, how can you avoid becoming a victim of your own network’s “efficiency”?

The hidden trap: How out-of-order packets strangle TCP

TCP, the workhorse of web traffic, thrives on predictability. It assumes packets arrive in order and within a stable timeframe.

But per-packet load balancing tosses this logic out the window by routing packets across paths with mismatched latencies. When packets take different routes, they arrive out of sequence.

TCP mistakes this chaos for packet loss, slamming the brakes on data flow. High-latency paths, such as satellite links, delay acknowledgments (ACKs), tricking TCP into thinking the network is congested.

The result? Shrinking congestion windows and stalled transfers. In extreme cases, throughput plummets by 70% – a death knell for latency-sensitive apps like video calls or cloud databases.

Why your network’s “speed boost” fails TCP

Per-packet load balancing maximizes raw bandwidth but ignores TCP’s need for orderly delivery. It’s like serving a gourmet meal course-by-course but shuffling the dishes randomly.

For example, bonding a 50ms terrestrial link with a 500ms satellite path results in ACKs from the satellite arriving too late. TCP’s timers panic, triggering unnecessary retransmissions and throttling speeds to a crawl.

Not all latency differences are created equal. If the difference is less than 10ms, it’s smooth sailing.

However, a 10–50ms difference causes throughput to drop by 20–40% due to frantic retransmissions. When the difference exceeds 50ms, throughput crashes by over 70% as TCP gives up.

What you should do: SD-WAN and smarter TCP

To address these issues, ditch per-packet load balancing for TCP and embrace per-session load balancing. Per-session load balancing keeps all packets in a flow on one path, preserving order.

It’s the default for SD-WAN solutions like those offered by Versa, which steer traffic based on real-time latency checks. Steps you should take include:

Reserve per-packet load balancing for UDP, such as video streaming, which doesn’t care about the order.
Enforce latency homogeneity by bonding links with similar latency (less than 20ms difference). Versa SD-WAN , for example, treats paths as “equal” only if their latencies differ by less than 10%.
Upgrade your TCP stack to modern algorithms like TCP BBR, used in Versa’s TCP proxy. BBR maintains 80% throughput even with 100ms latency swings, compared to 30% for older CUBIC TCP.

The bottom line

Per-packet load balancing isn’t evil – it’s just context-sensitive. For TCP-heavy enterprises, the best approach is to default to per-session load balancing, bond links with tight latency controls (ΔRTT <20ms), and deploy SD-WAN for dynamic path selection and TCP optimizations.

By aligning load balancing strategies with protocol quirks, you can dodge the hidden pitfalls and keep your cloud apps running smoothly.

If you would like to get in touch to discuss your SD-WAN deployment, please drop us a line here!

By Tayo Ogunseyinde

Systems Engineer

Tayo Ogunseyinde has 20 years of network engineering and design experience and supports customers and partners across EMEA on Versa SD-WAN and SASE deployments. He writes deeply technical content on networking and routing aimed at experienced network engineers and speaks on AI-driven networking at industry events.

Industry Insights

FAQs

Per-session load balancing is a traffic distribution method that keeps all packets within a single TCP flow on one network path, preserving packet order. Unlike per-packet load balancing, which splits individual packets across multiple links, per-session approaches prevent out-of-order delivery that triggers unnecessary retransmissions and throughput degradation in TCP-dependent enterprise applications.

Per-packet load balancing distributes individual packets across multiple paths to maximize raw bandwidth, but causes out-of-order delivery that cripples TCP throughput by up to 70%. Per-session load balancing routes all packets in a flow along one path, preserving TCP's expected packet order. Per-packet methods remain suitable for UDP traffic like video streaming, where order is irrelevant.

When per-packet distribution routes TCP packets across paths with mismatched latencies, packets arrive out of sequence. TCP interprets this reordering as packet loss, shrinking its congestion window and triggering unnecessary retransmissions. High-latency paths delay acknowledgments, compounding the effect. Latency gaps exceeding 50 milliseconds between bonded links can crash TCP throughput by over 70%.

Bonding links with similar latency – ideally less than 20 milliseconds difference – prevents TCP from misinterpreting reordered packets as congestion. Latency gaps under 10 milliseconds maintain smooth throughput, while gaps of 10–50 milliseconds reduce throughput by 20–40%. Enforcing latency homogeneity preserves performance for cloud databases, video conferencing, and other latency-sensitive enterprise applications without sacrificing multi-link bandwidth.

Enterprises should evaluate three capabilities: per-session load balancing as the default for TCP traffic, real-time latency monitoring that bonds only paths within tight thresholds (ΔRTT under 20 milliseconds), and modern TCP algorithms like BBR. BBR maintains approximately 80% throughput even with 100-millisecond latency swings, compared to roughly 30% for legacy Cubic TCP stacks.

Subscribe to the Versa Blog

Industry Insights Mar 11, 2026

Essential benefits of single-vendor SASE

Enterprises have bought into the benefits of SASE. When implemented right, with the right vendor, it makes the network more efficient and more secure while reducing total cost of ownership.

Industry Insights Jul 15, 2025

Redefining LAN: Powering secure connectivity for everything from people to things

The workplace has changed significantly – hybrid work is now the standard. Employees move easily between offices, campuses, home, and remote locations, working wherever they can be most productive.

Industry Insights May 20, 2025

How networking improves your Cybersecurity posture

Traditionally, networking and security have operated as separate silos within enterprise IT.

When faster isn’t better: How per-packet load balancing throttles your critical traffic

Summary

The hidden trap: How out-of-order packets strangle TCP

Why your network’s “speed boost” fails TCP

What you should do: SD-WAN and smarter TCP

The bottom line

By Tayo Ogunseyinde

FAQs

What is per-session load balancing in SD-WAN optimization?

How does per-packet compare to per-session load balancing?

How do out-of-order packets degrade TCP throughput mechanically?

What business benefit does latency-homogeneous link bonding deliver?

What should enterprises evaluate when deploying SD-WAN TCP optimization?

Subscribe to the Versa Blog

Related Posts

Essential benefits of single-vendor SASE

Redefining LAN: Powering secure connectivity for everything from people to things

How networking improves your Cybersecurity posture