Latency, jitter, packet loss, throughput — how to measure them, correlate them with RAN degradation, and fix them
1. Why Transport KPIs Matter More Than Most Teams Realise
Transport KPIs are the bridge between the physical network and RAN performance. When a gNB shows poor PDCP throughput, elevated RLC retransmission rates, or increased handover failure ratio, the RAN team checks RF parameters first. In most cases they find nothing wrong. The problem is on the transport — but without proper transport KPI monitoring, the investigation stalls and the blame game begins.
The four fundamental transport KPIs are latency, jitter (packet delay variation), packet loss, and throughput/utilisation. Each one has a direct causal relationship with specific RAN KPIs. Understanding these relationships is what separates a transport engineer who can diagnose cross-domain problems from one who just monitors link utilisation.
2. The Four Core KPIs — What They Mean in Practice
Latency — One-Way and Round-Trip
Latency in the transport context is the time a packet takes to traverse the network from source to destination. For 5G interfaces, the relevant latency measurements are:
| Interface | Latency Metric | Threshold | RAN Impact if Exceeded |
| Fronthaul (RU→DU) | One-way latency | < 100 µs (Option 7-2x) | HARQ timing failure, increased HARQ retransmissions |
| Midhaul (DU→CU) | One-way latency | < 1 ms | PDCP SDU reordering, RLC buffer growth |
| N3 (CU-UP→UPF) | Round-trip latency | < 10 ms (eMBB), < 1 ms (URLLC) | High application RTT, poor user throughput via slow-start |
| N2 (gNB→AMF) | Round-trip latency | < 100 ms | Slow RRC setup, paging delays, handover latency |
| X2/Xn (inter-gNB) | Round-trip latency | < 10 ms (X2 for PDCP split) | NSA: poor dual connectivity performance, PDCP reordering |
Jitter — Packet Delay Variation
Jitter is the variation in packet latency over time. A link with average latency of 2ms but jitter of 5ms is worse for real-time services than a link with average latency of 4ms and jitter of 0.1ms. Jitter destroys VoNR quality even when average latency is acceptable, because the voice codec cannot compensate for unpredictable arrival times.
For PTP timing distribution, jitter on PTP packets (measured as PDV — Packet Delay Variation) directly translates to timing error at the RU. A PDV spike during a burst of data traffic on a shared link can push the RU’s phase error past the 1.5 µs threshold, causing TDD interference.
Packet Loss — The Silent Throughput Killer
Packet loss in transport is categorised as:
- Random loss — typically caused by bit errors on physical links. In fibre networks, this should be < 10^-12. Rates above 10^-9 indicate a fibre or connector problem.
- Congestion loss — caused by queue overflow at router interfaces. This is the dominant loss mechanism in live networks. A 1% congestion loss on the N3 path causes TCP throughput to drop by roughly 30% due to congestion avoidance mechanisms.
- Policer/shaper loss — intentional drops by traffic policers enforcing rate limits. Must be distinguished from congestion loss — policer drops on RAN traffic indicate QoS misconfiguration.
Throughput and Utilisation
Link utilisation is the most visible KPI but also the most misleading. A link at 70% average utilisation can be causing significant packet loss at the microsecond level during bursts — which never shows up in 5-minute polling intervals. Real transport monitoring requires sub-minute granularity and burst detection.
| Utilisation Level | Risk Assessment | Recommended Action |
| < 50% | Safe — adequate headroom for bursts | Monitor quarterly, no action needed |
| 50-70% | Watch zone — burst headroom shrinking | Increase monitoring frequency, plan upgrade within 12 months |
| 70-85% | Congestion risk during peak events | Priority upgrade planning, deploy traffic shaping, investigate offload options |
| > 85% | Active congestion likely — VoNR at risk | Immediate upgrade or traffic rerouting required |
3. Tools Used in Real Operations
TWAMP — Two-Way Active Measurement Protocol
TWAMP (RFC 5357) is the standard tool for measuring latency, jitter, and packet loss on live transport paths. It works by sending test packets between two TWAMP-enabled endpoints (typically PE routers or DU/CU management interfaces) and measuring timestamps at both ends. Unlike ICMP ping, TWAMP uses UDP with configurable DSCP markings — meaning you can measure QoS-class-specific performance on the actual forwarding path.
- Deploy TWAMP sessions on all major transport paths: fronthaul aggregation, midhaul hub-to-hub, N3 path to UPF
- Run separate TWAMP sessions for each DSCP class — EF (VoNR), AF41 (video), CS0 (best effort) — to detect class-specific issues
- Set measurement interval to 10 seconds for real-time monitoring — 5-minute polling misses burst events
Streaming Telemetry — The Modern Approach
SNMP polling every 5 minutes is not sufficient for 5G transport. Streaming telemetry (gRPC/gNMI) pushes interface counters, queue statistics, and routing state to a TSDB (Time Series Database — typically InfluxDB, Prometheus, or VictoriaMetrics) at 10-60 second intervals. This gives you:
- Sub-minute link utilisation — detect micro-bursts invisible to SNMP polling
- Per-queue drop counters — see which traffic class is being dropped, not just total drops
- BGP session state and route count changes — detect control plane events that precede data plane problems
- PTP clock state and time error — integrate timing monitoring into the same dashboard as traffic KPIs
4. Practical Example — High Latency Causing Poor User Throughput
An operator in Muscat reports that 5G download speeds at a CBD site are 30-50% below the expected throughput despite strong signal and low interference. Transport investigation reveals:
| KPI | Measured Value | Expected | Root Cause |
| N3 RTT (CU-UP to UPF) | 28ms | < 8ms | UPF is at remote DC — N3 traverses 3 extra hops |
| N3 Jitter | 8ms peak PDV | < 1ms | N3 path shares queue with bulk internet traffic, no QoS |
| TCP throughput calculation | 28ms RTT, 0.1% loss → max ~4Mbps per flow | Expected 100Mbps+ | TCP slow-start never opens window — RTT-limited |
| Fix | Deploy regional UPF at Muscat hub + enable QoS on N3 path | — | N3 RTT dropped to 4ms, throughput improved 4x |
Key insight: TCP throughput is fundamentally limited by RTT via the bandwidth-delay product. A TCP flow on a 28ms RTT path with a typical window size of 65KB can achieve at most ~18Mbps — even on a 10GE link with zero congestion. High N3 RTT is a throughput ceiling, not a congestion problem. Fix the RTT, not the link capacity.
5. KPI Correlation With RAN Issues
| RAN KPI Degradation | Transport KPI to Check | Likely Cause |
| VoNR call drop / MOS degradation | N3 jitter > 5ms, priority queue drops | QoS misconfiguration — VoNR not in priority queue |
| High PDCP retransmission rate | Midhaul or N3 packet loss > 0.01% | Congestion on transport link, or WRED dropping GTP-U |
| RRC setup failure spike | N2 latency > 100ms or packet loss | Transport congestion or routing failure on N2 path |
| Handover failure increase | X2/Xn latency > 10ms or intermittent loss | X2 routing via congested or indirect path |
| TDD interference / PDCCH errors | PTP PDV spike, TE > 1µs | Timing: PTP packets queued with data traffic |
| Poor throughput despite good RF | N3 RTT > 15ms | UPF too far from RAN — throughput RTT-limited |
6. Common Monitoring Mistakes
- Using 5-minute SNMP polling for transport monitoring — this interval misses micro-bursts that last milliseconds but cause packet loss visible at the application layer. Deploy streaming telemetry at 10-30s granularity minimum.
- Monitoring only average latency — average latency hides tail latency. A link with average 2ms and 99th percentile 20ms is causing VoNR degradation at peak load. Always monitor P95 and P99 latency, not just average.
- Not correlating transport and RAN KPIs in the same dashboard — RAN and transport teams monitor separate systems. Cross-domain correlation requires integrating transport TSDB with RAN performance management data in a unified view.
- Ignoring queue drop counters — many operators monitor link utilisation but not per-queue drops. A link at 60% utilisation with 5% drops in the best-effort queue is healthy. A link at 60% utilisation with 0.1% drops in the priority queue is a critical problem.
7. Design Recommendations
- Deploy TWAMP between all major transport endpoints — fronthaul aggregation, midhaul hubs, N3 path. Make TWAMP a network standard in commissioning acceptance criteria.
- Integrate transport telemetry with RAN PM data — build a unified dashboard (Grafana is the standard open-source choice) that shows transport KPIs and RAN KPIs side by side. Correlation analysis becomes visual, not forensic.
- Set latency SLAs per interface in transport contracts — do not accept vague ‘best effort with QoS’ wording. Specify TWAMP-measured latency, jitter, and loss thresholds per interface type, with penalties for sustained violations.
- Monitor P99 latency, not average — configure TWAMP and telemetry to export percentile latency metrics. P99 latency spikes are what cause user-visible service degradation.
8. Summary — Practical Takeaways
Transport KPIs are not abstract numbers — they have direct, measurable causal relationships with RAN performance. High N3 RTT limits TCP throughput. Jitter degrades VoNR. Packet loss in the wrong traffic class causes retransmissions. Timing PDV causes TDD interference. Every transport team needs to monitor these KPIs proactively, at the right granularity, and correlate them with RAN KPIs to close the loop on cross-domain troubleshooting.
| Takeaway | Action |
| 5-minute polling is too slow | Deploy streaming telemetry at 10-30s intervals on all core transport links |
| Monitor P99 latency, not average | Configure TWAMP to export percentile metrics — average hides tail latency problems |
| Correlate transport and RAN KPIs | Build unified dashboard — transport jitter and RAN VoNR drops in the same view |
| TCP throughput is RTT-limited | Measure N3 RTT and fix UPF placement — do not upgrade link capacity to fix an RTT problem |
| Per-queue drops are the signal | Monitor priority queue drops separately — any drops there are a critical event |
