5GC Hardware & Infrastructure

COTS servers, x86 vs ARM, NIC offload, SmartNIC/DPU, DC design, vendor HW stacks, NFVI, sizing methodology

1. What Is 5GC Hardware — The Simple Version

5GC NFs run on standard COTS (Commercial Off-The-Shelf) servers — the same x86 platforms used in enterprise data centres, from Dell, HPE, Lenovo, or SuperMicro. The difference is in how those servers are configured and which peripherals they carry. An AMF pod can run on any modern Xeon server. A UPF pod forwarding 100 Gbps of GTP-U traffic cannot — it needs specific NIC hardware, hugepages, CPU pinning, and NUMA-aware scheduling.

The other difference from enterprise is the data centre itself. A 5GC DC must support high power density (up to 20 kW per rack for UPF-heavy deployments), carrier-grade redundancy (2N power, dual corded servers, geo-redundant DC pair), and a network fabric that delivers < 0.5 ms east-west latency between NFs. Get the hardware wrong and no amount of software tuning recovers the performance.

3GPP Reference
ETSI GS NFV-INF 001 — NFV Infrastructure requirements
GSMA NG.126 — Cloud Infrastructure Reference Model for 5G Core
3GPP TS 28.500 — Management concept, architecture and requirements for mobile networks

2. Architecture — The Hardware Stack

Typical 5GC Server Specifications

ComponentSignalling-Plane NFs (AMF, SMF, PCF)User-Plane (UPF)Notes
CPU2× Intel Xeon Scalable 4th Gen or AMD EPYC — 32–48 cores eachSame — but CPU pinning mandatory for DPDK workersDisable C-states in BIOS for UPF servers. C-state transitions add ms-level jitter.
RAM256–512 GB DDR5 ECC256–512 GB DDR5 ECC + hugepages pre-allocatedSMF needs large RAM for session state. UPF needs hugepages for DPDK.
NIC2× 25 GbE — management + SBI2× 100 GbE SR-IOV (Mellanox ConnectX-7 or Intel E810)UPF NIC choice directly determines max throughput.
StorageNVMe SSD RAID-1 for OS; shared NVMe for UDR/CHF logsOS NVMe only — UPF is statelessUDR needs low-latency NVMe for subscriber DB I/O.
Form factor2U rack-mount, hot-swap drives, dual corded PSUSame — no spinning diskCarrier-grade: hot-swap everything, no single cable failure.

Table 1 — 5GC server specifications by NF role. The hardware difference between signalling and user-plane NFs is primarily NIC type and hugepages.

x86 vs ARM

Dimensionx86 (Intel/AMD)ARM (Huawei Kunpeng, Ampere Altra)
Ecosystem maturityDominant — all NF vendors ship x86-native buildsGrowing — Ericsson, Nokia, Huawei have ARM roadmaps; check version support matrix
Power efficiency~200–300W per socket at full load30–50% lower power for equivalent throughput
DPDK supportMature, extensively testedSupported but smaller ecosystem — fewer NIC driver options
GCC operator useUniversal — all SA deployments currently x86On roadmap — ARM UPF deployments expected 2025+
DecisionDefault choice — zero riskChoose only if power/cooling is a primary constraint and vendor supports ARM version

Table 2 — x86 vs ARM for 5GC. ARM offers power efficiency but requires explicit vendor version support validation before commitment.

NIC Selection — The UPF Bottleneck

NIC TypeMax ThroughputKey FeatureBest For
Standard 10/25 GbE NIC10–25 GbpsNo offload — software onlySignalling-plane NFs only (AMF, SMF)
100 GbE SR-IOV (Intel E810 / Mellanox ConnectX-6)100 GbpsSR-IOV VFs, RSS, hardware filtersUPF N3/N6 — required for any serious user-plane throughput
SmartNIC / DPU (Nvidia BlueField-3)100–400 GbpsARM co-processor, GTP-U offload, inline IPsecUPF with GTP-U hardware offload — frees host CPU for session management
Intel IPU (E2000)100 GbpsProgrammable pipeline, P4-based offloadAdvanced UPF offload — less deployed than BlueField currently

Table 3 — NIC options for 5GC. SR-IOV 100 GbE is the minimum for UPF production deployments. SmartNIC/DPU enables GTP-U hardware offload and frees CPU for control plane work.

3. Data Centre Design for 5GC

A 5GC data centre is not an enterprise DC. The combination of high-throughput UPF servers, geo-redundancy requirements, and carrier-grade availability targets drives specific design decisions:

Design Parameter5GC RequirementReason
Rack power density8–20 kW per rackUPF servers with dual 100 GbE NICs + NVMe draw 1–2 kW each. 10–12 per rack = 15–20 kW.
CoolingDirect liquid cooling (DLC) or rear-door heat exchangers for UPF racksAir cooling insufficient above 15 kW/rack. Inlet temperature must stay < 25°C to prevent CPU throttling.
DC fabricSpine-leaf, 100 GbE ToR switches, ECMP< 0.5 ms AMF-to-SMF-to-UPF east-west latency. Oversubscription kills NF-to-NF SBI performance.
Power redundancy2N UPS + generator; dual-corded servers to separate PDU chainsSingle PDU failure must not affect any NF. 5-nines availability requires 2N at every power layer.
DC pairingTwo DCs within 50–100 km> 100 km inter-DC RTT becomes > 1 ms. Stateful NF replication latency affects AMF/SMF failover speed.
Network separationSeparate VLANs/VRFs for SBI, N3/N6 user-plane, OAM managementSBI and user-plane QoS requirements are different. Management traffic must not compete with N4 PFCP.

Table 4 — 5GC data centre design parameters. The DC fabric east-west latency and power density are the two most commonly underestimated requirements.

4. Key Parameters and Technical Terms

TermDefinition5GC Significance
NFVINetwork Functions Virtualisation Infrastructure. The compute, storage, and network hardware that NFs run on.Abstracted by VIM (OpenStack) or K8s. NF sees vCPU, vMemory, vNIC — not physical hardware.
C-statesCPU power management states. Deeper C-states save more power but add latency to return to active.Must be disabled in BIOS for UPF servers. C-state transitions add ms-level wakeup latency — unacceptable for DPDK packet processing.
NUMA NodeNon-Uniform Memory Access node — one CPU socket and its directly attached memory banks.UPF and AMF pods must have all vCPUs and memory from the same NUMA node. Cross-NUMA access: +30–80 ns per memory operation.
SR-IOV VFVirtual Function created by SR-IOV NIC. Multiple VFs share one physical NIC but have independent TX/RX queues.UPF pod gets one or more VFs for N3 and N6 interfaces. Bypasses kernel network stack for near-line-rate forwarding.
RSS (Receive Side Scaling)Hardware NIC feature that distributes incoming packets across multiple CPU cores using a hash of packet headers.Distributes GTP-U flows across UPF DPDK worker threads by TEID hash. Balances load. Without RSS: one core handles all N3 traffic.
GTP-U OffloadSmartNIC/DPU handles GTP-U encap/decap in hardware, not host CPU.BlueField-3 can offload GTP-U processing, freeing host CPU for PFCP session management. Enables higher session density on same server.
Inline IPsecIPsec encryption/decryption performed in NIC hardware pipeline.N3 IPsec at full 100 Gbps line rate without CPU overhead. Requires NIC with crypto acceleration (Intel QAT, Mellanox ConnectX).

Table 5 — Hardware key terms. C-state disablement and NUMA pinning are the two BIOS/OS configurations that most commonly degrade UPF performance in production.

5. Sizing Methodology

Hardware sizing starts from subscriber and session projections, not from server specs. Work backward from traffic to hardware:

NFPrimary Sizing DriverRule of Thumb per 1M SubscribersKey Caveat
AMFRegistration/paging event rate16–32 vCPU, 64–128 GB RAMHigher in dense urban — frequent idle/active transitions
SMFConcurrent PDU sessions (stateful)32–64 vCPU, 256–512 GB RAMMemory-dominant. 10M concurrent sessions = up to 80 GB RAM.
UPFThroughput (Gbps) and concurrent sessions32–64 vCPU + 2× 100GbE SR-IOV + hugepages1 vCPU per 3–5 Gbps with DPDK. SmartNIC GTP offload changes this significantly.
PCFPolicy decisions per second8–16 vCPU, 32–64 GB RAMScales with SMF — typically 1:1 vCPU ratio
UDM/UDRSubscriber DB IOPS8–16 vCPU, 64–128 GB RAM + NVMe SSDNVMe mandatory. Spinning disk latency causes auth failures at scale.
NRFDiscovery requests per second8–16 vCPU, 32–64 GB RAMOften undersized. Add 50% headroom for registration storm scenarios.

Table 6 — 5GC NF sizing rules of thumb. Always validate with vendor sizing tool using actual subscriber and traffic projections. These are starting points, not final specs.

6. Common Issues in the Field

Field Note: NUMA Misconfiguration — AMF Latency 4× Expected
SA deployment: AMF N11 response latency P95 was 8 ms against 2 ms target.
CPU and memory utilisation appeared normal. No obvious bottleneck.
Investigation: AMF vCPUs were split across both NUMA nodes of a dual-socket server.
Memory allocated on NUMA node 0; some worker threads running on NUMA node 1.
Cross-NUMA memory access adding ~60 ns per session lookup. At high request rate: cumulative latency spike.
Fix: set topologyManagerPolicy=single-numa-node; set cpuManagerPolicy=static; restart AMF pods.
N11 latency P95 dropped to 1.8 ms. No hardware change.
Field Note: C-States Not Disabled — UPF Packet Processing Jitter
UPF throughput at low traffic (< 1 Gbps): excellent, < 0.1 ms forwarding latency.
Under burst traffic: latency spikes to 8–12 ms for first packets in each burst.
Root cause: CPU C-states enabled. CPU entered C3 state during quiet periods.
Return from C3 to C0 (active): ~5–10 ms wakeup latency. First burst packets delayed.
Fix: BIOS: set C-states = disabled. OS: add intel_idle.max_cstate=0 to kernel cmdline.
Burst latency spikes eliminated. Consistent < 0.5 ms forwarding latency.

7. Troubleshooting

SymptomRoot CauseCheckFix
UPF throughput well below server specHugepages not configured; DPDK on 4KB pagesNode hugepages allocation; UPF pod hugepages resource requestConfigure hugepages-1Gi on node; add resource request to UPF pod spec
AMF/SMF latency 2-5× expected at low utilisationCross-NUMA memory accessK8s node topology manager policy; numactl –hardwareSet single-numa-node policy; restart pods with CPU Manager pinning
UPF burst latency spikes (otherwise OK)CPU C-states enabled — wakeup latency on first burst packetsBIOS C-state config; /sys/devices/system/cpu/cpu*/cpuidle/state*/disableDisable C-states in BIOS; set intel_idle.max_cstate=0 in kernel
UPF N3 cannot sustain > 10 Gbps despite 100GbE NICSR-IOV VF not attached — kernel network stack bottleneckUPF pod NetworkAttachmentDefinition; verify VF assigned to pod interfaceConfigure Multus SR-IOV CNI plugin; verify VF shows in UPF pod with ip link
NF pod repeatedly crashes during loadInsufficient memory limit — OOMKillK8s events: kubectl get events; pod describe shows OOMKilledIncrease memory limit; set requests=limits for Guaranteed QoS

Table 7 — Hardware platform troubleshooting. Most issues are BIOS/OS configuration errors discovered after deployment.

8. Summary — Key Takeaways

TopicKey Takeaway
COTS hardwareStandard x86 servers work for all 5GC NFs. Configuration matters more than brand. Same server misconfigured delivers 50% of its rated performance.
UPF NICSR-IOV 100 GbE (Intel E810 or Mellanox ConnectX) is the minimum for production UPF. SmartNIC/DPU (BlueField-3) enables hardware GTP-U offload for higher session density.
HugepagesConfigure hugepages-1Gi on K8s node spec AND in UPF pod resource request. Not optional for DPDK UPF — missing = 40–60% throughput loss.
NUMA + CPU pinningSingle-numa-node topology manager + static CPU manager policy. Cross-NUMA access is a silent latency multiplier.
C-statesDisable in BIOS for UPF servers. C-state wakeup latency causes burst latency spikes even when average utilisation is low.
DC designSpine-leaf with < 0.5 ms east-west latency. 2N power. Geo-redundant DC pair within 50–100 km. Dedicated VLANs for SBI, user-plane, OAM.
SizingWork backward from subscriber/session projections. SMF is memory-dominant. UPF is throughput-dominant. NRF is request-rate-dominant — add 50% headroom for storms.

Table 8 — Post 07 summary. Hardware is not the limiting factor. Configuration of hugepages, NUMA, C-states, and SR-IOV is.

Next: Post 08 — Cloud-Native 5GC on Kubernetes

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top