5GC Deployment Architectures

NSA, SA, Centralised vs Distributed UPF, Public, Private and Hybrid — a real-network perspective on every architecture decision your 5G Core will face

1. What Is 5GC Deployment Architecture — The Simple Version

5GC deployment architecture is the set of decisions that determine where your network functions live, who owns them, how the user plane is anchored, and what features you can actually offer. Most people think the 5G Core is just software you spin up in a DC. It is not. Every choice you make here — NSA vs SA, centralised vs distributed UPF, shared vs dedicated — defines your latency ceiling, your enterprise revenue potential, and your operational complexity for the next decade.

Get the architecture wrong and you end up with enterprise customers on an internet UPF 300 km away wondering why their URLLC latency is 38 ms. Or a private 5G factory site going dark during a WAN outage because the SMF is hosted remotely. These are not hypothetical failures. They happen at first deployments.

3GPP Reference
3GPP TS 23.501 — System Architecture for 5G System (the master reference for all architecture decisions)
3GPP TS 38.300 — NR and NG-RAN Overall Description (NSA Option 3x architecture and X2 interface)
3GPP TS 23.548 — 5G System Enhancements for Edge Computing (distributed UPF, UL CL, DNAI)
3GPP TS 23.502 — Procedures for 5G System (PDU session and mobility procedures)

2. Real Network Architecture — How Operators Actually Deploy

The Five Core Deployment Models

There is no single 5GC architecture. In practice you are making at least three independent decisions simultaneously: what type of core to use (NSA vs SA), where to anchor the user plane (centralised vs distributed UPF), and who owns what (public operator, dedicated private, or hybrid). These decisions interact — distributed UPF only exists in SA, private 5G nearly always requires SA, and hybrid is the commercial model that makes enterprise 5G economically viable.

NSA (Non-Standalone) — Option 3x

NSA is not a 5G Core architecture. The 5G Core does not exist in NSA. The LTE eNB is the Master Node. The 5G gNB is a Secondary Node that adds NR radio carriers for throughput. All NAS signalling — authentication, mobility, session management — flows through the existing 4G MME, SGW, and PGW unchanged. From the core’s perspective, an NSA UE is a 4G subscriber with a faster radio.

ComponentRole in NSAInterface / Notes
LTE eNB (Master Node)Controls UE connection, splits user plane bearers between LTE and NRS1-MME to MME; S1-U to SGW; X2 to gNB. X2 RTT must be < 10ms
5G gNB (Secondary Node)Adds NR carriers for user plane throughput only — no core connectionX2 to eNB only. No N2, no N3. Core is unaware of NR.
MMEAll NAS: authentication, registration, mobility. Unchanged.S1-MME from eNB. Same as 4G.
SGW / PGWUser plane anchor. Same 4G gateway. No PFCP, no SMF.S1-U from eNB. SGi to internet.
X2 InterfaceCoordination between eNB and gNB for bearer split decisionsSilent performance killer if RTT > 10ms. Covered in Section 5.

Table 1 — NSA Option 3x components. The 5G Core (AMF, SMF, UPF) is completely absent. NR is a radio enhancement only.

SA (Standalone) — The Real 5G Core

SA deploys a complete 5G Core based on Service-Based Architecture (SBA). Every network function — AMF, SMF, UPF, PCF, UDM, AUSF, NRF — is a standalone service that exposes HTTP/2 REST APIs on the Service-Based Interface (SBI). NFs find each other through the NRF (the DNS of 5GC). User plane is handled by the UPF, programmed by the SMF via N4/PFCP. There is no LTE anchor.

SA is what unlocks network slicing, VoNR, URLLC, edge computing, and enterprise SLAs. Every GCC operator that has launched SA — Ooredoo Oman, STC, e&, Zain — needed SA for exactly these reasons. NSA gives you 5G radio. SA gives you a 5G platform.

NFFull NameWhat It DoesKey Interfaces
AMFAccess & Mobility MgmtNAS termination, registration, mobility, paging, handover coordinationN1(UE), N2(gNB), N8(UDM), N11(SMF), N12(AUSF), N15(PCF)
SMFSession ManagementPDU session lifecycle, UPF selection, N4/PFCP programming, charging triggersN4(UPF), N7(PCF), N10(UDM), N11(AMF), N40(CHF)
UPFUser Plane FunctionForwards all subscriber data. GTP-U, QoS enforcement, usage metering, buffering.N3(gNB), N4(SMF), N6(internet), N9(UPF-UPF)
PCFPolicy ControlPCC rules: 5QI, GBR/MBR, flow descriptions, charging triggersN7(SMF), N15(AMF), N5(AF/NEF)
UDMUnified Data MgmtSubscriber profiles, authentication credentials, session data. Uses UDR as backend.N8(AMF), N10(SMF), N13(AUSF)
AUSFAuth Server5G-AKA and EAP-AKA’ authentication. SUCI decryption.N12(AMF), N13(UDM)
NRFNF RepositoryNF registration, discovery, OAuth2 tokens. Most critical NF in 5GC.SBI — all NFs register here
NSSFSlice SelectionComputes Allowed NSSAI for UE per TA. Redirects to correct AMF set.N22(AMF)
CHFCharging FunctionOnline credit control (prepaid quota) and offline CDR generationN40(SMF)
NEFNetwork Exposure3rd-party app access: QoS requests, traffic influence, event subscriptionsN33(AF), N29/N30(PCF)

Table 2 — SA 5GC NF reference. All SBI interfaces use HTTP/2 over TLS. The UPF is the only NF that touches actual subscriber data.

Centralised vs Distributed UPF

The UPF can sit anywhere on the network. That’s the whole point of CUPS (Control and User Plane Separation) defined in 3GPP TS 23.501. The SMF programs the UPF remotely via N4/PFCP — the two do not need to be co-located. A single SMF in your central DC can manage UPFs at 20 metro sites simultaneously. The question is where you place the UPF for each traffic type.

UPF PlacementOne-Way LatencyTypical Use CaseOperational Reality
Central DC (1–2 sites)15–40 msConsumer broadband, VoNR, IoT dataSimplest — one site, full visibility, easy upgrades
Metro Edge DC (5–20 km from gNB)1–5 msCloud gaming, CDN offload, enterprise branchMultiple sites to manage. N4 over transport — needs QoS engineering.
On-Premises at Enterprise< 0.5 msSmart factory URLLC, smart port, autonomous systemsPer-customer HW. SMF remote or local. WAN dependency risk.
Co-located with gNB / Hub< 0.1 msExtreme edge research, very specific URLLCVery high cost. Only justified for sub-1ms requirements.

Table 3 — UPF placement options. Match placement to latency requirement. Centralised is the right default for consumer. On-premises is mandatory for OT/URLLC.

Ownership Models — Public, Private, Hybrid

ModelInfrastructure OwnershipUser Plane PathBest For
Public MBBOperator owns everythingAll traffic → central DC UPF → internetConsumer broadband, standard business
Private (Isolated)Enterprise-owned or operator-dedicated NFsOn-prem UPF — data never leaves enterprise boundaryOil & gas, defence, utilities, critical manufacturing
HybridShared AMF/NRF, dedicated UPF per enterpriseLocal UPF breakout + shared control planeCommercial sweet spot for most enterprise customers

Table 4 — Ownership models. Hybrid delivers private 5G performance at managed service economics. Growing fastest in GCC enterprise market.

3. Step-by-Step — What Actually Happens

NSA: How the gNB Gets Added to an Active LTE Session

Here is what happens in an NSA network when a UE with 5G capability gets NR carriers added to its existing LTE session:

Step 1 — The UE completes a normal 4G LTE attach. MME authenticates via HSS, SGW creates a bearer, PGW allocates an IP address. At this point there is zero 5G involvement.

Step 2 — The eNB monitors UE measurements. When NR signal quality is strong enough (based on configured A4/A5 measurement thresholds), the eNB decides to add an NR Secondary Cell Group.

Step 3 — eNB sends X2: SgNB Addition Request to the gNB. This tells the gNB: “prepare NR radio resources for this UE and tell me what PDCP configuration to use.” The gNB responds with X2: SgNB Addition Request Acknowledge containing NR radio parameters and PDCP config.

Step 4 — eNB sends RRC Reconfiguration to the UE over LTE, instructing it to activate the NR Secondary Cell Group. The UE performs random access on the NR cell and confirms with RRC Reconfiguration Complete.

Step 5 — User data is now split. Some traffic flows: UE → NR → gNB → SGW. Other traffic continues: UE → LTE → eNB → SGW. The PGW and internet path are completely unchanged. The core sees nothing new — it is still a 4G session.

Key insight: The entire NR addition procedure rides on the X2 interface. If X2 RTT is above 10 ms — caused by X2 being routed over a slow microwave backhaul path — the gNB cannot keep up with bearer split decisions and the NR SCG gets dropped and re-added repeatedly. From the user’s perspective: 5G signal is strong but throughput is poor. From the RAN team’s perspective: looks like a radio issue. It is a transport issue.

SA: A PDU Session from UE to Internet — The Real Flow

Here is the complete flow when an SA-connected UE establishes a data session. This is what happens every time a device connects to the internet on a 5G SA network:

Step 1 — UE sends NAS: PDU Session Establishment Request to the AMF via the gNB on the N1 interface. The message contains: S-NSSAI (which slice), DNN (which data network, e.g. internet), PDU Session ID (UE-chosen, 1–15), PDU Type (IPv4), and SSC Mode (typically Mode 1, meaning the UPF anchor is preserved during mobility).

Step 2 — AMF queries the NRF to discover which SMF instances serve this S-NSSAI and DNN. NRF returns a list of SMF endpoints. AMF selects one and invokes Nsmf_PDUSession_CreateSMContext on N11. The SMF now owns this PDU session.

Step 3 — SMF queries the UDM on N10 (Nudm_SDM_Get) for the subscriber’s session management subscription data: Session-AMBR, subscribed QoS (default 5QI), static IP address if assigned. Without this, the SMF cannot know what the subscriber is entitled to.

Step 4 — SMF invokes the PCF on N7 (Npcf_SMPolicyControl_Create) to get PCC (Policy and Charging Control) rules for this session. PCF returns: QoS flow descriptors (5QI, ARP, optional GBR/MBR per flow), Session-AMBR, and charging rule identifiers. These rules drive both QoS enforcement at the UPF and charging at the CHF.

Step 5 — SMF selects a UPF. This is a five-stage decision: (1) filter by DNN pool — internet sessions go to internet UPF pool; (2) filter by S-NSSAI — slice-specific UPFs if configured; (3) location match — UE’s TAI mapped to geographically closest UPF site; (4) load balance — pick least loaded UPF within the selected site; (5) capability check — verify UPF supports needed features (buffering, UL CL if MEC session). Wrong configuration at any stage silently assigns the session to the wrong UPF.

Step 6 — SMF sends N4: PFCP Session Establishment Request to the selected UPF. This installs the complete rule set: PDR (packet detection — match this UE’s GTP-U TEID for uplink, match this UE IP for downlink), FAR (forward uplink to N6 / buffer downlink until gNB TEID is known), URR (measure volume for charging — trigger report at quota thresholds), QER (enforce Session-AMBR and per-flow bitrates). The UPF responds with its allocated N3 F-TEID — the GTP-U endpoint the gNB should tunnel to.

Step 7 — SMF returns to AMF on N11: PDU session context including UE IP address, QoS profile, and the UPF’s N3 F-TEID. AMF wraps this into N2: PDU Session Resource Setup Request and sends it to the gNB.

Step 8 — gNB allocates a Data Radio Bearer (DRB) with QoS matching the 5QI profile, configures its GTP-U stack, and sends RRC Reconfiguration to the UE. UE confirms with RRC Reconfiguration Complete.

Step 9 — gNB sends N2: PDU Session Resource Setup Response to the AMF, including its own N3 F-TEID. AMF forwards this to the SMF via N11: Nsmf_PDUSession_UpdateSMContext.

Step 10 — This is the step most simplified flow diagrams skip and the one that causes “uploads work, downloads fail” incidents in production. The SMF now knows the gNB’s N3 TEID. It sends N4: PFCP Session Modification Request to the UPF, updating the downlink FAR from Action=BUFFER to Action=FORWARD with the gNB’s TEID as the outer header creation target. The UPF now has a complete bidirectional forwarding path. Data flows.

Pro tip: When you see asymmetric connectivity in SA — uplink works, downlink does not — the first thing to check is whether Step 10 completed. Pull SMF logs and search for PFCP_SESSION_MODIFICATION_TIMEOUT. If the N4 modification timed out (UPF N4 processing queue backed up, N4 transport congestion), the UPF is still in buffering mode for that session’s downlink. No alarm fires. The user just cannot receive data.

4. Practical Example — Ooredoo Oman SA Deployment Scenario

Consider an Ooredoo SA deployment in Muscat. The gNB is at a commercial tower site in Al Khuwair. The RAN connects to the Seeb hub via metro fibre. The 5GC runs across two DCs: Muscat Central (primary) and Muscat Backup (geo-redundant, < 30 km apart, < 1 ms RTT between DCs). The AMF, SMF, PCF, UDM, and NRF run as Kubernetes-managed CNFs across both DCs. The UPF for consumer internet is centralised at Muscat Central. A second UPF for a smart port enterprise customer is on-premises at the Port of Sultan Qaboos.

Network ElementPhysical LocationTransport / InterfaceProtocol
gNBAl Khuwair tower siteBackhaul: 10GE fibre to Seeb hubN2 (NGAP/SCTP) to AMF; N3 (GTP-U/UDP) to UPF
AMF (active)Muscat Central DCInternal DC fabric (< 0.5 ms to SMF)N1/N2 to gNB/UE; N11 to SMF; HTTP/2 SBI
SMF (active)Muscat Central DCInternal DC fabricN4/PFCP to UPF; N7/HTTP2 to PCF; N11 to AMF
UPF — InternetMuscat Central DCN3 over MPLS backhaul; N6 to internet peeringGTP-U on N3; native IP on N6
UPF — EnterprisePort of Sultan Qaboos (on-prem)N4 over MPLS WAN (dedicated 1G); N6 to port LANPFCP on N4; local LAN breakout on N6
NRFBoth DCs (active-active)Internal SBI fabricHTTP/2 from all NFs; etcd replication between DCs

Table 5 — Realistic Ooredoo-style SA deployment in Muscat. Consumer traffic anchored at central DC. Enterprise UPF on-premises for sub-1ms latency and data sovereignty.

A consumer subscriber in Al Khuwair watching YouTube: phone → gNB → N3 GTP-U over metro backhaul → UPF Muscat Central → N6 → CDN cache at internet exchange → video. Round-trip: approximately 8–12 ms. Acceptable for any consumer use case.

A smart crane controller at the port: IoT device → local gNB at port → N3 GTP-U → on-premises UPF at port → N6 → crane controller LAN. Round-trip: under 1 ms. Data never leaves the port boundary. URLLC requirements met. This is what made enterprise 5G commercially viable.

5. Key Parameters and Technical Terms

Term / ParameterDefinitionWhy It Matters
S-NSSAISingle NSSAI = SST (8-bit slice type) + SD (24-bit slice differentiator). Identifies a network slice end-to-end.PDU session routing, UPF pool selection, RAN slice admission. Misconfigured S-NSSAI = session routed to wrong slice.
DNN (Data Network Name)APN equivalent in 5G. Maps to a specific data network, UPF pool, and N6 breakout target.Stage 1 of UPF selection. DNN misconfiguration = wrong UPF = data to wrong network.
TAI (Tracking Area Identity)MCC + MNC + TAC. Identifies the UE’s current location area.Drives location-based UPF selection. Missing TAI in SMF mapping = silent fallback to default pool.
F-TEIDFully Qualified TEID = IP address + 32-bit Tunnel Endpoint Identifier. Identifies one end of a GTP-U tunnel.N3 tunnel setup between gNB and UPF. Mismatch = total traffic loss for that session.
N4 / PFCPInterface between SMF and UPF. Protocol: PFCP over UDP port 8805. Programs UPF per-session rules.Every PDU session create/modify/delete goes over N4. Latency > 3 ms starts causing PFCP timeouts.
PDR / FAR / URR / QERPFCP rule types: Packet Detection Rule / Forwarding Action Rule / Usage Reporting Rule / QoS Enforcement Rule.The complete UPF programming model. These four rule types define exactly what UPF does with each packet.
CUPSControl and User Plane Separation. SMF (control) and UPF (user plane) can be anywhere on the network.Foundation of distributed UPF. SMF in central DC programs UPFs at edge sites via N4 over WAN.
Session-AMBRAggregate Maximum Bit Rate for a PDU session. From UDM subscription data.Enforced by UPF QER. Sets the speed limit for the session. Misconfigured = subscriber gets wrong plan speed.
SSC ModeSession and Service Continuity mode. Mode 1: anchor UPF preserved on mobility. Mode 2/3: reanchoring allowed.Affects UPF placement for mobile UEs. Mode 1 + distributed UPF = UE may keep distant UPF after moving.
N26 InterfaceAMF-to-MME interface for session context transfer during 5G→4G handover.Without N26: every 5G→4G handover causes a service interruption as PDU session is re-established in EPC.
GUAMIGlobally Unique AMF ID: MCC + MNC + AMF Region ID + AMF Set ID + AMF Pointer (6-bit).Embedded in 5G-GUTI assigned to UE. Used by gNB to route new registrations to correct AMF in pool.
DNAIData Network Access Identifier. Labels a specific UPF location for edge traffic steering.Used by NEF Traffic Influence API and UL CL to steer traffic to correct edge UPF for MEC applications.

Table 6 — Key architecture terms and parameters. The ones that cause the most production incidents: TAI mapping gaps, F-TEID mismatches, and N4 PFCP timeouts.

6. Common Issues in the Field

NSA: X2 Latency Killing NR Performance

In NSA, the X2 interface between eNB and gNB must have RTT below 10 ms for stable NR SCG operation. When eNB and gNB basebands are in a centralised baseband hotel connected via microwave backhaul, X2 can reach 15–25 ms. The gNB cannot maintain bearer split coordination at that latency. Result: NR SCG drops and re-adds every 30–60 seconds per UE. User experience: strong 5G signal indicator, but throughput 30–40% below expected. No alarm fires. The RAN team blames coverage. The transport team does not know X2 exists.

Field Note: X2 RTT 18 ms — NR SCG Instability Across 40 Sites, Salalah
Deployed NSA across 40 sites in Salalah with centralised basebands at regional hub. X2 routed via
shared microwave backhaul. Measured X2 RTT: 18–22 ms. NR SCG add/drop counter showed continuous
instability. Drive tests confirmed 5G icon present but peak throughput 35% of theoretical.
Fix: provisioned dedicated 1 GE fibre VLAN for X2 traffic with strict queue. RTT dropped to 2 ms.
Throughput recovered to expected levels. No radio configuration change needed.

SA: NRF Overload — The Domino Effect

The NRF is the single point of discovery for the entire 5GC. When NRF is slow or unavailable, all NF-to-NF calls start failing in sequence. The most common trigger is NRF discovery caching being disabled — every registration generates live NRF queries from AMF (two queries: one for AUSF, one for UDM). At scale during a cold start or launch event, this generates millions of NRF queries in minutes.

Field Note: NRF Overload on SA Commercial Launch — Registration Success Rate 71%
200,000 UE registrations within 20 minutes of commercial SA launch. AMF NRF caching was disabled.
Each registration: 2 NRF discovery queries (AUSF + UDM lookup). NRF CPU hit 95%.
Auth success rate dropped to 71%. NRF HTTP/2 response latency reached 800 ms.
Fix: enable NRF discovery result caching with 300-second TTL on all AMF instances.
NRF query rate dropped 85%. Auth success rate recovered to 99.8% within 3 minutes.

SA: Silent UPF Selection Fallback — Data Sovereignty Breach

The SMF maps UE Tracking Areas to UPF pools to ensure location-based selection. If a TAI is missing from the mapping table, the SMF does not throw an error — it silently falls back to the default UPF pool. For enterprise private slices, this means enterprise IoT device traffic routed to the shared internet UPF. No alarm fires. The compliance audit finds it later.

Field Note: Factory IoT Traffic on Internet UPF — Compliance Incident
Smart factory IoT devices connecting from building extension (new TAI not in SMF mapping table).
SMF fell back to internet UPF pool. Factory OT traffic traversing shared infrastructure.
Discovered during quarterly security audit — 3 months after deployment.
Fix: TAI mapping audit before go-live. Monitor: alert on any enterprise DNN session assigned to internet UPF.

Private 5G: WAN Failure Blocks New Sessions

Split deployments — on-premises UPF, remote SMF at operator DC — create a dependency on WAN for N4 PFCP. When the WAN fails: N4 PFCP heartbeats time out (default 3 retries over 9 seconds). SMF declares UPF unreachable. Existing sessions continue forwarding — GTP-U tunnels are still intact. But no new PDU sessions can be established. Any device that reboots during the outage cannot reconnect. In a factory, this means new shifts cannot log in, powered-off equipment cannot re-attach.

7. Troubleshooting Approach

Start With the Architecture, Not the Symptoms

When a session failure or performance issue appears, the first question is always: which layer of the architecture does this belong to? Use this sequence:

Check 1 — Is this NSA or SA? NSA failures are almost always X2 latency, S1 transport, or MME capacity. SA failures are NRF, SMF, N4, or PCF. The core type defines the failure mode space.

Check 2 — Is it affecting all sessions or specific sessions? All sessions failing = NRF/AMF/N4 association problem. Specific DNN or S-NSSAI only = UPF pool selection or N4 session rule problem. Specific TAI only = location-based UPF mapping or transport path issue.

Check 3 — Is uplink working but downlink failing? This is Step 10 (PFCP Session Modification with gNB TEID) not completing. Pull SMF logs: search for PFCP_SESSION_MODIFICATION_TIMEOUT. Check UPF N4 queue depth and N4 transport latency.

Check 4 — Is enterprise traffic on the wrong UPF? Query SMF session table filtered by enterprise DNN. Cross-reference UPF assignment. If internet UPF is serving enterprise DNN sessions, you have a TAI mapping gap.

SymptomMost Likely CauseWhere to LookFix
5G NR connected but throughput 30-40% low (NSA)X2 RTT > 10 ms — SCG instabilityNR SCG drop counter; measure X2 RTT on transportDedicate low-latency transport path for X2
SA registration success rate drops suddenlyNRF overloaded or unavailableNRF CPU/memory; NRF error counter; AMF NRF cache hit ratioEnable NRF discovery caching; scale NRF to 3+ pods
PDU session setup fails for one DNN onlyDNN-to-UPF-pool misconfiguration or UPF capacitySMF logs: UPF selection stage failure; UPF session table capacityFix pool mapping; add UPF capacity
Uploads work, downloads fail (SA)PFCP Session Modification (Step 10) timeoutSMF: PFCP_SESSION_MODIFICATION_TIMEOUT; UPF N4 queue depthTune N4 timers; scale UPF N4 threads
Enterprise traffic on shared UPFMissing TAI in SMF location-UPF mappingSMF session table: enterprise DNN mapped to internet UPFComplete TAI mapping table; add monitoring alert
Private 5G site: existing sessions up, new sessions failWAN failure causing N4 PFCP association loss to remote SMFPFCP heartbeat status; WAN link stateDeploy local SMF on-premises; N4 becomes loopback
All sessions on all NFs fail simultaneouslyNRF crashed or network partition isolates NRFNRF pod status in K8s; inter-DC network healthNRF HA: 3 replicas minimum, anti-affinity across hosts

Table 7 — Troubleshooting guide by symptom. Architecture type (NSA vs SA) determines which failure mode is possible.

8. Design Recommendations — Consultant Level

Plan the Right Architecture Before You Deploy — Not After

Every architecture choice that seems reversible at the design stage becomes expensive to change in production. The three decisions that lock you in the most are: (1) UPF placement — adding distributed UPF post-launch means new HW procurement, transport engineering, and SMF reconfiguration; (2) NRF sizing — too small and you find out on launch day; (3) whether SMF is local or remote for private 5G sites — the WAN dependency is a hidden risk that only surfaces when the WAN fails.

Architect UPF placement during the SA core design phase, even if you deploy central-only initially. Document the distributed UPF rollout plan so it can be executed without redesigning the SMF configuration from scratch.

NRF — Treat It Like a Database Cluster, Not an Application

The NRF must always run with minimum 3 replicas, pod anti-affinity across physical hosts, and inter-DC replication for geo-redundant deployments. Enable discovery result caching on every NF that queries NRF — default TTL of 300 seconds reduces NRF query load by 80–90% in a live network. Load-test NRF with a simulated registration storm before commercial launch. If you cannot sustain 10× normal registration rate on NRF for 5 minutes, you will have problems.

N4 Interface — Not a Management Plane, a Mission-Critical Interface

N4 carries PFCP for every PDU session create, modify, and delete — plus handover path switch updates which can be 3–5× the session creation rate during busy hour. Mark N4 traffic with DSCP CS6. Reserve explicit bandwidth on every transport segment it crosses. Target < 1 ms one-way latency between SMF and each UPF. If N4 goes over WAN for a private 5G site, treat it like a leased line: SLA, monitoring, failover path.

SMF and UPF Kubernetes Sizing — Guaranteed QoS Is Not Optional

Set K8s resource requests = limits for SMF and UPF pods. This gives them Guaranteed QoS class. Burstable pods get OOMKilled during memory pressure — and an SMF pod that gets OOMKilled drops every PDU session it was managing, potentially millions of users. SMF memory consumption scales with active PDU sessions: estimate 10–16 MB per 1000 sessions. At 10 million concurrent sessions you need up to 100 GB across the SMF cluster. Size with 100% headroom.

Private 5G — The WAN Dependency is a Design Decision, Not an Afterthought

If you deploy a private 5G site with an on-premises UPF and a remote SMF hosted at the operator’s central DC, document clearly: when the WAN fails, existing sessions continue but no new sessions can be established. If the customer’s use case requires continuous operation (factory automation, port operations), the SMF must be local. The correct model for critical private 5G sites: local SMF + local UPF, WAN only for OAM and UDM access.

9. Summary — Key Takeaways

TopicKey Takeaway
NSANo 5G Core in NSA. NR radio on 4G core. X2 RTT must be < 10 ms. Correct for time-to-market but cannot support slicing, VoNR, URLLC, or enterprise SLAs. Plan SA migration before NSA launch.
SAFull 5GC with SBA. NRF is the DNS — most critical NF. PDU session setup is 10 steps, with Step 10 (downlink PFCP modification) the silent failure point for asymmetric connectivity.
Centralised UPFRight default for consumer. Simple operations. 15–40 ms latency blocks URLLC and edge computing. Never commit to centralised UPF for enterprise OT without checking the latency SLA.
Distributed UPFEnables URLLC and MEC. Requires SMF TAI mapping per site. Missing TAI = silent fallback to wrong UPF = data sovereignty risk. Plan before SA launch.
Private 5GOn-premises UPF mandatory for data sovereignty and URLLC. Remote SMF creates WAN dependency. Critical sites: local SMF + local UPF.
HybridBest commercial model. Shared control plane reduces cost. Dedicated UPF per enterprise delivers performance. Growing fastest in GCC.
NRF3+ replicas, anti-affinity, discovery caching enabled. NRF failure takes down the entire 5GC simultaneously.
N4 / PFCPMission-critical interface. DSCP CS6, < 1 ms latency, explicit bandwidth. Shared transport causes PFCP timeouts that look like SMF bugs.
SMF SizingMemory-dominant. requests = limits in K8s (Guaranteed QoS). OOMKill = mass session drop.
UPF Selection BugFive-stage selection. Missing TAI at any stage = silent wrong-pool assignment. Audit mapping tables before go-live.

Table 8 — Post 01 summary. Architecture decisions made in the design phase are expensive to reverse in production.

Next: Post 02 — SMF Signalling Flows

Complete PDU session establishment with every message named and every parameter explained. N4/PFCP deep dive with real session rules. The full UPF selection algorithm. Modification and release procedures. Charging integration. And the top field failures engineers encounter in year one of SA production.

Muhammad Tahir Riaz

Data Analytics & Automation Consultant  |  17+ years telecom  |  trmtelcocloudai.com

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top