SBA & NF APIs Deep Dive

HTTP/2 on SBI, NRF discovery, SCP routing, OAuth2 tokens, mTLS, API versioning — the protocol layer every 5GC engineer must understand

1. What Is SBA — The Simple Version

The Service-Based Architecture (SBA) is the defining characteristic of 5GC that makes it fundamentally different from 4G EPC. In 4G, NFs communicated via point-to-point interfaces with dedicated protocols: Diameter for authentication and policy, GTP-C for session management, S1AP for gNB control. Every interface was a separate protocol to learn, configure, and troubleshoot.

In 5GC, every NF exposes its services as HTTP/2 REST APIs over the Service-Based Interface (SBI). An AMF exposes Namf services. An SMF exposes Nsmf services. They find each other through the NRF and call each other’s APIs directly. The entire control plane of 5GC runs on HTTP/2 over TLS — the same protocol stack as modern web services. This brings enormous benefits in tooling, monitoring, and developer familiarity. It also brings new failure modes: HTTP/2 stream floods, OAuth2 token expiry storms, and API version mismatches.

3GPP Reference
3GPP TS 29.500 — Technical Realization of Service-Based Architecture
3GPP TS 29.510 — NRF Services: NF Management and Discovery
3GPP TS 33.501 Section 13 — Security for SBI: mTLS and OAuth2
RFC 7540 — HTTP/2 (the transport protocol for SBI)

2. Architecture — HTTP/2 on the SBI

Why HTTP/2, Not HTTP/1.1

HTTP/2 FeatureHow It WorksBenefit for 5GC SBI
MultiplexingMultiple request/response streams on a single TCP connection — no head-of-line blockingAMF sends 1000s of parallel N11 requests without 1000s of TCP connections
Header compression (HPACK)Repeated headers (auth token, content-type) compressed after first exchange~60% header overhead reduction on high-frequency NF calls like PFCP triggers
Server PushServer sends data proactively without client pollingPCF pushes policy updates to SMF on policy change — no polling loop needed
Binary framingBinary protocol, not textFaster parsing, less CPU overhead per message vs HTTP/1.1
Stream prioritisationCritical streams (auth) can be prioritised over background requestsAMF auth flows get scheduling priority over periodic NRF heartbeats

Table 1 — HTTP/2 features and 5GC SBI benefits. Multiplexing is the most impactful feature for high-volume NF-to-NF communication.

NRF — Service Discovery

The NRF (Network Repository Function) is the service registry for the entire 5GC. Every NF registers its profile on startup: NF Type (AMF, SMF, etc.), NF Instance ID (UUID), supported services, FQDN/IP, capacity, and supported S-NSSAIs/DNNs. Every NF that needs to call another NF first queries NRF to get the endpoint.

NRF discovery uses HTTP/2 GET: GET /nnrf-disc/v1/nf-instances?target-nf-type=UDM&supi=imsi-xxx&snssais=[…]. NRF returns a SearchResult containing matching NF profile(s). Consumer NF caches the result for the TTL specified in the response (validityPeriod). This cache is critical for NRF load management — without it, every NF call generates a live NRF query.

SCP — Optional but Growing

The SCP (Service Communication Proxy) is an HTTP/2 proxy that can sit between any two NFs. In indirect communication mode, NFs send all SBI requests to the SCP, which handles discovery, load balancing, and routing. This centralises observability (all inter-NF traffic visible at SCP), circuit breaking, and load balancing. The trade-off: extra hop latency (~0.2–0.5 ms) on every NF-to-NF call.

3. Step-by-Step — OAuth2 Token Flow

Every NF-to-NF call on the SBI must be authorised via OAuth2 (3GPP TS 33.501). Here is the flow for an AMF calling the SMF:

Step 1 — AMF needs to call SMF Nsmf_PDUSession_CreateSMContext. Before calling, AMF checks its token cache: does it have a valid token for scope “nsmf-pdusession:create” targeting this SMF instance?

Step 2 — If no valid token: AMF sends POST /oauth2/token to NRF. Body: client_credentials grant_type, client_assertion (AMF’s JWT signed with AMF private key), scope=”nsmf-pdusession:create”, audience=SMF instance ID.

Step 3 — NRF validates AMF’s identity (checks AMF certificate), verifies AMF is authorised to call this scope, and returns a JWT access token signed with NRF’s private key. Token contains: issuer (NRF), subject (AMF instance ID), audience (SMF instance ID), scope, expiry (default 3600s).

Step 4 — AMF caches the token. Calls SMF: POST /nsmf-pdusession/v1/sm-contexts with Authorization: Bearer <JWT>.

Step 5 — SMF validates the token: verifies JWT signature using NRF public key, checks scope matches requested service, checks expiry, checks audience matches SMF instance ID. If all valid: processes request. If token expired: returns 401. AMF fetches fresh token and retries.

Pro Tip
Token expiry storms happen when all NFs simultaneously try to refresh tokens after a network event.
Stagger token refresh: each NF should refresh tokens at (expiry – random jitter) seconds.
Without jitter: all AMF pods refresh all SMF/AUSF/UDM tokens simultaneously after a maintenance window.
NRF token endpoint gets flooded: returns 503. NFs cannot call each other. Core degraded for minutes.

4. Key Parameters and Technical Terms

TermDefinitionOperational Significance
SBIService-Based Interface. The HTTP/2+TLS network used for all NF-to-NF communication.Every NF needs: correct FQDN, TLS certificate, OAuth2 scope configuration. One misconfigured NF cannot communicate.
NF ProfileData structure every NF registers in NRF: NF Type, UUID, services, endpoints, capacity, slice support.NRF discovery returns NF profiles. Incomplete or stale profile = consumer NF cannot find or route to producer NF.
Service NameEach NF service has a standard name: nsmf-pdusession, nudm-sdm, npcf-smpolicycontrol, etc.Used in OAuth2 scope and NRF discovery query. Mismatch = 403 Forbidden or discovery returns no results.
validityPeriodTTL for NRF discovery cache result. NF caches discovered endpoints for this duration.Set to 300s minimum. NFs not caching = NRF overloaded by live queries. NFs caching too long = stale endpoints.
mTLSMutual TLS — both client NF and server NF present certificates signed by operator CA.All SBI communication requires mTLS. A certificate expiry on any NF breaks all its SBI connections.
OAuth2 scopeDefines which NF service the token authorises. Format: {service-name}:{operation}.Example: “nsmf-pdusession:create” authorises CreateSMContext. NF rejects calls with tokens of wrong scope.
3gpp-Sbi-Target-apiRootHTTP/2 header used to specify the target NF FQDN for SCP routing.Consumer NF sets this header when using SCP indirect communication. SCP uses it for routing decision.
API version (URI path)SBI API versions embedded in URI: /nsmf-pdusession/v1/. Major version = breaking change.Both NFs must support compatible API versions. Mixed-version deployments during upgrades must be tested.
HTTP/2 streamA logical bidirectional flow within an HTTP/2 connection. Multiple streams multiplex on one TCP.Stream limits: max_concurrent_streams (default 100). NF-to-NF must not exceed this. Configure per NF.
NF heartbeatNF sends HTTP PATCH to NRF with heartbeatTimer to maintain registration.Default: every 60s. If NRF does not receive heartbeat within heartbeatTimer × 2: deregisters NF.

Table 2 — SBA key parameters. Certificate management and OAuth2 scope configuration are the two most operationally intensive aspects of SBA in production.

5. Common Issues in the Field

Field Note: OAuth2 Token Expiry Storm — NRF Overload Post-Maintenance
All NF pods restarted during planned maintenance window (simultaneous restart for 4 NF types).
On restart, all pods simultaneously attempt OAuth2 token acquisition for all their peer NFs.
800 token requests/second hit NRF token endpoint (normal: 15/second).
NRF returned 503 for 4 minutes until its token processing queue drained.
NFs could not call each other — registration and session setup failed during this window.
Fix: add random jitter (0–60 seconds) to token refresh on NF pod startup.
Stagger NF pod restarts across 10-minute windows during maintenance.
Field Note: mTLS Certificate Expiry — Silent SBI Failures
SMF TLS certificate expired at 02:00. cert-manager had rotated the secret but SMF pods had not reloaded.
SMF continued presenting old (expired) certificate on SBI. AMF TLS handshake failed: certificate expired.
N11 calls from AMF to SMF: TLS handshake failure → 503. PDU session setup failure rate: 100%.
No NF-level alarm — TLS failure happens before the HTTP layer. SMF reported itself healthy.
Detection: AMF N11 error counter spike; Grafana TLS handshake failure rate.
Fix: deploy K8s Operator cert rotation that reloads SMF pods on cert secret rotation. Test before production.

6. Troubleshooting

SymptomRoot CauseCheckFix
NF-to-NF calls return 401 UnauthorizedOAuth2 token expired or wrong scopeToken expiry in JWT (decode and check exp claim); scope in token vs requiredRefresh token; check scope config in NRF authorisation policy
NF-to-NF calls return 503 after maintenanceToken expiry storm — NRF token endpoint overloadedNRF token endpoint request rate; NRF CPU during incidentAdd startup jitter to token refresh; stagger NF pod restarts
SBI TLS handshake failureNF certificate expired; cert-manager rotated secret but NF not reloadedK8s Secret last-updated; NF pod certificate expiry: openssl s_clientUse Operator cert rotation that triggers pod reload; monitor cert expiry in Grafana
NRF discovery returns empty resultsNF not registered or NF profile missing required fields (NF Type, service names)NRF: GET /nnrf-nfm/v1/nf-instances?nf-type=<TYPE>; check profile completenessFix NF registration config; check NRF authorisation policy for NF type
API version mismatch between NFs after upgradeOne NF upgraded to v2 API, peer still on v1HTTP 404 or 415 on SBI calls; check URI version in NF logsValidate API version compatibility before upgrade; run both versions in parallel during phased upgrade

Table 3 — SBA troubleshooting. Certificate expiry and OAuth2 issues are the top two SBA-specific failure modes in production.

7. Summary — Key Takeaways

TopicKey Takeaway
HTTP/2 on SBIAll NF-to-NF communication uses HTTP/2 over TLS. Multiplexing reduces TCP connection overhead. Server Push enables proactive policy delivery.
NRF discoveryNFs find each other through NRF. Enable discovery result caching (validityPeriod=300s minimum). Without caching: NRF overloaded at scale.
OAuth2Every SBI call requires a valid JWT token. Tokens expire (default 3600s). Add jitter to token refresh on pod startup to prevent expiry storms.
mTLSBoth client and server NF present certificates. Expiry on any NF breaks all its SBI connections. Monitor cert expiry in Grafana with early alert.
SCPOptional HTTP/2 proxy. Centralises discovery, load balancing, circuit breaking. Adds ~0.2–0.5 ms per hop. Use when central observability is a requirement.
API versioningVersion in URI path. Both NFs must match on major version. Test API compatibility before upgrades. Run versions in parallel during phased rollouts.

Table 4 — Post 09 summary. SBA is HTTP/2 plus TLS plus OAuth2. All three layers can fail independently.

Next: Post 10 — AMF Deep Dive

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top