Our team recently shipped a new UPF which is a huge improvement on our old UPF, and I drew the short straw of doing all the interop testing for the IMS.
Initially I thought there was an issue with IP routing, as I’d never see the SIP register from the UE, but I would see the IMS APN coming up.
I could access the internet from the UE IPs just fine, but that’s going to public IPs, whereas the P-CSCF is in private address space, and hosted on the same box as the UPF.
I spent hours on this as my lab servers do routing on a stick, and I thought some hardware offload somewhere was trying to fast path my packets and send them back to the server without going via the router.
Then I dug a little deeper and found I could see the 3 way handhake between the UE an the P-CSCF, but no SIP packets.

This was confusing, clearly we had at least intermittent two way comms – the 3 way TCP handshake confirmed that, but then why were packets not getting across?
We have an XCAP server hosted on our P-CSCF instances, so I tried hitting that from the phone in case there was something weird about routing to the network segment that hosts the P-CSCF, but I could hit the XCAP server just fine, so now I was certain the UE IP pool could route to the P-CSCF and 3 way handshake for TCP was working and payload could be pushed.

Then I dug into what happened after the 3 way handshake, and I found a TCP payload containing the start of the SIP REGISTER.

I traced it all the way through and lo, it’s hitting the P-CSCF:

Okay, but then what happens, because it’s only a fragment, not the complete re-assembled packet, so what’s going on?
Well, the P-CSCF sends a TCP ACK back to the UE.

The ACK gets forwarded to the UPF:

And then… Nothing? The UPF never encaps the TCP ACK back into GTP-U and never sends it onto base station.
Eventually the UE re-sends the payload with the start of the REGISTER, but it does not get the ACK from the P-CSCF.

So naughty UPF right? Not forwarding that ACK for some reason?
I started digging, maybe the ACK was getting routed weirdly and landing on the UPF without going through the router?
Well not quite…
When I started digging into the QER rules being installed I noticed the MBR bitrate we had on the IMS APN in the HSS was tiny.

The UPF can only gate on traffic to the UE, so was gating the ACK traffic, as the QER had consumed all the bandwidth so the ACK never made it back.

Time wasted – About 4 hours, but I will not make this mistake again!
