All posts by Nick

About Nick

Dialtone.

Tales from the Trenches: mode-set in AMR

This one was a bit of a head scratcher for me, but I’m always glad to learn something new.

The handset made a VoLTE call, and it’s SDP offer shows it can support AMR and AMR-WB:

        Media Attribute (a): rtpmap:116 AMR-WB/16000/1
        Media Attribute (a): fmtp:116 mode-set=0,1,2,3,4,5,6,7,8;mode-change-capability=2;max-red=220
        Media Attribute (a): rtpmap:118 AMR/8000/1
        Media Attribute (a): fmtp:118 mode-set=0,1,2,3,4,5,6,7;mode-change-capability=2;max-red=220
        Media Attribute (a): rtpmap:111 telephone-event/16000
        Media Attribute (a): fmtp:111 0-15

Okay, that’s pretty normal, I can see we have the mode-set parameter defined, which indicates what modes the handset supports for each codec.

In our problem scenario, the Media Gateway that the call was sent to responded with this SDP answer:

        Media Description, name and address (m): audio 24504 RTP/AVP 118 110
        Media Attribute (a): rtpmap:118 AMR/8000
        Media Attribute (a): fmtp:118 mode-set=7
        Media Attribute (a): rtpmap:110 telephone-event/8000
        Media Attribute (a): fmtp:110 0-15
        Media Attribute (a): ptime:20
        Media Attribute (a): sendrecv
        [Generated Call-ID: FA163E564B37-f4d-98f56700-735d25-65357ee0-9c488]

But we got an error about not available codecs and the call drops, what gives?

Both sides support AMR (Only the phone supports AMR-WB), and the Media Gateway, as the answerer, supports mode-set 7, which is supported by the UE, so we should be good?

Well, not quite:

If mode-set is specified, it MUST be abided, and frames encoded with modes outside of the subset MUST NOT be sent in any RTP payload or used in codec mode requests. If not present, all codec modes are allowed for the payload type.

RFC 4867 – RTP Payload Format for AMR and AMR-WB

Okay, I get it, the answerer (media gateway) only supports mode 7, but the UE supports all the modes, so we should be fine right?

Well, no.

Section 8.3.1 in the RFC goes on to say in the Offer-Answer Model Considerations:

The parameter [mode-set] is bi-directional, i.e., the restricted set applies to media both to be received and sent by the declaring entity. If a mode set was supplied in the offer, the answerer SHALL return the mode-set unmodified or reject the payload type. However, the answerer is free to choose a mode-set in the answer only if no mode-set was supplied in the offer for a unicast two-peer session.

And there is our problem, and why the call is getting rejected.

The Media Gateway (the answerer in this scenario) is sending back the mode-set it supports (7) but as the UE / handset (offerer) included the mode-set, the Media Gateway should either respond with the same mode set (if it supported all the requested modes) or reject it.

Instead we’re seeing the Media Gateway repond with the mode set, which it supports, which it should not do: The Media Gateway should either return the same mode-set (unmodified / unchanged) or reject it.

And boom, another ticket to another vendor…

The role of the telecom Pillar & Cabinet in the Australian copper network

The gray telecom cabinets and pillars can be seen in suburbs across Australia, along rail corridors and even overseas.

But what do they do? What’s the difference between a pillar and a cabinet? Are they still used today? What’s inside? Why are they such an important part of the network?

What are they?

In a nutshell, they’re weatherproof (if properly cared for) enclosures for cross connecting (jumpering) cables.

This means that rather than doing the jumpering / cross connecting services in a dirty pit, a cabinet can be opened and the connection made quickly, in a clean, easily accessed, above ground housing.

They utilise a really clever design, that was the result of a competitive design process in the 1950s.

The schrader valve (bike valve) at the top allows the units to remain pressurized, this means in areas subject to flooding or for pressurised cables, the pillar remains water tight ( although the practice of sealing them again with air isn’t very common anymore).

When the aluminum top plate is unlocked and spun off the threaded fitting, the linesworker can unscrew the big nut on top, and lift up the cover, which locks open at the top, revealing the terminal units (either solder tag blocks or Krone blocks) inside the unit.

Jumpering a service is just a matter of opening up the cabinet, finding the A side and the B side, and running jumper wire through the built in cable management loops from one side to the other.

Each of the Terminal Units is a pre-terminated strip with a few meters of tail, which is fed through the base of the pillar to a nearby pit where they can join 1 to 1, out onto the underground cables, this means the units can be upgraded for additional capacity as needed.

While pressurised they are IP67 rated, but this only goes so far, check out this Telstra photo from Queensland Floods in 2010 from Taroom. https://www.flickr.com/photos/telstra-corp/5362036747/in/album-72157625841011142/
While pressurised they are IP67 rated, but this only goes so far, check out this Telstra photo from Queensland Floods in 2010 from Taroom.

Why were they needed?

  • Cables are expensive. We want to minimize excess unused pairs and use the existing pairs with maximum flexibility and efficacy
  • Opening joints costs time, money, and risks disturbing other services. We want to avoid opening joints
  • Troubleshooting is also time consuming and costly. A convenient test point is needed for isolating where in a cable a fault lies. (Main Cable, Distribution Cable, etc)
  • Easily use gas/air filled cables, without having to constantly open and reseal cables them to splice in new joins / jumpers

Cabinet vs Pillar

Cabinets and Pillars look the same, but the hints as to their purpose are in their location and what’s sprayed on them in faded paint.

Pillars are used for cross-connecting main cables (“M” Pair from the exchange) with distribution cables (to subscribers “O” pairs which run down the street to the pit in the front of your house).

Pillars are generally stenciled with a “P” and an number, or just the DA (Distribution Area) number.

Cabinet are a more flexible setup where you can connect cables between Pillars, akin to a root & branch approach.

Cabinets cross connect Main Cables (“M” pair to the exchange), with Branch Cables (“B” Pair from the Cabinet to Pillar) and Distribution cables (“O” pair to the customer).

Cabinets are stenciled with the prefix “CA” and a number, and exist in the 900 and 1800 pair variants, where one is just taller than the other.

The blue example is direct from the Main Cable to the Pillar, while Cabinets are used in the black example.

This means the distribution can go via a Cabinet to the Pillar to the Customer, as shown in the top /grey lines in the diagram.

  1. Exchange Main Cables (Main Cables / M-Pairs) go to Cabinets
  2. Cabinets connect to Pillars (Branch Cables / B-Pairs)
  3. Pillars connect (Distribution Cables / O-Pairs) that run through the pits outside houses
  4. Inside the Openable Joint in the pit is used to connect the lead in cable from a subscriber’s premises

Alternatively, the Cabinet may be bypassed and a direct cable goes between the Exchange and the Pillar, in that scenario it looks like the one show in blue lines on the diagram.

  1. Exchange Main Cables (Main Cables / M-Pairs) go to Pillar
  2. Pillars connect (Distribution Cables / O-Pairs) that run through the pits outside houses
  3. Inside the Openable Joint in the pit is used to connect the lead in cable from a subscriber’s premises
Display of 300, 900 and 1800 pair pillars and cabinets at the former Telstra Museum in Hawthorn

The Cabinet to Pillar model fell out of favor due to its increased complexity.
While it was cheaper to deploy the network using cabinets that cascaded down to feed pillars (you would only have to install enough cable for the “here and now” and could add additional Main & Branch cables as needed in a targeted manner) the move to outsourced lineswork for Telecom found that any increased complexity, led to additional operational cost that outweighed the capital savings

Use in the “Modern” Copper Customer Access Network

Pillars are still used in areas of Australia where NBNco have deployed Fibre to the Node.

NBN adds a row of X-Pairs (VDSL) and C-Pairs (Channel) to the pillar, which connect into the FTTN nodes themselves.

This means a customer with a traditional POTS line (M-Pair from the Exchange, C-Pair from the Cabinet to the Pillar, O-Pair from the Pillar to the Pit, and then the lead-in into their property) has the O-Pair and C-Pair buzzed out on the pillar, and then routed through the X-Pair and the C-Pair on the Node.

This puts the DSLAM in the Alcatel ISAM inline with the customer’s existing copper loop to the Exchange. The main cable comes from the exchange onto the M-Pair blocks in the Pillar, is jumpered onto the X-Pairs which go through the DSLAM, and come out as C-Pairs back onto the pillar. The C-Pair is then jumpered back to the Customer’s O-Pair and bingo, the FTTN cabinet is inline with the copper loop.

However as the PSTN services get dropped, the Main / M-Pair to the exchange can eventually be removed and the cables removed, meaning the connection just goes from the C pair for VDSL out into the O pair to the customer.

As part of the NBN migration some pillars were upgraded to include IDC / Punch Down blocks, and a rectangular version of the pillar was introduced.

NBN pillar

Oddly, these rectangular covers, do not have rectangular units inside, but rather cylindrical ones, just like the pillars of old.

This does fix the missing lids issue – The lid is captive, but I’m not sure what other design improvements this introduces – if anyone has the insight I’d be keen to hear it!

Tales from the Trenches – Emergency Calling when Roaming

In my last post talking about the Emergency Calling Codes, I had a few comments asking about what about in roaming scenarios?

For example, an American visiting the UK, would have 911 on the Emergency Calling Codes list on their SIM card, but in the UK they dial 999 to reach emergency services.

There’s two angles to this, the first is if a roamer dials the emergency calling code of their home country, the other is if they dial the emergency calling code of the country they are in.

Let’s look at the first scenario, where the roamer dials the emergency calling code of their home country.

If our American in the UK abroad dials 911, that number is on the ECC list on the SIM, it’s still flagged as an emergency call, and just goes out with the standard urn:service:sos URN – The network never sees 911 or 999, just that it’s an SOS call that goes to the PSAP.

In this scenario, the fact the dialled number is not passed to the network is actually a positive, we get the intent that the user wants to reach emergency services, and route based on this.

But what if our American friend in need dials 999?
That’s the correct number for the end user to dial in the UK after all, but if that’s not in their ECC list on the SIM / device, it’d go through as a regular call right?

If the call does not get flagged as an emergency call on the UE this has its own set of complications and considerations:

S8-Home Routing for VoLTE means that as the UE doesn’t know this is an emergency call, the call will get routed back to the home network. This means the call doesn’t go to the E-CSCF in the visited network, and would probably just get a message saying the number they’ve dialed is unavailable, this would be exactly as if they dialed 999 at home in the US.

But we have a fix for this!
On each MME we can set a list of emergency numbers, which would allow our Britt’s phone to know on this network, what the emergency calling codes are, and route the 999 call to the local PSAP, rather than home routing it.

MME Emergency Number list Config

This information is jammed into the Emergency Number List IE in the NAS Attach Accept body.

This means our American visitor in the UK, would know about 999 from the ECC list configured in the roaming operator’s MME.

The purpose of this information element is to encode emergency number(s) for use within the country where the IE is received.

3GPP TS 24.008: 10.5.3.13 – Emergency Number List

Where this becomes more problematic is unauthenticated emergency calling.

For example, a our American visiting the UK, that is not roaming dials 999.

We’ll assume the UK and US operator don’t have a VoLTE roaming agreement because they’ve been kicking the can down the road when it comes to VoLTE roaming… This is super common scenario – last numbers I saw on this were last year with ~50 bilateral VoLTE agreements in place worldwide.

Because the phone is not attached to a local MME, the handset does not know that 999 is an emergency calling code (because it’s not on the SIM), after all, the only way it can get the Emergency Number List is from an MME, and not having been attached to an MME, means the phone does not have the ECC list for the country, so the the handset does not begin the emergency attach procedure to make the call.

Common sense prevails here, on the majority of phones and the majority of SIM profiles, codes like 112 or 911 are treated as emergency calls, but more obscure numbers, such as dialing 999 in the UK or 10111 for South African Police on a handset with US firmware, are not guaranteed to work. Generally dialing the Emergency Calling code in the home network would get you through to some emergency services (although as we talked about in the last post, this might get you routed to the wrong agency in countries where each agency has their own number).

A better way forward?

These days I don’t dial much (apart from if I’m making adjustments on the Step-by-Step exchange), when I call people I do it from contacts, hyperlinks, etc.

Emergency Dialler page in Android

There is mountains of research to suggest that asking people to remember codes and phone numbers, is a struggle. A tourist who finds themselves in Tunisia in need of assistance, is unlikely to remember that it’s 190 for an Ambulance, and 198 for Fire.

Perhaps the ECC list on a phone should populate a page of icons from the emergency page on the phone, with the universal icon for each agency, that sends to the URN for that service type?

Countries with a single PSAP could have the URNs for each service type routed to the same place, while countries with seperated PSAPs for each service type, can route accordingly.

Likewise if a country does have a centralised PSAP for all call types, knowing the type that is selected would be useful, for example if the user has pressed fire and is not responsive when the call is answered, the best unit to dispatch would probably be a fire engine.

GNS3 vCenter / ESXi – Allow Traffic

The other day I setup GNS3 in the lab for some testing, we run vCenter for our server workloads, so I chucked the OVA on there.

One issue I ran into is that when linking a Cloud Component to a router, I simply could not get a path in/out of the router, I wasn’t learning MAC addresses and my ARP requests were going unanswered.

Wireshark showed the ARP requests going down that interface, and broadcast traffic from the rest of the network, so what gives?

The answer was pretty simple, on the vHost itself I needed to enable Promiscuous mode to allow L2 addresses that aren’t the VM, to be sent from within the VM.

Under Networking -> Port Groups -> the NICs you have assigned in GNS3:

Make sure Promiscuous mode, MAC address changes and Forged transmits are allowed – By default they’re denied on the vSwitch which it inherits from.

There’s obviously security concerns here, so think before you do, but that should have packets flowing.

Tales from the Trenches: The issue with Emergency Calling URNs in IMS Networks

A lot of countries have a single point of contact for emergency services; in Europe you’d call 112 in an emergency, 000 in Australia or 911 in the US. Calling this number in the country will get you the emergency services.

This means a caller can order an ambulance for smoke inhalation, and the fire brigade, in one call.

But that’s not the case in every country; many countries don’t have one number for the emergency services, they’ve got multiple; a phone number for police, a different number for fire brigade and a different number for an ambulance.

For example, in Brazil if you need the police, you call 190, while a for example, uses 193 as the emergency number for the fire department, the police can be reached at 190 or 191 depending on if it’s road policing or general, and medical emergencies are covered by 192. Other countries have similar setups.

This is all well and good if you’re in Brazil, and you call 192 for an ambulance, the phone sends a SIP INVITE with a Request URI of sip:[email protected], because we can put a rule into our E-CSCF to say if the number is 192 to route it to the answer point for ambulances – But that’s not often the case on emergency calls.

In IMS, handsets generally detect the number dialed is on the Emergency Calling Code (ECC) list from the USIM Card.

The use of the ECC list means the phone knows this is an emergency call, and this is really important. For countries that use AML this can trigger sending of the AML SMS that process, and Emergency Calls should always be allowed to be made, even without credit, a valid SIM card, or even a SIM in the phone at all.

But this comes with a cost; when a user dials 911, the phones doesn’t (generally) send a call to sip:[email protected] like it would with any other dialled number, but rather the SIP INVITE is sent to urn:service:sos which will be routed to the PSAP by the E-CSCF. When a call comes through to these URNs they’re given top priority in the network

This is all well and good in a country where it doesn’t matter which emergency service you called, because all emergency calls route to a single PSAP, but in a country with multiple numbers, it’s really important when you call and ambulance, your call doesn’t get routed to animal control.

That means the phone has to look at what emergency number you’ve dialed, and map the URN it sends the call to to match what you’ve actually requested.

Recently we’ve been helping an operator in a country with a numbering plan like this, and we’ve been finding the limits of the standards here.
So let’s start by looking at what the standards state:

IMS Emergency Calling is governed by TS 103.479 which in turn delegates to IETF RFC 5031, but for the calling number to URN translation, it’s pretty quiet.

Let’s look at what RFC 5031 allows for URNs:

  • urn:service:sos.ambulance
  • urn:service:sos.animal-control
  • urn:service:sos.fire
  • urn:service:sos.gas
  • urn:service:sos.marine
  • urn:service:sos.mountain
  • urn:service:sos.physician
  • urn:service:sos.poison
  • urn:service:sos.police

The USIM’s Emergency Calling Codes EF would be the perfect source of this data; for each emergency calling code defined, you’ve got a flag to indicate what it’s for, here’s what we’ve got available on the SIM Card:

  • Bit 1 Police
  • Bit 2 Ambulance
  • Bit 3 Fire Brigade
  • Bit 4 Marine Guard
  • Bit 5 Mountain Rescue
  • Bit 6 manually initiated eCall
  • Bit 7 automatically initiated eCall
  • Bit 8 is spare and set to “0”

So these could be mapped pretty easily you’d think, so if the call is made to an Emergency Calling Code flagged with Bit 4, the URN would go to urn:service:sos.mountain.

Alas from our research, we’ve found most OEMs send calls to the generic urn:service:sos, regardless of the dialled number and the ECC flags that are set on the SIM for that number.

One of the big chip vendors sends calls to an ECC flagged as Ambulance to urn:service:sos.fire, which is totally infuriating, and we’ve had to put a rule in our E-CSCF to handle this if the User Agent is set to one of their phones.

Is there room for improvement here? For sure! Emergency calling is super important, and time is of the essence, while animal control can probably transfer you to an ambulance, an emergency is by very nature time sensitive, and any time wasted can lead to worse outcomes.

While carrier bundles from the OEMs can handle this, the global ability to take any phone, from any country and call an emergency number is so important, that relying on a country-by-country approach here won’t suffice.

What could we do as an industry to address this?

Acknowledging that not all countries have a single point of contact for emergency service, introducing a simple mechanism in the UE SIP message to indicate what number (Emergency Calling Code) the user actually dialled would be invaluable here.

URNs are important, but knowing the dialed number when it comes to PSAP routing, is so important – This wouldn’t even need to be its own SIP header, it could just be thrown into the Contact header as another parameter.

Highly developed markets are often the first to embrace new tech (for us this means VoLTE and VoNR), but this means that these issues seen by less developed markets won’t appear until long after the standard has been set in stone, and often countries like this aren’t at the table of the standards bodies to discuss such requirements.

This easy, reasonable update to the standard, has the potential to save lives, and next time this comes up in a working group I’ll be advocating for a change.

Information overload on NBN FTTH

At long last, more and more Australians are going to have access to fibre based access to the NBN, and this seemed like as good an excuse than any to take a deep dive into how NBN’s GPON based fibre services are delivered to homes.

We’ve looked at NBN FTTN architecture, and NBN FTTC architecture and Skymuster Satellite architecture, so now let’s talk about how FTTH actually looks.

Let’s start in your local exchange where you’ll likely find a Nokia (Well, probably Alcatel-Lucent branded) 7210 SAS-R access aggregation switch, which is where NBN’s transmission network ends, and the access network begins.

It in turn spits out a 10 gig interface to feed the Optical Line Terminal (OLT), which provides the GPON services, each port on the OLT is split out and can feed 32 subscribers.

In NBN’s case, Nokia (Alcatel-Lucent) 7302, and rather than calling it an OLT, they call it a “FAN” or “Fibre Access Node” – Seemingly because they like the word node.

Each of the Nokia 7302s has at least one NGLT-A line card, which has 8 GPON ports. Each of the 8 ports on these cards can service 32 customers, and is fed by 2x 10Gbps uplinks to two 7210 SAS-R aggregation switches.

The chassis supports up to 16 cards, 8 ports each, 32 subs per port, giving us 4096 subscribers per FAN.

In some areas, FANs/OLTs aren’t located in an exchange but rather in a street cabinet, called a Temporary Fibre Access Node – Although it seems they’re very permanent.

To support Greenfields sites where a FAN site has not yet been established a cabinetised OLT solution is deployed, known as a Temporary FAN (TFAN).

In reality, each port on the OLT/FAN goes out Distribution Fibre Network or DFN which links the ports on the OLTs to a distribution cabinet in the street, known as as a Fibre Distribution Hub, or FDH.

If you look in FTTH areas, you’ll see the FDH cabinets.
The FDH is essentially a roadside optical distribution frame, used to cross connect cables from the Distribution Fibre Network (DFN) to the Local Fibre Network (LFN), and in a way, you can think of it as the GPON equivalent of a pillar, except this is where we have our optical splitters.

Remember when we were talking about the FAN/OLT how one port could serve 32 subscribers? We do that with a splitter, that takes one fibre from the DFN that runs to the FAN, and gives us 32 fibres we can could connect to an ONT onto to get service.

The FDH cabinets are made by Corning (OptiTect 576 fibre pad mounted cabinets) and you can see in the top right the Aqua cables go to the Distribution Fibre Network, and hanging below it on the right are the optical splitters themselves, which split the one fibre to the FAN into 32 fibres each on SC connectors.

These are then patched to the Local Fibre Network on the left hand side of the cabinet, where there’s up to 576 ports running across the suburb, and a “Parking” panel at the bottom where the unused ports from the splitter can be left until you patch the to the DFN ports above.

The FDH cabinets also offer “passthrough” allowing a fibre to from the FAN to be patched through to the DFN without passing through the GPON splitter, although I’m not clear if NBN uses this capability to deliver the NBN Business services.

But having each port in the FDH going to one home would be too simple; you’d have to bring 576 individually sheathed cables to the FDH and you’d lose too much flexibility in how the cable plant can be structured, so instead we’ve got a few more joints to go before we make it to your house.

From the FDH cabinet we go out into the Local Fibre Network, but NBN has two variants of LFN – LFN and Skinny LFN.
The traditional LFN uses high-density ribbon fibres, which offer a higher fibre count but is a bit tricker to splice/work with.
The Skinny LFN uses lower fibre count cables with stranded fibres, and is the current preferred option.

The original LFN cables are ribbon fibres and range from 72 to 288 fibre counts, but I believe 144 is the most common.

These LFN cables run down streets and close to homes, but not directly to lead in cables and customer houses.

These run to “Transition Closures” (Older NBN) or “Flexibility Joint Locations” (FJLs – Newer NBN)

While researching this I saw references to “Breakout Joint Locations” (BJLs) which are used in FTTC deployments, and are a Tenio B6 enclosure for 2x 12 Fibers and 4x 1 Fibers with a 1×4 splitter.

The FJLs are TE Systems’ (Now Commscope) Tenio range of fibre splice closures, and they’re use to splice the high fibre count cable from from the FDH cabinets into smaller 12 fibre count cables that run to multiple “Splitter Multi Ports” or “SMPs” in pits outside houses, and can contain splitters factory installed.

The splitters, referred to as “Multiports” or “SMPs” are Corning’s OptiSheath MultiPort Terminals, and they’re designed and laid out in such a way that the tech can activate a service, without needing to use a fusion splicer.

Due to the difficulty/cost in splicing fibre in pits for a service activation, NBNco opted to go from the FJL to the SMPs, where a field tech can just screw in a weatherproof fibre connector lead in to the customer’s premises.

During installation / activation callouts, the tech is assigned an SMP in the pit near the customer’s house, and a port on it. This in turn goes to the FJL and onto FDH cabinet as we just covered, but that patching/splicing for that is already done, so the tech doesn’t need to worry about that.

The tech just plugs in a pre-terminated lead in cable with a weatherproof fibre end, and screws it into the allocated port on the SMP, then hauls the other end of the lead in cable to the Premises Connection Device (Made by Madison or Tyco), located on the wall of the customer’s house.

The customer end of the lead in cable may be a pre terminated SC connector, or may get mechanically spliced onto a premade SC pigtail. In either case, they both terminate onto an SC male connector, which goes into an SC-SC female coupler inside the PCD.

Next is the customer’s internal wiring, again, preterm cable is used, to run between the PCD and the First Fibre Wall Outlet inside the house. This preterm cable join the lead in cable inside the PCD on the SC-SC female coupler, to join to the lead in.

Inside the house we have the “Network Termination Device” (NTD), which is a GPON ONT, is where the fibre from the street terminates and is turned into an Ethernet handoff to the customer. NBN has been through a few models of NTD, but the majority support 2x ATA ports for analog phones, and the option for an external battery backup unit to keep the device powered if mains power is lost.

Phew! That’s what I’ve been able to piece together from publicly available documentation, some of this may be out of date, and I can see there’s been several revisions to the LFN / DFN architectures over the years, if there’s anything I have incorrect here, please let me know!

Playing back AMR streams from Packet Captures

The other day I found myself banging my head on the table to diagnose an issue with Ringback tone on an SS7 link and the IMS.

On the IMS side, no RBT was heard, but I could see the Media Gateway was sending RTP packets to the TAS, and the TAS was sending it to the UE, but was there actual content in the RTP packets or was it just silence?

If this was PCM / G711 we’d be able to just playback in Wireshark, but alas we can’t do this for the AMR codec.

Filter the RTP stream out in Wireshark

Inside Wireshark I filtered each of the audio streams in one direction (one for the A-Party audio and one for the B-Party audio)

Then I needed to save each of the streams as a separate PCAP file (Not PCAPng).

Turn into AMR File

With the audio stream for one direction saved, we can turn it into an AMR file, using Juan Noguera (Spinlogic)’s AMR Extractor tool.

Clone the Repo from git, and then in the same directory run:

python3 pcap_parser.py -i AMR_B_Leg.pcap -o AMR_B_Leg.3ga

Playback with VLC / Audacity

I was able to play the file with VLC, and load it into Audacity to easily see that yes, the Ringback Tone was present in the AMR stream!

A look at Advanced Mobile Location SMS for Emergency Calls

Advanced Mobile Location (AML) is being rolled out by a large number of mobile network operators to provide accurate caller location to emergency services, so how does it work, what’s going on and what do you need to know?

Recently we’ve been doing a lot of work on emergency calling in IMS, and meeting requirements for NG-112 / e911, etc.

This led me to seeing my first Advanced Mobile Location (AML) SMS in the wild.

For those unfamiliar, AML is a fancy text message that contains the callers location, accuracy, etc, that is passed to emergency services when you make a call to emergency services in some countries.

It’s sent automatically by your handset (if enabled) when making a call to an emergency number, and it provides the dispatch operator with your location information, including extra metadata like the accuracy of the location information, height / floor if known, and level of confidence.

The standard is primarily driven by EENA, and, being backed by the European Union, it’s got almost universal handset support.

Google has their own version of AML called ELS, which they claim is supported on more than 99% of Android phones (I’m unclear on what this means for Harmony OS or other non-Google backed forks of Android), and Apple support for AML starts from iOS 11 onwards, meaning it’s supported on iPhones from the iPhone 5S onards,.

Call Flow

When a call is made to the PSAP based on the Emergency Calling Codes set on the SIM card or set in the OS, the handset starts collecting location information. The phone can pull this from a variety of sources, such as WiFi SSIDs visible, but the best is going to be GPS or one of it’s siblings (GLONASS / Galileo).

Once the handset has a good “lock” of a location (or if 20 seconds has passed since the call started) it bundles up all of this information the phone has, into an SMS and sends it to the PSAP as a regular old SMS.

The routing from the operator’s SMSc to the PSAP, and the routing from the PSAP to the dispatcher screen of the operator taking the call, is all up to implementation. For the most part the SMS destination is the emergency number (911 / 112) but again, this is dependent on the country.

Inside the SMS

To the user, the AML SMS is not seen, in fact, it’s actually forbidden by the standard to show in the “sent” items list in the SMS client.

On the wire, the SMS looks like any regular SMS, it can use GSM7 bit encoding as it doesn’t require any special characters.

Each attribute is a key / value pair, with semicolons (;) delineating the individual attributes, and = separating the key and the value.

Below is an example of an AML SMS body:

A"ML=1;lt=+54.76397;lg=-
0.18305;rd=50;top=20130717141935;lc=90;pm=W;si=123456789012345;ei=1234567890123456;mcc=234;mnc=30; ml=128

If you’ve got a few years of staring at Wireshark traces in Hex under your belt, then this will probably be pretty easy to get the gist of what’s going on, we’ve got the header (A”ML=1″) which denotes this is AML and the version is 1.

After that we have the latitude (lt=), longitude (lg=), radius (rd=), time of positioning (top=), level of confidence (lc=), positioning method (pm=) with G for GNSS, W for Wifi signal, C for Cell
or N for a position was not available, and so on.

AML outside the ordinary

Roaming Scenarios

If an emergency occurs inside my house, there’s a good chance I know the address, and even if I don’t know my own address, it’s probably linked to the account holder information from my telco anyway.

AML and location reporting for emergency calls is primarily relied upon in scenarios where the caller doesn’t know where they’re calling from, and a good example of this would be a call made while roaming.

If I were in a different country, there’s a much higher likelihood that I wouldn’t know my exact address, however AML does not currently work across borders.

The standard suggests disabling SMS when roaming, which is not that surprising considering the current state of SMS transport.

Without a SIM?

Without a SIM in the phone, calls can still be made to emergency services, however SMS cannot be sent.

That’s because the emergency calling standards for unauthenticated emergency calls, only cater for

This is a limitation however this could be addressed by 3GPP in future releases if there is sufficient need.

HTTPS Delivery

The standard was revised to allow HTTPS as the delivery method for AML, for example, the below POST contains the same data encoded for use in a HTTP transaction:

v=3&device_number=%2B447477593102&location_latitude=55.85732&location_longitude=-
4.26325&location_time=1476189444435&location_accuracy=10.4&location_source=GPS&location_certainty=83
&location_altitude=0.0&location_floor=5&device_model=ABC+ABC+Detente+530&device_imei=354773072099116
&device_imsi=234159176307582&device_os=AOS&cell_carrier=&cell_home_mcc=234&cell_home_mnc=15&cell_net
work_mcc=234&cell_network_mnc=15&cell_id=0213454321 

Implementation of this approach is however more complex, and leads to little benefit.

The operator must zero-rate the DNS, to allow the FQDN for this to be resolved (it resolves to a different domain in each country), and allow traffic to this endpoint even if the customer has data disabled (see what happens when your handset has PS Data Off ), or has run out of data.

Due to the EU’s stance on Net Neutrality, “Zero Rating” is a controversial topic that means most operators have limited implementation of this, so most fall back to SMS.

Other methods for sharing location of emergency calls?

In some upcoming posts we’ll look at the GMLC used for E911 Phase 2, and how the network can request the location from the handset.

Further Reading

https://eena.org/knowledge-hub/documents/aml-specifications-requirements/

Australia’s secret underground telephone exchanges

A few years ago, I was out with a friend (who knows telecom history like no one else) who pointed at a patch of grass and some concrete and said “There’s an underground exchange under there”.

Being the telecommunications nerd that I am, I had a lot of follow up questions, and a very strong desire to see inside, but first, I’m going to bore you with some history.

I’ve written about RIMs – Remote Integrated Multiplexers before, but here’s the summary:

In the early ’90s, Australia was growing. Areas that had been agricultural or farmland were now being converted into housing estates and industrial parks, and they all wanted phone lines.
While the planners at Telecom Australia had generally been able to cater for growth, suddenly plonking 400 homes in what once was a paddock presented a problem.

There were traditional ways to solve this of course; expanding the capacity at the exchange in the nearest town, trenching larger conduits, running 600 pair cables from the exchange to the housing estate, and distributing this around the estate, but this was the go-go-nineties, and Alcatel had a solution, the Remote Integrated Multiplexer, or RIM.

A RIM is essentially a stack of line cards in a cabinet by the side of the road, typically fed by one or more E1 circuits. Now Telecom Australia didn’t need to upgrade exchanges, trench new conduits or lay vast quantities of costly copper – Instead they could meet this demand with a green cabinet on the nature strip.

This was a practical and quick solution to increase capacity in these areas, and this actually worked quite well; RIMs served many Australian housing estates until the copper switch off, many having been upgraded with “top-hats” to provide DSLAM services for these subscribers as well, or CMUX being the evolved version. There’s still RIMs that are alive in the CAN today, in areas serviced by NBN’s Fixed Wireless product, it’s not uncommon to see them still whirring away.

File:Telstra roadside cabinet housing a RIM and CMUX.jpg
A typical RIM cabinet

But in some areas planning engineers realised some locations may not be suitable for a big green cabinet, for this they developed the “Underground CAN Equipment Housing” (UCEH). Designed as a solution for sensitive areas or locations where above ground housing of RIMs would not be suitable – which translated to areas council would not them put their big green boxes on their nature strips.

So in Narre Warren in Melbourne’s outer suburbs Telecom Research Labs staff built the first underground bunker to house the exchange equipment, line cards, a distribution frame and batteries – a scaled down exchange capable of serving 480 lines, built underground.

Naturally, an underground enclosure faced some issues, cooling and humidity being the big two.

The AC systems used to address this were kind of clunky, and while the underground exchanges were not as visually noisy as a street cabinet, they were audibly noisy, to the point you probably wouldn’t want to live next to one.

Sadly, for underground exchange enthusiasts such as myself, by 1996, OH&S classified these spaces as “Confined Spaces”, which made accessing them onerous, and it was decided that new facilities like this one would only be dug if there were no other options.

This wasn’t Telecom Australia’s first foray into underground equipment shelters, some of the Microwave sites in the desert built by telecom put the active equipment in underground enclosures covered over by a sea freight container with all the passive gear.

In the US the L-Carrier system used underground enclosures for the repeaters, and I have a vague memory of the Sydney-Melbourne Coax link doing the same.

Some of these sites still exist today, and I was lucky enough to see inside one, and let’s face it, if you’ve read this far you want to see what it looks like!

A large steel plate sunk into a concrete plinth doesn’t give away what sits below it.

A gentle pull and the door lifts open with a satisfying “woosh” – assisted by hydraulics that still seem to be working.

The power to the site has clearly been off for some time, but the sealed underground exchange is in surprisingly good condition, except for the musky smell of old electronics, which to be honest goes for any network site.

There’s an exhaust fan with a vent hose that hogs a good chunk of the ladder space, which feels very much like an afterthought.

Inside is pretty dark, to be expected I guess what with being underground, and not powered.

Inside is the power system (well, the rectifiers – the batteries were housed in a pit at the end of the UECH entrance hatch, so inside there are no batteries), a distribution frame (MDF / IDF), and the Alcatel cabinets that are the heart of the RIM.

From the log books it appeared no one had accessed this in a very long time, but no water had leaked in, and all the equipment was still there, albeit powered off.

I’ve no idea how many time capsules like this still exist in the network today, but keep your eyes peeled and you might just spot one yourself!

Logging DSL Line Rate & SNR on a Draytek Modem

I am connected on a VDSL line, not by choice, but here we are.
DSL is many things, but consistent it not one of them, so I thought it’d be interesting to graph out the SNR and the line rate of the connection.

This is an NBN FTTN circuit, I run Mikrotiks for the routing, but I have a Draytek Vigor 130 that acts as a dumb modem and connects to the Tik.

Draytek exposes this info via SNMP, but the OIDs / MIBs are not part of the standard Prometheus snmp_exporter, so I’ve added them into snmp_exporter.yaml and restarted the snmp_exporter service.

draytek:
  walk:
  - 1.3.6.1.2.1.10.94.1.1.3.1.8
  - 1.3.6.1.2.1.10.94.1.1.3.1.4
  - 1.3.6.1.2.1.10.94.1.1.5.1.2.4
  - 1.3.6.1.2.1.10.94.1.1.4.1.2.4
  metrics:
  - name: Draytek_dsl_LineRate
    oid: 1.3.6.1.2.1.10.94.1.1.3.1.8
    type: gauge
    help: adslAtucCurrAttainableRate

  - name: Draytek_dsl_Linerate_Down
    oid: 1.3.6.1.2.1.10.94.1.1.4.1.2.4
    type: gauge
    help: Draytek_dsl_Linerate_Down

  - name: Draytek_dsl_Linerate_Up
    oid: 1.3.6.1.2.1.10.94.1.1.5.1.2.4
    type: gauge
    help: Draytek_dsl_Linerate_Up

  - name: Draytek_dsl_SNR
    oid: 1.3.6.1.2.1.10.94.1.1.3.1.4
    type: gauge
    help: adslAturCurrSnrMgn

Then I added this as a target in Prometheus:

  - job_name: Draytek Logger
    scrape_interval: 1m
    scrape_timeout: 30s
    static_configs:
          - targets: ['10.0.2.1']  # My modem

    metrics_path: /snmp
    params:
      module: ['draytek']
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: localhost:9116  # SNMP exporter address

And then from Grafana I can quantify exactly how bad my line is over time!

Only two dropouts today!

Australia’s East-West Microwave Link of the 1970s

On July 9, 1970 a $10 million dollar program to link Australia from East to West via Microwave was officially opened.
Spanning over 2,400 kilometres, it connected Northam (to the east of Perth) to Port Pirie (north of Adelaide) and thus connected the automated telephone networks of Australia’s Eastern States and Western States together, to enable users to dial each other and share video live, across the country, for the first time.

In 1877, long before road and rail lines, the first telegraph line – a single iron wire, was spanned across the Nullabor to link Australia’s Eastern states with Western Australia.

By 1930 an open-wire voice link had been established between the two sides of the continent.
This was open-wire circuit was upgraded a rebuilt several times, to finally top out at 140 channels, but by the 1960s Australian Post Office (APO) engineers knew a higher bandwidth (broadband carrier) system was required if ever Standard Trunk Dialling (STD) was to be implemented so someone in Perth could dial someone in Sydney without going via an operator.

A few years earlier Melbourne and Sydney were linked via a 600 kilometre long coaxial cable route, so API engineers spent months in the Nullarbor desert surveying the soil conditions and came to the conclusion that a coaxial cable (like the recently opened Melbourne to Sydney Coaxial cable) was possible, but would be very difficult to achieve.

Instead, in 1966, Alan Hume, the Postmaster-General, announced that the decision had been made to construct a network of Microwave relay stations to span from South Australia to Western Australia.

In the 1930s microwave communications had spanned the English channel, by 1951 AT&T’s Long Lines microwave network had opened, spanning the continental United States. So by the 1960’s Microwave transmission networks were commonplace throughout Europe and the US and was thought to be fairly well understood.

But soon APO engineers soon realised that the unique terrain of the desert and the weather conditions of the Nullabor, had significant impacts on the transmission of Radio Waves. Again Research Labs staff went back to spend months in the desert measuring signal strength between test sites to better understand how the harsh desert environment would impact the transmission in order to overcome these impediments.

The length of the link was one of the longest ever attempted, longer than the distance from London to Moscow,

In the end it was decided that 59 towers with heights from 22 meters to 76 meters were to be built, topped off with 3.6m tall microwave dishes for relaying the messages between towers.

The towers themselves were to be built in a zig-zag pattern, to prevent overshooting microwave signals from interfering with signals for the next station in the chain.

Due to the remote nature of the repeater sites, for 43 of the 59 repeater sites had to be fully self sufficient in terms of power.

Initial planning saw the power requirements of the repeater sites to be limited to 500 watts, APO engineers looked at the available wind patterns and determined that combined with batteries, wind generators could keep these sites online year round, without the need for additional power sources. Unfortunately this 500 watt power consumption target quickly tripled, and diesel generators were added to make up any shortfall on calm days.

The addition of the Diesel gensets did not in any way reduce the need to conserve power – the more Diesel consumed, the more trips across the desert to refuel the diesel generators would be required, so the constant need to keep power to a minimum was one of the key restraints in the project.

The designs of these huts were reused after the project for extreme temperature equipment housings, including one reused by Broadcast Australia seen in Marble Barr – The hottest town in Australia.

Active cooling systems (Like Air Conditioning) were out of the question due to being too power hungry. APO engineers knew that the more efficient equipment they could use, the less heat they would produce, and the more efficient the system would be, so solid state (transistorised devices) were selected for the 2Ghz transmission equipment, instead of valves which would have been more power-hungry and produced more heat.

The reduced power requirement of the fully transistorized radio equipment meant that wind-supplied driven generators could provide satisfactory amounts of power provided that the wind characteristics of the site were suitable.

THE TELECOMMUNICATION JOURNAL OF AUSTRALIA / Volume 21 / Issue 21 / February 1971

So forced to use passive cooling methods, the engineers on the project designed the repeater huts to cleverly utilize ventilation and the orientation of the huts to keep them as cool as possible.

Construction was rough, but in just under 2 years the teams had constructed all 59 towers and the associated equipment huts to span the desert.

When the system first opened for service in July 1970, live TV programs could be simulcast on both sides of the country, for the first time, and someone in Perth could pick up the phone and call someone in Melbourne directly (previously this would have gone through an operator).

PMG Engineers designed a case to transport the fragile equipment spares – That resided in the back of a Falcon XR Station Wagon

The system offered 1+1 redundancy, and capacity for 600 circuits, split across up to 6 radio bearers, and a bearer could be dedicated at times to support TV transmissions, carried on 5 watt (2 watt when modulated) carriers, operating at 1.9 to 2.3Ghz.

By linking the two sides of Australia, Telecom opened up the ability to have a single time source distributed across the country, the station in Lyndhurst in Victoria, created the 100 “microseconds” signal generated by a VNG, that was carrier across the link.

Looking down one of the towers

Unlike AT&T’s Long Lines network, which lasted until after MCI, deregulation and the breakup off the Bell System, the East-West link didn’t last all that long.

By 1981, Telecom Australia (No longer APO) had installed their first experimental optic fibre cable between Clayton and Springvale, and fibre quickly became the preferred method for broadband carrier circuits between exchanges.

By 1987, Melbourne and Sydney were linked by fibre, and the benefits of fibre were starting to be seen more broadly, and by 1989, just under 20 years since the original East-West Microwave system opened, Telecom Australia completed a 2373 kilometre long / 14 fibre cable from Perth to Adelaide, and Optus followed in 1993.

This effectively made the microwave system redundant. Fibre provided a higher bandwidth, more reliable service, that was far cheaper to operate due to decreased power requirements. And so piece by piece microwave hops were replaced with fibre optic cables.

I’m not clear on which was the last link to be switched off (If you do know please leave a comment or drop me a message), but eventually at some point in the late 1980s or early 1990s, the system was decommissioned.

Many of the towers still stand today and carry microwave equipment on them, but it is a far cry from what was installed in the late 1960s.

Advertisement from Andrew Antennas

References

East-west microwave link opening (Press Release)

Walkabout.Vol. 35 No. 6 (1 June 1969) – Communications Across the Nullabor

$8 Million Trans-continental link

ABC Goldfields-Esperance – Australia’s first live national television broadcast

APO – Newsletter ‘New East-West Trunks System’

TelevisionAU.com 50 years since Project Australia

Whirlpool Post

TJA Article on spur to Lenora

VoLTE / IMS – Analysis Challenge

It’s challenge time, this time we’re going to be looking at an IMS PCAP, and answering some questions to test your IMS analysis chops!

Here’s the packet capture:

Easy Questions

  • What QCI value is used for the IMS bearer?
  • What is the registration expiry?
  • What is the E-UTRAN Cell ID the Subscriber is served by?
  • What is the AMBR of the IMS APN?

Intermediate Questions

  • Is this the first or subsequent registration?
  • What is the Integrity-Key for the registration?
  • What is the FQDN of the S-CSCF?
  • What Nonce value is used and what does it do?
  • What P-CSCF Addresses are returned?
  • What time would the UE need to re-register by in order to stay active?
  • What is the AA-Request in #476 doing?
  • Who is the(opens in a new tab)(opens in a new tab)(opens in a new tab) OEM of the handset?
  • What is the MSISDN associated with this user?

Hard Questions

  • What port is used for the ESP data?
  • Which encryption algorithm and algorithm is used?
  • How many packets are sent over the ESP tunnel to the UE?
  • Where should SIP SUBSCRIBE requests get routed?
  • What’s the model of phone?

The answers for each question are on the next page, let me know in the comments how you went, and if there’s any tricky ones!

Mobile IPv6 Tax?

Recently a Tweet from Dean Bubly got me thinking about how data is charged in cellular:

In the cellular world, subscribers are charged for data from the IP, transport and applications layers; this means you pay for the IP header, you pay for the TCP/UDP header, and you pay for the contents (the cat videos it contains).

This also means if an operator moves mobile subscribers from IPv4 to IPv6, there’s an extra 20 bytes the customer is charged for for every packet sent / received, which the customer is charged for – This is because the IPv6 header is longer than the IPv4 header.

Source: ServerFault - https://serverfault.com/questions/547768/ipv4-header-vs-ipv6-header-size

In most cases, mobile subs don’t get a choice as to if their connection is IPv4 or IPv6, but on a like for like basis, we can say that if a customer moves is on IPv6 every packet sent/received will have an extra 20 bytes of data consumed compared to IPv4.

This means subscribers use more data on IPv6, and this means they get charged for more data on IPv6.

For IoT applications, light users and PAYG users, this extra 20 bytes per packet could add up to something significant – But how much?

We can quantify this, but we’d need to know the number of packets sent on average, and the quantity of the data transferred, because the number of packets is the multiplier here.

So for starters I’ve left a phone on the desk, it’s registered to the network but just sitting in Idle mode – This is an engineering phone from an OEM, it’s just used for testing so doesn’t have anything loaded onto it in terms of apps, it’s not signed into any applications, or checking in the background, so I thought I’d try something more realistic.

So to get a clearer picture, I chucked a SIM in my regular everyday phone I use personally, registered it to the cellular lab I have here. For the next hour I sniffed the GTP traffic for the phone while it was sitting on my desk, not touching the phone, and here’s what I’ve got:

Overall the PCAP includes 6,417,732 bytes of data, but this includes the transport and GTP headers, meaning we can drop everything above it in our traffic calculations.

Everything except the data encapsulated in GTP can be dropped

For this I’ve got 14 bytes of ethernet, 20 bytes IP, 8 bytes UDP and 5 bytes for TZSP (this is to copy the traffic from the eNB to my local machine), then we’ve got the transport from the eNB to the SGW, 14 bytes of ethernet again, 20 bytes of IP , 8 bytes of UDP and 8 bytes of GTP then the payload itself. Phew.
All this means we can drop 97 bytes off every packet.

We have 16,889 packets, 6,417,732 bytes in total, minus 97 bytes from each gives us 1,638,233 of headers to drop (~1.6MB) giving us a total of 4.556 MB traffic to/from the phone itself.

This means my Android phone consumes 4.5 MB of cellular data in an hour while sitting on the desk, with 16,889 packets in/out.

Okay, now we’re getting somewhere!

So now we can answer the question, if each of these 16k packets was IPv6, rather than IPv4, we’d be adding another 20 bytes to each of them, 20 bytes x 16,889 packets gives 337,780 bytes (~0.3MB) to add to the total.

If this traffic was transferred via IPv6, rather than IPv4, we’d be looking at adding 20 bytes to each of the 16,889 packets, which would equate to 0.3MB extra, or about 7% overhead compared to IPv4.

But before you go on about what an outrage this IPv6 transport is, being charged for those extra bytes, that’s only one part of the picture.

There’s a reason operators are finally embracing IPv6, and it’s not to put an extra 7% of traffic on the network (I think if you asked most capacity planners, they’d say they want data savings, not growth).

IPv6 is, for lack of a better term, less rubbish than IPv4.

There’s a lot of drivers for IPv6, and some of these will reduce data consumption.
IPv6 is actually your stuff talking directly to the remote stuff, this means that we don’t need to rely on NAT, so no need to do NAT keepalives, and opening new sessions, which is going to save you data. If you’re running apps that need to keep a connection to somewhere alive, these data savings could negate your IPv6 overhead costs.

Will these potential data savings when using IPv6 outweigh the costs?

That’s going to depend on your use case.

If you’ve extremely bandwidth / data constrained, for example, you have an IoT device on an NTN / satellite connection, that was having to Push data every X hours via IPv4 because you couldn’t pull data from it as it had no public IP, then moving it to IPv6 so you can pull the data on the public IP, on demand, will save you data. That’s a win with IPv6.

If you’re a mobile user, watching YouTube, getting push notifications and using your phone like a normal human, probably not, but if you’re using data like a normal user, you’ve probably got a sizable data allowance that you don’t end up fully consuming, and the extra 20 bytes per packet will be nothing in comparison to the data used to watch a 2k video on your small phone screen.

DNS – TCP or UDP?

Ask someone with headphones and a lanyard in the halls of a datacenter what transport does DNS use, there’s a good chance the answer you’d get back is UDP Port 53.

But not always!

In scenarios where the DNS response is large (beyond 512 bytes) a DNS query will shift over to TCP for delivery.

How does the client know when to shift the request to TCP – After all, the DNS server knows how big the response is, but the client doesn’t.

The answer is the Truncated flag, in the response.

The DNS server sends back a response, but with the Truncated bit set, as per RFC 1035:

TC TrunCation – specifies that this message was truncated due to length greater than that permitted on the transmission channel.

RFC 1035

Here’s an example of the truncated bit being set in the DNS response.

The DNS client, upon receiving a response with the truncated bit set, should run the query again, this time using TCP for the transport.

One prime example of this is DNS NAPTR records used for DNS in roaming scenarios, where the response can quite often be quite large.

If it didn’t move these responses to TCP, you’d run the risk of MTU mismatches dropping DNS. In that half of my life has been spent debugging DNS issues, and the other half of my life debugging MTU issues, if I had MTU and DNS issues together, I’d be looking for a career change…

Improving WiFi Calling quality for WiFi Operators

I had a question recently on LinkedIn regarding how to preference Voice over WiFi traffic so that a network engineer operating the WiFi network can ensure the best quality of experience for Voice over WiFi.

Voice over WiFi is underpinned by the ePDG – Evolved Packet Data Gateway (this is a fancy IPsec tunnel we authenticate to using the SIM to drop our traffic into the P-CSCF over an unsecured connection). To someone operating a WiFi network, the question is how do we prioritise the traffic to the ePDGs and profile it?

ePDGs can be easily discovered through a simple DNS lookup, once you know the Mobile Network Code and Mobile Country code of the operators you want to prioritise, you can find the IPs really easily.

ePDG addresses take the form epdg.epc.mncXXX.mccYYY.pub.3gppnetwork.org so let’s look at finding the IPs for each of these for the operators in a country:

The first step is nailing down the mobile network code and mobile country codes of the operators you want to target, Wikipedia is a great source for this information.
Here in Australia we have the Mobile Country Code 505 and the big 3 operators all support Voice over WiFi, so let’s look at how we’d find the IPs for each.
Telstra has mobile network code (MNC) 01, in 3GPP DNS we always pad network codes to 3 digits, so that’ll be 001, and the mobile country code (MCC) for Australia is 505.
So to find the IPs for Telstra we’d run an nslookup for epdg.epc.mnc001.mcc505.pub.3gppnetwork.org – The list of IPs that are returned, are the IPs you’ll see Voice over WiFi traffic going to, and the IPs you should provide higher priority to:

For the other big operators in Australia epdg.epc.mnc002.mcc505.pub.3gppnetwork.org will get you Optus and epdg.epc.mnc003.mcc505.pub.3gppnetwork.org will get you VHA.

The same rules apply in other countries, you’d just need to update the MNC/MCC to match the operators in your country, do an nslookup and prioritise those IPs.

Generally these IPs are pretty static, but there will need to be a certain level of maintenance required to keep this list up to date by rechecking.

Happy WiFi Calling!

CGrateS MySQL Rounding Error

I put my rates in with a stack of decimal points, because accuracy matters!

But when I manually calculated the outputted costs associated with each transaction, I seemed to have some rounding errors.

So what was the issue?

The schema in MySQL was set to DECIMAL(10,4) which gives us 10 digits after the decimal point and 4 digits after.

A quick alter table and a reimport of the rates and I was on my way!

"alter table tp_rates modify column  rate DECIMAL(10,10);"

Lesson learned and hopefully of use to any other CGrateS users who may be using MySQL as a StoreDB.

SMS Transport Wars?

There’s old joke about standards that the great thing about standards there’s so many to choose from.

SMS wasn’t there from the start of GSM, but within a year of the inception of 2G we had SMS, and we’ve had SMS, almost totally unchanged, ever since.

In a recent Twitter exchange, I was asked, what’s the best way to transport SMS?
As always the answer is “it depends” so let’s take a look together at where we’ve come from, where we are now, and how we should move forward.

How we got Here

Between 2G and 3G SMS didn’t change at all, but the introduction of 4G (LTE) caused a bit of a rethink regarding SMS transport.

Early builders of LTE (4G) networks launched their 4G offerings without 4G Voice support (VoLTE), with the idea that networks would “fall back” to using 2G/3G for voice calls.

This meant users got fast data, but to make or receive a call they relied on falling back to the circuit switched (2G/3G) network – Hence the name Circuit Switched Fallback.

Falling back to the 2G/3G network for a call was one thing, but some smart minds realised that if a phone had to fall back to a 2G/3G network every time a subscriber sent a text (not just calls) – And keep in mind this was ~2010 when SMS traffic was crazy high; then that would put a huge amount of strain on the 2G/3G layers as subs constantly flip-flopped between them.

To address this the SGs-AP interface was introduced, linking the 4G core (MME) with the 2G/3G core (MSC) to support this stage where you had 4G/LTE but only for data, SMS and calls still relied on the 2G/3G core (MSC).

The SGs-AP interface has two purposes;
One, It can tell a phone on 4G to fallback to 2G/3G when it’s got an incoming call, and two; it can send and receive SMS.

SMS traffic over this interface is sometimes described as SMS-over-NAS, as it’s transported over a signaling channel to the UE.

This also worked when roaming, as the MSC from the 2G/3G network was still used, so SMS delivery worked the same when roaming as if you were in the home 2G/3G network.

Enter VoLTE & IMS

Of course when VoLTE entered the scene, it also came with it’s own option for delivering SMS to users, using IP, rather than the NAS signaling. This removed the reliance on a link to a 2G/3G core (MSC) to make calls and send texts.

This was great because it allowed operators to build networks without any 2G/3G network elements and build a fully standalone LTE only network, like Jio, Rakuten, etc.

VoLTE didn’t change anything about the GSM 2G/3G SMS PDU, it just bundled it up in an SIP message body, this is often referred to as SMS-over-IP.

SMS-over-IP doesn’t address any of the limitations from 2G/3G, including limiting multipart messages to send payloads above 160 characters, and carries all the same limitations in order to be backward compatible, but it is over IP, and it doesn’t need 2G or 3G.

In roaming scenarios, S8 Home Routing for VoLTE enabled SMS to be handled when roaming the same way as voice calls, which made SMS roaming a doddle.

4G SMS: SMS over IP vs SMS over NAS

So if you’re operating a 4G network, should you deliver your SMS traffic using SMS-over-IP or SMS-over-NAS?

Generally, if you’ve been evolving your network over the years, you’ve got an MSC and a 2G/3G network, you still may do CSFB so you’ve probably ended up using SMS over NAS using the SGs-AP interface.
This method still relies on “the old ways” to work, which is fine until a discussion starts around sunsetting the 2G/3G networks, when you’d need to move calling to VoLTE, and SMS over NAS is a bit of a mess when it comes to roaming.

Greenfield operators generally opt for SMS over IP from the start, but this has its own limitations; SMS over IP is has awful efficiency which makes it unsuitable for use with NB-IoT applications which are bandwidth constrained, support for SMS over IP is generally limited to more expensive chipsets, so the bargain basement chips used for IoT often don’t support SMS over IP either, and integration of VoLTE comes with its own set of challenges regarding VoLTE enablement.

5G enters the scene (Nsmsf_SMService)

5G rolled onto the scene with the opportunity to remove the SMS over NAS option, and rely purely on SMS over IP (IMS); forcing the industry to standardise on an option alas this did not happen.

Instead 5GC introduces another delivery mechanism for SMS, just for 5GC without VoNR, the SMSf which can still send messages over the 5G NAS messaging.

This added another option for SMS delivery dependent on the access network used, and the Nsmsf_SMService interface does not support roaming.

Of course if you are using Voice over NR (VoNR) then like VoLTE, SMS is carried in a SIP message to the IMS, so this negates the need for the Nsmsf_SMService.

2G/3G Shutdown – Diameter to replace SGs-AP (SGd)

With the 2G/3G shutdown in the US operators who had up until this point been relying on SMS-over-NAS using the SGs-AP interface back to their MSCs were forced to make a decision on how to route SMS traffic, after the MSCs were shut down.

This landed with SMS-over-Diameter, where the 4G core (MME) communicates over Diameter with the SMSc.

The advantage of this approach is the Diameter protocol stack is the backbone of 4G roaming, and it’s not a stretch to get existing Diameter Routing Agents to start flicking SMS over Diameter messages between operators.

This has adoption by all the US operators, but we’re not seeing it so widely deployed in the rest of the world.

State of Play

OptionConditionsNotes
MAP2G/3G OnlyRelies on SS7 signaling and is very old
Supports roaming
SGs-AP (SMS-over-NAS)4G only relies on 2G/3GNeeds an MSC to be present in the network (generally because you have a 2G/3G network and have not deployed VoLTE)
Supports limited roaming
SMS over IP (IMS)4G / 5GNot supported on 2G/3G networks
Relies on a IMS enabled handset and network
Supports roaming in all S8 Home Routed scenarios
Device support limited, especially for IoT devices
Diameter SGd4G only / 5G NSAOnly works on 4G or 5G NSA
Better device support than 4G/5G
Supports roaming in some scenarios
Nsmsf_SMService5G standalone onlyOnly works on 5GC
Doesn’t support roaming
The convoluted world of SMS delivery options

A Way Forward:

While the SMS payload hasn’t changed in the past 31 years, how it is transported has opened up a lot of potential options for operators to use, with no clear winner, while SMS revenues and traffic volumes have continued to fall.

For better or worse, the industry needs to accept that SMS over NAS is an option to use when there is no IMS, and that in order to decommission 2G/3G networks, IMS needs to be embraced, and so SMS over IP (IMS) supported in all future networks, seems like the simple logical answer to move forward.

And with that clear path forward, we add in another wildcard…

Direct to device Satellite messes everything up…

Remember way back in this post when I said SMS over IP using IMS is a really really inefficient way of getting data? Well that hasn’t been a problem as we progressed up the generations of cellular tech as with each “G” we had more and more bandwidth than the last.

To throw a spanner in the works, let’s introduce NB-IoT and Non-Terrestrial Networks which rely on Non-IP-Data-Delivery.

These offer the ability to cover the globe with a low bandwidth / high latency service, that would ensure a subscriber is always just a message away, we’re seeing real world examples of these networks getting deployed for messaging applications already.

But, when you’ve only got a finite resource of bandwidth, and massive latencies to contend with, the all-IP architecture of IMS (VoLTE / VoNR) and it’s woeful inefficiency starts to really sting.

Of course there are potential workarounds here, Robust Header Correction (ROHC) can shrink this down, but it’s still going to rely on the 3 way handshake of TCP, TCP keepalive timers and IMS registrations, which in turn can starve the radio resources of the satellite link.

For NTN (Satelite) networks the case is being heavily made to rely on Non-IP-Data-Delivery, so the logical answer for these applications is to move the traffic back to SMS over NAS.

End Note

Even with SMS over 30 years old, we can still expect it to be a part of networks for years to come, even as WhatsApp / iMessage, etc, offer enhanced services. As to how it’s transported and the myriad of options here, I’m expecting that we’ll keep seeing a multi-transport mix long into the future.

For simple, cut-and-dried 4G/5G only network, IMS and SMS over IP makes the most sense, but for anything outside of that, you’ve got a toolbox of options for use to make a solution that best meets your needs.

FreeSWITCH Bridge Timers 101

A cheat sheet for anyone trying to control FreeSWITCH bridge behaviour if you’re trying to move calls around if not answered / responded to:

originate_timeout

How long to wait for any response to from the remote peer (100 TRYING, 180 RINGING, etc).

This is useful for knowing when to give up and try a different peer as this peer is dead.

originate_retries

How many times to retransmit the INVITE if no 100 TRYING / 180 RINGING is received.

Like originate_timeout, this is handy for giving up sooner when a peer is dead and moving onto others.

progress_timeout

How long we wait between sending the SIP INVITE before we get a 180/183 before we give up.

This is handy to find out if the remote end isn’t able to reach the endpoint you’re after (page the UE in a cellular context).

bridge_answer_timeout

How long do we wait between the INVITE and a 200 OK (Including RINGING) – This is useful for “no answer” timeouts.

If you want to know why a bridge failed, ie no answer timeout reached, error on the remote end, etc, we can see why with the following variable:

variable_last_bridge_hangup_cause: [PROGRESS_TIMEOUT]

Which will allow you to tell if it’s no answer or progress timeout to blame.

Verify Android Signing Certificate for ARA-M Carrier Privileges in App

Part of the headache when adding the ARA-M Certificate to a SIM is getting the correct certificate loaded,

The below command calculates it the SHA-1 Digest we need to load as the App ID on the SIM card’s ARA-M or ARA-F applet:

apksigner verify --verbose --print-certs "yourapp.apk"

You can then flash this onto the SIM with PySIM:

pySIM-shell (MF/ADF.ARA-M)> aram_store_ref_ar_do --aid FFFFFFFFFFFF --device-app-id 40b01d74cf51bfb3c90b69b6ae7cd966d6a215d4 --android-permissions 0000000000000001 --apdu-always

FreeSWITCH – Keep Call-ID the same on both legs of a Bridged Call

I needed to have both legs of the B2BUA bridge call through FreeSWITCH using the same Call-ID (long story), and went down the rabbit hole of looking for how to do this.

A post from 15 years ago on the mailing list from Anthony Minessale said he added “sip_outgoing_call_id” variable for this, and I found the commit, but it doesn’t work – More digging shows this variable disappears somewhere in history.

But by looking at what it changed I found sip_invite_call_id does the same thing now, so if you want to make both legs use the same Call-ID here ya go:

<action application="set" data="sip_invite_call_id=mycustomcallid"/>

Yeah, this post probably could’ve been a Tweet….