Tag Archives: VoIP

CGrateS – AttributeS

The docs describe AttributeS as a Key-Value-Store, but that’s probably selling it short – You can do some really cool stuff with AttributeS, and in this post, we’re going to learn about using AttributeS to transform stuff.

Note: Before we get started, I’d suggest copying this config file to use for testing.

Let’s look at a really basic example, where we add some data into AttributeS, match based on Account in CGrateS, and get back that data.

Let’s look at how this would look on the API:

{
    "method": "APIerSv2.SetAttributeProfile",
    "params": [{
        "Tenant": "cgrates.org",
        "ID": "ATTR_Nick_Key_Value_Example",
        "Contexts": ["*any"],
        "FilterIDs": [
            "*string:~*req.Account:1234"
        ],
        "Attributes": [
            {
            "FilterIDs": [],
            "Path": "*req.ExampleKey",
            "Type": "*constant",
            "Value": "ExampleValue"
            }
        ],
        "Blocker": False,
        "Weight": 10

    }],
}

So what are we doing in this API call?

Well, for starters we’re calling the SetAttributeProfile endpoint, this is where we go to create / update Attribute Profiles, but in this case, because we’re hitting it for the first time with this ID, we’re creating a new entry called “ATTR_Nick_Key_Value_Example“, this will match any Contexts (more on them later) where the FilterIDs is a string, where the request Account, is equal to 1234.

Let’s run this against the CGrateS API and take a look at the result:

import cgrateshttpapi
import pprint

CGRateS_Obj = cgrateshttpapi.CGRateS('localhost', 2080)

SetAttributeProfile = {
    "method": "APIerSv2.SetAttributeProfile",
    "params": [{
        "Tenant": "cgrates.org",
        "ID": "ATTR_Nick_Key_Value_Example",
        "Contexts": ["*any"],
        "FilterIDs": [
            "*string:~*req.Account:1234"
        ],
        "Attributes": [
            {
            "FilterIDs": [],
            "Path": "*req.ExampleKey",
            "Type": "*constant",
            "Value": "ExampleValue"
            }
        ],
        "Blocker": False,
        "Weight": 10
    }],
}
result = CGRateS_Obj.SendData(SetAttributeProfile)
pprint.pprint(result)

result = CGRateS_Obj.SendData({"method":"AttributeSv1.ProcessEvent",
                               "params":[
                                   {"Tenant":"cgrates.org",
                                    "Event":{"Account":"1234"},"APIOpts":{}}]})
pprint.pprint(result)

All going well you should have got the following back:

{'method': 'AttributeSv1.ProcessEvent', 'params': [{'Tenant': 'cgrates.org', 'Event': {'Account': '1234'}, 'APIOpts': {}}]}
{'error': None,
 'id': None,
 'result': {'APIOpts': {},
            'AlteredFields': ['*req.ExampleKey'],
            'Event': {'Account': '1234', 'ExampleKey': 'ExampleValue'},
            'ID': '',
            'MatchedProfiles': ['cgrates.org:ATTR_Nick_Key_Value_Example'],
            'Tenant': 'cgrates.org',
            'Time': None}}

This tells us we matched the Attribute with the ID ATTR_Nick_Key_Value_Example, and inside Event we can see that ExampleKey was added with value ExampleValue.

Okay, you’re saying, well what was the point of that?

Well, what if as a key in the attributes, we had the password for the SIP account, which we passed to our SIP switch (Kamailio, FreeSWITCH or Asterisk for example), and used that to authenticate?

Let’s see how that would look:

{
    "method": "APIerSv2.SetAttributeProfile",
    "params": [{
        "Tenant": "cgrates.org",
        "ID": "ATTR_Nick_Password_Example",
        "Contexts": ["*any"],
        "FilterIDs": [
            "*string:~*req.Account:1234"
        ],
        "Attributes": [
            {
            "FilterIDs": [],
            "Path": "*req.SIP_password",
            "Type": "*constant",
            "Value": "sosecretiputitonthewebsite"
            }
        ],
        "Blocker": False,
        "Weight": 10
    }],
}

Now if the CGrateS Agent for your SIP Switch, includes the *attributes flag, and the call is coming from 1234, we’ll get back a key called “SIP_password” with the value “sosecretiputitonthewebsite”, which you can use to auth the SIP account.

We can also return multiple AttributeS, for example, we created two Attributes (ATTR_Nick_Password_Example and ATTR_Nick_Key_Value_Example) which match on the account 1234. This means we’ll get back the SIP Password from ATTR_Nick_Password_Example and the key:value we set in ATTR_Nick_Key_Value_Example:

{'method': 'AttributeSv1.ProcessEvent', 'params': [{'Tenant': 'cgrates.org', 'Event': {'Account': '1234'}}]}
{'error': None,
 'id': None,
 'result': {'APIOpts': {},
            'AlteredFields': ['*req.SIP_password', '*req.ExampleKey'],
            'Event': {'Account': '1234',
                      'ExampleKey': 'ExampleValue',
                      'SIP_password': 'sosecretiputitonthewebsite'},
            'ID': '',
            'MatchedProfiles': ['cgrates.org:ATTR_Nick_Password_Example',
                                'cgrates.org:ATTR_Nick_Key_Value_Example'],
            'Tenant': 'cgrates.org',
            'Time': None}}

The order can be controlled by the Weight flag in the attribute, and if you want to stop matching any other AttributeS rules after the current Attribute, you can set the Blocker=True flag when you create/update the Attribute.

Okay, I hear you saying, that’s all well and good, I can add arbitrary key/values to stuff. Here endeth the lesson right?

Well not quite, because we can add key/values, but we can also rewrite variables using AttributeS.

Let’s imagine we’ve got 3 phone numbers (DIDs) associated with an account inside CGrateS, for example’s sake let’s say we have 12340001, 12340002 and 12340003, and we want any calls from these numbers to be billed to a CGrateS account called “NickTest1234”.

Our SIP switch doesn’t need to know anything about “NickTest1234”, just the 3 DIDs it can use to call out from your SIP stack. But to do this, we’d need CGrateS to transform any events from these DIDs to replace the Account value inside CGrateS, with NickTest1234.

Let’s see how that would look:

{'method': 'APIerSv2.SetAttributeProfile', 'params': [{'Tenant': 'cgrates.org', 'ID': 'ATTR_Calling_NickTest1234_12340001', 'Contexts': ['*any'], 'FilterIDs': ['*string:~*req.Account:12340001'], 'Attributes': [{'Path': '*req.Account', 'Type': '*constant', 'Value': 'NickTest1234'}], 'Weight': 0}], 'id': 1}

{'method': 'APIerSv2.SetAttributeProfile', 'params': [{'Tenant': 'cgrates.org', 'ID': 'ATTR_Calling_NickTest1234_12340002', 'Contexts': ['*any'], 'FilterIDs': ['*string:~*req.Account:12340002'], 'Attributes': [{'Path': '*req.Account', 'Type': '*constant', 'Value': 'NickTest1234'}], 'Weight': 0}], 'id': 2}

{'method': 'APIerSv2.SetAttributeProfile', 'params': [{'Tenant': 'cgrates.org', 'ID': 'ATTR_Calling_NickTest1234_12340003', 'Contexts': ['*any'], 'FilterIDs': ['*string:~*req.Account:12340003'], 'Attributes': [{'Path': '*req.Account', 'Type': '*constant', 'Value': 'NickTest1234'}], 'Weight': 0}], 'id': 3}

In the example code to go with this I’ve put together a simple for loop to add these – You can find the code on Github (link at the bottom).

So with these defined, let’s try and rate something, we’ll add a default Charger, and add an SMS balance, before simulating an SMS where the account is set to 12340003:

#Define default Charger
print(CGRateS_Obj.SendData({"method":"APIerSv1.SetChargerProfile","params":[{"Tenant":"cgrates.org","ID":"DEFAULT","FilterIDs":[],"AttributeIDs":["*none"],"Weight":0}]}))

#Add an SMS Balance
print(CGRateS_Obj.SendData({"method":"ApierV1.SetBalance","params":[{"Tenant":"cgrates.org","Account":"Nick_Test_123","BalanceType":"*sms","Categories":"*any","Balance":{"ID":"SMS_Balance_1","Value":"100","Weight":25}}],"id":13}))

import uuid
import datetime
now = datetime.datetime.now()
result = CGRateS_Obj.SendData({
    "method": "CDRsV2.ProcessExternalCDR",
    "params": [
        {
            "OriginID": str(uuid.uuid1()),
            "ToR": "*sms",
            "RequestType": "*pseudoprepaid",
            "AnswerTime": now.strftime("%Y-%m-%d %H:%M:%S"),
            "SetupTime": now.strftime("%Y-%m-%d %H:%M:%S"),
            "Tenant": "cgrates.org",
            #This is going to be transformed to Nick_Test_123 by Attributes
            "Account": "12340003",
            "Usage": "1",
        }
    ]
})
pprint.pprint(result)

Right, so all going well, here’s what you should see in the CDRs table:

Bing, despite the fact the Account in the ProcessExternalCDR was set to 12340003, and had no mention of “NickTest1234”, CGrateS transformed it to NickTest1234.

How did that happen? Well, inside our cgrates.json file we have set the cdrs and chargers modules to have a link to Attributes, which means that when we call CDRs or Chargers modules via the API, these will in turn bounce the data through AttributesS for any transformations.

This means we don’t need to run AttributeSv1.ProcessEvent ourselves, when we call CDRsV2.ProcessExternalCDR, the CDRs module will call AttributeSv1.ProcessEvent for us.

We can actually see this happening, using ngrep, which as you work more with CGrateS, is a tool you’ll get very familiar with, let’s take a peek:

sudo ngrep -t -W byline port 2012 -d lo

Now if we run the CDRsV2.ProcessExternalCDR again, we’ll see the CDRs module has called Attributes for us:

Boom, there it is, same as we ran, but it’s being handled by CGrateS for us.

If you look carefully you’ll see the context in the API request is set to “*cdrs”, this means the CDRs module is calling Attributes.

When we define each of our Attributes, as we did earlier in the post, we can set what contexts they are valid in, for example we may want to apply the transformation when called by CDRs, but not other modules, you can restrict that when you define the Attribute by setting “Contexts”: [“*cdrs”].

Okay, so we’ve done some account replacement, what else can we do?

Well, let’s look at some other use cases,

Here in Australia we’ve got a few valid dialing formats, you could dial E.164 format (Numbers look like: +61212341234), 0NSN format (Numbers look like: 02 1234 1234) or NSN format (Numbers look like: 1234 1234 assuming you’re in the 03 area code yourself).
If we want to define all our Destinations in E.164 format, we’ll need to to normalise the format using AttributeS, so the numbers always come as E.164.

Let’s give it a whirl with a static translation:

{
    "method": "APIerSv2.SetAttributeProfile",
    "params": [{
        "Tenant": "cgrates.org",
        "ID": "ATTR_0NSN_to_E164_Single",
        "Contexts": ["*any"],
        "FilterIDs": [
            "*string:~*req.Subject:0212341234"
        ],
        "Attributes": [
            {
            "FilterIDs": [],
            "Path": "*req.Subject",
            "Type": "*constant",
            "Value": "61212341234"
            }
        ],
        "Blocker": False,
        "Weight": 10
    }],
}

Now this will work, if we simulate an Event to AttributeS with the Subject 0212341234, it’ll get transformed by AttributeS to 61212341234.

The issue here is probably pretty obvious, the only matches one number, if we dial 0212341235 this all falls apart.

Enter our old friend Regex.

For starters, we’ll change the FilterIDs to match where the Account is NickTest7, this way we can set the rules on a per CGrateS account basis.

{
    "method": "APIerSv2.SetAttributeProfile",
    "params": [{
        "Tenant": "cgrates.org",
        "ID": "ATTR_0NSN_to_E164_02_Area_Code",
        "Contexts": ["*any"],
        "FilterIDs": [
            "*string:~*req.Account:NickTest7"
        ],
        "Attributes": [
            {
            "FilterIDs": [],
            "Path": "*req.Subject",
            "Type": "*variable",
            "Value": "~*req.Subject:s/^0(\d{1})(\d{8})$/61${1}${2}/"
            },
            {
            "FilterIDs": [],
            "Path": "*req.Subject",
            "Type": "*variable",
            "Value": "~*req.Subject:s/^(\d{8})$/612${1}/"
            },
        ],
        "Blocker": False,
        "Weight": 10
    }],
}

And then under AttributeS we’ve defined a rule to replace anything matching the 0NSN regex, to strip the first digit and append a 61, to put it in E.164 format, and in SN format as the second entry.

We can now test this out:

{'method': 'AttributeSv1.ProcessEvent', 'params': [{'Tenant': 'cgrates.org', 'Event': {'Account': 'NickTest7', 'Subject': '0312341234'}, 'APIOpts': {'*processRuns': 5, '*profileRuns': 5, '*subsys': '*sessions'}}]}
{'error': None,
 'id': None,
 'result': {'APIOpts': {'*processRuns': 5,
                        '*profileRuns': 5,
                        '*subsys': '*sessions'},
            'AlteredFields': ['*req.Subject'],
            'Event': {'Account': 'NickTest7', 'Subject': '61312341234'},
            'ID': '',
            'MatchedProfiles': ['cgrates.org:ATTR_0NSN_to_E164_02_Area_Code'],
            'Tenant': 'cgrates.org',
            'Time': None}}



{'method': 'AttributeSv1.ProcessEvent', 'params': [{'Tenant': 'cgrates.org', 'Event': {'Account': 'NickTest7', 'Subject': '12341234'}, 'APIOpts': {'*processRuns': 5, '*profileRuns': 5, '*subsys': '*sessions'}}]}
{'error': None,
 'id': None,
 'result': {'APIOpts': {'*processRuns': 5,
                        '*profileRuns': 5,
                        '*subsys': '*sessions'},
            'AlteredFields': ['*req.Subject'],
            'Event': {'Account': 'NickTest7', 'Subject': '61212341234'},
            'ID': '',
            'MatchedProfiles': ['cgrates.org:ATTR_0NSN_to_E164_02_Area_Code'],
            'Tenant': 'cgrates.org',
            'Time': None}}

And there you have it folks; our number format standardized.

We can combo / cascade AttributeS rules together, with the aid of the Weight and Blocker flags in the API.

Let’s imagine the 61212341234 number has been ported from Operator1 to Operator2, and the Destinations we’ve defined in CGrateS for this prefix are currently set to DST_Operator1.
But because this number has been ported we should use DST_Operator2, so we charge the Operator2, as this number has been ported.

This means we don’t need to duplicate destination definitions to show this number has been ported, as this will be updated as the call gets rated, so we just assign the Attribute to each ported number.

So let’s match where the Subject of the call is 61212341234 (even though we’re going to input the Subject as 12341234), and rewrite the Destination attribute to DST_Operator2:

{
    "method": "APIerSv2.SetAttributeProfile",
    "params": [{
        "Tenant": "cgrates.org",
        "ID": "ATTR_Ported_61212341234",
        "Contexts": ["*any"],
        "FilterIDs": [
            "*string:~*req.Subject:61212341234",
        ],
        "Attributes": [
            {
            "FilterIDs": [],
            "Path": "*req.Destination",
            "Type": "*constant",
            "Value": "DST_Operator2"
            },
        ],
        "Blocker": False,
        "Weight": 5
    }],
}

From the results we can see we matched two AttributeS rules, the first, ATTR_0NSN_to_E164_02_Area_Code reformatted the Subject of the call from 12341234 to 61212341234, then the updated Subject was passed through to ATTR_Ported_61212341234, which updated the Destination attribute to DST_Operator2.

{'method': 'AttributeSv1.ProcessEvent', 'params': [{'Tenant': 'cgrates.org', 'Event': {'Account': 'NickTest7', 'Subject': '12341234'}, 'APIOpts': {'*processRuns': 5, '*profileRuns': 5, '*subsys': '*sessions'}}]}
{'error': None,
 'id': None,
 'result': {'APIOpts': {'*processRuns': 5,
                        '*profileRuns': 5,
                        '*subsys': '*sessions'},
            'AlteredFields': ['*req.Subject', '*req.Destination'],
            'Event': {'Account': 'NickTest7',
                      'Destination': 'DST_Operator2',
                      'Subject': '61212341234'},
            'ID': '',
            'MatchedProfiles': ['cgrates.org:ATTR_0NSN_to_E164_02_Area_Code',
                                'cgrates.org:ATTR_Ported_61212341234'],
            'Tenant': 'cgrates.org',
            'Time': None}}

Hopefully this has helped you to dip a toe into the CGrateS AttributeS pool, and give you some ideas of what we can achieve inside AttributeS.

A complete working code & config is available on my Github here.

If you’re having issues, make sure you have loaded the config file, are running the latest version, and if in doubt (and not on a production system), this script will clear all the data for you so you can rule out anything interfering.

Android and Emergency Calling

In the last post we looked at emergency calling when roaming, and I mentioned that there are databases on the handsets for emergency numbers, to allow for example, calling 999 from a US phone, with a US SIM, roaming into the UK.

Android, being open source, allows us to see how this logic works, and it’s important for operators to understand this logic, as it’s what dictates the behavior in many scenarios.

It’s important to note that I’m not covering Apple here, this information is not publicly available to share for iOS devices, so I won’t be sharing anything on this – Apple has their own ecosystem to handle emergency calling, if you’re from an operator and reading this, I’d suggest getting in touch with your Apple account manager to discuss it, they’re always great to work with.

The Android Open Source Project has an “emergency number database”. This database has each of the emergency phone numbers and the corresponding service, for each country.

This file can be read at packages/services/Telephony/ecc/input/eccdata.txt on a phone with engineering mode.

Let’s take a look what’s in mainline Android for Australia:

You can check ECC for countries from the database on the AOSP repo.

This is one of the ways handsets know what codes represent emergency calling codes in different countries, alongside the values set in the SIM and provided by the visited network.

Tales from the Trenches: mode-set in AMR

This one was a bit of a head scratcher for me, but I’m always glad to learn something new.

The handset made a VoLTE call, and it’s SDP offer shows it can support AMR and AMR-WB:

        Media Attribute (a): rtpmap:116 AMR-WB/16000/1
        Media Attribute (a): fmtp:116 mode-set=0,1,2,3,4,5,6,7,8;mode-change-capability=2;max-red=220
        Media Attribute (a): rtpmap:118 AMR/8000/1
        Media Attribute (a): fmtp:118 mode-set=0,1,2,3,4,5,6,7;mode-change-capability=2;max-red=220
        Media Attribute (a): rtpmap:111 telephone-event/16000
        Media Attribute (a): fmtp:111 0-15

Okay, that’s pretty normal, I can see we have the mode-set parameter defined, which indicates what modes the handset supports for each codec.

In our problem scenario, the Media Gateway that the call was sent to responded with this SDP answer:

        Media Description, name and address (m): audio 24504 RTP/AVP 118 110
        Media Attribute (a): rtpmap:118 AMR/8000
        Media Attribute (a): fmtp:118 mode-set=7
        Media Attribute (a): rtpmap:110 telephone-event/8000
        Media Attribute (a): fmtp:110 0-15
        Media Attribute (a): ptime:20
        Media Attribute (a): sendrecv
        [Generated Call-ID: FA163E564B37-f4d-98f56700-735d25-65357ee0-9c488]

But we got an error about not available codecs and the call drops, what gives?

Both sides support AMR (Only the phone supports AMR-WB), and the Media Gateway, as the answerer, supports mode-set 7, which is supported by the UE, so we should be good?

Well, not quite:

If mode-set is specified, it MUST be abided, and frames encoded with modes outside of the subset MUST NOT be sent in any RTP payload or used in codec mode requests. If not present, all codec modes are allowed for the payload type.

RFC 4867 – RTP Payload Format for AMR and AMR-WB

Okay, I get it, the answerer (media gateway) only supports mode 7, but the UE supports all the modes, so we should be fine right?

Well, no.

Section 8.3.1 in the RFC goes on to say in the Offer-Answer Model Considerations:

The parameter [mode-set] is bi-directional, i.e., the restricted set applies to media both to be received and sent by the declaring entity. If a mode set was supplied in the offer, the answerer SHALL return the mode-set unmodified or reject the payload type. However, the answerer is free to choose a mode-set in the answer only if no mode-set was supplied in the offer for a unicast two-peer session.

And there is our problem, and why the call is getting rejected.

The Media Gateway (the answerer in this scenario) is sending back the mode-set it supports (7) but as the UE / handset (offerer) included the mode-set, the Media Gateway should either respond with the same mode set (if it supported all the requested modes) or reject it.

Instead we’re seeing the Media Gateway repond with the mode set, which it supports, which it should not do: The Media Gateway should either return the same mode-set (unmodified / unchanged) or reject it.

And boom, another ticket to another vendor…

Tales from the Trenches: The issue with Emergency Calling URNs in IMS Networks

A lot of countries have a single point of contact for emergency services; in Europe you’d call 112 in an emergency, 000 in Australia or 911 in the US. Calling this number in the country will get you the emergency services.

This means a caller can order an ambulance for smoke inhalation, and the fire brigade, in one call.

But that’s not the case in every country; many countries don’t have one number for the emergency services, they’ve got multiple; a phone number for police, a different number for fire brigade and a different number for an ambulance.

For example, in Brazil if you need the police, you call 190, while a for example, uses 193 as the emergency number for the fire department, the police can be reached at 190 or 191 depending on if it’s road policing or general, and medical emergencies are covered by 192. Other countries have similar setups.

This is all well and good if you’re in Brazil, and you call 192 for an ambulance, the phone sends a SIP INVITE with a Request URI of sip:[email protected], because we can put a rule into our E-CSCF to say if the number is 192 to route it to the answer point for ambulances – But that’s not often the case on emergency calls.

In IMS, handsets generally detect the number dialed is on the Emergency Calling Code (ECC) list from the USIM Card.

The use of the ECC list means the phone knows this is an emergency call, and this is really important. For countries that use AML this can trigger sending of the AML SMS that process, and Emergency Calls should always be allowed to be made, even without credit, a valid SIM card, or even a SIM in the phone at all.

But this comes with a cost; when a user dials 911, the phones doesn’t (generally) send a call to sip:[email protected] like it would with any other dialled number, but rather the SIP INVITE is sent to urn:service:sos which will be routed to the PSAP by the E-CSCF. When a call comes through to these URNs they’re given top priority in the network

This is all well and good in a country where it doesn’t matter which emergency service you called, because all emergency calls route to a single PSAP, but in a country with multiple numbers, it’s really important when you call and ambulance, your call doesn’t get routed to animal control.

That means the phone has to look at what emergency number you’ve dialed, and map the URN it sends the call to to match what you’ve actually requested.

Recently we’ve been helping an operator in a country with a numbering plan like this, and we’ve been finding the limits of the standards here.
So let’s start by looking at what the standards state:

IMS Emergency Calling is governed by TS 103.479 which in turn delegates to IETF RFC 5031, but for the calling number to URN translation, it’s pretty quiet.

Let’s look at what RFC 5031 allows for URNs:

  • urn:service:sos.ambulance
  • urn:service:sos.animal-control
  • urn:service:sos.fire
  • urn:service:sos.gas
  • urn:service:sos.marine
  • urn:service:sos.mountain
  • urn:service:sos.physician
  • urn:service:sos.poison
  • urn:service:sos.police

The USIM’s Emergency Calling Codes EF would be the perfect source of this data; for each emergency calling code defined, you’ve got a flag to indicate what it’s for, here’s what we’ve got available on the SIM Card:

  • Bit 1 Police
  • Bit 2 Ambulance
  • Bit 3 Fire Brigade
  • Bit 4 Marine Guard
  • Bit 5 Mountain Rescue
  • Bit 6 manually initiated eCall
  • Bit 7 automatically initiated eCall
  • Bit 8 is spare and set to “0”

So these could be mapped pretty easily you’d think, so if the call is made to an Emergency Calling Code flagged with Bit 4, the URN would go to urn:service:sos.mountain.

Alas from our research, we’ve found most OEMs send calls to the generic urn:service:sos, regardless of the dialled number and the ECC flags that are set on the SIM for that number.

One of the big chip vendors sends calls to an ECC flagged as Ambulance to urn:service:sos.fire, which is totally infuriating, and we’ve had to put a rule in our E-CSCF to handle this if the User Agent is set to one of their phones.

Is there room for improvement here? For sure! Emergency calling is super important, and time is of the essence, while animal control can probably transfer you to an ambulance, an emergency is by very nature time sensitive, and any time wasted can lead to worse outcomes.

While carrier bundles from the OEMs can handle this, the global ability to take any phone, from any country and call an emergency number is so important, that relying on a country-by-country approach here won’t suffice.

What could we do as an industry to address this?

Acknowledging that not all countries have a single point of contact for emergency service, introducing a simple mechanism in the UE SIP message to indicate what number (Emergency Calling Code) the user actually dialled would be invaluable here.

URNs are important, but knowing the dialed number when it comes to PSAP routing, is so important – This wouldn’t even need to be its own SIP header, it could just be thrown into the Contact header as another parameter.

Highly developed markets are often the first to embrace new tech (for us this means VoLTE and VoNR), but this means that these issues seen by less developed markets won’t appear until long after the standard has been set in stone, and often countries like this aren’t at the table of the standards bodies to discuss such requirements.

This easy, reasonable update to the standard, has the potential to save lives, and next time this comes up in a working group I’ll be advocating for a change.

Playing back AMR streams from Packet Captures

The other day I found myself banging my head on the table to diagnose an issue with Ringback tone on an SS7 link and the IMS.

On the IMS side, no RBT was heard, but I could see the Media Gateway was sending RTP packets to the TAS, and the TAS was sending it to the UE, but was there actual content in the RTP packets or was it just silence?

If this was PCM / G711 we’d be able to just playback in Wireshark, but alas we can’t do this for the AMR codec.

Filter the RTP stream out in Wireshark

Inside Wireshark I filtered each of the audio streams in one direction (one for the A-Party audio and one for the B-Party audio)

Then I needed to save each of the streams as a separate PCAP file (Not PCAPng).

Turn into AMR File

With the audio stream for one direction saved, we can turn it into an AMR file, using Juan Noguera (Spinlogic)’s AMR Extractor tool.

Clone the Repo from git, and then in the same directory run:

python3 pcap_parser.py -i AMR_B_Leg.pcap -o AMR_B_Leg.3ga

Playback with VLC / Audacity

I was able to play the file with VLC, and load it into Audacity to easily see that yes, the Ringback Tone was present in the AMR stream!

VoLTE / IMS – Analysis Challenge

It’s challenge time, this time we’re going to be looking at an IMS PCAP, and answering some questions to test your IMS analysis chops!

Here’s the packet capture:

Easy Questions

  • What QCI value is used for the IMS bearer?
  • What is the registration expiry?
  • What is the E-UTRAN Cell ID the Subscriber is served by?
  • What is the AMBR of the IMS APN?

Intermediate Questions

  • Is this the first or subsequent registration?
  • What is the Integrity-Key for the registration?
  • What is the FQDN of the S-CSCF?
  • What Nonce value is used and what does it do?
  • What P-CSCF Addresses are returned?
  • What time would the UE need to re-register by in order to stay active?
  • What is the AA-Request in #476 doing?
  • Who is the(opens in a new tab)(opens in a new tab)(opens in a new tab) OEM of the handset?
  • What is the MSISDN associated with this user?

Hard Questions

  • What port is used for the ESP data?
  • Which encryption algorithm and algorithm is used?
  • How many packets are sent over the ESP tunnel to the UE?
  • Where should SIP SUBSCRIBE requests get routed?
  • What’s the model of phone?

The answers for each question are on the next page, let me know in the comments how you went, and if there’s any tricky ones!

Improving WiFi Calling quality for WiFi Operators

I had a question recently on LinkedIn regarding how to preference Voice over WiFi traffic so that a network engineer operating the WiFi network can ensure the best quality of experience for Voice over WiFi.

Voice over WiFi is underpinned by the ePDG – Evolved Packet Data Gateway (this is a fancy IPsec tunnel we authenticate to using the SIM to drop our traffic into the P-CSCF over an unsecured connection). To someone operating a WiFi network, the question is how do we prioritise the traffic to the ePDGs and profile it?

ePDGs can be easily discovered through a simple DNS lookup, once you know the Mobile Network Code and Mobile Country code of the operators you want to prioritise, you can find the IPs really easily.

ePDG addresses take the form epdg.epc.mncXXX.mccYYY.pub.3gppnetwork.org so let’s look at finding the IPs for each of these for the operators in a country:

The first step is nailing down the mobile network code and mobile country codes of the operators you want to target, Wikipedia is a great source for this information.
Here in Australia we have the Mobile Country Code 505 and the big 3 operators all support Voice over WiFi, so let’s look at how we’d find the IPs for each.
Telstra has mobile network code (MNC) 01, in 3GPP DNS we always pad network codes to 3 digits, so that’ll be 001, and the mobile country code (MCC) for Australia is 505.
So to find the IPs for Telstra we’d run an nslookup for epdg.epc.mnc001.mcc505.pub.3gppnetwork.org – The list of IPs that are returned, are the IPs you’ll see Voice over WiFi traffic going to, and the IPs you should provide higher priority to:

For the other big operators in Australia epdg.epc.mnc002.mcc505.pub.3gppnetwork.org will get you Optus and epdg.epc.mnc003.mcc505.pub.3gppnetwork.org will get you VHA.

The same rules apply in other countries, you’d just need to update the MNC/MCC to match the operators in your country, do an nslookup and prioritise those IPs.

Generally these IPs are pretty static, but there will need to be a certain level of maintenance required to keep this list up to date by rechecking.

Happy WiFi Calling!

CGrateS MySQL Rounding Error

I put my rates in with a stack of decimal points, because accuracy matters!

But when I manually calculated the outputted costs associated with each transaction, I seemed to have some rounding errors.

So what was the issue?

The schema in MySQL was set to DECIMAL(10,4) which gives us 10 digits after the decimal point and 4 digits after.

A quick alter table and a reimport of the rates and I was on my way!

"alter table tp_rates modify column  rate DECIMAL(10,10);"

Lesson learned and hopefully of use to any other CGrateS users who may be using MySQL as a StoreDB.

FreeSWITCH Bridge Timers 101

A cheat sheet for anyone trying to control FreeSWITCH bridge behaviour if you’re trying to move calls around if not answered / responded to:

originate_timeout

How long to wait for any response to from the remote peer (100 TRYING, 180 RINGING, etc).

This is useful for knowing when to give up and try a different peer as this peer is dead.

originate_retries

How many times to retransmit the INVITE if no 100 TRYING / 180 RINGING is received.

Like originate_timeout, this is handy for giving up sooner when a peer is dead and moving onto others.

progress_timeout

How long we wait between sending the SIP INVITE before we get a 180/183 before we give up.

This is handy to find out if the remote end isn’t able to reach the endpoint you’re after (page the UE in a cellular context).

bridge_answer_timeout

How long do we wait between the INVITE and a 200 OK (Including RINGING) – This is useful for “no answer” timeouts.

If you want to know why a bridge failed, ie no answer timeout reached, error on the remote end, etc, we can see why with the following variable:

variable_last_bridge_hangup_cause: [PROGRESS_TIMEOUT]

Which will allow you to tell if it’s no answer or progress timeout to blame.

FreeSWITCH – Keep Call-ID the same on both legs of a Bridged Call

I needed to have both legs of the B2BUA bridge call through FreeSWITCH using the same Call-ID (long story), and went down the rabbit hole of looking for how to do this.

A post from 15 years ago on the mailing list from Anthony Minessale said he added “sip_outgoing_call_id” variable for this, and I found the commit, but it doesn’t work – More digging shows this variable disappears somewhere in history.

But by looking at what it changed I found sip_invite_call_id does the same thing now, so if you want to make both legs use the same Call-ID here ya go:

<action application="set" data="sip_invite_call_id=mycustomcallid"/>

Yeah, this post probably could’ve been a Tweet….

RTPengine – Installation & Configuration (Ubuntu 20.04 / 22.04)

I wrote a post a few years back covering installing RTPengine on Ubuntu (14.04 / 18.04) but it doesn’t apply in later Ubuntu releases such as 20.04 and 22.04.

To make everyone’s lives easier; David Lublink publishes premade repos for Ubuntu Jammy (22.04) & Focal (20.04).

Note: It looks like Ubuntu 23.04 includes RTPengine in the standard repos, so this won’t be needed in the future.

sudo add-apt-repository ppa:davidlublink/rtpengine
sudo apt update
sudo apt-get install ngcp-rtpengine

The Ambient Capabilities in the systemctl file bit me,

Commenting out :

#AmbientCapabilities=CAP_NET_ADMIN CAP_SYS_NICE

In /lib/systemd/system/ngcp-rtpengine-daemon.service and then reloading the service and restarting and I was off and running:

systemctl daemon-reload
systemctl restart rtpengine

Getting it Running

Now we’ve got RTPengine installed let’s setup the basics,

There’s an example config file we’ll copy and edit:

vi /etc/rtpengine/rtpengine.conf

We’ll uncomment the interface line and set the IP to the IP we’ll be listening on:

Once we’ve set this to our IP we can start the service:

systemctl restart rtpengine

All going well it’ll start and rtpengine will be running.

You can learn about all the startup parameters and what everything in the config means in the readme.

Want more RTP info?

If you want to integrate RTPengine with Kamailio take a look at my post on how to set up RTPengine with Kamailio.

For more in-depth info on the workings of RTP check out my post RTP – More than you wanted to Know

Authenticating Fixed Line Subscribers into IMS

We recently added support in PyHSS for fixed line SIP subscribers to attach to the IMS.

Traditional telecom operators are finding their fixed line network to be a bit of a money pit, something they’re required to keep operating to meet regulatory obligations, but the switches are sitting idle 99% of the time. As such we’re seeing more and more operators move fixed line subs onto their IMS.

This new feature means we can use PyHSS to serve as the brains for a fixed network, as well as for mobile, but there’s one catch – How we authenticate subscribers changes.

Most banks of line cards in a legacy telecom switches, or IP Phones, don’t have SIM slots to allow us to authenticate, so instead we’re forced to fallback to what they do support.

Unfortunately for the most part, what is supported by these IP phones or telecom switches is SIP MD5 Digest Authentication.

The Nonce is generated by the HSS and put into the Multimedia-Authentication-Answer, along with the subscriber’s password and sent in the clear to the S-CSCF.

Subscriber with Password made up of all 1's MAA response from HSS for Digest-MD5 Auth

The HSS then generates the the Multimedia-Auth Answer, it generates a nonce (in the 3GPP-SIP-Authenticate / 609 AVP) and sends the Subscriber’s password in the 3GPP-SIP-Authorization (610) AVP in response back to the S-CSCF.

I would have thought a better option would be for the HSS to generate the Nonce and Digest, and then the S-CSCF to just send the Nonce to the Sub and compare the returned Digest from the Sub against the expected Digest from the HSS, but it would limit flexibility (realm adaptation, etc) I guess.

The UE/UA (I guess it’s a UA in this context as it’s not a mobile) then generates its own Digest from the Nonce and sends it back to the S-CSCF via the P-CSCF.

The S-CSCF compares the received Digest response against the one it generated, and if the two match, the sub is authenticated and allowed to attach onto the network.

IMS iFC – SPT Session Cases

Mostly just reference material for me:

Possible values:

  • 0 (ORIGINATING_SESSION)
  • 1 TERMINATING_REGISTERED
  • 2 (TERMINATING_UNREGISTERED)
  • 3 (ORIGINATING_UNREGISTERED

In the past I had my iFCs setup to look for the P-Access-Network-Info header to know if the call was coming from the IMS, but it wasn’t foolproof – Fixed line IMS subs didn’t have this header.

            <TriggerPoint>
                <ConditionTypeCNF>1</ConditionTypeCNF>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>0</Group>
                    <Method>INVITE</Method>
                    <Extension></Extension>
                </SPT>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>1</Group>
                    <SIPHeader>
                      <Header>P-Access-Network-Info</Header>
                    </SIPHeader>
                </SPT>                
            </TriggerPoint>

But now I’m using the Session Cases to know if the call is coming from a registered IMS user:

        <!-- SIP INVITE Traffic from Registered Sub-->
        <InitialFilterCriteria>
            <Priority>30</Priority>
            <TriggerPoint>
                <ConditionTypeCNF>1</ConditionTypeCNF>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>0</Group>
                    <Method>INVITE</Method>
                    <Extension></Extension>
                </SPT>
                <SPT>
                    <Group>0</Group>
                    <SessionCase>0</SessionCase>
                </SPT>             
            </TriggerPoint>

Using fs_cli and ESL

If you work with FreeSWITCH there’s a good chance every time you do, you run fs_cli and attempt to read the firehose of data shown when making a call to make sense of what’s going on and why what you’re trying to do isn’t working.

But, if you are also using the Event Socket Language service built into FreeSWITCH (Which you totally should) either for programming FreeSWITCH behaviour, or for realtime charging with CGrateS, then you’ll find that fs_cli will fail to connect.

That’s because we’ve edited the event_socket.conf.xml file, and fs_cli uses the event socket to connect to FreeSWITCH as well.

But there’s a simple fix,

Create a new file in /etc/fs_cli.conf and populate it with the info needed to connect to your ESL session you defined in event_socket.conf.xml, so if this is is your

<configuration name="event_socket.conf" description="Socket Client">
  <settings>
    <param name="nat-map" value="false"/>
    <param name="listen-ip" value="10.98.0.76"/>
    <param name="listen-port" value="8021"/>
    <param name="password" value="mysupersecretpassword"/>
    <param name="apply-inbound-acl" value="NickACL"/>
  </settings>
</configuration>

Then your /etc/fs_cli.conf should look like:

[default]
; Put me in /etc/fs_cli.conf or ~/.fs_cli_conf
;overide any default options here
loglevel => 6
log-uuid => false


host     => 10.98.0.76
port     => 8021
password => mysupersecretpassword
debug    => 7

And that’s it, now you can run fs_cli and connect to the terminal once more!

SQN Sync in IMS Auth

So the issue was a head scratcher.

Everything was working on the IMS, then I go to bed, the next morning I fire up the test device and it just won’t authenticate to the IMS – The S-CSCF generated a 401 in response to the REGISTER, but the next REGISTER wouldn’t pass.

Wireshark just shows me this loop:

UE -> IMS: REGISTER
IMS -> UE: 401 Unauthorized (With Challenge)
UE -> IMS: REGISTER with response
IMS -> UE: 401 Unauthorized (With Challenge)
UE -> IMS: REGISTER with response
IMS -> UE: 401 Unauthorized (With Challenge)
UE -> IMS: REGISTER with response
IMS -> UE: 401 Unauthorized (With Challenge)

So what’s going on here?

IMS uses AKAv1-MD5 for Authentication, this is slightly different to the standard AKA auth used in cellular, but if you’re curious, we’ve covered by IMS Authentication and standard AKA based SIM Authentication in cellular networks before.

When we generate the vectors (for IMS auth and standard auth) one of the inputs to generate the vectors is the Sequence Number or SQN.

This SQN ticks over like an odometer for the number of times the SIM / HSS authentication process has been performed.

There is some leeway in the SQN – It may not always match between the SIM and the HSS and that’s to be expected.
When the MME sends an Authentication-Information-Request it can ask for multiple vectors so it’s got some in reserve for the next time the subscriber attaches, and that’s allowed.

Information stored on USIM / SIM Card for LTE / EUTRAN / EPC - K key, OP/OPc key and SQN Sequence Number

But there are limits to how far out our SQN can be, and for good reason – One of the key purposes for the SQN is to protect against replay attacks, where the same vector is replayed to the UE. So the SQN on the HSS can be ahead of the SIM (within reason), but it can’t be behind – Odometers don’t go backwards.

So the issue was with the SQN on the SIM being out of Sync with the SQN in the IMS, how do we know this is the case, and how do we fix this?

Well there is a resync mechanism so the SIM can securely tell the HSS what the current SQN it is using, so the HSS can update it’s SQN.

When verifying the AUTN, the client may detect that the sequence numbers between the client and the server have fallen out of sync.
In this case, the client produces a synchronization parameter AUTS, using the shared secret K and the client sequence number SQN.
The AUTS parameter is delivered to the network in the authentication response, and the authentication can be tried again based on authentication vectors generated with the synchronized sequence number.

RFC 3110: HTTP Digest Authentication using AKA

In our example we can tell the sub is out of sync as in our Multimedia Authentication Request we see the SIP-Authorization AVP, which contains the AUTS (client synchronization parameter) which the SIM generated and the UE sent back to the S-CSCF. Our HSS can use the AUTS value to determine the correct SQN.

SIP-Authorization AVP in the Multimedia Authentication Request means the SQN is out of Sync and this AVP contains the RAND and AUTN required to Resync

Note: The SIP-Authorization AVP actually contains both the RAND and the AUTN concatenated together, so in the above example the first 32 bytes are the AUTN value, and the last 32 bytes are the RAND value.

So the HSS gets the AUTS and from it is able to calculate the correct SQN to use.

Then the HSS just generates a new Multimedia Authentication Answer with a new vector using the correct SQN, sends it back to the IMS and presto, the UE can respond to the challenge normally.

This feature is now fully implemented in PyHSS for anyone wanting to have a play with it and see how it all works.

And that friends, is how we do SQN resync in IMS!

HOMER API in Python

We’re doing more and more network automation, and something that came up as valuable to us would be to have all the IPs in HOMER SIP Capture come up as the hostnames of the VM running the service.

Luckily for us HOMER has an API for this ready to roll, and best of all, it’s Swagger based and easily documented (awesome!).

(Probably through my own failure to properly RTFM) I was struggling to work out the correct (current) way to Authenticate against the API service using a username and password.

Because the HOMER team are awesome however, the web UI for HOMER, is just an API client.

This means to look at how to log into the API, I just needed to fire up Wireshark, log into the Web UI via my browser and then flick through the packets for a real world example of how to do this.

Homer Login JSON body as seen by Wireshark

In the Login action I could see the browser posts a JSON body with the username and password to /api/v3/auth

{"username":"admin","password":"sipcapture","type":"internal"}

And in return the Homer API Server responds with a 201 Created an a auth token back:

Now in order to use the API we just need to include that token in our Authorization: header then we can hit all the API endpoints we want!

For me, the goal we were setting out to achieve was to setup the aliases from our automatically populated list of hosts. So using the info above I setup a simple Python script with Requests to achieve this:

import requests
s = requests.Session()

#Login and get Token
url = 'http://homer:9080/api/v3/auth'
json_data = {"username":"admin","password":"sipcapture"}
x = s.post(url, json = json_data)
print(x.content)
token = x.json()['token']
print("Token is: " + str(token))


#Add new Alias
alias_json = {
          "alias": "Blog Example",
          "captureID": "0",
          "id": 0,
          "ip": "1.2.3.4",
          "mask": 32,
          "port": 5060,
          "status": True
        }

x = s.post('http://homer:9080/api/v3/alias', json = alias_json, headers={'Authorization': 'Bearer ' + token})
print(x.status_code)
print(x.content)


#Print all Aliases
x = s.get('http://homer:9080/api/v3/alias', headers={'Authorization': 'Bearer ' + token})
print(x.json())

And bingo we’re done, a new alias defined.

We wrapped this up in a for loop for each of the hosts / subnets we use and hooked it into our build system and away we go!

With the Homer API the world is your oyster in terms of functionality, all the features of the Web UI are exposed on the API as the Web UI just uses the API (something I wish was more common!).

Using the Swagger based API docs you can see examples of how to achieve everything you need to, and if you ever get stuck, just fire up Wireshark and do it in the Homer WebUI for an example of how the bodies should look.

Thanks to the Homer team at QXIP for making such a great product!

Failures in cobbling together a USSD Gateway

One day recently I was messing with the XCAP server, trying to set the Call Forward timeout. In the process I triggered the UE to send a USSD request to the IMS.

Huh, I thought, “I wonder how hard it would be to build a USSD Gateway for our IMS?”, and this my friends, is the story of how I wasted a good chunk of my weekend trying (and failing) to add support for USSD.

You might be asking “Who still uses USSD?” – The use cases for USSD are pretty thin on the ground in this day and age, but I guess balance query, and uh…

But this is the story of what I tried before giving up and going outside…

Routing

First I’d need to get the USSD traffic towards the USSD Gateway, this means modifying iFCs. Skimming over the spec I can see the Recv-Info: header for USSD traffic should be set to “g.3gpp.ussd” so I knocked up an iFC to match that, and route the traffic to my dev USSD Gateway, and added it to the subscriber profile in PyHSS:

  <!-- SIP USSD Traffic to USSD-GW-->
        <InitialFilterCriteria>
            <Priority>25</Priority>
            <TriggerPoint>
                <ConditionTypeCNF>1</ConditionTypeCNF>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>1</Group>
                    <SIPHeader>
                      <Header>Recv-Info</Header>
                      <Content>"g.3gpp.ussd"</Content>
                    </SIPHeader>
                </SPT>                
            </TriggerPoint>
            <ApplicationServer>
                <ServerName>sip:ussdgw:5060</ServerName>
                <DefaultHandling>0</DefaultHandling>
            </ApplicationServer>
        </InitialFilterCriteria>

Easy peasy, now we have the USSD requests hitting our USSD Gateway.

The Response

I’ll admit that I didn’t jump straight to the TS doc from the start.

The first place I headed was Google to see if I could find any PCAPs of USSD over IMS/SIP.

And I did – Restcomm seems to have had a USSD product a few years back, and trawling around their stuff provided some reference PCAPs of USSD over SIP.

So the flow seemed pretty simple, SIP INVITE to set up the session, SIP INFO for in-dialog responses and a BYE at the end.

With all the USSD guts transferred as XML bodies, in a way that’s pretty easy to understand.

Being a Kamailio fan, that’s the first place I started, but quickly realised that SIP proxies, aren’t great at acting as the UAS.

So I needed to generate in-dialog SIP INFO messages, so I turned to the UAC module to generate the SIP INFO response.

My Kamailio code is super simple, but let’s have a look:

request_route {

        xlog("Request $rm from $fU");

        if(is_method("INVITE")){
                xlog("USSD from $fU to $rU (Emergency number) CSeq is $cs ");
                sl_reply("200", "OK Trying USSD Phase 1");      #Generate 200 OK
                route("USSD_Response"); #Call USSD_Response route block
                exit;
        }
}

route["USSD_Response"]{
        xlog("USSD_Response Route");
        #Generate a new UAC Request
        $uac_req(method)="INFO";
        $uac_req(ruri)=$fu;     #Copy From URI to Request URI
        $uac_req(furi)=$tu;     #Copy To URI to From URI
        $uac_req(turi)=$fu;     #Copy From URI to To URI
        $uac_req(callid)=$ci;   #Copy Call-ID
                                #Set Content Type to 3GPP USSD
        $uac_req(hdrs)=$uac_req(hdrs) + "Content-Type: application/vnd.3gpp.ussd+xml\r\n";
                                #Set the USSD XML Response body
        $uac_req(body)="<?xml version='1.0' encoding='UTF-8'?>
        <ussd-data>
                <language value=\"en\"/>
                <ussd-string value=\"Bienvenido. Seleccione una opcion: 1 o 2.\"/>
        </ussd-data>";
        $uac_req(evroute)=1;    #Set the event route to use on return replies
        uac_req_send();         #Send it!
}

So the UAC module generates the 200 OK and sends it back.

“That was quick” I told myself, patting myself on the back before trying it out for the first time.

Huston, we have a problem – Although the Call-ID is the same, it’s not an in-dialog response as the tags aren’t present, this means our UE send back a 405 to the SIP INFO.

Right. Perhaps this is the time to read the Spec…

Okay, so the SIP INFO needs to be in dialog. Can we do that with the UAC module? Perhaps not…

But the Transaction Module ™ in Kamailio exposes and option on the ctl API to generate an in-dialog UAC – this could be perfect…

But alas real life came back to rear its ugly head, and this adventure will have to continue another day…

Update: Thanks to a kindly provided PCAP I now know what I was doing wrong, and so we’ll soon have a follow up to this post named “Successes in cobbling together a USSD Gateway” just as soon as I have a weekend free.

Kamailio Bytes: Adding Prometheus + Grafana to Kamailio

I recently fell in love with the Prometheus + Grafana combo, and I’m including it in as much of my workflow as possible, so today we’ll be integrating this with another favorite – Kamailio.

Why would we want to integrate Kamailio into Prometheus + Grafana? Observability, monitoring, alerting, cool dashboards to make it look like you’re doing complicated stuff, this duo have it all!

I’m going to assume some level of familiarity with Prometheus here, and at least a basic level of understanding of Kamailio (if you’ve never worked with Kamailio before, check out my Kamailio 101 Series, then jump back here).

So what will we achieve today?

We’ll start with the simple SIP Registrar in Kamailio from this post, and we’ll add on the xhttp_prom module, and use it to expose some stats on the rate of requests, and responses sent to those requests.

So to get started we’ll need to load some extra modules, xhttp_prom module requires xhttp (If you’d like to learn the basics of xhttp there’s also a Kamailio Bytes – xHTTP Module post covering the basics) so we’ll load both.

xHTTP also has some extra requirements to load, so in the top of our config we’ll explicitly specify what ports we want to bind to, and set two parameters that control how Kamailio handles HTTP requests (otherwise you’ll not get responses for HTTP GET requests).

listen=tcp:0.0.0.0:9090
listen=tcp:0.0.0.0:5060
listen=udp:0.0.0.0:5060

http_reply_parse=yes
tcp_accept_no_cl=yes

Then where you load all your modules we’ll load xhttp and xhttp_prom, and set the basic parameters:

loadmodule "xhttp.so"
loadmodule "xhttp_prom.so"

# Define two counters and a gauge
modparam("xhttp_prom", "xhttp_prom_stats", "all")

By setting xhttp_prom module to expose all stats, this exposes all of Kamailio’s internal stats as counters to Prometheus – This means we don’t need to define all our own counters / histograms / gauges, instead we can use the built in ones from Kamailio. Of course we can define our own custom ones, but we’ll do that in our next post.

Lastly we’ll need to add an event route to handle HTTP requests to the /metrics URL:

event_route[xhttp:request] {
	xlog("Got a request!");
	xlog("$ru");
	$var(xhttp_prom_root) = $(hu{s.substr,0,8});
	if ($var(xhttp_prom_root) == "/metrics") {
			xlog("Called metrics");
			prom_dispatch();
			xlog("prom_dispatch() called");
			return;
	} else
		xhttp_reply("200", "OK", "text/html",
        		"<html><body>Wrong URL $hu</body></html>");
}

Restart, and browse to the IP of your Kamailio instance (mine is 10.01.23) port 9090 /metrics and you’ll get something like this:

Kamailio metrics endpoing used by Prometheus

That my friends, is the sort of data that Prometheus gobbles up, so let’s point Prometheus at it and see what data we get back.

Over on my Prometheus server I’ve edited /etc/prometheus/prometheus.yml to target our new Prometheus endpoint.

  - job_name: "kamailo"
    static_configs:
      - targets: ["10.0.1.23:9090"]  
    honor_timestamps: false

So how can we see this data? Well first off if we log into Prometheus we can see the data flowing in:

If we throw some SIP REGISTER traffic at our Kamailio instance and check on the kamailio_registrar_accepted_regs stat we can see our registrations.

After a few clicks in Grafana we can run some graphs for this data too:

So that’s it, Kamailio’s core stats are now exposed to Prometheus, and we can render this information in Grafana.

There’s a copy of the full code used here available in the Github, and in our next post we’ll look at defining our own metrics in Kamailio and then interacting with them.

Kamailio Diameter Routing Agent Support

Recently I’ve been working on open source Diameter Routing Agent implementations (See my posts on FreeDiameter).

With the hurdles to getting a DRA working with open source software covered, the next step was to get all my Diameter traffic routed via the DRAs, however I soon rediscovered a Kamailio limitation regarding support for Diameter Routing Agents.

You see, when Kamailio’s C Diameter Peer module makes a decision as to where to route a request, it looks for the active Diameter peers, and finds a peer with the suitable Vendor and Application IDs in the supported Applications for the Application needed.

Unfortunately, a DRA typically only advertises support for one application – Relay.

This means if you have everything connected via a DRA, Kamailio’s CDP module doesn’t see the Application / Vendor ID for the Diameter application on the DRA, and doesn’t route the traffic to the DRA.

The fix for this was twofold, the first step was to add some logic into Kamailio to determine if the Relay application was advertised in the Capabilities Exchange Request / Answer of the Diameter Peer.

I added the logic to do this and exposed this so you can see if the peer supports Diameter relay when you run “cdp.list_peers”.

With that out of the way, next step was to update the routing logic to not just reject the candidate peer if the Application / Vendor ID for the required application was missing, but to evaluate if the peer supports Diameter Relay, and if it does, keep it in the game.

I added this functionality, and now I’m able to use CDP Peers in Kamailio to allow my P-CSCF, S-CSCF and I-CSCF to route their traffic via a Diameter Routing Agent.

I’ve got a branch with the changes here and will submit a PR to get it hopefully merged into mainline soon.

Diameter Routing Agents (Why you need them, and how to build them) – Part 2 – Routing

What I typically refer to as Diameter interfaces / reference points, such as S6a, Sh, Sx, Sy, Gx, Gy, Zh, etc, etc, are also known as Applications.

Diameter Application Support

If you look inside the Capabilities Exchange Request / Answer dialog, what you’ll see is each side advertising the Applications (interfaces) that they support, each one being identified by an Application ID.

CER showing support for the 3GPP Zh Application-ID (Interface)

If two peers share a common Application-Id, then they can communicate using that Application / Interface.

For example, the above screenshot shows a peer with support for the Zh Interface (Spoiler alert, XCAP Gateway / BSF coming soon!). If two Diameter peers both have support for the Zh interface, then they can use that to send requests / responses to each other.

This is the basis of Diameter Routing.

Diameter Routing Tables

Like any router, our DRA needs to have logic to select which peer to route each message to.

For each Diameter connection to our DRA, it will build up a Diameter Routing table, with information on each peer, including the realm and applications it advertises support for.

Then, based on the logic defined in the DRA to select which Diameter peer to route each request to.

In its simplest form, Diameter routing is based on a few things:

  1. Look at the DestinationRealm, and see if we have any peers at that realm
  2. If we do then look at the DestinationHost, if that’s set, and the host is connected, and if it supports the specified Application-Id, then route it to that host
  3. If no DestinationHost is specified, look at the peers we have available and find the one that supports the specified Application-Id, then route it to that host
Simplified Diameter Routing Table used by DRAs

With this in mind, we can go back to looking at how our DRA may route a request from a connected MME towards an HSS.

Let’s look at some examples of this at play.

The request from MME02 is for DestinationRealm mnc001.mcc001.3gppnetwork.org, which our DRA knows it has 4 connected peers in (3 if we exclude the source of the request, as we don’t want to route it back to itself of course).

So we have 3 contenders still for who could get the request, but wait! We have a DestinationHost specified, so the DRA confirms the host is available, and that it supports the requested ApplicationId and routes it to HSS02.

So just because we are going through a DRA does not mean we can’t specific which destination host we need, just like we would if we had a direct link between each Diameter peer.

Conversely, if we sent another S6a request from MME01 but with no DestinationHost set, let’s see how that would look.

Again, the request is from MME02 is for DestinationRealm mnc001.mcc001.3gppnetwork.org, which our DRA knows it has 3 other peers it could route this to. But only two of those peers support the S6a Application, so the request would be split between the two peers evenly.

Clever Routing with DRAs

So with our DRA in place we can simplify the network, we don’t need to build peer links between every Diameter device to every other, but let’s look at some other ways DRAs can help us.

Load Control

We may want to always send requests to HSS01 and only use HSS02 if HSS01 is not available, we can do this with a DRA.

Or we may want to split load 75% on one HSS and 25% on the other.

Both are great use cases for a DRA.

Routing based on Username

We may want to route requests in the DRA based on other factors, such as the IMSI.

Our IMSIs may start with 001010001xxx, but if we introduced an MVNO with IMSIs starting with 001010002xxx, we’d need to know to route all traffic where the IMSI belongs to the home network to the home network HSS, and all the MVNO IMSI traffic to the MVNO’s HSS, and DRAs handle this.

Inter-Realm Routing

One of the main use cases you’ll see for DRAs is in Roaming scenarios.

For example, if we have a roaming agreement with a subscriber who’s IMSIs start with 90170, we can route all the traffic for their subs towards their HSS.

But wait, their Realm will be mnc901.mcc070.3gppnetwork.org, so in that scenario we’ll need to add a rule to route the request to a different realm.

DRAs handle this also.

In our next post we’ll start actually setting up a DRA with a default route table, and then look at some more advanced options for Diameter routing like we’ve just discussed.

One slight caveat, is that mutual support does not always mean what you may expect.
For example an MME and an HSS both support S6a, which is identified by Auth-Application-Id 16777251 (Vendor ID 10415), but one is a client and one is a server.
Keep this in mind!