The Session Initiation Protocol was developed in the 21st century. RFC 3261 is dated June 2002. In this time, network and Internet security was nothing new. Still, this is the year 2011 and most phones and servers still lack important security features. It’s not a lack of security technology and solution proposals that is the problem. It’s a lack of requirements from the market. And two paradigms that yet again meet – the netheads and the bellheads. In this article, I will try to give you an overview of the issues at hand without going to far into all the technichal details.

RFC 3261 security – TLS and S/MIME

In the original RFC, there was two security solutions applied to the SIP protocol, TLS and S/MIME. TLS was used to give confidentiality to signalling and some level of identity checking between clients and servers. Not end-to-end, but hop by hop. S/MIME was proposed to get some level of end-to-end identity assurance as well as confidentiality. I was surprised meeting S/MIME again, since most people I’ve worked with consider it a total failure for e-mail security. The possibility of it going anywhere in the SIP world seemed minimal at best. So far I haven’t seen any implementations of S/MIME for SIP in production use. SIPit reports has indicated that there are some new implementations, so something is happening, but not much.

The complexity of implementing and managing a PKI is one of the big hurdles of both S/MIME and TLS. The TLS part has been worked on for a long time. There has been a number of clarifications and corrections to the orginal RFC. I’ve been focusing a lot on TLS testing at SIPit’s I’ve participated in. There’s been many issues with the way phones handle certificates. If the certificates are not verified, there’s no use in setting up a TLS session at all. If you don’t know who you are exchanging keys with, you have no idea whether this is a secure session or not.

Now, if you do succeed in verification – what does this mean for the person using the phone? Should you really show the padlock on the screen? The user interface needs to be discussed much more. TLS only shows that you have a secure signaling path to the FIRST server you connect to. What happens after that, on the other side of that server, is out of control. That’s why the authors of RFC 3261 added S/MIME which is supposed to give end-to-end security for parts of the signalling between end points.

Media security – SRTP and the key exchange mess

When I ask people what a “secure call” really is, most people start with confidentiality – that no one should be able to listen in to the call. For that, we have the secure RTP protocol, a way to encrypt a media stream between two endpoints. There are a number of issues here. How do we exchange encryption keys securely, so no one else can get access to them? And who is the other endpoint? Is that the device of the person you are calling or a device operated by a third party? And if you exchange keys in a secure way, what about legal intercept?

There are a number of proposals on how to exchange keys for setting up an encrypted media stream in a SIP session. What’s implemented in most of the phones on the market? Sending keys in clear text, possibly protected by TLS, in the media stream. All intermediate servers in the call path will have access to the encryption keys. Add that most phones doesn’t really check the TLS certificate of the server and you will realize how secure these solutions are.

Netheads vs bellheads – different security paradigms

When working with network security, one quickly realizes that there are at least two very separate views on security – and it becomes very clear when working with telephony over networks. In the telco world, a VPN, virtual private network, is a network where the telco separates one customer’s traffic from other customers. The customer trust that the telco does this correctly and because of the trust of the operator’s services, the customer considers this a secure network.

In the datacom network world, a VPN is an encrypted network, managed by the customer. If someone else gets access to the encryption keys, it’s no longer considered a secure network. The service provider basically providers an insecure packet based connection, the customer secures the connections that needs security.

In the SIP world, you can see from a number of proposals that there are two worlds meeting in SIP. Some of the security proposals has an architecture where a third party, the service provider, manages security. This is the model imposed on telco’s by laws and regulation and it’s reflected in the technichal solutions. Other proposals tend to distribute security management to the end points, following the classical Internet and IETF security model.I think these conflicting views is one of the reasons why we have so many different proposals that doesn’t give ONE picture on how to manage security. This leads to a market where developers and customers just get confused and we have no working solutions to offer, no interoperability between vendors.

DNSsec coming into action – DANE

There is interesting work going on in an IETF working group called DANE, which targets one of the largest issues with TLS – the PKI overhead. With DNSsec, we get a trusted directory service that we can use for inter-domain PKI assurance. If I want to set up a secure call with SIP to your domain, I can check a specific record in your DNS for a PKI certificate verification. This record is signed by you in your DNS zone. The DNSsec platform used this way simplifies all TLS and certificate based solutions. I really look forward to the outcome of this working group and believe it will mean a lot for secure calls within an open federation, like the Internet. And by just having a secure identification of end points, a lot of spam (called SPIT in the SIP world) can be avoided.

Security for realtime communication – does customers care?

The XMPP world struggles with the same issues as the SIP world, but the core XMPP service is not regulated the way the core SIP service – telephony – is. ┬áThe same core questions apply to XMPP/Jabber as well as SIP:

  • What is a secure session? Chat, phone call, instant message, application sharing
  • Is there a need for secure identities?
  • Is there need for encryption?
  • Do we need a PKI?
  • How do we indicate security to the users? Does the WWW padlock mean anything in this context, if so – what?

In the end, there must be reason for the lack of focus on these topics. How many people use secure e-mail? Are we worrying about issues that our users doesn’t care about? I think it is high time for a reality check here. Is the lack of requirements for security in customer projects a lack of knowledge and understanding of the issues? Or is it actually an indication that this kind of security for phone calls, presence and chat is not needed?