Die SIP Die

by Colin Berkshire
If you have installed VOIP systems you know that the bane of their existence is the miserable SIP protocol.
One-way connections are almost the norm. Zero-way connections abound. And, it takes either a genius or a magician to make SIP work in complex networks with VPN.
There are two big problems with SIP:
  1. It has too many useless options and settings.
  2. Media uses RTP and there is no reliable way of ensuring that RTP is working.
It is often pointed out that SIP is simply a protocol for setting up calls, not for transporting them. This explanation always comes up when we’re having one-way audio problems. When I hear this I just glare at the person and say: “And, why is that a good thing when it causes so many problems?”
Take DTMF signaling. Do you want it in-band or out-of-band using RFC2822, or perhaps as SIP info, or perhaps a combination of these that might cause doubled up digits?
How can something as simple and as basic as transmitting digits have so much unnecessary complexity? Never mind that in-band DTMF tones over many codecs like G729 doesn’t even work.
When you call a number do you want the near end or the far end to provide ringback?
But the greatest problem of all are those one-way audio connections. They happen because the SIP protocol has no way of knowing if the audio is being delivered, and the clients have no way to communicate with the SIP controller that audio is getting through.
Now, try to get VOIP working through a VPN and you can spend endless time fiddling settings without success.
SIP has become an implementation disaster.
SIP could be cleaned up with better negotiation. Clients could be asked if they can generate ring-back tones locally. An inquiry could be made of whether media is being delivered. It would be possible to create a new well-thought-through version of SIP where things are tightly nailed down. But I am not seeing anybody step up to the plate to doing that.
Connecting a SIP phone to a PBX or VOIP service should be as simple as pressing a configure button and then keying in the IP address of the server, plus maybe a provisioning ID number. Everything else should be negotiated. And, connections should be verified by the clients and be confirmed back to the SIP server.
When a call is set up, the media path should be part of the negotiation process. The clients should pass peer-to-peer messages with each other and see if they can directly communicate. If they can, that is best because a SIP extension in Tokyo can then just send audio to a SIP client in Tokyo without having to be back-hauled twice to the PBX in the US.
Why is this a problem in the year 2015? Am I the only one out there that just hates all of the SIP settings and complexity and “personality” of getting things to work?
The first version of a SIP-like protocol that I am aware of was running on a switch in Bell Laboratories in the late 1970s. That was 35 years ago. In 35 years we should have been able to figure this out.