HD Voice, Clearly
There has been a lot of chatter lately about HD Voice – sometimes referred to as the G.722 codec. Some people consider it the defining improvement to desktop phones – others (me) might consider it the only improvement to the desktop phone. The subject even had its own Jeff Pulver sponsored conference recently. One more thing crystal clear on this topic is there is a lot of confusion. For example, owners of most of the newer Polycom Soundpoint phones will see a new HDVoice logo right on the handset. But that does not mean these users are experiencing HDVoice. It simply means the phone is capable of it. Like all codec’s it takes two to tango – but in most voice communications it takes more than two.
First, what is HDVoice? HDVoice is a registered trademark of Polycom, it is really referring to an emerging standard for wideband audio on telephones based on the 722 Codec. Our ears can hear a frequency range of 20 Hz – 20KHz, though most of us can’t hear frequencies above 15 KHz. The original analog PSTN phone supports a frequency range of 300 Hz – 3.4KHz. This is why classical piano is the recommended music on hold – the vast majority of the higher frequencies we hear are outside the audio range of telephones. Wideband phones support a frequency range of 30 Hertz – 7 KHz just about double the traditional narrow band standard. For reference, the human voice range is about 80-14KHz..
This is clearly illustrated in FM radio advertisements where the actors do a mock phone conversation – one is clear (FM radio goes to 15KHz) and the other person is tunneled into a narrow band. Without being able to see any actors or props, the radio listener has no confusion about which actor is “on the phone”. Well, it turns out that audio range of the phone was designed a very long time ago, and not until recently has anyone considered improving it. The complexity in improving the standard is it isn’t just the phones – the entire public network of carriers, gateways, switches, and phones are built around this narrow band standard.
When there is no PSTN involved, there was really no need to limit the audio to narrowband, but no one really cared. Most phone systems are not purchased for internal calls. But VoIP changed the game a bit, because now phone systems are networked between sites via IP trunks. Over the past few years, SIP trunks (an industry standard IP trunk) started connecting systems and supporting wide band turns out not to be very complex (though many SIP carriers do not support it).
So with phones in place and an increasing number of carriers supporting G.722 – wideband phones are positioned well for rapid growth. It turns out that G.722 is not complex and many phones can be upgraded with firmware. Cisco recently announced a firmware upgrade for this very reason. Wideband audio doesn’t even require more bandwidth. Personally, I find it disappointing that this long overdue upgrade is still so limited; if we are going to go to all this trouble, why not go to 15Khz? But things are in play for G.722 to catch-on this year.
[youtube=http://www.youtube.com/watch?v=LCfSNdbUCXw]
The name of this blog, Pin Drop Soup, refers to the last time we were sold on the clarity of voice communications. Analog communications tend to be hissy (think records vs. CDs), when Sprint completed their national upgrade to an all digital network, they convinced us the sound of a pin drop was the ultimate test. The Memorex commercials with Ella breaking the glass might be more appropriate for wideband – though unfortunately not with a high range around 7Khz.
There is an extraordinary amount of confusion regarding wideband voice. The issue is that all codecs involved in the communication path need to agree to the G.722 codec. Some technologies negotiate with the end point (correct) and some negotiate with the end (or midspan) switch. This can result in something called transcoding. Ideally, the endpoints of the conversation negotiate the codec used for the call, but if they can’t agree, additional processing will be required to transcode the mismatched codecs. Thus, a processing node would have to convert inputs from one wideband codec to the supported codec. Excessive transcoding can introduce delays and degrade voice quality. Transcoding is common between land lines and celluar calls as landlines generally use G.711 codec and cellular calls use multiple codecs optimized for wireless transmissions.
I have a wideband phone, but I don’t believe my headset is wideband. Many phone systems don’t support wideband at all. Microsoft OCS is currently restricted to narrowband as are many of the digital phone systems currently on the market. But even when all the equipment is wideband compliant, the vast majority of the PSTN is not – if you are using traditional T1, PRI links, or even analog links, dialing 9 to get out may indeed be setting the codec to narrow band communications. And remember, it requires end to end communications – so talking to someone say on a cell phone or narrow band G.711 connection means both ends are narrow, not just their end.
This month, for the first time I experienced a customer that didn’t like wideband. The solution required changing the phones to narrowband handsets. Though I consider this the exception, the fact is there are still many people that prefer music on AM over FM radio. This particular customer had a noisy environment. They felt their bullpen environment resulted with too much noise being sucked into the handset’s microphone and didn’t consider wideband an improvement. Rather than deal with the background noise, they liked things the way the were – narrow. Just goes to substantiate why there are so many different voice solutions out there.
Dave,
The trend in wideband voice sees a lot more than just the G.722 codec in play. It's just the lowest-common-denominator royalty-free preference of a few hardware companies. Some of the other codecs offer higher sample rates, but G.722 is the one thing that many people can deploy today.
The PSTN is by definition narrowband. There are regulations that stipulate filtering, all designed around 8 KHz sampling, and around 3.4 KHz upper frequency limit.
Yes, it seems like transcoding is going to be necessary. But we live with transcoding even now. All cell calls happen in codecs native to the mobile networks (AMR, GSM) and are then transcoded to G.711 or G.729 for termination with the PSTN. In fact, many cell calls are transcoded to G.711 for switching thes converted back to AMR or GSM if you're calling another cell phone!
AFAIK, MS OCS does support wideband in some fashion. In fact, they have introduced their own wideband codec called RTA.