Codec Comparision

For comprehensive comparisons between BV16/BV32 and other codecs, please see the codec comparison tables in Annexes C and D (pages 82 - 89) of the PacketCable 2.0 Codec and Media Specification, available at http://www.cablelabs.com/specifications/PKT-SP-CODEC-MEDIA-I07-090702.pdf

Broadcom believes that as compared to the other nearly two dozen dominant codecs listed there, BV16 and BV32 offer the most compelling combinations of low delay, low complexity, high quality, moderate bit-rate, and royalty-free code available in source code form.

The following figures give graphic comparisons of BroadVoice and other codecs in terms of delay, complexity, and speech quality. Most of the delay and complexity data are from PacketCable 2.0 Codec and Media Specification referenced above. The voice quality data are obtained from objective quality measurements previously performed by Broadcom (for PESQ scores) or from subjective listening tests previously conducted by independent third-party test labs (for MOS scores).

(1) Coding Delay:

Typical end-to-end delay in VoIP systems includes delays due to codec buffering, processing, transmission/decoder bit-stream buffering, real-time OS multi-tasking, propagation, network nodes, jitter buffering, etc. If the packet size is the same as the codec frame size, a general rule is that the one-way end-to-end delay is typically around 5 x codec frame size + look-ahead. Based on this formula, the following figure compares the end-to-end delay of many speech codecs. Note that sample-based or extremely low-delay codecs such as G.711, G.726, G.728, and G.722 offer no real delay advantage over BroadVoice because they typically have to use a buffer to fill a packet of at least 5 ms anyway (packet header overhead is too high for packet sizes < 5 ms).

(2) Codec Complexity:

The following figures compare the computational complexity in terms of Million Instructions Per Second (MIPS), the RAM memory requirement, and total memory footprint requirement on a typical 16-bit fixed-point commercial DSP.

(3) Codec Output Voice Quality:

Figures 5 and 6 compare the output quality of narrowband and wideband codecs, respectively, using the objective measure Perceptual Evaluation of Speech Quality (PESQ) as defined in the ITU-T Recommendation P.862. Each curve represent a single codec, evaluated with 13 different languages, with the PESQ scores averaged over 96 sentence pairs for each language.

Figures 7 and 8 compare the Mean Opinion Scores (MOS) obtained in formal subjective listening tests for narrowband and wideband codecs conducted at Dynastat Inc. and Comsat Laboratories, respectively. Each listening test has 32 listeners. BV16 was rated better than toll-quality codecs G.728, G.729, and 32 kb/s G.726 by statistically significant margins. BV32 was rated better than 64 kb/s G.722 by statistically significant margins in 4 out of 5 conditions and statistically equivalent in the remaining one condition. In the test condition labels, "PLR" stands for "Packet Loss Rate".

Video