Radio over IP networks ? - A Call for a cautionary approach to Audio Coding

Having supplied several hundred audio codecs to the new wave of Indian FM stations for IP delivery, and to country wide networks in Australia, UK, Finland, Norway, Japan, Korea, USA and others, APT has a degree of insight into the market and would like to use the benefit of our experience to provide some observations.

When considering the choice of audio codecs in the broadcast chain, developments have seen that impressive results can be achieved at low data rates using codecs such as AAC HE, indeed such algorithms have been widely adopted for applications such as DMB, DRM, DAB+, DVB-S/T. However that misses the very important point which is that such algorithms perform well for these applications when used in isolation.

However, the stand-alone performance of the audio encoding/decoding process is not the issue in every case.

Research by the European Broadcast Union (EBU) identifies the following stages in the broadcast chain, any of which may see audio compression systems in use:

»     Source - file based playout system, DJs iPod, portable flash recorder, Mini disc,

»     Contribution circuit - ISDN dial up from music concert, sports event etc, 

»     Broadcast Studio Installation - long term storage, file based, webserver based,

»     Secondary Distribution - studio to studio link, studio to transmitter link,

»     Emission - delivery to the listener of FM/AM/DAB+/DRM/HD Radio/DMB/DVBs/DVBt/Internet

In each of the above cases an audio compression system is more than likely going to be used, and in the case of Emission only AM and FM (being analogue) are guaranteed free of audio compression.

So, clearly it is not the standalone performance that is the concern but how these compression systems work together that matters.

Sitting through a demonstration of how wonderful the latest and newest compression system sounds at low bit rates does nothing to inform the listener as to how it will sound if used in conjunction with other compression systems, or if cascaded itself with several encode/decode cycles. Nor does the standalone performance inform the listener of the delay characteristics.

It should be noted that all MPEG and AAC variants are psychoacoustic in principle, this means that content which the algorithm predicts will not be noticed by the human ear is removed permanently. It works well, these algorithms sound good, the more recent ones sound the best (such as AAC HE), but once removed the content is gone, and when the signal is subject to a second or third round of encoding and decoding that is where the risk is. The fall off in sound quality after successive encoding rounds, can be quite severe, this effect is known as concatenation.

So when an engineer chooses a compression system for delivery of audio in real-time from A to B, he should be asking if his link is going to be the link that causes concatenation. If the engineer has been impressed with the performance of such a psychoacoustic algorithm and chooses it for a contribution or distribution link, now he has to ask for every piece of audio passing through that link...

»     What has the audio been through already? 

→   Has it already been subject to compression?

»     What is the audio going through next? 

→   Is it going to be subject to compression again?

»     Will my link be the link that introduces concatenation?

Instead if the engineer chooses and ADPCM type of algorithm he can relax, knowing that these algorithms (such as Enhanced apt-X) are non destructive, and can cause NO concatenation even if cascaded up to 10 times. Enhanced apt-X has the additional benefit of being incredibly low delay. In a large country like India, relying increasingly on IP networks for backbone infrastructure delay should be quite a concern.

Any network will have some transport latency, in the case of an IP network this can be quite considerable. An IP network has both standard transmission delays, and also packetising delays; jitter buffers also contribute delay. The choice of audio compression algorithm is therefore critical in determining the actual end to end latency. An ADPCM type of compression system provides a very low delay solution, in theory the encode/decode time for Enhanced apt-X is around 3msecs, in practise this depends on sample rate (45 samples being the governing factor in the equation), and of course there is the network itself to contend with. Tested on a bench back to back real world results in an operating codec are as follows (on a short link running at 192kbps)

»     Standard apt-X 16 bit = 17mS,

»     Enhanced apt-X24 = 10mS,

»     MPEG L2=103mS,

»     MPEG L3=160mS,

»     AAC LD=63mS

You can see a very clear difference here with apt-X and Enhanced apt-X outperforming even AAC LD (low delay) by a quite considerable margin. So, including IP stack delay (you have to add the time taken for assembly in to RTP packet and transport through the UDP stack), and the service providers network delay, you can see that really no other solution is going to meet a low delay objective other than using an ADPCM type of algorithm.

This is why Enhanced apt-X has become the algorithm of choice for many worldwide.

  • It is the only risk free choice, (it cannot cause concatenation),
  • It satisfies the best demands of the audio purists (it sounds great)
  • and
  • It has the added advantage of being low delay (by a massive margin, the lowest).

The EBU have specified it for all their contribution links between broadcasting members of that organization. In Australia the ABC have standardized on it for all their distribution it is also the standard now in Korea, Japan, Norway, UK, and India (within the independent sector), and many other countries.

So in conclusion, the success that MPEG and AAC have had in chasing lowest bit rates is well noted, and indeed makes these types of algorithms appropriate where when the bandwidth issue is the single key critical issue i.e. distribution to the end listener through the internet, cellular networks, satellite, and set top boxes....

but

...there are several other applications where the use of such psychoacoustic based technologies is not appropriate and can prove detrimental to the quality of a radio station's output. Therefore, APT believe it is beneficial to highlight the need for the more gentle approach of apt-X technology and point to its widespread acceptance throughout many networks worldwide as a result.

[ Guy Gampell is the Asia Pacific Sales Manager for Audio Processing Technology. www.aptx.com ]

 

Latest News

Exclusive H.264 Seminar and Ateme Launch

The Bridge Networks is pleased to announce our Exclusive H.264 Seminar & Product Launch.

23 Nov 09 Read More »

Are you doing Radio over IP networks ?

Having supplied several hundred audio codecs to the new wave of Indian FM stati

03 Feb 09 Read More »

Mobile TV: Let's Get On With It !

(An address by The Bridge Networks Managing Director, Darren Kirs

17 Dec 08 Read More »

Registered Quality Assurance logo - Occupational Health and SafetyRegistered Quality Assurance logo - Quality Endorsed Company The Bridge Networks is a wholly owned subsidiary of Broadcast Australia Pty Limited which in turn is a wholly owned subsidiary of the Canada Pension Plan Investment Board (CPPIB).