Figure 3 illustrates the typical decoder. It's easy to see that the decoder can follow a precise reverse process to reconstruct the audio signal. Finally the resulting data is formatted based on the standard specifications and packetized for storage or streaming. Huffman coding is the most commonly used technique in entropy coding. Entropy coding is a lossless technique that uses fewer bits to represent more likely quantization indices. Next, the quantization indices are entropy coded. Naturally, more important tones suffer less quantization error. Quantization is a lossy process and reduces the amount of information content significantly, thereby contributing to a large reduction in bit rate. The next step is quantization, where each transform coefficient (equivalently, the tone amplitude) is scaled down by the step size and converted to an integer called a quantization index. A smaller step size indicates that the corresponding tone is more important. As mentioned earlier, a psychoacoustic model is used to determine the relative importance (represented by a “step size”) of each transform coefficient. Of late, the modified discrete cosine transform (MDCT) is most preferred. In the first-generation audio codecs such as Musicam, filter banks were employed. Reversible transforms are best used in this step. The input audio signal is first transformed to the frequency domain to enable analysis. A good psychoacoustic model is at the heart of all high-quality audio codecs.įigure 2 shows the block-level organization of a typical audio encoder. In practical audio codecs, a comprehensive psychoacoustic model that mimics the properties of the human ear will capture the comparative relevance of the audible frequency tones while deeming other tones as inaudible. Masking is a complex phenomenon and what I just described is only a simplified model. By coding only the tones that are audible to the human ear, tremendous compression ratios can be achieved. The signal is then analyzed to determine which tones would be irrelevant because of masking by nearby louder tones. Each frame (which is collection of samples such as 1024 sample) of audio is first transformed to the frequency domain, thereby decomposing it into a collection of tones at various frequencies. The frequency masking phenomenon is exploited extensively in audio coding. Audio coding can reduce this data rate by a factor of 20 with negligible impact on perceived audio quality. As an example, one minute of DVD-quality audio data requires almost 30MB of space or 3.5Mbps of bandwidth for real-time streaming. Raw audio needs huge space for storage, or equivalently, high bandwidth for streaming. Typical sampling frequencies in high-quality audio are 44.1KHz or 48KHz, although lower sampling frequencies are used for sub-woofer channels. Specifically, we'll take a look at a number of recently discovered techniques such as spectral band replication (SBR), integer MDCT (intMDCT), parametric audio coding, and binaural cue coding (BCC), each of which is enabling new applications and improved quality at lower bit rates.Ī “raw” multichannel digital audio signal consists of sequences of 16-bit samples (one sequence per channel). The main focus of this article is the recent advance in audio coding technology and on-going work in the MPEG audio committee. In this article, I'll briefly cover the principles of audio coding and describe at a high level the popular MPEG audio codecs (MP3, AAC) as well as a few proprietary alternatives. Audio coding is the art and science of compressing audio signals for efficient storage (small file size) and high-quality streaming (low bandwidth). The terrific popularity of portable multimedia players and Internet media services such as iTunes has generated a lot of interest in audio coding. The best approach depends on your storage, your fidelity needs, and the amount of processing at hand. There's no shortage of methods for compressing digital audio.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |