FREQUENTLY ASKED QUESTIONS ON DIGITAL AUDIO

 

Q1. What is digital audio?
A: It is the representation of an audio signal by means of numbers, most often using a binary code (i.e., one that uses only zeros and ones).

Q2. Why is digital audio so widely used?
A: Because it has several advantages over analog audio that make it irreplaceable. First of all, it can be stored in an unchanging way. It is more difficult to change a number than the magnetic field proportional to the signal that is recorded in a magnetic tape such as in the case of an analog audio cassette. Second, it allows to take advantage of available digital signal processing technology in order to change sound in a controlled manner, such as in effects, filters or other enhancements very difficult, if not impossible, to accomplish with analog technology. For instance, delays, reverberation, noise reduction, and so on.

Q3. How can one turn from an analog electric signal into a digital one?
A: Two processes called sampling (discretization in the time domain) and digitization (amplitude discretization) are used. Sampling consists in taking discrete values of the signal at regular time intervals. Digitization consists in splitting the whole range of amplitudes of the signal into many "bins" or subintervals each of them labelled with consecutive integer numbers. Then the value of each sample is replaced by the number identifying the bin into which it happens to fall. For instance, if the full range of the signal varying between 0 volts and 10 volts, then a sample of 7.3 volts is assigned the integer part of 7.3*16/10, i.e., 11. This process is carried out by an analog-to-digital converter.

Q4. How is the number of bins into wich the signal values are classified chosen?
A: Usually a power of 2 is selected, so that the numbers assigned to the samples be between 0 and 2n - 1, where n corresponds to the bit count, i.e., the number of binary digits.

Q5. What is the resolution of a digital audio system?
A: It is the number of bits used to represent the audio samples. A higher resolution allows higher precision in the representation of the audio signal (i.e., it will preserve more details). For instance, with a resolution of 8 bits, the full range of variation of the signal is divided into 28 = 256 intervals, whereas with a resolution of 16 bits it will be divided into 216 = 65536 subintervals, which amplitude will be, thus, much smaller. Consumer digital audio (such as used in CD's or in computer sound cards) has a resolution of 16 bits. Professional audio equipment, such as that used in recording studios, work with higher resolutions such as 20 or 24 bits

Q6. What is the sampling rate?
A: Sometimes called sampling frequency, it is the number of samples per unit time. The larger the sampling rate, the wider the frequency response of the system. The standard sampling rate for compact discs (CD) is 44.1 kHz.

Q7. How is the sampla rate chosen?
A: It must be greater than twice the highest frequency fmax present in the signal. This condition is called the Nyquist condition. Note that it is not enough that it be greater than twice the highest useful or meaningful frequency component (for instance, 20 kHz for high quality audio), since noise beyond that frequency may cause a special kind of distortion called aliasing.

Q8. What is aliasing?
A: One consequence of the so-called sampling theorem is that if a signal is sampled with a sampling rate that does not fulfill the Nyquist condition (see previous question), when you try to recover the original signal you get spurious frequency components (not originally present in the signal) along with the expeted signal. Suppose, for instance that we want to sample an audible signal corrupted with an inaudible 35-kHz ultrasonic tone. If we use the standard 44.1 kHz, when trying to recover the original signal a noise tone of 9.1 Hz (= 44.1 kHz - 35 kHz) will also be heard. You cannot just filter it out, because chances are that a useful signal component has the same frequency (and it would be loss along with the noise). This sort of frequencies that are folded back into the useful range of the spectrum are called alias frequencies.

Q9. What happens when the sampling rate is already fixed by features of the system (for instance, because the signal is to be released in a commercial CD), so it cannot be chosen freely to comply with the Nyquist condition?
A: In such cases one has to act on the signal in order to filter out all of the potentially harmful frequency content. A device called antialias filter is used to remove frequencies exceeding the Nyquist rate, i.e., half the sampling rate fs. In the case of the commercial CD, which uses a sampling rate, of 44.1 kHz, the antialias filter is designed to preserve all frequencies below 20 kHz and remove those beyond 22.05 kHz (= 44.1 kHz / 2).

Q10. How do you convert back the digital audio signal on a CD into an audible signal?
A: A digital-to-analog converter is used. This device takes the stored samples and transforms them into a voltage by means of a scale factor. For instance, if the scale factor is 10/16 V, a sample equal to 11 is converted into a voltage equal to 11*10/16 = 6,875 V. The voltage magnitude corresponding to each individual sample is held constant up to the next sample, yielding a stepped signal that can be smoothed by means of a filter called smoothing filter.

Q11. Is the recovered signal identical to the original one?
A: No. If we compare the examples given in questions 3 and 9, we see that a signal value of 7,3 V has been "reconstructed" as 6,875 V, introducing an error of -0,425 V. This error can be reduced by reducing the amplitude of the bins into which the whole signal range is divided, i.e., by increasing the bit resolution. Time evolution of this error constitutes what is called digitization noise.

Q12. What is the relationship between digitization noise and resolution?
A: The best way to assess the noise present in any system (including digital audio systems) is by means of the signal-to-noise ratio (S/N) in decibels. In the case of digital signals, the maximum S/N attainable by a given system is roughly 6*n, where n is the bit resolution. For instance, a 16-bit system such as the CD is capable of reaching a S/N value of 6*16 = 96 dB. However, it should be noted that the actual S/R is usually less than that (for instance 90 dB) because of other sources of noise such as analog circuit noise.

Q13. How does the fact that the reconstructed signal is stepped instead of being exactly equal to the original affect?
A: In theory it should not be important, since it generates mostly spectral components beyond the audible range. However, it is convenient to include a smoothing filter in order to limit high frequency content to the strictly necessary amount. This prevents the presence of high frequency components that could interfere with other processes such as radio or TV reception, producing audible beats.

Q14. ?
A: .

Q15. ?
A: .

Q16. ?
A: .

 

E-mail: fmiyara@fceia.unr.edu.ar
Arriba
Biblioteca virtual
Home
Castellano