Hi-Rez - Another Myth Exploded!

Sep 11, 2011 at 10:48 PM Post #16 of 156
aliasing only happens at the ADC, and only then if the input contains frequencies above Nyquist at levels above 1 lsb - analog filtering the input is mandatory, large oversampling ratio in the ADC front end helps too
 
once converted to digital there are no further aliasing concerns as long as you are not changing the sample rate
 
handling 16 bit dithered source does require preserving the "extra" bit resolution of any signal processing math operations until the final quantization/redithering process - if you have to "export" the data at 16 bits there is a extra processing overhead for the redithering if some digital processing (volume, EQ) is done in one piece of hardware and 16 bit audio has to be passed to another device for DAC
 
but within a device with a direct DAC output it is good enough to pass the processed data to the "24 bit" DAC and let the electronic noise floor dither it for you
 
Sep 11, 2011 at 11:24 PM Post #17 of 156
I meant effect processing specifically.
Wouldn't a reverb applied to a 192kS/s waveform be more defined and regular across a given time period than one applied to a 44.1kS/s waveform because the reverb has more samples to process?
 
Sep 12, 2011 at 2:16 AM Post #18 of 156
Some thoughts I had while reading:
 
The point about equipment reducing the oversampling rate at higher bit-rates is a good one, however, lets consider a typical DAC chip that operates at 384k. That is the bit rate of 8x oversampled (or up-sampled) 48k music. If the original track is at 96k, then it can only be 4x oversampled. The result is exactly the same bit rate being converted internally!
 
Possibly there may be some difference depending on where the oversampling or up-sampling was done prior to conversion or whether it is the original rate at which the music was recorded, especially as there are different up-sampling algorithms. What I'm getting at here is that for high res (well, 96k) music to have a benefit, it would have to, as a result of being higher res, bypass some or other "flaw" in software or hardware up-sampling or oversampling in one's computer, converter or DAC. That's what I reckon anyway. 
smile.gif

 
Sep 12, 2011 at 7:50 AM Post #20 of 156
I meant effect processing specifically.
Wouldn't a reverb applied to a 192kS/s waveform be more defined and regular across a given time period than one applied to a 44.1kS/s waveform because the reverb has more samples to process?


You raise a good point and one worth going into in greater detail. Hopefully when you've finished reading this post, digital audio will make a lot more sense.

When considering digital audio, there comes a point when our logic of what we see and know let's us down and we start to assume things about digital audio which are incorrect. This is because we are missing some vital information about how digital audio works and without this information the reality of how digital audio works appears counter-intuitive. I'll try to explain the missing information as simply as I can:

Sampling Theory

Imagine a perfect circle. If we wanted to store and recreate that circle perfectly on a computer, logic would indicate that the more points we measure and store round the circumference of the circle, the more accurately we can recreate it. Although, we would never be able to recreate the circle absolutely perfectly because the circumference of a circle is actually made up of an infinite number of points and we can't measure or store an infinite number of values. But there is a completely different approach to this problem. Let's say we only measure and store 3 points on the circumference of the circle. Let's also say that we give the computer some limitations; it's only allowed to draw perfect circles, for example. Now if we give the computer those three points we measured, the computer will be able to recreate our original circle absolutely perfectly. It doesn't matter that there are only 3 points, as any perfect circle which intersects those 3 points must be identical to the original. Measuring and storing 10 or 1,000 points is not going to make our recreated circle any more accurate than using just 3 points!

Although digital audio is a lot more complicated, the basic basic concept is the same as with this analogy of the circle above. As with the circle example, we have to place some limitations on the system for it to perfectly recreate sound waves: The system can only recreate sine waves and we must take more than two measuring (sampling) points per sine wave. As with the circle, providing we have more than two measurements (sampling points) the sine wave can be recreated perfectly. A million sampling points does not make the sine wave any more perfect (linear).

But, I here you say, sound waves are a lot more complex than simple sine waves. True, but that's missing a fundamental fact: Any sound you can hear is constructed from sine waves as the human ear can only respond to sine waves. So if we can perfectly capture sine waves, by definition we can perfectly capture all sound. The other limitation we mentioned above (more than two sampling points per sine wave) explains the Nyquist Point: That the sampling rate must be at least twice as fast as the highest audio frequency we wish to capture. So the Nyquist Point of 96k sampling rate is 48kHz.

Remember though, the practical application of sampling theory brings us into the world of electronic engineering and the trade-offs mentioned in my original post. However, digital audio is based on the concept of perfect linearity (unlike any other audio recording technology) and modern digital equipment can can get us surprisingly close to this ideal.

Background

In 1928 Harry Nyquist came up with this sampling theory. Nyquist was working for Bell Labs at the time and his sampling theory was designed for use with telecommunications signals. Nyquist's paper was nothing more than interesting research until twenty years later a genius mathematician (Claude Shannon), mathematically proved Nyquist's theory (turning it into a theorem) and incorporated it into his much grander Information Theory. Information Theory has a wide number of applications, from neurobiology to the understanding of Black Holes. Information Theory isn't just the basis digital audio but of all digital information. Claude Shannon is often referred to as the father of the digital age. However, it would be nearly another 20 years (1960's), until technology had advanced enough, to enable research to start into turning the Nyquist-Shannon Sampling Theorem into a practical digital audio system (by NHK, the BBC, et al.). The rest is history!

Summary


In my opinion, most of the audio industry exploits the fact that few consumers have an understanding of the Nyquist-Shannon Sampling Theorem to peddle their "bigger numbers is better" hype. The fact that a lot of audiophile magazines and reviewers also seem ignorant of the Theorem and it's implications also adds to the misinformation. If you take nothing else away from this post, just consider these 3 facts:

1. The Nyquist-Shannon Sampling Theorem provides for the capture and perfect recreation of all the information contained in any sound we can hear.
2. The Nyquist-Shannon Sampling Theorem is mathematically proven and scientifically uncontested.
3. If the Nyquist-Shannon Sampling Theorem is incorrect, there would be no digital audio or indeed no digital age at all!

G

MIT paper: Shannon Information Theory
 
Sep 12, 2011 at 9:41 AM Post #21 of 156
aliasing only happens at the ADC, and only then if the input contains frequencies above Nyquist at levels above 1 lsb - analog filtering the input is mandatory, large oversampling ratio in the ADC front end helps too
 
once converted to digital there are no further aliasing concerns as long as you are not changing the sample rate


Almost true. Modern professional ADCs are pretty complex bits of kit. They often have initial sampling rates of around 22mS/s (22 million samples a second), this provides a very high Nyquist Point and therefore allows very simple, smooth analogue anti-alias filters without artefacts. The inevitable trade-off is that at this high speed, resolution (bit depth) has to be restricted if we are to maintain accuracy. Usually, 6-8bits are used (If we used 24bit at 22mS/s we'd run into the same speed limitations of the laws of physics we encounter with 24/192). Once this initial sampling is complete the digital data is passed through a decimation filter. This decimation filter mathematically converts sample rate into bit depth (not easy to explain this without going into a lot more detail!) and applies another anti-alias filter relevant to the selected sample frequency (say 44.1kS/s). This seems like quite a lot of hassle (it is!) but the advantage is that the final anti-alias filter (in the decimation process) can be applied digitally, which allows us to alleviate the problems associated with steep analogue filters (phase issues and ringing). From this point on, you are correct that there are no further aliasing concerns unless you change the sample rate or convert the digital data back to analogue (for playback).


Great post OP.
 
I'm in complete agreement with you, however there is a benefit of using a higher bit rate only if you want to use digital volume control in the DAC so that you can achieve a certain degree of attenuation without loss of dynamic range.


Thanks. I tried to be as precise as I could with my wording on my original post (within the limitations of simplification). 16bit is more than sufficient for playback. By this I mean feeding the 16bits to a DAC and outputting the resultant analogue waveform to an amp (integrated or not). 16bits is not sufficient when processing digital audio, EQ'ing, volume changes (in the digital domain), etc. Professionally, everything is mixed at bit depths of at least 32bit (float) or 48bit (fixed). The usual workflow would be to output this mix at 24bit (dithered from 48bit) and hand it over to a mastering engineer. The mastering engineer will do some more EQ, compression, etc., (at 48bit or converted for some analogue processing) and then create a 16bit master using noise-shaped dither. It is a bad idea to do any further processing after the noise-shaped dither reduction to 16bit. You are quickly going to start loosing fidelity (with a noise-shaped 16bit file) if you apply any digital processing (EQ for example), unless you've got quite specialised EQ equipment and know what you are doing. If you want to apply more digital processing you are much better off with 24bit files, although you still need some specialised EQ equipment (and to know what you are doing) to avoid loss of fidelity but with 24bit that loss should be less noticeable. Lowering the volume in the digital domain would also count as digital processing and you would loose fidelity, this would be less noticeable with 24bit files (than with 16bit) but it would be even better to do it analogue with the amp.

G
 
Sep 12, 2011 at 10:03 AM Post #22 of 156
The point about equipment reducing the oversampling rate at higher bit-rates is a good one, however, lets consider a typical DAC chip that operates at 384k. That is the bit rate of 8x oversampled (or up-sampled) 48k music. If the original track is at 96k, then it can only be 4x oversampled. The result is exactly the same bit rate being converted internally!


Sorry, I'm not really sure I understand. When you say "bit-rates", do you mean bit depths? Even if you do mean bit depths, I still can't quite get my head around what you're saying.

G

 
Sep 12, 2011 at 11:51 AM Post #23 of 156


Quote:
You raise a good point and one worth going into in greater detail. Hopefully when you've finished reading this post, digital audio will make a lot more sense.

When considering digital audio, there comes a point when our logic of what we see and know let's us down and we start to assume things about digital audio which are incorrect. This is because we are missing some vital information about how digital audio works and without this information the reality of how digital audio works appears counter-intuitive. I'll try to explain the missing information as simply as I can:

Sampling Theory

Imagine a perfect circle. If we wanted to store and recreate that circle perfectly on a computer, logic would indicate that the more points we measure and store round the circumference of the circle, the more accurately we can recreate it. Although, we would never be able to recreate the circle absolutely perfectly because the circumference of a circle is actually made up of an infinite number of points and we can't measure or store an infinite number of values. But there is a completely different approach to this problem. Let's say we only measure and store 3 points on the circumference of the circle. Let's also say that we give the computer some limitations; it's only allowed to draw perfect circles, for example. Now if we give the computer those three points we measured, the computer will be able to recreate our original circle absolutely perfectly. It doesn't matter that there are only 3 points, as any perfect circle which intersects those 3 points must be identical to the original. Measuring and storing 10 or 1,000 points is not going to make our recreated circle any more accurate than using just 3 points!

Although digital audio is a lot more complicated, the basic basic concept is the same as with this analogy of the circle above. As with the circle example, we have to place some limitations on the system for it to perfectly recreate sound waves: The system can only recreate sine waves and we must take more than two measuring (sampling) points per sine wave. As with the circle, providing we have more than two measurements (sampling points) the sine wave can be recreated perfectly. A million sampling points does not make the sine wave any more perfect (linear).

But, I here you say, sound waves are a lot more complex than simple sine waves. True, but that's missing a fundamental fact: Any sound you can hear is constructed from sine waves as the human ear can only respond to sine waves. So if we can perfectly capture sine waves, by definition we can perfectly capture all sound. The other limitation we mentioned above (more than two sampling points per sine wave) explains the Nyquist Point: That the sampling rate must be at least twice as fast as the highest audio frequency we wish to capture. So the Nyquist Point of 96k sampling rate is 48kHz.

Remember though, the practical application of sampling theory brings us into the world of electronic engineering and the trade-offs mentioned in my original post. However, digital audio is based on the concept of perfect linearity (unlike any other audio recording technology) and modern digital equipment can can get us surprisingly close to this ideal.

Background

In 1928 Harry Nyquist came up with this sampling theory. Nyquist was working for Bell Labs at the time and his sampling theory was designed for use with telecommunications signals. Nyquist's paper was nothing more than interesting research until twenty years later a genius mathematician (Claude Shannon), mathematically proved Nyquist's theory (turning it into a theorem) and incorporated it into his much grander Information Theory. Information Theory has a wide number of applications, from neurobiology to the understanding of Black Holes. Information Theory isn't just the basis digital audio but of all digital information. Claude Shannon is often referred to as the father of the digital age. However, it would be nearly another 20 years (1960's), until technology had advanced enough, to enable research to start into turning the Nyquist-Shannon Sampling Theorem into a practical digital audio system (by NHK, the BBC, et al.). The rest is history!

Summary

In my opinion, most of the audio industry exploits the fact that few consumers have an understanding of the Nyquist-Shannon Sampling Theorem to peddle their "bigger numbers is better" hype. The fact that a lot of audiophile magazines and reviewers also seem ignorant of the Theorem and it's implications also adds to the misinformation. If you take nothing else away from this post, just consider these 3 facts:

1. The Nyquist-Shannon Sampling Theorem provides for the capture and perfect recreation of all the information contained in any sound we can hear.
2. The Nyquist-Shannon Sampling Theorem is mathematically proven and scientifically uncontested.
3. If the Nyquist-Shannon Sampling Theorem is incorrect, there would be no digital audio or indeed no digital age at all!

G

MIT paper: Shannon Information Theory


None of that answered any part of my question.  I am not talking about the waveform represented by the samples at all.  The reverb, being an entirely digital process, does not see the final waveform between the samples.  It sees a series of dots and it processes those dots.  What I'm wondering is if processing at a higher sampling resolution produces a more perceptually coherent (Not necessarily mathematically correct) reverb when resampled back down in the same way that rendering a CGI image at a higher sampling resolution and resampling it produces a more coherent image.
Please open the full images.
 
This image has one sample per pixel and, from a mathematical standpoint, contains all the information needed to reconstruct the image within the bandwidth constraints of the sampling resolution of the image.
 
http://upload.wikimedia.org/wikipedia/en/8/84/Mandelbrot-spiral-original.png
 
This image has multiple samples per pixel that have been averaged.  It is resampled from an image that has a 400x higher sampling resolution.
 
http://upload.wikimedia.org/wikipedia/en/6/63/Mandelbrot-spiral-antialiased-400-samples.png
 
While the first image is the closest mathematical representation of the digital equation that can exist within the constraints of the image size, which looks better?
 
Sep 12, 2011 at 12:25 PM Post #24 of 156
People probably buy hi-rez not because it is simply high-resolution, but most of the time it is mastered better than their 16bit cousins.
 
Sep 12, 2011 at 12:49 PM Post #25 of 156

 
Quote:

popcorn.gif

 
 
On another sub topic.  Wrt Interpolation, how often does this or would this come into play regarding the various sampling formats in the processes described above?  Is it merely a compensation for defects in media, transmission, etc or is it also a function of the calculations used in the conversion and filtering process?
 
 
Sep 12, 2011 at 12:57 PM Post #26 of 156
http://www.pelpix.info/silverplate1.flac
http://www.pelpix.info/silverplate2.flac
 
One of these has the original samples fed into the reverb process and one of these is oversampled.
Does one sound better?  If so, which?
 
Sep 12, 2011 at 2:14 PM Post #27 of 156
None of that answered any part of my question.  I am not talking about the waveform represented by the samples at all.  The reverb, being an entirely digital process, does not see the final waveform between the samples.  It sees a series of dots and it processes those dots.  What I'm wondering is if processing at a higher sampling resolution produces a more perceptually coherent (Not necessarily mathematically correct) reverb when resampled back down in the same way that rendering a CGI image at a higher sampling resolution and resampling it produces a more coherent image.


I wonder if you are getting confused? A higher sampling rate does not create any more resolution, once we exceed two samples per waveform resolution is already perfect. The only advantage of a higher sample rate is to allow for a higher Nyquist Point and therefore a larger range of audio frequencies. If we sample a 1kHz sine wave with a sampling rate of 2.5kS/s we can reconstruct a perfect 1kHz sine wave. Sampling that sine wave at 96kS/s is not going to give us a more perfect reconstruction of the 1kHz sine wave. Reverb is in effect creating echoes (reflections) of the original sound wave, if our sampling rate is able to capture all sound waves within the spectrum of human hearing, that sample rate can also contain all the echoes. How "perceptually coherent" the reverb appears to be is down to the ability of the programmer. Like with my original post, this is the theory. In practice, a reverb programmer could create inferior algorithms to deal with a 44.1kS/s signal compared to his/her algorithms to deal with say a 96kS/s signal. In this case the reverb would "sound better" at 96kS/s but not because 96kS/s provides any more resolution (or any other benefit) but simply because the programmer was incompetent with his/her 44.1kS/s algorithms.

There are potentially some digital audio processes which would benefit from upsampling, compression is an example. Compression is a non-linear process which, depending on it's application, can generate frequencies above the Nyquist Point (22kHz), which would then cause alias images in the hearing range. In this case it would make sense for the compression processor to upsample, compress, anti-alias filter and then down sample again.

The difference, I believe, between your analogy with the image and digital audio is that with digital audio, when the signal is reconstructed, the only limitation is the frequency bandwidth, the resolution is infinite. The frequency bandwidth is not a problem either because even at 44.1kS/s (with a 22kHz Nyquist point) the bandwidth of the audio still exceeds the bandwidth of the human ear.

G
 
Sep 12, 2011 at 2:31 PM Post #28 of 156
On another sub topic.  Wrt Interpolation, how often does this or would this come into play regarding the various sampling formats in the processes described above?  Is it merely a compensation for defects in media, transmission, etc or is it also a function of the calculations used in the conversion and filtering process?


Interpolation filters are used when upsampling. The only time upsampling should be used is when we have a 44.1kS/s (or 48kS/s) sample rate, which benefits the filters in a DAC. Upsampling should not be carried out on sample rates of 88.2kS/s or higher simply because there is no benefit, only disadvantages, the same disadvantages as discussed in the OP.

I'm wondering if I should go into some detail about how the digital information is reconstructed to the analogue signal. mmmm .... What do you think?

G.

 
Sep 12, 2011 at 2:35 PM Post #29 of 156


Quote:
I'm wondering if I should go into some detail about how the digital information is reconstructed to the analogue signal. mmmm .... What do you think?

G.


Yes please.  I also have a question about conflicting notions of 'bandwidth' relating to the transmission of digital signals between devices using various standards but I fear that might be too OT.
 
 
Sep 12, 2011 at 4:33 PM Post #30 of 156


Quote:
I wonder if you are getting confused? A higher sampling rate does not create any more resolution, once we exceed two samples per waveform resolution is already perfect. The only advantage of a higher sample rate is to allow for a higher Nyquist Point and therefore a larger range of audio frequencies. If we sample a 1kHz sine wave with a sampling rate of 2.5kS/s we can reconstruct a perfect 1kHz sine wave. Sampling that sine wave at 96kS/s is not going to give us a more perfect reconstruction of the 1kHz sine wave. Reverb is in effect creating echoes (reflections) of the original sound wave, if our sampling rate is able to capture all sound waves within the spectrum of human hearing, that sample rate can also contain all the echoes. How "perceptually coherent" the reverb appears to be is down to the ability of the programmer. Like with my original post, this is the theory. In practice, a reverb programmer could create inferior algorithms to deal with a 44.1kS/s signal compared to his/her algorithms to deal with say a 96kS/s signal. In this case the reverb would "sound better" at 96kS/s but not because 96kS/s provides any more resolution (or any other benefit) but simply because the programmer was incompetent with his/her 44.1kS/s algorithms.

There are potentially some digital audio processes which would benefit from upsampling, compression is an example. Compression is a non-linear process which, depending on it's application, can generate frequencies above the Nyquist Point (22kHz), which would then cause alias images in the hearing range. In this case it would make sense for the compression processor to upsample, compress, anti-alias filter and then down sample again.

The difference, I believe, between your analogy with the image and digital audio is that with digital audio, when the signal is reconstructed, the only limitation is the frequency bandwidth, the resolution is infinite. The frequency bandwidth is not a problem either because even at 44.1kS/s (with a 22kHz Nyquist point) the bandwidth of the audio still exceeds the bandwidth of the human ear.

G


We're not talking about the waveform here.  A digital reverb does not see a waveform nor a sound.  It does not see the samples as connected in any way, shape, or form.  It applies the process to each individual sample as its own entity.  All laws related to waveform capture and reproduction aren't really applicable in digital processing.
 
Here are three samples

 
They represent this sine wave portion when traced as such (My line work is bad, so the curve isn't right.  My apologies.)
 

 
 
However, the average digital reverb simply connects the dots into a triangle wave:
 

 
Digital processors that aren't very, very bad like the one demonstrated above use interpolation, but it is cheap and full of aliasing.  The only reverb I have ever seen that doesn't alias without resampling is 2C-Audio's Aether.  It's clean for the whole band and has zero aliasing, but processing time can be up to two hours for a 5 minute audio file.
 

Users who are viewing this thread

Back
Top