24bit vs 16bit, the myth exploded!
Nov 5, 2013 at 6:48 PM Post #1,396 of 7,175
Great post, gregoria. There are a couple of things I'd like to add about resolution, though. When you quantise an analogue signal, the noise introduced can actually be non-linear. This distortion can be audible. To prevent this, we can randomise the quantisation errors so that the noise is ergo random and spread uniformly across the frequency spectrum. This way, the noise introduced (called white noise) is linear. This process is known as dithering. Going even further, we can can distribute this noise at varying amounts for different frequencies. This exploits the fact that we're more sensitive to some frequencies than others, so the frequencies that we're more sensitive to should have less noise than those that we're less sensitive to. The total amount of noise is still the same; only the distribution is different. This process is known as noise shaping. At this point, you can expect the (white) noise to be well below the threshold of hearing at sane volumes.
 
Nov 5, 2013 at 7:11 PM Post #1,397 of 7,175
He mentioned noise shaping and dither, but that part could indeed be a bit more detailed.
 
 
 When you quantise an analogue signal, the noise introduced can actually be non-linear. 

This sounds confusing.
 
The problem is correlation between the input signal and quantization noise resulting in non-linear distortion. Dither is used to de-correlate the quantization noise from the input signal, resulting in low-level white noise instead of low-level distortion products.
 
Nov 5, 2013 at 7:30 PM Post #1,398 of 7,175
 ... Furthermore, once we get to 192k sampling rates and beyond it is impossible to correctly filter them according to the demands of the Nyquist Theorem. While it's extremely unlikely that this will result in any audible problems, it is in theory at least, lower digital fidelity. ...

 
I think it was a very good post. There is one part I am struggling with, quoted above. How is it impossible to filter a very high sampling rate according to the demands of the Nyquist Theorem? 
 
In any case, we don't go to 192 KHz in order to try and record signals up to 96 KHz. We do so in order that we can use a simpler, lower order, low-pass anti-aliasing filter.
 
Nov 5, 2013 at 7:48 PM Post #1,399 of 7,175
  He mentioned noise shaping and dither, but that part could indeed be a bit more detailed.

Yes, and also that quantisation on its own introduces non-linear noise. The point is there's more to it than "the noise in 16-bit audio is inaudible".
 
Quote:
This sounds confusing.
 
The problem is correlation between the input signal and quantization noise resulting in non-linear distortion. Dither is used to de-correlate the quantization noise from the input signal, resulting in low-level white noise instead of low-level distortion products.

What's confusing about it? We can indeed say that the quantisation noise is non-linear.
 
Nov 5, 2013 at 10:51 PM Post #1,400 of 7,175
  What's confusing about it? We can indeed say that the quantisation noise is non-linear.

 
What is non-linear noise? What's the opposite?
 
Nov 6, 2013 at 1:35 AM Post #1,401 of 7,175
   
What is non-linear noise? What's the opposite?

People sometimes refer white noise as linear due to it's flat power spectral density - a line.  Noise is neither "linear" nor "non-linear". It is characterized by it's statistics. Otherwise, it's not strictly speaking noise because then we would simply could remove it.
 
Nov 6, 2013 at 4:42 AM Post #1,402 of 7,175
even the best and most dynamic of SACD have a dynamic range of no more than about 60dB


Hmm, that doesn't sound right (no pun intended). Surely a noise floor at -60dB would be audible. Or am I misunderstanding?




The SACD format is capable of delivering a dynamic range of 120 dB from 20 Hz to 20 kHz […] With appropriate low-pass filtering, a frequency response of 20 kHz can be achieved along with a dynamic range of nearly 120 dB, which is about the same dynamic range as PCM audio with a resolution of 20 bits.


http://en.wikipedia.org/wiki/Direct_Stream_Digital
 
Nov 6, 2013 at 7:56 AM Post #1,403 of 7,175
  He mentioned noise shaping and dither, but that part could indeed be a bit more detailed.

 
When I started this thread, the idea was to make as simple an explanation as possible, so it could be understood by Head-Fi'ers who may not have much interest in the deep technical detail. My last post followed the same vein. There's a broad spectrum of people on this site and it's a tough ask to write an article which covers all of them. In my last post I tried to balance as little detail as possible with enough detail to try and avoid rendering what I wrote too inaccurate.
 
Quote:
  When you quantise an analogue signal, the noise introduced can actually be non-linear. This distortion can be audible. To prevent this, we can randomise the quantisation errors so that the noise is ergo random and spread uniformly across the frequency spectrum. This way, the noise introduced (called white noise) is linear. This process is known as dithering. Going even further, we can can distribute this noise at varying amounts for different frequencies. This exploits the fact that we're more sensitive to some frequencies than others, so the frequencies that we're more sensitive to should have less noise than those that we're less sensitive to. The total amount of noise is still the same; only the distribution is different.

 
I'm not sure I would use the term linear or non-linear in this context, correlated or de-correlated would perhaps be more accurate and less confusing. The act of dithering is the act of de-correlating the quantisation error, which is essentially just a fancy and more precise way of saying the errors are turned into random noise. If we're going into the finer detail, I'm not sure the last sentence I've quoted of yours is entirely accurate; generally noise shaped dither would actually slightly increase the total amount of noise but of course significantly decrease the amount of audible noise. However, modern mastering dither processors allow the mastering engineer to select the amount of dither applied as well as (and independently from) the amount of re-distribution (noise shaping) of that noise, so the answer to this point is not entirely clear cut. Another factor to consider is where we can (or should) apply noise shaped dither. Where and how dither is applied in ADCs and processors can get pretty complex and difficult to understand, not least because the designers and the companies they work for tend not to want to divulge this information. When mixing/mastering though we generally do not want to noise shape the dither of every channel of sound in the mix, because if we start summing many channels of sound together all with that dither noise concentrated in the same frequency band we may introduce unwanted audible artefacts in broadcast limiters and even when converting to lossy codecs. Generally, when dither is required during mixing only standard TDPF (non-noise shaped) dither is used and as it's spread evenly across the spectrum there is no build up or concentration of noise in any one frequency band, just a 3dB increase in noise for every dither summed channel, we then we apply a noise shaped dither as the very final mastering process. In most mixing environments today though the bit depths are so high that you don't need to apply dither while mixing because even the correlated noise from truncating is still massively below audibility. It's generally only when reducing to 16bit where correlated noise errors are of any potential concern. It should be noted that even when truncating to 16bit there's relatively little evidence that this correlated noise (truncation error) can be heard at normal listening levels. So the application of noise shaped dither is to some extent a "playing it safe, just in case" approach.
  There is one part I am struggling with, quoted above. How is it impossible to filter a very high sampling rate according to the demands of the Nyquist Theorem? In any case, we don't go to 192 KHz in order to try and record signals up to 96 KHz. We do so in order that we can use a simpler, lower order, low-pass anti-aliasing filter.

 
Nyquist demands that the signal is band limited. This means applying anti-alias and anti-imaging filters to remove the error signal above the Nyquist Point (fs/2). In the case of 16/44.1 it's relatively trivial to accomplish 120dB or more of attenuation in the stop band (the range of frequencies above the Nyquist Point) and therefore reduce anti-aliasing to below the digital noise floor. But with 24/192 we have a great deal more processing to accomplish but no additional time in which to accomplish it. At these very high sample rates and bit depths we start hitting the limits of the laws of physics in how fast we can perform the calculations required to implement a filter which reduces anti-aliasing to below the digital noise floor. The only way this is likely to change is with a new paradigm in processing, for example quantum computing could in theory solve the problem! All professional ADCs initially sample at incredibly high rates (many megahertz) but they do so with a greatly reduced bit depth, 5 bits or so generally. In other words, you either have more bandwidth OR more accuracy but not both! This is borne out in tests and in manufacturers' specifications; generally at 24/192 anti-alias attenuation is only accomplished down to around -80dB which results in distortion across the entire frequency spectrum, including the audible band! It's unlikely (but not impossible) that this failure to achieve sufficient anti-aliasing to fully satisfy the Nyquist Theorem is going to be audible but nevertheless, this additional distortion does mean that in theory at least 24/192 is lower fidelity than 24/96. For the same reason, 24/384 and 32/384 performs even worse than 24/192 and is even lower fidelity. Given the choice, no knowledgeable music recording engineer would ever record at anything higher than 24/96 but they are sometimes not given the choice by the record companies employing them. Unfortunately the audiophile world is driven by marketing more than by fidelity!
 
G
 
Nov 6, 2013 at 9:16 AM Post #1,405 of 7,175
Hmm, that doesn't sound right (no pun intended). Surely a noise floor at -60dB would be audible. Or am I misunderstanding?

 
No, you're not misunderstanding, that's exactly the point when mixing and mastering, or at least one of the points! While you don't want listeners to specifically hear the noise floor, you do want them to hear the details of the recording all the way down to the noise floor. This means that the noise floor of the recording is hopefully roughly the same as the noise floor of the audiences' listening environment. Given that the average home listening environment is very roughly about 50dB, a recording which has a dynamic range of 60dB would mean the loudest peaks of the music would be at 110dB, which is extremely loud and would be uncomfortable for most people. In reality most people would play the music back quieter and simply not hear any of the details in the recording near the recording's noise floor because they would be considerably quieter than the noise floor of their environment. That's why very few recordings have a dynamic range as wide as 60dB and a 40dB or less dynamic range is so is much more common. Listening to music in something like a moving car, the ambient noise floor is way higher than an average home listening environment and therefore reducing the dynamic range of the recording even more is a good thing.
 
You've quoted the wiki article but you are confusing the container with it's contents! Yes, SACD like CD is capable of 120dB of dynamic range but for the reasons explained above, no one has or ever would release a recording with a dynamic range of 120dB, as even 60dB dynamic range is too much in the majority of cases!
 
G
 
Nov 6, 2013 at 9:41 AM Post #1,406 of 7,175
You've quoted the wiki article but you are confusing the container with it's contents!


Well, you made it sound like SACD was technically inferior by quite a large margin, which it isn't. The dynamic range of recordings has nothing to do with the medium, be it SACD, CD, DVD-Audio or BluRay, since all of them are technically able to accomodate the largest dynamic range present in recordings.
 
Nov 6, 2013 at 9:49 AM Post #1,407 of 7,175
It's really the way you put it that is confusing. Let me quote you again:

even the best and most dynamic of SACD have a dynamic range of no more than about 60dB


That's still not technically true. The dynamic range is the difference between the loudest signal on the recording (often 0dBFS) and the actual noise floor of the recording, i.e. the noise that's present even when there's no signal (silence), which can easily be the noise floor of the medium itself (-96dB for CD-DA, -120dB for SACD) when the music is generated electronically, for instance.

When the end of a track is fading into silence, I expect that it doesn't abruptly stop at -60dB: that would be audible and would completely defeat the purpose of the effect.
 
Nov 6, 2013 at 10:31 AM Post #1,409 of 7,175
That's still not technically true. The dynamic range is the difference between the loudest signal on the recording (often 0dBFS) and the actual noise floor of the recording, i.e. the noise that's present even when there's no signal (silence), which can easily be the noise floor of the medium itself (-96dB for CD-DA, -120dB for SACD) when the music is generated electronically, for instance.

 
If we are talking about what is technically true then CD has a dynamic range of about 98dB and SACD has a range of about 6dB! However, both use noise shaped dither to achieve a perceptual dynamic range of 120dB. It maybe possible to achieve a recording noise floor of 120dB with purely electronically generated music but it wouldn't be easy, mainly due to the nature of generating electronic music and the processing which is usually applied to it. There would be no point in trying to achieve this though as at any vaguely sensible listening level a recording noise floor at -120dBFS would be many times below the noise floor of the listening environment.
 
Quote:
I really, really don't want to hear the noise floor. "Quiet" is way above 16 bit quantization noise, let alone 20 or 24 bit.

 
Of course you don't but presumably you also don't want to miss any of the details in the music because it's below the noise floor of your listening environment? This is why recordings never attempt to get anywhere near the dynamic range limitations of the recording format, as I explained earlier. Sorry if you're still confused, it seems to have been caused by a typo, I intended to write "even the best and most dynamic of SACDs have a dynamic range of no more than about 60dB".
 
G
 
Nov 6, 2013 at 10:50 AM Post #1,410 of 7,175
  Sorry if you're still confused, it seems to have been caused by a typo, I intended to write "even the best and most dynamic of SACDs have a dynamic range of no more than about 60dB".

 
That is the difference between the loudest and quietest part of the music, but noise can be heard even if it has a lower overall RMS level than the signal (but not by an extreme amount). Not least because it has a different spectral distribution, and not all of it is masked. Otherwise, those high quality SACDs could be quantized down (without noise shaping) to 11 bits at 44100 Hz, and sound the same, which can be proven false with ABX testing (see also this post).
 

Users who are viewing this thread

Back
Top