24bit vs 16bit, the myth exploded!

Jul 6, 2018 at 3:29 PM Post #4,862 of 7,175
This article makes it sound to me that when using 24 bit audio instead of 16 bit, the range of loudness that the audio file can describe becomes greater. Why couldn't it instead increase the number of digital increments describing the difference between say -4db and -3db?

additionally, why couldn't the dynamic range of 16 bit audio be expanded by reducing the number of digital increments per 1db change?

Is the difference in db between the integer value that the 16 or 24 bits convert to a fixed value?




To me it seems wasteful to allow a 16 bit audio file to have a noise floor ~60db below the quietest noise in the music (assuming a song with 36db of dynamic range). Why can't the lowest noise or output voltage (-36db) be set to 0000 0000 0000 0001 and the highest noise (0db) to 1111 1111 1111 1111?


I apologize in advance for posting to such an old forum post. I yearn for a greater understanding of this subject.

Thanks in advance!
 
Last edited:
Jul 6, 2018 at 7:52 PM Post #4,863 of 7,175
This article makes it sound to me that when using 24 bit audio instead of 16 bit, the range of loudness that the audio file can describe becomes greater. Why couldn't it instead increase the number of digital increments describing the difference between say -4db and -3db?

additionally, why couldn't the dynamic range of 16 bit audio be expanded by reducing the number of digital increments per 1db change?

Is the difference in db between the integer value that the 16 or 24 bits convert to a fixed value?




To me it seems wasteful to allow a 16 bit audio file to have a noise floor ~60db below the quietest noise in the music (assuming a song with 36db of dynamic range). Why can't the lowest noise or output voltage (-36db) be set to 0000 0000 0000 0001 and the highest noise (0db) to 1111 1111 1111 1111?


I apologize in advance for posting to such an old forum post. I yearn for a greater understanding of this subject.

Thanks in advance!
For simplicity PCM audio is linear. Each bit increases dynamic range by 6 dB. There is no need to increace dynamic range of 16 bit audio (16*6 dB = 96 dB). With dither it is more than enough (perceptual dynamic range up to 120 dB!). 24 bit (24*6 dB = 144 dB) is good in music production, but not needed in consumer audio. Even at studio all of that 24 bit dynamic range can't be used, because all electrical devices have higher background noise level.

Remember, half of the 16 bits is used for negative values of the signal, so the maximum amplitude is 2^15 = 32768 = 0 dBFS. Then 23198 = -3 dBFS and 20675 = -4 dBFS. So, there are 2523 "increments" between -4 dBFS and -3 dBFS. This number of increments isn't that interesting. The level of quantization noise/dither is what matters.

There is µ-law/A-law 8 bit audio (companding algorithm) used in digital telecommunication that does what you propose. It results in quantization noise that fluctuates with the signal. The louder signal, the louder noise and vice versa. This was an early form of perceptual audio encoding.
 
Jul 6, 2018 at 11:22 PM Post #4,864 of 7,175
For simplicity PCM audio is linear. Each bit increases dynamic range by 6 dB. There is no need to increace dynamic range of 16 bit audio (16*6 dB = 96 dB). With dither it is more than enough (perceptual dynamic range up to 120 dB!). 24 bit (24*6 dB = 144 dB) is good in music production, but not needed in consumer audio. Even at studio all of that 24 bit dynamic range can't be used, because all electrical devices have higher background noise level.

Remember, half of the 16 bits is used for negative values of the signal, so the maximum amplitude is 2^15 = 32768 = 0 dBFS. Then 23198 = -3 dBFS and 20675 = -4 dBFS. So, there are 2523 "increments" between -4 dBFS and -3 dBFS. This number of increments isn't that interesting. The level of quantization noise/dither is what matters.

There is µ-law/A-law 8 bit audio (companding algorithm) used in digital telecommunication that does what you propose. It results in quantization noise that fluctuates with the signal. The louder signal, the louder noise and vice versa. This was an early form of perceptual audio encoding.
Good explanation. PCM is linear.

µ-law/A-law are not "perceptually coded" at all, just companded, which is why they don't work very well for high quality audio.
 
Jul 7, 2018 at 6:04 AM Post #4,865 of 7,175
[1] This article makes it sound to me that when using 24 bit audio instead of 16 bit, the range of loudness that the audio file can describe becomes greater.
[2] Why couldn't it instead increase the number of digital increments describing the difference between say -4db and -3db?
[3] additionally, why couldn't the dynamic range of 16 bit audio be expanded by reducing the number of digital increments per 1db change?
[3a] Is the difference in db between the integer value that the 16 or 24 bits convert to a fixed value?
[3b] To me it seems wasteful to allow a 16 bit audio file to have a noise floor ~60db below the quietest noise in the music (assuming a song with 36db of dynamic range).
[3c] Why can't the lowest noise or output voltage (-36db) be set to 0000 0000 0000 0001 and the highest noise (0db) to 1111 1111 1111 1111?
[4] I apologize in advance for posting to such an old forum post. I yearn for a greater understanding of this subject.

1. Correct.

2. That IS exactly what it does! The basic concept (or mental image) of how digital audio works is actually pretty simple, once you've got your head around it. The difficult part about understanding the basics is "getting your head around it" in the first place, because it's based on some relatively complex maths (proposed 90 years ago and proven 70 years ago) and can be somewhat counter-intuitive just using layman's terms. In other words, without a high level of education in maths, there are certain aspects of how digital audio works that you just have to accept. With this in mind, I'll try to explain using your examples and terminology in order to help you "get your head around it". At the moment, your difficulty appears to be that you are thinking about the "range of loudness" and the "number of digital increments" as two different things, that it's possible to use a given number of bits to either describe the "range of loudness" or describe the "number of digital increments (between say -4dB and -3dB)". What you're going to need to do, is get your head around the fact that in effect they are the same thing, that the end result in digital audio of increasing "the number of digital increments" is a larger "range of loudness". You have to remember that all sound is just a sine wave (or combination of sine waves) and that the dB value of a sine wave/s constantly varies with time, at every instant in time the dB value is different. Intuition would therefore suggest that to get a perfect measurement/description of a sine wave we would need an infinite number of measuring points (in time) and an infinite number of measurement bits/values (to describe the infinite number of dB values). Obviously that's impossible and in effect our intuition is letting us down because that's not what digital audio attempts to do, digital audio approaches the problem from a different angle and that's why we have to dump our intuition and "get our head around" that different angle. Instead of trying to get a perfect measurement/description of a sine wave in the first place, digital audio takes a different approach, it works on the principle of deliberately taking imperfect measurements to start with and then using maths to predict and effectively correct the those imperfections/errors. In effect, the difference between the actual dB values of our sine wave and our measured/assigned digital values (the imperfections/errors) is converted into white noise. In other words, what we end up with is a PERFECT description of our sine wave/s plus noise, and this is true no matter how many digital bits/values we have available, even with only one bit (two values)! With only two digital values to represent an infinite number of actual dB values obviously we're going to get a huge amount of imperfection/error, what that means is we'll end-up with a perfect description of our sine wave/s and a huge amount of noise. As we increase our number of digital bits/values, so we decrease the amount of imperfections/errors and therefore we end up with (as always) a perfect description of our sine wave/s, plus a decreasing amount of noise. With this in mind, try reading the OP again, it will make a lot more sense.

3. Hopefully, after reading the above and then the OP again, you can answer your subsequent questions yourself but just in case: Reducing the "number of digital increments per 1dB change" reduces the accuracy of the values stored in those "increments" or put the other way around, increases the amount of imperfection/error (the amount of noise we end up with in addition to our perfect sine wave/s) and therefore decreases our dynamic range.
3a. I'm not sure I understand the question. The digital integer values stored are fixed values but what comes out of your DAC is not, it is a virtually perfect reconstruction of the (constantly varying) originally digitised sine wave/s plus an amount of noise, defined by the number of bits or by the analogue noise floor of your particular DAC.
3b. In a sense you're absolutely correct but the idea with 16bit audio is that the digital noise floor is always below any of the other noise floors present, such as: The noise floor of your reproduction equipment (amp, speakers or HPs for example), the noise floor of your listening environment and the noise floor of the recording itself (the recording venue, microphones, etc.). While it's true that many pieces of popular music only have a dynamic range of around 36dB or so and therefore 7bits would be sufficient, other types of music (and other audio content in general) can have a dynamic range significantly greater, up to around 60dB or even 70dB or so in a few cases.
3c. I suppose in theory you could but what would you gain from doing this? You can't get any more of a perfect description of the sine waves because it's already perfect and the digital noise floor with 16bit is already below audibility. All you would do is effectively eliminate the quiet parts of other pieces of music/audio content.

4. No problem, that's why I started this thread in the first place. Please be aware that my explanation above is a simplification, it doesn't take into account "noise-shaping" and some other considerations but those other considerations aren't going to make much sense unless you've got your head around the basics first.

G
 
Last edited:
Jul 7, 2018 at 8:51 AM Post #4,866 of 7,175
This article makes it sound to me that when using 24 bit audio instead of 16 bit, the range of loudness that the audio file can describe becomes greater. Why couldn't it instead increase the number of digital increments describing the difference between say -4db and -3db?

additionally, why couldn't the dynamic range of 16 bit audio be expanded by reducing the number of digital increments per 1db change?

Is the difference in db between the integer value that the 16 or 24 bits convert to a fixed value?




To me it seems wasteful to allow a 16 bit audio file to have a noise floor ~60db below the quietest noise in the music (assuming a song with 36db of dynamic range). Why can't the lowest noise or output voltage (-36db) be set to 0000 0000 0000 0001 and the highest noise (0db) to 1111 1111 1111 1111?


I apologize in advance for posting to such an old forum post. I yearn for a greater understanding of this subject.

Thanks in advance!
to add the noob view to what the others said.
PCM is cool because for a given sample, you can easily correlate with an output voltage for the analog signal coming out of the DAC. let's say the first bit can be 0 or 1V, the second bit would code for 0 or 0.5V(half the voltage, giving us -6dB). the third bit codes for half the previous value, and so on.
so one way to look at this makes you and I realize that more bits only let us code quieter and quieter signals. when we add more bits beyond 16, we're pretty much making the background noise having a much better resolution ^_^. that doesn't seem like the best use of bits yet in some ways it is.
here are other ways to look at this and draw different conclusions:
for example, if we wish to improve the resolution of the musical content you can take my example and think about expressing the signal with 3 bits, then look at what more bits would do(I'm not bothering with how we'd need to code for negative values too on a sine wave, just so that the example is dead simple).
let's say we want the code for 0.4V. with 3 bits 0.4V could be coded using only the second bit to get 0.5V. that approximation isn't great. to increase the resolution between those first bits like you're suggesting, is actually what extra bits do to PCM. when you add an extra bit you can now code for 1V, 0.5V, 0.25V, and 0.125V and turn any combination ON anytime. now we can express our 0.4V as 0.25+0.125=0.375 instead of getting 0.5V with our 3bit code. and the more bits, the easier it becomes to zero in on the desired value.
so in practice extra bits are already increasing the resolution within the audio content like you suggest to do.

another way to look at it is with the additive properties of waves. we can visualize the perfect audio signal, and say that every variation from it is simply another sound being added to the perfect one. the obvious idea for this would be one signal as music and any variation from that being noise. if you increase the resolution so that you can code for a quieter noise, the result is going to look closer to the perfect music signal. it's like turning down the second source of sound. what's left is a cleaner first source. in that respect, just having a low enough quantization noise is giving the music higher resolution.
even if we only cared about not hearing that noise, we would probably wish to keep maybe 11 or 12bit. something like 36dB(about 6bits) wouldn't do for audibly clean background. you want to be sure that the quantization noise(from the value of the lower bit) is so far below the music that we won't notice it. including when the music is not stuck at 0dB all the time. all in all when you start considering pretty extreme examples, and using replay gain or other digital attenuation like changing the volume on the computer, if you still wish for the quantification noise to go unnoticed, you probably won't end up too far from 16bit in a PCM system.


now what I talked about is the correlation between PCM and voltage amplitude of the signal coming out of the DAC. because before and after do correlate, so we can still talk as if a DAC was an perfect R2R design, and get the right results. but in truth most DACs nowadays don't work that way and instead change the amplitude a great many times between each sample. so in practice modern DACs are not using the PCM coding of the file, not as is anyway.
 
Jul 7, 2018 at 9:39 AM Post #4,867 of 7,175
µ-law/A-law are not "perceptually coded" at all, just companded, which is why they don't work very well for high quality audio.

You just NEED to discredit people all the time

Not the way for example mp3s are, but the coding allows increased perceptual quality for speech for which it was developped. The dynamic range of speech exceeds linear 8 bit and µ-law/A-law coding allow bigger dynamic range to be fit to 8 bit at the expence of increased quantization noise which is less harmful than lack of dynamic range. Hence better perceptual sound quality = perceptual coding of a kind even if totally different from perceptual coding methods developped for music/high quality audio.

You know I am right with this so just let it be, ok?
 
Jul 7, 2018 at 10:06 AM Post #4,868 of 7,175
Let me preface my reply by saying I am very surprised right now, this is the greatest number of intelligent responses I have ever gotten from posting in a forum before, EVER. Even posting in forums about circuits, I haven't had responses like these before. Thank you!

to add the noob view to what the others said.
PCM is cool because for a given sample, you can easily correlate with an output voltage for the analog signal coming out of the DAC. let's say the first bit can be 0 or 1V, the second bit would code for 0 or 0.5V(half the voltage, giving us -6dB). the third bit codes for half the previous value, and so on.
so one way to look at this makes you and I realize that more bits only let us code quieter and quieter signals. when we add more bits beyond 16, we're pretty much making the background noise having a much better resolution ^_^. that doesn't seem like the best use of bits yet in some ways it is.

I think this was the explanation that finally cracked the code for me. Let me take your example and use it to explain what I was thinking:

Lets use the 2 bit audio, we have 11=1.5V, 10=1V, 01=0.5V, and 00=0V. My thoughts were, why couldn't you code your audio file so that... wait, is audio file where audiophile came from???.... code your audio file so that 00 is 0.2V instead. That is what I was sort of thinking. Because the way it was explained, I was thinking that the last bits were being "assigned" smaller sound levels as more bits were added to the front of the bit value. But what you have explained has finally gotten me to think of this in what I think is the correct way? That adding more bits simply makes those last bits quieter, and there's no way around that. 0000 0000 0000 0000 = 0V, and 0000 0000 0000 0000 0000 0000 = 0V, and that couldn't be changed. Now I am understanding this. I will read OP's post again now

from the original post: "So, 24bit does add more 'resolution' compared to 16bit but this added resolution doesn't mean higher quality, it just means we can encode a larger dynamic range. " and "The only difference between 16bit and 24bit is 48dB of dynamic range (8bits x 6dB = 48dB) and nothing else."

Now I do understand why 24 bit can, and always will encode a larger dynamic range than 16 bit, but I'm not so clear on why the added resolution does not mean higher quality. I get that with dither 16 bit can record an essentially perfect waveform with some noise, but wouldn't 24 bit with dither still be better than 16 bit with dither?

With your replies and re-reading of OP's post, I've been able to understand why adding bit depth does increase the dynamic range and that wouldn't change.

But does it not also increase the definition with which each note is described? When I was learning A to D stuff for chemical instrumentation, we were essentially taught that more bits are always better, because you get closer and closer to describing the true value of your analogue voltage using bits. Is this simply not true past a certain point in an audio signal? Is this another example of diminishing returns? Even if the difference is imperceptible to most listeners most of the time, is it not possible that say a uniquely distorted ~14kHz note could come through your system more similar to the way the artist intended using 24bit vs using 16 bit? Even if you couldn't notice a difference in quality? Say, maybe you don't really know which distortion is higher quality, but maybe if you knew the original sound, you could notice?







P.S. I have learned up to calculus 3 (3D calc), and differential equations, and I learned about the Nyquest frequency and how it works when learning about instrumentation for chemical spectroscopy & spectrometry. If this qualifies me to handle higher level audio understanding, please do throw it at me!



P.P.S. What are your thoughts on this:

 
Jul 7, 2018 at 12:25 PM Post #4,869 of 7,175
Now I do understand why 24 bit can, and always will encode a larger dynamic range than 16 bit, but I'm not so clear on why the added resolution does not mean higher quality. I get that with dither 16 bit can record an essentially perfect waveform with some noise, but wouldn't 24 bit with dither still be better than 16 bit with dither?

No, 16bit is already perfect there is no "better" than perfect. In fact 1bit is perfect, just with a lot more noise. This is the basic tenet of the Nyquist theory presented 90 years ago and is what digital audio is based on. The added "resolution" of 24bit is what causes the larger dynamic range. I think the difficulty here might be that the term "resolution" in digital audio is marketed as meaning more or higher "quality" but that's not true, more "resolution" means more dynamic range, not more quality, because more quality cannot exist.

G
 
Jul 7, 2018 at 12:58 PM Post #4,870 of 7,175
to add the noob view to what the others said.
PCM is cool because for a given sample, you can easily correlate with an output voltage for the analog signal coming out of the DAC. let's say the first bit can be 0 or 1V, the second bit would code for 0 or 0.5V(half the voltage, giving us -6dB). the third bit codes for half the previous value, and so on.
so one way to look at this makes you and I realize that more bits only let us code quieter and quieter signals. when we add more bits beyond 16, we're pretty much making the background noise having a much better resolution ^_^. that doesn't seem like the best use of bits yet in some ways it is.
here are other ways to look at this and draw different conclusions:
for example, if we wish to improve the resolution of the musical content you can take my example and think about expressing the signal with 3 bits, then look at what more bits would do(I'm not bothering with how we'd need to code for negative values too on a sine wave, just so that the example is dead simple).
let's say we want the code for 0.4V. with 3 bits 0.4V could be coded using only the second bit to get 0.5V. that approximation isn't great. to increase the resolution between those first bits like you're suggesting, is actually what extra bits do to PCM. when you add an extra bit you can now code for 1V, 0.5V, 0.25V, and 0.125V and turn any combination ON anytime. now we can express our 0.4V as 0.25+0.125=0.375 instead of getting 0.5V with our 3bit code. and the more bits, the easier it becomes to zero in on the desired value.
so in practice extra bits are already increasing the resolution within the audio content like you suggest to do.

another way to look at it is with the additive properties of waves. we can visualize the perfect audio signal, and say that every variation from it is simply another sound being added to the perfect one. the obvious idea for this would be one signal as music and any variation from that being noise. if you increase the resolution so that you can code for a quieter noise, the result is going to look closer to the perfect music signal. it's like turning down the second source of sound. what's left is a cleaner first source. in that respect, just having a low enough quantization noise is giving the music higher resolution.
even if we only cared about not hearing that noise, we would probably wish to keep maybe 11 or 12bit. something like 36dB(about 6bits) wouldn't do for audibly clean background. you want to be sure that the quantization noise(from the value of the lower bit) is so far below the music that we won't notice it. including when the music is not stuck at 0dB all the time. all in all when you start considering pretty extreme examples, and using replay gain or other digital attenuation like changing the volume on the computer, if you still wish for the quantification noise to go unnoticed, you probably won't end up too far from 16bit in a PCM system.


now what I talked about is the correlation between PCM and voltage amplitude of the signal coming out of the DAC. because before and after do correlate, so we can still talk as if a DAC was an perfect R2R design, and get the right results. but in truth most DACs nowadays don't work that way and instead change the amplitude a great many times between each sample. so in practice modern DACs are not using the PCM coding of the file, not as is anyway.

Or, one could compress, limit, and make-up gain the audio to fit nicely into the MSB! That tharez 'yoozin all the bits' for yah! lol
 
Jul 7, 2018 at 1:35 PM Post #4,871 of 7,175
No, 16bit is already perfect there is no "better" than perfect. In fact 1bit is perfect, just with a lot more noise.

I think I understand the application of Nyquist as it relates to sampling rate, but I'm having a little difficulty understanding the '1 bit' statement as it relates to bit depth. Recording amplitude in one bit would allow you to know if signal was present or not but I don't see how it could do any more than that, and certainly not result in a recognizable copy of the original modulated sine wave. What am I missing?
 
Jul 7, 2018 at 1:46 PM Post #4,872 of 7,175
You just NEED to discredit people all the time

Not the way for example mp3s are, but the coding allows increased perceptual quality for speech for which it was developped. The dynamic range of speech exceeds linear 8 bit and µ-law/A-law coding allow bigger dynamic range to be fit to 8 bit at the expence of increased quantization noise which is less harmful than lack of dynamic range. Hence better perceptual sound quality = perceptual coding of a kind even if totally different from perceptual coding methods developped for music/high quality audio.

You know I am right with this so just let it be, ok?
Here’s wha perceptual codinhg is:
https://en.m.wikipedia.org/wiki/Perceptual_audio_coder

µ-law/A-law Is not based on perception at all. So I guess I don’t know you’re right about this after all.

I’m not discrediting anyone, I’m adding accurate information.
 
Jul 7, 2018 at 1:53 PM Post #4,873 of 7,175
Recording amplitude in one bit would allow you to know if signal was present or not but I don't see how it could do any more than that, and certainly not result in a recognizable copy of the original modulated sine wave. What am I missing?

Essentially with only 1 bit/2 values to play with, you would only be able to encode if the amplitude of the signal being digitized is increasing or decreasing. Obviously, this is highly imperfect or to put it another way, we have a great deal of error (resulting in a perfect waveform and a great deal of noise). However, using aggressive noise-shaped dither, all that noise can be placed into the ultrasonic frequency range, assuming a high sample rate and therefore a large freq range to redistribute that huge amount of noise. BTW, this isn't just theory, Sony actually produced such a digital audio format/product nearly 20 years ago, it's called SACD.

G
 
Last edited:
Jul 7, 2018 at 1:55 PM Post #4,874 of 7,175
Lets use the 2 bit audio, we have 11=1.5V, 10=1V, 01=0.5V, and 00=0V. My thoughts were, why couldn't you code your audio file so that... wait, is audio file where audiophile came from???.... code your audio file so that 00 is 0.2V instead. That is what I was sort of thinking.

from a "language" perspective, of course we could assign any value to anything we want and start wherever we like. but we have to consider the practical application. how would we go to implement that in the DAC? no signal set at 0.2V, does it mean we have DC voltage on every silent passage?
in practice if a DAC can output 2V max that value will be affect as 0dB(loudest signal) and we go down from there in a 6dB increment for each bit value, and as low as the DAC can. I'm not sure how we could do differently. if we took a song where the lowest signal is at -36dB, and somehow 0dB was affected to the max output and -36dB as the LSB value, that would result in quantization noise at -36dB which really wouldn't improve fidelity.


I think I understand the application of Nyquist as it relates to sampling rate, but I'm having a little difficulty understanding the '1 bit' statement as it relates to bit depth. Recording amplitude in one bit would allow you to know if signal was present or not but I don't see how it could do any more than that, and certainly not result in a recognizable copy of the original modulated sine wave. What am I missing?
the obvious example of what he means is DSD. you code in 1bit but you can achieve an equivalent to 24/96 or even higher without too much difficulty. it's one of the best examples to show that whatever we have it's in the end the right accurate signal, and some noise. when we have a way to push the noise around, what's left is the proper accurate signal.
of course if you just code stuff in 1bit and reconstruct them like that, you just get noisy crap. it can't work on its own as a one step does all, but then again few digital designs do.
 
Jul 7, 2018 at 3:25 PM Post #4,875 of 7,175
Here’s wha perceptual codinhg is:
https://en.m.wikipedia.org/wiki/Perceptual_audio_coder

µ-law/A-law Is not based on perception at all. So I guess I don’t know you’re right about this after all.

I’m not discrediting anyone, I’m adding accurate information.

It's semantics bro! It is based on perception, because µ-law/A-law coding increases intellibility of speech. What is inaccurate about that?

Is it my fault µ-law/A-law is not called what it is for historical reasons? Later much more advanced coders don't change things.

"μ-law encoding effectively reduced the dynamic range of the signal, thereby increasing the coding efficiency while biasing the signal in a way that results in a signal-to-distortion ratio that is greater than that obtained by linear encoding for a given number of bits. This is an early form of perceptual audio encoding."

https://en.wikipedia.org/wiki/Μ-law_algorithm
 
Last edited:

Users who are viewing this thread

Back
Top