24bit vs 16bit, the myth exploded!

Aug 22, 2023 at 12:00 PM Post #7,081 of 7,175
To avoid distortion due to quantization error. Instead of distorted signal we have distortion free signal + uncorrelated noise which contain all the problems of quantization. It is a better, wiser and more pleasant way to swallow quantization errors.

So the purpose of dither is to deal with unwanted noise from quantization errors?
 
Aug 22, 2023 at 12:04 PM Post #7,082 of 7,175
Even with 16 bit dither is a very small difference. Dither isn’t a big thing. It’s a small improvement that it many cases doesn’t make any significant difference in the real world. In my sig there’s a link to a video by Ethan Winer where he demonstrates a track with and without dither. On his website you can download the tracks and listen to them carefully yourself. It’s worth checking out for yourself so you know exactly what degree of difference it makes, but the short answer is that you probably wouldn’t notice that dithering hadn’t been applied with normal music listening. It isn’t a very big difference at all. It’s more of a low level polish than it is a make or break.

I’m not saying you don’t need to use dithering. I’m just saying it doesn’t make a huge difference.
Dither is like Tardigrades: Impressive under the microscope, but we don't really notice it in everyday life. :relieved:
 
Aug 22, 2023 at 12:18 PM Post #7,083 of 7,175
So the purpose of dither is to deal with unwanted noise from quantization errors?
Yes, particularly noise that correlates with the signal (distortion). We trade nasty correlating noise to a little bit stronger but less nasty uncorrelated noise. Adding dither in the signal before truncation randomises the quantization errors and they don't correlate with the signal. For example rounding 1.2 to closest integer gives 1 every time (correlation!), but if we add a random number (dither) between -0.5 to 0.5 to the signal, we are randomly rounding numbers between 0.7 and 1.7 which 80 % of the time gives 1 and 20 % of the time 2. So, in "average" the quantized value is 0.8*1 + 0.2*2 = 1.2 which is what we want.
 
Last edited:
Aug 22, 2023 at 12:32 PM Post #7,084 of 7,175
Yes, particularly noise that correlates with the signal (distortion). We trade nasty correlating noise to a little bit stronger but less nasty uncorrelated noise. Adding dither in the signal before truncation randomises the quantization errors and they don't correlate with the signal (e.g. rounding 1.2 to closest integer gives 1 every time (correlation!), but if we add a random number between -0.5 to 0.5 to the signal, we are randomly rounding numbers between 0.7 and 1.7 which 80 % of the time gives 1 and 20 % of the time 2. So, in "average" the quantized value is 0.8*1 + 0.2*2 = 1.2 which is what we want.

Is it fair to say a tiny compromise in accuracy is exchanged for the benefits of noise reduction?
 
Aug 22, 2023 at 12:54 PM Post #7,085 of 7,175
A higher level of noise isn't more accurate.
 
Aug 22, 2023 at 9:33 PM Post #7,086 of 7,175
Is it fair to say a tiny compromise in accuracy is exchanged for the benefits of noise reduction?
We compromise noise floor (it gets higher) and in exchange we get rid of distortion. As to accuracy I'd say no distortion is more accurate. But why not listen yourself. In attachment you have comparison of 16 to 8 bit reduction (the signal is 2 seconds of 440 Hz tone at -10 dBFS, including about 0.5 second of fade in and fade out):
  • 2 sec of 16 bit
  • 2 sec of 8 bit, no dither
  • 2 sec of 16 bit
  • 2 sec of 8 bit, dither with flat noise
  • 2 sec of 16 bit
  • 2 sec of 8 bit, dither with shaped noise
 

Attachments

Aug 23, 2023 at 4:45 AM Post #7,087 of 7,175
Is it fair to say a tiny compromise in accuracy is exchanged for the benefits of noise reduction?
No, it’s the other way around, a tiny compromise in the digital noise floor for the benefits of no distortion. In practice though we get both, no distortion AND noise reduction, because we use noise-shaped dither which reduces the noise in the critical hearing band by 20 - 30dB.

You appear to think that the dither noise is more significant than the distortion it eliminates but in fact the opposite is true. Have a look at this image of the frequency domain effects of truncation vs dither:
3_5.png

This is a 1kHz signal, the yellow plot is the frequency result of truncation, the white plot is the same bit reduction using basic (triangular) dither. The horizontal lines (y-axis) are 10dB increments and the x-axis is 0Hz - 20kHz frequency (taken from page 8 of a dithering PDF distributed by iZotope). Note that the distortion products are inharmonically related and would therefore be more noticeable/audible than uncorrelated noise of the same level but worse still, the distortion products actually peak around 10dB higher than the dither noise. Even worse, these distortion products are deterministic and correlated so they will sum at +6dB (in the case of mixing/down-mixing), while the dither is random and uncorrelated and therefore will sum at +3dB.

Here is a very short video showing the spectrogram of an original 1kHz signal at 24bit, then truncated to 16bit and finally converted to 16bit using noise-shaped dither. Notice the large number of distortion products with the truncation, then the complete lack of distortion products using noise-shaped dither at the expense of minuscule amount of noise in the critical hearing band and more noise starting around 17kHz, where our hearing is insensitive:

A higher level of noise isn't more accurate.
In this case it is, you really should understand at least the basics of dither if you’re going to make assertions about it! Didn’t you look at the image @71 dB posted before you replied? Dither is summed with the signal, so you end up with the signal plus some noise. How isn’t that more accurate than nothing at all (no signal and no noise)?

G
 
Last edited:
Aug 23, 2023 at 6:18 AM Post #7,088 of 7,175
Edit: i also dont think thousands of people use dither on 24bit->16bit conversion because its "rarely audible"
And you know that how? You mean you just thought up an idea, with no basis in any actual facts but assumed it’s probably true anyway, just because you thought of it?

The actual facts are:

1. Dither/Noise-shaped dither is taught in all digital audio/engineering courses, it’s discussed in all the textbooks, in all the online DAW forums and has been standard practice for 25 years or more.
2. “Rarely audible” (or really “extremely rarely audible”) means you cannot be absolutely sure it won’t be audible. So either you have to go through the recording with a fine tooth comb to make sure there’s no audible distortion from truncating, which is likely to take quite some time (and not be 100% reliable) or spend a couple of seconds choosing noise-shaped dither instead of truncation and guaranteeing no audible distortion. Which do you think is more efficient/sensible?

In a sense you are right though, thousands of people don’t use dither, it’s more like tens of thousands! The only exceptions are likely to be kids in their bedrooms with no idea what they’re doing or potentially, an actual engineer who has spent considerable time checking and decided not to apply dither. I say “potentially” because I’ve never actually heard of any engineers who do that. Why would they?

G
 
Last edited:
Aug 23, 2023 at 6:42 AM Post #7,089 of 7,175
No, it’s the other way around, a tiny compromise in the digital noise floor for the benefits of no distortion. In practice though we get both, no distortion AND noise reduction, because we use noise-shaped dither which reduces the noise in the critical hearing band by 20 - 30dB.
Here it is good to make clear to all readers shaped dither still adds noise power compared to truncation error noise/distortion (no dither), but it is perceptually significantly quieter, which is what counts in consumer audio.

You appear to think that the dither noise is more significant than the distortion it eliminates but in fact the opposite is true. Have a look at this image of the frequency domain effects of truncation vs dither:
3_5.png


This is a 1kHz signal, the yellow plot is the frequency result of truncation, the white plot is the same bit reduction using basic (triangular) dither. The horizontal lines (y-axis) are 10dB increments and the x-axis is 0Hz - 20kHz frequency (taken from page 8 of a dithering PDF distributed by iZotope). Note that the distortion products are inharmonically related and would therefore be more noticeable/audible than uncorrelated noise of the same level but worse still, the distortion products actually peak around 10dB higher than the dither noise. Even worse, these distortion products are deterministic and correlated so they will sum at +6dB (in the case of mixing/down-mixing), while the dither is random and uncorrelated and therefore will sum at +3dB.
Just to make thing clearer for everyone: The signal power of the dither is bigger than the signal power of the truncation error even if the truncation error has much higher peaks above ~700 Hz. Below 1 kHz the truncation error is very "sparse" containing zero power at frequencies lower than ~250 Hz (however, as truncation error correlates with the signal, its spectrum changes strongly with the signal). The truncation error has a little bit less energy when integrated over the whole bandwidth 0-22050 Hz than dither noise, but

1) Dither is subjectively less annoying than truncation distortion
2) Dither allows distortion free signal even below noise floor!
3) The spectrum shapes of dither and temporal truncation error dictate perceptual loudness
 
Aug 23, 2023 at 8:58 AM Post #7,090 of 7,175
[...] I say “potentially” because I’ve never actually heard of any engineers who do that. Why would they?

May I ask you something off-topic, and very, very hypothetical.

I know dithering is for making things that are in a 16 (etc) bit format and nothing else, like CDs.

But if I was suspecting cheap consumer DACs to be defective, sometimes. Would you advice to take dithering and truncation to the "user side"?

I mean, if you had a scenario where the Portaudio library is used (by me, e.g. I'm making a video game).

Or something similar is used like any audio output API in general. So, the API, here Portaudio, usually asks for a 32 bit float format to bring sound to the speakers. I don't know if you can set Portaudio to 16 bit format, but let's assume you can.

So, I need to repeat myself here. On the one hand I would use the most abstract and high level API. I should not think about what it does. But, as I said, let's assume I don't trust it (and the driver and the DAC) to do everything right.

Would you advice to try to "be on the safe side" and set this API to 16 bit and supply a carefully dithered stream, or is it enough to trust the high level API and the DACs in the market to bring it from 24/32 bit to the speakers without any driver/firmware defects? I mean, there might be some engineers who think truncation to 16 bit without dithering is okay, somewhere in the DAC system.

I might even want to create a dithered 16 bit stream for extending it to 32 bit myself. This seems to make no sense at first, but if I assumed the DAC or the high level API was just truncating it undithered from 32 to 16 again, I might want to do this?
 
Aug 23, 2023 at 9:55 AM Post #7,091 of 7,175
But if I was suspecting cheap consumer DACs to be defective, sometimes. Would you advice to take dithering and truncation to the "user side"?
I’m not exactly sure what you’re asking. Modern DACs are all 24bit, even cheap DACs use cheap DAC chips that are 24bit. The vast majority oversample and reduce the bit depth to just a handful of bits but always with noise-shaped dither. There’s really only two choices for a DAC manufacturer, use an off-the-shelf DAC chip from one of the few DAC chip manufacturers, all of which operate very well with no artefacts near audibility or effectively create your own with a FPGA. Only more expensive DACs sometimes don’t have an off-the-shelf DAC chip, so I’d be more worried about the expensive “boutique” DACs than cheap ones!

So I’m not sure why/what dithering and truncation would be done on the “user side”? Do you mean that a user might have the audio output of their computer set to 16bit and therefore your 32bit or 24bit files would get truncated or dithered before being sent to the DAC?

G
 
Aug 23, 2023 at 10:39 AM Post #7,092 of 7,175
So I’m not sure why/what dithering and truncation would be done on the “user side”? Do you mean that a user might have the audio output of their computer set to 16bit and therefore your 32bit or 24bit files would get truncated or dithered before being sent to the DAC?
I'm not sure either, sorry. So, there is music, it is just existing as 32 bit raw format in RAM. That's a very likely case when mixing some game sounds together.

And I want it to go to the speakers/headphones, stereo. The usual choice would be to ask the high level API for a 32 bit format, for outputting the final game music mix. The high level API (Portaudio) most often guarantees that you can set 32 bit and everything is fine. And when it goes to a 24 bit DAC chip, there's no problem.

It's just that when reading this thread I got the impression that 16 bit could be the right choice, too. The default choice. It's like this: One usually tends to prefer one of the choices and stay there for keeping things simple. There are still situations where 16 bit is even the only format, I believe it was Core Audio on Iphone, but this is surely only API problems, and it's gettting sorted out over time, probably it's a thing of the past.

You gave the answer to my whole question already. DACs are 24 bit, there is no real sound problem. I was just too sceptical.

PS: Thank you a lot for taking the time, I appreciate it a lot.
 
Last edited:
Aug 23, 2023 at 10:57 AM Post #7,093 of 7,175
The high level API (Portaudio) most often guarantees that you can set 32 bit and everything is fine. And when it goes to a 24 bit DAC chip, there's no problem.
Yep, even if there is some truncation going on from 32bit to 24bit, the distortion will be way down below -120dB and obviously inaudible.

The only potential issue I could see would be on a device that allows the user to set the output bit depth to 16bit. I would still expect it to be dithered rather than truncated but there’s no way to know for sure unless it’s documented somewhere. Even if it is just truncated to 16bit, it’s very unlikely there will be any audible artefacts but if you think this scenario is possible and want to be certain of no audible artefacts, you could dither (or preferably noise-shape dither) the mixed 32bit stereo output to 16bit before outputting it.

G
 
Aug 23, 2023 at 11:13 AM Post #7,094 of 7,175
No, it’s the other way around, a tiny compromise in the digital noise floor for the benefits of no distortion. In practice though we get both, no distortion AND noise reduction, because we use noise-shaped dither which reduces the noise in the critical hearing band by 20 - 30dB.

You appear to think that the dither noise is more significant than the distortion it eliminates but in fact the opposite is true. Have a look at this image of the frequency domain effects of truncation vs dither:
3_5.png

This is a 1kHz signal, the yellow plot is the frequency result of truncation, the white plot is the same bit reduction using basic (triangular) dither. The horizontal lines (y-axis) are 10dB increments and the x-axis is 0Hz - 20kHz frequency (taken from page 8 of a dithering PDF distributed by iZotope). Note that the distortion products are inharmonically related and would therefore be more noticeable/audible than uncorrelated noise of the same level but worse still, the distortion products actually peak around 10dB higher than the dither noise. Even worse, these distortion products are deterministic and correlated so they will sum at +6dB (in the case of mixing/down-mixing), while the dither is random and uncorrelated and therefore will sum at +3dB.

Here is a very short video showing the spectrogram of an original 1kHz signal at 24bit, then truncated to 16bit and finally converted to 16bit using noise-shaped dither. Notice the large number of distortion products with the truncation, then the complete lack of distortion products using noise-shaped dither at the expense of minuscule amount of noise in the critical hearing band and more noise starting around 17kHz, where our hearing is insensitive:


In this case it is, you really should understand at least the basics of dither if you’re going to make assertions about it! Didn’t you look at the image @71 dB posted before you replied? Dither is summed with the signal, so you end up with the signal plus some noise. How isn’t that more accurate than nothing at all (no signal and no noise)?

G


My point in all this is WHY compromise anything at all?

Simply stay in 24 bit.
 
Aug 23, 2023 at 11:21 AM Post #7,095 of 7,175
My point in all this is WHY compromise anything at all?

Simply stay in 24 bit.
A. It’s completely inaudible.
B. It’s comprised anyway because the noise floor of the recording is going to be way higher than the dither noise.
C. 16bit obviously uses less data/bandwidth.
D. We can’t stay in 24bit, we have to mix and master at considerably more bits than 24bit (typically 64bit float). So we have to get the mix and master down to a distributable format and that might as well be 16bit as there’s no audible difference.

G
 

Users who are viewing this thread

Back
Top