24bit vs 16bit, the myth exploded!

xdog · Jul 21, 2014 at 1:45 PM

I thinik you have the wrong midnset/assumptions.

I have similar problem with the need for amplification.
Like for instance Sennheiser HD600 reaches 94db (according to personalaudio.ru) at ~0.25V, so technically most of the smartphones can provide sufficient power
for that headphones (or even K701) so that the user will get that 85db of average volumn for modern music
(I'm totally happy with the volumn which I get with my smartphone on HD600 with music such as pop, alternative rock, trance...);
then you should be even more happy with the 1V standard (cause more can cause hearing damage).
However some users heavilly insist that HD600 needs an amplifiler, and this is completely understandable if you consider than they are listening
for piano/violin duoes, which can be recorded like -20db to -30db from the top; which might put the requirements for the voltage like 10x compared to my (compressed) music;
which translates into need for a more or less standard 6V output amplifier.

Other examples are
- I remeber that one person was bragging that HD800 can reach 120db with low distortion.
- please note that headphones can provide large isolation (-20db in wide frequency range I would guess are the best results, man like with takstars hd6000, it is difficult to hear what is saying a person next to you)
- technically in my room I would guess that I have noise on the level like 30db (I think that this is the noise of my quite computer fan, beeing the most noisy think in the room);
however with my Beats mixr headphones (something like 120db@1V) I can hear (playing 24bit silence @ 1V) the noise on the headphone output STX (I think which is like -100db?), and that something is going on with X-Fi HD (I think the noise is something like 110db); and I had the same feeling about the STX heaphone output with some less efficient headphone; so I would guess that 10db noise is hearable in silent passages [I don't care about it myself, because most of my music if 'full of content'] [and I mean those noise values are very quite, the sound of my breathing seemd quite loud]
- actually one of my friends has a room with that 3d funny isolating material, so I would guess that he goes below 30db of environment noise

Nobody is saying that they will listen continously at 120db, but [I'm guessing here, probably someone more accustomed to orchestral concerts could provide better values]
- 120db is the maximum theoretical output (see note below)
- 100db to 110db is the most likely maxiumum short period value obtainable [but it would be better to have some little space just in case]
- 90db is the normal orchestral level
- 70db is the level of small duoes or string quartets, playing soft songs
- 50db is the level of intro sequences
- 20-30db is the level of hearable echos, reverberation (that person which I mentioned in the story said that it can hear diffrences just with such kind of hidden content, and not with normal level of music]

You can play a little bit with the numbers, but the thing is that the 96db (which is most likely should be restated as 48db, see the previous post) dynamic range might be not enough.
The 16bit is ok for almost all contemporary music, but it might be not sufficient for classical (especially if someone would like to record everything with the same bit to real db mapping)

One another analogy [very exagerated

] (please note that this sample had to be logarithmitized, but our hearing also does exponantial->linear conversion):
Here we sell a car which can do 0-200km/h,
the only problem is that the speed has discrete values.
That is not a real problem, cause you have 200km, 199.99km, 199.97km/h
Oh, wait, but at the low level the first atainable level is 50km/h, then 70km/h, so that might be a problem.
But, not really, cause we have this super mechanism in which we randomly push brakes at high frequencies,
so in reality you can get 20,30,36,42 km/h.
But you are still will drive 150-200km?, well even if you don't want to do it other people will force you to just 180-200km/h

I'm going to restate that, the question if 16bit is sufficient is the question if we with 96db of dynamic range can cover all the spectrum which for example classical music can provide, starting from full output of the orchestra to little brushes of instruments and echoes (those things don't have to occur simultanously) with sufficient quantization levels (technically you could argue that you can record violin playing at constant level of 80db with only 1bit

); or more likely the dynamic range should handle from highest sound to the lowest hearable harmonic of the most quite instruments/echo (which most likely will turn into the 0db in which humans start to percive sound), and from my calculation (based on the work of KrzysiekK) it seems that 16 bit might not be enought. Technically the cost of 24bit is very low, just 1.5x more space is needed, and you have equipment which can reach, even if not 24bit, than 20bit easily.

PS: Personally I'm really happy with the 16bit

bigshot · Jul 21, 2014 at 2:02 PM

Peaks aren't the issue, the noise floor is. Dynamics in digital audio extends downward, not upward. The peaks are the same level no matter whether you use an MP3 or DSD. The difference in resolution is down in the super quiet stuff.

If your living room is very quiet, it might have a 30dB noise floor. In order to hear the quietest sounds in a 90dB recording, you would have to raise them above the level of your room tone. That means that the 90dB dynamic range is actually 120dB in practice. Your 120dB peaks in orchestral music recordings have a noise floor of the concert hall in the 30dB range too. All 24 bit would add to the sound quality would be a beautifully defined bed of noise that if you raised to an audible level, wouldn't sound all that different from redbook quantization noise.

The fact is, super wide dynamics are unpleasant to listen to. A 45-50dB dynamic range is overkill for comfortably listening to even the most dynamic music at a healthy volume. Most music, even classical music is mixed to keep the dynamics in the range of comfort and does't come close to taxing even the abilities of redbook.

bigshot · Jul 21, 2014 at 2:12 PM

By the way, just because the threshold of pain for hearing is up around 120dB, it doesn't mean that humans can hear down to 0dB at the same time. Human hearing compensates for the average sound level and focuses on that. If there are loud horn blasts alternating with the musician turning the page of his music, you aren't going to hear the page turns the way you would if you had been sitting in a silent room for a few minutes and then heard a page turn.

stv014 · Jul 21, 2014 at 4:52 PM

Originally Posted by xdog /img/forum/go_quote.gif

however with my Beats mixr headphones (something like 120db@1V) I can hear (playing 24bit silence @ 1V) the noise on the headphone output STX (I think which is like -100db?)

It is about 20 uV (A-weighted) noise voltage at 44100 Hz sample rate with no load. Using a multiple of 48000 Hz instead improves that by 6-7 dB. So, if your headphones really have 120 dB/V sensitivity, the noise SPL at the most commonly used sample rate would be above 20 dBA.

Steve Eddy · Jul 21, 2014 at 5:20 PM

It bears repeating, the effective dynamic range of human hearing is about 60-70 dB.

se

xdog · Jul 21, 2014 at 5:39 PM

Just one more calculus on the quantization problem:

Info: This process will be done assuming that there is no dithering, where you would trade of something like 1-2bits for some small noise.

So lets assume that we have simple 'music' consiting of 2 sine waves.
One of them has value of 1 (the smallest posible, the next value is 2 which is +6db more) and is the least hearable/persivable sine wave.
The second sine has some arbitrary value where quantization should not be a problem, for instance 1024 (the next value is ~0.1% larger).

Now we have to take some physiologic data:
- the smallest memorable diffrence is sound pressure level is 1db
- with instant change lets assume that it is ~0.3db (we want to be more audiophile friendly here, you can decrease that value if you feel that your better than that)
This is cerca (log problem) 1/16th change of 6db which is 2^4.
So if I would want to change the second sine wave by 0.3db (percivable change); I would have to change the first sine wave also by 0.3db;
but here I would need at least 4 bits of quantization used for making the volumn transition smooth.
Otherwise it could happen for example that the second sine went from 50db to 50.3db, but the first one went from 12db to 18db; which would affect sound.
[those considerations were done without the assumption what the real number of bits per second is]

I still am aware that for, lets say Lady Gaga, where everything is like -10db from the top, and the music is 'full-on'
for variety of reasons (masking, THD of headhones/speakers, sound leakage) everything which is like -60db is not going to be heard.
But -60db means still additional -36db till we reach the bottom, which is 2^5, which is more quantization steps than I've calculated.

Thank you, I'll not bother you further, just I love music and math

RE: to those issues above, I understand that, but I said you can use heavilly isolating headphones or have even isolated the whole room (you know, where any kind of noise feels welcome); and the 30db is still some average (meaning here some random more pronounced noises, or defined noises), so might be that 10db 2kHz tone would still reach you. As I have given the example with beats, the noise produce by dac can be lower than the sound of my computer fan, or my breathing, but I'm still able to perceive it due to diffrent place of orgin, and diffrent characteristic; the 30db floor gives too much space for 'buts' to rebuke the 24 bit audio myth

bigshot · Jul 21, 2014 at 7:16 PM

The threshold of perception for volume is between .5 and 1 dB. For music it is higher than with test tones.
30dB is as quiet as a library.

You aren't thinking in the range of reality. In the real world everything is much less sensitive than your abstract concepts.

castleofargh · Jul 21, 2014 at 9:19 PM

@xdog

the headphone isolating for the external noise, that's actually an interesting part. for the hd600 it doesn't stand, the isolation is pretty much zero until we reach a few khz. but sure some headphones do isolate a good deal, at least enough to dismiss some part of the ambient noises. that's a good point.

but overall you still try to justify 24bit for the sound that can exist, not for the sound we can hear. and that's where we part ways.
look at the dynamic range actually used on your albums. I own a few operas and a good deal of classical stuff. I don't have a lot using more than 70db of dynamic. and the few I have, when you listen to the silence you hear a lot of noises. I don't know if it was in the room when recording or if it's something copied from a vinyl, a wax cylinder, or if the tapes got damaged. but all I get past 60/70db is usually noise. and again that must account for less than 0.5% of my albums. and I'm avoiding ultracompressed stuff as much as I can, listening mostly to old stuff as a strange consequence of the loudness war.
but let's get past this, why restrict ourselves and sacrifice the 10albums that may have one day a use for 24bit dynamic? let's go for it.

if your hears work ok, around 80/90db you will have the stapedius muscle kicking in to moderate the vibrations in your ears. that would turn into a temporary change of sensitivity.
long story short, it's some kind of recalibration to avoid damages, just like the iris will do for brightness, except it's not an homogeneous response with the ears.
and just like the stars you stop seeing in daylight because the iris adapts for the brightest part of what you see, past a certain level of sound, the stapedius will try to reduce sensitivity to accommodate for the loudest sound. so the loudest the sounds, the less sensitive you'll be to the quiet parts. if you listen to music with a 100DB dynamic, you stop being able to hear 1 or 2 or 5db sounds as a human reaction to loud sound.
that's what Steve Eddy is talking about, he's just mentioning the result of conducted experiments.
so that ends the debate about 24bit, nobody with normally functioning hears will hear a 15DB decay when listening to some 100 or 110DB music. and even better, given that you reduced you internal hear sensitivity, you also lost the ability to discern the same levels of variations. you're overal less sensitive to sound, it's a damn waste to actually listen loud for detail retreival.

then there are all the possible noises and distortions that will probably pass 20db with high colors when the music goes past 100db. the HD800 at 100db is still above 0.1% of distortion, that's only -60db below the original signal(and that's the best part of the distortion, there is actually more than that up to almost -40db). do the math and cry. your quiet details will be masked by loudest sounds that are not even part of the music.

and last but certainly not least, there is a limit to how much you can register at the same time on a conscious level. when listening to music you move freely from one instrument to another. to the voice, then the drums, something on the left, bam the singer in front. all those conscious or reflex choices make you concentrate mainly on those parts, dismissing most of the rest for sake of concentration. because after all we human can't think about 15things at once. one after another no problem, all at once... no can do.
do you mean to tell me that you will focus on 25db decays when music is playing at 90db? all of what makes the music will be in the loudest parts, naturally you would never do that(even if you could), your attention gets captured by one of those loud sound and makse you lose the last decay of another tone by choice.

so the only situation where I could agree, ends up being some piece of music where we go from strong loud sounds, to a pretty long passage super quiet. that happens a lot on classical music, but did you look at the actual dynamic. usually the super quiet passage is -30 or -40db below 0. and for us i's already a great deal of difference, I often rise the volume for those passages. again my albums are around 60/70db of maximum dynamic at best. that leaves 26db on 16bit tracks to define the quietest sound of the entire piece. how important is that part? how far away from anything louder was it? because if it's not isolated then depending on the frequencies, some masking effect might happen and hide that quietest part anyway. soetimes only something 20 or 30db louder are enough to mask the quiet part. making the quietest sound of the album something we cannot actually hear because of other sounds, so something useless again as this sound never existed for us humans in the first place.

all this contribute to most of us in here saying that 16bit is actually enough for us. and people asking for more or pretending to hear more, are liars, aliens from another planet, or simply believe that something they hear is actually the excess bits, when it's something else. and I'm still waiting to see some evidence from the people claiming they do better than what doctors measured for us in experiments. until then I will keep my 16bit and my money.

and about you first bit that makes for 0 or +6db... the 1bit=6db is an estimate based on multibit systems, you mentioned it several times, that could work only on a 1bit dac that would happen to have that value of voltage/no voltage. but as soon as there are more than 1bit, that hypothetical +6db step is supported by smaller values and will be activated only when the smaller values are not enough to move to the next amplitude. you got something mixed up here obviously.
with 5bits and a need for small steps variations at the lowest volume levels on the next sample, it will be the 4th and 5th bit that would change from 0 to 1 or 1 to 0. it's a pretty straightforward system, you want a given value, you use a combination of different discrete options to make the discrete value closest to the one you asked for.
there will never be a moment where the smallest possible step between 2 discrete values in a 16bit dac will be +6db.

esldude · Jul 21, 2014 at 10:10 PM

I think it is on some of the Wescott audio pages, but one fellow describes blind testing various amps and other components for the pro audio company he once designed for. He told how it quickly became obvious that the very best, most discriminating blind tests were with volume no higher than 75 db (average level I think). When they allowed people to set volume for a test he observed how they quickly raised volume to 85 or 90 db and within minutes had very poor ability to discriminate sound quality. Things easily perceived with statistical validity at the lower volume were not discerned at those higher volume. Considering peaks with a 75 db level of likely no more than 90 db, and a 30 db noise floor you aren't far from the 60 or so effective real time dynamic range of human hearing. Which makes sense that would be the most precise range of listening. Low enough you don't often activate the muscle to reduce hearing sensitivity, and high enough to get the effective noise floor out of the noise floor of ambient sound.

stv014 · Jul 22, 2014 at 4:56 AM

Originally Posted by xdog /img/forum/go_quote.gif

So lets assume that we have simple 'music' consiting of 2 sine waves.
One of them has value of 1 (the smallest posible, the next value is 2 which is +6db more) and is the least hearable/persivable sine wave.
The second sine has some arbitrary value where quantization should not be a problem, for instance 1024 (the next value is ~0.1% larger).

Now we have to take some physiologic data:
- the smallest memorable diffrence is sound pressure level is 1db
- with instant change lets assume that it is ~0.3db (we want to be more audiophile friendly here, you can decrease that value if you feel that your better than that)
This is cerca (log problem) 1/16th change of 6db which is 2^4.
So if I would want to change the second sine wave by 0.3db (percivable change); I would have to change the first sine wave also by 0.3db;
but here I would need at least 4 bits of quantization used for making the volumn transition smooth.
Otherwise it could happen for example that the second sine went from 50db to 50.3db, but the first one went from 12db to 18db; which would affect sound.
[those considerations were done without the assumption what the real number of bits per second is]

All the above is only really relevant if no dithering is used. With dithering, the volume "steps" and distortion disappear, and there is only a constant noise floor.

You can see this on the graphs below (click to zoom):

This 44100 Hz/16-bit format sample is a mix of two sine waves, one has a frequency of 1000 Hz and a constant peak amplitude of 1024 (in 16-bit LSB units), and the other has a frequency of 1250 Hz and the peak amplitude increases exponentially from 0.25 to 2. Since dithering is used, there is a constant noise floor (hiss) without visible distortion products, and the level of the higher frequency tone increases smoothly without "steps". The graph on the right is a zoomed in version of the same sample.

Without dithering, it looks like this:

Now the noise floor is not as clean and consistent, but the amplitude still appears to increase continuously. This is possible because even the entropy from the higher level tone is enough for some dithering effect. In fact, with complex samples (like music when it is not at a very low level), dithering can be redundant, and just adds a small amount of extra noise. But it can be used to guarantee that quantization distortion is avoided.

With the louder low frequency tone removed, the effect of dithering becomes much more obvious (left: not dithered, right: dithered):

Without dithering, there is now high distortion, the amplitude increases in steps, and at the lowest levels the tone is cut off entirely. Dithering still produces a clean (other than the uncorrelated noise floor) output.

ab initio · Jul 23, 2014 at 12:01 AM

stv014 said:
Dithering still produces a clean (other than the uncorrelated noise floor) output.

... a noise floor at something like -120 dB, which is essentially non-existent! Do we even know of any equipment with a SNR > 120 dB?

Great example of the effect of dithering on sinusoids vs more complex signals (aka, 2 sinusoids!).

What software were you using to create the plots? Was it audio-specific software or general purpose data analysis software?

Cheers

Krutsch · Jul 23, 2014 at 1:08 AM

stv014 said:
All the above is only really relevant if no dithering is used. With dithering, the volume "steps" and distortion disappear, and there is only a constant noise floor.

<snip, snip>

That was a very helpful explanation of the effects of dithering and I appreciate you taking the time to post the plots with a clear explanation.

All of the ranting is worth wading through for posts like these...

bigshot · Jul 23, 2014 at 1:36 AM

Even without dithering, the noise is still inaudible under music.

stv014 · Jul 23, 2014 at 6:07 AM

ab initio said:
... a noise floor at something like -120 dB, which is essentially non-existent! Do we even know of any equipment with a SNR > 120 dB?

The overall A-weighted level of the noise floor is actually about -97.3 dBFS, which is normally still more than good enough for music listening. The analysis displays it in relatively narrow bands (the "50 Hz bandwidth" on the graph means 6.02 dB attenuation at +/- 25 Hz from the signal, with a Gaussian window), in which the noise energy is obviously lower than over the entire 22050 Hz bandwidth of the sample. That is also why the tone at about -102 dBFS can still be clearly seen (and heard, with the volume turned up and the much louder 1 kHz tone removed to prevent masking), even though it is "under" the A-weighted noise floor.

On the first graph, it can also be seen that there is more noise in the highest octave. This is intentional, to achieve a lower perceived noise loudness at the same unweighted RMS level.

Originally Posted by ab initio /img/forum/go_quote.gif
Great example of the effect of dithering on sinusoids vs more complex signals (aka, 2 sinusoids!).

As shown, higher complexity can actually make dithering less important, as the lowest bits that are to be discarded by the quantization become more like white noise. But with proper dithering, the quantization error always just adds noise at a constant RMS level.

Originally Posted by ab initio /img/forum/go_quote.gif

What software were you using to create the plots? Was it audio-specific software or general purpose data analysis software?

I used the utilities from the link in my signature to generate and analyze the samples. In case anyone is interested, I still have the script that runs all the required commands to reproduce the samples and graphs.

Dark_wizzie · Jul 23, 2014 at 6:58 AM

stv014 said:
The overall A-weighted level of the noise floor is actually about -97.3 dBFS, which is normally still more than good enough for music listening. The analysis displays it in relatively narrow bands (the "50 Hz bandwidth" on the graph means 6.02 dB attenuation at +/- 25 Hz from the signal, with a Gaussian window), in which the noise energy is obviously lower than over the entire 22050 Hz bandwidth of the sample. That is also why the tone at about -102 dBFS can still be clearly seen (and heard, with the volume turned up and the much louder 1 kHz tone removed to prevent masking), even though it is "under" the A-weighted noise floor.

On the first graph, it can also be seen that there is more noise in the highest octave. This is intentional, to achieve a lower perceived noise loudness at the same unweighted RMS level.

As shown, higher complexity can actually make dithering less important, as the lowest bits that are to be discarded by the quantization become more like white noise. But with proper dithering, the quantization error always just adds noise at a constant RMS level.

I used the utilities from the link in my signature to generate and analyze the samples. In case anyone is interested, I still have the script that runs all the required commands to reproduce the samples and graphs.

Even after you broke it down with pictures, it all reads like Martian to me.

Latest Thread Images

Featured Sponsor Listings

24bit vs 16bit, the myth exploded!

xdog

Head-Fier

bigshot

Headphoneus Supremus

bigshot

Headphoneus Supremus

stv014

Headphoneus Supremus

Steve Eddy

Member of the Trade: The Audio Guild
Aka: TempAccount555

xdog

Head-Fier

bigshot

Headphoneus Supremus

castleofargh

Sound Science Forum Moderator

esldude

500+ Head-Fier

stv014

Headphoneus Supremus

ab initio

500+ Head-Fier

Krutsch

Headphoneus Supremus

bigshot

Headphoneus Supremus

stv014

Headphoneus Supremus

Dark_wizzie

100+ Head-Fier

Users who are viewing this thread

Latest Thread Images

Featured Sponsor Listings

24bit vs 16bit, the myth exploded!

Head-Fier

Headphoneus Supremus

Headphoneus Supremus

Headphoneus Supremus

Member of the Trade: The Audio GuildAka: TempAccount555

Head-Fier

Headphoneus Supremus

Sound Science Forum Moderator

500+ Head-Fier

Headphoneus Supremus

500+ Head-Fier

Headphoneus Supremus

Headphoneus Supremus

Headphoneus Supremus

100+ Head-Fier

Users who are viewing this thread

Member of the Trade: The Audio Guild
Aka: TempAccount555