24bit vs 16bit, the myth exploded!
May 1, 2013 at 12:50 PM Post #1,127 of 7,175
If you turn the volume down you drown the low-level details in your rooms noise floor, which is exactly why high bit depth doesn't really matter for playback.
 
Also, the 65536 discrete values are not different volume levels as you mention. Samples values fall somewhere between discrete values most of the time. That just results in quantization error, which is being randomized using dither.
 
Playing a 16-bit audio file through a 24-bit DAC does multiply each sample. -32768 (16 bit) = -8388608 (24 bit) = -1.0 (normalized float) = 0 dBFS.
 
May 1, 2013 at 7:41 PM Post #1,128 of 7,175
The post that Skamp refers me to is not a proof. It is a guess. We may consider that it is a reasonable guess, but that's all it is.
 
I do not disagree with Xnor either about a noise floor affecting one's ability to perceive signal. Noise is unhelpful. However, it is a particularly human gift to be able to screen out parts of the noise spectrum (even when large parts of that would normally be regarded as signal) and focus the ears on e.g. what the 3rd violin to the left in the second row of an orchestra is playing and conclude that he/she is exceptionally gifted/total rubbish. I am not sure that is easily quantifiable.
 
The discrete values from -32k to +32k are discrete volume levels. If you want to reduce the volume a bit e.g. to remove segments of a recording which have gone into clipping by exceeding 32k, you take the numbers down for a couple of microseconds in the waveform, or you take the whole waveform down a notch, or a larger part of it. One way or another, high numbers, high volume - one to one correlation.
 
I don't know at a bit level what over-sampling DACs do - I don't care much - what I do know is that if you take a digital recording in 24 bit resolution and look at the numbers, they still range from -32k to +32k but there is a decimal point in there for you to fool with which is not available in 16 bit. I know this, because I do it.
 
I even kind of get the impression from the little I read of audio product reviews that non-oversampling now tends to get a better press than oversampling. You have already irrevocably lost any extra precision. Oversampling always struck me as unproductive - you can no more conjure up a change of waveform by multiplying the numbers than you can retrieve what is lost in a 128kbps MP3 by re-recording it as a 320kbps. If you start with a recording where that precision is there from the start, it stays there until you eliminate it e.g. by compressing 24bit data down to 16bit. Garbage in, garbage out. 
 
Please don't think I am claiming to be able to tell the difference. I can barely hear anything above 12,000 Hz these days, let alone 25,000 and it took me quite a while, including a trip to the doctors' to get my ears syringed before I finally realised that my teenaged kids had blown the tweeters on my stereo :). To my partial credit I did notice something was wrong. 
I can't help feeling that CD specs were put together 35 years ago when digital technology was at an early stage. If we can't do better than that with the technology that's available to us now, I'd be a bit surprised - and it is not difficult to do. Most studio masters these days are done in digital form at 24/196 and it's pretty easy to copy digits. 
 
May 1, 2013 at 8:30 PM Post #1,129 of 7,175
Quote:
Originally Posted by obuckley /img/forum/go_quote.gif
 
I do not disagree with Xnor either about a noise floor affecting one's ability to perceive signal. Noise is unhelpful. However, it is a particularly human gift to be able to screen out parts of the noise spectrum (even when large parts of that would normally be regarded as signal) and focus the ears on e.g. what the 3rd violin to the left in the second row of an orchestra is playing and conclude that he/she is exceptionally gifted/total rubbish. I am not sure that is easily quantifiable.

Assume you boost an extremely low-level sound in a dithered 16 bit file so that you can hear the noise floor clearly. (This noise floor usually is recorded noise but let's assume the recording is perfect and the noise floor is just dither.)
Even if that sound is below the noise floor, you can still hear it. Have you ever had such a freaking extreme clean recording where you had to turn up the volume so much that you could hear the dither noise clearly in order to hear low-level details?

While I welcome recordings with great dynamic range, it does get annoying at some point. Like in movies where you have to turn up the volume to understand what the characters are saying in dialogs but that volume would blow your ears during action scenes. Also, if we take a look at (dynamically uncompressed) concert hall recordings, we rarely see >70 dB dynamic range due to the noise floor. Sure, maybe there are some details in the noise floor like someone scratching his/her nose, so what?
 
Quote:
The discrete values from -32k to +32k are discrete volume levels. If you want to reduce the volume a bit e.g. to remove segments of a recording which have gone into clipping by exceeding 32k, you take the numbers down for a couple of microseconds in the waveform, or you take the whole waveform down a notch, or a larger part of it. One way or another, high numbers, high volume - one to one correlation.

No, they are discrete values for individual samples. But sounds often consist of several hundreds or thousands of samples, not single samples. So the different volumes of a sound are not fixed to those discrete levels.
Let's take 8 bit quantization to make things simpler:
127 = -0.07 dBFS
126 = -0.14 dBFS
 
Generate a sine wave at -0.10 dBFS and quantize it to 8 bits using simple triangular dither. Take a look at the spectral analysis and compare it to that of the initial signal. They match, despite the fact that -0.10 dBFS = 126.5 (8 bit) and an 8 bit sample can only have the value 126 or 127, nothing in between (also see below).
 
Quote:
I don't know at a bit level what over-sampling DACs do - I don't care much - what I do know is that if you take a digital recording in 24 bit resolution and look at the numbers, they still range from -32k to +32k but there is a decimal point in there for you to fool with which is not available in 16 bit. I know this, because I do it.

I'm not speaking about the implementation of DACs either. The range of discrete values is a function of bit depth:
24 bit: -2^23 to 2^23-1 or -8388608 to 8388607
16 bit: -2^15 to 2^15-1 or -32768 to 32767
.. and so on ..
 
Integers do not have a decimal or fractional component. Integers consist of natural numbers (0, 1, 2, 3 ...).
 
If you "blow up" a 16 bit file to a 24 bit one you will have unused values. For example 32767 -> 8388352 and 32766 -> 8388096 which, surprise, is a difference of 256 (= 2^8).
 
Quote:
I even kind of get the impression from the little I read of audio product reviews that non-oversampling now tends to get a better press than oversampling. You have already irrevocably lost any extra precision. Oversampling always struck me as unproductive - you can no more conjure up a change of waveform by multiplying the numbers than you can retrieve what is lost in a 128kbps MP3 by re-recording it as a 320kbps. If you start with a recording where that precision is there from the start, it stays there until you eliminate it e.g. by compressing 24bit data down to 16bit. Garbage in, garbage out.

It seems you're confusing several different things here. Oversampling, as the name suggest, deals with the sample rate not bit depth. Lossy codecs throw away information based on psychoacoustic models.
Please be more specific about each of these terms if you want a more detailed response.
 
Quote:
Please don't think I am claiming to be able to tell the difference. I can barely hear anything above 12,000 Hz these days, let alone 25,000 and it took me quite a while, including a trip to the doctors' to get my ears syringed before I finally realised that my teenaged kids had blown the tweeters on my stereo :). To my partial credit I did notice something was wrong. 
I can't help feeling that CD specs were put together 35 years ago when digital technology was at an early stage. If we can't do better than that with the technology that's available to us now, I'd be a bit surprised - and it is not difficult to do. Most studio masters these days are done in digital form at 24/196 and it's pretty easy to copy digits.

You have to distinguish all the recording and processing from the final playback/reproduction. Let's assume there is recording both available in 44.1/16 and 44.1/24 but you cannot hear a difference between the two. You wouldn't buy the usually more expensive 44.1/24 file right?
But those people who earn money by selling you such formats also try to sell you that there are huge differences..
 
May 1, 2013 at 10:50 PM Post #1,130 of 7,175
The maximum comfortable dynamic range for music is around 40 dB. Beyond that, you have to listen so loud to hear the quiet parts that the loud parts are uncomfortable.
 
May 2, 2013 at 7:12 AM Post #1,131 of 7,175
I know exactly what the last two posters mean about e.g. turning up the volume to hear dialog in movies and then having to turn it down again when the ordnance goes off. Surely that is not a function of excess dynamic range in the recording, it's a function of bad mixing. Given how loud the ordnance is going to be, a good sound engineer should presumably set the amplitude for the spoken parts at a level that does not require volume knob adjustment. They often don't, but that is what is wrong.
 
I am simplifying slightly when I talk about the amplitude of a single sample. You probably cannot hear a single sample in isolation, although in my experience of e.g. cleaning up clicks and pops on digitised vinyl, you often find that it is a single sample which takes the click into clipping and by attenuating that amplitude (reducing the value from above +/- 32k to below that level), you solve the problem. You may solve it better by adjusting a few adjacent samples as well.
 
As regards the 8bit sine wave, are you saying Xnor, that the waveform matches exactly because all of the 126.5 values have been randomly allocated to 126 or 127 by the dithering process? Because if so, I don't think I would consider that to be an exact match.
 
I suppose the decimal points that I was referring to are an implementation in particular software. I understand that a 24bit binary number will be an integer. If you look at a 24bit digital waveform in Adobe Audition or the earlier Syntrillium equivalents, the samples all still range from - to + 32k, but they are all given decimal point precision, which you can adjust minutely as you wish. It had not occurred to me that the actual samples have a higher range adjusted through arithmetic, but I suppose that must be the case.
 
I do seem to have mixed sampling with bit-depth in what I wrote concerning precision, but I think the point is still that you do not achieve much by over-sampling during playback where you are just sampling the same unchangeable digit on your CD more than once. Isn't that a consequence of Nyquist's theorem? What I am saying in all of these cases (bit depth, sampling frequency, psycho-acoustic legerdemain) is that you cannot create useful extra information from a source that does not contain it, so it is easy enough to translate a 16/44.1 waveform into 24/196, but you have not added anything useful either as regards the bit depth or the sampling frequency, the old slow samples bear exactly the same relationship to each other. Just as it is easy to re-code a 128kbps MP3 file to 320kbps, but that is not the same as taking the original analog or digital source material and coding it to MP3 at 320kbps. Precision, if you apply that term in this context, has already been irrevocably lost, hence the adjective "lossy". 
 
So all of this extra precision is useful to the recording engineers and for post-production, but once they have settled on the sound they want, they may as well pare it all down to 16/44.1 and throw away any extra precision because nobody can tell any difference. And a format invented before the existence of the personal computer happened to get the specification so adequate in all respects that no improvement is even theoretically possible nearly 40 years later. Impressive.
 
And the sound engineers responsible for voicing your CD base the final mix on a pair of "average" loudspeakers. So if you happen to own that model of speaker, you hear it as they intended, whereas if you spend $300,000 on your speakers, you get a massive over-emphasis in the bass which most average speakers cannot cope with and who knows what differences in the rest of the spectrum. Maybe Van Gogh had the right idea...
 
May 2, 2013 at 7:56 AM Post #1,132 of 7,175
On the volume of dialogs: a normal conversation is the range of 40 - 60 dB SPL, a rifle being fired can hit 170 dB SPL, a stun grenade even 10 dB more which is comparable to small explosions I guess. I don't think any sound system can realistically reproduce such events, which is why the dynamic range is decreased a lot (maybe so that peaks only reach about 100 dB SPL) by the engineer.
Sure you can call it bad mixing, but the issue is dynamic range. You can "fix" such movies by using a dynamic range compressor.
 
Yes, the level of the 8 bit sine wave in the spectrum analyzer matches that of the source file. If it didn't, you'd see the sine wave at -0.07 or -0.14 dBFS.
 
My Adobe Audition has a window called amplitude statistics and shows min/max sample values. These values are not limited to +/-32k at all.
 
Oversampling in the DAC offers huge benefits over old non-oversampling DACs. But you're right, you cannot add information by converting that was not there in the first place.
 
May 2, 2013 at 11:24 AM Post #1,133 of 7,175
Quote:
 
So all of this extra precision is useful to the recording engineers and for post-production, but once they have settled on the sound they want, they may as well pare it all down to 16/44.1 and throw away any extra precision because nobody can tell any difference. And a format invented before the existence of the personal computer happened to get the specification so adequate in all respects that no improvement is even theoretically possible nearly 40 years later. Impressive.

Pretty much true, though most engineers now agree that 44.1 is a little low.  That frequency was chosen because early digital recording systems used video recorders to record the data in a standard NTSC video field, and 44.1 works out as the right sampling frequency to evenly fit bytes on scan lines in video fields of monochrome video decks. Consumer versions of that system sampled at 44.056 because they were stuck with the scan rates of NTSC color.  Early non-video systems sampled at 50KHz, which would have been more desirable then, and now.  60KHz might be all that we aver need.
Quote:
And the sound engineers responsible for voicing your CD base the final mix on a pair of "average" loudspeakers. So if you happen to own that model of speaker, you hear it as they intended, whereas if you spend $300,000 on your speakers, you get a massive over-emphasis in the bass which most average speakers cannot cope with and who knows what differences in the rest of the spectrum. Maybe Van Gogh had the right idea...

Monitor speakers and control rooms are generally average with respect to each other, but are considerably above average with respect to home systems. Dubbing stages for film work are highly standardized, and also exceed home systems.  A 300K home system will not by definition result in extremely high bass response.  I've calibrated a $350k audio system, it's response wasn't different from a more modest system, and in fact, there were many aspects that were much worse.  Bass response is mostly a room size and shape, and acoustic treatment issue.
 
Van Gogh had a lot of right ideas, but pinna removal probably wasn't one of them.
 
May 2, 2013 at 12:29 PM Post #1,134 of 7,175
Quote:
Originally Posted by obuckley /img/forum/go_quote.gif
 
And a format invented before the existence of the personal computer happened to get the specification so adequate in all respects that no improvement is even theoretically possible nearly 40 years later. Impressive.

 
Well, human hearing has not improved over the last 40 years, the limits are still the same.
 
May 2, 2013 at 1:18 PM Post #1,135 of 7,175
Quote:
 
Well, human hearing has not improved over the last 40 years, the limits are still the same.

 
This. It's been said before, but I'll say it again. Really, it has nothing to do with whether there is a measurable, calculable difference. It's just whether that difference is audible, and given the extreme limitations of human biology, it really is not. When we get our bionic ear implants, then we'll reconsider. But as it is, our ears are not evolving any better in this short amount of time, if anything they're worse on average since there's so much more constant, loud noise sources in the environment.
 
May 2, 2013 at 1:19 PM Post #1,136 of 7,175
Theoretically on a computer you can create audio signals sampled at a couple of MHz and with a bit depth of 64 bits or higher. Purely theoretical you can increase the sampling rate and bit depth to infinity.
 
But when we have certain requirements like storing frequencies up to 20 kHz (based on human hearing limit) we don't need crazy high sampling rates. Also, all the analog stuff has its limits as well. Many mics for example roll-off a bit above 20 kHz.
 
May 2, 2013 at 7:55 PM Post #1,137 of 7,175
I plan on purchasing a SimAudio Moon 100D DAC ( http://www.simaudio.com/moon100D.htm ) for a setup that I'm building soon. It has a BurrBrown PCM1793 DAC, which has an 8x oversampling digital filter. In this particular case, would using 24/96 recordings make a difference (perhaps avoiding the oversampling and allowing better accuracy)? I'm sorry if this isn't an intelligent question, I don't know very much about sound science.
 
May 3, 2013 at 4:44 AM Post #1,138 of 7,175
Quote:
In this particular case, would using 24/96 recordings make a difference (perhaps avoiding the oversampling and allowing better accuracy)?

 
96 kHz input would be oversampled, too. The only difference is that the filter cuts off at 48 kHz instead of 22.05 kHz. If you cannot hear frequencies higher than 20-21 kHz (like the majority of people), chances are that it would sound the same. Note, however, that the 24/96 recordings could be mastered better (for reasons related to marketing, rather than technical advantages of the format), and in that case they will sound noticeably better; however, if you downsample them to 44.1 kHz with a good converter, they would sound the same, and better than the badly produced 44.1/16 version.
 
May 3, 2013 at 10:10 AM Post #1,139 of 7,175
Quote:
 
96 kHz input would be oversampled, too. The only difference is that the filter cuts off at 48 kHz instead of 22.05 kHz. If you cannot hear frequencies higher than 20-21 kHz (like the majority of people), chances are that it would sound the same. Note, however, that the 24/96 recordings could be mastered better (for reasons related to marketing, rather than technical advantages of the format), and in that case they will sound noticeably better; however, if you downsample them to 44.1 kHz with a good converter, they would sound the same, and better than the badly produced 44.1/16 version.

 
I did a bunch of ABX testing with 24/96 flac files converted to 16/44.1 lame mp3, and the files sounded identical to me.  I actually had to ask for assistance to make sure was I doing everything correctly and was provided with step-by-step instructions to make the conversion correctly. (might have even been you)
 
I was amazed at my own results. Although I have a bit of tinnitus and can't hear anything over ~12-13KHz at normal volume levels after drinking too much caffeine or not getting plenty of rest.
 
May 3, 2013 at 12:12 PM Post #1,140 of 7,175
Some of the claims of the proponents of high rate files suggest that the improvements happen at least in part on the recording side.  That would mean that some high rate files we can download would not contain that supposed benefit because they may or may not have originated that way.  Up-sampled files wouldn't count, and things that started with old analog recordings wouldn't either. That takes the valid high rate files down to a much smaller number, and frankly, what you can get from on-line sellers usually comes with a disclaimer that they had nothing to do with how the files were created, and don't have any information about them.  
 
Then, at least one high-rate proponent (Bill Schnee, Bravura Records) claims he had to have a "special" A/D converter built before he could hear the difference (his ProTools gear simply wasn't good enough), and even then he claims you need 192KHz before the benefits are readily apparent. If you reduce his claims to specifics, it's his "special" converter at 192/24, his "special" console, his "special" mix to 2 tracks, and then it's all wonderful, even if down sampled, though if that's done at least some of the "wonder" goes away.  Of course, he doesn't actually sell anything, at least yet.  When asked what, specifically, causes the improvement, he doesn't claim that it is not just the additional bandwidth, though, it's something else.  But unfortunately, he places the improvement into the "mysteriously unmeasurable" zone.  
 
That would imply that garden-variety 192/24 ADCs mask the benefits of that bit rate.  So files we can buy online won't contain the full high-rate glory.  And you pretty much need...well...his entire studio and him to get those kind of dramatic results.  That means there's no way for the rest of us to hear this stuff outside of the Christmas music downloads on his site.  He didn't call DACs into question, and I'm sure I have no idea why not, some would. 
 
I'm not saying I agree with any of this, but I find it interesting that someone of that stature would focus the issue that tightly and try to build a record company on that basis.
 

Users who are viewing this thread

Back
Top