10 bits of sample depth is more than enough for audio with tests that you can perform to prove it to yourself
Aug 26, 2014 at 1:54 PM Post #16 of 19
  Have you read the paper Don Hills linked to above? The paper looks like it goes to great lengths to relate the thresholds of human detection to the noise inherent in PCM audio.

The paper uses 120 dB SPL as the reference (0 dBFS) level, which is probably quite a bit (>= 10 dB) higher than it was in your test, and the "noise inherent in PCM audio" was assumed to be simple triangular PDF dither, that produces the most audible noise. For comparison, here is a frequency analysis of your Audacity test sample (green), the simple white spectrum +/-1 LSB TPDF dither (blue), and a slightly better "shaped" TPDF dither (red) where the triangular noise is generated by differentiating uniform distribution white noise:

So, to make the graphs in the PDF file relevant, lower the 16-bit noise levels by at least 10 dB first (<=110 vs. 120 dB SPL), then further adjust them by the difference relative to tpdf.wav above (for example, 15 dB lower up to about 5 kHz). As far as I can see, that moves all the noise under the threshold. Although non-flat transducer response, like a large treble peak under 10 kHz, could change that.
In any case, this type of "crank up the volume and listen to the noise floor in absolute silence" test is not really representative of actual music listening, similarly to another paper that "proves" the audibility of sub-nanosecond levels of jitter by comparing the levels of jitter products from 120 dB SPL ultrasonic signals under worst case conditions against the absolute threshold of hearing. Even the infamous Meyer and Moran paper that is frequently cited as evidence of Red Book transparency acknowledges that simply listening to the noise floor at very high volume with no music playing can reveal the quantization noise that would otherwise be masked.
Aug 27, 2014 at 4:33 PM Post #17 of 19
While i partially agree, i think this is a bit misleading as to what the negative consequences are for reducing the bitdepth. This spectrogram is averaging the energy from the quite bits (which are mostly vocals) and the loud bits (which include the heavy percussion where i suspect much of the high frequency energy comes from).  The quite bits don't contribute much energy, thus the spectrogram for this whole passage isn't unlike the spectrogram of just the loud passages, except shifted down a few dB because it's being divided by a longer sampling time.

*facepalm* Took a quick look at the song you suggested and plotted it not realizing what I was plotting and not thinking about it. The high frequencies are harmonics and so of course they'd be down in amplitude but I forgot about the averaging.
Still, it was a good excuse to listen to King Crimson. Thanks for that!
Aug 27, 2014 at 4:57 PM Post #18 of 19
  Still, it was a good excuse to listen to King Crimson. Thanks for that!

I'm happy to help!

Aug 28, 2014 at 8:30 AM Post #19 of 19
I recently got the remastered 24bit version of Lizard, I think it is slightly more compressed than the original (as in dynamic compression). It's still very dynamic though and I think it is better - the original seemed too dynamic for me.

Users who are viewing this thread