Lossy compression and high bit depth
Feb 24, 2016 at 4:18 AM Thread Starter Post #1 of 4

charleski

100+ Head-Fier
Joined
Nov 10, 2014
Posts
385
Likes
374
Let me start out by saying that nothing I'll be writing about in this post actually matters. So let's not get bogged down in questions of whether you can actually hear the differences I'll demonstrate because no, you almost certainly can't. I was curious about what happened to high-bit-depth files when compressed and couldn't find any authoritative info on the web, so decided to investigate myself and am posting what I found as others may be curious as well.
 
Transform-coding systems such as those used in MP3 and AAC compression keep the sample rate of the original, but they don't have a bit depth. The signal is converted from the amplitude to the frequency domain: instead of a series of amplitudes, we get a series containing the energies at different frequencies, which is then further compressed using psychoacoustic measures. In theory you can decompress an AAC file to produce a PCM signal with any bit depth you want, but in practice it doesn't make much sense to go higher than 16bits. While 16bit PCM has a noise floor around -96dB, we know that correct use of dithering and noise shaping can allow it to carry signals with an amplitude much lower than this. So, does this survive lossy encoding?
 
Here's an uncompressed 1kHz -120dB signal at 48kHz/24bits:

 
Here's the same signal, only the sample type has been converted (in Adobe Audition) to 16 bit using triangular dither and Neutral(Light) noise shaping with its default parameters:

The noise floor has gone up, but the 1kHz signal is still clearly visible, along with a large high-frequency noise hump caused by the noise shaping.
 
For those who are curious, this is what it looks like when converted to 16bit with triangular dither but no noise shaping:

The high-frequency hump has gone, but the overall noise level is a lot higher.
If you don't use any dither at all and convert to 16bits by lopping off the least-significant 8 bits, you simply end up with a blank file. Raw 16bit PCM can't encode a -120dB signal, so there's nothing there.
 
Now, is this still there after AAC encoding? I used the QuickTime true-vbr encoder which has been the gold standard among AAC encoders for some time (-tvbr 100). AAC decoding in all cases was the default system built into Audition.
 
First, let's see what happens if I simply compress the 24bit file without any manipulation:

Oops, it's gone. Somewhere along the line the signal has been truncated and everything below -96dB was stripped out.
 
But if a compress a 16bit file converted using triangular dither and Neutral(Light) noise-shaping, it's still there:

Comparing this to the uncompressed plot shown above we can see that the overall noise level has gone up, probably a result of the high-frequency hump being truncated which damages the noise-shaping. But the 1kHz spike is clearly visible.
 
If we don't use any noise-shaping we see a broad 5dB noise hump centered around the 1kHz peak:

 
 
Now, as I said right at the start, in practice none of this matters. When you're compressing a complex busy piece of music anything down at -120dB will probably get dumped by the encoder that will allocate bits to the stuff you can actually hear instead. But in principle AAC compression can retain ultra-low-level details well below threshold from 24bit files as long as the file has been properly dithered first.
 
Feb 24, 2016 at 10:24 AM Post #2 of 4
That was pretty cool
 
Feb 24, 2016 at 11:49 AM Post #3 of 4
Especially interesting considering Apple's explicit recommendations. There has to be a trade off of sorts going on here.
 
Quote:
Apple – Mastered for iTunes /img/forum/go_quote.gif
  An ideal master will have 24-bit 96kHz resolution. These files contain more detail from which our encoders can create more accurate encodes.
However, any resolution above 16-bit 44.1kHz, including sample rates of 48kHz, 88.2kHz, 96kHz, and 192kHz, will benefit from our encoding process.
(…) Don’t provide files that have been downsampled and dithered for a CD. This degrades the file’s audio quality.

 
Feb 27, 2016 at 5:15 AM Post #4 of 4
I looked through my library and found a couple of reasonably recent 'Mastered for iTunes' albums that I'd got. They're clearly dithering the signal, but it's hard to tell if they use noise-shaping. One interesting correspondence I found was a double spike around 16kHz that's most visible in the quiet sections:


These were found in albums from different artists, so it's unlikely they were present in the source. They aren't remotely large enough to work for noise-shaping and might just be an artifact of the encoder they're using.
 

Users who are viewing this thread

Back
Top