Why does quieter music require less data? (FLAC file size) | Headphone Reviews and Discussion - Head-Fi.org

stalepie · Aug 25, 2016 at 3:20 PM

With some recordings, older ones of course, they might make for around 500 or 600 kilobite average with FLAC, but music that has large waveforms (louder) is closer to 1000 or more. Does that mean the quieter music has less data? It's not just quieter but has less information in the recording, and therefore less fidelity?

Ancipital · Aug 25, 2016 at 4:30 PM

To vastly oversimplify, the "louder" the recording, e.g. the more of the available dynamic range used, the more bits of entropy are potentially present, and the harder to code it is.

Of course it's way more complex than that, but you don't want to hear about residual coding and all that other jazz, and I would probably screw up explaining it

stalepie · Aug 25, 2016 at 5:02 PM

hmm but if you record a sound that doesn't require much dynamic range, and you record it quietly and then at higher gain, so the waveforms are larger (but not clipping yet), these both have the same audio fidelity? If you then shrink down the larger waveform to the same size, it makes as close to the same data as the code allows (like "amplify" in Audacity, setting it to negative to make a track quieter).

I would just think it would be the same amount of data but then a different amount of gain applied for playback, whether the waveform is small to start with or recorded hot. Confusing. Oh well, I don't know much about this subject.

stalepie · Aug 25, 2016 at 5:05 PM

Is a large waveform essentially wasted data? Like the way you can take a BMP or PNG image file and double its size in an image editor, save it again as a PNG or BMP and have a much larger file size on the computer even though it seems to be the same picture. Or is it like taking a photo at 2 megapixel and then another at 8 megapixel, with the latter having more actual detail?

castleofargh · Aug 25, 2016 at 5:29 PM

did you try this with wave to see what happens? it would be more relevant than flac for your question.
the uncompressed PCM signal will contain a given number of samples per second, and each sample will occupy a space corresponding to how much bits of information each sample contains. let's say for my own sake that we're dealing with a 4bit file instead of the usual 16 or 24bit:
then whatever the amplitude of the signal at a given moment, the PCM sample will be 4bits long, so it will have 4 values recorded, each being 0 or 1.
if you record total silence then all the samples will be 0000, the sample is still 4bits long. it takes the same space to store four 0 or four 1 or any combination of those. so uncompressed PCM at a given resolution uses the same storage space for a given length of time. and that contradicts your hypothesis.

now flac, starting here this is only my guess and I might be full of crap. but it's a lossless compression format so anytime it can save some space without actually losing data, it tries to do it.
here is my own hypothesis. if the music is always loud then the signal sometimes reaches the maximum amplitude and the value (still for a 4bit system) would often reach 1111.
but if you have a quiet record far away from 0db value(the max voltage), then maybe the signal never goes beyond 0111? and if the first value is 0 at all times, I imagine a compression algorithm might be able to take advantage of this and save a good deal of space with whatever trick it uses. like maybe sticking to the last value as long as not told otherwise or whatever the code involves(sorry I don't speak flac ^_^). but this has to do with the way the data is stored, not with the precision of the data as flac is lossless, and not with the quality of the signal.

Ancipital · Aug 25, 2016 at 6:06 PM

stalepie said:
hmm but if you record a sound that doesn't require much dynamic range, and you record it quietly and then at higher gain, so the waveforms are larger (but not clipping yet), these both have the same audio fidelity? If you then shrink down the larger waveform to the same size, it makes as close to the same data as the code allows (like "amplify" in Audacity, setting it to negative to make a track quieter).

I would just think it would be the same amount of data but then a different amount of gain applied for playback, whether the waveform is small to start with or recorded hot. Confusing. Oh well, I don't know much about this subject.

I suspect you probably need to stop saying "fidelity" at this point, in order to have a useful discussion, and be more specific. That said, we might slip more into the realm of Claude Shannon, and it has been too many years since I studied that, so maybe best not to

If recorded a sound at two different gains at once, and normalised the louder one to match the peak value of the quieter one, you will often have a very similar result, but not always identical- due to specifics of quantisation effects and so forth. However, as you intuit, when you normalise the louder one to be quieter, you're throwing away information, as you're then using fewer bits to represent the amplitude. For example, if have a 16 bit sample (16 bit being the most common, and most useful for listening), each sample has a possible value of 0-65535 (2^16=65536). If you reduce the amplitude of a bit of audio with a peak value of 60000 by 75%, so that the peak is 15000, that can be represented by only 14 bits (2^14=16384).

Since FLAC uses various arithmetic coding tricks, I wouldn't be amazed if this was relevant, but don't quote me on that.

castleofargh,

FLAC does several things as part of its compression, including cheerfully coding things like long silences. It also does a form of residual coding (from what I recall, it has been a white). It's fairly adaptive and clever.

stalepie · Aug 25, 2016 at 6:07 PM

I hadn't tried WAV, but I just tried importing Britney Spears - Hot As Ice FLAC into Audacity, exporting 1 minute of it as a 16-bit signed Microsoft WAV, then 1 minute of "The Children's Hour" from the Batman Returns soundtrack, which was quieter, and the Batman WAV is 18 MB and Britney's is 33.

It probably works the same with WAV as you're describing with FLAC. I didn't think about that, discarding the redundant data beyond the dynamic range recorded -- so that does mean large waveforms are wasted space if the dynamics are not there to need it recorded that loud.

RRod · Aug 25, 2016 at 6:21 PM

stalepie said:
It probably works the same with WAV as you're describing with FLAC. I didn't think about that, discarding the redundant data beyond the dynamic range recorded -- so that does mean large waveforms are wasted space if the dynamics are not there to need it recorded that loud.

I see what you're getting at, and yes you have a point. A modern hyper-compressed pop recording doesn't "use" 16-bits in an *audible* sense, but something like FLAC can't make such decisions because it has to be lossless. If the bottom 6 bits are inaudible at normal listening levels but still have information in them, FLAC has to spend resources encoding them. You could decide on your own that the song is only 10-bit and zero-out the bottom 6-bits, and then FLAC would indeed encode the resulting file to a smaller size.

stalepie · Aug 25, 2016 at 7:16 PM

ancipital said:
However, as you intuit, when you normalise the louder one to be quieter, you're throwing away information, as you're then using fewer bits to represent the amplitude. For example, if have a 16 bit sample (16 bit being the most common, and most useful for listening), each sample has a possible value of 0-65535 (2^16=65536). If you reduce the amplitude of a bit of audio with a peak value of 60000 by 75%, so that the peak is 15000, that can be represented by only 14 bits (2^14=16384).

So there is more audible detail in a larger waveform? It's better to record as loud as you can before clipping? Not for reasons of loudness war.

stalepie · Aug 25, 2016 at 7:18 PM

rrod said:
I see what you're getting at, and yes you have a point. A modern hyper-compressed pop recording doesn't "use" 16-bits in an *audible* sense, but something like FLAC can't make such decisions because it has to be lossless. If the bottom 6 bits are inaudible at normal listening levels but still have information in them, FLAC has to spend resources encoding them. You could decide on your own that the song is only 10-bit and zero-out the bottom 6-bits, and then FLAC would indeed encode the resulting file to a smaller size.

Well I was thinking if you hear the same amount of detail (once the same volume is met to the ear) then the compression program (MP3, FLAC, AAC, whatever) could first shrink the waveform and then compress, and then remember the shrink value and upon decompressing during playback revert it to the same size wave.

MindsMirror · Aug 25, 2016 at 7:47 PM

Here's my explanation. Say you're making a 16 bit recording, and the audio has a practical dynamic range of 60dB (equal to 10 bits). You could record your audio at the maximum level and you would get samples that look something like this where X represents bits containing your useful audio information and N represents bits below the practical noise floor or your recording.
XXXXXXXXXXNNNNNN

Your lossless compression schemes such as FLAC can't tell the difference between noise and signal, so it has to compress the full 16 bits worth of information.

Now say you reduce the volume by 36dB (equal to 6 bits), you'll get this.
000000XXXXXXXXXX
Your lossless compression can essentially discard the top 6 bits which are always zero and only needs to compress 10 bits worth of audio information, saving some space while still retaining the 10 bit 60dB dynamic range in your audio signal.

Now reduce the volume by another 36dB and you'll get something like this.
000000000000XXXX
Your compression only needs to compress 4 bits worth, but you've lost some useful audio information below the quanization level and your audio's dynamic range is reduced to 4 bits or 24dB

In a real compression algorithm it's not quite that simple, it will dynamically use more or less data throughout the audio based on many factors. But this hopefully still gives you a basic understanding of why quieter audio can use less space in lossless compression.

RRod · Aug 25, 2016 at 8:18 PM

stalepie said:
Well I was thinking if you hear the same amount of detail (once the same volume is met to the ear) then the compression program (MP3, FLAC, AAC, whatever) could first shrink the waveform and then compress, and then remember the shrink value and upon decompressing during playback revert it to the same size wave.

Well, *lossy* codecs like mp3 and aac can do just that: they can decide that certain parts of the audio content can be stored with less precision due to considerations of audibility. A lossless codec like FLAC can't do that: it has to be able to return exactly the same bits as the original WAV.

castleofargh · Aug 25, 2016 at 10:00 PM

stalepie said:
I hadn't tried WAV, but I just tried importing Britney Spears - Hot As Ice FLAC into Audacity, exporting 1 minute of it as a 16-bit signed Microsoft WAV, then 1 minute of "The Children's Hour" from the Batman Returns soundtrack, which was quieter, and the Batman WAV is 18 MB and Britney's is 33.

It probably works the same with WAV as you're describing with FLAC. I didn't think about that, discarding the redundant data beyond the dynamic range recorded -- so that does mean large waveforms are wasted space if the dynamics are not there to need it recorded that loud.

you just messed up your wav test

. try again with same length, same bit depth, same sample rate.

stalepie · Aug 25, 2016 at 10:35 PM

castleofargh said:
you just messed up your wav test

. try again with same length, same bit depth, same sample rate.

oh god you're so right. I'm so stupid. I exported the whole tracks, one was 3 minutes and the other was a minute 45, instead of Export Selected Audio for only a minute each. ughh

(or "argh," since I make people go 'argh!")

castleofargh · Aug 26, 2016 at 12:40 AM

shiiit happens mate. I thought about it because I'm really good at messing up my own tests.

does this wav result clear things up for your initial question?

stalepie

100+ Head-Fier

Ancipital

500+ Head-Fier

stalepie

100+ Head-Fier

stalepie

100+ Head-Fier

castleofargh

Sound Science Forum Moderator

Ancipital

500+ Head-Fier

stalepie

100+ Head-Fier

RRod

Headphoneus Supremus

stalepie

100+ Head-Fier

stalepie

100+ Head-Fier

MindsMirror

1000+ Head-Fier

RRod

Headphoneus Supremus

castleofargh

Sound Science Forum Moderator

stalepie

100+ Head-Fier

castleofargh

Sound Science Forum Moderator

Users who are viewing this thread