Hi-Rez - Another Myth Exploded!

gregorio · Sep 11, 2011 at 11:34 AM

There is a lot of talk of Hi-Rez audio these days, instead of one, there are now several Hi-Rez formats becoming available. I'll try to explain here what these formats are, what they do and the reality behind their existence. Although there is a great deal of technical complexity in digital audio, I'll try and keep this as simple as I can (loss of accuracy is inevitable when simplifying but I'll do my best to minimise it).

Sound waves travelling through the air to our ears have only two attributes, the frequency of the waves, (IE how many waves per second, measured in Hertz, Hz), and the height or energy of the waves (applitude, measured in Decibels, dB). The different frequencies contained in sound waves are related to the pitch and tonal characteristics of what we hear and the amplitude is related to what we hear as volume. A simple way of understanding digital audio, is to assume that frequency is encoded by the sample rate (how many times a second we measure the sound waves) and the amplitude is encoded by the bit depth (the range of values available for each measurement).

Bit Depth

I will just summarize here, as I covered this in a thread in more detail a couple of years ago (see here). Bit depth is responsible for encoding the volumes or dynamic range of the sound waves. Each bit allows for approximately 6dB of dynamic range, 16bit = 96dB and 24bit = 144dB. There is no quality (or any other) difference between 16bit and 24bit except for the additional 48dB of dynamic range. This extra dynamic range is useful for professional recording (to provide headroom) but as headroom is not required on playback, 24bit is of no benefit to the consumer. Due to some clever technology (noise-shaped dither) we can enhance 16bit so it appears to the ear to be equivalent to 20bit (120dB dynamic range). Consider that the most dynamic recordings ever released have a dynamic range of no more than about 60dB and you can see that CD (16bit) already allows roughly 1000 times more dynamic range than is ever used. There is no disadvantages of 24bit over 16bit, except the additional storage space required.

Sampling Rate

The sampling rate is measured in samples per second (S/s: kS/s or mS/s). The science upon which digital audio exists is called the Nyquist-Shannon Sampling Theorem. The sound wave frequencies which we can encode in digital audio are defined by the Nyquist Point, which works out at half the sampling frequency. With CD the sampling frequency is 44.1kS/s, so the audio frequency we can encode is limited to 22.05kHz. Anything above 22.05kHz (mostly just noise) has to be removed using a filter.

Human hearing in adults does not extend beyond 20kHz, so CD (44.1kS/s) with it's audio limit of 22.05kHz appears to cover any eventuality. However, applying the required filter (to remove anything beyond the Nyquist Point) is likely to cause some issues (phase or ringing issues) lower than the Nyquist Point (EG. At around 20kHz). Increasing the sample rate to 96kS/s means the Nyquist Point is at 48kHz, which in theory allows plenty of space outside the range of human hearing to cure the phase or ringing issues. So in theory a sample rate of 96kS/s should provide a more linear (accurate) recording within the hearing range than 44.1kS/s. Indeed, measurements bare this out. However, I say in theory because a number of scientific studies have shown that using correctly constructed tests, no one has been able to perceive a difference between 44.1kS/s and 96kS/s.

What about sample rates of 176.4kS/s, 192kS/s and even 384kS/s. Here we start running into problems. Electronic engineering is based on compromise. For example using a filter creates complications with phase and ringing, however not using a filter breaks the Nyquist-Shannon Sampling Theorem and causes even greater problems (distortion due to alias images). So using a filter is a necessary compromise, being the lesser of two evils. These higher sample rates are also a compromise. 192KS/s provides a potential benefit of recording frequencies up to 96kHz. The downside is that the calculations which need to be carried out when implementing the required filter are now 4 times more complex and have to be carried out 4 times more quickly with 192kS/s than with 44.1kS/s. Unfortunately the laws of physics make this impossible*. To get around this problem, the chip designers have had to simplify the filter implementation at these higher sample rates, resulting in much less efficient filters. Distortion, ringing and phase issues are all measurably poorer at 176.4kS/s and 192kS/s than at 96kS/s. So, the trade off here is: The benefit of being able to record frequencies between 48kHz and 96kHz verses more distortion and other unwanted artefacts. Also, consider these points:

1. Musical instruments produce virtually no energy beyond 48kHz. So there is nothing much there to record except noise.
2. Few standard studio microphones can record much above 20kHz and none record above 48kHz anyway.
2. 48kHz is already more than twice the highest frequency a human can hear.
3. Almost no commonly available speakers or cans reproduce any thing higher than about 40kHz and most can't produce above 20kHz.

With this in mind, the are no advantages of theoretically being able to record frequencies between 48kHz and 96kHz, all that is left is the disadvantages of 176.4kS/s and higher.

For these reasons, some of the very highest end professional converters do not even provide sampling rates higher than 96kS/s. Any self respecting, well educated audio professional would not use sample rates of 176.4kS/s or higher unless forced by clients.

Conclusion

There are no benefits of 24bit (or 32bit) over 16bit. There is a theoretical benefit of 88.2kS/s and 96kS/s over 44.1kS/s. The so called Hi-Rez formats of 176.4kS/s and higher are actually of poorer quality and should be avoided. In other words, 16/44 provides more “hi-rez” than the human ear can detect but if you want to play it absolutely safe, 24/88.2 or 24/96 provides the highest resolution available.

Marketing

The difficulty facing the audio industry is that 16/44 is an old and well established technology. It's difficult and not very profitable to keep selling the same thing for years. On the other hand, it's easy to convince consumers that bigger numbers are better, so hi-rez provides an ideal opportunity to sell the same customers new equipment and new music collections. Everyone wins, the companies stay in business and the consumers think they are getting something better. The real shame is that instead of spending their development money improving the quality of their products at 16/44, they are spending their money aiming for bigger and bigger meaningless numbers to make their marketing departments happy, while actually reducing audio fidelity.

Observations

So, all those people who believe 24/192 is better than 16/44 are just fooling themselves? In a nut shell yes but it's not quite that simple! I know of examples where a 16/44 version has been deliberately doctored to sound worse than a 24/96 or 24/192 version. So in this case, they are not fooling themselves but are deliberately being fooled. Also, concerning converter design: As mentioned before, electronic engineering is effectively a trade off. This trade off can be the lesser of two technical problems or it can be cost verses quality. A converter or chip manufacturer may decide to spend more time and money on handling one sample rate better than another. So it's entirely possible that 24/96 may sound better than 16/44 with a particular converter due to it's filter or other design considerations. There's no real way around the problems with 176.4kS/s and higher though, so the only explanation for hearing improvements at these sample rates is extraordinarily bad 44.1k and 96k filters in your DAC or placebo effect.

*Modern technology has reached the speed limits of the laws of physics (the speed at which a capacitor can be charged for example). Most CPU advances these days is largely centred around data handling and the efficiency of breaking down complex tasks into simpler ones and providing more cores, so more of these simpler tasks can be computed at the same time. More cores is of limited benefit to digital signal processing because often the tasks cannot be broken down any further. One task often starts with the results from a previous task, so these two tasks cannot be calculated at the same time.

Some further reading and supporting evidence:

Lavry White Paper
Benchmark Statement
ProSoundWeb (Big guns discussion)

G

fatcat28037 · Sep 11, 2011 at 12:23 PM

Oh boy, this is gonna' be fun.

Anaxilus · Sep 11, 2011 at 1:24 PM

So, without being overly critical wrt specific points, if we concede for the sake of argument there might not be audible benefits when differentiating the formats there are audible benefits using one format over another depending on the master, source, etc, etc. Ergo, unlike other 'audiofoolery' topics discussed ad nauseum there is much more evidence and truth to the fact that one can hear audible differences between formats but simply not for the reasons most people believe. Sounds reasonable.

gregorio · Sep 11, 2011 at 1:44 PM

anaxilus said:
So, without being overly critical wrt specific points, if we concede for the sake of argument there might not be audible benefits when differentiating the formats there are audible benefits using one format over another depending on the master, source, etc, etc. Ergo, unlike other 'audiofoolery' topics discussed ad nauseum there is much more evidence and truth to the fact that one can hear audible differences between formats but simply not for the reasons most people believe. Sounds reasonable.

I think I understand your point. My point basically is 176.4kS/s is less linear (has more distortion) than 96kS/s. So it's true to say that 96kS/s is higher fidelity than 176.4kS/s. However, a particular listener might actually prefer the additional distortion of >176.4kS/s. The market for tubes and filterless DACs prove that for some, distortion is desirable. Personally I much prefer linearity but that's just my opinion.

G

barleyguy · Sep 11, 2011 at 3:14 PM

Dan Lavry of Lavry Engineering (who sometimes logs in here) pointed this out in a white paper about 7 years ago. He points out that there are tangible benefits to going over 44.1K because of implementation reasons such as FIR filters, but that all of those reasons are resolved by about 60K. He also points out that 192K sounds WORSE than 96K and some very specific reasons behind that.

http://www.lavryengineering.com/documents/Sampling_Theory.pdf

Benchmark admits that 96K sounds better than 192K as well. Right in their standard materials for the DAC1 they state:

"Thirdly, all current D-A converters operate relatively poorly at quad sample rates (176.4 and 192kHz) because to accommodate such high rates they have to use reduced internal oversampling ratios for the digital filters. Often this ratio is as low as 2, resulting in much poorer reconstruction filter characteristics, and thus poor alias rejection."

Benchmark has also gone on record by responding to audiophile publications, correcting them when they state that 192K sounds better.

Here's an example:

http://www.soundstagehifi.com/index.php?option=com_content&view=article&id=126:96khz-vs-192khz&catid=57:reader-feedback&Itemid=24

So, as far as "exploding the myth" that 192K sounds better, the reputable DAC manufacturers aren't telling anyone it sounds better. I believe the demand for 192K is coming from the other end - people are unsatisfied with the compressed to crap, badly engineered music that they often get, and rather than blaming the producer or the mastering engineer or whoever, they are saying "higher numbers are better, give us higher numbers".

----------------------------

As far as 24-bit, I don't agree there. The fact is, doing 16-bit math on sampled audio sounds like crap. So if audio is recorded in 16-bit, and you want to do DSP operations on it, it has to be converted to 24-bit, run through the DSP, then converted back to 16-bit. This needs to happen every time the slightest change is made to the audio. Also, quantization errors in the audio are much farther below the music on 24-bit systems than on 16-bit systems. So from a recording and DAW perspective, there are real advantages to having the audio stored in 24-bit. The final conversion to 16-bit is also imperfect, so it might as well be distributed to consumers in 24-bit.

Benchmark has a white paper that explains the advantages of 24-bit from a technical perspective, in regards to implementation:

http://www.benchmarkmedia.com/discuss/feedback/newsletter/2010/08/1/unique-evils-digital-audio-and-how-defeat-them

What you are saying essentially is "a perfectly implemented 16-bit system is just as good as a perfectly implemented 24-bit system". Unfortunately in the real world there is no such thing as a perfectly implemented ADC or DAC. There is only "the best we can do within the technology and the budget". And within those constraints, there are real benefits to 24-bit audio.

SpaceTimeMorph · Sep 11, 2011 at 3:44 PM

^ He states that there are benefits to using 24-bit at the recording stage, just not so much at the playback stage.

CEE TEE · Sep 11, 2011 at 3:52 PM

^Nice post! I'll read those later...

OP, thanks for the thread too!

SpaceTimeMorph · Sep 11, 2011 at 3:54 PM

And there's this as a demonstration of what happens when you reduce the bit depth of an audio file. Although I would agree you would have benefits during playback while using heavy DSP's, etc... would they be audible differences at the playback level?

Puranti · Sep 11, 2011 at 4:38 PM

24 / 96 is the best for now, already said a dozen time

gregorio · Sep 11, 2011 at 4:40 PM

barleyguy said:
As far as 24-bit, I don't agree there. The fact is, doing 16-bit math on sampled audio sounds like crap. So if audio is recorded in 16-bit, and you want to do DSP operations on it, it has to be converted to 24-bit, run through the DSP, then converted back to 16-bit. This needs to happen every time the slightest change is made to the audio. Also, quantization errors in the audio are much farther below the music on 24-bit systems than on 16-bit systems. So from a recording and DAW perspective, there are real advantages to having the audio stored in 24-bit.

I really don't understand the point of your message. What is the point of going on about the Lavry and Benchmark papers when I already linked to them in my original post? As well as Lavry and Benchmark there has also been support of the same principle as detailed in Lavry's paper from Prism and Apogee. But the reason I started this thread is because the truth obviously isn't getting through to so many of the consumers. Have a look round head-fi and at everyone talking about hi-rez. Also look at the multitude of companies (other than the ones mentioned here) who are either directly adding to the hype or at least not trying to dispel the hi-rez myth. Not to mention the reviews in audiophile magazines and the online stores peddling hi-rez music. It was obvious to me that this thread could be useful to some, that is why I started it. I don't see why you're giving me a hard time about it, when it seems you agree in general with most of what I said.

The only main place you disagreed with me is where you actually made a mistake in your understanding of digital audio! :- I disagree with your point about 24bit and processing in DAWs. Actually when mixing multiple channels with complex routing and multiple levels of processing on each, 24bit is insufficient, a bit depth of 48bit is far preferable. Double precision processing (48bit) is hardly some new technology either, I've been using it for over a dozen years! The reason for 24bit is as I stated, for headroom during recording, not for processing! But none of this has got anything to do with playback, for which 16bit is far more than sufficient and 24bit is just wasting a third more disk space than necessary!

G

grokit · Sep 11, 2011 at 5:16 PM

Quote:

gregorio said:
Sampling Rate

The sampling rate is measured in samples per second (S/s: kS/s or mS/s). The science upon which digital audio exists is called the Nyquist-Shannon Sampling Theorem. The sound wave frequencies which we can encode in digital audio are defined by the Nyquist Point, which works out at half the sampling frequency. With CD the sampling frequency is 44.1kS/s, so the audio frequency we can encode is limited to 22.05kHz. Anything above 22.05kHz (mostly just noise) has to be removed using a filter.

Human hearing in adults does not extend beyond 20kHz, so CD (44.1kS/s) with it's audio limit of 22.05kHz appears to cover any eventuality. However, applying the required filter (to remove anything beyond the Nyquist Point) is likely to cause some issues (phase or ringing issues) lower than the Nyquist Point (EG. At around 20kHz). Increasing the sample rate to 96kS/s means the Nyquist Point is at 48kHz, which in theory allows plenty of space outside the range of human hearing to cure the phase or ringing issues. So in theory a sample rate of 96kS/s should provide a more linear (accurate) recording within the hearing range than 44.1kS/s. Indeed, measurements bare this out. However, I say in theory because a number of scientific studies have shown that using correctly constructed tests, no one has been able to perceive a difference between 44.1kS/s and 96kS/s.

What about sample rates of 176.4kS/s, 192kS/s and even 384kS/s. Here we start running into problems. Electronic engineering is based on compromise. For example using a filter creates complications with phase and ringing, however not using a filter breaks the Nyquist-Shannon Sampling Theorem and causes even greater problems (distortion due to alias images). So using a filter is a necessary compromise, being the lesser of two evils. These higher sample rates are also a compromise. 192KS/s provides a potential benefit of recording frequencies up to 96kHz. The downside is that the calculations which need to be carried out when implementing the required filter are now 4 times more complex and have to be carried out 4 times more quickly with 192kS/s than with 44.1kS/s. Unfortunately the laws of physics make this impossible*. To get around this problem, the chip designers have had to simplify the filter implementation at these higher sample rates, resulting in much less efficient filters. Distortion, ringing and phase issues are all measurably poorer at 176.4kS/s and 192kS/s than at 96kS/s. So, the trade off here is: The benefit of being able to record frequencies between 48kHz and 96kHz verses more distortion and other unwanted artefacts.

Great post G, and the follow up posts as well. Thanks to everybody for sharing.

In my main Mac system, I typically use Pure Music to upconvert from 16/44 to 24/96, mainly because my spdif clock/converter doesn't have the option of the 88.2 sampling rate.

So I have been wondering about the benefit of 2x upconversion from 44.1 to 88.2, vs. just going from 44.1 to 96. Is there any benefit to using 88.2 over 96?

Really I'm just uncertain if I should give an 88.2-capable transport a try or not. I am happy to learn that a 192+ sampling rate is not something that I need to be concerned with on my converter, even if my DAC can handle it.

OTOH, maybe should I just go without upconversion, I don't think I can hear a difference anyways. I do hear a clear difference using the spdif clock/converter vs. my DAC's built-in 16/48 USB input, but I am torn on upconversion.

grokit · Sep 11, 2011 at 5:32 PM

Another question for you G: Are you familiar with Bel Canto's DAC2?

It's an older model that upsamples 16/44 to 24/192, but then applies a 96k "slow roll off filter". Would this filter alleviate (theoretically at least) the problems that you mentioned in regards to 196 vs. 96k?

"1. The latest S/PDIF and up-sampling receiver technology eliminates Jitter sources on the DAC clock and 24 bit processing lowers the quantization noise floor by over 40dB – well below that of the original recording. The DAC clock in the DAC2 is a local Crystal Oscillator driving the Digital Filter/DAC directly. A buffered clock output from the Digital Filter is sent back to the input sample rate converter circuit where it is compared digitally to the incoming data, setting internal registers for calculating the precise relationship between the incoming S/PDIF data and the DAC clock – eliminating jitter. Result: Clear, superb musical resolution.

2. The DAC2 uses a Burr-Brown PCM 1738e 24/192 DAC along with 96 kHz slow roll-off digital filter technology for minimal time domain errors, significantly improving transient response. The new Asynchronous Sample Rate Converter (ASRC) used in the DAC2 provides a minimum of 139 dB of dynamic range. Result: Any filter residue is well below the noise floor of the original recording, providing better imaging and high frequency coherence.

3. Most processors use brick wall filters that introduce a time smearing effect. The DAC2 uses a 96 kHz, slow roll-off Digital filter, similar to an analog moving coil cartridge. Result: Analog-like energy response with minimal coloration of high frequency harmonics."

gregorio · Sep 11, 2011 at 6:26 PM

grokit said:
Great post G, and the follow up posts as well. Thanks to everybody for sharing.

In my main Mac system, I typically use Pure Music to upconvert from 16/44 to 24/96, mainly because my spdif clock/converter doesn't have the option of the 88.2 sampling rate.

So I have been wondering about the benefit of 2x upconversion from 44.1 to 88.2, vs. just going from 44.1 to 96. Is there any benefit to using 88.2 over 96?

Really I'm just uncertain if I should give an 88.2-capable transport a try or not. I am happy to learn that a 192+ sampling rate is not something that I need to be concerned with on my converter, even if my DAC can handle it.

OTOH, maybe should I just go without upconversion, I don't think I can hear a difference anyways. I do hear a clear difference using the spdif clock/converter vs. my DAC's built-in 16/48 USB input, but I am torn on upconversion.

Thanks, glad you found it useful. There is theoretically some advantage to upconverting 44.1kS/s to 88.2kS/s rather than to 96kS/s. In practise though I'd be more than surprised if you could hear any difference. You shouldn't need to upconvert at all but as I mentioned in my OP, there are some small theoretical advantages of upsampling when it comes to the filter implementation. Best thing to do would be to listen, as carefully as you can, and decide for yourself how well you DAC handles the 16/44.1 processing.

grokit said:
Another question for you G: Are you familiar with Bel Canto's DAC2?

It's an older model that upsamples 16/44 to 24/192, but then applies a 96k "slow roll off filter". Would this filter alleviate (theoretically at least) the problems that you mentioned in regards to 196 vs. 96k?

"1. The latest S/PDIF and up-sampling receiver technology eliminates Jitter sources on the DAC clock and 24 bit processing lowers the quantization noise floor by over 40dB – well below that of the original recording. The DAC clock in the DAC2 is a local Crystal Oscillator driving the Digital Filter/DAC directly. A buffered clock output from the Digital Filter is sent back to the input sample rate converter circuit where it is compared digitally to the incoming data, setting internal registers for calculating the precise relationship between the incoming S/PDIF data and the DAC clock – eliminating jitter. Result: Clear, superb musical resolution.

2. The DAC2 uses a Burr-Brown PCM 1738e 24/192 DAC along with 96 kHz slow roll-off digital filter technology for minimal time domain errors, significantly improving transient response. The new Asynchronous Sample Rate Converter (ASRC) used in the DAC2 provides a minimum of 139 dB of dynamic range. Result: Any filter residue is well below the noise floor of the original recording, providing better imaging and high frequency coherence.

3. Most processors use brick wall filters that introduce a time smearing effect. The DAC2 uses a 96 kHz, slow roll-off Digital filter, similar to an analog moving coil cartridge. Result: Analog-like energy response with minimal coloration of high frequency harmonics."

I'm not sure how they are implementing their "96k slow roll off filter" but there is no solution to the processing problems inherent with 192kS/s.

Sorry, I've not listened to the BC DAC2. A few things bother me about their advertising though.

1. Not sure what they are talking about here, the quantisation noise floor of 16bit is already more than 40dB less than the noise floor of any original recording. The buffered clock idea sounds like a PLL circuit, a standard jitter reduction circuit found in all DACs.

2. The claim of 139dB dynamic range is a lie, it's simply beyond the laws of physics to provide that dynamic range. The reason is that even the sound of electrons colliding inside a resistor will cause more noise than a signal of -139dB. No DAC on the planet can have a higher dynamic range than about 126dB (give or take a few dB).

3. Most processors do not have brick wall filters which introduce time smearing. I'd only expect this in a poor quality converter.

G

grokit · Sep 11, 2011 at 7:57 PM

Quote:

gregorio said:
I'm not sure how they are implementing their "96k slow roll off filter" but there is no solution to the processing problems inherent with 192kS/s.

Sorry, I've not listened to the BC DAC2. A few things bother me about their advertising though.

1. Not sure what they are talking about here, the quantisation noise floor of 16bit is already more than 40dB less than the noise floor of any original recording. The buffered clock idea sounds like a PLL circuit, a standard jitter reduction circuit found in all DACs.

2. The claim of 139dB dynamic range is a lie, it's simply beyond the laws of physics to provide that dynamic range. The reason is that even the sound of electrons colliding inside a resistor will cause more noise than a signal of -139dB. No DAC on the planet can have a higher dynamic range than about 126dB (give or take a few dB).

3. Most processors do not have brick wall filters which introduce time smearing. I'd only expect this in a poor quality converter.

G

It's a pretty old model, circa 2002. As far as the dynamic range goes, they must be speaking of just the sample rate converter, because in the same owners manual at the end they publish the specs for the unit as a whole:

"Signal to Noise Ratio • > 117 dB A Weighted

Dynamic Range • 117 dB"

The whole manual is here, and this is a pretty good overview if you're interested.

PelPix · Sep 11, 2011 at 8:00 PM

What about sampling rate in regards to signal processing? Wouldn't you get a more even and non-aliased sounding sound if you processed at a higher sampling rate? I'm not talking about listening at all.

Latest Thread Images

Featured Sponsor Listings

Hi-Rez - Another Myth Exploded!

gregorio

Headphoneus Supremus

fatcat28037

Headphoneus Supremus

Anaxilus

Headphoneus Supremus

gregorio

Headphoneus Supremus

barleyguy

1000+ Head-Fier

SpaceTimeMorph

Head-Fier

CEE TEE

Member of the Trade: NITSCH

SpaceTimeMorph

Head-Fier

Puranti

100+ Head-Fier

gregorio

Headphoneus Supremus

grokit

Headphoneus Supremus

grokit

Headphoneus Supremus

gregorio

Headphoneus Supremus

grokit

Headphoneus Supremus

PelPix

100+ Head-Fier

Users who are viewing this thread