Testing audiophile claims and myths
Jan 8, 2019 at 5:39 PM Post #11,881 of 17,336
Some things to keep in mind regarding whether differences matter and are worth extra storage, cost, etc.:

- One person may notice differences that another doesn't (or can't)

- The differences a person notices may vary over time, depending on how much attention they pay, where they direct their attention, state of body and health, past listening experiences, expectations, etc.

- The differences a person notices may depend on the track

- The differences a person notices may depend on other components in the signal chain

- Even if people notice and don't notice the same differences, they may give different importance to those differences

For a lot of things, these factors make it difficult, if not impossible, to make general statements about what differences can be heard and how much they matter. To a large extent, each person needs to be make their own judgments and decisions.
 
Jan 8, 2019 at 6:13 PM Post #11,882 of 17,336
I am looking for a person who can discern a difference between lossy and lossless in a blind test under normal music listening conditions. So far I haven't found anyone. If someone thinks they can do it, please let me know. I would be happy to set you up with a test. So far I have tested dozens and dozens of audiophiles, including several posters in this forum and no one seems to be able to tell any difference above about 256. Thanks.
 
Jan 8, 2019 at 7:25 PM Post #11,883 of 17,336
I am looking for a person who can discern a difference between lossy and lossless in a blind test under normal music listening conditions. So far I haven't found anyone. If someone thinks they can do it, please let me know. I would be happy to set you up with a test. So far I have tested dozens and dozens of audiophiles, including several posters in this forum and no one seems to be able to tell any difference above about 256. Thanks.

As we've discussed before, blind test results aren't necessarily conclusive, because:

- There are variations in how blind tests are designed, conducted, and interpreted; a blind test can produce false positive or false negative results due to problems in these areas

- Statistics can't be applied to blind tests in a simple way because hearing acuity can vary across trials - auditory perception isn't a simple and consistent "measuring device/process" (e.g., someone might really notice a difference in 10% of trials, and just be guessing in the other 90% of trials, thus producing an apparent null result)

- Memory inaccuracy and fuzziness is a problem in any listening comparison, whether sighted or blind (blinding only fixes the problem of expectation bias)

- Results of blind tests don't necessarily generalize to normal listening (especially with complex musical signals where the listener can't be sure of what differences to listen for, as compared to simple test samples where the listener knows what to listen for), so blind tests may not be a reliable tool to make such an inference

I can't propose a test better than a controlled blind test, but that doesn't mean that such a blind test is good enough to draw the kinds of sweeping conclusions which are sometimes asserted in Sound Science. There can be such a situation as "we're not sure, but leaning this way."
 
Jan 8, 2019 at 7:52 PM Post #11,884 of 17,336
It’s unlikely we will ever have a “perfect” testing methodology, but given the consistency of blind test results (people never successfully identifying differences in a statistically meaningful way), it’s unlikely that the methodology is fatally flawed. If success/failure was conditional, then we should be seeing individuals who can consistently identify lossy vs. lossless those under specific conditions. That this never happens in controlled testing is fairly compelling.
 
Last edited:
Jan 8, 2019 at 7:59 PM Post #11,885 of 17,336
As we've discussed before, blind test results aren't necessarily conclusive, because:

- There are variations in how blind tests are designed, conducted, and interpreted; a blind test can produce false positive or false negative results due to problems in these areas

- Statistics can't be applied to blind tests in a simple way because hearing acuity can vary across trials - auditory perception isn't a simple and consistent "measuring device/process" (e.g., someone might really notice a difference in 10% of trials, and just be guessing in the other 90% of trials, thus producing an apparent null result)

- Memory inaccuracy and fuzziness is a problem in any listening comparison, whether sighted or blind (blinding only fixes the problem of expectation bias)

- Results of blind tests don't necessarily generalize to normal listening (especially with complex musical signals where the listener can't be sure of what differences to listen for, as compared to simple test samples where the listener knows what to listen for), so blind tests may not be a reliable tool to make such an inference

I can't propose a test better than a controlled blind test, but that doesn't mean that such a blind test is good enough to draw the kinds of sweeping conclusions which are sometimes asserted in Sound Science. There can be such a situation as "we're not sure, but leaning this way."


c'mon man...it ain't nearly as complicated as all that. Why is it so hard for some to admit that 256 and above lossy files are pretty damned hard to distinguish from lossless? "memory inaccuracy and fuzziness" makes A/B testing unreliable, but some guy saying he KNOWS his new high $ cable has "dramatically" improved the SQ of his ear buds or that this DAC sounds way better than that DAC is perfectly reasonable. :L3000: It's simple. Listen to these two recordings - one a good (256 or better) mp3 and the other a lossless recording - and tell me which one sounds better. You can listen to them over and over and focus on whatever you want to focus on ...then pick the one that sounds better. If you can pick the lossless more than the lossy you have golden ears.
 
Last edited:
Jan 8, 2019 at 9:28 PM Post #11,886 of 17,336
It’s unlikely we will ever have a “perfect” testing methodology, but given the consistency of blind test results (people never successfully identifying differences in a statistically meaningful way), it’s unlikely that the methodology is fatally flawed. If success/failure was conditional, then we should be seeing individuals who can consistently identify lossy vs. lossless those under specific conditions. That this never happens in controlled testing is fairly compelling.

c'mon man...it ain't nearly as complicated as all that. Why is it so hard for some to admit that 256 and above lossy files are pretty damned hard to distinguish from lossless? "memory inaccuracy and fuzziness" makes A/B testing unreliable, but some guy saying he KNOWS his new high $ cable has "dramatically" improved the SQ of his ear buds or that this DAC sounds way better than that DAC is perfectly reasonable. :L3000: It's simple. Listen to these two recordings - one a good (256 or better) mp3 and the other a lossless recording - and tell me which one sounds better. You can listen to them over and over and focus on whatever you want to focus on ...then pick the one that sounds better. If you can pick the lossless more than the lossy you have golden ears.

I'm comfortable with saying that the blind tests conducted so far are probably sufficient to conclude that any audible differences between lossless vs somewhat lossy, DACs, amps, and cables are very likely to be very subtle at most for the vast majority of listeners. (The qualifiers I italicized make this sort of statement acceptable to me. I'm not comfortable with sweeping and absolute statements, because we don't have the evidence and knowledge to support such statements.)

A corollary of the above statement is that it can very likely be concluded that the night and day differences that many listeners frequently report are due to misperceptions rather than real audible differences, in the vast majority of cases. We have an impressive ability to frequently misperceive things without our being consciously aware of it, and we clearly didn't evolve to be able to be able to make and remember these kinds of fine auditory distinctions. And yet we're drawn to try to make these kinds of distinctions, almost like it's some sort of sport. Weird …

As I've noted before, I don't notice any obvious difference between Spotify Extreme and Tidal lossless, and I happily use both. Given the choice, I usually use Tidal "just in case" there's a subtle difference I don't readily pick up, but I don't really worry that I'm missing out when I use Spotify. There's been talk for a while about Spotify offering a lossless option, and I'm guessing that they haven't done so yet because they don't want to be accused of charging people more for something which makes no difference.
 
Last edited:
Jan 8, 2019 at 10:57 PM Post #11,887 of 17,336
most of the time you act as if a statement is true until proved false, but this is supposedly a section about facts and science, not one about Judge Judy. claims without supporting evidence are worth nothing because science relies on data, not on our good will to trust an empty claim form some dude online. and if providing supporting evidence is overly complicated or impossible, then the answer isn't to accept the possibility that the claim is true. the answer is to ask why anybody would claim something he has no idea how to demonstrate.
empty claims can be rejected and doing so doesn't mean we claim that the guy was wrong, it means that we have better things to do than getting into a debate without supporting data. it's the same good practice that refuse to argue non falsifiable ideas or claims.
 
Jan 8, 2019 at 11:07 PM Post #11,889 of 17,336
Sounds good....

But, for starters, you will have to test it under ALL "normal listening conditions".
So, of course, you should try it in stereo.
Then you should try it in synthesized 5.1 and 7.1 channel surround sound, when decoded using Dolby PLIIx, DTS Neo-6, the Dolby Surround Upmixer, the DTS Neo-X upmixer, and the Auro 3D upmixer.
You should also confirm that there is no audible difference when it's played through the most common "headphone ehnancement" plugins - including at least the "Dolby headphone" plugin and " the new "Atmos headphone" plugin for Windows.
You should probably also include a few of the proprietary surround sound modes offered by the major manufacturers.
These are all "normal listening conditions" used by large numbers of "typical listeners".
(We sell both stereo and home theater equipment at Emotiva - and many of our customers who own home theater equipment listen to their stereo music through a surround sound decoder in synthesized surround sound.)

I am looking for a person who can discern a difference between lossy and lossless in a blind test under normal music listening conditions. So far I haven't found anyone. If someone thinks they can do it, please let me know. I would be happy to set you up with a test. So far I have tested dozens and dozens of audiophiles, including several posters in this forum and no one seems to be able to tell any difference above about 256. Thanks.
 
Jan 8, 2019 at 11:15 PM Post #11,890 of 17,336
But ok. I will admit I do still have some mp3's sitting around in my collection. But it's stuff I don't really care about really and listen to it very rarely.
 
Jan 8, 2019 at 11:30 PM Post #11,891 of 17,336
I absolutely agree....
However, this leads us right around the circle, and back to a very basic question:
WHICH claim is the one that we are supposed to reject without proof?
If we accept the criterion that "we shouldn't accept ANY claim without proof" then we simply have two unproven claims.
I should also note that failure to prove that one of those claims is true does NOT prove by default that the other is true.

One person claims that "lossless files sound audibly the same as lossy files".
Another claims that, since lossy files can be shown to be measurably quite different, it seems likely that they will be audibly different.
(I personally suspect that, because lossy files are measurably very different, it seems likely that there will turn out to be situations where those differences lead to audible differences... but I make no claim to have tested it either way.)

It is a logical fallacy to assume that either of those claims is some sort of "default assumption"; by your criterion there is no such thing as a default assumption..
Neither claim is "obviously true" or "obviously likely to be true" or "obviously unlikely to be true".
They are BOTH "just empty claims" until and unless valid and relevent proof is presented.
We need to see actual proof before assuming that EITHER of those claims is true.
(Note that the scale and scope of the proof must be appropriate; for example, if you test five subjects and three pieces of equipment, you cannot reasonably generalize your results to "everyone" or "all equipment".)

I should also point out that most lossy compression methods, including the popular MP3, are not standardized.
If you compress the same original file into a "320k VBR MP3 file" using different encoders, you will end up with different results.
Each encoder uses its own "judgment" to decide what to discard.
They are all based on the same basic assumptions - but the details vary considerably.
Therefore, at a very minimum, when making this sort of claim, you must specify the EXACT encoder, version, and settings that were used.
(There is no reason to assume that different encoders, or the same encoder with slightly different settings, will produce equally "audibly transparent" results.)
And, yes, this can be a problem if you purchase MP3 files, because vendors often neglect to tell you what compressor and settings they used.
(Of course, the solution there is to find an encoder whose performance YOU trust, then compress your own files from lossless originals.)

most of the time you act as if a statement is true until proved false, but this is supposedly a section about facts and science, not one about Judge Judy. claims without supporting evidence are worth nothing because science relies on data, not on our good will to trust an empty claim form some dude online. and if providing supporting evidence is overly complicated or impossible, then the answer isn't to accept the possibility that the claim is true. the answer is to ask why anybody would claim something he has no idea how to demonstrate.
empty claims can be rejected and doing so doesn't mean we claim that the guy was wrong, it means that we have better things to do than getting into a debate without supporting data. it's the same good practice that refuse to argue non falsifiable ideas or claims.
 
Jan 9, 2019 at 12:11 AM Post #11,892 of 17,336
Perhaps you have a short memory......

There was a time when "nobody could tell the difference between a cylinder recording and a live performer".
Then people insisted that vinyl "was so close to perfect that there was no point in looking for improvement".
Then we were told that most people couldn't tell "is it live or is it Memorex" (referring to cassettes).
Then there was a time when "most people were sure that 128k MP3 files were audibly perfect".
(Note that the developers of the MP3 compression process never made claims beyond that "most listeners" wouldn't notice a difference with "most music".)

I agree.... there's nothing to suggest that the methodology itself is flawed.
However, there really have NOT been "comprehensive widespread tests".
(It seems reasonable to suggest that, at least for now, no single group has both the resources and the inclination to perform those tests.)

I should also note something about human nature - which is that we learn and evolve in our ability to recognize things.
In one very early test, an audience was unable to tell the difference between a live performer and a cylinder recording.
HOWEVER, it is important to note that the audience who participated in that test had no experience whatsoever with recorded music... having only ever experienced live performances.
To them, that poor quality recording was 'the closest thing they'd ever heard to a live performance - other than a live performance".
A modern audience would have been quick to notice the surface noise, ticks and pops, and distortion of the cylinder recording as "obvious hints that it was a record".
In short, we have LEARNED that ticks, pops, and hiss are artifacts often associated with mechanical recordings like vinyl records.

This strongly suggests an interesting avenue of research.

After doing careful tests to determine whether listeners can detect differences between lossless and lossy compressed files (using a particular level and sort of compression).
We should take one group of listeners and "teach them the differences".
This would be accomplished by allowing them to listen to both versions of several different files - while pointing out the differences that exist "so they know what to listen for".
("Here's what those two files look like on an oscilloscope. Do you see the differences? Do you hear a difference that seems to correlate with the difference you see?")
We should then re-run the test, to find out whether our "taught" group has in fact LEARNED how to better notice and recognize the differences between the files.
We could then perform a double blind test to determine whether our "taught" group has actually LEARNED to be more accurate in distinguishing lossy files - or not.

We aren't born knowing how to tell a counterfeit painting from an original - doing so is a skill that we learn - and that some people have a particular aptitude for while others don't.
And, for those of us who lack that skill, the differences noticed by skilled experts are often invisible or very difficult to detect until they are pointed out to us.
Why would we assume that the ability to recognize the small differences caused by lossy compression shouldn't have a similar characteristic?

It’s unlikely we will ever have a “perfect” testing methodology, but given the consistency of blind test results (people never successfully identifying differences in a statistically meaningful way), it’s unlikely that the methodology is fatally flawed. If success/failure was conditional, then we should be seeing individuals who can consistently identify lossy vs. lossless those under specific conditions. That this never happens in controlled testing is fairly compelling.
 
Jan 9, 2019 at 12:31 AM Post #11,893 of 17,336
I absolutely agree....

And, yes, many people on all sides of the fence take advantage of the inclination we humans have to exaggerate our perceptions.
And, yes, many people also see it as a sort of competition... where the winner is more perceptive than his competitors (or has the more golden ear).

HOWEVER, you must always remember to maintain perspective....
It's quite likely that neither you nor I would get to the store any faster if we were driving as Formula 1 racing car.
However, it is still true that the Formula 1 racer is faster and does perform better than my Nissan Versa.

You may reasonably claim that "neither of us would benefit from owning a Formula 1 racer".
However, you may NOT reasonably suggest that "the Formula 1 racer doesn't perform better than the Nissan" just because we are unable to take advantage of the differences.

I have heard that Spotify has been 'test marketing" their lossless service.
However, since we haven't heard any more about it, it seems likely that their test wasn't very successful (at least from the perspective of generating revenue).

I'm comfortable with saying that the blind tests conducted so far are probably sufficient to conclude that any audible differences between lossless vs somewhat lossy, DACs, amps, and cables are very likely to be very subtle at most for the vast majority of listeners. (The qualifiers I italicized make this sort of statement acceptable to me. I'm not comfortable with sweeping and absolute statements, because we don't have the evidence and knowledge to support such statements.)

A corollary of the above statement is that it can very likely be concluded that the night and day differences that many listeners frequently report are due to misperceptions rather than real audible differences, in the vast majority of cases. We have an impressive ability to frequently misperceive things without our being consciously aware of it, and we clearly didn't evolve to be able to be able to make and remember these kinds of fine auditory distinctions. And yet we're drawn to try to make these kinds of distinctions, almost like it's some sort of sport. Weird …

As I've noted before, I don't notice any obvious difference between Spotify Extreme and Tidal lossless, and I happily use both. Given the choice, I usually use Tidal "just in case" there's a subtle difference I don't readily pick up, but I don't really worry that I'm missing out when I use Spotify. There's been talk for a while about Spotify offering a lossless option, and I'm guessing that they haven't done so yet because they don't want to be accused of charging people more for something which makes no difference.
 
Jan 9, 2019 at 12:35 AM Post #11,894 of 17,336
Perhaps you have a short memory......

There was a time when "nobody could tell the difference between a cylinder recording and a live performer".
Then people insisted that vinyl "was so close to perfect that there was no point in looking for improvement".
Then we were told that most people couldn't tell "is it live or is it Memorex" (referring to cassettes).
Then there was a time when "most people were sure that 128k MP3 files were audibly perfect".
(Note that the developers of the MP3 compression process never made claims beyond that "most listeners" wouldn't notice a difference with "most music".)

I agree.... there's nothing to suggest that the methodology itself is flawed.
However, there really have NOT been "comprehensive widespread tests".
(It seems reasonable to suggest that, at least for now, no single group has both the resources and the inclination to perform those tests.)

I should also note something about human nature - which is that we learn and evolve in our ability to recognize things.
In one very early test, an audience was unable to tell the difference between a live performer and a cylinder recording.
HOWEVER, it is important to note that the audience who participated in that test had no experience whatsoever with recorded music... having only ever experienced live performances.
To them, that poor quality recording was 'the closest thing they'd ever heard to a live performance - other than a live performance".
A modern audience would have been quick to notice the surface noise, ticks and pops, and distortion of the cylinder recording as "obvious hints that it was a record".
In short, we have LEARNED that ticks, pops, and hiss are artifacts often associated with mechanical recordings like vinyl records.

This strongly suggests an interesting avenue of research.

After doing careful tests to determine whether listeners can detect differences between lossless and lossy compressed files (using a particular level and sort of compression).
We should take one group of listeners and "teach them the differences".
This would be accomplished by allowing them to listen to both versions of several different files - while pointing out the differences that exist "so they know what to listen for".
("Here's what those two files look like on an oscilloscope. Do you see the differences? Do you hear a difference that seems to correlate with the difference you see?")
We should then re-run the test, to find out whether our "taught" group has in fact LEARNED how to better notice and recognize the differences between the files.
We could then perform a double blind test to determine whether our "taught" group has actually LEARNED to be more accurate in distinguishing lossy files - or not.

We aren't born knowing how to tell a counterfeit painting from an original - doing so is a skill that we learn - and that some people have a particular aptitude for while others don't.
And, for those of us who lack that skill, the differences noticed by skilled experts are often invisible or very difficult to detect until they are pointed out to us.
Why would we assume that the ability to recognize the small differences caused by lossy compression shouldn't have a similar characteristic?


So many words and so little actual refutation. Wax cylinders, alledged testing from the 1920s with no references, and 70s advertising slogans - seriously?

Still waiting for you to present actual evidence rather than blindly lobbing grenades in the hopes of actually hitting something. It’s almost as if you have a financial stake in avoiding the data available from existing testing...
 
Jan 9, 2019 at 4:55 AM Post #11,895 of 17,336
[1] I can't propose a test better than a controlled blind test, but that doesn't mean that such a blind test is good enough to draw the kinds of sweeping conclusions which are sometimes asserted in Sound Science.
[2] There can be such a situation as "we're not sure, but leaning this way.
[3] I'm comfortable with saying that the blind tests conducted so far are probably sufficient to conclude that any audible differences between lossless vs somewhat lossy, DACs, amps, and cables are very likely to be very subtle at most for the vast majority of listeners. (The qualifiers I italicized make this sort of statement acceptable to me ...)

1. But we don't ONLY use "such a blind test" to draw the kinds of sweeping conclusions. Lossy compression algorithms have been tested extensively over a period of at least a couple of decades (and have been refined over that period) and not just with "such a blind test" but with countless and far more robust controlled double blind/ABX tests.

2. Yes, there can be but this isn't one of those situations!

3. You are of course entitled to your opinion but this is the sound science forum and what you personally are "comfortable with saying" or what "sort of statement is acceptable to you" is irrelevant. There is no evidence to support the assertion that the differences are "very likely to be very subtle at most", all the reliable evidence indicates that the differences are "very likely to be completely inaudible". If there were audible but very subtle differences we would see some evidence of that, we would see a small minority of test subjects who could reliably identify a difference and we would expect to see a somewhat higher percentage when the test subjects were trained listeners using the highest quality (most accurate) reproduction systems/environments but we don't see this, we don't see ANY percentage of test subjects who can reliably identify a difference, even among the most highly trained listeners with the best reproduction equipment. The actual situation, as far as the evidence is concerned, is therefore: "We cannot be absolutely sure about anything but we can be reasonably certain".

[1] But, for starters, you will have to test it under ALL "normal listening conditions".
[2] You should probably also include a few of the proprietary surround sound modes offered by the major manufacturers.
[3] These are all "normal listening conditions" used by large numbers of "typical listeners".

1. Clearly that statement is false. It's clearly impossible to test every combination of consumer equipment and listening environment, let alone test every consumer with every combination of equipment/environment. There can be no absolute proof, only a weight of evidence.

2. If a consumer wants to take a lossy (or lossless) recording and completely change the fidelity/purpose according to their own preferences, that's entirely up to them but it's not a lack of performance if it's used for a different purpose than it was designed for. Using your own analogy, a Formula 1 car does NOT have "better performance" than a Nissan Versa, if you want to go to a store (with your wife or kids, and buy something) a Formula 1 car has no performance at all, let alone "better performance"!

3. No they're not, they are abnormal listening conditions, not what lossy codecs were designed for. Additionally, it's not done "by large numbers" of "typical listeners", it's done by an extremely small minority.

[1] However, this leads us right around the circle, and back to a very basic question: WHICH claim is the one that we are supposed to reject without proof?
[1a] If we accept the criterion that "we shouldn't accept ANY claim without proof" then we simply have two unproven claims.
[1b] It is a logical fallacy to assume that either of those claims is some sort of "default assumption"; by your criterion there is no such thing as a default assumption.
[2] There was a time when "nobody could tell the difference between a cylinder recording and a live performer".
[2a] Then people insisted that vinyl "was so close to perfect that there was no point in looking for improvement".

1. It does indeed unfortunately lead us right around the circle again, and back to the very basic fallacy which you keep repeating! As there is and cannot be any absolute proof, the claim "we are supposed to reject" is the claim WHICH has no supporting reliable evidence in favour of the claim which has overwhelming supporting evidence.
1a. You can accept any criterion you choose but this is the Sound Science forum (not the "KeithEmo's Criterion" forum) and therefore we "accept the criterion" of science, which is that we accept the claim which is supported by reliable evidence and reject the claim which has none! For example, Evolution has not been proven, neither has Creationism but scientifically we do NOT "simply have two unproven claims". Scientifically, the claim of Evolution is accepted (without proof) because of it's weight of reliable evidence (and the lack of evidence for creationism).
1b. No it's a logical fallacy not to accept the "default assumption" of say Evolution.

2. No there wasn't! You keep contradicting yourself, you keep going on about proof and then make an assertion without even any supporting evidence, let alone proof. Why is that?
2a. What people? Clearly your statement does not apply to all people and specifically excludes entire groups of people (scientists and engineers for example). Why was the RIAA curve invented, why was digital audio invented, was no one able to measure anything 80 years ago?

And off we go again, round and round in circles!

G
 

Users who are viewing this thread

Back
Top