Testing audiophile claims and myths | Page 458 | Headphone Reviews and Discussion - Head-Fi.org

robthemac · Apr 24, 2018 at 4:19 PM

@KeithEmo you're right, but with one small proviso. The more outcome measures you have, the higher the likelihood of some outcomes reaching statistical significance. This is called P-hacking, and is a big problem in medicine and psychology at present. If you use a P-value of 0.05 (i.e. there being a 5% chance of the result being due to chance), you only need to measure about ten outcomes to have a pretty high likelihood than one reaches statistical significance (chances are 1-(19/20)^10, I think...)

What is a better measure of significance in your example is having clusters of people at one end of the normal distribution curve.

Phronesis · Apr 24, 2018 at 4:21 PM

sonitus mirus said:
What if your Mojo products aren't actually audibly superior to many other similar devices costing a fraction of the amount? What biases are you gravitating towards that makes you believe the Mojo is any better than a McIntosh or a cheap Schiit product? If there is a difference, it can't be much, can it? I would think any differences should be insignificant, unless you really like to see a light turn on to confirm a specific format is playing. I think I could save that same person even more money.

For me, weighing the results of my listening against the cost of the Mojo (versus less expensive products), it was worth taking the "risk" of spending more on the Mojo. As I said in another thread, I would need more scrutiny before spending $10K on the Dave. It's all relative - for many people, the cost of the Dave is simply out of reach, for others it's peanuts, and many of us are somewhere in between. I didn't actually encourage the other guy to buy a Mojo, I just suggested that the McIntosh may be a waste of money since similar performance can apparently be obtained for a fraction of the cost.

robthemac · Apr 24, 2018 at 4:23 PM

PS: how much to ship your factory renewed BasX A-100 to NZ? Got an HE-6 that needs powering.

Phronesis · Apr 24, 2018 at 4:38 PM

KeithEmo, great post. Totally agreed that tests need to be designed to answer specific questions, the results interpreted accordingly, and the limitations recognized related to generalizing from those results and the test conditions. This is a good read from the medical setting:

http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124

bigshot · Apr 24, 2018 at 5:04 PM

If we're talking amps, power and impedance are more important to sound quality than the particular model or brand. Unless something is designed remarkably poorly or defective in manufacture, an amp should be audibly transparent. I've had a lot of amps in the past 40 years, and the only one I had that wasn't audibly transparent was from the late 1960s/early 70s.

KeithEmo · Apr 24, 2018 at 5:26 PM

Shipping from the USA to NZ is expensive... and the A-100 has some weight to it.
I had our front office spot check the pricing - and shipping to NZ comes up around US$175
(that's NOT including the price of the amplifier itself and any possible import duties).

robthemac said:
PS: how much to ship your factory renewed BasX A-100 to NZ? Got an HE-6 that needs powering.

robthemac · Apr 24, 2018 at 5:47 PM

Bloody hell, almost cheaper to fly there myself....

KeithEmo · Apr 24, 2018 at 6:08 PM

Absolutely.

I just wanted to point out that there's a distinct difference between statistical significance and a fact that can be generalized with absolute certainty... and many audiophiles in particular seem to miss that distinction.

The number of people with a potentially fatal peanut allergy is almost certainly "statistically insignificant".
And this even extends to the point that, if you see someone fall over after eating a peanut, the likelihood is that they DIDN'T collapse from a peanut allergy.
Yet neither of those statistical probabilities rules out the POSSIBILITY that he or she has a serious peanut allergy.

Unfortunately, many audio companies use less than credible science to sell their products, and many audiophiles seem too eager to believe what they read (or give in to their biases).
This leads to the quite reasonable suggestion that much of what audiophiles claim to hear much of the time probably is in fact the product of their own biases.

My point was basically that many people don't understand "how to read the statistics" and "how to design the tests".

For example, let's say I'm trying to prove that "the difference between FLAC and WAV files of the same bit depth and sample rate is inaudible".
(First off, the only real way to do this is to attempt to falsify that claim - attempt to prove that some people do hear a difference and then fail in that attempt.)
I could test a thousand people, with twenty files each, and perhaps produce a result that "there was no statistically significant correlation" when people attempted to tell which they were listening to.
However, by doing it that way, while I would have failed to prove that there IS a difference, or to PROVE that there isn't; I could only suggest that there is NO difference.
(And only on the particular test equipment, with the particular sample files, and under the particular test conditions, I chose.)

The test protocols often used in audio generally fail to pick out small groups of outliers (for example the small percentage of people who have "absolute pitch").

From a practical point of view, if I wanted much more conclusive results, here's how I would run the test....

I would advertise a public application for test subjects.
I would offer a prize of $500 to anyone who can "get 17 out of 20 correct" when trying to guess whether they're listening to a FLAC or a WAV file.
I would invite them to use the audio system of their choice.
(If I expected a lot of applicants, I might have a self-administered "screening round", to weed out those who obviously couldn't tell, before the "cash round".)

This protocol would:
- self-select for people likely to actually be able to hear a difference (at least by their own evaluation)
- provide those people an incentive to participate
- provide an incentive for them to try their hardest to succeed
- ensure that they were tested under optimal test conditions (within limitations)

I would now see if any of the applicants were INDIVIDUALLY able to distinguish which file was which format to a statistically significant degree.
And, if a statistically significant number of the participants "beat the odds" then I would conclude that I had a positive result.
And, if even a few participants "beat the odds", I would conclude that the result appeared significant, but statistically COULD still be due to random chance.
So I would RETEST my successful candidates - with a longer list of files.
And, if they were AGAIN able to "beat the odds" I would conclude that those few candidates were actually able to hear the difference.
And, if, UNDER CONDITIONS CHOSEN BY THE APPLICANTS, their guesses were still random, I would have a pretty solid justification for claiming that there was probably no audible difference.

In short, I would have offered every reasonable opportunity for a positive result...
All I need is one person who can consistently and reliably hear a difference to state with certainty that "at least some humans can hear a difference".
However, BECAUSE I PROVIDED EVERY POSSIBLE OPPORTUNITY FOR A POSITIVE RESULT, if I FAIL to produce a positive result, then my failure gives credibility to my claim that the negative result is probably correct.
(Note that it is impossible to ever prove a negative in most situations.)

I would still have failed to test against the possibility that there is some specific small group who could actually hear a difference but failed to participate (perhaps only children below the age of five can hear it........)
But that is probably a minor consideration.

However, I would have given every person who believes that there is in fact a difference to "win their point" under their chosen conditions.
Therefore, I can assert that "I have given everyone a fair opportunity to prove me wrong in my claim that there is no difference - and nobody has succeeded in doing so.)

robthemac said:
@KeithEmo you're right, but with one small proviso. The more outcome measures you have, the higher the likelihood of some outcomes reaching statistical significance. This is called P-hacking, and is a big problem in medicine and psychology at present. If you use a P-value of 0.05 (i.e. there being a 5% chance of the result being due to chance), you only need to measure about ten outcomes to have a pretty high likelihood than one reaches statistical significance (chances are 1-(19/20)^10, I think...)

What is a better measure of significance in your example is having clusters of people at one end of the normal distribution curve.

KeithEmo · Apr 24, 2018 at 6:10 PM

No kidding.

Shipping to NZ from the USA is EXPENSIVE.
(I shipped a dozen CDs to NZ once, in jewel cases, without the commercial rates... it cost me almost $90.)

robthemac said:
Bloody hell, almost cheaper to fly there myself....

KeithEmo · Apr 24, 2018 at 6:23 PM

I would tend to agree with you in the practical sense.... although, to a degree, it depends on what you mean by "audibly transparent".

Here at Emotiva, I've had many opportunities to compare different amplifier models side by side....
And, in many cases, there are what I would call tiny differences.
However, in many situations, it's not a matter of one sounding obviously more transparent than another, but simply a slight difference.
(The gains were matched within a fraction of a dB, and the frequency response likewise were flat within a fraction of a dB.)
And, listened to, each by itself, both sounded "perfectly transparent".

When I discuss this with people, I tend to say something like: "if you switch back and forth there is a tiny difference - but, if I walked out of the room, and walked back in, I probably couldn't tell you which one was playing".

I suspect that the reality is simply that the commonly accepted numbers for "indistinguishably" aren't strictly true.
For example, if I were to play a pure sine wave, then increase the level by 0.1 dB, would that change really be inaudible?

I tend to compare the situation with sound to our perception of colors.
If I were to paint two squares both similar shades of red.... and show them to you on a white background.... separated by an inch of white space.... there is some threshold variation where you would identify them as different colors.
But, if I were to move the squares so they TOUCH, you would find that a much smaller difference between them would enable you to see a "seam" between them.
Likewise, compare two pieces of "transparent window glass" and you'll find that they are slightly different colors.

I note that you say "all those amps were audibly transparent"...... but did you actually switch directly back and forth between two of them?
(With NO pause, of even a second, between them.)

Note that I'm being very careful NOT to claim that there are SIGNIFICANT differences there.

bigshot said:
If we're talking amps, power and impedance are more important to sound quality than the particular model or brand. Unless something is designed remarkably poorly or defective in manufacture, an amp should be audibly transparent. I've had a lot of amps in the past 40 years, and the only one I had that wasn't audibly transparent was from the late 1960s/early 70s.

bigshot · Apr 24, 2018

Audibly transparent means that the ability of the amp to reproduce the music exceeds the ability of human ears to hear differences. No coloration. Above the line where improvements can't be heard any more. An amp can't be "more transparent" than another amp. It either has audible coloration, distortion or noise, or it's audibly transparent.

Most amps, players, DACs and redbook, SACD, high bitrate lossy... all audibly transparent. If it isn't, then it's either poorly designed or a manufacturing defect. (If someone knows of an amp that is audibly different, I'd like to have information on it.)

I do direct A/B switched, line level matched comparisons of every piece of equipment I own. I haven't found anything that sounds different than anything else yet. If I do, I'll probably send it back for a refund, because my system is calibrated to a particular response and I don't want one thing to be colored differently than another. And in digital audio, I refuse to put up with audible noise or distortion.

Phronesis · Apr 24, 2018 at 6:48 PM

Perhaps nothing is truly 100% "transparent," even if a given set of measurements don't show significant noise or distortion, so there could still be subtle (or not subtle) differences in sound. Every form of measurement involves an interaction between the measuring apparatus and what is being measured, so measurements don't simply "tell us the way things are," and measurements are subject to their own errors. Sometimes the process of measuring can even significantly influence the thing being measured. Moreover, we can't get away from the fact that all measurements have to be interpreted in the context of (fallible) models. I'm not comfortable with the assumption that measurements do or can tell the whole truth, and nothing but the truth.

I don't intend offense to anyone, but my observation is that there can be a problem with dogmatism on both sides of these debates.

sonitus mirus · Apr 24, 2018 at 7:06 PM

KeithEmo said:
I would tend to agree with you in the practical sense.... although, to a degree, it depends on what you mean by "audibly transparent".

Here at Emotiva, I've had many opportunities to compare different amplifier models side by side....
And, in many cases, there are what I would call tiny differences.
However, in many situations, it's not a matter of one sounding obviously more transparent than another, but simply a slight difference.
(The gains were matched within a fraction of a dB, and the frequency response likewise were flat within a fraction of a dB.)
And, listened to, each by itself, both sounded "perfectly transparent".

When I discuss this with people, I tend to say something like: "if you switch back and forth there is a tiny difference - but, if I walked out of the room, and walked back in, I probably couldn't tell you which one was playing".

I suspect that the reality is simply that the commonly accepted numbers for "indistinguishably" aren't strictly true.
For example, if I were to play a pure sine wave, then increase the level by 0.1 dB, would that change really be inaudible?

I tend to compare the situation with sound to our perception of colors.
If I were to paint two squares both similar shades of red.... and show them to you on a white background.... separated by an inch of white space.... there is some threshold variation where you would identify them as different colors.
But, if I were to move the squares so they TOUCH, you would find that a much smaller difference between them would enable you to see a "seam" between them.
Likewise, compare two pieces of "transparent window glass" and you'll find that they are slightly different colors.

I note that you say "all those amps were audibly transparent"...... but did you actually switch directly back and forth between two of them?
(With NO pause, of even a second, between them.)

Note that I'm being very careful NOT to claim that there are SIGNIFICANT differences there.

Have you ever attempted to compare two amps of the same make and model? Might there also be tiny differences just from the manufacturing tolerances? I have seen informal test results where a listener appears to be able to identify a difference when quickly switching between test samples, though they were unable to pick a favorite. In these situations, it was never fully determined what the exact differences were, or even if something in the test procedure was providing a cue that tipped off the listener.

I suppose that I am the "walk in a room and walk out of a room" tester at this point. In the past, I had resorted to ABX testing several dozens of times when I swore I heard a difference, only to fail miserably again and again. Now I sit decidedly on the skeptical side of things, and even if others might have stupendous hearing, I know my limits and I have arrived.

It's mp3 streaming for me, and to my half centenarian ears, it sounds simply fantastic. I just don't hear much of any real difference that would trump the convenience and quantity of music I currently enjoy. Though, I listen almost exclusively with stereo speakers at low volume levels, so I don't have many issues to contend with that might arise from headphone usage and/or louder listening levels.

bigshot · Apr 24, 2018

Phronesis said:
Perhaps nothing is truly 100% "transparent," even if a given set of measurements don't show significant noise or distortion, so there could still be subtle (or not subtle) differences in sound.

That would be frequency response then. What sort of signal could an amp produce that wouldn't show up in measurements of frequency, amplitude, distortion or noise? Of course there is such a thing as 100% transparent. We can record and measure sound beyond the range of human hearing. The only part of audio that makes it impossible to reproduce sound (as opposed to signal) transparently are the errors in transducers and the effect of space and reflection on sound. That's all mechanical, not electronic.

But you seem to have missed the point I was originally making. Audible transparency has nothing to do with measurements. It has to do with two sounds directly compared in a blind, line level matched direct A/B switched comparison using human ears. Audible transparency means it sounds the same to your ears. You don't need to measure to determine that. A controlled listening test will do the trick.

I understand though that a lot of people have a bias against believing measurements... That is, until you ask them to compare in a controlled way using their ears. That's when they shift their bias to asking for measurements again! It's bias making a person's thinking process do flip flops. I see it all the time.

Phronesis · Apr 24, 2018

bigshot said:
That would be frequency response then. What sort of signal could an amp produce that wouldn't show up in measurements of frequency, amplitude, distortion or noise? Of course there is such a thing as 100% transparent. We can record and measure sound beyond the range of human hearing. The only part of audio that makes it impossible to reproduce sound (as opposed to signal) transparently are the errors in transducers and the effect of space and reflection on sound. That's all mechanical, not electronic.

My understanding is that FR curves can be generated in various ways, and can't reflect timing errors. Again, a measurement which can't tell the whole story.

bigshot said:
But you seem to have missed the point I was originally making. Audible transparency has nothing to do with measurements. It has to do with two sounds directly compared in a blind, line level matched direct A/B switched comparison using human ears. Audible transparency means it sounds the same to your ears. You don't need to measure to determine that. A controlled listening test will do the trick.

Then we're back to the limitations of listening tests. There can be both false positives which see differences which aren't there, and false negatives which miss differences which actually are there. And some people may have better hearing ability than others, just as some have better vision than others.

bigshot said:
I understand though that a lot of people have a bias against believing measurements... That is, until you ask them to compare in a controlled way using their ears. That's when they shift their bias to asking for measurements again! It's bias making a person's thinking process do flip flops. I see it all the time.

IMO, we should make use of both measurements and listening, recognizing the strengths and limitations of each. Even when we do a good job of using both together, there will be uncertainty and we can get things wrong.

Latest Thread Images

robthemac

100+ Head-Fier

Phronesis

Headphoneus Supremus

robthemac

100+ Head-Fier

Phronesis

Headphoneus Supremus

bigshot

Headphoneus Supremus

KeithEmo

Member of the Trade: Emotiva

robthemac

100+ Head-Fier

KeithEmo

Member of the Trade: Emotiva

KeithEmo

Member of the Trade: Emotiva

KeithEmo

Member of the Trade: Emotiva

bigshot

Headphoneus Supremus

Phronesis

Headphoneus Supremus

sonitus mirus

Headphoneus Supremus

bigshot

Headphoneus Supremus

Phronesis

Headphoneus Supremus

Users who are viewing this thread