The validity of ABX testing | Page 4 | Headphone Reviews and Discussion - Head-Fi.org

royalcrown · Jul 20, 2009 at 7:43 PM

Quote:

Originally Posted by PhilS /img/forum/go_quote.gif
Maybe you could repeat your reply to my previous post with quick-switching, as I don't see why the putting analogy isn't pretty good. I think you're focusing on the trees rather than the forest, when you say that we know how people react to bullets. (I'm not being critical; I'm just trying to explain my difficulties with the argument.)

To me, if a believer says quick-switching is something that can prevent people from hearing differences in ABX tests, I don't see why it isn't reasonable to accept a positive result and reject a negative result when quick-switching is utilized. The positive result may mean (1) the differences were so obvious that the quick-switching was not enough of a hindrance to overcome them, or (2) quick-switching is not a hindrance at all. OTOH, if the result is negative, we don't know that quick-switching is not a hindrance. Maybe that is what cause the negative result.

The reason that the bullet analogy doesn't convince me is because the situations work off of different assumptions. For one, in the case of putting it's a matter of purely skill, whereas in this instance it's about whether or not a difference can be detected in the first place. So while with putting the variable we're looking to isolate is purely the skill of the subject, whereas with audio components, we're looking to isolate both the skill of the subject and the equipment as well. That aside however, the big problems I have with assumptions are as follow:

1) With bullets flying over someone's head, we a) know that those bullets are there and b) we know that the bullets have effects that can foil the test. With quick-switching, we only know that quick-switching is employed - we don't know whether or not quick-switching has any effect that can foil the test. It's undeniable that loud noises and the fear of death will affect just about any test, let alone one concerning golf putting - it's common knowledge that those test conditions are awful. However, it's not common knowledge that quick-switching is inherently bad. In fact, we have empirical evidence that quick-switching increases the chances of success in other aspects of audio (i.e. codec testing).

2) If ABX tests were successful, nobody would argue that quick-switching mucks up the tests - so the only reason why quick-switching (or other aspects of testing) is even suspected to make a difference is because ABX tests don't reveal differences between components. However, I think suspecting something because of a given test result trend is not enough - there needs to be something more in order to give the argument credence.

Of course, this isn't limited to believers - I think more skeptics would argue that ABX tests are flawed if they provided positive cable results than would admit to it (not really in the sound science section as much as on other subsections and other forums altogether), and they would probably argue similar things (that some factor or other caused faulty results). Nevertheless, if all the evidence we have for quick-switching messing tests up is the fact that ABX tests aren't providing positive results, that's not enough proof. It's a possibility, but we don't usually take possibilities seriously unless we have reasonable cause to assume so - I don't, and I don't think most do or should, consider every single possibility and give them all equal weighting unless there's sufficient and equal probable cause for each one.

The reason I think that biases are at play here is because I've seen empirical evidence that placebo/biases have affected many tests, in the sense that many tests will pretend to switch a cable out and leave the test setup the same, only to find that subjects still believe there to be differences between cables. So there must be some sort of bias influence when it comes to testing, but I haven't seen compelling evidence of other factors making a large impact in the same sense that biases do.

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
You say it pretty well here, just as you have all throughout this thread.

I think maybe Royalcrown's point is that SOME believers would base their entire argument on the success or failure of a test. In other words, this strawman "believer" being criticized by Royalcrown thinks like this:

- Someone didn't pass an ABX test.
- Therefore ABX tests are flawed.
- Now let me invent some reasons why.

There may in fact be a few people who think like that, but certainly none of them have participated in this sound science forum during the time I've been here.

That's not what I'm saying at all. If you want me to translate it into "stat-speak," I don't think there's reasonable cause to assume that p is greater than 0.5, for all of the reasons listed above. I try to refrain from such word usage because it's imprecise and impedes clarity, but if you insist. I was referring to your original post on quick-switching and imagination contamination, before you started referring to statistics. I don't know how many more times I have to say this: the thread is not about statistics.

Real Man of Genius · Jul 20, 2009 at 7:55 PM

Quote:

Originally Posted by royalcrown /img/forum/go_quote.gif
Of course, this isn't limited to believers - I think more skeptics would argue that ABX tests are flawed if they provided positive cable results than would admit to it (not really in the sound science section as much as on other subsections and other forums altogether), and they would probably argue similar things (that some factor or other caused faulty results).

If there was a substantial body of positive cable test evidence I would assume cables did make a difference and that my ears or system could not reveal it.

I would not assume there was something wrong with test itself and then proceed to make up convoluted, snake-eating-it's-own-tail arguments for why the test is flawed.

You are right though, some skeptics would.

Catharsis · Jul 20, 2009 at 9:41 PM

Quote:

Originally Posted by royalcrown /img/forum/go_quote.gif
This is a pretty simple thread, but I decided to make a separate thread because I want to get as many answers as possible without threadjacking.

I think it's safe to assume that at least a handful of people on this forum believe that double blind testing is invalid as a form of testing, and argue from there that a negative result of an ABX test, as currently performed, with all of its downfalls (no pooling, large sample size, listening fatigue, quick-switching, imagination contamination, etc) does not mean much. The position, if I've summarized this properly, is that ABX is too flawed of a testing method to draw reliable conclusions.

Now say I were to perform an ABX test, with quick-switching, a large sample size, no pooling, and with quick-switching. Say that in this hypothetical, the test comes back positive - people can, according to this test methodology, distinguish between, say, cables (what's being tested isn't that important, you can insert DACs or amps if you want).

The question is: would you accept this test?

Assuming that it was done properly, most "skeptics" would (I imagine) have sufficient proof to say that at least some people can tell the difference between two cables. This is my personal belief. However, this belief slices two ways - we can't say that positive results are compelling without also saying that negative tests are compelling as well (even if they're far less compelling), or at least that they hold some sort of validity.

If, however, you are not a skeptic, is this positive result notable? If not, why not? And if so, how do you reconcile the apparent contradiction between shunning negative results and embracing positive results (given that ABX testing is so flawed to begin with)?

There shouldn't even be an arguement about this. Double blind studies (placebo controlled ideally) are the gold standard of analyzing the affect of an independent variable such as drugs, vitamin C, prescription lenses etc.

If audio components are claimed to produce an audible difference, constructing a valid experiement to determine the relationship between independent and dependent variables is as simple as enrolling yourself in statistics 101.

There is a no-brainer arguement in the eyes of science, and science will kick the arse of ignorance everytime. Every ....time.

PhilS · Jul 20, 2009 at 10:11 PM

Quote:

Originally Posted by Catharsis /img/forum/go_quote.gif
There shouldn't even be an arguement about this. Double blind studies (placebo controlled ideally) are the gold standard of analyzing the affect of an independent variable such as drugs, vitamin C, prescription lenses etc.

If audio components are claimed to produce an audible difference, constructing a valid experiement to determine the relationship between independent and dependent variables is as simple as enrolling yourself in statistics 101.

There is a no-brainer arguement in the eyes of science, and science will kick the arse of ignorance everytime. Every ....time.

Thank you for favoring us with your insights. We'll return later for further comments like: "I'm right and you're wrong and anybody who doesn't agree with me is a fool."

And now, back to a reasoned discussion of the issues.

PhilS · Jul 20, 2009 at 10:23 PM

Quote:

Originally Posted by royalcrown /img/forum/go_quote.gif

Of course, this isn't limited to believers - I think more skeptics would argue that ABX tests are flawed if they provided positive cable results than would admit to it (not really in the sound science section as much as on other subsections and other forums altogether), and they would probably argue similar things (that some factor or other caused faulty results).

I think I understand a little bit better what you're driving at. But I think that, if there is a positive result, it is more likely to be a valid result that cannot reasonably be explained as a false positive, as compared to the opposite situation, i.e., a negative result that can be explained by a flaw in the testing methodology.

So I guess where I come out on this is that it is not unreasonable for a believer to accept a positive result while at the same time having some doubts about the validity of a negative result. For example, I can see a reasonable believer saying, I have some concern that quick switching may interfere with hearing a difference. But if we get a positive result with quick switching, that is persuasive to me. I could also see that believer doubting very much the attacks on the test by a skeptic who alleges, for example, that there must have been some "signaling" or cheating going on.

Part of this is just the belief that people have in their own positions, and part of it is due to the null hypothesis and the different import of positive and negative results.

mike1127 · Jul 20, 2009 at 10:48 PM

Royalcrown,

Before proceeding, we need to agree on a few things or else this is a complete waste of time.

I am proposing hypotheses that need to be tested.

Do you understand that? I am not "assuming" anything. I am not "explaining" anything with "just so" stories.

It might be worth asking if you see any usefulness in proposing a hypothesis that there is a problem with quick-switching as a test. Are you content to accept the results we have so far? Do you see any usefulness in examining those conditions?

If not, there is no point in continuing. If not, you will simply regard everything I say as "just so" stories invented to explain "any results I don't like." Rather than a useful hypothesis that results from examining those conditions.

Catharsis · Jul 21, 2009 at 6:40 PM

Quote:

Originally Posted by PhilS /img/forum/go_quote.gif
Thank you for favoring us with your insights. We'll return later for further comments like: "I'm right and you're wrong and anybody who doesn't agree with me is a fool."

And now, back to a reasoned discussion of the issues.

This is a bit like arguing whether science has any place in debating creationism by using "controversial" theories like evolution.

Much like creationism advocates, expensive cable advocates insist that we adhere to their ideas without offering any proof, AND suggesting that well proven electrical equations need not apply in this audio arena.

In this sense, the burden of proof rests with those who think cables make a difference. Sound / electronics engineers, and physicists (scientists if you will) will expound that expensive cables don't work based on findings with well proven equations that accurately predict electrical properties such as resistance, inductance and capacitance. Science takes a risk on being incorrect by subjecting its theories to experiments, whereas heresay like "expensive cables are better" is somehow expected to be taken as truth without any attempts to obtain evidence to support its conclusion.

Quite simply, there is no reason whatsoever to believe that ABX testing cannot be applicable to examining the audible effects of cables on an audio system. This can be done double blind and controlled with absolute ease.

mike1127 · Jul 21, 2009 at 8:07 PM

Quote:

Originally Posted by Catharsis /img/forum/go_quote.gif
Quite simply, there is no reason whatsoever to believe that ABX testing cannot be applicable to examining the audible effects of cables on an audio system. This can be done double blind and controlled with absolute ease.

You might notice this thread is a discussion among those who are willing to seriously consider the evidence for both sides, and those who don't take anything as gospel, even conventional scientific or engineering wisdom. Telling people the issue is settled, and saying in essence "you are stupid and foolish to try to debate it," is against the spirit of the scientific method. Your post is not constructive in this context. If you would like to answer a specific point with specific data rather than tell us "science will kick your ass" you are more than welcome to. Otherwise it looks like you are just chest-puffing. And if you really have no intention of listening to anyone's point and simply want to knock it down, what reason do you have to post to this thread?

SmellyGas · Jul 21, 2009 at 8:55 PM

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
You might notice this thread is a discussion among those who are willing to seriously consider the evidence for both sides, and those who don't take anything as gospel, even conventional scientific or engineering wisdom. Telling people the issue is settled, and saying in essence "you are stupid and foolish to try to debate it," is against the spirit of the scientific method. Your post is not constructive in this context. If you would like to answer a specific point with specific data rather than tell us "science will kick your ass" you are more than welcome to. Otherwise it looks like you are just chest-puffing. And if you really have no intention of listening to anyone's point and simply want to knock it down, what reason do you have to post to this thread?

Some of this is very frustrating, because there are clearly scientists, researchers, and other people here who frequently design and/or read/critique scientific studies that use blinded controlled comparisons to test a hypothesis. For this group of people, the interpretation of the results are very simple and straightforward.

If a controlled ABX experiment finds that subjects CAN distinguish between two cables, then there MAY actually be an audible difference between the two cables. It is up to everyone to determine if there were any statistical or methodological flaws that could have explained the test results OTHER THAN an actual audible difference in the cables.

On the other hand, if a controlled ABX experiment finds that subjects CANNOT distinguish between two cables beyond random chance, then there MAY NOT actually be an audible difference between the two cables. It is up to everyone to determine if there were any statistical or methodological flaws that may have failed to detect a difference in cables, should one have existed.

This is how a scientist would interpret an ABX experiment. This is a logical and reasoned approach. However, for people who have not read hundreds of published studies, or been involved in designing or critiquing experiments, this type of approach and reasoning is not natural to you. If you are a lawyer, a real estate agent, a mechanic, or CEO, etc. you can't expect to just read about a scientific experiment and interpret it correctly, the way it was designed to be. It can be frustrating for both everyone when trying to communicate or discuss ABX tests because of this breakdown.

Catharsis · Jul 21, 2009 at 10:25 PM

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
You might notice this thread is a discussion among those who are willing to seriously consider the evidence for both sides, and those who don't take anything as gospel, even conventional scientific or engineering wisdom. Telling people the issue is settled, and saying in essence "you are stupid and foolish to try to debate it," is against the spirit of the scientific method. Your post is not constructive in this context. If you would like to answer a specific point with specific data rather than tell us "science will kick your ass" you are more than welcome to. Otherwise it looks like you are just chest-puffing. And if you really have no intention of listening to anyone's point and simply want to knock it down, what reason do you have to post to this thread?

I understand the premise of the thread. My argument is that I can't think of a valid reason as to why ABX testing couldn't apply to cables. What's the hold up?

Maybe cables DO make a difference, an ABX test would settle that.

PhilS · Jul 21, 2009 at 10:33 PM

Quote:

Originally Posted by Catharsis /img/forum/go_quote.gif
This is a bit like arguing whether science has any place in debating creationism by using "controversial" theories like evolution.

Much like creationism advocates, expensive cable advocates insist that we adhere to their ideas without offering any proof, AND suggesting that well proven electrical equations need not apply in this audio arena.

Religious discussion is prohibited on this forum, and these threads are routinely edited when people delve into the creation vs. evolution analogy. So I won't point out the flaws in your analogy or other erroneous statements you have made in this regard. Let's stay away from that area.

Quote:

Originally Posted by Catharsis /img/forum/go_quote.gif
In this sense, the burden of proof rests with those who think cables make a difference.

There is no burden of proof on anyone here. This is a hobbyists forum, and we are having a discussion regarding some issues with ABX testing. Nobody has to "prove" anything to anybody.

Quote:

Originally Posted by Catharsis /img/forum/go_quote.gif
Sound / electronics engineers, and physicists (scientists if you will) will expound that expensive cables don't work based on findings with well proven equations that accurately predict electrical properties such as resistance, inductance and capacitance. Science takes a risk on being incorrect by subjecting its theories to experiments, whereas heresay like "expensive cables are better" is somehow expected to be taken as truth without any attempts to obtain evidence to support its conclusion.

It's not "heresay." It's not even hearsay. And nobody says anybody is required to take it as truth. You're misrepresenting positions and setting up straw men -- and generally acting like a troll.

Quote:

Originally Posted by Catharsis /img/forum/go_quote.gif
Quite simply, there is no reason whatsoever to believe that ABX testing cannot be applicable to examining the audible effects of cables on an audio system.

Yes, there are reasons. They have been put forth in this thread and many others. The fact that you choose to ignore them, and pretend they are not there, so you can intrude on this discussion to advance your particular brand of dogma, doesn't mean they don't exist.

Catharsis · Jul 21, 2009 at 10:35 PM

Quote:

Originally Posted by SmellyGas /img/forum/go_quote.gif
Some of this is very frustrating, because there are clearly scientists, researchers, and other people here who frequently design and/or read/critique scientific studies that use blinded controlled comparisons to test a hypothesis. For this group of people, the interpretation of the results are very simple and straightforward.

If a controlled ABX experiment finds that subjects CAN distinguish between two cables, then there MAY actually be an audible difference between the two cables. It is up to everyone to determine if there were any statistical or methodological flaws that could have explained the test results OTHER THAN an actual audible difference in the cables.

On the other hand, if a controlled ABX experiment finds that subjects CANNOT distinguish between two cables beyond random chance, then there MAY NOT actually be an audible difference between the two cables. It is up to everyone to determine if there were any statistical or methodological flaws that may have failed to detect a difference in cables, should one have existed.

This is how a scientist would interpret an ABX experiment. This is a logical and reasoned approach. However, for people who have not read hundreds of published studies, or been involved in designing or critiquing experiments, this type of approach and reasoning is not natural to you. If you are a lawyer, a real estate agent, a mechanic, or CEO, etc. you can't expect to just read about a scientific experiment and interpret it correctly, the way it was designed to be. It can be frustrating for both everyone when trying to communicate or discuss ABX tests because of this breakdown.

I think you have eloquently articulated what I have been rambling about for the past several posts. I didn't mean to be offensive, but coming from a science background myself, I don't see why we're debating whether an ABX test is a valid experiment to conduct with cables. It would be easy IMO and in the opinion of just about any person who has ever conducted a scientific experiment.

There are some people on this forum who literally don't believe in science (therein lies a problem), and attempting to convince those individuals of the validity of an ABX test is several chapters into the science textbook. I'm a science nazi - I will admit that.

My apologies if I have offended anyone; SmellyGas gets where I'm coming from on this. Thanks SmellyGas, and wicked name btw.

PhilS · Jul 21, 2009 at 10:42 PM

Quote:

Originally Posted by SmellyGas /img/forum/go_quote.gif
If you are a lawyer, a real estate agent, a mechanic, or CEO, etc. you can't expect to just read about a scientific experiment and interpret it correctly, the way it was designed to be. It can be frustrating for both everyone when trying to communicate or discuss ABX tests because of this breakdown.

OTOH, if you are experienced in other fields, you can bring something to this discussion that a scientist may not consider. Neither scientists nor science is foolproof and without error. And not every scientific method that has been employed with success in one field is applicable without any caveats in another field.

Quote:

Originally Posted by SmellyGas /img/forum/go_quote.gif
If a controlled ABX experiment finds that subjects CAN distinguish between two cables, then there MAY actually be an audible difference between the two cables. It is up to everyone to determine if there were any statistical or methodological flaws that could have explained the test results OTHER THAN an actual audible difference in the cables.

I disagree. I think a plethora of observations that A sounds different than B is a reasonable basis for someone to question the methodology of a test that seems to indicate A does not sound different than B, especially when the test was developed in another field for different purposes, and the testing methodology may not be perfectly applicable to audio. There have been several articles published on this precise issue, and they have been referred to on other threads.

PhilS · Jul 21, 2009 at 10:45 PM

Quote:

Originally Posted by Catharsis /img/forum/go_quote.gif
I didn't mean to be offensive, but coming from a science background myself, I don't see why we're debating whether an ABX test is a valid experiment to conduct with cables.

And yet there are many people who participate in this forum who have a science background who do think it is an issue worthy of debate or discussion. I don't mean to be offensive, but if you don't, then don't debate or discuss with us. Go find another thread that deals with an issue you think is worthy of debate.

SmellyGas · Jul 21, 2009 at 11:28 PM

Quote:

Originally Posted by PhilS /img/forum/go_quote.gif
OTOH, if you are experienced in other fields, you can bring something to this discussion that a scientist may not consider.

This is true, but in many cases that I've seen specifically here, the objection coming from the non-science person is something that usually indicates a lack of understanding of the process of "science" and/or the methodology and what it demonstrates and what it doesn't demonstrate.

Quote:

I think a plethora of observations that A sounds different than B is a reasonable basis for someone to question the methodology of a test that seems to indicate A does not sound different than B,

If a "test result" doesn't match up with "what makes sense," you need to test for BIAS. Bias is anything that artificially influences your results other than the factor you are testing. The test, itself, however is sound. An controlled comparison in which an independent variable is changed (e.g. a cable) and a dependent variable is observed (e.g. blinded listener preference) is a solid methodology. On the other hand, the way blinded A/B comparison was adminstered could have been done with flaws that could introduce bias.

Quote:

especially when the test was developed in another field for different purposes, and the testing methodology may not be perfectly applicable to audio. There have been several articles published on this precise issue, and they have been referred to on other threads.

Science is designed to test hypothesis across any field that has things that you can observe. There are things science cannot test, but behaviors and preferences (i.e. listener preferences) *ARE* things that are measurable and testable. There is no perfect way to test every hypothesis out there, but the absence of a perfect test doesn't mean we shouldn't perform good tests and draw conclusions from them.

royalcrown

500+ Head-Fier

Real Man of Genius

100+ Head-Fier

Catharsis

500+ Head-Fier

PhilS

Headphoneus Supremus

PhilS

Headphoneus Supremus

mike1127

Member of the Trade: Brilliant Zen Audio

Catharsis

500+ Head-Fier

mike1127

Member of the Trade: Brilliant Zen Audio

SmellyGas

100+ Head-Fier

Catharsis

500+ Head-Fier

PhilS

Headphoneus Supremus

Catharsis

500+ Head-Fier

PhilS

Headphoneus Supremus

PhilS

Headphoneus Supremus

SmellyGas

100+ Head-Fier

Users who are viewing this thread