ABX Reliability
May 16, 2010 at 9:40 AM Thread Starter Post #1 of 73

Slaughter

1000+ Head-Fier
Joined
Jun 8, 2006
Posts
1,471
Likes
28
Not sure if you have seen this or if is has been posted here before. This backups my feelings on audio ABX tests.
 
http://www.sieveking-sound.de/abx/
 
Most of you will fail this visual ABX test despite knowing for a fact that there is a difference between the items they are showing you. If you can't pass an ABX test in which there is a guaranteed measurable and visual difference in images, they how can an audio ABX be valid?
 
And before you bring up the fact that there might not being a measurable (or significantly measurable) difference in amps, cables, dacs, or whatever. Measurements don't tell the whole story. Is soundstage/headstage measurable? No, but we can all agree AKG's have massive soundstage compared to Grados. How can that be when it's not measurable? <rhetorical, don't attempt to answer>
 
 
 
 
 
 
 
 
May 16, 2010 at 11:03 AM Post #2 of 73


Quote:
Most of you will fail this visual ABX test despite knowing for a fact that there is a difference between the items they are showing you. If you can't pass an ABX test in which there is a guaranteed measurable and visual difference in images, they how can an audio ABX be valid?

This is a very misguided conclusion.  I believe that you have unknowingly provided a test case that actually supports this type of testing.  These types of tests try to determine whether or not one can detect/perceive a difference, not whether or not there is a measurable difference.
 
May 16, 2010 at 11:03 AM Post #3 of 73
Epic fail.
 
/thread.
 
May 16, 2010 at 11:18 AM Post #4 of 73
The limits of human perceptual abilities have been very well-documented.  In fact there is a mature scientific field that has delinated its limits and abilities, developed almost exclusively through blind ABX tests. So yes, there are cases where real physical difference cannot be registered by our limited brains (e.g. the abilities of headphones or dacs to resolve frequencies over 15-20 kHz).
 
http://en.wikipedia.org/wiki/Psychophysics
 
When you do blind abx of gear with your ears, the question being answered is 'can I discriminate the differences between two pieces of gear with my senses' or 'is the claim that x sounds different than y true'. If someone want to test physical differences, the test is done with more sensitive  equipment (microphones and computers).  For example in my laboratory we analyze molecules with chemical kits and not our tongues because our senses cannot resolve their differences.
 
The standing of dbt/abx is not threatened or affected by the limits of sensory perception .  You implicitly accept the validity of (hundreds of) blind abx tests and stake your well-being upon them every time you accept a perscription from your doctor for a drug that whose effects are real and not limited to the fraudulent claims of its maker.
 
May 16, 2010 at 11:29 AM Post #5 of 73
Quote:
Most of you will fail this visual ABX test despite knowing for a fact that there is a difference between the items they are showing you, because you are only allowed to see A and B once and compare your memory of them to X. If you can't pass an ABX a visual memory based ABX test (that is not really an ABX test) in which there is a guaranteed measurable and visual difference in images, then how can an audio ABX be valid? then banana, banana fruitcake fly!

There I fixed that for you! I also made you a new avatar:

 
Quote:
And before you bring up the fact that there might not being a measurable (or significantly measurable) difference in amps, cables, dacs, or whatever. Measurements don't tell the whole story. Is soundstage/headstage measurable? No, but we can all agree AKG's have massive soundstage compared to Grados. How can that be when it's not measurable?

I don't agree on your notion that "measurements don't tell the whole story", if something is not measurable then how does that differ from something that is nonexistent? And please define what you mean by soundstage/headstage, before you just assert it to be non-measurable or that AKG's have more of it than Grados do.
 
May 16, 2010 at 11:33 AM Post #6 of 73
Bob, I assume you failed the test, even though there was in fact a measurable difference. I preferred image B over image A in the side by side comparison, even though I failed in a blind test, does this mean that I don't in fact prefer image B? Answer me that.
 
Ronald - Reason?
 
Eucariote - That is exactly where DBT is an epic fail. DBT never tells you which one is better or different. Whether I can tell is irrelevant to some degree. As I explained in another thread. All decent HDTVs look good when you viewing them individually, but put them side by side and there is a sizable difference in black level, color accuracy and noise levels. Should people not buy the better one, whether it is cheaper or more expensive just because they can only see it when side by side? No. Most people will choose the superior display. I believe 1 in 5 people cant tell the difference between HD and SD, does that mean that HD doesn't exist or people shouldn't buy it. Unfortunately we can never do an auditory side by side comparison, so there will never be an accurate result with an ABX test.
 
Forgot to mention, these are only my thoughts in regards to A/V ABX tests. I am sure it useful for other things.
 
May 16, 2010 at 11:40 AM Post #7 of 73

 
Quote:
Not sure if you have seen this or if is has been posted here before. This backups my feelings on audio ABX tests.
 
http://www.sieveking-sound.de/abx/
 
Most of you will fail this visual ABX test despite knowing for a fact that there is a difference between the items they are showing you. If you can't pass an ABX test in which there is a guaranteed measurable and visual difference in images, they how can an audio ABX be valid?
 
And before you bring up the fact that there might not being a measurable (or significantly measurable) difference in amps, cables, dacs, or whatever. Measurements don't tell the whole story. Is soundstage/headstage measurable? No, but we can all agree AKG's have massive soundstage compared to Grados. How can that be when it's not measurable? <rhetorical, don't attempt to answer>
 
 
 
 
 
 
 


 
 
This has been my issue with ABX since the beginning, to me given the conditions and confusion it is highly plausible that measured and PERCEIVED differences end up with a null result.
 
Also what are we supposed to do with the results? If I fail an ABX test for cables yet when i go home and swap out my cables I have a defined preference am I supposed to trash the cables I prefer simply because of a failure in an ABX test? If it is placibo why is it consistent in my normal listening environment?
 
To me ABX simply tests short term memory in  stressfull situations, and the results of the ABX test can't even be applied in a home setting. Seems like a nice little parlour trick.
 
May 16, 2010 at 12:15 PM Post #8 of 73
?? your counter example deals with clearly perceivable differences in tv images and so is not comparable.  A better analogue would be tv's that resolve infrared light differently (wavelengths below 300 micrometers), not in the visible spectrum which would be a real but non-perceivable difference.  Regarding the side by side tv's, your eyes do not see two things at the same time, they saccade serially between them.  Likewise there are A/B boxes that you can use to switch back and forth between two + audio/cable sources at a comparable rate.
 
And the test can be done by anyone, not just inexperienced people who might not tell the difference anyway.  In fact the burden is precisely upon those who say that they *do* hear a difference, with their experience and superior equipment to show that they can do it blind, with a reliability that falls outside of a statistical random distribution of choices: a very low bar for success.  As has been stated before, even after all these negative results, it just would take just one set of golden ears with one magnificent set of audio equipment to provide nonrandom data to convince me and other people who accept the scientific method.
 
Quote:
 
 
Eucariote - That is exactly where DBT is an epic fail. DBT never tells you which one is better or different. Whether I can tell is irrelevant to some degree. As I gave explained in another thread. All decent HDTVs look good when you viewing them individually, but put them side by side and there is a sizable difference in black level, color accuracy and noise levels. Should people not buy the lesser one, whether it is cheaper or more expensive just because they can only see it when side by side? No. Most people will choose the superior display. I believe 1 in 5 people cant tell the difference between HD and SD, does that mean that HD doesn't exist or people shouldn't buy it. Unfortunately we can never do an auditory side by side comparison, so there will never be an accurate result with an ABX test.

 
May 16, 2010 at 12:20 PM Post #9 of 73


Quote:
Originally Posted by Slaughter /img/forum/go_quote.gif
 
Most of you will fail this visual ABX test despite knowing for a fact that there is a difference between the items they are showing you. If you can't pass an ABX test in which there is a guaranteed measurable and visual difference in images, they how can an audio ABX be valid?
 


ABX tests get used to test the audible differences between lossy and lossless codecs all the time.  There are very measurable differences between a lossy sound sample and a lossless sound sample yet the listening test are valid and useful.  The purpose of the test is to find if there is an audible difference, not to verify if there is a measurable difference.
 
Your convolution of the purpose of ABX tests is a complete straw man.
 
The ABX test as presented at sieveking is a failed attempt at an ABX implementation.  One has to wonder why the Foobar ABX plugin doesn't do ABX testing the same way? <rhetorical question>  Present A and B.  Now present 20 samples of either A or B and try to identify whether you see A or B, never having an opportunity to go and compare back to X.  Nevermind the bogus 5 second wait.  How is that even an ABX test?  It is more like an AB test designed by someone who doesn't know what they are doing or is intentionally buggering it up.
 
Do the ABX test the way the Foobar plugin does and that sieveking color test will be useful.
 
May 16, 2010 at 12:20 PM Post #10 of 73
I have a question:
 
If cable believers want so bad to prove the scientists wrong about cable differences, and scientists love DBT, why don't the cable believers just do a DBT and get it over with? Why go to all the trouble of trying to prove DBTs ineffective before even doing one and publishing the results? If the differences are so perceivable, the test should come back positive anyway, and there would be no need for further argument.
 
May 16, 2010 at 12:40 PM Post #11 of 73
Ham Sandwich, I am a sucker for straw man. There is no way to present audio as an A/B the way you can with two static images. And the point of the above test is to prove to you that there is a difference and that you saw it at the beginning of the test and you still failed. It's basically rubbing it in your face. Open two browsers and toggle. 
 
Eucariote, when comparing things and you can't see or hear side by side, you are dealing with your perception.
 
Head Injury - Because of the fact that ABX is flawed. Read below.
 
The test still proves that there is a visual difference between two images, but individually you can't tell. That doesn't make the difference go away. Whether intended or not, ABX tests are used in the A/V circles to try and drum up some bogus data to prove that there is no perceivable difference when there is a difference, but not always when viewed/heard individually.
 
Trust with ears and eyes, not your wallets and reviews.
 
May 16, 2010 at 12:53 PM Post #12 of 73


Quote:
 
Eucariote, when comparing things and you can't see or hear side by side, you are dealing with your perception.
 

 
 
True.  You might find the link below useful.  Your eyes move in your head and re-center stimuli on the fovea, thereby changing completely the (centered) inputs to your visual system and the entire contents of your visual system.  Much like your ears when you shift inputs and even attention.  At least you're getting there in baby steps 
wink_face.gif

 
http://en.wikipedia.org/wiki/Fovea

 
May 16, 2010 at 1:11 PM Post #13 of 73
No, I don't think so. Perception is not what we want to use. Using one's memory is a terrible way to conduct any type of testing. And to call it scientific is laughable. For me, I don't care what I perceive during an ABX test, my memory sucks. ABX never tells you if there is a difference, only that you can or can't perceive one. That is useless when you are buying A/V equipment, except maybe to help with ROI.
 
Quote:
 
 
True.  You might find the link below useful.  Your eyes move in your head and re-center stimuli on the fovea, thereby changing completely the (centered) inputs to your visual system and the entire contents of your visual system.  Much like your ears when you shift inputs and even attention.  At least you're getting there in baby steps 
wink_face.gif

 
http://en.wikipedia.org/wiki/Fovea



 
May 16, 2010 at 1:33 PM Post #14 of 73
Two posts before the TV example was your exemplar of good perceptual comparisons.  Now it's laughable.  
confused_face_2.gif
  Most neuroscience and psychology departments in universities use these techniques routinely. And they know how perception works, along with the scientific method.  Until you can present a consistent and informed position, I'll stop wasting my time.
 
You can argue methodology with these people in the meantime.
 
http://scholar.google.com/scholar?hl=en&as_sdt=2000&q=journal+of+vision+research
 
Quote:
No, I don't think so. Perception is not what we want to use. Using one's memory is a terrible way to conduct any type of testing. And to call it scientific is laughable. For me, I don't care what I perceive during an ABX test, my memory sucks. ABX never tells you if there is a difference, only that you can or can't perceive one. That is useless when you are buying A/V equipment, except maybe to help with ROI.
 

 



 
May 16, 2010 at 1:41 PM Post #15 of 73


Quote:
Ham Sandwich, I am a sucker for straw man. There is no way to present audio as an A/B the way you can with two static images. And the point of the above test is to prove to you that there is a difference and that you saw it at the beginning of the test and you still failed. It's basically rubbing it in your face. Open two browsers and toggle. 


Well that's one way to turn that AB test into an ABX test.  Take a screenshot at the beginning of the test showing both A and B.  Then toggle between the screenshot and the sample throughout the test.
 
Have you ever tried the Foobar ABX testing plugin?  You can flip rapidly between A, B, X, and Y as many times as you want for each test.  There is no gap or false delay between flipping from one to another.  You can repeat short sections or listen to longer sections.
 
If your complaint is that audio is transitory and therefore cannot be compared then why even bother with subjective listening reviews.  What are the subjective reviews comparing against?  How could listening impression be compared?  Yes, comparing audio is more difficult that comparing static images.  That doesn't mean you give up and say to heck with statistics and trying to develop statistically sound listening experiments.
 

Users who are viewing this thread

Back
Top