What are the arguments against double blind tests (incl. ABX)?
Mar 8, 2012 at 5:43 PM Post #16 of 209
Mar 8, 2012 at 5:49 PM Post #17 of 209
Makes sense, I was just clarifying. I agree to a certain extent - thought I also am a big proponent of the truth. If someone makes a claim which does not jive with what we know of electrical behavior, electronics, how we hear, and in general what we know of physics, I want those claims justified and defended before I accept them and buy (or recommend to others) their product. 
 
I don't need to abx every single thing - in many cases it has already been done, or my decisions are for aesthetic or usability reasons... but I do like having objective measurements and data in place to combat the specious and unreasonable claims made in this industry. They do nothing but harm to the industry on the whole. The truth matters.
 
Mar 9, 2012 at 2:16 AM Post #18 of 209


Quote:
Originally Posted by M-13 /img/forum/go_quote.gif
I only have one girl friend...

 
And if you want to keep her, I suggest that you don't  ask her to run a DBT for you
smile_phones.gif
.
 
My (universally non-audiophile) friends and neighbours already suspect I'm a bit strange. I don't want to put it beyond all doubt by asking them to help with any form of audiophile-related DBT/ABX.
 
 
 
Mar 9, 2012 at 6:46 AM Post #19 of 209
I can’t believe that no one here addressed the main issue. For a very long time psychology was rejected as science because all their blind tests depended on short term memory. Memory depends on your imagination, your expectations, strength of your memory, etc. Memory can be over written, manipulated, and plain false sometimes, etc.
 
Pick up a picture puzzle where they have 2 pictures right next to each other, cover the first one and then look at second one and try to find the differences. It is much harder to pin point the differences when you're depending on your memory. If they were right next to each other, the differences are much more evident. When watching HD movies, most people are aware that the picture looks better somehow, but ask them to pin point the differences and most of them will get them wrong, they might be headed in the right direction but most people will guess wrong.
 
A lot of people will also claim that test subjects are using gear and tracks that they’re familiar with. However, that is much worse. Why do people tell you not to correct your own essay, because no matter how many times you read it you can’t see some of your own spelling mistakes. You will actually read the word correctly even though it is quite obvious to another person that you spelled it wrong. That’s why tests that depend on memory are so unreliable. You already have expectation and stubborn people will never change those expiations. Some people would have already made their decision just by hearing the first track.
 
I’ll give you another example, with my jh 16. I used the normal cable vs my Piccolo cable in blind test. The difference was much more obvious when I hooked up one end of normal cables to right jh 16 in ear and one end of Piccolo cable to the left jh 16 in ear phones (you can use same trick with 2 different files). But it’s still tricky because the brain processes sound differently to help you pin point where sound is coming from. If you’re surprised by loud sound sometimes you can’t even pin point where it came from because your brain is spending resources to keep you alert in case of danger.
 
There is nothing wrong with the test, it’s the way it’s being used that people have problem with. Basically what their saying is, you’re taking a 12 year old kid and you’re giving him a test designed for engineering graduates. When he fails you claim that this kid is dumb. However, even though there is nothing wrong with test, its design for engineer graduates. Furthermore, the 12 year old isn’t dumb because the test was too hard.
 
As an artist who loves to draw and loves to take Photoshop courses, I can tell you that I know a lot of tricks to make an image look better even though those tricks are undetectable. You will clearly see a difference but you can never really tell what it is because it blends in with image. Most of these audio blind tests are testing material with maybe a 10-20% difference. We’re not talking about adding noticing new stuff (like a chair that wasn’t there before), where talking about minor changes. However, these minor changes can sometimes tremendously increase enjoyment or experience.
 
You cannot claim that people cannot tell the difference between 2 files scientifically when you use something so unreliable like memory as a major factor in a test result [size=medium]that was made harder than it should have been[/size]. Even if you cut file up or loop through segments you’re still using your memory. There are at least 4 best sellers from the dean of medicine in Harvard on why using memory in blind test is unreliable and unscientific.
 
Mar 9, 2012 at 7:19 AM Post #20 of 209
I don't buy your analogy with the pictures, if there are differences in the picture you only need your eyes to find them, in a sighted listening test you are using your eyes and your hearing to find differences that are supposed to be in the sound. The fact that you see what element you are using doesn't eliminate the fact that you have to rely on your memory to compare the differences in sound. 
 
 
 
Mar 9, 2012 at 7:40 AM Post #21 of 209
Pick up a picture puzzle where they have 2 pictures right next to each other, cover the first one and then look at second one and try to find the differences. It is much harder to pin point the differences when you're depending on your memory. If they were right next to each other, the differences are much more evident.

You cannot claim that people cannot tell the difference between 2 files scientifically when you use something so unreliable like memory as a major factor in a test result that was made harder than it should have been. Even if you cut file up or loop through segments you’re still using your memory. There are at least 4 best sellers from the dean of medicine in Harvard on why using memory in blind test is unreliable and unscientific.


But I believe that an ABX test is pretty much like showing two pictures next to each other. You can switch back and forth with the flick of a switch, and I don't think it's much different from focusing with your eyes on one picture, then on the other (you can't focus on both at the same time). If memory is involved (the really short term memory), then it's similarly involved in both cases.

Have you ever run an ABX test with foobar?

When watching HD movies, most people are aware that the picture looks better somehow

They're aware they're watching an HD movie in the first place, because they picked it beforehand. It's the text book definition of bias. All that tells you, is that it feels good to know that you're watching a High Definition movie as opposed to a Standard Definition movie. I won't deny that, and whenever the placebo effect has a positive outcome on the level of enjoyment of something, I'm all for it.

It would be interesting to know if they were equally aware of it (they just might!), if someone blindingly picked a (really well mastered) SD or HD movie for them and no-one knew what they were watching. An ABX test tells you just that. I mean, it's the whole point of the test.

FWIW, I think it's a hell of a lot easier to ABX a DVD vs. a BluRay than 160kbps Ogg Vorbis vs. FLAC.
 
Mar 9, 2012 at 8:04 AM Post #22 of 209
Some discussion about this was here:
 
http://www.head-fi.org/t/419614/an-interesting-take-on-dbt
 
 
Quote:

[size=inherit]

Blind testing of something that is measured subjectively is inherently flawed, and perception of what good audio sounds like is usually subjective.

I work in a field where the double-blind randomised controlled trial (RCT) is the gold standard, due to large placebo effects in any unblinded trials. I love RCTs. The key issue here is that the measures are objective - they are hard, measureable, and repeatable. RCTs using subjective outcomes tend to be poorly regarded unless they use validated scales or mass amalgamation of data. Put simply, an RCT using subjective outcomes has so much bias that the results may well be meaningless.

Someone mentioned wine - it's a good example. I do a lot of blind wine tasting and I'm good at it - it's useful to find great new wines with no preconceptions. But I am aware that I can taste the same wine (unknown) in different circumstances and have very different opinions. Things like temperature, my mood, what my palate has been exposed to previously, and comparison wines, all affect my judgement. I'm quite capable of liking a wine one day and disliking it another. And neither time am I wrong as my opinion is subjective. Others may like or dislike the same wine, neither of which invalidate my opinion.

Some wines have clear flaws. Good tasters can identify these (and they can be measured in a lab) and in a group of experienced tasters these measures can be objective, but wine quality never will be. Even wine judges, when awarding medals etc, will have dissenting opinions within the group.

Audio is the same. Unless you want to do double blind testing with machines measuring fixed validated outcomes like volume, bass depth, frequency sweeps, it's pointless. Firstly, it's usually single-blind as most people do it knowing the equipment involved which immediately introduces bias, plus all the outcomes are subjective, which makes any results flawed from a scientific viewpoint. And it's commonly done in groups - another source of bias, as groups tend to alter individual opinions towards a mean. Any preconceptions about the result (ie cables do/don't work) also introduce bias.

The net result of all this is that if you perform a group single-blind test with any preconceptions about the results (difference/no difference) you introduce significant bias into already subjective measurements, which makes it difficult to extrapolise the results to any other setting i.e. it doesn't help me!

Results from DBT may be interesting, but they are unlikely to be more valid that simply saying 'My music sounds better like this'

[/size]  
 
Mar 9, 2012 at 8:19 AM Post #23 of 209
Blind testing of something that is measured subjectively is inherently flawed, and perception of what good audio sounds like is usually subjective.

Double blind tests don't ask you to decide which sounds better, they determine whether you can hear *any* difference, good or bad, between two files/components/systems. While the result may vary from one individual to the other, it's an objective test.


Firstly, it's usually single-blind as most people do it knowing the equipment involved which immediately introduces bias

That's a falacy. It's not "single" blind. Double blind means that neither the test subject (one) or the operator (two, hence "double") know which is which. It doesn't mean they don't know what equipments are used for the test. There's no bias precisely because no-one knows which is which. You can't have bias for something you don't know o_O

Edit: OK, I'll play devil's advocate here. The only bias I can think of is whether or not the test subject is supposed to hear a difference. My take is that, if there is a difference at all, they should hear it, and if they can't, the systems tested are equivalent, as far as listening is concerned, anyway.
 
Mar 9, 2012 at 8:43 AM Post #24 of 209
Quote:
Originally Posted by Currawong /img/forum/go_quote.gif
 
Quote:
[size=inherit]
And it's commonly done in groups - another source of bias, as groups tend to alter individual opinions towards a mean. Any preconceptions about the result (ie cables do/don't work) also introduce bias.

[/size]

 
This is obviously not correct for ABX tests. It is not possible to give a "biased" answer in those, except by deliberately lying (e.g. always choosing A without even listening) to produce a false negative result. But the subjects are usually audiophiles with a clear motivation to prove that the difference exists, so this is not likely.
 
 
Mar 9, 2012 at 8:50 AM Post #25 of 209

[size=13.5pt]I can’t really tell if you’re trolling or messing around. So let me get this straight, if I place 2 dots on a piece of paper, one with a black pen and other with a red pen. I can only focus on one dot at a time? Ask any partially colored blind person about color blind test. When green and red colors are next to each other they can tell difference. But when they show them one then other hint singals, they can’t tell difference between colors. That must prove by your logic that the colors are exactly the same right? It’s all in their head am I right? Switching back and forth depends on your memory but looking at them at same time doesn’t involve memory. If you still disagree then you’re entitled to your opinion.[/size]
 
 
 
Quote:
But I believe that an ABX test is pretty much like showing two pictures next to each other. You can switch back and forth with the flick of a switch, and I don't think it's much different from focusing with your eyes on one picture, then on the other (you can't focus on both at the same time). If memory is involved (the really short term memory), then it's similarly involved in both cases.

 
[size=medium]There is nothing wrong with the test, what I’m saying is that you can’t really use memory as major factor in determining blind test results.  History has proved over and over again that it’s not reliable enough to use as a parameter in a blind test. No one accepted blind memory tests conducted by psychologists because memory isn’t reliable enough to give consistent unbiased accurate results.  I find it hilarious that you believe that people can imagine things from a [/size][size=10pt]placebo effect yet their memory is 100% accurate. [/size]
 
Quote:
They're aware they're watching an HD movie in the first place, because they picked it beforehand. It's the text book definition of bias. All that tells you, is that it feels good to know that you're watching a High Definition movie as opposed to a Standard Definition movie. I won't deny that, and whenever the placebo effect has a positive outcome on the level of enjoyment of something, I'm all for it.
It would be interesting to know if they were equally aware of it (they just might!), if someone blindingly picked a (really well mastered) SD or HD movie for them and no-one knew what they were watching. An ABX test tells you just that. I mean, it's the whole point of the test.
FWIW, I think it's a hell of a lot easier to ABX a DVD vs. a BluRay than 160kbps Ogg Vorbis vs. FLAC.


 
 
 
 
 
 
 
Mar 9, 2012 at 8:54 AM Post #26 of 209
By the way, usually the arguments against DBT also apply to the common sighted "A/B test" as well, and therefore fail to prove why the latter is better, only that any subjective comparison is inherently unreliable. But the use of DBT and accurate level matching are steps in the right direction to minimize the unreliability. Therefore, opponents should focus on finding advantages that only exist in sighted listening tests, or flaws that are specific to blind ones.
 
 
Mar 9, 2012 at 9:03 AM Post #27 of 209
I can’t really tell if you’re trolling or messing around.

First, no need to shout. Keep your cool, man. And just because I don't share your opinion, doesn't make me a troll. You sound like you're so opinionated that you can't believe anyone in their right mind would disagree with you.

So let me get this straight, if I place 2 dots on a piece of paper, one with a black pen and other with a red pen. I can only focus on one dot at a time?

Alright, you can tell the difference in that case by using your peripheral vision, but that's hardly a good example. To healthy eyes, different colors are, erm, well, different. It would be like ABXing a high pitch tone and a low pitch tone. Any healthy person would score 16/16 at that test using pretty much ANY equipment, even a telephone.
 
Mar 9, 2012 at 9:05 AM Post #28 of 209
[size=medium]It has nothing to do with your vision, hearing, taste, or touch even. It has to do with the fact it relies on your memory at least on the entire tests I’ve come across. All I’m saying is that memory is not reliable enough to be used as tool in blind tests. Furthermore, every single school of medicine and physiology would also state the same exact thing.[/size]
 
Mar 9, 2012 at 9:13 AM Post #29 of 209
[size=medium]I’m sorry I got carried away, I was just surprised by the fact that you believe that people can only focus on one image at time. Your right that’s your opinion and I should respect that.[/size]
 
[size=medium][/size][size=10pt]Alright, you can tell the difference in that case by using your peripheral vision, but that's hardly a good example. To healthy eyes, different colors are, erm, well, different. It would be like ABXing a high pitch tone and a low pitch tone. Any healthy person would score 16/16 at that test using pretty much ANY equipment, even a telephone”[/size]
 
[size=10pt]It seems that I didn’t get my point through, what I’m saying is that the tests would be more accurate if blind tests did not depend on uncontrollable elements like human memory. The test would be more meaningful, especially when it comes to things that have minor differences. Expecting a person to remember minor differences from memory is just asking a lot in my opinion from a test. No one should really take it seriously.[/size]
 
[size=10pt]The scientific method is the process by which scientists, collectively and over time, endeavor to construct an accurate (that is, reliable, consistent and non-arbitrary) representation of the world[/size]

 
Quote:
First, no need to shout. Keep your cool, man. And just because I don't share your opinion, doesn't make me a troll. You sound like you're so opinionated that you can't believe anyone in their right mind would disagree with you.
Alright, you can tell the difference in that case by using your peripheral vision, but that's hardly a good example. To healthy eyes, different colors are, erm, well, different. It would be like ABXing a high pitch tone and a low pitch tone. Any healthy person would score 16/16 at that test using pretty much ANY equipment, even a telephone.



 
 
Mar 9, 2012 at 9:49 AM Post #30 of 209
[size=medium]Kay I thought of better example, please tell me what you find easier:[/size]
 
[size=medium]!. Listen to a recording of the numbers1-10 in random order, then listen to them again with one of numbers moved to another position in the ladder of numbers, do this 3 times back to back.[/size]
 
[size=medium]2. Listen to same recording using 2 different speakers playing recordings at same time and do it 3 times back to back. Notice how much easier it is to spot which number has moved in ladder.[/size]
 
[size=medium]Most people will only get the first attempt correct in first experiment, while everyone will pass the second experiment with flying colors. I cannot conclude that there is no difference in first example in number ordering, because I made test too hard by using people’s memory. Because memory is not consistent, unreliable, varies with age etc, and not accurate. The test cannot accurately represent the truth, while the second test is much more accurate because memory wasn’t involved.[/size]
 
  
 
 

Users who are viewing this thread

Back
Top