Head-Fi.org › Forums › Equipment Forums › Sound Science › Should we have a stickied post/link about blind testing?
New Posts  All Forums:Forum Nav:

Should we have a stickied post/link about blind testing? - Page 2

post #16 of 23
Quote:
Originally Posted by pyramid6 View Post

… you will need at least 384 test runs.

Why do you claim 384 test runs are necessary? Can't a p-value of less than 0.05 can be obtained with far fewer tests?
Edited by Jaywalk3r - 2/15/12 at 10:19am
post #17 of 23

http://en.wikipedia.org/wiki/File:Marginoferror95.PNG

 

One experiment can I think, if you measure it objectively.  But if you are doing samples, I think you need 384.

 

post #18 of 23
Quote:
Originally Posted by pyramid6 View Post

http://en.wikipedia.org/wiki/File:Marginoferror95.PNG

One experiment can I think, if you measure it objectively.  But if you are doing samples, I think you need 384.

We aren't looking for a 5% margin of error. What we want is a p-value < 0.05 to ensure that the results are statistically significant at a 95% confidence level.
post #19 of 23
Quote:
Originally Posted by Jaywalk3r View Post


We aren't looking for a 5% margin of error. What we want is a p-value < 0.05 to ensure that the results are statistically significant at a 95% confidence level.


Maybe I'm wrong, but I don't think you can do AB testing.  Or at least come out with something you could hang your hat on.

 

Let say you have setup A and setup B.  You double blind test it and determine A is better.  That information is useless.  Now if you get 384 people to double blind test it and 70% say A is better than B, then you can say 70% +- 5% of people will find A better than B.

 

I don't know if you can prove A sounds better than B.  Because what is "better". I think statistical sampling is a more accurate way to test in audio.

 

Unless you are talking about measuring the actual waves, which is already done.  That doesn't say how it will sound, or at least that's what people will argue.

 

 


Edited by pyramid6 - 2/15/12 at 3:09pm
post #20 of 23

I see where you coming from now.  You're talking about if there is a known more accurate.  Like testing Lossy/Lossless.  I'm talking about something subjective like amp A is better than amp B.  You would use different methods I would think.

 

I still don't like ABX testing. Different doesn't equal better.

 

Personally, I would love to see double blind testing with large number of participants.  It's just very hard to setup.

 

Edit:  I think your looking for http://en.wikipedia.org/wiki/Confidence_interval. 


Edited by pyramid6 - 2/15/12 at 3:19pm
post #21 of 23
Quote:
Originally Posted by pyramid6 View Post

I still don't like ABX testing. Different doesn't equal better.

Correct. All ABX testing is designed to do is determine if the person taking the test can accurately differentiate A from B, not to determine a preference.

"Better" is subjective. Knowing that 54% of people prefer amp/codec/ice cream flavor A to amp/codec/ice cream flavor B (+/- 2%) tells us absolutely nothing about whether or not we will, individually, like A better than B, or if we will even be able to differentiate between the two.

With audio, we can usually determine objectively whether A or B is more accurate. However, we need (properly performed) ABX testing, or equivalent, to determine if the difference in accuracy is audible for a particular person. If someone can't hear the difference between the two, there's no need to spend 5x more on B instead of A, even if 75% (+/- 2%) of listeners can reliably differentiate between the two blindly.
Quote:
Originally Posted by pyramid6 View Post

I think your looking for http://en.wikipedia.org/wiki/Confidence_interval.

No. I'm not trying to estimate a parameter with a statistic.
Edited by Jaywalk3r - 2/15/12 at 4:09pm
post #22 of 23

But you ABX testing only tells us that you can tell the difference.  To generalize it to me or someone else, you need more people, a lot more.  The only way to tell that 75% can tell the difference is if you have more people running the same tests.  It still doesn't tell us it will sound better.

 

post #23 of 23
Quote:
Originally Posted by pyramid6 View Post

But you ABX testing only tells us that you can tell the difference.  To generalize it to me or someone else, you need more people, a lot more.  The only way to tell that 75% can tell the difference is if you have more people running the same tests.  It still doesn't tell us it will sound better.

I can't generalize it to you at all, no matter how many other people are tested. You have to do your own ABX testing.

Even if 95% of people can't hear a difference between two amps, until you take the test, you won't know if you can detect a difference. And, if you know that you can hear the difference between them, you might be justified buying the $4000 amp instead of the $200 amp, whereas 95% of people wouldn't be.

Again, better is a subjective term. How do you define better? If we take better to mean more likely to be preferred, then knowing something is better offers no useful information to an individual. If 60% of people prefer Milky Way to Snickers, does that mean you will prefer Milky Way to Snickers? Having a confidence interval for the proportion of the population who prefer Milky Way is useless information to most people who don't sell candy bars.
Edited by Jaywalk3r - 2/15/12 at 5:46pm
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Sound Science
Head-Fi.org › Forums › Equipment Forums › Sound Science › Should we have a stickied post/link about blind testing?