Hello,
Quote:
Originally Posted by mike1127 /img/forum/go_quote.gif
So in this analysis, we have done 8 trials, and I have gotten 7 right. This reaches a significance level of (8 choose 1) / (2 ^ 8) or 3%.
So I have already succeeded.
|
Two remarks here.
First, 3% is a success for you. Fine, and ok for me.
But it would not be a success for me. After all the experiments and tests I have done with interconnects, I have seen so much evidence in favor of the null hypothesis that for me, 0.1 % would be a maximum to change my mind.
After all, I have already seen someone getting a 0.2 % probabilty of false success listening to... nothing ! Just hitting randomly the keys on an ABX software while not wearing the headphones !
Direct link :
Blind test challenge - Hydrogenaudio Forums
Second, as Wavoman says, your score is not really 7/8. And you gave yourself the right explanation
:
Quote:
Originally Posted by mike1127 /img/forum/go_quote.gif
But this is a post-hoc analysis and carries some danger. For example, there is one thing that is a bit arbitrary. Why am I considered the answer about the ordering of the first two sub-trials so important that it is independent from my answer about the second two sub-trials, and why am I considering this post-hoc (i.e. it wasn't in the original protocol directions)? Because of my theory that I am most sensitive in the first or second listen, and because I was very confident about my answers. This does not convince you, of course.
|
Correct : the two problems are
-Why would the "fresh ears" identification be better than the "trained ears" one ? If the theory opposite to yours, that the hearing ability is better after some repetitions, then you have 6/8, not 7/8, because during the very last trials, you mistook both the sequence and the identification. Maybe you are right. Maybe the fresh ears listening was the good one. But this has not been the object of a double blind test so far. You are just assuming it.
-Why adding the second score post hoc ? If you go on with 50 other similar trials and get all sequences correct, but all identification wrong, what would you conclude ? 50/50 correct, or 50/100 correct ? You give yourself the choice between two possible ways of getting the score, and obviously pick the one with the best result. It multiplies the probability of false success by roughly 2.
Quote:
Originally Posted by mike1127 /img/forum/go_quote.gif
This test takes two days, a week apart. I pick 8 test tracks. On both Day 1 and Day 2 I listen to the 8 tracks. However, for each track there is a random assignment of cable A (the cheap cable) or cable B (the expensive cable).
As I listen to each track, I write down my impressions of it, and try to assign a score (from 1 to 10) to various aspects of the sound, like the highs, the dynamics, etc.
[...]
After Day 2 is over, I compare notes. For each track, I see if I rated it more highly with the good cable.
|
That's a good idea. But you need to clearly define
before the test what exactly will be considered as a success, and what exactly will be considered as a failure.
Otherwise, there will always be a possibility to mess with the different tracks and characteristics, pick the ones that make a good score and say "these ones were the most revealing, the other ones can be discarded".
You must not do this "post hoc".
However, in order to make the test easier, you can do it after all listenings are over, but before your friend gives you the right answers. This way, you can still discard the tracks or characteristincs that didn't seem significant for you, you won't know if it will make your score go higher or lower.
It is better, in this case that you do this choice without your friend looking. Non-verbal clues can be powerful.