Quote:
Originally Posted by nick_charles /img/forum/go_quote.gif
If you refer to the M & M study your statement is not true. In the M & M study it was not a matter of averaging.
There were well over 500 trials and over 50 subjects.
|
Nick -- Look at the sentence just before the one you quote and you will see that they did make a statement integrated over the whole population. That is what I meant by "averaging" ... and they did do it, honestly. Then they call out the few subjects who did well, and state (correctly) that this is still explainable by chance (under the hypotheses that no one can identify correctly which is which)
But I can fit other models to their data (or I could if they really published all their data), including ones where certain people
can correctly identify the high-res signal.
In fact my models will fit the data better (in the sense of having higher likelihood under the model, or lower lack-of-fit measurements like chi-squared).
But the way classical statistics works is to assume the null hypothesis is true (that no one can correctly identify the signals) and not move off it until there is compelling reason to. Classical statistics refuses to abandon the null hypothesis if it could be true with high probability given the data.
More nuanced approached began in WWII -- if you're in a submarine, and assume there is no enemy sub nearby, and wait to reject that null hypotheses while assembling sonar data, the odds are very high you will die underwater. Considering the loss ratios of your correct and incorrect actions, you should get outta there as soon as there is the hint of signal in the sonar noise.
Later these methods began to take individual subject differences into account, and there are now lots of subtle ways to test and separate consumer panels. D & D did not use these; however there is no question that they did get their classical statistics correct, and their conclusion is most likely corect.
But the conclusion supported by the data is
not the point they want you to take away from the article. They want you to believe that "no one can hear the difference", but that is false IMO, and not proven by them in any way.
They have most likely proven something real about A/B/X testing of hi and redbook resolution. Who cares? I don't care that you can't tell whether X is A or B ... that is very hard, given the way our brains recall music.
I believe strongly -- I'm about to piss a lot of people off -- that the foundation of A/B/X testing is intellectually bankrupt. The stereo magazines find the same flaws I do, and use it to attack blind testing. They are just as wrong.
With no chance of knowing which is hi-res and which is redbook, I want to listen to two signals A and B over and over, as I like, switching when I like, as often as I like. This is one trial. At the end of the trial, I will say one of the following:
- "they sound the same", or
- "they sound different but I have no preference", or
- "they sound different and I prefer A to B", or
- "they sound different and I preder B to A"
We then take a break. Then I do another trial. And another. And another. The scientist leading the experiment changes which is A, which is B, sometimes makes them the same (and then sometimes hi-res and sometimes redbook), throws a joker in the pack now and then (the signal degraded on purpose, etc.) and so on. Twenty or thirty trials.
If I pick the hi-res as my favorite most of the time, then we have proven that I can hear the differrence. No placebo effect or memory effect or hidden cues possible, unless the scientist running the tests tips me off with body language, and we can guard against that. We can also examine his record of assignments -- we need a third person to insure fairness, or better yet, two people who have different opinons about this matter up front.
It is interesting, but speaks only to the economics of equipment manufacturing, how many people can do this. But I don't care. I only care if I can do it. If I can, then I record in 24/96.
That's tesing. Listener-blind (DBT not needed). Single subject. Air tight. No one else around. No peer pressure.
We want to know if we can hear a difference. So we test that. We do not want to know whether we can identify a particular signal as one of the two (A/B/X). That is nothing we care about.
It will take a year to get this all together, but (some of) the NJ meet crew has stacked hands and said we are going to try -- although in truth many were not interested at all.