Schiit Happened: The Story of the World's Most Improbable Start-Up
Jun 4, 2015 at 7:20 PM Post #6,661 of 151,780
Eish.. audio faculty testing... abx-ish

Recently had a long forum discussion leading to a discussion with my wife about mathematical and statistical analysis of humans and abx testing who is very math and stats literate wife... I am quite poor at it... for a biologist...

Take home message was this: the Central limitation theory holds and assumes, for the specific test, that every test subject must have the same opportunity to get the answer right or wrong. Then you will get a standard deviation based on chance alone... So if you violate this by training people to listen for certain things, or give them a job evaluating hi-fi equipment, or some have better hearing than others, you cannot use simple statistics to evaluate the results of such a test. Stats doesn't work for the extremities.


As someone who has taught graduate statistics in a psych department (that is, measuring human behavior) for more than a decade and a half, this is... wildly incorrect.

There are some serious issues with the way the ABX crowd draws conclusions from statistics—most critically, a failure to reject the null does not justify concluding that two things are equivalent (you basically can never "prove" two things are the same; you can only "prove" differences)—but what you're talking about here is not one of them. The Central Limit Theorem is a key theorem in statistics, but it's not about extreme values.

Example... everyone can run 100m sprint in 10 seconds... vs. a few gifted people who practise daily and are surrounded by a team of experts helping them, can run the 100m sprint in 10 seconds... vs. people can't run the 100m sprint in 10seconds... which is what the stats would indicate


Hunh? That makes no sense. In what way do "the stats" indicate that? Which stats?

If the question is "does the distribution in the population contain any values less than X?" the stats are pretty clear: if you observe any values less than X, then it does. We can observe values less than X (that is, we see people do it in less than 10 seconds), then the correct conclusion is that there are definitely population values less than X. So I'm not sure what you mean here.

So if you do ABX - and I suggest you keep the question a simple: which do you prefer?


That's not what an ABX test asks. You could ask that question, but it would not then be an ABX test if you did. An ABX test asks "Can people tell the difference?" It's not about preference. There's no point in determining preference if people can't discriminate the two things in the first place.
 
Jun 4, 2015 at 7:28 PM Post #6,662 of 151,780
That's not what an ABX test asks. You could ask that question, but it would not then be an ABX test if you did. An ABX test asks "Can people tell the difference?" It's not about preference. There's no point in determining preference if people can't discriminate the two things in the first place.

 
You are of course correct about what an ABX test is set up to show.  But there's a potentially very interesting question in this: If people are asked for a preference rather than explicitly to discriminate, does the ability to discriminate improve?  (In other words, if a preference is requested, is A preferred to B at a non-random rate, in spite of the fact that if the discrimination question is asked explicitly, people are not able to differentiate A from B at non-random rates?)
 
Any psych research you're aware of on this, or the slightly more general topic of how the framing of the question affects A/B or A/B/X test results?
 
Jun 4, 2015 at 7:47 PM Post #6,663 of 151,780
I'm a chemical engineer and I can tell you for a fact that not a lot of people in the industry understand statistics beyond what an average is. If industry professionals don't get it, I don't think we're going to figure out all of the oddities of blind testing in this forum, and I really don't think its all that important. The goal of doing blind testing at a booth is just so people can remove their bias on what looks good and what they have looked up on the internet and salivated over and get an idea of what sound they actually prefer. That is the goal, not statistically proving that one is better over the other for an individual or group of people or anyone. 
 
Jun 4, 2015 at 8:10 PM Post #6,664 of 151,780
This is a fun discussion.  Applying statistical tests to the sound of HP rigs.
You could easily design a test to compare two competing rigs or HP's.
Hypotheses:  the systems sound the same
Alternate hypothesis:  the systems sound different
That's an attribute test - either yes or no.  No measure of how much the systems differ.  That would be much more complex.
I think many of us already run a basic attribute test like this - called A/B.  But, at a meet with more participants, you could gather data to help prove whether the hypothesis is true, of false.
 
All the while, listening to great music!  That's the really fun part.
 
Cheers,
RCB
 
Jun 5, 2015 at 4:52 AM Post #6,668 of 151,780
Sorry, for the sake of brevity I was lax and incorrect. I appreciate that I may not understand the finer details of statistics.
 
As I understand it, CLT says if every test subject has a random (or the same) chance of getting it right every time a choice must be made, then you will get a normal distribution of test subject performance. If you have a deviation from the normal distribution, it may be significant. Am I wrong about this?
 
The way that the Clarkson test and the Atkinson amplifier test panned out, you got a normal distribution, but only the audio reviewers and industry professionals scored full or near full marks, as a whole the results were basically normally distributed. That SHOULD have signified to the testers that there is a correlation between training and performance, that the population is not equal and that the normal distribution, might not mean that test score is random. However Clarkson told the people who scored full marks or close to full marks, no they really can't hear it, they were just the "lucky coins" in the population that by accident got the answers all right, as the normal distribution still held. I appreciate that people better at stats will not do such a thing, but this is the way the abx tests and those most publicly upheld, were run.
 
In this case and I should have been more specific about this, using JUST the presence of a normal distribution, is not adequate to say that the test shows that nobody can hear the difference in amps... So yes, I agree with you...
 
My example pointed out ONLY that the standard deviation in objectively measured performance is as absurd as using the average 100m sprint times to say that people can only run the 100m sprint in x seconds, the other results are just by chance.
 
So if you are going to use normal stats, you better ask the right question - which is one of the harder things to do, or you have to test and re-test those performing well - if that corresponds to some factor that may prejudice their scores - to confirm they do in fact score differently than the rest of the population. To be honest the breadth of human experience that could be grouped at a single show... would seem to indicate that there might be a need for this regardless of any apparent correlations.
 
Or you have to perform a multivariate analysis that includes background information, to see if there is any correlation between training and the observed trends... but you may have to classify the data objectively or subjectively and then we will probably fight a little more
 
What I also do not mean to say is that everybody should buy the most expensive DAC, AMP, Cables they can afford, cause they WILL hear a difference. 
 
 
 
Quote:
As someone who has taught graduate statistics in a psych department (that is, measuring human behavior) for more than a decade and a half, this is... wildly incorrect.

There are some serious issues with the way the ABX crowd draws conclusions from statistics—most critically, a failure to reject the null does not justify concluding that two things are equivalent (you basically can never "prove" two things are the same; you can only "prove" differences)—but what you're talking about here is not one of them. The Central Limit Theorem is a key theorem in statistics, but it's not about extreme values.
Hunh? That makes no sense. In what way do "the stats" indicate that? Which stats?

If the question is "does the distribution in the population contain any values less than X?" the stats are pretty clear: if you observe any values less than X, then it does. We can observe values less than X (that is, we see people do it in less than 10 seconds), then the correct conclusion is that there are definitely population values less than X. So I'm not sure what you mean here.
That's not what an ABX test asks. You could ask that question, but it would not then be an ABX test if you did. An ABX test asks "Can people tell the difference?" It's not about preference. There's no point in determining preference if people can't discriminate the two things in the first place.

 
Jun 5, 2015 at 11:38 AM Post #6,670 of 151,780
But in a world with tens/hundreds of choices, the statistics at least help you to narrow your selection. It is difficult to know where to start sometimes. Other people's reviews and analyses can point you in the right direction.
 
Jun 5, 2015 at 11:41 AM Post #6,671 of 151,780
But in a world with tens/hundreds of choices, the statistics at least help you to narrow your selection. It is difficult to know where to start sometimes. Other people's reviews and analyses can point you in the right direction.

I usually just start with whatever I can afford today and then move up from there.  :)
 
Jun 5, 2015 at 11:54 AM Post #6,672 of 151,780
I usually just start with whatever I can afford today and then move up from there.  :)


True, but there's a lot of choice in any price range. But usually what you want is two level up from what your wallet can handle. Lol
 
Jun 5, 2015 at 12:00 PM Post #6,673 of 151,780
True, but there's a lot of choice in any price range. But usually what you want is two level up from what your wallet can handle. Lol


Audio has always been an aspirational hobby.
 
Jun 5, 2015 at 1:11 PM Post #6,675 of 151,780
Statistics are useful to predict mass preference, but they are meaningless to illustrate individual choice.

+1 Well said.
 

Users who are viewing this thread

Back
Top