chipwelder
New Head-Fier
- Joined
- Mar 1, 2011
- Posts
- 29
- Likes
- 23
Just to understand... You are saying that the CLT only says whether or not you had a large enough sample size for the particular test (although I guess the specific test has nothing to do with it) and population? So if the data is not normally distributed - i.e. the shape of the curve is not roughly binomial you just don't have a large enough samplesize. This is what i believed... I was utterly dumbfounded when people said a normal distribution says things happen according to chance. I might be mistaken, but it seems to be what is said.
Here are the experiments - I don't believe any of the experiments are actually set-up ideally. I think there are some assumptions about what is audible and what is not that already featured in the experimental design. And I believe by asking people to get an answer correct, you are probably inducing stress, I would rather test two equal-ish amps and see if people trend in personal preference. I.E. here are two similar amps, we tweaked one, and want to know if it made a difference... blindly select which version of the song you like better, if at all. Then you could probably make it a normal ABX too - but there is such history about this, it is unlikely you will find people who don't care about the results
http://www.stereophile.com/features/113/index.html
http://tom-morrow-land.com/tests/ampchall/
I am not sure I will hear a difference in such carefully controlled tests. I do BELIEVE i hear them otherwise...
Quote:
Here are the experiments - I don't believe any of the experiments are actually set-up ideally. I think there are some assumptions about what is audible and what is not that already featured in the experimental design. And I believe by asking people to get an answer correct, you are probably inducing stress, I would rather test two equal-ish amps and see if people trend in personal preference. I.E. here are two similar amps, we tweaked one, and want to know if it made a difference... blindly select which version of the song you like better, if at all. Then you could probably make it a normal ABX too - but there is such history about this, it is unlikely you will find people who don't care about the results
http://www.stereophile.com/features/113/index.html
http://tom-morrow-land.com/tests/ampchall/
I am not sure I will hear a difference in such carefully controlled tests. I do BELIEVE i hear them otherwise...
Quote:
Warning: This is going to get technical. If you're not interested in statistics, just go ahead and skip this one.
That's not actually what the CLT says—the CLT is actually pretty complicated. What the CLT says is that if you are randomly sampling from a population, the sampling distribution of the mean will approach a normal distribution with parameters that are a function of the mean of that population, the standard deviation of that population, and the sample size. This will be true regardless of the shape of the population—as long as the sample size is large enough, the population need not be normally distributed, but the sampling distribution will be.
What makes this complicated is that the notion of "the sampling distribution of the mean" is itself not a simple idea. Most of my grad students really struggle with this concept at first, so unless people really want a seriously long post, I'm not even going to attempt it here.
What it sounds like you're talking about is not actually the CLT, but the normal approximation to the binomial. When responses are independent and binary with a stable probability, the outcomes are described by the binomial distribution. If the sample size is large (say, > 30 or so), then (almost) nobody actually uses the binomial distribution, they approximate it with the normal.
A normal distribution of what? I apologize but I don't remember the details of these tests.
I'm lost. It's possible to construct a statistical model where "experience" is a predictive factor, and then test it. If this wasn't what was done, it's very hard to justify such a conclusion based only on the descriptives of the shape of the outcome distribution.
Maybe. Depends on how they ran the test. In principle, if you test 100 people, you would expect to reject the default null on 5% (or whatever your alpha leve is) of those people. However, this is a well-understood problem and there are similarly well-established ways to correct for this.
That certainly sounds right. The claim that "on average, people don't hear a difference" is very different, statistically, from "nobody hears a difference."
Hmm. The use of the "just by chance" phrasing in statistics is usually an indicator of the tenability of a conclusion based on some hypothesized overall population parameter (usually a mean). The presence of extreme values doesn't tell you anything about the stability of the individual measurements, only the population mean. If anybody is trying to draw conclusions about the prevalence of extreme values based on a hypothesis test of the mean, well, yeah, that's almost certainly wrong.
Truer words were never written! Though I'd modify that to "any" stats, not just "normal" stats. Most of the scientific manuscripts I reject when I peer review, I reject on the basis of incorrectly performed statistical analyses. This is indeed hard, and the complexity of the issues involved are really easy to not fully appreciate, even for people who do this kind of thing for a living.
If you want to test "does this specific subgroup score differently, on average, than this other subgroup?" that's actually pretty easy to test, assuming adequate sample size. However, asking "are there specific individuals in the distribution who score in a way that is systematically deviant from everyone else?" is much harder.
Again, I heartily agree!