Designing an experiment to detect differences between cables
Apr 19, 2004 at 2:49 PM Post #46 of 93
Quote:

Originally Posted by rodbac
I know what the odds were- 1 in a 1000 is hardly more statistically relevant for proof of something like what we're trying to prove than 50-50 (which is why I used the term "impressive"- it was too late to do the math).

I've never claimed to be an expert in stats, but I do know enough to tell you that correctly predicting 10 coin flips comes nowhere ****ing near the improbability to "wow" anyone who knows anything about it. Only explanation would be a weighted coin? Probability "infinitesimal"? Seriously, maybe if you're in 3rd grade.



Conventionally, the level of probability used in statistics to determine significance is 0.05. In other words, an effect is considered "statistically significant" if there is less than a 1 in 20 probability that by rejecting the null hypothesis we are committing a Type I error. In the coin flip experiment, the null hypothesis would be that the coin is not weighted. In that case we would expect to see a normal distribution of heads and tails, with maximal probability of occurence at 5/5. The 1 in 1024 probability events (all heads or all tails) would be at the extreme ends of the distribution. Since they have less than a 0.05 probability of occurrence if the null hypothesis is true, we would reject the null hypothesis if this occurred, and conclude that the coin was weighted. We would be committing a type I error approximately 1 out of every 1024 experiments in which we did this, assuming we did a very large number of experiments. Since the conventional level in science is 1 in 20, we're on pretty firm ground.

If you don't understand this, please bail. This is Stats 101. If you don't understand how it works, you don't have the capability to criticize any scientific experimentation, as you're missing the tools to understand how scientific data is analyzed.
 
Apr 19, 2004 at 2:49 PM Post #47 of 93
Quote:

Oh God, you're an expert on cables and statistics as well as an expert on Rainbow foil.. what a man you are Rodbac


Crimeny, I come back in here to apologize to Hirsch and find a guy who thinks rainbow foil is reasonable taking pot shots...

Pink, I haven't claimed to be an "expert" on any of those things (well, the rainbow foil I guess I did
wink.gif
). I am truly curious about the results of this cable experiment (if it's done PROPERLY), and have admitted I've only had a couple classes in my time that required heavy statistics.

However, I have more school under my belt than many MDs, and I am certainly able to spot a bad experiment (at least when it's as bad as "listen to these two cables and tell us whether they're different", or "if even one person can spot a difference, the difference exists").

Now to why I am here...

Hirsch, I wanted to come back and apologize for the harsh tone. Good luck with the experiment.
 
Apr 19, 2004 at 2:52 PM Post #48 of 93
Quote:

If you don't understand this, please bail. This is Stats 101. If you don't understand how it works, you don't have the capability to criticize any scientific experimentation, as you're missing the tools to understand how scientific data is analyzed.


Yep, you're completely, 100% correct. I'm WAY out of my league and I know that now.

Just go with the two cables, let the listener flip between the two and report if there's a dif. Get it banged out.

Good luck with it, Hirsch. I'll be watching for the results in my journals...
 
Apr 19, 2004 at 3:06 PM Post #49 of 93
Quote:

Originally Posted by rodbac
However, I have more school under my belt than many MDs, and I am certainly able to spot a bad experiment (at least when it's as bad as "listen to these two cables and tell us whether they're different", or "if even one person can spot a difference, the difference exists").

Hirsch, I wanted to come back and apologize for the harsh tone. Good luck with the experiment.



Apology accepted.

If you've had the education claimed, please read what I've written carefully, and try not to misquote me or take what I've said out of context. Regardless of your education to date, you still need a course in statistics, and another one in experimental design.

If you've never designed an experiment before, you have to realize that it is not done in one step. You start with the hypothesis, then fill in details. There is an exchange as various methods are discussed. Eventually, once a method is arrived at, the experiment may be ready for a trial run. The purpose of putting suggestions for general procedures out is to allow them to be criticized. However, it is also important that criticism be substantive.

"This is a bad experiment" is not a valid criticism. "This is bad experiment because it allows experimentor bias to influence the outcome" would be a valid criticism. It's valid because it has content that can either be accepted or refuted. Pot shots from the sidelines merely siderail the discussion. So far, those are all that you've delivered.
 
Apr 19, 2004 at 3:55 PM Post #50 of 93
Quote:

Originally Posted by Edwood
Ah. So you are splitting from one source three ways to three different IC's to one amp with three inputs?


Nope. No splitter is used. The source has two outputs, so no degredation.

Quote:

Not everyone lives with their mother.


That was an example.
tongue.gif


Quote:

I guess the ghetto solution to the double blind would be to cover up the wires.


Or simply have the person doing the listening facing the oposite direction and have someone else flip the switch.
 
Apr 19, 2004 at 11:39 PM Post #51 of 93
Quote:

Originally Posted by radrd
Or simply have the person doing the listening facing the oposite direction and have someone else flip the switch.


He must also plug his/her ears with his/her index finger and go LALALALALALALALALALA...
 
Apr 19, 2004 at 11:41 PM Post #52 of 93
Quote:

He must also plug his/her ears with his/her index finger and go LALALALALALALALALALA...


Only if the switch makes a distinct noise for each position it is in...
wink.gif
 
Apr 19, 2004 at 11:47 PM Post #53 of 93
Quote:

Originally Posted by radrd
Only if the switch makes a distinct noise for each position it is in...
wink.gif



Cable 1: *cough*
Cable 2: *Ahhhh Chooooo*
Cable 3: *sniff*
Cable 4: *giggle* *psst...this is the good cable*

Dont give the cable dude any ideas
tongue.gif
 
Apr 20, 2004 at 1:08 AM Post #54 of 93
I have a couple of thoughts. First, radrd's suggestion to use a single pair of 'phones sounds better than using two sets of 'phones. Switching would be faster and it would guarantee that everything except the cables would be the same. Second, have you considered trying to correlate the results of this somewhat subjective experiment with a more objective measurement of cable differences? For example, you could use someone's high-end Terratec or RME soundcard to record a source with different interconnects between the source and the soundcard. Another option would be Souce -> Interconnects -> A/D converter -> USB Audio Device -> PC. I think that any audible difference between cables should be equally apparent when loooking at recorded waveforms.

Edit: One more thing. Where's the control group? I'd like to see a control group listening to identical pairs of cables.
 
Apr 20, 2004 at 3:32 AM Post #55 of 93
Putting Rodbac behind us, let’s take a look at what we need to accomplish. Our design is going to have several objectives that will be needed to give it credibility. These objectives can be expressed in both experimental and statistical ways. This may be overly advanced for some, and overly simplistic for others.

We need to eliminate alternative explanations for any effects observed. Obvious examples of this are subject and experimenter bias. I think we’re agreed that the subject must be blind to the exact experimental condition. I also think that we’re agreed that if the experimenter is not blinded to the experimental condition, then the setup cannot allow the subject to receive cues from him. Other possible effects we need to control for may be less obvious. We need to control for possible listening fatigue. Accordingly, all trials must be done in a random order. Once we decide what the trials are, we can use a random number generator to set the order. This must be done for each subject.

We need to control for the hearing of the subject. We also need a way to control for deliberate sabotage. Sadly, one way to kill of a study of this type is for an individual with an agenda, and it’s pretty clear there are a few out there, to screw up a study by generating random responses. Only one or two could be enough to create enough variance to mess up a statistical analysis, unless we go with a really large sample size. For this, we need a positive control. The positive control would have to be an auditory stimulus that we know about, and that is just barely detectable by a person with normal hearing. If someone performed at chance levels on trials where there were two different cables (i.e. couldn’t tell the difference), but also performed at chance levels on trials where we know that there was a detectable difference, we can throw that data out. The person did not have normal hearing, or was just screwing around with the experiment. In either case, it’s not valid data. Without a positive control group, we cannot control for these effects, and a negative finding would be less meaningful. One way to do this would be to use a set of inline attenuators. We’d need three sets, to control for the effect of the attenuator. Two sets would have the same small attenuation, and could be used as a control. The third set would have just enough extra attenuation to produce a barely audible change in gain. We’d need to test to determine just how much this should be.

The type of testing we’re going to try is actually pretty easy to interpret if a positive result is found. We know the distribution of responses that are possibilities in the absence of a difference between cables (think coin flip). However, we run into some statistical issues here. We can “disprove” a null hypothesis, and assign a probability to the values that we obtain occurring by chance. Conventionally, if we get an effect with a probability of occurrence of less that 0.05 (we call this criterion alpha), we say that we’ve got a significant effect. However, we also are accepting that there may be a 5% chance that if we say we’ve got a real effect, we’re wrong. We’ll obtain a “p” value for our results, that will tell us the actual probability of its occurrence in the absence of an effect (alpha is the criterion level, while p is the obtained value). If p < 0.05, we’re happy, and we can say we got an effect. If p < 0.001, we’ll be very very happy. This probability is our estimate of committing what is called Type I error in statistics: rejecting the null hypothesis when it is true. If we say that we’ve got an effect, we are rejecting the null hypothesis, and obviously want the probability that we’re wrong to be as small as possible.

However, and this is important, failure to find a positive effect does not necessarily imply a negative effect. If we get a p < 0.10, we’ll say that the effect is not statistically significant. And yet, there are still 9 to 1 odds that the effect is real. We just haven’t proven it to scientific standards. So, we need to take steps to control for Type II Error: accepting the null hypothesis when it is false. One way to do this is to run a large number of subjects. The more subjects get run, the less chance that the absence of an effect is due to experimental error. The actual statistic for this is called beta. This is not always calculated for experiments, and is in fact difficult to do. Many scientists don’t bother, since the data is useless to them if they don’t get a positive effect. It usually requires a pilot study to get some initial numbers to plug into the calculation. However, if beta is not < 0.05, we will not be able to say that no effect was present. Since people are going to want to say that there was no effect, rather than that we failed to get a significant effect, we’re going to have to insure that beta is small.

Whew. Anyone who is still reading this is a stronger man than I.

Some thoughts and questions, which I’ll express with more brevity than above:

How long is a session going to be? Do we want to put a time limit on the session, or let the subject take as long as he needs?

How many subjects are we going to run in a session? This will depend on how long a subject takes to finish his trials.

The trial types so far:

Type 1: The same cheap cable at both inputs to the amp.
Type 2: The same expensive cable at both inputs to the amp.
Type 3: Different cables at each input to the amp.
Type 4 (optional): The same attenuators at each input
Type 5 (optional): Small attenuation at one input

We're doing to need to design the response sheet the subject will complete after each trial. Do we just need a "same" "different" response, or do we want the subject to try and identify the expensive cable and give sonic impressions?

Are two types of cables enough? How do we select the ones we want to use?

How many repetitions of each trial that we end up with do we want?

Do we want to run this at multiple sites, to raise the N (number of subjects)? If so, how do we insure that experimental conditions are comparable?

There's lots more, but this should give us something to work on.

Confused? Welcome to the wonderful world of science
tongue.gif
 
Apr 20, 2004 at 4:19 AM Post #56 of 93
Quote:

Originally Posted by Hirsch
Do we want to run this at multiple sites, to raise the N (number of subjects)? If so, how do we insure that experimental conditions are comparable?


Perhaps the experiment should be supervised by the same person (or people) at several different Head-Fi meets. Said person (or people) should be ignorant of the data collected until after the last run of the experiment. I suggest sealing it and sending it to a non-participating third party for safe keeping.
 
Apr 20, 2004 at 10:38 AM Post #57 of 93
Quote:

Originally Posted by Orpheus
the question: "Can cables audibly affect the tonality of your system?"

the goal is to devise a test that both believers and non-believers agree on. then we will execute this test and report the results.

so, let's do it.

post your ideas here.



Last week, a customer at "my" local hifi dealer invited me to participate in an audition by which he wanted to determine the differences between two amplifiers he was interested in (I only entered the listening room in the first place to look at a rare Gibson Les Paul they had there as decoration). I just kinda sat there, without any real interest, but just like the customer, I had difficulties spotting the differences between the amps.

Then the hifi dealer replaced the interconnect between CD player and amp, and the difference between the two cables was clear, and it was so much easier to hear than I had ever experienced before. The difference wasn't even subtle in the sense that you'd have to listen closely or go back and forth several times. It was unmistakeable. The other guy heard it too. I did not even ask for the cable brands, but I noticed that the better sounding cable was 3-4 times as long as the dull one, which surprised me.

This dealer has the components arranged on a shelf, whose front faces the shop's window, and whose back faces the listening room. All persons in the room could clearly see the terminals of the CD player and the amp involved, what the dealer was doing, how the cabes looked like, and that the same CD player and amp as before were involved. Moreover, this test was clearly not about selling cables, but rather about two amps, and apparently the dealer wanted to make it easier for the customer to spot the differences between the amps by installing a more revealing interconnect. That was a sufficiently "scientific" test setup for me.

So the question is less "Can cables audibly affect the tonality of your system?", because the answer is already known. The question is more: "How can I teach myself to better and easier notice the sonic differences between cables?" In my experience, that ability is the result of a learning process, which takes a little time, and a function of the level of relaxation the listener can reach. I remember how "hard" I listened when I first experimented with different cables. The proper relaxation is not easy to achieve. In the example above I was completely relaxed because it wasn't my audition, and I wasn't expecting others to expect me to spot the differences.

The most sophisticated test setup may fail to yield meaningful resuts if the listeners are too tense.
 
Apr 20, 2004 at 1:20 PM Post #58 of 93
Quote:

Originally Posted by Sugano-san
). I just kinda sat there, without any real interest, but just like the customer, I had difficulties spotting the differences between the amps.

Then the hifi dealer replaced the interconnect between CD player and amp, and the difference between the two cables was clear, and it was so much easier to hear than I had ever experienced before. The difference wasn't even subtle in the sense that you'd have to listen closely or go back and forth several times. It was unmistakeable. The other guy heard it too. I did not even ask for the cable brands, but I noticed that the better sounding cable was 3-4 times as long as the dull one, which surprised me.



How do you know the first wasn't an analog IC & the 2nd a digital IC or vice versa? I'm not your "experiment" has proven anything.
CPW
 
Apr 20, 2004 at 1:40 PM Post #59 of 93
Quote:

Originally Posted by Hirsch
How long is a session going to be? Do we want to put a time limit on the session, or let the subject take as long as he needs?


As Sugano suggested, I think the listener should feel relaxed enough to take plenty of time. That said, there should be a limit to facilitate time limits so the experimenter doesn't have to sit there all day.

Quote:

Originally Posted by Hirsch
How many subjects are we going to run in a session? This will depend on how long a subject takes to finish his trials.


From a non-statistical perspective, There should be enough samples so that bias or error in one sample doesn't destroy the data.

Quote:

Originally Posted by Hirsch
We're doing to need to design the response sheet the subject will complete after each trial. Do we just need a "same" "different" response, or do we want the subject to try and identify the expensive cable and give sonic impressions?


It depends what you want the hypothesis to be. It seems to me that we are just looking for a qualitative difference in sound. In that case a simple same/different response from the subject is enough information to test the hypothesis that cables do/don't make in a difference in tonality.

Quote:

Are two types of cables enough? How do we select the ones we want to use?


I would suggest using multiple sample sets of cables, with two selected at random for each subject to test with. I think the different cables should be of comparable quality, differing in construction, geometry, or materials. That way the hypothesis can't be rejected because of comparing "zipwire" (which may have poor electrical conductivity measurements such as high resistance) to well constructed cables.

The apparent difference to a person who thinks he can tell a difference may be different on different levels of equipment. I believe that I can tell a difference between certain cables on my headphone system, but I am hesitant to say whether or not I would be able to do so on an unfamiliar system.

We might want to consult an expert on auditory memory. It has been said before that naturally our auditory memory is poor. For this reason, it might make a huge difference if the experiment were conducted on familiar equipment with sufficient time to discern a difference.

It is also important to figure out what the test music will be. Will each subject hear the same music? What if the subject has a bias against the test music, or the test music is not recorded well? As I mentioned above, if the subject does not know the test music well, it may be too difficult to discern a difference due to our poor auditory memory. On the other hand, having a test subject bring their own music might be a variable that is out of control.

One way to implement the experiment would be to create a sampler disc with five or so tracks from different genres, all thought to be of above average recording quality. Perhaps one classical track, one jazz track, one rock, one hip hop, etc. That way the music selection is controlled

Quote:

Do we want to run this at multiple sites, to raise the N (number of subjects)? If so, how do we insure that experimental conditions are comparable?


If the experimental conditions aren't carefully controlled, the results may have little value to the community. Using multiple test sites is a good idea since I think our sample size should be as high as possible, however if there are different experimenters there should be an exact procedure for them to follow. In addition, the equipment used at different test sites has to be identical.
 
Apr 20, 2004 at 2:11 PM Post #60 of 93
Something inside me told me that I should have stayed away from this thread, and the reply quoted below is a case in point.

On the other hand, I am confident that nothing questionable happened in the course of the demo I described, and I did want to share the experience that differences in analog interconnects can be quite easy to detect. Maybe even easier in a situation where you're not focused on the sound of cables, but something different. I was quite surprised myself.

In any event, I stand by my theory that tense listeners will have smaller chances to hear what's going on. I also stand by my theory that the ability to detect sonic differences easily is the result of a learning process.

Quote:

Originally Posted by cpw
How do you know the first wasn't an analog IC & the 2nd a digital IC or vice versa? I'm not your "experiment" has proven anything.
CPW



 

Users who are viewing this thread

Back
Top