Scientific Listening Experiment: Can we tell the difference? Let's find out.
Dec 14, 2004 at 9:53 PM Thread Starter Post #1 of 70

TWIFOSP

Headphoneus Supremus
Joined
Feb 21, 2003
Posts
1,607
Likes
23
I've become intrigued by the events in this thread:
http://www6.head-fi.org/forums/showthread.php?t=96841

Particularly the latter portions of the thread which have turned into yet another cable discussion. I, personaly, am of the camp which can hear the difference in cables. Especially headphone recables.

However, I also have no scientific evidence to provide that there is a difference. Also, most of the "studies" that have been performed in this regard have been flawed either in the experiment portion, or in the post analysis of the statistics and comparisons.

Therefore I would like to construct an official head-fi expirment where we put this issue to rest once and for all. I happen to do this kind of thing for a living, but not in the audio field. So I will offer my services to design and analyze the data for publication to the community. The results are welcome to be scruntinized by anyone. The conclusions as well as raw data will be made completely public.

The goal: To find out if a group of listening experts (audiophiles) can statistically tell the difference between aftermarket and stock headphone cables.
The piggy-back goal: To find out if a group of listening experts can statistically detect improvements between aftermarket and stock cables.

What we NEED:
1. A control group. This control group should contain at least 20-25 participants and should be composed of expert listeners. The population should also contain those of the group that claim no difference in cables. (don't worry, we have a way of making sure you don't sabotage the study
biggrin.gif
)

2. Control equipment. As a community, we must decide on the control source, the control power, the control amp, and the control headphones. This is open to discussion. I would think the 650s would be ideal because those seem to be the most contested headphone with cable upgrades. This is up to the community.

3. Control enviroment and time. This idealy could be done at a headphone meet. Realistically, this could take several hours. And everyone must be in the same location with access to the exact same control equipment.

This could take a while. We will first need to perform a control test, and on the spot data analysis (MSA, Variability checks, ect) to ensure that the control group is capable of an acceptable level of variation and consistent data measurements. Anyone willing to commit, would probably need to commit 4 hours.

4. Data collection. We will collect the data in an unbiased blind test manner. We will introduce intentional variations in the tests to ensure variation control among the listeners, as well as honest responses. The methods behind this will be known only to the facilatators of the event, until after the study is over to avoid listener bias.

5. Data analysis. Statistical analysis of the results to determine: A. Differences in listeners. B. Invalidation of listeners (false positives) C. And of course, differences in cables. I can do this. But I would be happy to share this task with someone else who is capable of this computation if only to ensure I do not bias the tests. (even though the raw data will be published).

I understand if this never gets off the ground. It will require lots of organization and logistics. But I think it'd make for a great excercise at a head-fi meet. And I'd like to put this topic to rest once and for all
biggrin.gif


So... who's up for it?

If we can allocate resources, people, and location. I'll design the experiment and share with the other facilators. As mentioned, we can't share the methods publicly until the experiment is completed to avoid bias (hawthorne wo0t).
 
Dec 14, 2004 at 10:12 PM Post #2 of 70
I think this would be an interesting experiment, but it would not be anywhere near conclusive, because I think it is difficult for a person to perform well in an ABX test when they are not used to listening to the equipment being used, or the musical selections that are part of the test. To borrow from someone's analogy in the other thread, I might have a hard time telling apart two neighbor's twin girls that I just met, but I know if I had twin daughters I could tell them apart quite readily.
 
Dec 14, 2004 at 10:25 PM Post #3 of 70
Quote:

Originally Posted by PhilS
I think this would be an interesting experiment, but it would not be anywhere near conclusive, because I think it is difficult for a person to perform well in an ABX test when they are not used to listening to the equipment being used, or the musical selections that are part of the test. To borrow from someone's analogy in the other thread, I might have a hard time telling apart two neighbor's twin girls that I just met, but I know if I had twin daughters I could tell them apart quite readily.


A good point and I'm glad you raised it.

To get around this problem, we will conduct tests & calibration before the actual study begins. This process will calibrate everyone statistically on changes and differences. As long as everyone is honest, this process accounts for that problem. This process is known as an Measurement Systems Analysis (MSA). It is a key component in an objective test, and is used quite often to ensure the results of the final study are valid. Without it, a subjective survey test is never accurate.

As for music, I imagine that along with control music and tones, we will let each listener supply one track of their own to listen to. But the validity of a test subjects own music will be called into question unless the results of the other control music are also significant. Either way, we can draw interesting conclusions.
 
Dec 14, 2004 at 10:48 PM Post #4 of 70
I'm all for familiarity, but it's face it, the stuff i have at home isn't going to cut it for a test like this. We need a Source and Cans (and recording) that will all get out of the way. Transparent.
 
Dec 14, 2004 at 10:55 PM Post #5 of 70
Quote:

Originally Posted by Jahn
I'm all for familiarity, but it's face it, the stuff i have at home isn't going to cut it for a test like this. We need a Source and Cans (and recording) that will all get out of the way. Transparent.


I agree, but then again, I'd like to conduct the test with a real world headphone. Perhaps we'll have to use 2 control headphones.

Suggestions?
 
Dec 14, 2004 at 10:56 PM Post #6 of 70
Sounds like a great idea. I'm in... although I doubt I would pass the 'expert' exam.
tongue.gif
 
Dec 14, 2004 at 11:15 PM Post #7 of 70
Quote:

Originally Posted by TWIFOSP
To get around this problem, we will conduct tests & calibration before the actual study begins. This process will calibrate everyone statistically on changes and differences. As long as everyone is honest, this process accounts for that problem. This process is known as an Measurement Systems Analysis (MSA). It is a key component in an objective test, and is used quite often to ensure the results of the final study are valid. Without it, a subjective survey test is never accurate.


Can you explain how this would be done, as I am not familiar with MSA. I've never had my ears "calibrated" before, how does one go about doing this? And does it hurt?
eek.gif
 
Dec 14, 2004 at 11:28 PM Post #8 of 70
Quote:

Originally Posted by PhilS
Can you explain how this would be done, as I am not familiar with MSA. I've never had my ears "calibrated" before, how does one go about doing this? And does it hurt?
eek.gif



Heheh, actually we wouldn't calibrate the listeners ears. I wouldn't know the first thing about doing that. Instead, we'll calibrate our data collection methods to account for listener variation. To do this, we'll intentionaly alter the conditions and measure the responses to it. If this weren't objective, we'd then calibrate the listeners back on the differences. But since we have no "right and wrong" template to go on, we'll just have to make some educated guesses.

Essentially it's just a mathmatical way to benchmark a bunch of people who aren't trained to do something the same way. It'll help us explain listener variation instead of cable variation later on in the study.
 
Dec 14, 2004 at 11:43 PM Post #9 of 70
You live in Austin? I might be up for it or maybe a meet. I have no opinion on cables since I never had any experince with them. Heck I never even own an amp yet. I'm just want to be in it so I can listen with the good stuff.
biggrin.gif


EDIT: I can't type...
 
Dec 15, 2004 at 5:07 AM Post #10 of 70
In science, the use of the word control has a very specific meaning. It's a group that is treated identically to an experimental group, except for the experimental treatment itself. For example, in a pharmacology experiment, the placebo group would be a control group. There are also positive control groups, where a group is given a treatment with a known effect on the dependent variable. This is done to insure that the measure is in fact sensitive to the manipulation that you're doing in the experiment. Controls are run to reduce the possibility of an observed effect (or lack of one) being due to experimental artifact. The way "control" is being used in this thread is gibberish. You will need to run appropriate control conditions in order to draw conclusions from your experiment.

You also need to understand the use of the word "statistics". Your goal is to determine if a group of experts can tell the difference between aftermarket and stock headphone cables. Note the absence of the word "statistics". When the experiment is complete, you will use descriptive statistics to summarize the data. You will also perform inferential statistics to determine if any results you obtain meet probabilitistic criteria to call them a real effect and not experimental error. However, your subjects will hear differences, or they won't. That's not a statistical action (although you can argue that it is in the case of a guess, which is not what you were talking about).

Presumably, you will be changing headphone cords as your independent variable. Your dependent variable will be a correct or incorrect identification? If so, please clarify.

The biggest nonsense is your control test, with MSA and variability check. Variability of what? Presumably you intend to measure something known. What is it, and how does it relate to your experiment? About the only pretest you might run is a hearing test, to insure that your subjects can detect a normal range of frequencies for their age group. What is an acceptable level of variation and data consistency, assuming normal hearing?

How did you arrive at a figure of 20-25 people? Why not 10 or 500?

Here's a trick question for you. Out of 25 people, how many people must reliably be able to detect cable differences for you to conclude that the differences are real?
 
Dec 15, 2004 at 5:18 AM Post #11 of 70
Quote:

Originally Posted by Hirsch
Here's a trick question for you. Out of 25 people, how many people must reliably be able to detect cable differences for you to conclude that the differences are real?


1


JF
 
Dec 15, 2004 at 5:23 AM Post #12 of 70
Quote:

Originally Posted by JohnFerrier
1


For once we are in complete agreement.
cool.gif
 
Dec 15, 2004 at 7:25 AM Post #13 of 70
Quote:

Originally Posted by TWIFOSP
I've become intrigued by the events in this thread:
http://www6.head-fi.org/forums/showthread.php?t=96841
...
Therefore I would like to construct an official head-fi expirment where we put this issue to rest once and for all.



You mean that my explosive first post on Head-Fi might actually instigate some sort of positive action in the direction of an accountable and forum-approved test trial of specific audio components, even if only because of the provocativeness of said post given the sordid history of the topic under discussion? And here I was being attacked for contributing nothing to the boards.
rolleyes.gif


Sorry people, just reeling from the substantial (and passionate) response.

(Perhaps unsurprisingly) I am very supportive of TWIFOSP's idea, and I'll leave it to all of you to decide on a mutually acceptable procedure. It would be interesting indeed to see if this goes anywhere, and the results if it does.
 
Dec 15, 2004 at 8:18 AM Post #14 of 70
Quote:

Originally Posted by TWIFOSP
Therefore I would like to construct an official head-fi expirment where we put this issue to rest once and for all.



Once and for all, LOL! It would be fun, though
biggrin.gif

There is an earlier thread, I believe Hirsch started it, with an interesting long discussion of the complexities of controlling this kind of experiment.
smily_headphones1.gif
 
Dec 15, 2004 at 1:58 PM Post #15 of 70
Asking people to spot subtle cable differences on some unfamiliar gear (source, amp, and headphone if they are not native Sennheiser users) is adding too many variables to the equation, IMO. People have no baseline for what that system is "supposed" to sound like in the first place, so spotting differences when a cable is swapped becomes incredibly difficult.

I don't think this is a meaningful test. You would have to find a way to allow people to use their own systems that they actually know and would be able to spot differences on.
 

Users who are viewing this thread

Back
Top