Testing audiophile claims and myths
Jul 7, 2019 at 7:14 PM Post #13,186 of 17,336
I don't disagree with BigShot here... but I do disagree with the vehemence with which he defends his point of view.

I've done lots of "engineering work" - designing and testing various equipment and components - which often requires that things be measured.
In almost every case, there is a reasonably well established level of accuracy that is required for a successful result.
In some circuits, you need resistors matched to within 1%; in others a 10% part will be perfectly adequate.
However, when I measure resistors, I still use a meter that is certified to be accurate within 1/10 of 1%.
(The difference in price for the more accurate meter is inconsequential.)
The reason I do this essentially boils down to: "It never hurts to be more accurate than you need to be".
In engineering, we often discuss a more specific benefit, commonly known as "safety margin".
("If you start out with something that's a lot better than what you need, then it will still be OK, even if it drifts a little bit, or if you were a little optimistic.")

And, not surprisingly, I tend to apply a similar standard or guideline to the audio equipment I own.
Whenever possible, I don't buy equipment that is "just good enough that its flaws are inaudible to me"...
I prefer to have equipment that is "a lot better than the bare minimum that I need"...
(So, if I was quite certain that THD was only audible above 0.5%, I would still prefer to have an amplifier with a THD of 0.05% instead of 0.5%, "because it has a better safety margin".)

And, yes, I do also agree that there is "taking things too far"......
That's why I haven't spent $10k on a meter that's accurate to 0.001%.....
(However, if someone were to offer me one for only 20% more than my current meter cost, I might consider it...)
And why I'm willing to settle for audio equipment that's "probably only 10x better than what I really need".....
But I absolutely don't agree that "there is absolutely no reason to pursue better performance past what's demonstrably audible" - I'll take my 10x safety margin; even if it costs a little extra; as long as it isn't too much extra.

The important thing to remember about listening tests is that they are not intended to detect all differences, just AUDIBLE ones. There can be differences that are measurable, but not audible. That may be important from a theoretical point of view, but not from a practical one. We are looking for differences that will impact our systems when we are listening to music in our living rooms. The average controlled listening test with tones detects differences that are an order of magnitude smaller than anything that would make a difference when you're listening to Beethoven on the couch. There is such a thing as good enough. Too many people chase down rabbit holes of absolutism. Most differences you read about in audiophile forums are completely irrelevant to real world music listening. You have to make an effort to learn what the numbers actually sound like to really understand. There are a couple of good AES demonstrations in my sig file if you are interested.
 
Jul 7, 2019 at 8:18 PM Post #13,187 of 17,336
Ok, I now have read more about the Swedish Radio example: apparently the test was not well designed and set-up.
In combination with assuming bigshot is right about blind, level matched, direct A/B switched giving a better chance to discern differences I retract my proposal to do sighted listening before doing a blind test.
 
Jul 7, 2019 at 11:50 PM Post #13,188 of 17,336
It's good to throw suggestions out there. You run it up the flagpole and figure things out based on how it does.
 
Jul 8, 2019 at 12:44 AM Post #13,189 of 17,336
I'm seeing a major flaw here in terms of how the idea of "subjective vs objective" is being conflated with the difference between blind and sighted tests.
Doing a double blind test does NOT ensure you of getting objective results...
All a blind test does is to avoid one specific sort of bias - a false positive based on expectation bias.
For one thing, if you have an expectation bias NOT to hear a difference between two devices, a simple blind or abx test will do nothing to eliminate it.
For another thing, a blind test doesn't eliminate purely subjective responses; for example, even in a blind test, a subject might PREFER the sound of a device with more distortion.
(And, even if you're supposedly testing for "whether they hear a difference or not" they may still apply a subjective standard to reporting "tiny differences" or "significant differences".)

There is a classic example of "visual bias" that is often described in psychology textbooks...
A well-prepared steak is divided into two halves - and one half is dyed bright green with an odorless tasteless food dye.
When subjects are invited to compare a green piece of steak to a normal looking piece, with their eyes closed, they routinely report no difference (proving the dye has no effect on taste).
But, when invited to sample pieces, and allowed to see what they're eating, subjects consistently report that the green steak doesn't taste good.
This clearly demonstrates some sort of bias to percieve green steak as "bad" (almost certainly because we've been taught to associate that color with "spoiled meat").
HOWEVER, this is NOT a matter of "subjective vs objective"... (which would suggest something based on "opinion or expectation).
Even when the subjects know that the steak is identical, and have no conscious expectation or opinion that they will taste different, they STILL perceive a difference.
It is in fact a demonstration that, due the complex way in which our rbains are wired, input from one sense may affect how we interpret input from a different sense.
We actually, OBJECTIVELY, experience the taste of the steak differently when we see a green color associated with it - and an MRI will show a different sensory response in our brain.

To me it seems like an exact parallel if a certain audiophile finds that a "big impressive speaker" sounds better - but only when he or she can see those "non-audible attributes"....
- or "one with a big impressive price tag"
- or "one with better specs"
(something "we imagine we hear because we expect to hear it" is quite different from "a subjective evaluation of something"...)

I would also agree that I personally prefer to know whether what I'm hearing is being influenced by this sort of bias or not.
Therefore, at least as a starting point, being able to determine how various devices sound while avoiding various common types of bias is certainly useful...
However, I DO NOT believe that the terms "subjective" and "objective" apply here.

The proper way to phrase the situation would be that "a blind test will rule out most positive expectation biases due to recognizing or seeing the product".
did you perhaps quote the wrong post? because in the one you quote I didn't mention objective or subjective once ^_^.
 
Jul 8, 2019 at 12:58 AM Post #13,190 of 17,336
Ok, I now have read more about the Swedish Radio example: apparently the test was not well designed and set-up.
In combination with assuming bigshot is right about blind, level matched, direct A/B switched giving a better chance to discern differences I retract my proposal to do sighted listening before doing a blind test.
trick question: how do you confirm how successful casual listening is in detecting audible differences?
 
Jul 8, 2019 at 1:47 AM Post #13,191 of 17,336
did you perhaps quote the wrong post? because in the one you quote I didn't mention objective or subjective once ^_^.

I think we're long past expecting a reasonable answer from that source. Bias is one thing. Commercial interests are another.

On another topic, challenging beliefs is at the heart of objectivism. If it's impolite to challenge, you'll never arrive at the truth. Challenges are judged by how well they present their case. Supporting arguments are what matter ultimately. If you don't believe the supporting arguments, you prove them wrong. You don't just throw up smoke.
 
Last edited:
Jul 8, 2019 at 4:11 AM Post #13,192 of 17,336
@bigshot You can insist on not being rude all you want, but I felt obviously ridiculed by you and I think you understand why. I have no issues accepting my biases - I have been completely open about what I believe, and I've attempted to argue for that. Through getting it explained by the posters in this sub, dwelling on it and reading around (especially the main post) I have changed my view.

I guess the main reason I take offence is because you essentially accuse me of being intellectually dishonest on purpose. It is hard not only having spent a lot of time dwelling on this hobby, but also money and emotional energy and then get the whole fundament of it essentially ripped away and having to rebuild a new version of it - one that is closer to the truth.

Anyways, sorry for cluttering the thread.

Yesterday, after having concluded that what I've been hearing is likely to be so influenced by biases that I cannot really trust it at all, it kinda took over my whole day - I couldn't get it out of my head. Today, my STAX SR-L500 arrives in the mail. Have I wasted my money? I mean I swear I loved the estat sound, but I am starting to think that the differences between the LCD 2.2c and SR-L500 are mostly in my head - maybe they sound essentially identical (for most listening at least), and its really just money out the window.
 
Last edited:
Jul 8, 2019 at 5:09 AM Post #13,193 of 17,336
@bigshot You can insist on not being rude all you want, but I felt obviously ridiculed by you and I think you understand why. I have no issues accepting my biases - I have been completely open about what I believe, and I've attempted to argue for that. Through getting it explained by the posters in this sub, dwelling on it and reading around (especially the main post) I have changed my view.

I guess the main reason I take offence is because you essentially accuse me of being intellectually dishonest on purpose. It is hard not only having spent a lot of time dwelling on this hobby, but also money and emotional energy and then get the whole fundament of it essentially ripped away and having to rebuild a new version of it - one that is closer to the truth.

Anyways, sorry for cluttering the thread.

Yesterday, after having concluded that what I've been hearing is likely to be so influenced by biases that I cannot really trust it at all, it kinda took over my whole day - I couldn't get it out of my head. Today, my STAX SR-L500 arrives in the mail. Have I wasted my money? I mean I swear I loved the estat sound, but I am starting to think that the differences between the LCD 2.2c and SR-L500 are mostly in my head - maybe they sound essentially identical (for most listening at least), and its really just money out the window.
@bigshot has been a naughty boy recently and seems to have his posting style stuck on "roast".


about headphones and differences, or becoming paranoid about what to believe, all you need is to be clear about what you're asking for:
- if you're looking for community validation, then listen to the community.

- if you're looking for some objectively superior product that provides higher fidelity, then measurements and only measurements can tell you something about it. but as we discussed before, one headphone might do one thing better and not another. it will then come back to you and your own priorities to decide if that's the improvement you wanted or not. that objective quality route will usually be a a struggle as it's very hard to find exhaustive and rigorous headphone measurements. professionals don't usually share their data, so you often end up with nothing much or some amateur measurements that may or may not properly reflect fidelity for your pair and your use.

- if you're looking for something you enjoy, the only person that matters is you, the only variable that counts is "am I enjoying this?". if you enjoy using a headphone, then it's doing something right. maybe it's not sound, maybe it's the sound but what really reaches you is a special sauce instead of some pure fidelity, or maybe it's about comfort, maybe it's because you read something in a review, maybe it's the price or the marketing around the headphone or whatever. the actual reason why we enjoy something doesn't need to be clear. I believe it's all right to just enjoy and be happy about it. all our warnings about biases and not mistaking a feeling for what's happening in the objective world are relevant when what you're looking for is a fact. or when you post on a forum and don't wish to misinform thousands of people by forcing unchecked beliefs onto others as if they were undeniable truths. but otherwise, are facts making us enjoy music more? not sure there is any correlation between the two.knowledge never claimed to be the path toward happiness.

what I'm trying to say is:


:wink:
 
Jul 8, 2019 at 6:11 AM Post #13,194 of 17,336
Would it be a good idea to do the following, or can there be any problems with this from a scientific perspective:
First do extensive sighted listening to the things you want to compare, take as much time as you like. Get "familiar" with the sound, and try to find "suspected differences" and specific audio fragments with wich you seem to hear these "suspected differences". Then do a well set up double blind ABX test using those specific audio fragments. Only if you can confirm the "suspected differences" in this test they can become "accepted differences".
In fact, I am wondering if this approach should not even be considered mandatory before concluding that there is no difference (or rather: I think this approach would raise the level of confidence that there is no difference). (Certain differences are maybe only audible with specific kinds of audio fragments, and not by all people. If you just do an double blind ABX test with a "random group of people" and a "randomly chosen set of audio/music fragments" and just look at the "statistical relevance" of the deviation from 50% correct score you could get to a false conclusion if for example none of the audio/music fragments was suitable to hear the difference.)

There are a few problems here and in fact there's a better idea, one that is commonly (but not always) employed. Firstly the problems:

A. It's not such a good idea to use say an ABX test for "suspected" differences. First find out if there is an actual difference, which is generally easier, quicker and more reliable anyway. This way, the ABX test is hopefully going to answer a specific question: "Is this specific difference audible". As I said in my post to AudioThief, "Commonly we have two sides to the scientific/rational approach", and "commonly" indicates "often but not always", and that's because we sometimes don't need the "observed evidence" side. For example, if there is no actual difference then there's no point in any sort of listening test because either it will just confirm what we already objectively know or it won't, in which case it must be a faulty test. "Observed evidence" might also be unnecessary because it's irrelevant for another reason or has already effectively been done. For example, if the difference is say just random noise at -140dBFS, then we can't even reproduce it in the first place, or if the difference is above 30kHz there's zero chance we can hear it because we've already extensively tested frequency hearing thresholds over many decades, with countless subjects, and 30kHz significantly above that threshold.
B. The goal of an ABX or other double blind test should always be to pass it. I can't think of a single published scientific audio double blind study where they used a "randomly chosen set of audio/music fragments", they always use a set of carefully chosen of audio fragments which maximise the chances of the differences being audible. In fact, a test signal will often be used, a signal which doesn't necessarily ever exist in music/audio recordings but is specifically designed to make the particular difference being tested more audibly detectable. This enables us to apply the results of the test to a wider group of people than just the test subjects, with a higher degree of confidence. A good example of this is the testing that has been carried out for random jitter audibility. Using a specifically designed test signal, some subjects have been able to discern random jitter down to just two or three nano-seconds. However, this signal never exists in nature or in any audio/music recordings and when doing random jitter tests using music recordings, the lowest reported results have been 200 nano-secs, although most subjects struggle to discern lower than 500 nano-secs. From this we can draw some conclusions with a high degree of confidence, for example, while different people have different hearing acuity and different levels of listening skills, we can say with very high confidence that no one one can hear random jitter below two or three nano-secs with musical material.
C. It's not really a problem. If those test subjects could not discern a difference then you have your answer, it makes no difference. The problem only occurs if you try to apply that answer to other people/everyone else.

And now the better idea, which also has several points (but can depend on exactly what it is that we're testing):

A. We don't just have a "random group of people" but a selected "random group". I know this sounds contradictory but let me give an example: If we randomly pick 20 people off the street, the probability is high that they'll all just be members of the public, it's unlikely there will be any audiophiles or any experienced, professional musicians or sound/music engineers in that group. So depending on what we're testing, we can (and often do) select a group of random test subjects that deliberately includes representation of all 4 of these groups.
B. Essentially the same as "B" above, not use a "randomly chosen set of audio fragments" but a very carefully chosen set of audio fragments or a specifically designed test signal.
C. Quite commonly the test subjects are "trained" for 20 or 30 minutes before the actual DB test, especially if the difference being tested is not one we're accustomed to identifying. This is typically done using either test signals or a section of modified music that's had the specific difference added at an artificially very high level, to make it easily identifiable and then over the course of the training the amount of this added difference is reduced. This acclimatises/sensitises the subjects to the difference being tested, optimises their chances of detecting the difference during the test and makes the results applicable to others with a higher degree of confidence.

[1] I'm seeing a major flaw here in terms of how the idea of "subjective vs objective" is being conflated with the difference between blind and sighted tests. Doing a double blind test does NOT ensure you of getting objective results...
[1a] For one thing, if you have an expectation bias NOT to hear a difference between two devices, a simple blind or abx test will do nothing to eliminate it.
[2] For another thing, a blind test doesn't eliminate purely subjective responses; for example, even in a blind test, a subject might PREFER the sound of a device with more distortion.
[2a] (And, even if you're supposedly testing for "whether they hear a difference or not" they may still apply a subjective standard to reporting "tiny differences" or "significant differences".)
[3] There is a classic example of "visual bias" that is often described in psychology textbooks...
It is in fact a demonstration that, due the complex way in which our rbains are wired, input from one sense may affect how we interpret input from a different sense.
We actually, OBJECTIVELY, experience the taste of the steak differently when we see a green color associated with it - and an MRI will show a different sensory response in our brain.

1. Yes it does, unless it's a flawed double blind test, in which case one may still have some confidence in the result but obviously not as much.
1a. Yes it will, unless it's a flawed blind or ABX test. Firstly, an expectation bias not to hear a difference doesn't necessarily result in a false negative. A number of times I've done ABX tests not expecting to detect a difference but have, or even thought during the test that I wasn't detecting a difference but the results demonstrated otherwise. This isn't always the case of course, in which case "a simple blind or ABX test" is flawed and something other than a "simple blind or ABX test" is called for (in order to arrive at a higher confidence level result), for instance a more complex ABX test. A bigger/wider sample size for example and/or some control iterations (EG. One or more of the samples has an artificially high difference that's easily discernable).

2. I don't understand this assertion, a blind test is not designed to "eliminate purely subjective responses" it's designed to test them! What a blind test is designed to eliminate is biases which affect our "purely subjective response" (hearing). Most obviously, a blind test is designed to eliminate the biases introduced by sight and leave us with just the "purely subjective response" of hearing. If you want to eliminate subjective responses then you can't use ABX or any listening test, you have to use an objective measurement.
2a. If you're testing for "whether they hear a difference or not", it doesn't matter if the subjects detect a tiny difference or a massive one, just that they detected a difference.

3. Sorry, that's completely illogical. If two pieces of steak objectively have the exact same flavour, texture, etc., but we experience a difference between them, then by definition that experience cannot be objective, it must be subjective! And, if our subjective response (of taste) is influenced by a different sense, say sight, then of course there will be some difference in sensory response in our brain, it will obviously include more activity in those areas of the brain responsible for processing vision and probably some differences in those areas of the brain combining all the information to create the overall/final perception.

G
 
Jul 8, 2019 at 7:31 AM Post #13,195 of 17,336
Which headphones measure the overall best, i.e most "high fidelity" out of all, not considering price class? I can imagine some pretty cheap model could share the frequency response of my Stax, but would they sound the same as a matter of fact? Or is figuring out fidelity a mix of many, many measurements ? (I most often see the FR graph, but I don't understand any of the other measurements)
 
Jul 8, 2019 at 8:13 AM Post #13,196 of 17,336
Which headphones measure the overall best, i.e most "high fidelity" out of all, not considering price class? I can imagine some pretty cheap model could share the frequency response of my Stax, but would they sound the same as a matter of fact? Or is figuring out fidelity a mix of many, many measurements ? (I most often see the FR graph, but I don't understand any of the other measurements)

The Sony MDR-7506 (about $80) have come out on top in a very careful study that took into account data on user preferences, and have come out on top a couple of times before that over the course of decades. No way they are as comfortable as your Stax though. A lot of people find them too emphasized in the mids and treble. The Sony MDR-V6 have their own Wikipedia page and it discusses the very similar Sony MDR-7506s a lot.

My current faves for just knocking around are the Superlux HD 681s (about $30). They are full range and pretty flat and neutral, and very comfortable, really nice subjective sound for my taste, pretty close to top of the heap, but build quality is awful and you get negative bragging rights.

If you don’t have Bosephobia the QC35 IIs (about $350) are very comfy measure quite well and have a lot of the modern luxuries like wireless, noise cancelling, etc. Again, negative bragging rights.

You can check out the rtings.com web site but their ratings were found not to correlate very well with user preference by Harman (a major manufacturer who has done a lot of research to combine findings as to what people prefer with objective measurements). Interestingly Harman found that the least accurate group as to predicting user preference was professional reviewers.
 
Last edited:
Jul 8, 2019 at 8:35 AM Post #13,197 of 17,336
But are these headphones, as a matter of fact the most high fidelity ones, or simply those with the flattest GR graphs?
 
Jul 8, 2019 at 8:55 AM Post #13,198 of 17,336
[/QUOTE="AudioThief, post: 15050823, member: 470047"]But are these headphones, as a matter of fact the most high fidelity ones, or simply those with the flattest GR graphs?[/QUOTE]

They are all within a range such that it would be a matter of opinion, IMHO, along with many other headphones.

The MDR-7506s have been found to be the highest fidelity headphones full stop on more than one occasion. Read the Wikipedia article, There are beauties and clunkers throughout the price spectrum. But a number of people just plain don’t like 7506s. : )
 
Last edited:
Jul 8, 2019 at 9:08 AM Post #13,199 of 17,336
[/QUOTE="AudioThief, post: 15050823, member: 470047"]But are these headphones, as a matter of fact the most high fidelity ones, or simply those with the flattest GR graphs?

They are all within a range such that it would be a matter of opinion, IMHO, along with many other headphones.

The MDR-7506s have been found to be the highest fidelity headphones full stop on more than one occasion. Read the Wikipedia article, There are beauties and clunkers throughout the price spectrum. But a number of people just plain don’t like them. : )[/QUOTE]

So if people believe that say the hd800 are better, it is a matter of opinion that would likely Split 50/50? Or do people prefer lower fidelity? (i suspect a random group would prefer the hd800s over the sony, ststidtically speaking)
 
Jul 8, 2019 at 9:32 AM Post #13,200 of 17,336
Not really, although my reply wasn't directed specifically to your post, but to the whole discussion in general.
(Although I was sort of tagging onto your reference to blind testing...)

From a lot of what I read it seems to me that many people, especially on this forum, believe that "double blind tests eliminate the problem of a lack of objectivity".
There seems to be a somewhat dubious pair of assumptions that:
- without blind testing people and test results can't be objective
- when you perform a double blind test you always rule out the negative effects of bias

As if double blind testing is a "cure for all the ills of other types of testing and subjective opinions", performing a double-blind test will automatically produce an accurate result, and "people will convert to being objective once shown how foolish and inaccurate being subjective really is"... I merely wanted to "put double blind testing in its place" - which is as an excellent way to eliminate one particular sort of expectation bias.- and nothing more or less than that.

did you perhaps quote the wrong post? because in the one you quote I didn't mention objective or subjective once ^_^.
 

Users who are viewing this thread

Back
Top