The Dishonesty of Sighted Listening Tests
Dec 2, 2016 at 5:13 AM Post #31 of 94
  I don't think they're inherently designed to make a specific group of people bad...that might very well just be a by-product. I mean, the point of an ABX test is to take out as many variables as possible and to test things in an objective manner rather than to introduce any sort of subjective biases (whether those be intentional or subconsciously). Is that not pretty much the definition of scientific testing?

Its very hard to remove variables when dealing with something as complex as human perception. I think the right kind of long term listening tests with a black box that randomly switches things and then requires the person to press a button or something to "log" the change might be good. It would satisy me anyways. My point is simply that a lot of the tests don't take into account that people get "into" music and hear it better at certain times than others.
 
Personally I do think there are audible differences, but they are so small they don't contribute to the experience.
 
Dec 2, 2016 at 6:09 AM Post #32 of 94
 
Of course you'll hear a difference between speaker A and speaker B!!
This thread is not about using ABX to find out if you can tell A and B apart. Read it please.
 
Sean Olive proved there were differences in preference ratings between sighted and blind tests comparing a particular set of speakers.
The conclusion is sighted tests are dishonest which I think most of us agree. What I say is that blind testing (read, listening to the speakers without knowing which speaker is playing and/or listening to the speakers without knowing their placement as in Olive's tests) you can build up all sort of biases and trick yourself with eyes closed as well. Then you'll rate speakers A, B, C and D based on extremely subjective stuff anyway because you're not a measurement tool. You might come back two days later and rate the speakers in a different manner, let alone if you use different recordings during the test and/or listen at different levels.
 
You can remove one single variable (by going blind) but the rest is purely subjective anyway. And that's the way it is. Even matching levels is flawed because you can not match at all frequencies.
I've done many blind tests in the past as well as sighted tests. In my experience the best way of noticing subtle differences and defining personal preferences is by extended exposure to the system. Both sighted and blind tests with A/B switching within short periods of time yield pretty poor results in my experience since you can not focus on everything when listening to music and the whole comparative experience tends to become overwhelming and confusing.


Sounds like you failed to grok Sean Olive's work and what it means.  All of your objections to it have been tested and work out by him and others doing such work already.  How can you develop biases and trick yourself if you don't know what you are listening to?  Somehow large groups of people tested have nearly all tricked themselves the same way and ended up agreeing with what sounds best to them.......over years.... from different countries and cultures and backgrounds?  Hell of a trick that.  Telepathic or psychopathic?  Pretty funny ideas.  Oh, and the usual rant of the put upon that subtle differences become clear over time as the short experience is overwhelming.  Hahahahaha!  I guess all those who have taken part in the tests would all agree over time to some other standard over longer auditioning right?  Or all prefer a different standard one which you would somehow twist around to being the more accurate result. 
 
Dec 2, 2016 at 6:13 AM Post #33 of 94
  Its very hard to remove variables when dealing with something as complex as human perception. I think the right kind of long term listening tests with a black box that randomly switches things and then requires the person to press a button or something to "log" the change might be good. It would satisy me anyways. My point is simply that a lot of the tests don't take into account that people get "into" music and hear it better at certain times than others.
 
Personally I do think there are audible differences, but they are so small they don't contribute to the experience.


You do realize the test you describe has been done before don't you.  Just as you describe. 
 
Look for info from Tom Nousaine.  An article titled "Flying blind:the Case against long term listening"
 
Results don't support your position.
 
Dec 2, 2016 at 7:01 AM Post #34 of 94
 
Sounds like you failed to grok Sean Olive's work and what it means. 
 
Me x3: it might sound like that to you, but then you are no reference at all.
 
All of your objections to it have been tested and work out by him and others doing such work already. 
 
Me x3: I'm not here against Olive's work, I'm here poinitng blind tests can be dishonest, something that you can not deny.
 
How can you develop biases and trick yourself if you don't know what you are listening to? 
 
Me x3: Read my previous posts and you'll find out. The simple idea, 'I think this have more bass' can lead you to think the next one has less bass. You can get bored of the song the second, third, forth time you play it and that's a trick your mind plays. Then you might say the first loudspeaker had better Prat or whatever.
 
Somehow large groups of people tested have nearly all tricked themselves the same way and ended up agreeing with what sounds best to them.......over years.... from different countries and cultures and backgrounds? 
 
Me x3: If you study Olive's work you'll realize there's no such consensus among people, and even averaging results different studies made by Olive himself lead to slightly different results. Different type of listeners have different preferences and so on...
 
Hell of a trick that.  Telepathic or psychopathic?  Pretty funny ideas.  Oh, and the usual rant of the put upon that subtle differences become clear over time as the short experience is overwhelming.  Hahahahaha!  I guess all those who have taken part in the tests would all agree over time to some other standard over longer auditioning right?  Or all prefer a different standard one which you would somehow twist around to being the more accurate result.
 
Me x3: Extended exposure increase the chances of getting a better understanding of the properties of certain reproduction system, in the same way if you watch the same movie 100 times you'll know it better than if you watch it a single time.

 
Make yourself a favor and go study (or at least try) before coming here to make us lose time explaining yourself why your "Hahahaha" unthought posts are of no use.
 
Dec 2, 2016 at 7:33 AM Post #35 of 94
I can't find any reference to the Noussaine article online other than suggesting that it was about testing CDs sent in the mail.
 
I highly doubt the skeptics would accept any results from an in home experiment.
 
I still believe that the perception of sound is a difficult variable. Yes it is suggestable but also can be highly acute at certain times, but not other times. The key is to test it at these times.
 
If an amp was to be switched at a moment when one is highly focused on the sound I believe it would be noticed.
 
I also don't think the differences in modern amps appreciably impact the experience, ie tubes don't sound better.
 
Dec 2, 2016 at 12:28 PM Post #36 of 94
  I can't find any reference to the Noussaine article online other than suggesting that it was about testing CDs sent in the mail.
 
I highly doubt the skeptics would accept any results from an in home experiment.
 
I still believe that the perception of sound is a difficult variable. Yes it is suggestable but also can be highly acute at certain times, but not other times. The key is to test it at these times.
 
If an amp was to be switched at a moment when one is highly focused on the sound I believe it would be noticed.
 
I also don't think the differences in modern amps appreciably impact the experience, ie tubes don't sound better.

 
I imagine a test where we take the subject's audio chain and put it into a black box with another chain. The subject then presses a button to turn on the black box, which will randomly choose one of the chains to use (all this is calibrated for volume output to a set of cans/speakers and has other necessary controls). The subject can then, at any time, indicate his satisfaction with what he is hearing out of the black box. The system stores the chain choice, the satisfaction scores, and hopefully content metadata and we analyze after some longish amount of time. This would seem to be a way to satisfy those who think these kinds of test need to take years.
 
Dec 2, 2016 at 1:29 PM Post #37 of 94
  This would seem to be a way to satisfy those who think these kinds of test need to take years.

 
Or people could just accept what is already established....cognitive inputs affect sensory perception.
 
Anybody who has listened to music while high knows this. :)
 
Dec 2, 2016 at 2:01 PM Post #38 of 94
   
Make yourself a favor and go study (or at least try) before coming here to make us lose time explaining yourself why your "Hahahaha" unthought posts are of no use.

 
What does "personal preferences" have to do with ABX?
ABX does not set forward the goal of proving that A is objectively better (or different) than B.
It is all about the subject, and how he perceives A and B, by removing sight from the equations.
Clearly there are many other variables in the equations, like time of the day, mood, health, being tired, sleepy, hungry, you-name-it.
But if A and B are tested relatively close in time (and with the usual good testing practices), you can assume the variables listed above to be constant.
Of course, can happen even that, for a given subject, one day A != B (or > if testing for "better"), and another day B == A.
Which at the very end means A and B are effectively undistinguishable by the subject (within measure noise) once you remove sight.
That does not mean ABX is "wrong", even if folks with personal business agenda would like to infer.
Like "Sighted tests are wrong, but see, ABX are too, so here it is a $1000 device you cannot effectively tell apart from a $100 one".
 
Dec 2, 2016 at 3:35 PM Post #39 of 94
general rant for no reason: 
 
when I assume that I'm in control of my senses and brain, I'm assuming that objective reality and my own subjective reality are one and the same. because me believing is 100% of the requirements for something to be subjectively true to me, when I tell others that something is true, it means nothing and has no conclusive value whatsoever. the total of my demonstration can be summarized as "please believe me as I believe myself". how do we call this? an opinion. what is the conclusive power of an opinion? not much.
the only way to tell objectively if a given variable impacts my judgment, is to test with and without that variable. then we see if I get different results and we know for sure how immune I am to bias from that extra variable. it's the definition of blind testing. a testing method defines what we can prove, and what we wish to prove put conditions on the testing method. when we go with sighted evaluation, we put almost no conditions on the test, so we get almost no conclusive result from it. as long as people pretend to care about how a device sounds, they should test the sound cleared from as many other variables as possible.
 
I always feel like I'm explaining logic to a 10year old when this matter comes up. I don't like the brown and the green M&M's it's always been like that. they don't taste as good to me. I completely agree that the red and yellow are the right choice for the adds because they're the best. that's how I honestly feel. and plenty of people feel the same so it must be true...
this is my subjective reality, my opinion. now if I close my eyes, and someone takes note of the color I pick and how I feel about it, will it become obvious that I really prefer the taste of the yellow and red ones? well here is the thing, unless I try under such controlled conditions, I'll never really be sure if taste has anything to do with it. I can talk for the next 10 years about how sure I am, and make a list of all the people who agree with me, I'll never actually have proof of anything because I will never be able to identify the M&M's in a sighted test without seeing the color. it's so damn obvious.
 
now seeing a guy with a white coat, some computer and a crap test aren't legitimate reasons to accept any result given to us. scientists have discovered that people are more likely to trust anything if you start the sentence by "scientists have discovered".
biggrin.gif

again the test defines what is conclusive and what isn't. not the subjective idea that science is true or not or how famous is the guy talking to you. if you try objectivity, try it for good.  there is an unbreakable relation between the conditions of a test and what answer it can deliver conclusively. when whoever went to trick people with the fake pono test, what his test demonstrated was that people are easy to trick(nothing new under the sun). he didn't prove anything about the sound of the pono.
so just reading results isn't the objective way, when you understand how something is done and why it's done, you're better armed to understand what the results mean. trying to convince somebody that abx is right when the guy doesn't actually understand what abx is and what it should be used for, that's a hopeless job. and the same way, criticizing a testing method when we don't understand it, that's just noise.
 
 
is long term listening a factor in audio? the only way to verify is to test for it. empty conjectures are just that.
is abx stressful to people, that too can be tested. the first time you do an abx, you may not perform as well as you would after you've done it 50 times. so what's the answer to that questions? to try a given test and do 20 or 50 series over a few weeks. then look at the results and see if there is a pattern showing improvement. if there is, then you'd know that you need time to do a more significant abx test. but if you've done 3 and keep whining that abx is stressful, you're nothing but an annoying kid. not all question can be answered with a lazy method. if I reject abx because I need to learn how to do it effectively with minimum stress, then do I give up painting because the first 3 times I tried I made garbage? do I run away from sex because of how the first time was a lot of stress?
all I see are lame excuses and fallacies to justify being lazy and stick to comfy ignorance of sighted evaluations instead of actually sicking the truth.
 
Dec 2, 2016 at 5:28 PM Post #40 of 94
  I can't find any reference to the Noussaine article online other than suggesting that it was about testing CDs sent in the mail.
 
I highly doubt the skeptics would accept any results from an in home experiment.
 
I still believe that the perception of sound is a difficult variable. Yes it is suggestable but also can be highly acute at certain times, but not other times. The key is to test it at these times.
 
If an amp was to be switched at a moment when one is highly focused on the sound I believe it would be noticed.
 
I also don't think the differences in modern amps appreciably impact the experience, ie tubes don't sound better.


No CDs sent in the mail.  Sorry until six months ago it was on the web.  Mr. Nousaine passed away in March of this year so his web page of articles is gone.
 
He built a black box, that black box either did nothing, or added distortion to the signal.  A level known to be definitely audible in quick switching ABX testing.  People were allowed to take the box home, connect it up and listen.  Any way they chose to do so.  For as long a time as they chose to do so.  Of course they were told not to open the box or test the box with instrumentation.  Quite a number of these (don't remember the exact count) were sent out with audiophiles with good hearing.  Most returned them between 6 and 11 weeks.  A few kept them several months.  You were to choose when you felt you knew, whether the box was straight thru or one that altered the sound.   The number of correct responses was very nearly exactly the level of chance. 
 
He later had those same people come and do a short term ABX test where they get to listen and immediately switch.  The same box was being switched in or out.  I forget if it was 100% or just close to it that now scored better than a 95% confidence level.  He proceeded another couple of rounds where the amount of distortion was decreased.  I forget the exact amount, but with quick switch ABX these same people who didn't reliably hear the distortion as a group over weeks were able to hear it and reliably identify distortion at maybe 1/5 the level of the box they took home for long term audition.
 
Further tests have been done to see the effects of audition length and time switching.  Short segments and rapid switching give the most discriminating results.  Lengthening the audition and most especially lengthening the time required to switch reduce the acuity of the test procedure.  Audiophiles always feel much better about the long term listening.  They grow confident with time especially sighted testing.  That confidence doesn't seem to translate into better results in testing.  
 
Your ideas aren't new to anyone, I have read them dozens and dozens of times.  Results when you don't know what you are listening to don't support your ideas. 
 
Dec 2, 2016 at 5:36 PM Post #41 of 94
   
What does "personal preferences" have to do with ABX?
ABX does not set forward the goal of proving that A is objectively better (or different) than B.
It is all about the subject, and how he perceives A and B, by removing sight from the equations.
Clearly there are many other variables in the equations, like time of the day, mood, health, being tired, sleepy, hungry, you-name-it.
But if A and B are tested relatively close in time (and with the usual good testing practices), you can assume the variables listed above to be constant.
Of course, can happen even that, for a given subject, one day A != B (or > if testing for "better"), and another day B == A.
Which at the very end means A and B are effectively undistinguishable by the subject (within measure noise) once you remove sight.
That does not mean ABX is "wrong", even if folks with personal business agenda would like to infer.
Like "Sighted tests are wrong, but see, ABX are too, so here it is a $1000 device you cannot effectively tell apart from a $100 one".


The purpose of this thread is not judging ABX.
ABX is a fine practice to find out if you can (under those particular conditions tell A and B apart)
If you want my opinion on that matter, it's a fine practice. With shortcomings, but other types of tests have normally more shortcomings.
 
You cannot objectively assume the variables listed to be constant. It's not that far from assuming sight won't change your preference.
 
You can prefer A at certain moment in time because you liked A bass more and you were testing for bass, then prefer B because it makes vocals stand out more, then prefer A because it has more natural tonal balance for the recording used, then prefer B for having more natural balance for a different recording. And that is far from A = B
 
I never said ABX is wrong, I just said why blind testing can be dishonest when used to state preference.
Closing our eyes doesn't make us a tool designed to measure preference in an objective manner.
 
I really don't get the 1000usd vs 100usd device thing. Price means nothing to me. My 200usd K702 best my 1000usd HD800 with some piano recordings, blind or sighted. My DALI Zensor 1 sounded better than my more expensive FOCAL Aria with certain recordings on certain rooms. My 70usd E10K sounds very similar to my 700usd Schiit Bifrost/Asgard2 combo. Using price as a measure of performance or quality is a senseless generalization normally used by those who tend to over-simplify complex problems.
 
Dec 2, 2016 at 5:56 PM Post #42 of 94
   
Make yourself a favor and go study (or at least try) before coming here to make us lose time explaining yourself why your "Hahahaha" unthought posts are of no use.


Look at this.  It isn't the only one.  You will find this one write up of testing shows the lack of veracity of much of your post.
 
https://dl.dropboxusercontent.com/u/16343460/AES%20137%20The%20Influecne%20of%20Listeners%27%20Experence%2C%20Age%2C%20and%20Cultire%20on%20Headphone%20Sound%20Quality%20Preferences%20.key.pdf
 
Dec 2, 2016 at 6:01 PM Post #43 of 94
 
is long term listening a factor in audio? the only way to verify is to test for it. empty conjectures are just that.
is abx stressful to people, that too can be tested. the first time you do an abx, you may not perform as well as you would after you've done it 50 times. so what's the answer to that questions? to try a given test and do 20 or 50 series over a few weeks. then look at the results and see if there is a pattern showing improvement. if there is, then you'd know that you need time to do a more significant abx test. but if you've done 3 and keep whining that abx is stressful, you're nothing but an annoying kid. not all question can be answered with a lazy method. if I reject abx because I need to learn how to do it effectively with minimum stress, then do I give up painting because the first 3 times I tried I made garbage? do I run away from sex because of how the first time was a lot of stress?
all I see are lame excuses and fallacies to justify being lazy and stick to comfy ignorance of sighted evaluations instead of actually sicking the truth.

Preferring loudspeaker A over loudspeakers B, C and D on a blind test at a given moment in time, in a given room, with given recordings at given volume, doesn't objectively make A a better speaker than B, C and D.
 
Picking 1000 people and doing the same test on all of them then averaging the results to see what's the better speaker is of little use to the random individual in deterministic terms. You can be one of the minority who prefer speaker C for some reason.
 
You can train yourself for ABX and critical listening, but again I don't think that's the topic here. Stating preference is a subjective matter, and as such it could be labelled as being dishonest, sighted or blind. If Mr. reviewer state he prefers speaker A over speaker B in his room, with his music, his listening levels and his personal preferences, it doesn't matter much if he was blind or not when doing the test, his results are useful for him under those conditions that will eventually change. Going blind doesn't make the test objective, it just makes it 1 variable less subjective.
 
Dec 2, 2016 at 6:27 PM Post #44 of 94
 
Look at this.  It isn't the only one.  You will find this one write up of testing shows the lack of veracity of much of your post.
 
https://dl.dropboxusercontent.com/u/16343460/AES%20137%20The%20Influecne%20of%20Listeners%27%20Experence%2C%20Age%2C%20and%20Cultire%20on%20Headphone%20Sound%20Quality%20Preferences%20.key.pdf

It doesn't.

Picking a big number of people and averaging results won't provide useful information to you if you happen to be different from the average guy. That's why Olive's target response is useful for companies trying to please a bigger number of people but it doesn't mean you'll like it best. Some people will always listen to heavy metal music while some people will always listen to opera. Olive's studies were made using sets of particular recordings. His studies are useful for Harman and for the sake of knowing what the average guy prefers but cannot say what's better for certain individual.
 
Most people prefer HD600 over DT880, but I prefer DT880 over HD600.
Friend of mine is a bass head so he will like the one with more bass and he doesn't care about mids and treble.
He won't care for the average guy at all, he will just seek for the one with more bass.
 
Dec 2, 2016 at 8:47 PM Post #45 of 94
 You cannot objectively assume the variables listed to be constant. It's not that far from assuming sight won't change your preference.

 
Oh, no, it's far, and the whole point of these discussions.
Someone claiming "A is night and day better than B", then either failing miserably in pick apart A from B in an AB test, or, MUCH more commonly, vanishing like vampires at dawn, as soon as AB tests are mentioned.
 
 You can prefer A at certain moment in time because you liked A bass more and you were testing for bass, then prefer B because it makes vocals stand out more, then prefer A because it has more natural tonal balance for the recording used, then prefer B for having more natural balance for a different recording. And that is far from A = B

 
As I said, if varying time, you evaluation function changes its value, it means such function depends on time.
And if considering A vs. B as a +- function around zero, and the integral over time is close to zero, then it simply means that according to your evaluation function, A and B are indistinguishable.
That has nothing to do with AB tests being unreliable.
On the contrary, when the same subject in a sighted test claimed that "A is night and day better than B", totally proves the biasing effects of the sight.
 
 

 

Users who are viewing this thread

Back
Top