you cannot trust your eyes, so why trust your ears?

mike1127 · Jul 7, 2009 at 11:56 PM

Quote:

Originally Posted by royalcrown /img/forum/go_quote.gif
The measurements told you plenty of information: it was written at a specific grade level, and can be comprehended by an approximate percentage of the world.

The only problem was they were wrong. When presented with a paragraph with crucial grammatical errors and poor sentence construction, they blithely went right away and produced a number.
Quote:

Ph0rk's score indicates that his post was, according to wikipedia, "easily understandable by 13- to 15-year old students" (Flesch-Kincaid readability test - Wikipedia, the free encyclopedia).

Ha.

Quote:

I've said it before, and I'll say it again: the purpose of audio measurements are not to tell people how something sounds. They exist to determine whether or not a device is beyond audible limits or to determine differences between components (i.e. whether or not a difference exists, not how that difference will perceptually manifest itself).

I don't think there is any agreed-upon reason that measurements exist. However, measurements generally fail to determine whether components sound identical for the same reason they fail to predict whether something sounds good.

Perhaps this example will make it clearer. If two paragraphs receive the same Flesch-Kinkaid score, does that tell you the two paragraphs contain the same text?

Quote:

Besides, your claim that "people who favor the use of measurements don't seem to realize that they are an extremely tiny peek into a universe of behavior" has no substantiation.

It has an easy substantiation! Incredibly trivial. Measurements are made with only a small subset of possible inputs to a device. Q.E.D.

Quote:

Where has it ever been proven in the course of history that the audio measurements we have are incomplete?

I think the proof would have to go the other way.

mike1127 · Jul 7, 2009 at 11:59 PM

Quote:

Originally Posted by ph0rk /img/forum/go_quote.gif
To be honest though, I think before writing and speaking so much at my day gig that I tend to treat web forums as more of a conversation than written discourse, which is why I rarely point out the typos and grammar of others.

I don't believe I have EVER pointed out someone's grammar on a web forum before---but the subject became apropos as soon as you wanted to blame PhilS for not understanding your paragraph, and tried to justify its readability through measurements, and RoyalCrown took your angle and further tried to insult PhilS and myself.

ph0rk · Jul 8, 2009 at 12:38 AM

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
I don't believe I have EVER pointed out someone's grammar on a web forum before---but the subject became apropos as soon as you wanted to blame PhilS for not understanding your paragraph, and tried to justify its readability through measurements, and RoyalCrown took your angle and further tried to insult PhilS and myself.

Fair enough.

PhilS probably would have gotten less snark if he was more particular about which sentence/bit he didn't grok - I don't write poetry (even sober) but it wasn't -that- bad. To just throw up one's hands and give up made my first thought be that he didn't even try. To quote an entire paragraph and say "I don't get you" isn't too far removed from replying "wut?".

Of course I get the impression that people don't read the entirety of a post pretty often here and elsewhere, so I shouldn't read too much into it.

Goodness, we are about as far off topic as we can be.

PhilS · Jul 8, 2009 at 1:09 AM

Quote:

Originally Posted by ph0rk /img/forum/go_quote.gif

PhilS probably would have gotten less snark if he was more particular about which sentence/bit he didn't grok - I don't write poetry (even sober) but it wasn't -that- bad. To just throw up one's hands and give up made my first thought be that he didn't even try. To quote an entire paragraph and say "I don't get you" isn't too far removed from replying "wut?".

Yeah, perhaps I should have tried to explain better why I was confused. I didn't quite know where to begin, though. In any event, let's return to the topic, or things reasonably related to it. The other stuff is water under the bridge as far as I'm concerned.

royalcrown · Jul 8, 2009 at 3:06 AM

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
I think PhilS's point was that if you can't rule out "PhilS is an idiot" based on his other posts on this thread, then you are being insulting. I don't think anyone says you are "attacking" him, but you and Ph0rk are definitely not as polite as PhilS. (Neither am I.) I do think that Ph0rk's paragraph can be demonstrated fairly convincingly to be far harder than a 5th grade level, and generally "hard to read" to a non-technical specialist.

I was able to guess what Ph0rk meant... but my job actually involves decoding poorly written computer software and documentation. I practice this skill regularly. In fact, of all the people in my group, I'm the best at it---partly because I have developed certain reading techniques and see it as a challenge.

In other words, I'm no fifth grader.

I don't rule anything out based on assumption alone. If that makes me impolite, so be it, but I'd rather be impolite than dishonest. Besides, the point of my post wasn't to implicate anything about anyone - I wasn't even defending ph0rk. I just saw an interesting topic (how do we apply measurements to other fields, and how does this link to audio measurements) and tried to keep the focus of the thread going.

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
The only problem was they were wrong. When presented with a paragraph with crucial grammatical errors and poor sentence construction, they blithely went right away and produced a number.

Well for one, the only mistake you pointed out was misspelling "hear" for "here," which is hardly a crucial mistake (they're homonyms). Furthermore, grammatical errors don't necessarily lead to an unclear sentence - they can inhibit clarity, and in some cases completely destroy meaning, but that doesn't mean that's necessarily the case. Completely correct grammar and spelling aren't necessary nor sufficient conditions for a clear paragraph. I don't have the link on hand, but I distinctly remember a paragraph I've read several times where every word was misspelled but the paragraph was still easily readable.

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
Ha.

Sure? I sent the link over to my friends who have little brothers in middle school, and none of them found it particularly hard to read (though I imagine their minds were elsewhere). It's not compelling proof of anything, but waving away something without giving a reason for it is no more of a response than going "LOLOLOLOLOLOL" in a flashing annoying font.

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
I don't think there is any agreed-upon reason that measurements exist. However, measurements generally fail to determine whether components sound identical for the same reason they fail to predict whether something sounds good.

You're right in that there are no agreed upon reason that measurements exist. However, that doesn't mean that one can simply point out one aspect where measurements don't work, and then from that conclude that measurements are not useful. Measurements have uses, even if they're not a cure-all for everything, and omitting measurements altogether on the basis that they're incomplete ignores the uses that they do have.

That said, you stated that measurements fail to determine whether components sound identical for the same reason they fail to predict whether something sounds good. But what reason is that in the first place? More importantly, when have measurements ever failed to determine whether components sound identical?

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
Perhaps this example will make it clearer. If two paragraphs receive the same Flesch-Kinkaid score, does that tell you the two paragraphs contain the same text?

Of course not - but that doesn't mean that the measurements are useful. The text may differ, but the Flesch-Kincaid is a metric of readability. This is a perfect example of how missing the purpose of the measurements leads to an incorrect conclusion. The Flesch-Kincaid metric wasn't designed to tell whether or not two texts are identical, they were designed to assess readability, which they do just fine. Likewise, audio measurements aren't designed to tell you what sounds good; that's probably not possible in the first instance. But that doesn't mean that audio measurements should be disregarded altogether - that's throwing out the baby with the bathwater.

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
It has an easy substantiation! Incredibly trivial. Measurements are made with only a small subset of possible inputs to a device. Q.E.D.

Say you have a circuit that consists of one LED and a battery supply. Once connected, the LED will be of a specific brightness (assuming the battery doesn't cook the LED). Now say you have the battery supply, one resistor, and then the LED. Depending on the resistor value, the LED will dim to some degree. Let's say the resistor was of the proper value, and the LED brightness is cut in half. If you measure the input and output, you'll notice the voltage will diminish according to the resistor used. In this instance, the resistor is doing all sorts of things: it's affecting the noise level, adding some inductance, and more stuff that engineers can explain better than I can. However, if all I want is the LED to be lower in brightness, I don't care about the "world" of other stuff the resistor is doing. I just want the LED down. Perhaps I should rephrase my question: do you have any substantiation that this unmodeled behavior matters? Perhaps theoretically there's some behavior we haven't measured, but there's no proof that said behavior actually matters when it comes to audio.

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
I think the proof would have to go the other way.

How do you plan on showing that measurements are incomplete other than demonstrating that two components that measure identically can be audibly distinguished from each other?

Bullseye · Jul 8, 2009 at 8:19 AM

Quote:

Originally Posted by mike1127 /img/forum/go_quote.gif
Hey Bullseye, you've said that you want to do good in the world by preventing people from wasting their money on exotic cables. I think you will have more success if you learn good communication skills. We've said it before, and I'll say it again---putting an emoticon after an insult does not make "not an insult." And if these insults are coming out of you because English is not your first language, you might want to learn a few things about English.

Mike, I don't know if you are having any trouble dealing with sentences or if you feel attacked by absolutely everything but you are being over sensitive about a few words written that do not even go directed towards you...

You are being paranoid. No, it is not an insult, it is a fact!

Royalcrown, did you feel insulted after my sentence? Did i hurt your feelings in any way by saying in an ironic way you should go "cry to your bed saying Why are this believers so bad with me?"

mike1127 · Jul 8, 2009 at 9:37 AM

(Note: I wrote this post late at night so numerous edits were necessary today to clarify it. You might want to make sure you are reading the new version. Yes, I am included among the people who can sometimes write terrible grammar!

)

Royalcrown,

Perhaps I have given the mistaken impression that I think measurements are totally useless or have no basis in reality. A lot of your reply here doesn't really address my point(s), but perhaps I have not made my point(s) clear.

To me, the question is: how well do measurements correlate with perceived sound? A related question is: how complete are our models of audio devices; that is, how significant is unmodeled behavior? You say that measurements are useful for deciding if two devices sound similar, and that using measurements to decide how something will sound subjectively is the wrong use of measurements.

But I never expected a measurement to tell me "what a speaker sounds like" in an aesthetic sense. However, there is some relationship between distortion measurements and sound quality. A device with high distortion---I mean really high distortion---always sounds bad. A device with less distortion will sound better (better fidelity to the original). The question is: how far can you take this relationship? Are there any measurements that a closely correlated with sound quality? Consider measurement X: suppose a device with 1% X distortion sounds accurate, 2% sounds so-so, and 3% sounds bad. That would be a close correlation.

If a particular set of measurements is so complete that it leaves no significant unmodeled behavior, then we can use that set to determine if two devices are identical.

If you survey audio engineers, you will find a range of opinions about measurements. Some engineers say they are very useful, and will say they've found a way to relate almost any perception to a measurement. (In other words, if I report that a speaker sounds "warm" they will say, "Yup, it has a 250 Hz bump.") Other engineers say that measurements are nearly useless except for diagnosing gross types of distortion.

Why such a range of opinions? Probably because people are listening for different things. In the general "area" I move in---that is, the type of equipment and music I use---measurements have not proven to be very useful. So I downplay the usefulness of measurements.

This is not to say measurements are useless in principle. It's just that in my "area", we don't have any good ones, that really correspond to sound fidelity in a more refined way than detecting gross distortion.

Quote:

Originally Posted by Royalcrown
Quote:

Originally Posted by mike1127
Perhaps this example will make it clearer. If two paragraphs receive the same Flesch-Kinkaid score, does that tell you the two paragraphs contain the same text?

Of course not - but that doesn't mean that the measurements aren't useful. The text may differ, but the Flesch-Kincaid is a metric of readability. This is a perfect example of how missing the purpose of the measurements leads to an incorrect conclusion. The Flesch-Kincaid metric wasn't designed to tell whether or not two texts are identical, they were designed to assess readability, which they do just fine. Likewise, audio measurements aren't designed to tell you what sounds good; that's probably not possible in the first instance. But that doesn't mean that audio measurements should be disregarded altogether - that's throwing out the baby with the bathwater.

I was addressing the question of whether audio measurements are good for determining whether two devices are audibly the same. You raised this point yourself:

Quote:

Originally Posted by Royalcrown on measurements
They exist to determine whether or not a device is beyond audible limits or to determine differences between components (i.e. whether or not a difference exists, not how that difference will perceptually manifest itself).

First of all, Flesch-Kinkaid is a measure of comprehensibility, not readability. I would like to see a fifth grader prove their comprehension of that paragraph by repeating it back in their own words.

The issue is not "what measurements are for." The issue is that any measurement of anything, whether it is Flesch-Kinkaid or frequency response, is a kind of lossy compression. It takes a rich source of information and compresses it into a few numbers. That's why Flesch-Kinkaid cannot tell you if two paragraphs are the same---because so much information has been lost. The same is true of audio measurements.

Quote:

Perhaps I should rephrase my question: do you have any substantiation that this unmodeled behavior matters? Perhaps theoretically there's some behavior we haven't measured, but there's no proof that said behavior actually matters when it comes to audio.

This question is at the heart of the matter. However, I think it's silly to presume that our measurements are complete, and that the "proof" is needed to show they aren't. It should be the other way. Our measurements are grossly simplified representations of complex behavior. No proof is needed for that. That is the definition of a measurement.

Consider codec research. Codec researchers have models of the ear/brain that predict whether the distortion induced by lossy compression is likely to be audible, and how audible. Through experimentation, both listening tests and developing understanding of neurology, those models have been refined.

If the models are good enough, there should come a time when codec researchers can predict the results of listening tests without having to do them. For example, if someone comes up with a new lossy compression algorithm, we should be able to apply a measurement to it and determine how it would fare in listening tests. (Because you are interested in determining if two devices sound the same, consider that equivalent to asking if a codec introduces audible distortion.)

If we are going to say our models are complete, then this should be true of any new codec... particularly if it's a radically new type of algorithm. No matter how new or different than what we've seen before, it should be analyzed via existing measurements, and any listening results predicted with perfect accuracy.

But I think most scientists would never say that you should stop running the listening experiments. There may come a time (or it may have come already) when so much success has occurred that it doesn't seem necessary to run the tests, but I'm sure that scientists would want to keep running some tests, especially with new ideas.

I am not a codec researcher, so we would really have to get one to comment on the question: how complete our are models of the ear/brain's response to lossy codecs? You asked" "Have we ever predicted two things to sound the same, and yet they prove to be distinguishable in a double-blind test?" I would guess this situation comes up frequently in codec research. If it didn't, the research would be over! We would know everything there is to know about codecs.

Now, codecs are a specialized area of knowledge. A particular type of device. If you are going to say our measurements are complete as a whole, that would apply to all audio devices... amplifiers, DACs, headphones, etc. (*)

It seems to be a very, very bold claim to claim that we have a way to completely characterize the behavior of all these devices. Consider what a brilliant accomplishment it would be to achieve this just for codecs... and codecs are probably THE easiest type of device to run ABX tests on.

Because I'm not an audio engineer, I don't have a lot of field data. EDIT: let me qualify my resume a bit here. I took a class in college in which I designed speakers and worked with an audio engineer. We did a lot of measurements, and none of the correlated to sound quality, especially not frequency response or harmonic distortion. After college I worked for a company that serviced audio test equipment and I spent some time at Harman, so I was able to observe in an indirect way their process for evaluating speakers through measurements. This was Floyd Toole, supposedly the king of scientific/subjective evaluation. And their speakers sounded terrible. Something was wrong with their approach.

Later I discovered that LP is a higher-resolution medium than CD. There are no measurements to explain that. I have some of my own guesses, though... I think the impulse response of a system may be key, and LP and CD have very different impulse responses (because CD has a brick-wall filter at 22.05 KHz). But there is no single number that demonstrates less distortion in analog. I take this to mean we haven't found the right measurement.

(EDIT - clarified this terrible sentence!) Part of my opinion comes from this fact: the good pieces I own are made by companies who don't rely exclusively on measurements and put subjective evaluation as a high priority; while companies who emphasize measurements over all else make terrible-sounding stuff (to my ears).

(*) EDIT: to say we can completely characterize a device doesn't mean that we can predict how that device will sound subjectively in all circumstances, but it means there are no significant unmodeled effects, so that two devices can be compared for audible equality with confidence.

gotchaforce · Jul 8, 2009 at 10:14 AM

i made a thread similar to this a while ago.. ill go and read it again lets see if it was the same type of replies!

http://www.head-fi.org/forums/f21/tw...y-ears-301377/

hmm well it went off on a different track probably because my original post was a sarcastic bait..

i think this thread got off on the wrong foot, saying your eyes can be tricked by optical illusions doesnt really mean anything for your sense of hearing. Your sense of hearing definitely does get tricked by the mere power of suggestion though (in the form of price, specific details to listen to, or just plain saying something will sound better after X or Y event happens). basically saying "trust your ears" is dumb because there is no way to isolate what your ear is hearing and what your mind is THINKING its hearing. The appropriate thing to say would be "trust your ears with the aid of a test that takes out bias"

PhilS · Jul 8, 2009 at 5:31 PM

Are they any comparisons where you can trust your ears without a blind test? If you're using the headphone out from a cheap receiver to listen to your headphones, and you buy a nice Ray Samuels amp (let's say the Apache, for example), and the RS amp sounds better to you, are you entitled, without a blind test, to (1) tell others on this forum with some degree of reliability or respect that the RS amp is an improvement, and (2) have confidence that your ears are actually hearing an improvement? Or does that fact that you did not conduct a blind test in this instance undercut your entire conclusion and basis for offering your opinion?

P.S. Assume for the purposes of argument that in this instance we cannot compare measurements.

haloxt · Jul 8, 2009 at 5:43 PM

Why would you even compare the rs amp and the apache, they both measure the same and any differences as shown by the measurements would be inaudible. While we're at it, I have ipod touches I will trade for your super-sized dacs and amps, you just pay shipping of your dacs and amps so long as they retail for $500+ and I will mail you a 2gb ipod touch in exchange. That way we can save the world from global warming by recycling your super-sized dacs and amps and you will get a cool piece of audio equipment that you can carry around with you everywhere you go.

Dane · Jul 8, 2009 at 5:49 PM

1) I don't see any problems with sharing ones experiences, otherwise this forum would be quite boring. I read it and take it for what it is.

2) You can have full confidence that you experience an improvement when you experience an improvement

As such your music experience just got better - whatever the cause is.

I guess that as long as people report what they personally experience and don't pretend that it is an absolute truth, then there's no problem.

PhilS · Jul 8, 2009 at 5:59 PM

Quote:

Originally Posted by Dane /img/forum/go_quote.gif
1) I don't see any problems with sharing ones experiences, otherwise this forum would be quite boring. I read it and take it for what it is.

2) You can have full confidence that you experience an improvement when you experience an improvement

As such your music experience just got better - whatever the cause is.

I guess that as long as people report what they personally experience and don't pretend that it is an absolute truth, then there's no problem.

I agree with you, but I suspect your viewpoint is more liberal or open-minded than some other folks who are participating in this thread or who participate regularly in this sub-forum. We will see (if they choose to answer the question).

Real Man of Genius · Jul 8, 2009 at 6:36 PM

Quote:

Originally Posted by PhilS /img/forum/go_quote.gif
Are they any comparisons where you can trust your ears without a blind test? If you're using the headphone out from a cheap receiver to listen to your headphones, and you buy a nice Ray Samuels amp (let's say the Apache, for example), and the RS amp sounds better to you, are you entitled, without a blind test, to (1) tell others on this forum with some degree of reliability or respect that the RS amp is an improvement, and (2) have confidence that your ears are actually hearing an improvement? Or does that fact that you did not conduct a blind test in this instance undercut your entire conclusion and basis for offering your opinion?

P.S. Assume for the purposes of argument that in this instance we cannot compare measurements.

You are entitled to say what you please.
As far as credibility, I would say it depends on the nature of the claim and who is making it.
You seem a reasonable fellow so I would heed your opinion without a blind test in the example you gave which is hardly farfetched, IMO.
If it was someone like haloxt saying how much amazingly better his Super Expensive Cable is then I would give it the credence it (or him) deserves: none whatsoever. In that case only actually witnessing a successful DBT would sway my opinion. The difference being the nature of the claim (highly unlikely, IMO) and the credibility of the person making the claim (zero).
I am more likely to believe someone I respect.
I am more likely to believe a claim that makes sense.
I am less likely to believe someone I do not respect.
I am less likely to believe a claim that makes no sense.
In the case of a respected person making an unlikely claim?
It would bear investigating. Possibly by a DBT if applicable.

haloxt · Jul 8, 2009 at 7:34 PM

Only a philistine gives merit to what is said based on how much respect he has for the person speaking. Truly you are a real enemy of genius.

mike1127 · Jul 8, 2009 at 10:09 PM

The question came up earlier on this thread: do scientists have an agenda?

As some of you may have gathered, I have serious reservations about what audio scientists say about audio reproduction. However, I don't think that scientists are irrational, and I don't have a conspiracy theory.

Scientists like to understand things. It is satisfying (particularly for a scientist's kind of personality) to find models of the world that successfully predict things.

How should we run listening tests to determine if two devices are identical? Quick-switch ABX testing is a practical method.

Treating the human mind as a "black box" or a "instrument" is a good first assumption. What do I mean by "treating the human mind as an instrument"? Instruments have the following properties:

If you give an instrument the same signal on two different occasions, it gives the same response (within the limits of noise).
Instruments don't care about context---they don't care what signals came earlier or what is coming next.
Instruments don't have expectations.
Instruments have a particularly sensitivity and they are susceptible to noise.

So it is very practical to treat the human mind as having these properties. The quick-switch ABX tests are designed, then, to find the limits of sensitivty and noise of the human mind.

But as I have pointed out, human minds are not instruments in this sense. And quick-switch ABX is a context that changes the situation, compared to listening to music for enjoyment.

What I am told by scientists on the internet is that they've had a great deal of success modeling and predicting the brain's response to certain kinds of signals and distortion mechanisms, under quick-switch ABX conditions.

I take this to mean that they brain behaves consistently under quick-switch ABX conditions. I also take this mean that the scientists have discovered what kinds of signals and distortions are detectable under those conditions.

I think that scientists are satisfied with this success. Codec theory has also been very useful commercially. So it is natural that they would resist the idea there is an entire unexplored domain---namely, how the brain behaves in other contexts. That would challenge the idea that they have good, and fairly complete, theories.

I am often asked, "But where's the evidence? Can you point to an experiment that shows greater sensitivity to a particular distortion under long-term listening?"

Well, I don't "get around" a whole lot, so I'm not sure. Maybe such an experiment is out there. However, I can point out why the pracctical difficulties are so enormous that it is unlikely that a fair test has ever been done.

In a long-term-listening test, trials take a long time. Then there is the question of how many trials need to be run to avoid Type II error (failing to reject the null hypothesis that the devices are the same). This relates to p, the probability that the subject gives the correct answer. If the test subject can tell two devices apart with perfect accuracy, p is 1.0. If the test subject is purely guessing, p is 0.5. We have no way of knowing what p is for any particular listener, but if p is relatively small, an enormous number of trials is needed to reduce the chance of Type II error. If p is 0.6, then something like 100 trials are needed!

Can you imagine a long-term-listening test that has 100 trials? How long would that take? Years?

Featured Sponsor Listings

you cannot trust your eyes, so why trust your ears?

mike1127

Member of the Trade: Brilliant Zen Audio

mike1127

Member of the Trade: Brilliant Zen Audio

ph0rk

Headphoneus Supremus

PhilS

Headphoneus Supremus

royalcrown

500+ Head-Fier

Bullseye

Headphoneus Supremus

mike1127

Member of the Trade: Brilliant Zen Audio

gotchaforce

100+ Head-Fier

PhilS

Headphoneus Supremus

haloxt

Headphoneus Supremus

Dane

500+ Head-Fier

PhilS

Headphoneus Supremus

Real Man of Genius

100+ Head-Fier

haloxt

Headphoneus Supremus

mike1127

Member of the Trade: Brilliant Zen Audio

Users who are viewing this thread