1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.

    Dismiss Notice

Testing audiophile claims and myths

Discussion in 'Sound Science' started by prog rock man, May 3, 2010.
First
 
Back
766 767 768 769 770 771 772 773 774 775
777 778 779 780 781 782 783 784 785 786
Next
 
Last
  1. KeithEmo
    The short answer to your first question is YES.
    Frequency response is just one of several measurements - although arguably the most important one.

    This is where it gets somewhat complicated - and where the disagreements tend to start.

    In most cases, differences in frequency response as small as 0.5 dB tend to not be especially noticeable.
    Some folks will debate endlessly about whether it's possible for a human to detect them...
    However, whether they can be detected under ideal conditions or not, they probably aren't important enough to matter.
    (If the frequency response zig-zagged up and down by 0.5 dB, you might notice something odd, but that is rarely the case in real life.)
    Likewise, while people will argue about the actual threshold for hearing THD, most of us agree 0.004% is very far below what would be audible.

    If you look at those pictures of filter impulse response.......
    Impulse response is related to frequency response - but not directly - and not always in easily predicted ways.
    It is POSSIBLE that, along with the differences shown in those images, one or more of those filters MIGHT also have a significant, and audibly different, frequency response.
    However, it's also POSSIBLE that three filters with those very different impulse responses COULD all measure within that spec for frequency response and THD..

    You will find that there is significant debate about whether even "obvious" differences in impulse response are actually audible or not.
    The most usual claims are that "excessive ringing is noticeable" and that "pre-ringing is more noticeable than post-ringing".
    In nature, many sounds exhibit significant ringing after the main sound, while nothing in nature starts ringing before the sound itself....
    The debate is about whether the amount of ringing exhibited by DACs (which occurs for very short times at very high frequencies) is audible or not.
    Many folks in this forum will insist that "tests show that you can't possibly hear a difference and must be imagining it".
    However, many quite mainstream companies, including folks like Dolby Labs, and the makers of a lot of pro audio editing software, consider it to be "worth addressing".
    (For example, the latest version of Dolby's professional encoder suite offers options to reduce pre-ringing, and, obviously, DAC manufacturers often offer filter choices..)

     
    james444 likes this.
  2. Phronesis
    I read something interesting about blind wine tasting today:

    - When a wine is 'bad', judges will consistently rate it as bad. They can tell the difference between bad and good. Bad wines aren't usually submitted to wine competitions.

    - When sampling the same wine multiple times - and not knowing it's the same - judges will tend to rate the wine as tasting better each time.

    - When tasting good wines, judges are quite inconsistent, with themselves and others, in rating which ones they prefer. Despite wines being chemically different, when they're all of high quality, the perceptual ability of judges seems to become confused.

    I know there's some circular reasoning here related to good and bad and quality, but just go with it for a moment ...

    Some possible implications for audio blind testing:

    - When gear sounds substantially different (e.g., headphones), people can probably reliably tell which gear is which, and express a valid preference.

    - When gear sounds very similar, even if there are audible differences, our ability to tell which gear is which may fall apart during the testing. We may not even be able to express consistent preferences. And that may mean that any audible differences which are really there just don't matter much. So ... both of these can be true:

    (a) There's an audible difference which is missed in blind tests.

    (b) The tests themselves show that the audible difference is unimportant.
     
    Last edited: Dec 17, 2018
  3. bigshot
    The kinds of audible differences that might be missed in careful controlled testing are too small to make any difference for the purposes of listening to recorded music in the home. There's much more likelihood of significant false results if you apply no controls and go purely with subjective impressions.
     
  4. Phronesis
    I generally am inclined to agree. I find it difficult to make the case that an audible difference missed in a blind test could somehow be a substantial difference in normal listening. I could be wrong about that (due to a difference in how perception works in the test versus normal listening), but that seems quite unlikely to me.
     
    Last edited: Dec 17, 2018
  5. Steve999
    Interestingly, if I digested it correctly, that Harmon paper you distributed pointed out that A/B testing can exaggerate the significance of differences for normal listening purposes. This I think was because the difference may be quite apparent on a controlled A/B comparison but not as important as one might infer as to general quality or preference of the sound reproduction. This is going purely by memory so if I am paraphrasing the article incorrectly feel free to refine or correct.

    As to wines, I'm not surprised. But it's alright cause it's midnight and I got two more bottles of wine. :wine_glass:<Great. . now you have me on one of my country music kicks.:tractor: >

     
    Last edited: Dec 17, 2018
  6. bigshot
    A level matched, A/B switched, blind comparison is what you use to determine the most subtle of differences. If you have to struggle to hear a difference that way, it flat out doesn't matter. The truth is that that kind of control is overkill. But it's a hell of a lot better than doing no controls and deciding to spend an extra $500 on nothing. It's a lot easier to make an error on the side of getting the impression you hear a difference that doesn't exist than to not discern a difference in a controlled test. All this bias cuts two ways stuff is BS.
     
    Last edited: Dec 18, 2018
  7. castleofargh Contributor
    quasi-modo talkin:
    guys please, easy on the sarcasm, trolling, personal criticism or whatever you want to call it. we're here to judge facts and point of views about audio, not to judge people. if you can't stand the way somebody acts, ignore him, or just don't talk to him. if some ideas rank from weird to false, present your views and back them up with something reliable, so that most people can clearly make their mind on the topic and avoid being misguided. we don't need to chase away those who don't act or think like we do. that's what happens to many of us all too often in the rest of the forum and in amateur audio forums in general when we dare bring up revolutionary ideas like how a listening test should probably not involve eyesight:weary:. if you hate that half as much as I do, you know how unfair and revolting it feels. the counter to that intolerance should be tolerance, not intolerance toward the type of people/behaviors that are not us. I guess we can make this section our own and try to force the rest of the world out, but if that's who we are, at the very least let's not pretend that we're any better.


    don't get me wrong, I'm not saying to tolerate false claims or logical fallacies!!!! wrong is wrong. you can throw fact based rocks at those while I tie them to a pole myself. we can defend our beliefs whatever they are with reason, experiments, gathered evidence, and without attacking people who aren't the way we want them to be(AKA bigotry).
     
  8. castleofargh Contributor
    the minimalistic specs provided by a manufacturer are rarely enough to say that something will be audibly transparent. most are given into a specific load at a specific output, and we don't know what happens under other uses. also THD aren't the only distortions.

    yes drivers' limitations make many things less relevant to us. they're a significant part of why I don't care all that much about the rest of the chain. those filters are a more or less fancy way to low pass frequencies at sample rate/2. what matters is mostly the quality of the band limiting to avoid aliasing IMO. I'm really not sure that the rest makes that much of a difference. some phase shift in the upper freqs, that's hardly noticeable in direct A/B. some small FR roll off, that sometimes is noticeable, but it's not like we're using perfectly flat headphones anyway. same idea with pre ringing. we could certainly make filters that will create audible consequences for at least some people with the right test signal. but in general, I wouldn't lose sleep over that.
    the designer is going to have to decide where he goes and how far for various objective variables. my own ignorant position is that a reasonable balance of compromises will always give the best results and would probably not sound anything special. so as a result I'm sort of against anybody advertising how he's pushing one single variable to 11. which is how you might end up with clear differences IMO. that, or simple incompetence. but hopefully such people only pretend to be all in on one single variable to market their difference, while the design is actually more mindful and balanced than they tell us.
    DACs have had all sorts of filters for a while now, and if we can still have people thinking that almost all DACs sound the same, I think you can guess what they will think about the filter in the DAC ^_^.
     
    james444, WoodyLuvr and Phronesis like this.
  9. gregorio
    1. Thanks for answering my question. So when the band comes on Thursday, I'll make them a cup of coffee and ask them to wait for a century.
    2. Which to the rational mind would present a serious problem, because the equipment in a studio is SPECIFICALLY DESIGNED for recording musical instruments (unlike measurement mics)!

    3. If "NEITHER OF US knows if it is CURRENTLY possible to do so" then you admit that your claim (that it is possible) was FALSE, it was entirely speculation and how exactly does your speculation qualify as "pure science"?? However, you are clearly CONTRADICTING YOURSELF: If you need to ask "for proposals from companies" to invent the "equipment necessary" (to "record a cymbal properly"), then OBVIOUSLY the "equipment necessary" doesn't currently exist and therefore it is NOT currently possible to "record a cymbal properly"?
    3a. No you didn't and I explained why, so why are you repeating that falsehood?

    1. Absolutely you should, more wasted and insulting pages of incorrect facts, misrepresentations and FALSE SPECULATION!
    2. It's CLEARLY listed as a "free-field measurement mic" and NOT as a "general purpose microphone". So clearly, you are NOT "bothering with accurate scientific descriptions"!! Maybe it was a typo and you meant to say that you're "bothering with silly arguments like inaccurate unscientific descriptions"!
    3. If only!

    1. So why do you simply ignore any request to "point out tests" which support your (or other audiophile) claims and instead respond with inapplicable analogies and unfounded/incorrect speculations that you falsely present as fact/"pure science"?
    2. You've done this a few times, though often incorrectly.
    3. You've suggested lots of tests (pointlessly because they've already been done countless times) but repeatedly REFUSED to do any any of those (or other suggested tests) yourself, AND your basis for that refusal has been that you are "not interested" and/or your company would not gain from them.
    4. BUT you have NOT discussed the "results we get" because you refuse to get any.
    5. You not only refuse to discuss practical issues but actually insult me when I try to!

    There's no number 6 in that list: Endlessly invent unfounded speculations, even when they contradict the known facts/tests/evidence. There's no number 7: Endlessly assert/discuss that flying pigs (even dead frozen ones), manned missions to other solar systems, TV's that give you a suntan and all manner of other nonsense are "possible". There's no number 8: Repeatedly misquote and/or misrepresent the actual facts/science to serve your own agenda.

    I entirely agree with the list you posted, which makes the fact that you repeatedly ignore every single one of those points (and instead post according to a completely different list) so hypocritical, off-topic, insulting and effectively trolling! Why won't you STOP with all this unsupported speculations/nonsense/BS and instead follow your own advice and stick to the list?

    G
     
  10. Phronesis
    I suspect that this is true, but are there any good studies correlating blind tests using short music excerpts with normal listening tests? I’d like to see evidence here. Otherwise, we’re just making claims.
     
  11. castleofargh Contributor
    but blind test using short samples is normal listening test ^_^.
     
  12. Phronesis
    What I'm looking for is a study which uses multiple listeners and music excerpts, varies the duration of the music excerpts and the switching time, does both blind and sighted comparisons in the A/B, and also has listeners blind and sighted listen for long durations and note their observations on the sound quality (not A/B identification) with respect to things like tonal balance, instrument separation, stage, detail, etc. (the kind of stuff people write in flowery gear reviews). This involves a lot of trials, so it would take some time to complete the study.

    By correlating all of the results, we can get an idea of how the various variables affect the results, and whether there's any relationship between results of blind A/B testing with short excerpts versus notes from normal extended listening.

    I've yet to run across even one good study along these lines, and lots of poorly documented, conducted, and/or interpreted studies aren't a substitute for one good study which has some scientific rigor. I think we all have an idea of what the results of the study would be, but until such a study is done, we're all somewhat speculating about the conclusions. "Burden of proof" stuff doesn't really matter, because the goal is to figure out the answer based on evidence, not win internet forum debates.
     
  13. KeithEmo
    I'm just going to point out one final thing... then I really am going to give up.

    One minute you say that what I'm proposing isn't possible...
    The next minute you say "we've already tested it and found it made no difference"...
    Could you at least make up your mind about which you believe there?
    (I'm pretty sure that, if it's impossible, then you haven't actually tested it.)

    And, yet again, no....
    The fact that your local WalMart....
    Or your local recording studio....
    Doesn't have a certain piece of equipment....
    Says very little about whether it is either possible or even currently available....

    When you need a piece of scientific equipment....
    - first you see if you have it
    - then you ask the people who sell it if they have one (purchase proposal)
    - then, if they don't, you ask them if they can make it and what it will cost (design proposal)
    - then, if nobody is willing to offer a proposal, you conclude that it is not currently available
    - then you decide if it's worth bothering to try to design it yourself

    However, since I'm not a studio engineer, I'll admit that you could quite well be right.
    Perhaps, if a typical studio doesn't have one, they just tell the customer who asks that "it's impossible"...

    However, this is all moot, since I already posted a catalog page to the equipment necessary.
    (It seems to be readily available, not unreasonably expensive, and not even especially new.)

    [IN CASE ANYONE MISSED IT]
    https://www.bksv.com/-/media/literature/Product-Data/bp2212.ashx

    Feel free to suggest that "you don't like the way that B&K microphone sounds" - if you've actually heard one.
    (That would be an "artistic decision"....)
    However, it pretty obviously exists.



     
    Last edited: Dec 18, 2018
  14. castleofargh Contributor
    the nature of what cues you're trying to perceive will define the testing protocol, so I don't even see how what you're asking for could be done to answer a general question.
    but for detection of small changes in sound, which is usually what we discuss here, we have a bunch of recommendations starting with https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1116-3-201502-I!!PDF-E.pdf (not sure if it's the latest version).
    the reason for short samples isn't that we notice short samples better per se. it is that we remember things well only for a limited amount of time. it's all about memory instead of perception and this has at least 2 papers I know of, saying about the same things. which is that accuracy of a recalled audio event starts to drop after only a few seconds. so if your test sample is longer than what your brain can accurately recall, you're in trouble.
    also it's obvious that when the point of the test is to detect differences between 2 samples, having them as close as possible in time from one another is helping. a little like how it's easier to find variations between 2 pictures when they rapidly switch on a screen, compared to spending 5mn looking only at one then 5 minute looking only at the other. the ability to go back and forth is a great help in identifying and confirming variations. the only real problem comes when the effect of something is felt only in the long term, obviously a 3second sample might fail there. and also when you have no clue what you're looking for, your short sample might not contain the cues you're supposed to notice. but beyond that I don't think there is much doubt left that rapid switching and short samples are the most effective listening method to detect small differences.

    now if you're talking about identifying the components of one track, then obviously more time will probably help as you're not looking for differences with another sample.
     
    Last edited: Dec 18, 2018
  15. KeithEmo
    I agree. I hate to be trite, but none of the "famous studies" that I continually see quoted would have gotten me a passing grade in my college level science courses. They all violate one or more of the basic tenets of "properly conducted scientific tests" - at least if you wish to produce results that are considered to be credible.

    I also see a frequent lack of distinction between statistical results and absolute results. For example, let's say you test 50 subjects, with 20 trials each (that's 1000 total trials)... and you find that you got 532 overall "correct" responses... but one subject was right 18/20....

    Statistically your overall results suggest that the outcome was within the standard error for a random result. Likewise, the fact that one subject scored 18/20 is itself not statistically significant. However, that result is in fact suggestive enough to justify further testing. It's simple enough to run an additional 20 trials with just that subject to confirm that his anomalous result was random. There is a QUALITATIVE decision involved. It really DOES matter whether that one test subject can routinely get more correct than all your other subjects, or whether he was simply lucky that day. If someone throws ten heads in a row you do NOT "write it off as an anomaly". You conduct additional tests to CONFIRM whether it was an anomaly or a very significant result.

    I'm not sure whether this sort of error qualifies as a specific sort of fallacy... of just falls under the heading of "misusing statistics" or "confusing statistics with specific facts".... but it sure happens a lot.

    In the context of testing for very subtle audible differences between... (anything)...

    If the goal is to determine the absolute fact of whether "a difference is audible or not", then I would suggest one slight difference..... I would suggest that the most sensitive version of an A/B test would be not an A/B test, but the A/B/A variation. In this form of test you play THREE samples in sequence, timed or not, or simply allow the listener to switch back and forth multiple times as they like. This covers the possibility that, for some unknown reason, the listener may be more sensitive to some small difference between the samples when switching in one direction than in the other. (If we're trying to find the smallest detectable difference then we want to ensure that we include the most sensitive conditions possible.)

    At the risk of using an analogy to prove the point, if we were asking observers to compare two colors, the most sensitive test for doing so is to have the colors shown at the same time in direct contact. When matching colors it is universally acknowledged that the most accurate results will be obtained when comparing two OVERLAPPING samples. When comparing "swatches" you hold the swatch OVER the color sample you're testing. (Many very accurate old-school scientific tests worked this way - and many still do.) You do NOT look at the samples sequentially or hold them several inches apart. Showing samples sequentially, or at the same time, but not overlapping, universally reduces the sensitivity of the test. I suggest that allowing the user to switch as often as possible, and as quickly as possible, or as infrequently as they prefer, achieves the same goal. It both minimizes the opportunity for memory to affect the results and allows each test subject to choose their own "most sensitive test conditions".

    An A/B/X test is FAR less sensitive since it asks the user to not only recognize a difference but also to characterize what that difference is. Half of the sample sets in our A/B/A test will contain samples that are different; and half samples that are the same. Therefore, if the different samples are "audibly indistinguishable" we would expect the same random 50/50 result for both types of sets. And, if the observer is statistically more accurate in identifying sets where the samples are different, then we must conclude that SOMETHING is causing them to be able to tell which is which. (And, since it is essentially a forced-choice situation, we have included "unconscious factors" as well as conscious.)

    I would also reiterate that, if we're testing something like the audibility of ultrasonic content, then we MUST confirm that such content is both present in our test samples, and actually delivered to the ears of the listener at the listening position. (We must ensure that both our test samples, and all of the gear we use in the test, "is delivering the sample we're testing for".)

    I would also note that what we're talking about here is "a test to determine the absolute minimum difference that is audible". This is far different than, for example, testing for "what the majority of people notice" or "what the majority of people find important". In tests of THAT sort, it may be desirable to deliberately include other confounding factors. For example, we may conclude that, if the majority of listeners are unable to detect a difference when their listening sessions are deliberately separated by a thirty second delay, those differences are "unimportant", and "unlikely to be noticed in a typical listening situation". In other words, answering THAT question might call for a different and less stringent test. (Perhaps we actually need to develop a specific test for "what's important in a typical listening situation"... with specific, and less stringent, requirements.)

     
    Last edited: Dec 18, 2018
First
 
Back
766 767 768 769 770 771 772 773 774 775
777 778 779 780 781 782 783 784 785 786
Next
 
Last

Share This Page