Sighted training and blind tests

Discussion in 'Sound Science' started by edgeworth, Apr 2, 2018.
2 3 4 5 6 7
Next
 
Last
  1. edgeworth
    Can someone point me to any recent studies that have followed up very old AT&T work on training people to hear differences?

    I vaguely remember an old AT&T or Bell Labs study showing that people who could not identify a low frequency tone mixed in with noise were able to reliable identify the sound after it had been shown to them in non-blind conditions first?

    Also I wonder if audio engineers have databases that allow meta-studies of their work?

    I'm thinking that in this age of big data that the potential for discovering subtle but persistent effects is there.

    I'm minded of this when looking at the new biology studies of genes and intelligence. Many small scale studies could not find effects that were significant but when combined with other studies people have managed to publish studies (usually with enormous sample sizes on the order 500k or 1M participants) showing clear links between certain genes and a) educational attainment and b) IQ in serious bio journals.

    I remember once looking at a large cable test that seemed to show negative results but when I looked through the data it was clear that a subset of listeners could -- at statistically significant levels -- tell which cable was which. There was no followup nor paper on this and the people doing the test didn't respond to queries from me. But I haven't seen many studies that are large enough and recombinable enough to allow one to use the most sophisticated stats to ferret out subset effects. I am not an audio engineer though I work with large scale statistical data. So I'd appreciate it if an expert could give me some references.

    Thanks.
     
    skwoodwiva likes this.
  2. skwoodwiva
    This may prompt you to look harder & maybe find this one too.
    It Is I why I am here
    https://www.head-fi.org/threads/is-there-science-genetics-behind-the-audiophile-phenomenon.875390/
     
  3. bigshot
    I think pitch is probably a very good way to isolate sound in noise.
     
  4. castleofargh Contributor
    the AES guys will be able to help you more than us. the only real notable work on meta analysis for listening tests has been to try and show that high res is audible, https://qmro.qmul.ac.uk/xmlui/bitst...High Resolution 2016 Published.pdf?sequence=1 and let's just say it wasn't unanimously acclaimed.
    too many choices were left to subjective decisions despite how heavily they ended up weighing on the results.

    but for sure there must be treasures to be found when combining all the small sample studies, if we were able to find enough common denominators(not that easy sadly when just a different test signal can greatly change the significance of the results). good luck with your search. I for one would certainly love to see more work done on such subjects.
     
  5. bigshot
    The trick is to try to isolate the type of perception, not to cherry pick to prove a result. A lot of times people will just look for the exception to the rule to prove their point. The exception isn't nearly as important as a typical result under real world conditions.
     
  6. gregorio
    That's not "very old" work, AT&T are a fairly new company. Musicians though, have had listening training (sighted and blind) for many centuries, probably at least 6. Listening training is a completely established, accepted and fundamental part of all formally trained musicians. The same is true of music and sound engineers, all the university courses in the subject I've ever heard of contain at least one mandatory module specifically on listening training.

    In the education world there are certainly such databases of student work and results but they are not publicly available, in fact in most countries such databases are covered by strict data privacy/protection laws. Music and sound engineers do not in general have such databases. The testing of equipment is important because it influences music and sound engineers' ability to do their job but it's incidental to their actual job and equipment testing is typically limited to periods of studio refurbishment and is generally not done formally, recorded officially or made public, although it is frequently discussed within the community of pro engineers.

    One has to be very careful when interpreting what constitutes "statistically significant". For example, it's entirely possible to toss a coin 10 times and by pure chance to get 9 heads. An over-simplified understanding of statistics indicates that with 10 tosses we should get 5 heads and 5 tails but that's not really how statistics works. Statistics actually predicts a bell curve of pure chance probabilities and therefore, with enough batches of 10 tosses, statistics dictates that we would actually expect to see one or more 9 heads results, purely by chance. So, such a result would NOT be indicative that I am actually able to deliberately toss a "head"! In a large cable (or any other test), statistics dictates that there will be a certain number of outlier results (results which vary significantly from the mean) purely by chance. A subset of listeners achieving outlier results is NOT indicative of that subset actually being able to tell which cable was which, it is indicative of pure chance/guessing! Only if the subset of listeners were big enough to fall above the bell curve of expected chance probabilities could that subset of listeners be described as "statistically significant"!

    If it were me, I would like to run further tests on those who had outlier results, just to be doubly certain their results were the predicted, pure chance outliers. However, this is not a necessity and time and funds often don't allow further levels of testing.

    G
     
  7. edgeworth
    Of course one has to be careful. That's why I tried to followup but no one responded. I have published multiple statistical studies in econ and psychology journals so I am aware of the issues involved. But there has also been much progress in both stats and econ in using more sophisticated stats to tease out various effects in a causally significant way. I think it is safe to say that today the level of statistical sophistication in leading econometrics papers is higher than what is typical in the leading medical or engineering journals.

    I would also note that heterogeneity -- the tendency of different people to have different acuity and under different conditions -- further complicates the interpretation of even negative results for small samples. Note that in medicine, there is an analogous case where non-test results are used regularly except we don't see them as such. That is -- side effects. Although there are strict protocols for testing for drug approval very often doctors list a variety of side effects that are never tested double blind because they might be important. I would argue that subjective acoustics is a bit like this. Many drugs in current use were only double blind tested for effect X but ended up being used as effect Y was routinely observed ex post without double blind testing. Furthermore, in some cases the benefit seems strong enough for some people without testing that doctors allow the procedure whether or not it is a placebo -- such as acupuncture. Even if it is a placebo, if there are no side effects and there is no way of recreating the beneficial effects without an actual drug -- with a greater problem of side effects -- then the profession is in favor of just allowing it.
     
  8. bigshot
    There's also the basic question of practicality... If you have to jump through hoops to prove something exists at all, does it really matter?
     
  9. edgeworth
    Well as I noted for side effects in medicine, they clearly matter. And for other work, it's interesting for scientists because the hard to prove can be important in different contexts. In finance, in psychology, in genetics, many things are hard to prove but are potentially very important. If an engineer says effect A doesn't exist at all, that statement is false if even one person in the world can prove it wrong. It may not matter in practicality or it may mean that more can hear it unevenly in hard to prove situations. If an audiophile fools himself, at most he wastes a bit of time and money. If a scientist fools himself he harms the quest for better knowledge.
     
  10. skwoodwiva
    Are you a researcher or physician?
    Know anything about tinnitus, my wife has it.
     
  11. bigshot
    My point in being here is to use science to make better sound, not the other way around. Some people might feel differently I guess.
     
  12. edgeworth
    The first step to better sound is a better model of hearing and subjective valuation of what we hear. I believe that the tendency to treat accuracy as about electrical neutrality rather than perceived accuracy is a big issue. Furthermore, even more than in medicine, blind adherence to the 95% confidence interval is damaging.

    Just to make a quick digression. Since is about avoiding both false positives and false negatives. There is nothing magical about the use of 95% confidence as the cutoff in scientific journals. It is an arbitrary construct to limit false positives. But as Bayesians know, it also may result in too many false negatives. Depending on the loss function involved, we may want to further investigate situations with a lower confidence interval if the costs of doing so are not large. In medicine we are purposely conservative due to fear of bad effects. But even in certain cases -- such as terminal cancer -- experimental drugs which show promising results not yet at the 95% confidence level are admitted if there is nothing to lose. Similarly as I said, doctors take reported (non-verified) side effects into consideration because ignoring them might prove damaging.

    A deep research project in psychoacoustics might want to examine how sound is perceived differently by differently people and how different distortions are valued or not. Some have already done this crudely but there is a lot more to be done. It is even harder because audio engineering is not a large field and most of the top 20 or so universities don't have large teams of engineers plus psychologists examining these issues which means research is limited and spotty.

    I am not in a medical field so I can't even say much about biological differences and how those relate, but those surely matter.
     
    skwoodwiva likes this.
  13. bigshot
    I really don’t have much knowledge or interest in statistical models and testing procedures. I’m focused more on thresholds of perception and how that relates to the sound a system makes playing back recorded music. My experience with that tells me that there is a much wider latitude for forgiveness of problems in sound than most audiophiles allow for. In fact audiophiles often worry about things that are an order of magnitude or more below the “it just doesn’t matter” line. Headphones are more subject to that than speakers. In general discussion of speakers is more grounded in the real world than discussion of electronics and headphones.

    What I look for is practical techniques I can use to put together a great sounding system. There isn’t a lot of that in the audiophile world. But if you’re interested in the theoretical side of things there’s lots to chew on. Especially if you’re interested in esoteric exceptions to the rule.
     
  14. gregorio
    But doesn't that mean; not jumping to conclusions without sufficient evidence? You stated that: "when I looked through the data it was clear that a subset of listeners could -- at statistically significant levels -- tell which cable was which." - I don't know the study to which you are referring but can you be certain a subset of listeners could tell which cable was which, does the evidence really support such a claim? Even if the outlier results fell entirely within the predicted bell curve, that is not absolute conclusive proof that those individuals could not detect a difference. The only claim we could make is that there was no evidence that any one could detect a difference. It's largely for this reason that science cannot prove a negative, it's impossible to test everyone alive and everyone who has been or will be alive.

    Following on from the previous point, I agree with this point but only within certain limits. For example, the hearing range of homo sapiens is typically given as 20Hz-20kHz. Exceeding few people actually have such a large hearing range though. Having said this, I did hear of one reliably documented case of a test subject who was able to detect 23kHz. However, there are two points this raises: 1. The conditions of that test would not exist under any normal conditions of consumers listening to music and 2. The audiophile world (or some portions of it) seem convinced that it's important to capture and reproduce audio frequencies up to 96kHz or with some of the newest formats, 2 or more times higher than 96kHz. That is so far beyond what all the evidence suggests, it's safe to assume that no one can hear such frequencies, regardless of any audiophile testimony to the contrary. Obviously we haven't tested every single member of our species so we can't be absolutely certain but if a person exists who could hear such freq ranges, they would need such a different physiology that in all likelihood they would no longer technically qualify as a homo sapien but be some other species/sub-species.

    1. Again, we've been doing this for a very long time. The first documented music competition occurred (I believe) in 700BC. Even by 5 centuries ago "subjective valuations" were very sophisticated and taught as standard to composers and musicians. We have a very good model of hearing but there can never be an accurate model of "subjective valuation" because there is no precise, universally accepted definition of "value". The best example of of a model of subjective valuation (for hearing) we currently have, is probably that of loudness, which on the face of it seems easy to evaluate. However, it is not an accurate model, it is based (as all measurements of subjective valuation would have to be) on a mean response of a large group of people and of course that results in the model being somewhat inaccurate for anyone who does not fall exactly on that mean.

    2. No, that's not an issue at all! The issue, where there is one, is simply that some people confuse accuracy with personal preference (personal subjective valuation). Accuracy is simple: At the beginning of the recording chain all the way through to the end of the reproduction chain the only thing we effectively have is an electrical signal. It is therefore easy to measure this electrical signal after it has passed through any piece (or pieces) of equipment in the chain, compare it with the signal before it passed through that piece/s of equipment, determine how closely they match and therefore how accurate (high-fidelity) that equipment is. The big issue, if it is an issue, is that there's really no such thing as "perceived accuracy" in the first place, "perceive" and "accuracy" are effectively mutually exclusive terms! A "perception" is by definition an interpretation by the brain, not a precise or accurate measurement. Furthermore, that's a very good thing because without perception (and the fact that it's not an accurate measurement) there would be no music in the first place!! In short, don't confuse accuracy with perception, they are two entirely different things!

    G
     
    Last edited: Apr 4, 2018
  15. amirm
    Yes because you don't know if there are more sensitive cases or people with even higher acuity.

    And as noted, knowing something is impossible to detect is hugely valuable. Take that away and a lot of arguments cannot be made.
     
2 3 4 5 6 7
Next
 
Last

Share This Page