Originally Posted by dwareing
I think people are right to have high expectations for binaural. The idea that if you can reproduce the movement of the eardrum that you would have got live, by means of a recording, that it would sound the same is quite compelling. This assumes that the vibrations of the eardrum are the only input to the brain that conveys sound and its location. The direction part of the sound vector has no sound itself, so where the brain decides to locate the sound in the auditory image could come from vision, variation of ITD with head movement, the hairs on your arms, or wishful thinking, etc.
There are many factors (cues) involved. Some may be 'ANDed' and others 'ORed'. Also there are many types of sound source types - musical notes, impulses, random phase noise, different frequency bands, etc. On top of that there are varying types, and amounts of reflections. You cannot take one set of conditions and apply your observations to another set of conditions. It is very difficult (and unwise) to reach firm conclusions.. Sound localization is a process rather than a state. Closing your eyes while listening to a sound is not the same as being blind from birth, for instance.
Jeff Anderson is on the vision tack at the moment. The idea is that we could all learn to use a standard pinna response (his), calibrating our auditory system by watching videos.
I watched and listened the videos and agree with your observations. The distance perception is quite vague however, so you can place the sound at her lips on my screen 2ft. away, or farther away when the eyes are closed. I enjoyed the recording, as I did the first time I heard it, but for me in my livingroom the scale was quite small.
Many years ago I made a minidisk recording with in-ear mics, of a waterfall. When I replayed it on headphones in a field it sounded almost credible in scale. When I played it in a room it shrank to fit the room, so I played it in a bathroom, and then in the car. It fitted each space.. I conclude there was very little actual distance information on that recording.
I learnt of the McGurk Effect and other such things many years ago from a very nice guy called Richard O Duda, via emails.
This place is a hotbed of high-end beliefs, I won't be hanging around.. Cheers - David.
I think we're largely in agreement (on a number of issues); what happens in one set of boundary conditions isn't necessarily applicable to another set of conditions.
I guess what I struggle with (a lot) is that many comparisons are made wherein multiple variable are changed at the same time. As anyone who has conducted experiments will tell you, changing only one variable at a time makes the conclusions (usually) more salient and logical. Unfortunately, most of the research aimed at answering a lot of questions about binaural (or for that matter other formats like Ambisonics etc) is confined to publications by the A.E.S., A.S.A., or other technical-centric organizations, and regrettably, this puts much of the published work into the realm of 'specailized' rather than general. Thus, most people who might have an interest (and they are few...) may not at first be sure where to turn to find such controlled experiments, or could even be intimidated by dealing with publications by such organizations.
The videos...I had a somewhat similar impression of the distance being vague (in the Sumkali videos), but at the same time, I think I know what is going on. Again, remember that nearly all of the instruments were mic'd or taken via direct box to the FOH mixing console. I don't recall how much of each instrument was in the main mix, but... the fact that there is a PA component (close to the mannequin head, on-center) means that the natural arrival time from each instrument is going to be affected - essentially... blurred. This is due to the fact that the sound from the instruments (via the microphones) arrives before the naturally radiated acoustic power from the instruments - this has GOT to affect the phase relationship; phase is time...time is phase...and timing is a major factor in localization. Frankly, were one to piggy-back the various mic channels and compare them to the signals from the mannequin ears, one could (via signal processing) glean a great deal of insight as to the phase(s), or rather, the deltas...but this would only help explain what is going on - and not fix it. Yes, it would be grand to be able to record (as an experiment) taking a band that is primarily an acoustical band, and then alternately turn on and off the PA during the recording process, and incorporating video might make for another interesting variable (one could expand this further and think about 3-D video as well). I think it would be interesting to see how peoples' perceptions of localization change in the presence of varying degrees of PA content as compared to the acoustical content. I've seen something related in many houses of worship...counter-intuitively, often times articulation is actually made worse by a PA because the sound from the PA arrives before the naturally-radiated sound of the person speaking / the people singing, and the time delay and also reverb of the space can actually make it harder to hear when the PA is in use.Yes, much of this can be compensated for using time delays and such, but most budget installations don't allow for this.
Anyway...I'm rambling, Again. Let me get back to my original point (by way of exaple) with regard to the importance of natural phase.
A counterpoint to much of what I typically record is something that I do for select people that I know, namely, family histories. How does this differ from other things that I record?
Well, apart from the genre, the main difference here is that the natural arrival times (from each person's mouth to the mannequin) are preserved - unlike the case where one has the acoustic, un-amplified path, and a path augmented by the PA. In these family history sessions (basically, discussions), there are no artificial means of augmenting sound - people are simply sitting and speaking to one another.
In this instance, I record binaurally in a space with which all who will likely listen (mostly the family members) are well acquainted...for instance, in a living room, a kitchen...or wherever the participants feel most natural and comfortable. Now...mind you this is just anectodal, but I will say this...when I make such recordings, when the people hear them their jaws drop, they grin uncontrollably, and look around / point to things that are in the space. A good example of this was one event where I recorded a family history in a home, and on the wall was a clock - an older, gear-driven AC-motor-powered clock that had seen better days. It was functional enough, but made a sort of low whirring / slight grinding sound. When I played the recordings for them there, immediately after the fact, using standard closed-back headphones (nothing esoteric), every person commented - as they looked toward the clock was relative to the mannequin head and gestured about how they could hear the clock - their gestures were really raised hands pointing to where the clock was situated (relative to the mannequin).
For this reason (and others), this is why I always want to record family histories in the family home if at all possible. Why? Well, apart from what I wrote previously, there's a learned element. That is, if you think about it, people tend to learn, if un-conciously, the acoustics of the space(s) in which they spend a great deal of time. So, by recording on (acoustically) 'familiar turf', the recordings take on added realism...in particular...to the family members because som many crucial details of the space are pretty faithfully rendered by the binaural technique - yet, if you play the same recordings for someone who does not know the acoustics of the space, they often have a different perspective and experience. Yes, I could record them in a conference room, and yes, all would be heard, but it would not carry the same 'weight' as incorporating the known acoustics of the family home.
Likewise, I know of at least a few car audio system providers (who provide systems to the auto manufacturers) who run their listening tests, using binaural audio, in the vehicle. That is, they may have several variants to test, but they record them all binaurally, at the same location etc. When the systems are to be judged by the listeners, they do so in the vehicle using playback of the binaural recordings that were made previously with the mannequin head in the vehicle, situated where they would have been situated (and where they ARE situated for the listening tests). They do this because they have found that there is better agreement with what is ultimately judged to be the 'right' audio system when the headphone playback of the target systems takes place in the vehicle rather than in controlled listening space.
I have likewise seen this in-situ in my day-job (Sound Quality / NVH Manager for Lear Corporation). For example, when we record in a vehicle interior (binaurally) and play the sounds back in the vehicle (whether the headphones are closed or open doesn't really matter in this case), the jurors overwhelmingly judge the recordings as 'very realistic' and have no real issues localizing sounds of interest, usually with quite good precision (in the scope of the tests that we execute). However...when the jurors are played the same sounds seated elsewhere than in the vehicle, they sense them differently - not that they judge them as unrealistic, but they do tend to judge them as less realistic. Context...plays a big, big role. Often times, we record binaurally not to (necessarily) gauge localization, but to do relative comparisons about the sonic attributes of each system being tested - in this case we're looking for more information about what product sound is preferred the most when compared to the others (for some of the readers out there, this may come as a shock, but many, many products' sounds are very carefully engineered to sound a very specific way, and not necessarily just to meet an existing noise specification - drills, dishwashers, transmissions, door closures, sunroofs, power windows...all of these are scrutinized and tweaked to try and get the right sound, that is, the proper Sound Quality).