How do we hear height in a recording with earphones

Discussion in 'Sound Science' started by vidal, Mar 20, 2017.

  1. Vidal
    Can someone please provide an explanation (a link to website would be fine) on how we hear height with earphones. Just normal stereo recordings.
     
    I've read that the outer ear/inner ear has a role to play in terms of reflections etc. but with IEMs that interplay is removed, right?
     
    I keep seeing 3D soundstage being used when describing the sound from IEMs and I'm sorry but I just can't fathom how that works. Left and right along with imaging and distance I can see how we'd perceive these elements but height just doesn't make sense.
     
    Be as technical as you like, I'll pick up.
     
  2. spruce music
    I wonder the same thing.  Normally our outer ear causes comb filtering that results from reflections of the ear lobe.  You get a dip in response centered between 6500 hz and 11,000 hz depending upon direction.  The dip is higher in frequency the higher the sound direction with 11 khz being maybe 70 degrees up.  A dip at 6500 hz is only slightly elevated. 
     
    I have trouble finding phones that image outside my head.  Even then images in the middle third of my head seem to also go up.  I guess the way phones sit on my head just end up oriented that way.  What I have experienced is a distorted 2 D image.  I might get some slight 3 D effect with some binaural recordings.  Other than binaural I can't say I have ever gotten 3D imaging over headphones.  What up aspect I get is clearly a distortion that is the same on all recordings.
     
    Binaural recordings done with a dummy head and ear should give you some 3D effect if the phones don't otherwise obscure the effect.
     
  3. castleofargh Contributor
    the HRTF stuff I'm partially using for my crossfeed(not mine but works better than nothing) shows a behavior similar to what @spruce is saying with sound right in front on the horizontal axis. a recess(about 5db, sometimes more) that appears at a higher frequency when the source of the sound is more elevated.
     
    the outer ear is clearly responsible for hearing position on a vertical axis, with IEMs, the brain can only try to find known patterns and interpret as best as it can. like how I get pretty much all my music using headphones and IEMs with the phantom center elevated when it should be in front of me. it's a problem that doesn't exist for me with speakers or IRL sounds. having the sound coming at me at a 90degree angle ruins my ability to correctly judge elevation, and as my HRTF is obviously different from the average head, model standards give me fairly crappy results.
     
    I will be able to tell more in a few months when I'll get a smyth realiser, as I plan to record everything and try to find patterns, at least to get answers for my own head.
     
     
    I remember watching some BBC stuff where the person got some blu-tack to change the shape of her outer ears, and then got herself trolled a few times by the other person clapping at different vertical positions (eyes closed obviously). I never tried that, might be worth the experience?
     
  4. Vidal
    Thanks for the responses guys, I've been straining to hear height and just not getting it. Looks like I can stop wasting my time at least with standard IEMs.
     
    When I first started properly listening I'd just get a left/right axis within my head along which the elements of the music would be distributed. As time as gone on I've got better equipment and found some good recordings that seem to have more 'hidden' clues with regards to positions.
     
    I'm guessing my brain is interpreting that as depth so I think I'm getting 2 axises, it's not as clear as real life I'm having to really concentrate to get any perception of depth. Some of this is down to subtle variations in volume of instruments.
     
    Would it be correct to say that people who 'perceive' height with IEMs are over interpreting? 
     
    (Thanks for the mention of HRTF, on the back of that I found I good article on how we interpret position) 
     
  5. gregorio
     
    1. That's a very common circular logic fallacy amongst audiophiles. As others have pointed out, due to individual headphone colouration, individual HRTF and individual perception, the human brain can sometimes mis-interpret the relationship between frequency, phase and levels of the sounds/instruments which are actually in a particular mix and create an illusion of height information. The circular logic comes into play because if a group of audiophiles decide that they like this unintentional height illusion then: Any equipment which seems to enhance this perception is better than equipment which doesn't, any mixes which seem to exhibit this phenomena are better than those which don't and therefore, anyone who doesn't perceive this illusion either has poor equipment, poor hearing or both. The reason this is all a fallacy is because there is no height information in stereo, the equipment isn't "good" because it's revealing "hidden clues" because there are no hidden clues to reveal, just a fluke of the interaction of the elements in a mix with the headphones, HRTF and perception of each individual. We could just as easily say/decide that as there is no height info in a stereo mix, any equipment which creates an illusion of height is poor/bad rather than good, anyone who perceives it has flawed perception or any mix which exhibits it has a fault/problem.
     
    2. I don't think we can generalise in this way. Many people do experience the perception of "height" in some mixes with some headphones/IEMs, I certainly have. Some are probably experiencing it purely because of some cognitive bias; numerous testimonials and reviews by those who have experienced it (or believe they have) and the assumption that more expensive equipment must be better/more "revealing" are just two of many common biases which can lead to an "over interpreting" or even, the belief that "night and day" differences exist where in fact there are none.
     
    G
     
  6. Vidal
     
    Thanks for the input Gregorio
     
    So purely IEM, purely stereo music recordings there really is only a single axis of information getting to the ear?
     
    Depth (i.e. forward/back) and height are the brain's interpretation of the information it's being provided, when I personally think I hear depth it's not really there just my brain rationalising the sound as though it was in a 3D environment?
     
  7. RRod
     
    Our brain learns how to take the subtle variations in sound due to our ear/body shapes and extract a 3D sound "landscape" from our stereo ears. To the extent that a particular recording+IEM combination might, by happenstance, capture some of those variations, we can certainly perceive the sound seeming ahead/behind/above/below due to our brain trying to work-out an artificial sound field.
     
  8. gregorio
     
    It's not quite that simple I'm afraid. All of stereo is in effect an illusion created by the brain, even left/right on a single axis is an illusion because in practice we don't have a single axis, we only have two points (two speakers) and everything in a mix which appears to be positioned between the speakers is a consequence of the brain's interpretation. So a sound which appears to be in the centre (commonly the vocal for example) is not actually in the centre, it's only in the left and right speaker. If we make the timing and level of that vocal equal in both speakers and we're positioned equidistant from both speakers our brain won't identify the sound as being two separate sound sources coming equally from two speakers, it will interpret it as a single sound source coming from the centre point between the speakers. This is an illusion which we're obviously able to manipulate, if we make the level higher in the right speaker than the left and/or if we delay (by a few milliseconds) the vocal going to the left speaker, then the brain will still interpret this as a single source but a single source somewhere to the right of centre. How far right depends on how much higher the level (in the right speaker) and/or how much delay in the left speaker. It also depends of course on how far apart the speakers are placed and in the case of headphones, the "speakers" are effectively placed at the ultimate extremes of left and right.
     
    Although it's a little more convoluted, we can also manipulate depth/distance. A sound which is distant will be quieter than one which is close, it will have less high frequency content (because high frequencies are absorbed by air) and it will have more reverb (reflections/echoes/the "sound" of the room acoustics). All of these we can change; we can lower the volume, roll off the high freqs with EQ and add reverb/echoes, thereby creating the illusion of distance on whichever instruments/sounds within the mix we want, and by combining some sounds which appear distant with others which appear closer we create a soundscape with depth.
     
    The perception of height is a different kettle of fish though, it's a consequence of a mis-interpretation due to the (headphones') "speakers" being isolated from each other and at the most extreme left and right positions, a sound presentation which cannot exist in the real world. The illusion of height is therefore an unintended mis-perception caused by the brain trying to make sense of this impossible sound presentation, it's therefore a highly unpredictable illusion and one we can't manipulate with any certainty (and therefore we don't even try). This makes it quite different to the illusion of depth and left/right positioning, which work on both speakers and headphones, which mimic how sound behaves in the real world and are entirely predictable, manipulatable intended perceptions.
     
    G
     
    CoiL and Vidal like this.
  9. Vidal
    Gregorio thanks for that last post, that really helps me a lot. I just couldn't figure out how, to use your phrase, 'manipulatable intended perceptions' could be done for height.
     
    It's good to know my brain/ears are working fine when I perceive depth and left/right, but the fact that I'm not getting height isn't a failure of my listening abilities or equipment. 
     
    I'll stop spending now (I wish that was true [​IMG]).
     
  10. spruce music
    I had someone insist simple pair miking could provide height cues especially coincident pairs.  There might be something to that.  As coincident stereo pairs are offset vertically a small distance there could be comb filtering in the frequencies that give us a sense of height.  It wouldn't be accurate, but it might sometimes cause that sensation.  Of course recordings done that way commercially available are highly rare beasts.  I also suppose binaural done with dummy heads and outer ears at least similar to our own in theory should manage it though such recordings have never worked well for me done that way. 
     
  11. jgazal
    wp_ss_20170321_00012.png

    Considering that the pinna shape more and less obeys a human pattern, although varying in size and excluding ILD and ITD cues, how much do you believe the frequency peaks and dips depicted in the image above (spectral cues) deviate from person to person?

    If peaks and dips stay more and less in the same frequency, how much do you think the peaks and dips differ in dBs?

    Do you think that playing a binaural microphone dummy head and torso recording (generic HRTF) through loudspeakers with crosstalk cancellation (second personal HRTF filtering) completely ruins the perception of elevation?

    Or it just increase elevation perception errors (i.e. comparing the real position of the recorded source and the virtual elevation indicated by the listener)?
     
    PETEBULL likes this.
  12. spruce music
     This doesn't really answer your question exactly.  Just maybe informs how this all works for us.  Firstly yes different pinna shapes are different enough one person's filtering would be confusing to another.
     
    I wish I had a link to a copy of the article, but an experiment was done a few years back about this.  They measured the height and position perception for a large group of volunteers to document how accurate it was and if people heard height and positional accurately compared to each other (they did).  They then custom molded inserts to alter the pinna shape for each volunteer.  Got them to agree to wear them I think for two weeks.  Or at least a few days. 
     
    At first their height and position perceptions were all scrambled.  Not accurate at all, not consistent, just badly wrong.  I forget if it was 12 hours later or the next day, but anyway their auditory processing learned this new pattern in time.  At some point they could test height and position perceptions doing every bit as well as they initially could.  There were two surprising results.  They expected when they removed the ear inserts the subjects would require hours to straighten it out again.  Instead, within minutes height and perception were accurately perceived.  Almost as if the auditory sensors had kept one set of filters in memory.  Somehow it figured out hey we are back to this and switched to the old filters.  So they put the ear inserts back in and like the original condition they had excellent perceptive ability after minutes.  It was like the auditory portion of the brain had two different patterns for hearing position and height allowing it to switch between them in only a couple minutes as needed.
     
    So maybe if they did binaural recordings with a standard ear shape, and we all listened to enough hours (perhaps with video that accurately matched sound sources) we could all in time have this second pattern of pinna to work with and switch to for listening to music.  Of course you still have the issue of how replay over phones interacts with your real pinna to confuse the issue.
     
    CoiL and jgazal like this.
  13. castleofargh Contributor
    I'm confident that could be done, all the brain needs is training. the problem is that we don't train by listening to our favorite albums, because we never really know where the sound was supposed to be. but with exercises where we'd know where the sound should be(visual cues to assist would be the most effective as always), then we can learn many things.
    I think I mentioned this once, I've spent a few months using the tweeters of my laptop as only sound source while using external screen and keyboard, so the laptop was on the side. after some time(weeks) of watching some TV shows that way, I came to "feel" like the sound was correctly centered, and no longer was I annoyed by the guy talking in front of me and the sound coming from the side and below.
    of course my brain would switch calibration only when conscious that I was using that system, I didn't start hearing off centered cues for everyday sounds.  it's like learning to drive or to use rollerblades, there would be some serious issues if we couldn't rapidly revert to properly walking ^_^. we're amazing machines.
     
  14. ricosuave
    Would these be a possibility? https://www.indiegogo.com/projects/adel-drum-bionic-earbuds-music-headphones#/
     
  15. castleofargh Contributor

    no that doesn't address the problem at all. the adel stuff is sort of hybrid between vented IEMs and sealed IEMs. think vented IEM with a little condom instead of the hole. so it's almost sealed but not solid sealed(I'd be curious to see the distortion figures with this system). personally, if I was really scared of  pneumatic pressure from sealed IEMs, I'd use good old ear buds that don't seal anything. some happen to sound very nice but as they don't isolate it's meaningless in a noisy town so I don't use them.
    anyway I'm getting off topic ^_^, this does nothing for acoustic cues that should come from the shape of the ear/head/torso.
     

Share This Page