How do we hear height in a recording with earphones

Discussion in 'Sound Science' started by vidal, Mar 20, 2017.
  1. jgazal
    I wish I could give you double reputation for your post.

    You obviously have a vast and much larger repertoire than me about this subject and I appreciate your patience and your willingness to share your experience.



    You are right. There is no way to achieve xtc or beamforming without DSP.

    What I wanted to say is that such DSP may not be loaded with a custom measured personal room binaural impulse response (PRIR) to make that HRTF translation in digital domain, so that such part of that HRTF translation may naturally occur in the acoustic domain because the listeners hears through loudspeakers.



    AFAIK, the custom bacch-sp also acquires what I would call a PRIR and uses head tracking to allow the listener to turn his/her head.



    Perhaps I am reading this page and his FAQ too optimistically.

    Throughout those pages, although comparing the BA2L method to ambisonics and expressly using the phrase "3d sound", he carefully does not mention the word "elevation" neither "error in elevation" or "elevation perception collapse".

    That's the main reason I keep asking if someone compared the number of elevation errors listeners have with bacch-sp versus ambisonics with an exponential increase of my post count...
     
  2. pinnahertz
    If an HRTF translation is not made there will be two HRTFs applied to what the listener hears: that of the recording head and his own HRTF, but with all sound arriving at his ears from only two very forward directions.  The confusion that results destroys the solidity and accuracy of the image and hurts it in all vectors.  Again, crosstalk comp will get you a huge spacious image outside of the speakers, it just won't be defined well and is very unpredictable.  
     
    The in-room PRIR would be very important too, as the room itself layers a second unrelated 3D acoustic space on top of the original.  
     
    These are precisely the kind of things that confound 3D sound from speakers.  Even with very rudimentary crosstalk comp, like the kind I played with using analog circuits decades ago, required consideration of the room. I had no means to convolve an impulse response, so all I could do was reduce the impact of the room.  I did it with acoustic treatment and moving the speakers out into the room such that early reflections were kept at least 10ms behind the first arrival. That actually makes a huge difference, but the room is still in the equation.  If I were doing it now I'd start with comping out the room as part of the crosstalk comp (essentially some form of de-verb), and then go after comping out the listeners HRTF.  That would effectively but virtual headphones on the listener, so long as he didn't move even 1/8".  
     
    jgazal likes this.
  3. Russin
    My understanding is that part of the illusion has to do with our minds interpreting non-spatial information in a spatial way.
     
    For example - have you ever walked past an extremely well-damped surface on one side, and you can almost "feel" the wall next to you due to the way that it deadens sound?
     
    Our mind does this kind of thing automatically with various frequencies and other sound characteristics as a way to try approximate our environment - the same way you get an idea of the shape and composition of a room based on the sounds you hear in it and your expectation for what the sound is "objectively."
     
    Your mind is trying to calculate the correct "position" of frequencies based on your life experience. Various things that are mixed into the track can impact the degree to which you can place them.
     
    Another example of how this is used - in many tracks, when a vocal is panned slightly, you off-set it with quiet reverb on the other side. The tonality and character of this reverb sets the "size" of the virtual room your mind is creating.
     
    Amazing things, brains.
     
    Disclaimer - not an expert.
     
  4. pinnahertz
    Agreed that the brain can interpret non-spacial information as spacial. That's why you can get a sense of depth and size from a mono recording that technically presents no binaural spacial information at all. However, walking past an acoustic absorber and sensing its presence is real spacial perception, not interpreted.
    Not life experience alone, though.  The spacial perception ability is designed in and at least partially pre-programmed. A new born baby readily turns its head to its mothers voice, no life experience to depend on.
    Hmmm...I don't think I can agree with "many tracks", though. An off-center vocal would be relatively rare, and a lopsided reverb even more so, unless we're listening to some very early stereo recordings before we had the art figured out. However, modern reverb programs have all sorts of dimensional adjustments available that include room size, surface reflectivity, spectral decay, and on and on...
    Yes, I too am amazed constantly. The more I learn about sensory perception, the more amazed I am. Incredibly design, and so far, we haven't come close to replicating it.
     
  5. Russin
     
     
    Sorry for any confusion. I was giving the example of the wall as something that is in-fact spatial to try and demonstrate to the OP what our brains are trying to "figure out" when we hear various acoustic elements in a recording.
     
    Yes, you are absolutely right that many aspects of our perception are, in-fact, innate. Super interesting field of study trying to figure out which ones [​IMG]
     
    Regarding vocal panning - I'd be hard-pressed to find any song containing back-up vocals that had them aligned dead-center along with the lead vocals, and I find it to be quite common for vocals to be arranged spatially Perhaps we are talking about different things? You can find many examples of this behavior in common music. Panned reverb in Queen comes to mind as a particularly common and noticeable one.
     
  6. pinnahertz
    All you had to do was say "back-up" vocals, and I would have agreed.  Not about asymmetrical reverb, though.  To pan reverb it has to show up as a mono signal.  Reverb has been stereo for a very long time, the use of it as a mono source would be limited to the rare, amateur, or a specific special effect.  Possibly the Queen track was the latter. 
     
  7. Russin
     
    Ah yea - playing fast and loose with the vocab. I can be more specific, sorry about that.
     
    Panning/weighting of stereo reverb does seem to show up in the wild, but maybe I'm just being naive about how common it is. It's easy enough to find tutorials and how-to's that use reverb panning, eq, and delay as aspects of spatial placement. (e.g http://designingsound.org/2012/12/panning-reverb-returns/).
     
    Perhaps I'm just handling this from too bare bones of a perspective and you're referring to the fact that many complex stereo reverb plugins perform similar functions automatically?
     
  8. Vidal
     
    I am still reading this thread [​IMG] although some of it's way beyond my understanding and I'm having to Google stuff just to keep up. 
     
  9. theheightisreal
    Listen, I'm not a scientist or a gearhead, nor is my anecdotal evidence definitive proof, but I have always heard "height" in recordings, and I haven't heard this from other people so I know I'm not being influenced by others. I arrived here by googling, and this is a direct quote, "how do recordings sound like different heights"
     
    I have noticed height information in random songs (heard with ear and headphones) but I've never taken the time to write them down because I didn't know it was a thing (and even less that it could be disputed by people). Maybe an instrument or two sounded like they were physically above the rest, stuff like that.
    Now, perhaps what I perceive is a misrepresentation and there's no way to manually create the illusion of height, but the fact remains that, intentional or not, I hear said height.
    The graphic posted a few replies back with the frequency dips suggests that there is scientific basis for my (and others') claim.
     
    The first time I remember noticing this was over ten years ago with a car stereo (yeah) that had a setting called DSO (dynamic stage organizer) with about 3 "heights." It was surreal toggling between them and hearing the band change location. If you're skeptical about height and you ever ride in a car that has this feature, try it out.
     
  10. castleofargh Contributor

    nobody is denying that we can feel height listening to tracks. the issue here is to control what people will perceive. most of the albums where you get that piano slightly elevated, the drums way up there and stuff like that, well that information wasn't recorded on purpose. most likely the instrument was recorded in mono, and then mixed and only panned left and right when put down to stereo. the 3D sound we get sometimes with headphones is in the end a mistake. a fun and most of the time enjoyable mistake, but a mistake still. and the exact place where you're hearing an instrument is likely not where I would hear it with the same gear.
     
    I was lost into this not long ago myself(and a few things are still not clear TBH) http://www.head-fi.org/t/796868/frontal-sounds-go-up-instead-of-further-in-front-of-me-with-headphones-why
    sure enough I perceive height with headphones, I even perceive it when I shouldn't and don't want to.[​IMG]
    since, I've fooled around with HRTF measurements from other people because I don't yet have the tools to really use custom measurements, and managed to slightly improve my own situation. center sounds were on my forehead or on top of my head most of the time, now it's more at eye level but still in my face.  and I found a way to push it away from me, but then the little mono mustard starts climbing again[​IMG]. hopefully the smith realiser will solve this once and for all for me. but vertical cues in a typical stereo album isn't really a thing from the recording and mixing/mastering point of view. and that's what our friends here were talking about.
     
    Vidal likes this.
  11. stalepie
    I don't hear much height with headphones, or depth. For instance, in ASMR videos of head scratching/tapping/massages to believe it I have to stare at the video. If I just listen to it while reading other tabs (such as this Head-Fi) thread I lose a lot of the spatial awareness. In this video as her tapping moves close to the ears it does indeed sound like it's on the ears, but then in other areas, like right on top, it sounds more in-the-head or mono: https://youtu.be/4ghkeIk2tnQ?t=1882
     
    Fortunately with music you are not usually listening to a band or concert done higher and lower than you very much, or all around you (surround sound). 
     
  12. Computerpro3
    I attended a performance of Brahms 2nd concerto today.  I happened to have absolutely perfect seats - first row balcony, dead center of the stage.   I shut my eyes and concentrated on evaluating the sound as I would through headphones.  I was quite surprised to hear not only depth, but height - false height, technically - in the sound.  I could hear the piano and certain instruments higher than they actually were....a good sounding, but "fake" soundstage.   I believe a lot of these cues have to do with the venue of the recording.  I heard them in real life, so it makes sense that the microphones would pick them up.  It makes sense, when you think about it.  A lot of the sounds are reflecting off acoustic panels and the backdrop of the ceiling.  Given that my seats were elevated (balcony), it's not like the performers were directly in front of me anyway - which is something to think about when considering what is "realistic."    
     
    spruce music likes this.
  13. Vidal
     
    But surely if you are listening live you're getting positional reflections from your own outer ear and head structures that would not be captured by the microphones in a recording?
     
    Plus when listening to a recording with earphones/headphones these structures are bypassed again. 
     
  14. pinnahertz
    What you heard was neither false nor fake, it was the real 3D soundfield arriving at your ears that was created by the sound from the instruments reflecting around a venue designed specifically to do that.  You experienced and heard exactly what was in the room, exactly at your seat.  A good concert hall will have lots of those reflections, by intent and design.  That's the "soundstage" that classical music attempts to capture, though usually from a much closer position to the orchestra.  There may have been a bit of disconnect from what you saw and heard, but that happens with reflections of sound all the time. 
     
    For example, you are in a mountain valley and someone demonstrates a echo.  He's far enough away from you that you don't hear much of his direct sound, but you hear his voice reflecting off a rock face across the valley.  It sounds like he is on the rock face.  It's not fake or false, that's the real acoustic environment you are in, and the real echo you are hearing.  The sound is misleading as to the guy's location, but that is how sound works, and pretty often.  
     
    But, that same 3D space you heard the Brahms in is extremely difficult if not impossible to replicate with a pair of headphones.  Because of that, while some may experience a sense of height in a recording played on headphones, it's accidental, and has little relation to the actual acoustic event being recorded because the recording method is largely height-blind.  
     
  15. jgazal

    I was hesitant to post this in the Realiser thread, but since you mentioned such recording blindness and this is a sound science forum, I believe this is a better suited thread.

    In my quest to understand the constrictions of both i) binaural HRTF content filtered by personalized HRIR* (PRIR**) with a binaural stream output for headphones and ii) high order Ambisonics content (HOA) transcoded with PRIR parameters to a binaural stream output for headphones, I have just found this interesting thesis from a Norwegian author discussing one of the available methods to implement the latter (ii), although using a generic HRTF from a Neumann KU-100 without torso:

    Binaural Reproduction of Higher Order Ambisonics - A Real-Time Implementation and Perceptual Improvements - Jakob Vennerød - Norwegian University of Science and Technology

    Dr. Stephen Smyth mentioned that they will provide an Ambisonics decoder in the mid term, written by a third party. It is not clear yet how many channels it will decode to, but it may be less than the available 16 channels.

    Since they postponed the Realiser delivery date, I hope they can write and improve the Ambisonics decoder before they ship the first units. But in his announcement Dr. Stephen only mentioned a probable 4k HDR HDMI improvement.

    It is a pity, because the curiosity of comparing both methods is just killing me.

    *HRIR - head related impulse response
    ** PRIR - personalized room impulse response
     

Share This Page