The Science Of Soundstage
Sep 21, 2021 at 11:53 AM Post #16 of 81
Headphone spatiality is difficult, because all errors in it are directly shot into our ears. There is nothing in between to mitigate errors or create real spatial information as is the case with speakers. The most convincing soundstage with headphones happens with binaural sound. Unfortunately binaural sound is not common, because almost all stereophonic music has been produced primarily for speakers and binaural recordings don't work well with speakers. The secret of binaural recordings is the correct or near correct combination of many many spatial cues.

EQ may help with soundstage (if it can attenuate spatial problems), but it is very limited. Often the wideness of headphone sound is ruined by excessive spatiality targeted for speakers. When the level difference between the ears (ILD) is too large (compared to natural levels) spatial hearing is likely to deduct that the sound is very near the other ear (the side with higher level) and that explains the level difference. Reducing ILD to natural levels helps spatial hearing to believe the sound is further away, especially if other spatial cues support it (for example reverberation level is high compared to direct sound. So, more channel separation can actually make the headphone soundstage narrower! On speakers the effect is the opposite, because the listening room regulates spatial cues to natural levels. Instead of increasing channel separation, the target should be natural levels. That's when we can have the "widest" headphone sound and going over or under that "sweet spot" makes the sound narrower, just in different ways (the sound is outside or inside the head)

Spatial hearing looks at many spatial cues and tries to come up with a interpretation that makes sense. If a recording has huge ILD (indicates the sound is near listener) and also strong reverberation level compared to direct sound (indicates the sound is far away), spatial hearing may struggle a lot to make head or tail out of it. This struggle causes even listening fatique to some listeners including me.

Tonal accuracy is partly a myth. Sounds don't need to have tonal accuracy. Instead sound reproducing need to be transparent. In real life sound gets spatially coloured and that colourization is the information about the kind of existence the sound has. Does the sound happen in a church? Or in a forest? Or in your living room? Our hearing expects the sound to be coloured. That colour equals the physical existence of the sound. Too much color is called bad acoustics.

Cross feed, with a time delay, might help to increase the impression of width in the soundstage. In my own tests with closed headphones though, where I simply mixed some of the left and right channels together without a delay, it actually seemed to make the soundstage a bit narrower. Though it probably did make things seem a bit farther away in the Z-axis.

I'm sure you're correct though that the spatial cues are very important. And if you are listening to open headphones in a fairly reflective or reverberant room, then you may get a more realistic cross feed from the headphones themselves in some of the higher frequencies, that is more delayed in time. As well as the ambient sounds of the room, which will also be different in each ear. And all of this will probably contribute in some ways to your perception of greater width/spaciousness in the sound.

I am not an expert on any of this though. So I may be completely off base on all of this.
 
Last edited:
Sep 21, 2021 at 12:53 PM Post #17 of 81
Cross feed, with a time delay, might help to increase the impression of width in the soundstage. In my own tests with closed headphones though, where I simply mixed some of the left and right channels together without a delay, it actually seemed to make the soundstage a bit narrower. Though it probably did make things seem a bit farther away in the Z-axis.
Yes, correct delay (ITD) is needed. For sounds coming 90° left or right this delay is about 640 microseconds. For sounds coming from about 30° angle (mimicking speakers) the delay is about 200-250 microseconds which is typical value for cross-feeders (at low frequencies). The amount of sound to be crossfed is strongly frequence dependent. At high frequencies high ILD is ok and even desired, while at low frequencies large ILD is spatial poison! This is why headphone spatiality is so complex. So many things matter. Loudspeaker spatiality if more forgiving, because the room acoustics regulate/shape spatiality.

I'm sure you're correct though that the spatial cues are very important. And if you are listening to open headphones in a fairly reflective or reverberant room, then you may get a more realistic cross feed from the headphones themselves in the some of the higher frequencies, that is more delayed in time. As well as the ambient sounds of the room, which will also be different in each ear. And all of this will probably contribute in some ways to your perception of greater width in the sound.
Even open headphones have very little acoustic cross-feed and it is "enough" only at high frequencies if at all. At low frequencies the leakage is nowhere near strong enough. One idea would be to make the "headband" of headphone hollow so that the sound could leak in it to to the other ear. Some damping material inside the tunnel would prevent resonances and too much leakage at high frequencies.

I think environmental sound can "augment" the spatiality of recordings by adding real life spatiality that the ear can't tell apart from the recorded sound, but it requires noticeable environmental sounds which is not wanted (that's why people use closed noise canceling headphones in noisy places)

I am not an expert on any of this though. So I may be completely off bass on all of this.
I don't think you are completely off. You seem to have a pretty good understanding.
 
Last edited:
Sep 21, 2021 at 1:45 PM Post #18 of 81
I have listened to binaural recordings, just listened to some last night actually and they are no different to me at all. There was no added depth or width that I could percieve. I have heard normal recordings do a better job if mixed correctly. I tried open and closed backs because some say the effect doesn't work well with open back headphones. Still no difference to my ears. I have to assume it has something to do with the drivers or how they are aligned? Much in the same way it works with speakers. With speakers you can change the sound stage by putting them at different distances, closer or farther to the listener, closer or farther apart and messing with the toe in. I think with headphones they either have a wide sound stage or they don't. You really can't trick them into do it. Even with DSP or those gaming headsets with so called 7.1 mode. I have heard certain amps that can slightly increase the the sound stage IF the headphones already have a wide sound stage to begin with. But not by much. It maybe increases the bubble by a few inches or so. And it is entirely possible that could be a mental thing. With a blind a/b test I might not be able to tell the difference. I know that for me my Asgard 3 amp and modi 3+ seems to add a bit more width and depth with my sundara's. And with my K371's the imaging seems to be more precise. Again, I am not ashamed to admit that it all be in my head. Just that is how I perceive it.

I have spent a good 30 years trying to find a headpone set up that could come close to what speakers can do. Haven't found it yet. And from everything I have seen, heard or read I won't because it just can't be done. They are just two different things. And then when you add dolby atmos or DTS neural X into the mix it is completely different ball game.

And the thing is some people like the closeness of headphones, they perfer it. For whatever reason they don't want or like a really open sound stage. And that is cool, you do you. Headphones can do things a lot of speakers can't, at a much lower price point. And without bothering other people aorund you.
 
Sep 21, 2021 at 2:41 PM Post #19 of 81
It has to do with the physiognomy of your body. The way you perceive directionality of sound is connected to the shape of your shoulders, ear canals and head. Every person is different, so it's impossible to come up with a one size fits all binaural recording.

I'm with you. Binaural does nothing much for me either. And the only way for headphones to sound the same as speakers in a room is to calibrate to your own physiognomy and perform fairly complex processing to recreate the reflections and delays in sound as it bounces around a room. That requires a computer of some sort.
 
Sep 21, 2021 at 3:24 PM Post #20 of 81
A lot of people don't know what soundstage is so I'll make this thread for y'all to discuss. Soundstage is the perception of width. Things sounding far away in the presence region. It is desired because that is how the human ear hears things. The human ear prefers v-shaped sound. Here is a tutorial on how to create soundstage via examples. Let's start with the legendary hd800s:
ghgfhgfh.jpg

Here is a example of HD600 having no soundstage:
jmbnmbm.jpg
you will notice the treble is directly competing with the high mids presence region. in fact the sennheiser treble veil combined with the presence being more prominent than the treble leads to the complete lack of soundstage. so what is soundstage? the high mids being scooped out in relation to treble emphasis. also, a bass boost can help with soundstage as a bass boost leads the proximity effect of being close to a sound source. that combined with a presence scoop adds to the soundstage. v-shaped headphones inherently have soundstage and is how the human ear hears sound:
hghghg.jpg
thanks for reading:

fdfdf.jpg
Instead of trying to turn correlation into causation when the evidence is thin, let's think about how sound is altered IRL when it comes from further away. The sound gets quieter with greater distance. And high frequencies end up being attenuated more than low frequencies.
Of course that is meaningful in magnitude only for decent distances, which are not what a typical stereo albums and a headphone are going to give you anyway. So I'm perfectly fine expecting that a different process might be involved for perception of small distances(beside sight, head movement, and what not).
The issue now is that with short distances, we cannot ignore HRTF. I couldn't find the paper I know on this, but there is a small sample graphed in that one:
Near-field head-related transfer-function measurement and database of human subjects
see fig3
https://asa.scitation.org/doi/pdf/10.1121/1.5027019

Screenshot 2021-09-21 201438.png

This would be what interests us as you discuss width. Basically you match the first and third graphs to see the variation from the two distances(0.2m and 1m). It doesn't seem to agree with your proposition. but then again, 0.2m seems already quite far for perceived width with most headphones. so maybe we'd need measures of smaller distances(that would surely show even bigger disparities between subjects...).


As an anecdote, I do feel more width in music if I EQ my headphone as a HD800(or actually use one, but that doesn't help as a HD800 is much more than a FR). I can't stand that tuning!!! It completely ruins whatever perceived tonal balance I like/feel to be correct, but I do feel like the fully panned instruments are a little further away from me.


I started reading that post thinking it was nonsensical fantasy. Then the more I read others saying it for me, the more I wondered if maybe there was something to it despite the argued causes being rather suspicious^_^. The brain is a really strange machine.
 
Sep 21, 2021 at 3:45 PM Post #21 of 81
I have listened to binaural recordings, just listened to some last night actually and they are no different to me at all. There was no added depth or width that I could percieve. I have heard normal recordings do a better job if mixed correctly. I tried open and closed backs because some say the effect doesn't work well with open back headphones. Still no difference to my ears. I have to assume it has something to do with the drivers or how they are aligned? Much in the same way it works with speakers. With speakers you can change the sound stage by putting them at different distances, closer or farther to the listener, closer or farther apart and messing with the toe in. I think with headphones they either have a wide sound stage or they don't. You really can't trick them into do it. Even with DSP or those gaming headsets with so called 7.1 mode. I have heard certain amps that can slightly increase the the sound stage IF the headphones already have a wide sound stage to begin with. But not by much. It maybe increases the bubble by a few inches or so. And it is entirely possible that could be a mental thing. With a blind a/b test I might not be able to tell the difference. I know that for me my Asgard 3 amp and modi 3+ seems to add a bit more width and depth with my sundara's. And with my K371's the imaging seems to be more precise. Again, I am not ashamed to admit that it all be in my head. Just that is how I perceive it.

I have spent a good 30 years trying to find a headpone set up that could come close to what speakers can do. Haven't found it yet. And from everything I have seen, heard or read I won't because it just can't be done. They are just two different things. And then when you add dolby atmos or DTS neural X into the mix it is completely different ball game.

And the thing is some people like the closeness of headphones, they perfer it. For whatever reason they don't want or like a really open sound stage. And that is cool, you do you. Headphones can do things a lot of speakers can't, at a much lower price point. And without bothering other people aorund you.
The silly expensive solution for speaker simulation:
https://www.head-fi.org/threads/smyth-research-realiser-a16.807459/

The free app that requires binaural mics to make measurements(no head tracking):
https://www.head-fi.org/threads/recording-impulse-responses-for-speaker-virtualization.890719/

That might still not make you feel exactly like you're using speakers; because you'll feel the headphone on your head, you might mess up the measurements, you might need to actually see speakers for your brain to be tricked, and of course, a headphone won't shake your lungs with sub frequencies(although you can purchase a so called "shaker" you wear or put on your chair). So it wouldn't be right to claim that those stuff will fool you for sure. But they sure have fooled most people who tried them(the A16 adding some variables, it logically has increased chance to trick you).
The secret is personal measurement of sounds altered by your own head, ears and shoulders, as suggested by @bigshot. Being unique sucks in this particular case, as it makes generic solutions inadequate for almost everybody. IMO, getting some sort of head tracking can really bring you halfway there, but then I know of other people who turned it off on their A16 because they found that it didn't add much... So again, being unique is the one thing to always keep in your mind for such topics.
 
Sep 21, 2021 at 4:11 PM Post #22 of 81
Apple integrated head tracking into their Dolby Spatial Audio, but I find it only pans things along the axis between my ears, I can't use it to measure distances by sound and locate the source the way a deer in the forest can detect a predator at a great distance. (I have to use the analogy to describe that. I don't know what the actual technical term for it is.)
 
Sep 21, 2021 at 4:49 PM Post #23 of 81
I have spent a good 30 years trying to find a headpone set up that could come close to what speakers can do. Haven't found it yet. And from everything I have seen, heard or read I won't because it just can't be done. They are just two different things. And then when you add dolby atmos or DTS neural X into the mix it is completely different ball game.
YEP, different. Headphones can give miniature soundstage, not loudspeaker soundstage. It is like expecting a 32" TV to give the same picture as movie theatres.
 
Sep 21, 2021 at 4:58 PM Post #24 of 81
I have a set of bass shakers. They are ok, but not as good as a subwoofer. They do a decent job when combined with subwooers though to extend the range down below 20 hz. Its a cool effect but for not the same as an actual sub that will play down to 15 hz. Like you said they won't move your hair or suck the air out of your lungs like my large 15 inch sub will lol. It takes some fiddling with xross overs and eq though. Other wise you can end up feeling deep voices in your butt lol. Haven't tried it yet with my headphones but I am thinking about. Was actually just sitting here thinking about it before I read this. I just have to get the adapters and what not do use it with a 1/4 inch set up. I wish we still had a radio shack. I miss that place.
 
Sep 21, 2021 at 5:30 PM Post #25 of 81
Voices in my butt could keep the voices in my head company.
 
Sep 22, 2021 at 7:47 PM Post #26 of 81
I often get soundstage and imaging confused. When I think of stereo imaging though, and the relative positions of sounds from left to right, I mainly think of driver symmetry (or left/right balance), and also the accuracy/neutrality of the frequency response. Some headphones seem to have clearer imaging than others though. So distortion and maybe also impulse response may also get into the act on that. (Edit: the video below also mentions differences in phase on this.)

Rtings has tests for both passive soundstage and imaging. I'm not sure that I agree with their definitions, or really even know enough about them to comment. But here is the way they describe the two...

SOUNDSTAGE:
Soundstage determines the space and environment of sound, as created by the headphones. That is, it determines the perceived location and size of the sound field itself, whereas imaging determines the location and size of the objects within the sound field. In other words, soundstage is the localization and spatial cues not inherent to the audio content (music), and headphones have to 'create' them rather than 'reproduce' them. This differs from imaging, which is the localization and spatial cues inherent to the audio content.
https://www.rtings.com/headphones/tests/sound-quality/passive-soundstage

IMAGING:
Imaging determines where, how far, and how wide each object should be in the stereo image. That is, it controls the location, transparency, and stereo balance of objects in the mix, as intended by the audio source. In other words, imaging is the localization and spatial cues inherent to the audio content that loudspeakers/headphones have to "reproduce" rather than "create". This differs from soundstage, which is the localization and spatial cues not inherent to the audio content.
https://www.rtings.com/headphones/tests/sound-quality/imaging

More on the differences from Sam at Rtings...

 
Last edited:
Sep 22, 2021 at 8:03 PM Post #27 of 81
You be have it backwards. There is soundstage in speaker systems and it corresponds with your definition of imaging. That is the definition sound engineers use.
 
Sep 22, 2021 at 8:10 PM Post #28 of 81
Here is another 2-3 minute explanation of the differences by DMS. This is mostly subjective.

 
Last edited:
Sep 22, 2021 at 8:22 PM Post #29 of 81
Headphone audiophiles use the term wrong. It isn’t uncommon, but that doesn’t mean they aren’t wrong. They make up their own meanings to words. And then ones that know even less pick it up and repeat it. That’s why I don’t put much stock in audiophile advertorial.

The term goes back to the early days of stereo. John Culshaw was the producer who pioneered the idea of a mix that placed the instruments and sound sources across the left/right plane to simulate a concert stage. Opera recordings from the 50s.
 
Last edited:
Sep 22, 2021 at 8:25 PM Post #30 of 81
You be have it backwards. There is soundstage in speaker systems and it corresponds with your definition of imaging. That is the definition sound engineers use.

Perhaps. Maybe it works different with speakers than with headphones though.

I associate the accuracy and precision of the stereo imaging primarily with the symmetry of the drivers though, and to a lesser extent the neutrality/accuracy/extension of the FR. And the clarity of the imaging with things like distortion, and maybe also impulse response. Though the latter may have more to do with soundstage and spatial cues.

I think of stereo imaging in much the same way that DMS described, as the ability to accurately locate the positions of sounds along an axis that extends from left to right. Whereas I think of soundstage as more of a spatial quality. The two may be intertwined though.

The rubber band analogy was sort of interesting in DMS's video. I think what that is implying is a broader sense of soundstage will basically stretch the stereo image out making it seem wider, and possibly also deeper.
 
Last edited:

Users who are viewing this thread

Back
Top