Is soundstage actually detrimental to spatial audio?
Sep 17, 2019 at 8:18 PM Post #76 of 162
So it snaps to the next angle every 15 degrees? Does it cross fade from angle to angle? I don't see how you could get a very precise front soundstage if it didn't. It's just a guesstimate, but I would say that with a clearly recorded soundstage, I can probably easily discern double that... 6 or 7 degrees. I'm thinking of recordings where I can tell that the flutes are to the left of the oboes... that kind of thing.

Can it do more than 5.1 or 7.1? I can do a real physical surround speaker setup, so there's no need synthesizing that. But it would be interesting if they could do Atmos object based audio. I suppose it's way too expensive for anyone to mix music that way. Maybe someday.
 
Last edited:
Sep 17, 2019 at 8:52 PM Post #77 of 162
So it snaps to the next angle every 15 degrees? Does it cross fade from angle to angle? I don't see how you could get a very precise front soundstage if it didn't.
They interpolate inbetween the measured angles. At first I wondered how exacly, I thought it couldn't be a simple weighted average of the 2 measured impulse responses because that would boil down to panning between 2 virtual speakers. Then someone posted a link to a document describing Smyth patents and that cleared it up:
(If I understood correctly): First they strip the initial delay from each impulse response (so effectively you only keep the HRTF filtering effect that implicitely also contains the interaural level difference, but not the timing). Then they interpolate the two stripped impulse responses (this time as a simple weighted average). And then based on the actual head position they re-calculate and re-add the initial delays (thus also respecting interaural time delay).
(If you think about it carefully you will understand that it works correct for the direct sound portion of the speakers, but that reflexions and reverberations could be messed up a little bit timing wise. Maybe that could explain why some people have reported to hear tiny differences between the real soundstage and the virtual soundstage. A smaller "inter measurement angle" would decrease this issue I assume.)

[Edit: here is the patent description:
https://patents.google.com/patent/US20060045294A1
Actually they don't strip the complete initial delay but only a part to "align" the impulses I now think.]

Can it do more than 5.1 or 7.1? I can do a real physical surround speaker setup, so there's no need synthesizing that. But it would be interesting if they could do Atmos object based audio.
It does dolby atmos decoding up to 9.1.6, but for fixed overhead speaker positions. Not "direct binaural rendering" of the sound objects at arbitrary positions.
DTS and Auro 3D are not yet supported. But you can input 16 channels analog, or 16 channels PCM via USB, or 8 channels PCM via HDMI and use 16 virtual speakers at any positions you want.
 
Last edited:
Sep 17, 2019 at 10:47 PM Post #78 of 162
@RRod got one so I'm expecting some interesting feedback from him once he's done reading the 900000000000 pages manual 5 times and testing a bunch of stuff. he's a particularly interesting source of intel as he's also been fooling around with convolution and binaural recordings on headphones(and he speaks transfer function fluently, while at best I can mention Laplace to try and sound like an intellectual).
 
Sep 18, 2019 at 1:21 AM Post #79 of 162
you fooled me!
 
Sep 18, 2019 at 6:30 AM Post #80 of 162
As the soundstage shrinks it becomes compressed. Shrink the triangle and the soundstage shrinks. By the time you are working with miniature speakers that are three inches apart and your face is three inches from them, you aren't going to hear much in the way of soundstage any more. Pressing the transducers over your ears completely eliminates the distance part of the triangle, eliminating the soundstage.

Soundstage is clearer as it gets larger until a certain point. If it gets too large, the room itself will start intruding with unwanted delays and reflections, like playing back sound in a stadium with the speakers on either end of the football field and the listening position in the back row center.

The optimal size for soundstage is one that matches the scale of what you would experience sitting in a good seat in the audience in front of a real orchestra or band, while still maintaining the recommended 60 degree spread. I find that about 14 feet or so matches what you would experience in a symphony hall pretty closely, but 8 feet would work OK too. I'm sure you could go up to 20 feet to get close to a jazz club and still make it work in the home.



Every time you moved your head, you would know. And the lack of natural primary depth cues (the envelope of the room) would tell you.



But most modern commercial music is miked closely. The engineer is counting on the listening room and speaker placement to take care of the closest distances. He only creates secondary depth cues to synthesize the larger distances.



There is a diffuse feeling with open headphones that the sound is around you, rather than the close boxed in feeling of closed cans. But we aren't talking about size for soundstage, we are talking about distance. Secondary depth cues are fine. But secondary depth cues added to natural depth cues created by space in a real live room is what creates soundstage.

You keep thinking of PHYSICAL soundfields! WE DONT HEAR THAT!!!! WE HEAR CUES!!! How many times do I need to say it?
 
Sep 18, 2019 at 7:24 AM Post #81 of 162
I would love to get out of that loop of one guy defining a term, discussing the implications, and then having someone else explain that he disagrees because his use of the term means something else entirely. it's like a joke where we replaced half the dictionary by "soundstage".
many complain about semantic, but we're pretty much still stuck at the very concern I had in the very first reply to the first post. and that happens because we keep using an obviously confusing term anyway for various and clearly incompatible purposes.
one thinks that so long as he feels a sense of distance or direction from a headphone then the headphone has his own soundstage. one thinks that so long as 2 headphones place an instrument at a different distance or angle, then those headphones have their own soundstage. one says that soundstage is about the sound, affected by... a stage. I find it pretty hard to reject that definition when it literally involves the words in soundstage.
maybe we can try objective soundstage vs subjective soundstage? I'd rather have less ambiguous terms used that would clearly point out to whatever we're discussing, but I'm a little desperate to find any solution you guys are willing to agree on at this point.
 
Sep 18, 2019 at 8:01 AM Post #82 of 162
How are those cues directional in a way that would be deterministic for front vs. rear placement? Particularly for instruments that are static and can’t leverage the Doppler effect?

Speakers are physically in front in stereo and around you in multichannel. How do headphones replicate that physical placement and resultant front/rear soundstage, not just near/far?

You keep thinking of PHYSICAL soundfields! WE DONT HEAR THAT!!!! WE HEAR CUES!!! How many times do I need to say it?

Still have the same question on how these cues support front/rear placement via headphone in standard stereo recordings.
 
Last edited:
Sep 18, 2019 at 12:12 PM Post #83 of 162
You keep thinking of PHYSICAL soundfields! WE DONT HEAR THAT!!!! WE HEAR CUES!!!

Physical distance cues are more realistic than secondary distance cues because they're real. Soundstage depends on a combination of natural distance cues created by the room to push the stage out in front of you in a plane, and secondary distance cues to synthesize depth beyond that plane. The envelope of the room and the physical space in front of you is the most important, because that's where soundstage gets its physical distance. And you don't get soundstage without physical distance. It isn't sound affected by a stage, it's sound *presented* as a stage... the performers are in front of you and you are sitting a distance back in the audience seats.

If you take a mono recording that is totally dry (no secondary distance cues other than the room itself) and play it through two stereo speakers, it will still sound like the sound is located in front of you at a distance, it just won't have the left to right sound location information. With headphones, if you play a dry mono track through the cans, you get an image right in the middle of your skull. Headphones don't impart any sense of distance themselves. You need speakers and a room for that. Stereo gives you the left/right and the physical distance between the listener and the speakers gives you the distance.

The space around the sound is just as important in many ways as the sound itself. The room in a speaker system is like the sounding box on a stringed instrument or the bell on a horn. Imagine how a violin would sound if it was just the fretboard from top to bottom with no body. You can push your ear right up to it and hear the notes, but it won't sound like a violin. Soundstage is like that too. You don't get it without physical space for the sound to inhabit.
 
Last edited:
Sep 18, 2019 at 12:38 PM Post #84 of 162
It's not really directly related, but I have a bunch of interesting SACDs by the Royal Philharmonic Orchestra that are very unique. They miked the orchestra in a recording studio with something like 80 microphones at a time. Then they fed all that into a monster mixing board and created entirely synthetic left to right location info and entirely synthetic secondary depth cues. Then they used surround sound to get the rear channel to pull the instruments in the front of the band out towards the middle of the room. What you end up with is strings forward, almost in your lap, woodwinds at the normal plane for stereo soundstage, and percussion bathed in secondary depth cues. It's a very unique sound and has a great deal of depth. I imagine that in stereo with headphones, there would be very little depth at all. In fact, it would probably sound just like any other orchestral recording.
 
Sep 18, 2019 at 10:36 PM Post #85 of 162
Physical distance cues are more realistic than secondary distance cues because they're real.
yes and no. if the signal at the eardrum was the same, it wouldn't matter how it came to be. what makes you right here is simply that non physical stuff will typically have obvious variations contradicting or affecting the original/desired cues.

if you play a dry mono track through the cans, you get an image right in the middle of your skull. Headphones don't impart any sense of distance themselves.
for this specifically, I have to side with some of the stuff @Hifiearspeakers was trying to argue about. the same mono signal is not perceived by me at exactly the same place on various headphones. I assume that the main reason(excluding a headphone with horrible distortions) is simply the FR difference between models, and maybe where the driver is placed and how big it is. nothing magical, but I will end up feeling the sound slightly off center on a pair, and of course going up a lot in my case for most headphones considered diffuse field neutral. it's not a big deal in the sense that I can create displacement of similar magnitude simply with my imagination or by deciding to have my eyes opened or closed and my head moving or not. but I get those variations anyway between headphones.
once EQed for mono at "eye level", the remaining differences are really not much in my limited experience, but still not zero.
 
Sep 19, 2019 at 3:06 AM Post #86 of 162
yes and no. if the signal at the eardrum was the same, it wouldn't matter how it came to be. what makes you right here is simply that non physical stuff will typically have obvious variations contradicting or affecting the original/desired cues.

Directionality. The room is around you 360 degrees. You are getting delays and reflections from all around you that add to the sense of space. Two speakers can never do that. Our ears are very good at meshing two sets of distance cues into one. You listen to the real room ambience for things that are closer and recorded dry, and you listen to the secondary cues for things that are further away and recorded wet. With the distance between the listener and the speakers, you get added realism. That should be self evident if you've ever heard a good speaker setup in a good room. It sounds more visceral and you can pinpoint things at a distance infinitely better than with cans.

If you calibrated the headphones for EQ and levels, mono would be smack dab in the middle of your head. The only reason it isn't like that always is variance in manufacturing tolerances. Transducers aren't consistent, even the left and right transducer in the same set of headphones aren't exactly the same. I was told that midrange to good headphones have a tolerance of +/- 3dB generally. That is enough to make a difference, especially if the left and right aren't chosen to match each other carefully. The naturalness of the effect of the room on speakers makes small variability in calibration less apparent than with cans. That is why cans sound more detailed, but speakers have better soundstage.
 
Last edited:
Sep 19, 2019 at 3:58 AM Post #87 of 162
Directionality. The room is around you 360 degrees. You are getting delays and reflections from all around you that add to the sense of space. Two speakers can never do that. Our ears are very good at meshing two sets of distance cues into one. You listen to the real room ambience for things that are closer and recorded dry, and you listen to the secondary cues for things that are further away and recorded wet. With the distance between the listener and the speakers, you get added realism. That should be self evident if you've ever heard a good speaker setup in a good room. It sounds more visceral and you can pinpoint things at a distance infinitely better than with cans.

If you calibrated the headphones for EQ and levels, mono would be smack dab in the middle of your head. The only reason it isn't like that always is variance in manufacturing tolerances. Transducers aren't consistent, even the left and right transducer in the same set of headphones aren't exactly the same. I was told that midrange to good headphones have a tolerance of +/- 3dB generally. That is enough to make a difference, especially if the left and right aren't chosen to match each other carefully. The naturalness of the effect of the room on speakers makes small variability in calibration less apparent than with cans. That is why cans sound more detailed, but speakers have better soundstage.
to be clear I'm not advocating that headphones have soundstage or that they do something right about subjective image. I'm just saying that we differentiate real sound sources in a room from on-ear ones by how acoustically different they are at the eardrum(and with our eyes and our body). the right sound having all the cues of the room at the eardrum but sent from headphone would sound like the sound sources in the actual room(minus tactile subs).

about EQing the headphone, you're right about channel balance, but I wasn't even talking about this. my argument was about how the unique FR of a headphone can be interpreted by my brain as altitude cues for mono signals(how sounds from a source in front of me at different altitudes bounce on a different area of the outer ear, resulting in a FR change per vertical angle). maybe you remember some post I made some time back wondering why I had the sound usually rising above my head(mentally) the more mono it was in a track. since I've done my homework and understand why and how to solve the issue even if it's a PITA to do properly. David Griesinger has a video tuto of sort suggesting to actually place one speaker in front of us and EQ the headphone tone by tone to get the same FR thing we perceive from the speaker.

to summarize, just because the headphone lacks many of the typical cues, and with non binaural stuff, just messes with the stereo delivery, doesn't mean that all that mess plus the specific FR of a headphone won't cause the brain to create some impressions of placement or space. and another headphone, having different FR and whatever other sort of issue, can give a subjective impression of placement that's different from the first headphone. so in that ultra specific respect, different headphones can have different ways of placing stuff in our mind. not correct, not even necessarily consistent from one listener to the next, but a perceivable difference still. I think that perceivable difference is what those defending headphone "soundstage" were trying to defend all along. they perceive something, and they perceive something different with another headphone. some clearly assumed that some versions of those impressions were somehow accurate, that's clearly wrong, but they still do feel some sense of direction and volume and whatever else caused by the right cues, by the wrong cues, by the power of imagination, etc. but they do feel the result as being headphone specific.
 
Sep 19, 2019 at 6:13 AM Post #88 of 162
I have had flu recently so I'm not feeling that great, but I try:

Still have the same question on how these cues support front/rear placement via headphone in standard stereo recordings.

Pinna shadows high frequency rear sounds more than front sounds. Microphones tend to shadow rear sounds more than front sounds. That's the main idea, but it's complicated and the result is often bad, but sometimes it's good.
 
Sep 19, 2019 at 6:20 AM Post #89 of 162
Physical distance cues are more realistic than secondary distance cues because they're real. Soundstage depends on a combination of natural distance cues created by the room to push the stage out in front of you in a plane, and secondary distance cues to synthesize depth beyond that plane. The envelope of the room and the physical space in front of you is the most important, because that's where soundstage gets its physical distance. And you don't get soundstage without physical distance. It isn't sound affected by a stage, it's sound *presented* as a stage... the performers are in front of you and you are sitting a distance back in the audience seats.

If you take a mono recording that is totally dry (no secondary distance cues other than the room itself) and play it through two stereo speakers, it will still sound like the sound is located in front of you at a distance, it just won't have the left to right sound location information. With headphones, if you play a dry mono track through the cans, you get an image right in the middle of your skull. Headphones don't impart any sense of distance themselves. You need speakers and a room for that. Stereo gives you the left/right and the physical distance between the listener and the speakers gives you the distance.

The space around the sound is just as important in many ways as the sound itself. The room in a speaker system is like the sounding box on a stringed instrument or the bell on a horn. Imagine how a violin would sound if it was just the fretboard from top to bottom with no body. You can push your ear right up to it and hear the notes, but it won't sound like a violin. Soundstage is like that too. You don't get it without physical space for the sound to inhabit.

Our hearing has no realism meters other than how much the cues make sense. That's why I use crossfeed, to make the cues make more sense and appear more realistic to my spatial hearing.
 
Sep 19, 2019 at 1:04 PM Post #90 of 162
The room imparts an envelope on sound that is instantly recognizable as sounding real... because it is a real envelope of reflections and directionality imparted by the room. The room is a huge part of the sound in speaker systems. Headphones don't have that. Crossfeed is a good way to process the signal to take the curse off of headphone listening, but it isn't the same as the complex acoustics of a real live room, nor is it intended to be that.

Headphones do not have soundstage because there is no physical space involved.
Crossfeed isn't the same as the effect of the room on sound with speaker systems.
Perhaps the Smyth device will solve your problem. But at this point, it is a very expensive solution.
 

Users who are viewing this thread

Back
Top