Is soundstage actually detrimental to spatial audio?
Oct 15, 2019 at 4:22 PM Post #151 of 162
It isn't "different" if the person doing the mix *intended* for room acoustics to be wrapped around the sound. The vast majority of commercial music is designed to be played on speakers in a room. That's how the engineers monitor the mix and that is the sound they want.

I am tired of this semantics war. If you read my posts about how I justify crossfeed I keep saying: Crossfeed helps because recordings are mixed for speakers. We disagree much less than we fight. Wrong is different from right. Somehow you don't understand that many people live in circumstances in which headphones are more convenient DESPITE their weaknesses compared to speakers. The engineers can want what they want, but in life we don't always get what we want.
 
Oct 15, 2019 at 4:54 PM Post #152 of 162
Headphones are an acceptable compromise if you can't afford or don't have the space for speakers. But good speakers in a room is the sort of rig most commercially recorded music is designed for. Crossfeed and other types of signal processing are fine if you like the effect, but they don't make headphones sound anything like speakers.
 
Oct 15, 2019 at 4:56 PM Post #153 of 162
speakers don't sound anything like headphones.
 
Oct 15, 2019 at 4:59 PM Post #154 of 162
No, not the same at all.
 
Oct 16, 2019 at 8:44 PM Post #155 of 162
my really last attempt at explaining that different stuff are different. in case someone still cares about the topic and wishes to think about it. here are the various systems to consider and try to understand:

model 1: human psychoacoustic, the use of ILD and ITD to locate one physical sound source somewhere around us. freedom of movement, head movement even helping to get a more accurate triangulation(more reference points, more cues), vision helping to confirm the correlation between the sound and some visual source causing it. in short, us in our daily lives in a real environment and the cat making cat sounds coming from the actual position of the cat.

model 2: stereo speaker playback. 2 physical sound sources are tasked with making us feel like one virtual sound source(or many) is located somewhere in a general direction between the 2 speakers. the main trick involves sending the signal to both speakers and making one louder(panning). we have free head movements allowing ILD ITD cues from the active sound field to let us place the 2 speakers as having a fixed plosition in the room(which is the case, so that's cool). we get room reverb, giving a few more cues about the position of the speakers, but also about the room itself. over time the acoustic and visual cues keep agreeing with our expectations and model 1 for the room and the speakers. but the actual track and virtual sound sources are another story. they may have their own psychoacoustic cues, that may agree with model 1 for that room and those speakers from our listening position, or maybe not. it's the fake that kind of feels real.

model 3: headphone playback. 2 physical sound sources stuck on our ears. moving our head tends to give us the idea that the physical sound sources are on our ears because they are(good job senses!). if we naturally rely on model 1, that's when we make use of ILD and ITD to place the physical sound sources on our own head. depending on the audio cues related to the virtual sound source(let's say some instrument on the track), we might try to make sense of the impossible by placing that instrument inside our head, somewhere between the 2 drivers or from one of the drivers. but maybe our brain understands that it's physically impossible and keeps pushing almost every virtual sound source near each ears even when it's almost mono. or maybe some place on the head right outside of it. maybe the FR of the headphone(why not some clear channel imbalance or some funny phase games), happens to remind our brain of the acoustic cues for a given direction. and we'll feel like the singer is not at the horizon but somewhere up or down(still probably in the vicinity of our head).
as the headphone sends sound from a position that bypasses some of what we use to locate a sound source(HRTF), a direct consequence is that one listener might not place a virtual sound at the same position as another listener would do with the same headphone. our interpretation is affected in a personal way when in model 1 or model 2(if sited at the same position), we all tend to point toward the same direction when asked where a sound is coming from.
on the bright side, headphones often have much lower distortions than speakers, are right on the ears, don't have room reverb, so they do get to deliver a very clear signal where it's easy to notice details if we care about that.

model 4: the idea behind crossfeed. we imagine 2 physical sound sources at a distance(speakers), and a dummy head facing them. I say dummy head because the notion of head movement is entirely ignored. now we doodle this and draw lines between the speakers and the ears. each speaker sends one line toward the left ear and one toward the right ear, we end up with 4 lines. in that model, a little geometry(head size, speaker angle, speed of sound at room temperature) gives us how much delay between left and right ears for the sound coming from the left speaker(and same stuff for the right speaker). we can measure how the sound from one speaker gets affected by the dummy head blocking some frequencies more than others when receiving sound from the speaker opposite to a given ear.
we can now try to simulate this so that when placing a headphone on the dummy head, the left channel will be copied, EQed, and delayed before being sent to the right ear, as if the sound source was still the pair of speakers. if done with a really specific EQ based on accurate measurement from the dummy head, we can get an impulse corresponding to each of the 4 "lines of sound" in our initial doodle and that could count as part of the dummy head's HRTF(head related transfer function) for that specific situation when nothing ever moves at all. that's about the best objective approach we can take to properly simulate speaker sound for that dummy head while using headphones. we'd still need to compensate for the headphone's own response but that would be very convincing as far as audio cues go. if that dummy head could talk(but still not move at all!!!! ever!!!) it would say that it works great.
the basic principle of crossfeed is inspired from this. it also completely neglects head movements as if it was no big deal. it doesn't bother with the headphone's initial FR at all. it doesn't apply your correct EQ for the sound coming from the opposite direction before mixing it because it doesn't know what the correct EQ should be for your own head. it also doesn't know the size of your skull but hopefully you can set that yourself correctly.
so in practice, even if you didn't sense the headphone on your head, even if your eyes weren't your main and most trusted source of intel regarding the word around you, chances are still that you could find issues with that processing of the music.
forget the 2 speakers + 2 ears doodle for a sec and consider real life circumstances instead. that's really all you need to do to find the many issues in that system and in @71 dB's "demonstrations of a factual improvement".
an unrealistic model does not characterize a real system. it's that simple.


@71 dB if crossfeed is an objective approach that factually improves sound over default headphone listening(and not just that you and a few people like it subjectively), I want to propose a similar approach to the topic of masturbation. masturbation isn't as good as sex, I believe this is an accepted opinion for most people. and here is another fact, a human partner is mostly water. so I postulate that masturbating while holding a bottle of water is an objective improvement over not holding that bottle of water, and that's why people will like it more. it is my sincere belief that both our positions and arguments are exactly as factual and just as easily refutable because they rely on obvious logical fallacies and complete disregard for anything beside the variables we present.
 
Oct 17, 2019 at 2:23 AM Post #157 of 162
Masturbation is cheaper and more convenient! You can do it without disturbing the neighbors! Just like headphones!
 
Last edited:
Oct 17, 2019 at 6:47 AM Post #159 of 162
model 4: the idea behind crossfeed. we imagine 2 physical sound sources at a distance(speakers), and a dummy head facing them. I say dummy head because the notion of head movement is entirely ignored. now we doodle this and draw lines between the speakers and the ears. each speaker sends one line toward the left ear and one toward the right ear, we end up with 4 lines. in that model, a little geometry(head size, speaker angle, speed of sound at room temperature) gives us how much delay between left and right ears for the sound coming from the left speaker(and same stuff for the right speaker). we can measure how the sound from one speaker gets affected by the dummy head blocking some frequencies more than others when receiving sound from the speaker opposite to a given ear.
we can now try to simulate this so that when placing a headphone on the dummy head, the left channel will be copied, EQed, and delayed before being sent to the right ear, as if the sound source was still the pair of speakers. if done with a really specific EQ based on accurate measurement from the dummy head, we can get an impulse corresponding to each of the 4 "lines of sound" in our initial doodle and that could count as part of the dummy head's HRTF(head related transfer function) for that specific situation when nothing ever moves at all. that's about the best objective approach we can take to properly simulate speaker sound for that dummy head while using headphones. we'd still need to compensate for the headphone's own response but that would be very convincing as far as audio cues go. if that dummy head could talk(but still not move at all!!!! ever!!!) it would say that it works great.
the basic principle of crossfeed is inspired from this.
it also completely neglects head movements as if it was no big deal.
it doesn't bother with the headphone's initial FR at all.
it doesn't apply your correct EQ for the sound coming from the opposite direction before mixing it because it doesn't know what the correct EQ should be for your own head. it also doesn't know the size of your skull but hopefully you can set that yourself correctly.
so in practice, even if you didn't sense the headphone on your head, even if your eyes weren't your main and most trusted source of intel regarding the word around you, chances are still that you could find issues with that processing of the music.
forget the 2 speakers + 2 ears doodle for a sec and consider real life circumstances instead. that's really all you need to do to find the many issues in that system and in @71 dB's "demonstrations of a factual improvement".
an unrealistic model does not characterize a real system. it's that simple.

These problems don't arise from using crossfeed, they are there all along. Headphones ignore head movements, so how is it a "crossfeed problem"? It is a headphone problem. Same with FR. As if the FR was perfect without crossfeed? No crossfeed is in fact even more wrong EQ. Not doing something is not better than doing something half-well if you have to do something. This all comes down to system thinking. People are used to think headphone sound is fine, but it is actually very wrong because recordings are mixed for speakers. Headphones "need" binaural sound and when you mix for speakers, you don't do binaural sound, because binaural sound on speakers sound crap. HRTF tells us the "spatial information" space, possible combinations and range of ILD and ITD and other cues and headphones get those completely wrong and we are far outside HRTF based spatial information space. Crossfeed scales spatial information inside spatial information space so that it makes sense. I don't understand why this doesn't work for some people, but for me it is a big improvement and I will never stop using crossfeed unless it's moving to even more sophisticated processing. Headphone sound as it is? No way! I'm not torturing my ears with excessive spatiality mixed for speakers assuming acoustic crossfeed, ER and reverberation if the solution to scale that spatiality for headphones is a $50 diy crossfeeder!

I have to say you people are good at finding problems for crossfeed, but you totally ignore problems or speaker listening and headphones as they are. The problems don't arise just because we do something additional (that should be done anyway!). The problems are there all along, and crossfeed fixes one of them, the scaling of speaker spatiality to headphone spatiality. It's one (big) problem less, at least for me!
 
Oct 17, 2019 at 1:08 PM Post #160 of 162
Speakers present music the way the engineers intend it to be presented- with the spatial envelope of the room wrapped around the music. Headphones can't do that, and crossfeed won't add the spatial qualities of a real room with speakers.
 
Last edited:
Oct 17, 2019 at 1:34 PM Post #161 of 162
These problems don't arise from using crossfeed, they are there all along. Headphones ignore head movements, so how is it a "crossfeed problem"? It is a headphone problem. Same with FR. As if the FR was perfect without crossfeed? No crossfeed is in fact even more wrong EQ. Not doing something is not better than doing something half-well if you have to do something. This all comes down to system thinking. People are used to think headphone sound is fine, but it is actually very wrong because recordings are mixed for speakers. Headphones "need" binaural sound and when you mix for speakers, you don't do binaural sound, because binaural sound on speakers sound crap. HRTF tells us the "spatial information" space, possible combinations and range of ILD and ITD and other cues and headphones get those completely wrong and we are far outside HRTF based spatial information space. Crossfeed scales spatial information inside spatial information space so that it makes sense. I don't understand why this doesn't work for some people, but for me it is a big improvement and I will never stop using crossfeed unless it's moving to even more sophisticated processing. Headphone sound as it is? No way! I'm not torturing my ears with excessive spatiality mixed for speakers assuming acoustic crossfeed, ER and reverberation if the solution to scale that spatiality for headphones is a $50 diy crossfeeder!

I have to say you people are good at finding problems for crossfeed, but you totally ignore problems or speaker listening and headphones as they are. The problems don't arise just because we do something additional (that should be done anyway!). The problems are there all along, and crossfeed fixes one of them, the scaling of speaker spatiality to headphone spatiality. It's one (big) problem less, at least for me!
sure, sure. once again you demonstrate an unmatched skill in selective reading. I'm done with that broken record routine.
 

Users who are viewing this thread

Back
Top