Recording live performances from multiple mikes and laying down multiple tracks is an art and a science. Very hard to do well enough so that when the mixing is done in the studio something that resembles a live performance with a sound stage and instrument placement can be heard correctly. The only way it happens is with a lot of expertise at processing and tweaking. Another alternative is to stereo mike and let the chips fall where they may. Sometimes it works, most times not.
The mechanisms, physics, anatomy, and brain-processing of hearing are often over-looked by audiophiles. For example, I am amazed that people describe the vertical image they perceive when they hear a recording through one DAC that isn't there when they hear it through another. Animals with ears located on the same vertical plane don't perceive vertical auditory information unless they tilt their head. So are they hearing a tall soundstage with one DAC, or are they leaning their head differently? Not saying that there aren't imbedded clues about vertical images in music but if people are hearing them it's because they've moved their ears. There are so many examples like this we ought to write a review of how hearing works and make it sticky on every page. We could for example turn our attention to a statement i just read which I think was "as jitter is lowered you get more bass and smoother trebles" (http://www.head-fi.org/t/766347/schiit-yggdrasil-impressions-thread/45#post_11609682). Jeez, I didn't know that, somewhere I read that the jitter present in modern digital equipment was inaudible.
If you think about it logically, the most (only) accurate way to record something should be to use a pair of microphones in the ears of a simulated head (a binarual recording) - since the microphones should then be recording exactly what our ears would hear if we were there. Of course, this only works if you then play the recording through headphones (with speakers you get an extra stage of mixing of left and right between your speakers and your ears).
However, there are several very good reasons for doing it other ways. First, with a binaural recording, you have no opportunity to "adjust" the recording later - you can't turn one instrument or another up or down, and you certainly can't re-record one or another later to "fix a problem". Second, some instruments just don't seem to come out sounding right when you do it that way. (And, again, you have no way to adjust things. When you record a drum set using several microphones, you can alter the balance in the mix between the microphone near the cymbal, which records more "bite", and the one near the bass drum head, which records more "thump" - you can use the ability either to deliberately make the sound different, or to "adjust" it to sound more like it really did to begin with.) Finally, each of our ears, and even the shape of our heads, is different. So a binaural recording made using a perfect copy of my head and ears may not deliver the right cues to you - because your head and ears are different than mine. Our brains have a remarkable ability to "figure out" what all those cues mean - which means that we have an equally remarkable ability to notice when they're wrong.
"Where we hear a sound as coming from" seems to be based on a complex combination of frequency response and phase. A sound that you hear that actually originates from "up and to the left" arrives at each of your ears at different times, and the frequency response is altered as it wraps around your head. However, even that is an oversimplification. In fact, the sound reaches your left ear directly, after passing the curves of your ear, and some of it reaches your right ear after wrapping around your head, while some of it reaches your right ear after bounding off the wall to your right - and the proportions, delays, and frequency response of each varies - and all of those things are different for me than you because our head and ear shapes are different.
One very well know way in which this complicates things concerns the delay times associated with reverberance. If we hear a sound, followed by echoes of that same sound, how our brains interpret those echoes depends on how long the delay is. Echoes that arrive within a few millisecods are not identified as echoes - our brains hear them as "a live room"; echoes that arrive after a long time sound like distinct echoes; and our brains use this information both to judge whether a room is "live" or "dead", and to get an idea how large the room is. Most of us can figure out a lot about the size and shape of a room by listening to the echoes.
The subject is actually very complex - and it is
NOT perfectly understood yet.
In the context of DACs, different DACs produce different outputs. With the same impulse signal, one DAC may produce 2 mS of post-ringing and no pre-ringing, while another may produce 1 mS of pre-ringing and 1 mS of post-ringing. In both of those cases, you have extra signal - the ringing - that "doesn't belong" - which counts as distortion of some sort. We now have to introduce another concept called "masking". What that means is that, if a very quiet noise happens just before or just after a loud noise, we don't hear the quiet noise (we say "the quieter noise is masked by the louder noise"). However, how masking works is very complicated. How well a louder noise masks a quieter one depends on how the frequencies of the noises relate, their relative loudnesses, and their relative timing - and all that depends on frequency range in which they occur. So, for example, a 1 kHz noise at a certain level will entirely mask a 500 Hz noise that occurs within a certain number of milliseconds of it, and is a certain number of dB quieter. And masking is not symmetrical in time; a loud noise masks a quiet noise of similar frequency that occurs
AFTER it much better, and for a longer period of time, than one that occurs before it.
So here's the theory about how those two DACs could produce different "height information"...... Both DACs produce some extra ringing that shouldn't be there. However, with some particular signal, one of those DACs produces 1 mS of ringing before the signal, and 1 mS of ringing after it, while the other DAC produces 2 mS of ringing after the signal, and none before it. Since it is known that masking works better when the quiet signal is after the loud one, this means that, due to masking, the ringing on the DAC with 2 mS of post-ringing is
LESS AUDIBLE than the ringing on the DAC with 1 mS of pre-ringing and 1 mS of post-ringing.
(Just to be perfectly clear, the fact that this ringing exists, and that it is different for different DAC filters, is not at all in question. It is easy to measure, and easy to demonstrate, and is shown and spelled out on most DAC chip spec sheets. It can also be deliberately exaggerated to the point where it is clearly audible to anyone. The
ONLY question is whether specific differences present in specific real-world DAC chips are audible or not.)
Now, assuming that our brain interprets the sound of that ringing as "height information" - perhaps because it mimics the reverberance information of a high ceiling - the DAC with the less well-masked ringing makes music sound like the source has more height. (So, even if we don't hear the ringing as a sound, the differences in the ringing on the two DACs produces a signal that tricks the "height calculators" in our brains to perceive different room sizes or instrument locations.) And, yes, this is one of those situations where the claim is that a cue that we can't "hear" still influences our "experience" in other ways. Unfortunately, this is a very complicate subject and, contrary to what some people seem to want to think, it is not perfectly understood yet.
(Note that I'm not necessarily saying that the theory is correct - however, that IS the theory, and nobody has proven that it is
NOT correct yet.)
To test this theory, we would have to compare not only whether people hear a difference between those filters, but whether the people who claim to hear a difference claim that one "always produces a higher image than the other" or not...
It's easy enough to prove to yourself that the situation is very complex. Just put one finger or earplug in one of your ears. You will find that, even though it may not be accurate, you
WILL still have a strong sense of where music and other sounds are coming from - the world does not "turn into mono" like you may think it would when you can only hear with one ear, so there are obviously lots of cues that your brain is using besides the relative timing and levels between your two ears.