I agree that the majority of the time we are trying to make it "better than real".
However, there are some music genres where this isn't the case, effectively where we're trying to make it better so that it does sound real.
Quite often in audiophile discussions the topic is brought around to the comparison of a live acoustic performance, such as orchestral music, with a recorded equivalent.
The problem here is quite different to the "better than [and not even directly concerned with] real" which is the case with the non-acoustic genres.
In the case of acoustic genres such as orchestral, I would re-word the part I've highlighted in bold to: "The result would often not appear to be entirely realistic or very exciting, because what we hear at an orchestral concert is not real in the first place!" - What actually enters our ears and what we perceive are two different things. Our brain will filter/reduce what it thinks is irrelevant, such as the constant noise floor of the audience for example, and increase the level of what it thinks is most important, such as what we are looking at (the instrument/s with the solo line for example).
This isn't "real" at all, although of course it feels entirely real. Clearly, even with a theoretically perfect capture system, all we're going to record is the real sound waves but when reproduced, the brain is generally not going to perceive those sound waves as it would in the live performance because the visual cues and other biases which informed that perception are entirely different.
So, the trend over the decades has been to create a orchestral music product which sounds realistic relative to human perception rather than just accurately capture the sound waves which would enter one's ears. To achieve this we use elaborate mic'ing setups which allows us to alter the relative levels of various parts of the orchestra in mixing (as our perception would in the live performance).
However, a consequence of this is messed-up timing, as sound wave arrival times are going to vary between all the different mics (which are necessarily in significantly different positions). This is an unavoidable trade-off, we're always going to get messed-up spatial information but with careful adjustment during mixing we can hopefully end up with a mix which is not perceived to be too spatially messed-up (even though it still is).
This "careful adjustment" is done mainly on speakers but is typically checked on HPs and further adjustments may be made if the illusion/perception of not being spatially messed-up is considered to be too negatively affected by HP presentation.
This brings me back to what I stated previously, that pretty much whatever we listen to and however we're listening to it (speakers, HPs, HPs with crossfeed, etc.) we've always got messed-up timing, "spatial distortion" or whatever else you want to call it.
PS. I know you're probably aware of all this already bigshot.
(...)
G
Thought experiment:
Imagine that you record an orchestra with an eigenmic (32 capsules) placed at row A, seat 2, and you have and that you convolve the highest number possible of virtual speakers a high density HRTF. At row A, seat 3, there is a born blind listener. At row A, seat 1, there is a viewer with normal eyesight. Finally, at row B, seat 2 you have a listener that recently acquired blindness. Full audience.
Questions:
Are you saying that the viewer with normal eyesight would only perceive, with headphones playback, an soundfield identically to the one he/she heard live if, and only if, he/she uses a “perfect” virtual reality headset displaying images at where he/she were seated?
Are you saying that blind listeners cannot precisely locate sounds at the live event, for instance, identify where the soloist is playing?
Are you saying that only blind listeners would perceive, with headphones playback, an soundfield identically to the one he/she heard live?
Are you saying that accuracy to locate sounds (at least in the horizontal plane) differs from a blind listener and a blindfolded viewer who has normal eyesight?
Are you saying that blind listeners are not capable of sound selective attention (cocktail party effect)?
Do you think that the born blind listener and the listener that recently acquired blindness will achieve different sound location accuracy?
I agree that vision can in some circumstances override sound cues. I also agree that vision is normally the sense that allows to train your brain to locate sound sources with your ears and that you can retrain your brain if your vision does not match your sound cues.
But I don’t know if that is the only route to create a
virtual soundfield map in your brain (or maybe is it a
neural network physical simulacrum of a soundfield map?).
Someone that was born blinded can walk to his mother when she is calling “my angel”. Some are capable of echolocation. Some play blind soccer.
But I don’t know if all psychoacoustics processing phenomena are caused by visual and sound cues ambiguities.
Are you sure you can claim
that?