@jgazal bigshot has tried to explain but you don't seem to be getting the implications of what he's saying. You agree with him but then continue as if you don't or rather, as if he'd not posted anything. Let's take an example, your quote of Choueiri: "
For serious music listening of music recorded in real acoustic spaces ...". Much, if not everything you've stated and quoted is effectively based on this but it's contradictory and inapplicable. This statement eliminates the vast majority of recorded music because because the vast majority of music is not recorded in a real acoustic space. It also eliminates pretty much all other music because although it is recorded in a real acoustic space, due to the way it's recorded and mixed, it ends up not being a real acoustic space. So, the obvious question with "
serious listening of music" would be; serious listening of what music? Choueiri has eliminated pretty much all commercial music recordings! I'm not saying Choueiri's work is definitely worthless nonsense, it *might* potentially have some influence on future developments, it might not have any influence and just be research which expands scientific knowledge or it might actually be relatively worthless. I don't know the current cutting edge of scientific knowledge and don't know if Choueiri is expanding it.
[1] There is certainly two drivers other to keep the reference locked to the mastering room when we are dealing with music content. One is objective and the other is let’s say volitive: a) the consumer environment; b) preference for such best seat in the audience. [c)] But there may be a third driver: apprehension that one single coincident microphone could detract the creative intent of artist, producers, recording, mixing and mastering engineers.
2. I don’t believe that shifting the reference can in anyway detract the creative intent ... Giving the user access to 360 degrees of freedom in the x, y and z axis of a sound-field may sound challenging, but do not detract the creativity. It may be just a new language. This analogy with the cinema limited angle of view versus VR freedom of view might be useful:
1a. There appears to be a general misconception in the audiophile world about what mastering is, what a mastering room is for and how it's used. A mastering room is a room with (hopefully) superb acoustics and accuracy of playback. Incidentally, superb acoustics doesn't mean little/no acoustics. The bigger misconception is that the mastering engineer, masters to this room, IE. They are attempting to created a master which sounds great in the mastering room. We need the accuracy of the mastering room to hear exactly what is going on with the mix and exactly what we're doing/applying but we are NOT trying to create a master which takes advantage of that accuracy. If we did, then we would be defeating the whole purpose of mastering in the first place! In practise, a mastering room will have the most accurate speakers/acoustics possible but it will also have pretty much the worst speakers possible and it will have headphones too. So when I see audiophiles suggesting that recreating the mastering room provides the highest fidelity playback, I want to ask: Which mastering room, the one with the great speakers or the one with the crappiest speakers and what about the mastering engineer's ears and subjective opinion? The last of these being the most important because the master he/she creates is a compromise between the two! The thinking being; the mastering room with the great speakers is loosely representative of the best quality playback and the mastering room with the crappy speakers is loosely representative of the worst and what virtually all consumers will experience is something inbetween these two scenarios. Therefore, if we can create a master which works reasonably well on both, we've got a master applicable to most consumers. Of course, it's virtually impossible to create a master which works perfectly on both sets of speakers. Generally, the more perfectly the master works on the crappy speakers, the less perfectly it works on the great speakers and vice versa. In other words, the reference is NOT locked to the mastering room, it's not even locked to the effectively two vastly different mastering rooms, it's locked to the mastering engineer's subjective opinion of somewhere between the two, which is in turn informed by the likely listening circumstances of the target consumers and modified by the client (artist, producer, record label). The audiophile concept of recreating or getting as close to the mastering room playback as possible, is most likely/almost certainly counter productive!!
1b. This brings us back to my initial point, that even when we're recording in a real acoustic space, that's not what we're trying to create. We're NOT trying to recreate the best seat in the house! And the reason we're not trying to recreate the best seat in the house is because we're talking about an audience member, NOT a seat! In other words, the actual sound waves which would enter an audience member's ears is substantially different to what that audience member would perceive. What we hear is always a perception, a combination of our senses and our expectations. This perception is potentially constantly changing as we decide consciously and subconsciously what to focus on and just as importantly for recording/mixing purposes, what not to focus on! For example, in reality (the actual sound waves), the audience is making constant noise, however as we're looking at (say) the orchestra the brain will decide that constant noise is irrelevant/unimportant/an unwanted distraction and reduce it's perceived level or even eliminate it entirely, unless something non-constant occurs (such as a loud cough for example) or we consciously decide to override what our brain is doing by focusing our attention on that audience noise instead of the orchestra. This is only one of tricks our perception is constantly playing in order to better hear and make sense of the world. Our hearing can also reduce some of the reflections, the number, duration and/or levels of those reflections, it will also even reduce parts of the orchestra itself, the parts we're not concentrating on. For example if the lead violin gets a solo, our brain will match that sound with our eyes and the combination will reinforce the focus of attention on that violin and reduce everything else. On the face of it, this all appears to be pretty unnatural but of course it's actually the exact opposite, it's how our hearing has evolved to work and it's the only thing we've ever experienced as individuals, from even before we're born, so it actually sounds ENTIRELY natural. There's an obvious problem here; with a music recording our sight is contradicting our hearing, we're seeing say our living room but hearing an orchestra in a concert hall and there is little/no reinforcement effect between our sight and hearing which would result in the real life scenario of our brain manipulating, subconsciously reducing and amplifying, the various elements of what is entering our ears in favour of other elements. Assuming we're talking about an actual acoustic event, such as a symphony concert for example, then what we're doing with the recording and mixing is NOT trying to capture and recreate the actual audio reality but create a sort of generalisation of what we would have perceived. This effectively means applying those reducing/amplifying brain manipulations to the recording itself, because our brain will not perform those manipulations when we're listening in our sitting rooms. There is no algorithm for this, there's too many variables at play and ultimately it all comes down to the skill/technique, perception and subjective opinion of the engineers/producer. Also, this is in addition to any creative intent! For example, with our violin solo above, we might decide to make the lead violin a tiny bit louder and more present in the recording to emulate what we would likely have perceived had we been there (and our sight and hearing had combined to create this perception). Artistically though, we might decide that what we would have perceived is still not quite right or could be subjectively better, maybe we would make the violin even louder or maybe quieter again or maybe tweak some other aspect of the sound.
1c. That one coincident pair not only severely limits what we can do artistically but even if it were a perfect coincident pair (which is impossible), it would still only be giving us a perfect recording of the sound waves entering the ear, not a recreation of what we would perceive!
2. I understand how you have arrived at that belief but it's false! It would have a massive affect/detraction on creativity. Your analogy with cinema highlights this fact, although you don't seem to realise it. What you're talking about is not just a new language, it's a different language, a language which has no words or acceptable way of expressing most of the art/creativity of filmmaking but does provide some words/expressions which current filmmaking does not contain. What VR does is defined by it's name, present a virtual REALITY but reality is not what we're after! Narrative filmmaking is ultimately all about storytelling; we read a book by an author, we watch a film by a filmmaker/s and we are limited to the authors words in the order he/she placed them and we are limited to the frame and timeline the filmmaker/s present us with. VR presents the opportunity for the consumer to look and go where they want. This has an advantage and a disadvantage. The disadvantage is that the filmmakers no longer completely controls exactly what you're seeing and hearing at any instant in time and this means most of the subtle artistic storytelling tools evaporate or become chance. For example, a particular scene or even the whole story might depend on a simple gesture, subtle facial expression, a subtle inflection, something going on in the background of a frame, something implied by a camera angle/movement or even a subtle sound. With VR we cannot use any of these and many other similar tools because there's a fair chance the audience will miss most/all of it and it's almost certain they'll miss at least some of it because they will be looking wherever they want, instead of where they're intended to. We'd have to create a film where it wouldn't matter if all the subtle cues were missed and that has massive implications for the complexities of the stories, character development and interaction, if fact pretty much every aspect of the art of modern narrative filmmaking. In effect, the advantage of VR is that the consumer themselves effectively becomes the storyteller and this opens up a whole new set of interesting possibilities but, this is also the disadvantage, the more control the consumer has, the less control the filmmakers have and I'm effectively substituting my own storytelling abilities for those of the filmmakers. I don't know about anyone else but I go and watch a film because of the storytelling abilities of Speilberg, Nolan, Kubrik or Singer, my own storytelling abilities are pathetic in comparison! I'm not saying VR is therefore pathetic, I think it has a bright and interesting future but as a different thing, as a different entertainment experience altogether rather than as an evolution and replacement for film.
The idea of the performance and the individual sound objects being separate from the space the sound inhabits opens up a ton of new possibilities. Being able to play an object based recording and switch environments and mixes on the fly completely blows my mind. I can't wait to be able to do that.
The idea would be fun to an extent but in practise you'd loose more than you gained. Some of your "guesses" of the SACDs you mentioned were quite a bit wide of the mark, the technology for what you're guessing is barely possible even today, let alone in the 1990's. I did some work at CTS by the way, in the 90's, so probably around the time those recordings were made, I knew some of the personnel there, one of them quite well but that's going back a bit, CTS closed it's doors for the last time about 15 or so years ago.
G