[1] They may have tried to minimize room acoustics in the studio, but that doesn't mean that they don't intend for there to be room acoustics in home playback.
[2] Some forms of distortion actually *correct* error, like the timing calculations done by an AVR when you input the distances of the various speakers to the listening position.
[3] The distortion created by DSPs can actually improve the sound beyond what was heard by the engineers when the music was being mixed.
[3a] The best part of speakers is that there is a *natural* coloration to a real world installation that actually *enhances* the sound of the recording. In this case coloration is a good thing. I know a lot of people think good sound is just the sound, nothing but the sound, but with humans that isn't true. We want *present* sound- meaning it sounds like the music is in the room with us.
In general I agree with your post but there are a few points I disagree with:
1. No, this is virtually never the case. In TV/Film we tend to use more absorption to reduce room acoustics but never minimise them. The philosophy of music studio design is quite different though, typically with much less absorption and much more diffusion. In music production/mastering, room acoustics are therefore not minimised much beyond the average home environment but they are randomised to (hopefully) achieve a reasonably flat, neutral acoustic.
2. Yes but in this case what's being corrected is your personal speaker position. The ideal goal is to add a distortion which cancels out the distortion of speaker positioning, resulting in a distortion-less reproduction of the recording.
3. Now this is a dangerous assertion which falls within audiophile myth/misrepresentation! As far as fidelity is concerned, you CANNOT improve what was heard by the artists/engineers in the studio! You can of course change it, according to your personal tastes but what you end up with is lower fidelity, which is not an improvement, it's just a better match to your personal preference. IMO, it's very important to make this distinction, because much of the misleading audiophile marketing and myth is based on perverting this distinction. Effectively selling a preference as a higher fidelity improvement.
3a. This is essentially the same as point #3. The recording was released with the "natural colouration" of real world speaker installation and the human factor ALREADY baked in! It was mixed and adjusted on speakers by humans, according to their human perception and subjective opinion. It therefore already contains the exact amount of "present sound" intended and adding more colouration to compensate for "humans" is effectively double compensating. Now, maybe your subjective opinion differs to those of the artists and in my case it frequently does, I often feel that I would have done something somewhat differently, but I still want to hear what the artists themselves did. I don't want my system to automatically apply some adjustment to maybe more closely match my subjective preference/opinion, I want to hear those artists' preference/opinion.
[1] If you do not play above 90db, in pure technical terms, it should be all there, at 16bit.
[1a] If there is an audible difference, and sure, that might be the case, what is causing it?
[2] There is also a ton of added distortion, for non-"dry rooms". Why this need for this super accurate rendering then?
[3] That something is complex, is not proof of much. But if it is complex, then it is not simple. ... What is expected, currently, with the results at hand, is that positional accuracy, as done by hearing, will correlate with the findings of the bounds discovered thus far. But that is for the simple stuff. There might be combination of variables, or complexity, that suddenly reveals a different result, as a result of how the brain and senses work. Until that is a known, it will remain an unknown.
[4] People also need to realize, that hardly any, if any, theory in physics is proven. They are just not proven false...
1. Assuming noise-shaped dither, which is standard practise, then we're talking more in the range of 120dB, not 90dB AND, that figure is the figure above the noise floor of your listening environment, which is probably around 20dB (with headphones) or >30dB with speakers. Therefore, your statement should be "if you do not play above 140dB it should all be there at 16bit". Can your headphones actually output 140dB? If not, any talk of audibility is irrelevant because you obviously cannot hear what your equipment is not producing in the first place. I don't know of any headphones which can but let's say there are some, now we can talk about audibility and then we run into another even more serious problem, 140dB is well beyond the threshold of pain and well into the range of serious permanent hearing damage. These two factors, producing headphones with 140dB output and what it would do to you if you actually tried to listen to such an output, is why we don't need more than 16bit!
1a. A range of potential factors: A fault or deliberate design choice by a DAC manufacturer or in many cases, placebo or comparing different versions/masters.
2. No, music recordings are not made in dry rooms or designed for playback in dry rooms, as I explained above.
3. You are confusing two, effectively unrelated factors. One factor is the container format (16/44 or 48), the other factor is what we choose to put into that container. As an analogy, let's say that the container is a plate and what we choose to put on that plate is food. As far as the food is concerned, we don't fully understand the perception of taste. A plate only has two things to worry about though; not adding it's own flavour to the food and being big enough to contain any amount of food one person could eat. Provided the plate achieves these two requirements, the perception of taste and our understanding of it is completely irrelevant as far as the plate is concerned. It is of course entirely relevant to the food we put on the plate but that's a human choice, a factor unrelated to the plate itself. 16/44 is already a plate which is far bigger than could ever be required, how would a plate another 100 times bigger improve the food? And as far as adding it's own flavour is concerned ...
4. We're not talking about theories of physics. We're talking about proven mathematics, which you have been supplied with, mathematics which prove, using the analogy above, that the plate does NOT add it's own flavour. The difficulty was the implementation of that proven maths with technology/engineering but this difficulty has to be put into context. Even in the earliest days of consumer digital audio this "difficulty" was relatively (though not entirely), audibly insignificant but the astounding advances in digital technology in the last 30+ years means that not only are we way past any notion of audible significance but we can achieve this feat at astoundingly low cost, about $1.50 trade price for such a DAC chip. That doesn't mean that all DACs are audibly perfect because of course it's up to the individual DAC manufacturer how they choose to implement that proven math, whether they choose to go down the proven route of cheap, effectively perfect audio or take the different route of imperfect audio in order to differentiate their product.
[1] I really like this post. It falls inline with my impression of how musicians typically work.
[2] Noise Reduction is used in post, if a mic picks up too much noise. Many find this a non-issue.
[3] As for math, high res recordings should sound as low res, but not in my case.
1. You say that but it's completely contrary to your previous assertion, of recording everything in mono with minimised acoustics. I presume you've seen for example how a drummer typically works? Have you ever seen a drummer record the snare drum in mono, then record the kick drum in mono, then the hi-hats, then each of the toms, then each of the cymbals? No, that's both impractical and undesirable aesthetically. They play the whole drum kit in one go, it's recorded both with spot mics on some of the instruments and in stereo and often with a room mic, and then this is all mixed together in stereo, for an aesthetically pleasing result. Essentially classical music ensembles are recorded similarly and so are most other genres of acoustic music. Additionally, minimising acoustics is particularly preposterous because all acoustic instruments rely on acoustics, in some cases entirely! For example, an audience never hears the direct sound of a french horn, the bell of the instrument is pointed towards the rear wall of the concert venue and the audience only ever hears the reflections. Recording the direct sound and minimising the reflections/acoustics would result in something quite different to the expected/desired french horn sound.
2. NR is routine practise in Film/TV, in fact you'd be hard pressed to find any TV/Film without NR but not so in music. NR cannot perfectly differentiate noise from signal because the difference is essentially a human perception. Removing noise also damages the signal to some degree and is therefore avoided.
3. If it really is not just a trick of your perception (placebo) and providing you are not inadvertently comparing different versions/masters, then the only explanation is either that your DAC has been deliberately designed not to audibly perfectly implement the math for 16/44, or the higher sample rates contain ultra-sonic frequencies which are causing audible inter-modulation distortion downstream from your DAC.
I absolutely recognize that my example was more theoretical than practical. My point was to refute that mic self noise and preamp noise was the limiting factor.
Not sure I understand, because in practice mic self noise and pre-amp noise often are the limiting factor or certainly contributory, along with the noise floor of the environment.
G