You're making a very basic - and false - assumption.
The original analog signal is a series of measurements of the pressure of the air over time.
It has no constraints and no limitations.... and no band limitations... and no windows limitations.
It is NOT a SinC function... it is completely arbitrary.
Set off a string of fire crackers.
Each pop is a single pressure wave, which expands, and eventually hits your ears.
What follows is a whole bunch of odd little squiggles in pressure as that wave bounces around and interacts with other stuff.
If I had a "perfect oscilloscope" I could draw a "perfect" picture of it.
There is no sound whatsoever before the first pop.
Things like "windowing errors" and "Gibbs ringing" do not exist in the original.
They are ERRORS that result from the conversion into digital.
1)
I'm going to hit a bell......... now.
In order to make an "accurate digital representation" of that signal - sort of......
You can sample it.
When you then reconstruct those samples, in order to do so perfectly, and get back a "perfect" version of the original (in terms of energy distribution) your reconstructed signal would have to extend backward in time.
Forget the practicalities of that.......
The original bell hit had ZERO energy before I hit the bell; your reconstruction does; therefore your reconstruction has an error.
(The bell was NOT ringing before I hit it... but, in your reconstructed signal there is ringing before the bell hit; therefore they are NOT the same.)
Therefore, the ONLY question is whether the error that we know exists is audible or not.
2)
A DAC does NOT use a SinC function.
The output of a DAC is NOT "a sum of sample-weighted sinc-functions with various delays"
The DAC (chip) outputs a stream on analog voltages - one for each sample you feed to it.
The DAC (chip) does not put out signal before it receives the fist signal (even if a "real" SinC reconstruction would require it to do so).
I'm not a mathematician....... but I believe that the "problem" is that the SinC function of a non-continuous waveform must extend forward and backward in time to infinity.
(The SinC function of a continuous sine wave needn't do that..... which is why the theory works perfectly for continuous sine waves.)
I can give you a more ridiculous - but still valid - example.......
Let's design the most ridiculous filter imaginable.
It will be a super-duper-hyper-narrow bandpass filer.
It will pass 400.0000000000 Hz, with a cutoff of a million dB per octave.
I'm too lazy to do the math, but you will find that, due to the tradeoff between time resolution and sharpness
....our fun filter will take SEVERAL SECONDS to ring up to approximately full output level once it receives a 400 Hz input signal
....and our fun filter will ring detectably for several seconds after the signal stops (it will actually ring forever, but I've made sure it will ring powerfully enough that it will be easy to see).
I now create a tone burst that is 40 cycles of a 400 Hz tone (it exists for 0.1 seconds).
If I play it from reasonably good speakers in an anechoic chamber it will seem to start and stop quite suddenly.
I can create my signal by taking the output of a signal generator set to 400 Hz and gating it at the zero crossing point to pass ten full cycles and then stop.
(I'm going to gate it using an FET for a switch.)
Now I'm going to send this signal to my fun filter.
The input of my filter will be a 0.1 second set of ten sine wave cycles of a 400 Hz tone.
The output will NOT.
It will increase in level gradually and decrease (continue to ring) for several seconds.
The INFORMATION it contains will be the same (which satisfies Nyquist and Shannon).
(Nyquist and Shannon don't actually specify how long I have to wait for all of my information to "accumulate" or "reconstruct".)
However, the FORM of that information will be very different...... which may or may not satisfy a human being.
Basically, at the risk of being intuitive, the information theory says that, as long as you follow certain constraints, the SAME INFORMATION will still be there.
HOWEVER, their definition of the term "information" isn't intuitively what you might think.
I could post this message in Braille, or in MIME encoding...... the same information would be there...... but it would LOOK quite different.
Likewise, Nyquist & Shannon make a statement about the information.... but not about how our signal SOUNDS, or whether it is AUDIBLY the same as the original.
(When we design DACs, we do our best to design the filters and such so that the output also SOUNDS audibly similar.
Besides following the constraints, we follow other constraints that are based on acoustics and human perception.
For example, we don't spread out a tick over ten seconds - because we know that, even if the information content is the same, it will SOUND different.
I feel sick and tired today, so it's hard to think. The analog signal is a sum of sample-weighted sinc-functions with various delays. I don't get why this works only for sine waves.
Time errors shouldn't be dependent on delay. Why would a filter cause different time error for delayed signal? Doesn't make sense. That would require time-variant filters. Maybe it's all because the sinc -functions are actually windowed versions. In that case we can increase window size and reduce the error, make it as small as we want (need).