Sorry to cause offence. You can't simplify beyond a certain point and still accurately convey the essentials here.
If everything in the total ADC/DAC chain is done as close as possible to the Shannon Nyquist criteria, a filter with pre-ring is a necessity, both in the record end anti-aliasing filter and in the replay (dac) end reconstruction filter. The ringing is at the corner frequency, and decays in amplitude symmetrically as you move away from the peak response point. It is caused by the filter itself. In other words, to construct a good brickwall filter you will need a time domain impulse response associated with that filter with pre-ring, and also post ring. The longer the better, technically speaking, in terms of getting an accurate result. Same for the accuracy of the filter coefficients. Here Rob Watts is undoubtedly correct insofar as the replay filter is concerned. He also concedes that the ADC filter used by the recording/mastering engineer is normally out of the control of the DAC designer. Rob seems to believes an optimum sinc replay filter is best whatever is used at the record end. He has far more experience of that, in terms of listening tests, than I do.
It is important to point out that the signal we are able to reconstruct, using the Shannon-Nyquist-Whittaker method, is the analogue signal we wish to record, AFTER it has been through a brickwall filter. I.e. we are not claiming to reconstruct the original signal. Though the theory says that if there are no frequency components in that original input signal above the brickwall frequency, the filter should not affect the signal.
The fact that DAC designers often offer a series of possible reconstruction filters to the user shows this is all an inexact science in practice.
Hope this helps. I'm an electronics engineer and physicist. If I've got anything wrong maybe Rob or someone else can put me right.
I have been enjoying the recent technical posts, but I thought I ought to clarify some things.
If you look into sampling theory, it has nothing to say about how bandwidth limiting is done before sampling; it just merely requires no energy at and above FS/2, which for 44.1 kHz would be 22.05 kHz. If it is not bandwidth limited, aliasing occurs. In practice, with a real ADC, the issue here is decimation from the n bit quantizer running at 3 or 6 MHz (conventional ADCs) or 104 MHz (my pulse array ADCs). Since it's relatively easy to analogue filter audio to prevent aliasing with 104 MHz, then the challenge is in designing the digital decimation filters, which then supplies the samples for the OP data. So long as you achieve acceptable aliasing performance, in principle it does not matter whether the decimation filter is IIR (Causal so will NOT pre-ring) or symmetric FIR (strictly non-causal so will pre-ring). Note that the decimation does not have to be sinc or brick-wall - just enough to have acceptable aliasing performance.
Of course, defining acceptable aliasing performance is not easy, as the brain perceptually is extremely sensitive to vanishingly small amounts of aliasing - based on a number of different listening tests - and getting the required performance is a huge challenge with IIR, but relatively simple for FIR. The downside to IIR is phase linearity where the filter delay changes with frequency - and this would be substantial. My instinct is that FIR with no phase linearity issues would be much better - and my decimation filters with FIR have already been designed and listened too - and the decimation filters have only improved sound quality, as a benefit of the filters is to remove HF distortion and noise from the ADCs. The listening tests were in non-decimation mode, so just acting as a low pass filter and this suggests that the pre-ringing is subjectively completely inconsequential.
Your previous post about ringing was absolutely correct - it's something that audio industry gets wrong all the time - ringing is only a consequence of energy being there. If there is no OP at 20 to 22.05 kHz there will be zero ringing from the filter with transients (this applies both to decimation and interpolation filters). In practice, the energy from the mic at 20kHz is very low, typically -60dB.
But the interpolation filter within all DACs is a very different challenge, and is several orders of magnitude more of an issue. Firstly, you
must use a sinc function if you wish to perfectly preserve transient timing - if you use any other type of filter, transients will be constantly modulated by the program material - sometimes too early, sometimes too late, and this will have huge impacts perceptually, as the timing of transients is vitally important for all facets of audio perception. Music after all, is defined by transients.
An additional problem with the DAC interpolation filter is suppression of image aliasing. Once a signal is sampled (lets say at 48kHz), then images at 0dB exist up to infinite frequencies with the image centred at every 48kHz interval; these images must be removed as they create out of band aliasing, and will cause noise floor modulation and in itself damage transient timing.
As to your comment "The fact that DAC designers often offer a series of possible reconstruction filters to the user shows this is all an inexact science in practice."
It's exactly NOT an inexact science - I only give one version of the WTA1 filter as this is fine tuned to give the best possible SQ performance by reducing the transient timing uncertainty by as much as possible. The reason other companies give different filter options is because they do not understand what they are doing. In essence they are using varying transient distortions to allow listeners to play with the sound. But using distortions to modify the sound will not allow one to fully experience the emotions of the original musical event, which is why I give the best available with the given FPGA and no options. It would be very easy for me to give lots of filter options - and commercially it would perhaps be better if I did - but I refuse to give options when I know it is damaging the true performance. My goal is to recover the sound from the mic as accurately as possible, and not produce toys that I know will degrade this performance.