It's easy to assume compression automatically means lower quality, as audio has often been that way in the past. However, it's not necessarily always the case. One example is Zip files. It's a form of compression that results in no loss of data. With audio, we normally send all the information for each sample. Instead, it is more efficient to only transmit the difference from one sample to the next. This results in no loss of data while using less bandwidth.
Compression isn't what makes MQA special, so that is beside the point. The process of "deblurring" is how MQA improves the sound quality over regular files.
Understanding MQA's deblurring
A causal transmission system has dispersive properties which result from filtering or attenuation. Fine details in the time waveform can be smeared or obscured if the end-to-end impulse response is not sensitive to the signal and to the receiver (human listener).
Blurring has a direct parallel in the optical world as it relates to the design of lenses, dispersion of light in media, in image processing. In electronics, this is well understood by the designers of oscilloscopes.
There is now considerable evidence from neuroscience that the human listener appears more sensitive to time than frequency, by which I mean both that the human listener can outperform Fourier time-frequency uncertainty and that sensitivity to temporal microstructure is finer than a linear system of the same bandwidth would enable. The fine details in sound that are important for the human listener seem to be on timescales as short as 5µs. It is critical to appreciate that these small-scale events in time do not necessarily have origins in high-frequency elements. Sounds can arrive at the microphone from different objects, including reverberation and recognizing that voice and instruments are not point sources. It is very interesting to see that this order of sensitivity is not coupled related to the human tonal limit of ~18kHz.
In a linear analogue system which has cascaded elements contributing to high-frequency roll-off, we can see that temporal detail is smoothed by a function which moves the centroid (group delay) and spreads or can merge finely-separated events.
If we consider a complete recording chain, it may be that the designers of each of the individual components considered should cover the frequency range up to 100kHz, but it is unlikely. Until recently it was considered adequate for an individual component to show a response flat to say 30kHz whereas temporal considerations suggest this is barely adequate for the whole journey from performer to listener through a cascade of microphone, preamplifier, mixer, converter pre- and post-filters, replay pre- and power amplifier and playback transducer. But what we see here is that such a chain has already used up 8µs of the budget while, by extrapolation, limiting one component to 30kHz uses the entire budget.
It is critical to appreciate that this argument is based on the temporal smear of signals and not on the (unlikely) requirement that the human listener benefits from signal harmonics in the range 30–100kHz.
The fact that our systems should exhibit wide bandwidth does not mean that high frequencies are the reason; rather that our neural processing is sensitive at the microsecond level to changes made within the audio band by filtering above 20kHz.
One final thing to see is that this definition (deblurring) is made in the analogue domain; sound is analogue in air and if there is a digital storage channel it should fit into this framework.