aptX HD is not even true 16-bit. The bits are discarded at the very beginning due to its ADPCM like encoding.
There is no discarding at the beginning.... The beginning being the point where the algorithm knows nothing of what to reduce or what is safe to discard without losing the accuracy of representation. I mean ADPCM uses a differential signal to initially reduce the amount of bits required to represent the *same* digital signal as best as possible, only discarding information when unusual cases arise.
Imagine a single 1KHz maximum-volume sine wave that is represented in 16 bit PCM, it rises and falls and with each sample from zero to the upper limit (32575 high), through the zero point and down to the lower limit (-32576 low), then back to zero and repeat. ADPCM starts by storing the difference between each sample, instead of the absolute value of the sample. The difference between each sample is going to be a lot smaller number than actual range of 16 bits.
We can calculate the maximum differential between each sample for this scenario, based on the maximum angle of each sample step being 45 degrees (it's just a single sine wave) - so within a quarter of the samples for one period of this wave, the wave has gone from the zero point up to the maximum level with an initial and maximum angle of 45 degrees.
So, roughly divide the absolute range here (0 to 32575 high) by the maximum step size which is (44.1 KHz / (4 * 1KHz)) - say roughly a factor of 10, the first 2 values in this sample will be zero followed by (32575 / 10) - roughly 3200, then around 6400 for the next sample etc...
So ADPCM in this case only needs about 10% of the original range in order to represent the same original signal, that's not 10% of the information though, we're using binary, so 10% of the range is actually roughly a saving of 3 bits per sample (2 to the power 3 is 8 - roughly 10)
So straight away, ADPCM without doing anything else is able to represent this sine wave with approx. 13 bits instead of 16 - without any loss at all.
That's the starting point, it's actually more complicated than that of course, with adaptive ranges (range step size can vary) and predictive waveforms that are subtracted out to still reduce the range of the differential representation in terms of total information.
If the encoded data rate is a reasonably high proportion of the original PCM data rate, then quite a lot of the signal comes through very close to losslessly. It's not trying to select frequencies to reduce their accuracy using pyscho-acoustic models like other codecs will, it's just another technique to start from and then use other methods to adapt the compression.
AptX uses sub-bands initially too, to help isolate and restrict possible ranges, based on narrower frequency bands.
So.... I hope some of that made sense to you.
This is sound science.