If the jump from 16 to 24-bit depth mainly has to do with the dynamic range of sound, and isn't likely to be perceived due to the dynamic range already allowed by 16-bit, why is the jump from 8 to 16-bit sound so instantly noticeable, even on very poor speakers?
I remember listening to a bunch of 8- and 16-bit clips back when I first got a sound card on my PC.
The jump from 8 to 16 bit is immediately noticeable because the usual way to deal with quantization error, the error between the exact value of the analog signal and the digital value, is dithering which means "converting" this error into floor noise. At 8 bit, this floor noise is some 48 dB bellow the signal if no noise shaping is applied, it is easily audible, that's 30 something dB if you are listening at normal volume, ie the volume of a quiet room which is never totally quiet, thus easily audible.
Now, if you shift to 16 bit, the noise floor is 96 dB below the main signal, which means the noise doesn't even reach audibility threshold if you listen at normal volumes, and if it somehow does, the noise floor of the 16 bit file remains buried in the noise floor of the room.
24 bit is about lowering the noise floor to -144 dB, ie lowering an already inaudible noise floor, thus absolutely inaudible.
I think naike was getting at the fact that a lot of DAWs and sound applications can internally process the sound at higher bit-depths than what hardware supports (32-bit for Foobar, 48-bit for Pro Tools, etc.). Even in professional studios, most hardware sound cards and DACs only support up to 24-bit.
As other posters have commented, 16-bit has a ton of dynamic range if used properly, so what is the benefit of higher bit-rates? Dynamic accuracy. Less interpolation. Even if we're talking about rock music that only uses the top 10% of available dynamic range, that 10% represents 6,553.6 values in a 16-bit system and 1,677,721.6 values in a 24-bit system. This is the same benefit seen with an increased sampling rate. Not only does an increased sampling rate result in higher representable frequencies, it results in more accuracy in the audible range due to a decreased need for interpolation because of the additional samples.
You are speaking of accuracy, but what is it? In my opinion, you are speaking of the ability if distinguish a low level signal in when it's played at the same time as a high level one, let 0dB be the highest volume played, it corresponds to a 96 dB with 16 bit.
Suppose you have a 1st sin wave playing - 6 dB (if you considered 16 bit, the signal has an amplitude of (2^16)/2=32768) and a 2nd sin wave representing a low level detail playing at -84 dB (amplitude of 4). Does it matter if the amplitude is changed from 4 to 3 or 5?
Not that much, because this secondary signal is already 78 dB softer than the main signal, Saying that there are more value in 24 bit is accurate, but it doesn't bring anything to the auditor, all those minute variation are basically inaudible. With noise shaping we can basically reproduce secondary, tertiary, quaternary... signals that are up to 90 dB softer than the main signal, anything softer would be inaudible.