Let's work out some numbers, and let's consider the max volume recorded volume on a CD at 0 dB, which means that the softest detail is at -96 dB.
Let's consider The Planets (Holst) conducted by Dutoit at the Montreal symphonic orchestra. It is a CD with VERY large dynamics.
The loudest instant is at - 1 dB, the recoding has an average of -30 dB, that's a very soft recording, very akin to a torture test.
You want to listen at a loud level, actually louder the real life performance, let's say 90 dB, the peaks are at 120 dB (~equivalent of a gunshot at 1 m)
The passage were one would notice a lack of detail are the softest ones, for example, the beginning of the Saturn passage, averaging at -47 dB with a minimum at -60 dB.
Which means that it's the softest detail can still be 96-60=36 dB softer than the softest second of music.
Let's see how this plays into our scenario, 120-60=60, at this exact second, the music plays at 60 dB, with the smallest details drowned at 6 dB softer in the room's noisefloor, that's no even taking into account that when playing a signal at 60 dB, the main signal somewhat maskes the smaller details (ie. it's easier to hear a mosquito flying in a silent room than in a rock concert).
Now, some of you may say that 36 dB below the main signal is not enough, this is where dither+noise shaping comes in, basically it's an acoustical trick that brings the subjective dynamics of a 16 bit file up to 120 dB (you can look up the details, it's quite complicated).
Now we get our softest signals at 0 dB, 30 dB below the noise floor of the room and 60 dB below out main signal.
All problems are solved, 16 bit playback is indeed enough.
Finally, why do I call this a torture test? Because a "normal" CD that has not succumbed to the loudness wars is usually mastered much louder, maybe at -12 to -15 dB, with its softest passages at -30 dB and not at -60 dB like in the Planets CD.
PS: Graphic example of how a 1 bit (black and white) can subjectively have a greater bit depth (ie. more shades of gray) if you look at it form a certain distance.