The SACD is mastered at about -35dB RMS, and I swear 20dB of that is just for this one 'ping' on a glockenspiel. Been wanting to set up a blind test on that one for some time to see how much can be taken off the top.
This raises another interesting point (beyond the one you are making), a point which is virtually always ignored but which makes a huge difference and that is; the difference between what an instrument actually produces and how it is heard (or expected to be heard). Metal perc instruments, particularly those struck by hard beaters (like a glock, gamelan, triangle, etc.), produce massive transients. However, even the front row of an audience is going to be at least 30 feet away from the glock, most of that massive transient (and it's substantial HF components) is simply not going to be there any more, it's going to be absorbed massively by everything, the floors, walls, intervening musicians and their chairs and even substantially by the air itself. What the instrument actually produces and what we would actually hear are two vastly different things however, what we record could be either, depending on where we position the mic. If we were to place a mic a foot or a few inches from the glock we'd (more or less) record exactly what it actually produces and therefore, we'd have to apply some processing to bring it in line with what an audience would expect to hear. So, should that 20dB glock peak even be in the master to start with? Is it merely a consequence of the mic/s positioning that could/should have been reduced/dealt with? Logically of course, a glock ping should not overpower an entire orchestra going full pelt (fortissimo) by 20dB. This is just one of numerous examples of where compression is (or could be) vital and not the "black and white" evil it's made out to be.
In addition to this, there's your point that a few dB of compression cannot generally be heard but how much cannot be heard entirely depends on what we're compressing and the type of compression used. And this brings up another overlooked/misunderstood point; compression/limiting isn't "a thing", it's a wide range of things with numerous different characteristics. They cost anywhere from $0 - $50,000 (and $0 does not mean bad, just different) and any completed commercial music product from the last 40+ years is going to have at least 2 different compressors applied, possibly as many as 8 or so, applied at various stages (throughout mixing, mastering and on older recordings, even during recording) and some of them used multiple times with different settings. So, suggesting no compression and then the consumer adds their own is nonsense.
I think the old school way of looking at peak was as an emotional climax point, with the expectation you were waiting for it.
There was no old school way of looking at peaks, you couldn't see them! Analogue (VU) meters had a measurement window of about 0.3 secs (300ms) but transient peaks are in the few (or few tens of) milli-secs range, so you either couldn't see them at all or their level was vastly under reported! With digital on the other hand our measurement window (with CD, 44.1kS/s) is 0.00002 secs (22ns or 0.02ms) and not only can we "see" the transients but we can "see" different parts/segments of the transients. Of course though, with analogue you didn't get digital clipping, just compression/saturation/breakup.
Your post though is a little confusing, it's not clear whether you're talking about physical peaks or musical structure, which raises yet another very important and overlooked/misunderstood point. I warned you, this is a rabbit hole!!

Some of my answers may have appeared rather glib and that's because a full answer would take too long. Part of the problem is that from my perspective, I'm some way down the rabbit hole while many of the posters here are just looking in from the outside and haven't even noticed that there is a rabbit hole! Communication is therefore going to be an issue without some common ground, some idea of the basic layout of the rabbit warren and as this is the science forum, I'm presuming people are here to gain some understanding of what's really going on under the surface:
For this reason, I very strongly suggest reading
this SOS article before we try to continue. It's the best I've seen on the subject because it's saying it as it really is, not the massively over-simplified soundbite/propaganda bilge which was necessary to get consumers interested in the issue! BTW, the article doesn't give all the answers, just a more reasonable place to start.
It's quite nonsense in my view to squeeze the sound to gain e.g. 15dB and then the material doesn't use the full dynamic range of the medium.
The reason your view is nonsense is because if we did "use the full dynamic range of the medium" not a single person would like it, not even an extreme audiophile! All we'd get is complaints and demands for refunds and that's why when we make very dynamic recordings we literally use about 1,000 times less than the full dynamic range available!
... if you want to hear the human voice in all its dynamic glory go to a church with 1/2 decent acoustics ...
Actually, I'd say the exact opposite, that a church is just about the worst place to go for "dynamic glory". In a church you've typically got a large space with very reflective surfaces and therefore a long, dense reverb with an RT60 of at least 4 secs and up to around 8 secs (and nearly double that in a large cathedral). This means notes overlap with the reverb of previous notes, there's no silence between notes and the dynamic range is reduced by very wet acoustics. You want to hear the dynamic glory of the human voice, try being about 3ft away from Placido Domingo in a broadcast studio. That's close enough to hear his jacket rustle as he breathes in and close enough to feel like you're in the middle of an artillery barrage! I'm not joking, the amount of volume was literally staggering (and the noise I made when I did stagger, made him aware of my presence and he stopped and apologised), there are very few mics which woudn't be damaged if you tried to place them where you normally would for a pop/rock singer. I wasn't even directly in front of him and it was easily over 100dB.
G