Let's talk about speakers to begin with, their optimal frequency response is supposed to be a flat line from 20 to 20000 Hz. It means if fed by an input signal composed of varying frequencies at the same level (voltage), they will measure the same volume on a dB meter. It has nothing to do with the frequency distribution of the music itself, but merely ensures that the frequency distribution (tonal balance) of the music will not be changed by the speaker. If the music is bass heavy, the speaker will reproduce a bass heavy sound, if it has a lot of treble, they will reproduce a lot of treble.
In addition to that, an overwhelmingly large proportion of sound engineers mix and master music with studio monitors, and almost all of them have an extremely flat frequency response curve (not to mention that their studio was acoustically treated for flatness as well), and all the equalizing was done using those monitors as a reference. Thus, if you want to hear what was intended by the sound engineers, you would have to buy speakers with a flat FR (frequency response).
However, higher frequencies are attenuated faster as they travel through the air (that's why you only hear the bass when your neighbors play music). A good sound engineer mixes the music knowing this. That means that if microphones are placed near the instrument, the engineer would equalize the high frequencies to give an impression of space. If he wants to make it seem like it's played in a concert hall, he would have to reduce the higher frequencies according to that distance. The final product (the music you buy) would be realistic as far as frequency distribution is concerned when listened with a speaker that has a flat FR at the same distance as the sound engineer did.
Knowing that, we can move to the FR of headphones: the transducers being so close to the ear, high frequencies aren't attenuated when they arrive to the ear drum (not to mention the reflection of your pinna and your head), that's why the FR curve of headphones is not flat, higher frequencies should be reproduced at a lower volume to mimic the effect of traveling through air to reach our ears. Even so, the added high frequencies of headphones with a flatter FR make them sound more detailed than some speakers.
The lack of content in the high frequencies is partly due to distance attenuation and partly due to the fact that instruments don't naturally produce these frequencies, the highest sung note is an F6 which fundamental is 1397 Hz, for other instruments, refer to this chart:
This is of course pure theory, and it assumes that the engineer has perfect equipment and is trying to mix accurately. Some sound engineers use less accurate monitors and are not aware of their monitors faults and would thus create colored recordings. Other engineers might intentionally color the sound, to make it sound "better" on cheap equipment.