Voltage sensitivity is taken at some reference frequency, typically 1khz. The frequency response chart shows you the output of the headphone though a range frequencies given a constant voltage. The current draw can be concluded from the impedance graph of the headphone. The O2 should be able to power those headphones to near defeaning levels.  Interestingly enough, the HE-6 as measured by Tyll is far less sensitive, with an output of 77dB/mW and a 90dB output at 1 volt.
Bigshot, I think you need to use a steeper filter. When I look at it in SPAN, I can see you also trimmed off some response below 15khz.   Not that I don't get your point and all about high frequencies. Though perhaps try a test just removing those higher frequencies instead?
Sean Olive's compensation is more or less derived from adjustments to diffuse field equalization. Golden ears uses a different target in which their compensation is already built into the graphs. For example, the same JH16 presented earlier, now on an uncompensated graph.  
NT6 is spec'ed to go to 18khz, and looking at measurements of a couple of sets, it appears to cut off at 17-18 kHz. Curiously enough, the single driver Hidition ear first appears to have better presence in the top octave, as measured.
His measurement gear is described here. http://www.innerfidelity.com/content/headphone-measurment-proceedures-introduction-and-equipment As for what the electrical measurements show that is audible, I don't see what they show besides phase.
The nozzles are different? Do you still have both on hand to take a photo of them together?
iOS has access to much better, more powerful equalizers than the Android platform: Accudio, EQu, and Equalizer off the top of my head.   In absolute terms, speakers have a disadvantage and don't go as deep in bass extension compared to IEMs.
I remember seeing that the standard deviation of the human FR using a microphone at the eardrum was 1 decibel. The variation that we see in listener descriptions has more to do with acclimation, taste, and training more than anything else, IMO. If Harman can get a bunch of testers to hear the same headphone similarly, I doubt that head differences are that big a deal.
The bulk of energy is below 1khz, but the ear is most discriminating in the high midrange and low treble. The ear basically works out so that we hear equal energy per octave (pink noise), so on a spectrum most music would look tilted towards the bass.
That would yield 110db, give or take a couple db for manufacturing consistency. These measurements are usually taken at 1khz.  This assumes that he/she listens to music with 110db peaks. The Galaxy S4 is limited to about 1 volt output, which yields about 102db into the HD 600.
