Regarding graphs, I don't see the point to add an hrtf to a dummy head measurement. The dummy is already an hrtf. So for me the "reference" is the grey curve, it's what's in what's out and this is what counts for me. I don't want to compare the frequency to a target curve decided by a group of person saying it suits their taste best; Nor comparing headphones to speakers, I prefer headphones.
That's not completely accurate. The grey line is raw - a perfectly "flat" headphone will not be flat - it will have a rise of roughly 8 dB from 2 - 8 kHz. Tyll explains here:
http://www.innerfidelity.com/content/headphone-measurements-explained-frequency-response-part-one
He uses the raw (grey) line not because flat means flat (it doesn't), but because he is familiar with the shape of the HRTF and would rather compensate in his mind than use the recommended compensation, because the latter over-compensates.
Think of it this way: your ear canal emphasizes tones from roughly 2 - 8 kHz. So if you listen to a natural wideband sound with near equal energy at all frequencies, like waves hitting the beach, when that sound hits your eardrum it's boosted roughly 8 dB from 2 - 8 kHz. That's what your brain says is "flat" psychoacoustically. Put differently: if you hold the dummy head in the open air at the beach and use it to measure this natural ocean sound you'd see the rise from 2-8 kHz. This is the dummy head's HRTF effect on the sound.
This is the grey line of Tyll's graphs. A "flat" sound does not measure flat.
Thus a headphone should measure, in absolute terms, a lift in the 2-8 kHz range in order to sound "flat" to your ears. It has to provide the lift that your HRTF would because it's bypassing your HRTF.
That's why all the grey curves for well designed "flat" headphones show a lift in this range. As do the LCD-2F. Yet their lift is less than the HRTF effect, so they're actually slightly down in this range, perceptually or psychoacoustically. This is consistent with Audeze's own measurements, which use a different HRTF yet show roughly similar response - the LCD-2F is slightly down in the 2-8 kHz range.
Everyone's head is a bit different, so every person has a different HRTF. I believe that plays a big role in why different people think different headphones are closer to the "real thing". They simply hear "the real thing" differently.