Most of the tests that people conduct are garbage, anyway. ~Whee, look at me, I did a frequency test!~
Break-in with drivers is like break-in with shoes: Takes a while for the fresh-from-factory materials to get into their comfortably flexible place. Are you gonna hear a huge difference? Can you jump higher with an old pair of Nikes than you can with a new pair, or vice versa? Probably not to any appreciable degree. But the tightness of new shoes may translate into discomfort. Similarly, the tightness of new drivers may translate into problems rendering the envelope and phase of a sound.
We are, among other things, talking about the time it takes for the magnet to push the diaphragm into a desired decibel level at a desired frequency.
Check this link, and scroll down until you see some images:
http://www.bbesound.com/technologies/BBE_HDS/
It's a simplified diagram, but gets the point across pretty well. Fighting against the inertia of the magnets, resonance of the diaphragm, tension of the damping, and any other elements inside of a speaker, all contribute to a mis-shapen waveform.
Mis-shapen, but when averaged out on a frequency response plot, it still hits the desired frequency at the desired amplitude.
To my knowledge, there are few or no Standardized Scientific Tests™ which reliably put a headphone through its paces on those teeeeensy tiny details; the things that go beyond generalized frequency and amplitude.
So. This doesn't solve the riddle, you know? But it at least gives a hint that there's more to it than pure imagination. I'm not saying that a lot of 'burn-in' experiences
aren't pure imagination -- just saying that frequency response tests by their very nature do not account for the kinds of details which burn-in would be affecting.