^A very good question, indeed.
I can't really answer you, but much of it is about how sound decays within the headphone/ears. Most music is made for playback on speakers, so soundstage will not be rendered correctly. Binaural music is made for headphones.
Soundstage can be deep, wide and high. If a headphone has a soundstage that lacks depth, you will feel that you are hit by a 'wall' of sound IME. If it lacks width, music seems to be crammed together IME.
There is another side of the coin when speaking soundstage: Imaging. Imaging is how precise and defined instruments are WITHIN the soundstage. For example with the hifiman HE-500, instruments beside you/to the left or to the right seems very defined, but sounds like vocals coming from the front/middle seems a bit smeared IME.
Hope that helps