Based on my personal experience, I can think of a few things.
1. Frequency manipulation or personal signature
Most cans or buds are tweaked to give a particular frequency or sound signature. Those that are brighter and less bass emphasised, also tend to have a wider sound stage. I think the basic reason for this is that bass congests the other frequencies, and gives the audio illusion of a more claustrophobic sound, when the reality is that the emphasised bass is just masking the other frequencies somewhat. Equality, if bass is less prominent, the other frequencies are given more of a chance to sing (easier to hear), and as a result sound better spread or more apparent.
2. Physical design, drivers and resonance.
Cans, unlike say with amps or DAC's, also have physical form and design play a part. I.e, wider or larger cups and cans might impact the the nature of the sound's width, because of how much more space they have to spread the sound over. Less so than probably expected since ultimately the sound still travels down a small ear canal, but it's method in getting there could vary.
Add to that the shape or design of the physical cups holding the drivers could also have an effect. The way the sound is delivered through the cans, way it's filtered, the way the sound resonates through to the ear etc. Whether the can is open or closed, the shape of the cup, the filter applied to the driver, the type of driver, the size of the driver etc.
Having said all that, I still feel that there will never be such a thing as a perfect balance in frequency or sound on any headphone suited to all music, only one suited to personal tastes or more so towards particular music genres. But even then it's tricky because different genres are better on different cans, and different music is recorded using different cans. My belief is that if you inject more colour in the bass, you inevitably lose some width and air else where, and if you add more sparkle at the expense of reduced bass, you get the opposite. You can tweak this till you find a balance appropriate to your personal taste, but I don't think there's a de facto ideal due to all the variables in play and the static nature of the actual signal being fed.