What is soundstage? | Page 4 | Headphone Reviews and Discussion - Head-Fi.org

jaddie · May 31, 2013 at 1:35 AM

Just briefly, been traveling all day...
Quote:

uchihaitachi said:
Jaddie, except for speakers, so for IEMs and headphones, what do you believe are the determinants of 'soundstage'?

Kind of a long discussion, some of which I've posted before, but I'll get back to you on this when I'm not so tired.
Quote:

uchihaitachi said:
Are you of the opinion that they are entirely constructs of the mind?

No, it's always a combination of the total hearing mechanism plus the brain.
Quote:

uchihaitachi said:
If so would you say that the disparity in 'soundstage' among headphones and IEMs that people perceive are due to some head gear possessing different frequency responses that lead to different representation of the recordings which in turn may lead to perception of a lack of or a more of 'soundstage'?

I'm not all that convinced there is all that much disparity in soundstage between headphones. The major qualities of headphone listening perspective are so overwhelming, and unnatural that the differences in soundstage should be quite minimal within different models of the same type. There is a difference between types though.

I'm afraid I don't survey many headphones. Most of them are so bad I just don't bother. The last time I auditioned about a dozen, pretty much hated them all, and some were pretty pricy and famous. I do have a few favorites, and all of those for their neutrality. I really hate colored headphone sound. So I may not be the best one to judge how vastly different response characteristics are perceived as changes in soundstage, other than theoretically, which as I said, I'll have to get back to you with, it's been a 20 hour day so far.

jaddie · May 31, 2013 at 2:13 PM

OK, more awake now.

The concept of soundstage is a bit complex. If you start with the idea of a concert listener in an audience, he hears the position of each instrument on stage using his spacial hearing abilities, which are based on the direction of the sound, or angle of incidence. He hears the distance from the source based on direct and reflected sound, the timing, intensity and spectral distribution of each, and related to that, room acoustics made up of multiple reflections at various timings and spectral distributions each arriving to the listener at different angles of incidence. The hearing mechanism comprised of the inner and outer ear, head and chest, combined with processing in the brain let the listener perceive the position of the source in space, and sense the space around him, all in three dimensions, and might I add, in real-time! Impressive processor, the human brain.

When that concert is recorded, a lot of that spacial hearing system is bypassed. We don't have a head with ears and a chest (binaural recordings excepted), so we eliminate most of the 3D spacial and directional cues. A stereo recording is not ever a reproduction of the original, but rather an acceptable artistic representation of it, but also as an entirely new audio event (see Ch 2 and 3, "Sound Reproduction..." by Floyd Toole).

The soundstage produced by two speakers in a small room is influenced by the position of the speakers in the room, the position of the listener in the room, the direct and reflected sound of the speakers in the room arriving at the listener, and whatever spacial information is included in the recording. Notice what I mentioned last: the recording. It's all pretty much artificial in the recording because we are limited to two channels, which is far from enough to reproduce a believable 3d sound field. So, stereo recordings are a fake-out from the beginning, and soundstage in a stereo listening room can never be an accurate reproduction of the original. (I'm deliberately ignoring 3D sound processing systems, which don't really apply here.)

Now we take a recording mixed and produced for two-channel stereo on speakers in a room, and play it on headphones that again bypass the spacial hearing mechanisms of the outer ear, head and chest, leaving the brain with most of its spacial input eliminated. The perspective we get, regardless of what headphones or IEMs we use, is basically hyper-stereo, with most images placed on a line between our ears that goes right through the center of our skull. To get any sound to image off that line, the sound has to have included in it, or around it in time, spacial cues that present at least some spacial information to the brain, something to tell it the sound is in front of us, near or far, or behind or above, or left or right. Some of that can be faked in a recording, but it's never quite right without the full spacial hearing mechanism.

Most of what we hear in headphones as spacial cues in stereo recordings is accidental. Music is almost never mixed for headphones, because if it were, it wouldn't work well on speakers. However, music mixed for stereo speakers translates acceptably, though completely differently, to headphone listening, so for now we have speaker-centric mixes.

So what changes the soundstage with different headphones or IEMS? My guess would be that different headphone and IEM response curves emphasize different areas of the spectrum that contain spacial cues. A direct sound has a certain spectral distribution, the reflection of it (and there would be many, unless it's a completely dry, close-mic studio recording) would have completely different spectral distribution. To put it another way, the FR of reflections is very different than the FR of a direct sound. Many reverberation spectra are concentrated mid-band, with high frequency reflections being absorbed or diffused more in the room, and thus containing much less energy by comparison to the direct sound. So a mid-heavy IEM may present a more "spacious" effect than a flat IEM.

You can take that reflection and reverb spectral difference analysis to just about the infinite, but the thing is, every recording is different, so I would also theorize that an IEM that presents a wide or deep soundstage on one recording may not on another. It may even get down to general genre of music having different generalized acoustic qualities. For example, most modern vocal tracks are very dry, where vocals of 30 years ago were always recorded with some reverb around them. If all you listen to was 70's rock, then that stereo vocal reverb might sound better on an IEM with more mid emphasis. If you listen to current music, you might not think that was true. Just a theory, and I'm not going to be the one to prove it, unless somebody wants to fund a research project.

Sorry for the long-way-round, but hopefully more detail helps answer this and future questions.

uchihaitachi · May 31, 2013 at 2:37 PM

Wow as ever, thanks for the very extensive feedback. I think your hypothesis could not be any more accurate. I have found in my experience with mid-centric IEMs, listening to vocals by say Ella Fitzgerald era provides a much more spacious representation of the music whereas with modern tracks it feels like the singer is merely inches away from my face.

bigshot · May 31, 2013 at 5:37 PM

That's because back in the 50's most records were engineered with a whole band playing at once. They set up the mikes to capture a performance in real space. Today, everything is multi-tracked, one element at a time. The mike may be right up close to one instrument and far away from another. There is no real space. It's just laid into the mix in an artificial space, and that artificial space might change every few seconds.

uchihaitachi · May 31, 2013 at 5:43 PM

Quote:

bigshot said:
That's because back in the 50's most records were engineered with a whole band playing at once. They set up the mikes to capture a performance in real space. Today, everything is multi-tracked, one element at a time. The mike may be right up close to one instrument and far away from another. There is no real space. It's just laid into the mix in an artificial space, and that artificial space might change every few seconds.

Yeah... Such a shame :frowning2:

bigshot · May 31, 2013 at 5:47 PM

Well it's technology. Back in the 50s, they didn't have any choice but to mike performances as a group. They were limited to four track recorders. Today, digital audio allows for an infinite number of channels, so they use it.

jaddie · May 31, 2013 at 5:53 PM

You might find this article interesting. It's about the evolution of the broadcast console, but on page 3 there's a photo of an old radio studio, and the description below is about how it used to be done...one mic, no mixer...because the had to do it that way. Nobody today would have the guts!

http://www.thebdr.net/articles/prof/history/HPH-Consoles.pdf

uchihaitachi · May 31, 2013 at 6:03 PM

Quote:

bigshot said:
Well it's technology. Back in the 50s, they didn't have any choice but to mike performances as a group. They were limited to four track recorders. Today, digital audio allows for an infinite number of channels, so they use it.

I am even more impressed by the great classical pianists. Single take.... Amazing.

morethansense · Apr 9, 2014 at 11:09 PM

Most definitely just reviving a dead thread here,

I'm just a small timer in the recording industry and I'm really glad we're (some of us) moving from recording in near-anechoic chambers and acoustically dead studios and blending artificially to recording in halls and live spaces all together. Believe me when I say that I've used and heard tens of thousands of reverb plugins and units and none of them come close to a good real hall. A lot of studios that I work with are beginning to realise the importance of mic-ing not just individual instruments, but also the room.

bigshot · Apr 9, 2014 at 11:25 PM

Recording studios used to sometimes drag a monitor across the hall and into the restroom, mike the reflected sound, and mix it back into the track. That's a natural reverb, but it didn't work if someone was doing something in the next stall!

jeffnev · Apr 11, 2014 at 10:05 AM

This is all very interesting. So if I was looking for the most in soundstage, which headphones would be best?

Jeff

ab initio · Apr 11, 2014 at 11:46 AM

There's also the stories about Jimmy Page micromanaging the studio recording of the led Zeppelin albums back in the beginning of the 70s.

Things like mixing the mics close to the individual band members amps/drum kit with mics far from the band to capture the "sound of the band". Also, recording the drum track for 'when the levee breaks' im the studio stairwell to get a really wet reverb. That drum track is a really great, obvious example for laypeople to hear what reverb is.

Fun stuff

Cheers

bigshot · Apr 11, 2014 at 1:28 PM

jeffnev said:
This is all very interesting. So if I was looking for the most in soundstage, which headphones would be best?

If you're looking for the most in sound stage, you need speakers, not headphones. Headphone sound stage is weak compared to speakers...so weak to be pretty much nonexistant.

Ruben123 · Aug 25, 2015 at 6:21 AM

So many users banned... What's your opinion on sound stage by now? Width - I get it, but height and depth? Let alone in headphones?

frodeni · Sep 3, 2015 at 7:38 PM

This might not be the correct use of terms. Would be nice to get things right and in agreement. Please fill me in, if I am wrong.

I don't know what the correct terminology is, but from my experience, the following makes sense to me. (I start of by talking speaker sound)

Sound stage

Some systems seems to disappear. There is no sounds "coming from the speaker" and if I close my eyes, it is actually hard to pinpoint where the speaker is. For such systems, the far right, is about the straight line going from my head to the speaker, just continuing behind the speaker. Some of these systems, are not as great as to draw the dept.

By dept, I mean the ability to place the instruments at the axis running on lines that intersects my head. Or the area behind the speakers. Some systems are great at listening to orchestral music, as instruments appear both back and forth in dept.

A combination of reproduction in both dept and width is rare. It also is strongly dependent on the room, which the speakers are placed, and how everything is placed. In particular, in the extended triangle made up by my head, and the two speakers.

Some speaker makers, tries actively to draw the sound to the front of the stage. Like Monitor Audio. In that case, the soundstage could be described as close an personal. Meaning nothing else, than that the speaker is tuned to draw the music closer to the foreground. Foreground is the line running between the speakers.

Other speakers, like the ones I am using, Snell, do it the old fashion way. They like to draw things more in dept. That is just awesome for big stage music, like big choirs. But not so great for a single artist with a guitar.

I do not know if I hit this correctly, but as i see it, width and dept, as to where instruments possibly might be positioned, is important for sound stage. As would be the way the sound stage is drawn, like front or rear heavy. That sort of makes sense to me. If I got this wrong, please let me know.

Imaging

This term is quite new to me, as I would not use it by my native tongue. The way I see it, how well instruments and voices are placed in the stereo perspective, the positioning, is independent of the sound stage itself. Well, to some degree. By that, I mean that even if reproduction is wide and deep, instruments might not be placed with spectacular accuracy. Where they are placed both by dept and width, might be a bit blurry. I am accustomed to call this perspective in my native tongue. My understanding is that this would be imaging.

I guess, that body, should fall into this as well? Let me explain. If a instrument is well reproduced, all its tones are placed correctly. In some recordings, the movement of the fingers (pressing the strings) on the guitar is placed to the right, while the fingers hitting the strings, slightly to the left. The sound of the guitar box, the wooden box, extends a bit more, as it should. That is like hyper excellent recording, and superb imaging. Also, the expected sounds are reproduced as expected, and in tune. If so, the guitar seems to be given body, and virtually exists.

If I were to use a term like imaging, that would make sense to me. I might be wrong though. Would be nice to nail this one.

Articulation

Articulation is related to imaging. In my experience, this is not abut placement in the stereo perspective. It is more about how well articulated the reproduction is of voices and instruments.

By that I mean tone and details. Not just that things are reproduced, but that it is produced with precision. This is best described by the horrific reproduction of percussion by mp3 compression. In particular cymbals. They are reproduced by mp3, but that is it. The finer details are completely lost by the conversion. It is washed out into a high pitched something, lacking almost any articulation.

Separation

To me, this is the ability to separate instruments and vocals. If I focus, I am able to track an instrument, and the easier it is, the better the separation. I can hear what it plays.

Then there is, again, false separation. Like if I lossy compress music, some instruments actually gets easier to follow. But that often comes at a cost, as other instruments simply just disappears. Filtering out other instruments, is not improved separation.

Musicality

I do not think this has been mentioned, but musicality is to me, the ability to draw me into the music, to engage me. Meaning what exactly?

To me, first and foremost, that everything is reproduced in harmony. By that I mean, that harmonies are heard. Like listening to a choir, the voices harmonizes. Or even for pop, the instruments blend as they should. They are reproduced correctly by tone. To me, when that is the case, I sort of forget everything else, as I am sensitive to that.

Also, nothing should be off. If you got great speed and attack in the highs, slow and dull base sounds off with that. The infamous hiss of the HD800 is another example.

Details

Details, is more simple to quantify. It is simply if sounds are audible reproduced. Accuracy has little to do with this. Some gear lifts all the lower level sounds, which is a safe bet to improve details. But not articulation. Not musicality. Not imaging. Just lifting the low level treble, usually brings a lot of details. Like the breath of the woman singing.

Great reproduction of details is often times accompanied by great dynamic range. By that I mean the ability to be articulate across the entire dynamic range. If you listen to Metal, the details remain. If you play classical music, articulation will be great for all instruments, both those playing loud, and those playing soft. But sounds will be reproduced at their correct sound level. There is a huge difference at that.

This is why, when people just throw out there that things are more detailed, that is not necessarily a good thing.

Clarity

This is a tricky term. A silly way to describe it, would be that the less haze in the the reproduction the better clearity. Like removing foam covering the speaker. Only it is bloody tricky to pinpoint in the reproduction, and when a lossy mp3 is clearer, things suddenly turn messy on me.

Clarity is the one thing that tricks me. My Note3 mobile phone sounds clear as a bell, but why?

Sound drawn closer to me, front heavy that is, typically sounds cleaner.

Also, a simpler reproduction, highlighting the main traits of the instrument, oftentimes sounds clearer to me. Its like this USM of photography, in which less sharp is sharper.

It is unclear to me, what makes up clarity, as by my senses.

In my experience, real clarity is best described, as the combination of other aspects, and the synergy of that. Sound stage, imaging, separation, articulation, and details. When they mix, the perceieved clarity is of a completely different nature. That is the best I can do.

Might sound silly, but I have learned not to trust my ears on "clarity". If I tune anything by my experience of that, I mess everything up.

To add to my confusion, digital noise masked my rig at one point. The sound stage was tilted far back there. Something was clearly off. Once removing a greater part of the digital noise, moved everything way more upfront. I had mistaken digital noise for sound stage and imaging. Embarrassing, to say the least.

Headphones

This is really easy. Headphones would give you a sound stage, as if you had the speakers up to your ears, as you do. Left to right passes through your head.

Dept is relatively small, as the distance between the speakers are small, and the distance between you and the line running between the speakers, is like nill.

The HD800 has the elements at a distance from the ear, resulting in a slightly longer path between them, and your ears. By default, that offers a wider sound stage. The axes between the cans, no longer passes through the middle of your head. Rather the front part of it. (So, I guess that means that I hear voices in my head then? Not sure if I like the sound of that.)

As for the rest of the terms, as I have described it here, headphones excel.

In fact, if anyone made user specific vector correction to the sound, and placed the sounds individually on the fly, headphones is the only current artifact that may reproduce 3D, in any direction. If the listener had a directional sensor on its head, sounds would remain positioned fixed relative to the listener, like one meter in front, slightly to the left. Even if the listener turns its head.

The limited sound stage of current tech, is due to how the music is recorded and reproduced. Left to right is oftentimes just the difference in sound level. For real acoustic recordings, you also got the time delay between the microphones, and environmental reflections. But that is not matched to the listeners ears.

The ear picks up at least the following:

Level differences (as used in stereo recordings)
Time difference between hitting the ears(accustic recordings)
Directional movement by the source (as in an airplane moving toward or away from you: It sound differently)

Example: Most people can pinpoint a plane in the sky (well, they would point to a point where it was three second ago, if the plane was a 1000 meters away). At that distance, the sound level difference between the ears is tiny. Double the distance to the source, results in half the level. Sound level only decreases 3db from 500m to 1000m, resulting in hardly any difference between the ears. But the sound will hit the ears at slight time delta. That difference in time, is instrumental, in the human ability to pinpoint sounds.

Even more impressive, is that the speed of sound alters quite a bit by temperature, but that do no seem to affect the hearing much.

A plane moving towards you, has a different pitch, as its movement compresses the sound. You hear this in particular as the plane passes above you, as the sound alters at that point. I lived near an airport, so this became second nature to me. Same applies to cars.

The headphone is the only current artifact, that can possible reproduce all of this, as it by design, could reproduce the time delta. But the tech is not there yet.

Latest Thread Images

jaddie

Account deactivated by request.

jaddie

Account deactivated by request.

uchihaitachi

Headphoneus Supremus

bigshot

Headphoneus Supremus

uchihaitachi

Headphoneus Supremus

bigshot

Headphoneus Supremus

jaddie

Account deactivated by request.

uchihaitachi

Headphoneus Supremus

morethansense

Head-Fier

bigshot

Headphoneus Supremus

jeffnev

New Head-Fier

ab initio

500+ Head-Fier

bigshot

Headphoneus Supremus

Ruben123

Headphoneus Supremus

frodeni

100+ Head-Fier

Sound stage

Imaging

Articulation

Separation

Musicality

Details

Clarity

Headphones

Users who are viewing this thread