Physics of soundstage perception?
Nov 27, 2009 at 12:03 AM Thread Starter Post #1 of 23

turnstyle

100+ Head-Fier
Joined
Oct 29, 2009
Posts
122
Likes
10
Hi all, I gather 'soundstage' generally refers to how spacious a pair of phones sounds, yes?

What I don't quite understand is: how, physically, is it that two different sets of IEMs are perceived to have totally different soundstage? ie, even comparing two high-end IEMs -- both are essentially small nozzles poked into your year -- how is it that one may sound broad and spacious ("several rows back, facing the stage") and another might sound tight and in your head ("on the stage, rather than in the audience")?

It would be very interesting if anybody happens to understand this from a physical/perceptual/psychoacoustical basis...
 
Nov 27, 2009 at 12:40 AM Post #2 of 23
Quite simply, it's the fit.

Compared to loudspeakers and sound acoustics in a room, with IEM's we are talking millimeters of spacial difference making a world of difference in how it will sound.
 
Nov 27, 2009 at 1:37 AM Post #3 of 23
Quote:

Originally Posted by Spyro /img/forum/go_quote.gif
Quite simply, it's the fit.

Compared to loudspeakers and sound acoustics in a room, with IEM's we are talking millimeters of spacial difference making a world of difference in how it will sound.



But if that's all there is to it, I would then think you could coax either a broad or a focused soundstage out of any given pair of IEMs, depending on how you position them (depth/angle) within your ear. Isn't there something else going on?

And if it really is just about how the nozzle is positioned, what sort of positioning leads to a broad vs. a focused soundstage?

Thanks...
 
Nov 27, 2009 at 1:43 AM Post #4 of 23
I'd have to ask you to gve a more specific example. Generally speaking...Head-Fier's agree for the most part on the which IEM's have a large or small soundstage.

Also some people use the words soundstage and instrument separtion in the same way. My guess is this is what confuses people. Westone UM3X has huge instrument separation but a small soundstage. W3 has decent instrument separation but a much larger soundstage.
 
Nov 27, 2009 at 1:44 AM Post #5 of 23
The tubes are also distanced diffferently depending on the make and model. Drivers can be further away in the larger housings than universals for instance.
 
Nov 27, 2009 at 2:19 AM Post #6 of 23
I hope I don't seem like I'm arguing! Rather, I'm just hoping to better understand what set of physical and/or electronic characteristics result in one kind of soundstage vs. another. And so I'm not really asking about one set of IEMs in particular.

For example, if it mostly comes down to the positioning of the nozzle in the ear, and the positioning of the drivers within the IEM, then I'm just curious what sort of positioning generally results in what sort of soundstage.

Just as an example, what is it that gives the IE8 a big soundstage vs. the UM3X a tight soundstage?
 
Nov 27, 2009 at 2:31 AM Post #7 of 23
I don't have experience with the IE8s so I can't comment but the ES3Xs have a large soundstage compared with the UM2s. Both are from the same company, though the customs have a different driver configuration and the tubes are longer and the drivers appear to be more recessed in the housing. Any or all of this combined may give the sense of a larger soundstage.

I don't have more experience with customs but it might be that all customs have larger soundstages than their universal counter parts precisely because the drivers are more recessed and the tubes are longer.
 
Nov 27, 2009 at 2:36 AM Post #8 of 23
To me soundstage is closely related to sound signature and time to reach our eardrum. We percieved it as 'positioning' and 'spaceness' based on certain frequency bands. Most of recordings were mixed for speaker systems, where the sound from speakers were mixed in the air, creating a virtual soundstage, before it reached to our ears. So, there will some changes in sound waves which we usually perceived it as 'soundstage or sound space'. But, with IEMs, there is no 'on air sound mixing' and the sound travels directly to our eardrum. IMHO, this what makes IEMs sound more accurate, and more 'Hi-Fi' than speaker systems
smily_headphones1.gif
.

Our ears canals/shape also altering ' frequency response ', and enhanced certain frequency range, typically around 5-7 kHz. That's why we may find different sound signature with IEMs, headphones and speakers, when we listened to the same performances.




Thank you.

References:
Head-related transfer function - Wikipedia, the free encyclopedia
http://www.mbfys.ru.nl/~johnvo/papers/nn98.pdf
 
Nov 27, 2009 at 4:28 AM Post #9 of 23
Soundstage does not literally correlate to a physical difference, in terms of the size of the earphone or the length of the sound outlet. It's all about the perception of space. Welcome to the world of psychoacoustics.

People often say they want their sound pure as the driven snow, but most of us don't want pure data; we want an experience. We're willing to put up with a certain degree of distortion to feed a craving or two, because music is not entirely objective. It is a subjective experience. Like our storytelling, we're willing to bend things a bit to get what we want.

When you shove an earpiece into your ear canal, you have a speaker squawking. With two of them, you might have stereo but it's not the equivalent of two loudspeakers. The sound is in your head. It's centered deep in your skull. That's not the same experience as a pair of loudspeakers simulating a live performance. You want to fooled into believing that what you are hearing is "out there" rather than "in here."

To mimick that experience, and give yourself headroom, earphones have to do more than simply "play the recording" as if they were loudspeakers jammed into your ear. In fact, that's exactly what they're going to do - since they are just loudspeakers jammed into your ear. But - and this is a big but - that presentation can be manipulated to make it sound more like a set of loudspeakers across the room (or hopefully a live performance).

So, what creates the illusion of space? The recording of a guitar presumably takes place with the microphone at a kind of point-blanc range. It's the point of view of the pickup or the microphone on a stand. This version of the sound is the aural equivalent of a 50-yard-line seat at a football game. If it's all you hear, your mind accepts this as being up close and personal. The result is the perception of a very narrow sound stage.

Unless you're right up there in row one, your perception of the same performance will probably be a mixture of sounds - the instruments themselves, the echo of those instruments, the crowd, et cetera. There's a parallax of sound which your mind is trying to sort out. But one of the ironies of creating a concert recording is the fact that you can't just put the mic in the back. Our minds have this amazing ability to filter things out so we can concentrate. On a certain level, you'd not think you were at a concert if you didn't hear the decay and distortion of those non-musical (or at least non-original) elements. On the other hand, if that stuff is too loud, it interferes with your ability to enjoy the music, itself.

From a psychoacoustic perspective, certain general ideas apply. Louder means closer. Quieter means farther away. This is the aural equivalent of bigger and smaller as indicia of distance. We tend to hear more bass up front. The lack of it makes us feel further away. Clarity tends to improve the closer we are to the stage. The further away we get, the more we notice echoes, reverberations and other forms of distortion that muddy up the presentation.

So, how can an earphone's presentation affect the overall sense of space? The earphone will do what a loudspeaker does and "play the recording," but how it plays that recording will affect our sense of space. An emphasis on midrange, for example, will narrow the soundstage. How? Because most of what we hear is in the midrange. That's our survival frequency. The midrange contains most of the critical data we need to make things out. Now, the good thing is that midrange is lush with detail. Without good midrange, you won't feel involved in the recording. That detail is what you gives you a sense that you're "there." It's the sonic equivalent of a zoom lens. It's the scraping of the pick on an electric guitar. It's the squeak of fingers moving over frets. It's the wooden percussiveness on a keyboard, the sharp rattle of tightly bound snare drum. It's texture. The further you are from the action, the less texture you get. This stuff either disappears or gets absorbed in sonic mud. So, to be "there," you have to have enough midrange. It's the difference between feeling like you're in front of the band and feeling like you're coming out of the concourse tunnel with a drink in your hand.

Part of that midrange experience is the speed of the delivery. You need a fast and responsive driver in order to maintain instrument separation. A recording is a criss-crossing of patterns. The more responsive the driver, the more you can make out the fine details between instruments. Less gets absorbed into sonic mud.

But while you need good midrange, you also need the right balance of HF and LF. While these have less to do with the critical information in a recording (You don't need a good woofer or tweeter to recognize a song off a cheap radio), they add presence. Without good bass, the recording doesn't feel "real." In a real concert setting, there are LF sounds that are deep and powerful. Without deep and convincing bass, the presentation shrinks. On the other hand, because LF waves are longer and slower than HF waves, they tend to crowd the soundstage. Grotesquely elevated bass makes you feel like you're standing next to the bass amp. It buries everything in sonic mud. Bass works better where it's not overwhelming and where it's cleaner. In loudspeaker design, bass reflex systems (the ones with vents) have higher output but acoustic suspension (where the speaker is sealed) produces tighter, cleaner, bass. Tighter bass lets you enjoy the bass without having it ruin the presentation. You hear more detail while having more space for other kinds of sounds.

It also helps to have the right kind of bass. The closer you get to the midrange (mid-bass, upper-bass) the greater the danger of crowding the soundstage. Some earphones are marketed as having "great bass" but they're really pushing mid- and upper-bass. It's often punchy, hollow, cardboard-sounding bass - and it sucks. You can get away with enhancing the bass, because it's a craving and bass is harder to reproduce (it takes more energy) - but a headphone that produces "boomy" bass will collapse the soundstage.

A critical part of creating that larger soundstage involves HF. Hard surfaces reflect sound and while bass may walk through walls, HF bounces more readily. HF waves are shorter and faster. An earphone with muted HF will lose soundstage. More HF (and further from the midrange) will "open things up." It will create an airy atmosphere. Too much of it will make the listener feel too far removed from the action, but there should be enough of it to mimick that hard-surface echo of a large room. There are DSP settings that do nothing more than increase HF a little and use a slight time delay to fool the brain into feeling that one is in a larger space. A headphone won't provide DSP effects but it should have enough HF presence to feel "open" and "wide." The other important ingredient is the speed of the driver. HF signals are short and fast, so the faster the driver, the better the detail.

In the end, soundstage is created by finding the right signature with the right balance between midrange (for clarity), HF (for hard-surface reflection) and LF (for presence). If it's too warm, soundstage collapses. If it's too sweet, the listener feels too far away. If there's too much emphasis on midrange, the listener feels thrust to the front and the soundstage narrows again. We say we want flat response, but we lie. In my opinion, what we're looking for is some version of an EQ smile. We want a little extra bass. We want a little extra sizzle at the high end. Upper bass crowds. Lower treble distances. Too much mid reminds us that we have tiny little speakers jammed in our ears. Beyond finding the right sound signature, we also need fast drivers with the right rate of decay. Slower drivers are sloppy drivers, which is why the bass needs to be tight, not reflexive (choose better bass over louder bass).
 
Nov 27, 2009 at 5:34 AM Post #12 of 23
Bilavideo : Very good descriptions.

So ask ourself, do the speaker systems provide an accurate presentation of recordings? IMHO, there are many variables/parameters which can changes sound signature before it hit our eardrums.
smily_headphones1.gif


But, we can still get good imaging with headphones, earphones or IEMs. Try listen to binaural or dummy-head recordings. Which one is the most real reproductions of the recordings, speakers or head-fis ? If the performances were recorded live in studio with good mics arrangement and mixing, then with an accurate hi-fi systems, they should be sounded like in studio. But, yes, we do have our own preferences
smily_headphones1.gif
, and most of the time, did not actually need to be 'precise' or 'accurate', which we like to enjoy. I like precise sound reproductions, my IEMs' soundstage can be described as small, close or forward, but, I am enjoying them.

Finally, just enjoy your audio systems, let's your ears be the judge.
smily_headphones1.gif
But, learn science of audio also helps to understand the science behind the scenes and ourself too.
smily_headphones1.gif
Also, do try to listen to other audio systems/gears as much as possible. It can give you ideas and guidances to what we are looking for.

Thank you.
 
Nov 27, 2009 at 1:31 PM Post #13 of 23
Brilliant post, thanks.

So this suggests flatter response with fast mids would generally provides more detail and a smaller soundstage -- whereas bringing up the eq 'smile' can expand the perceived soundstage size, at the expense of a bit of perceived detail -- is that about right?

For those of you with experience with several IEMs, would you say that there is generally a trade-off between fast, detailed mids vs. expansive feeling soundstage?

Neat info.
 
Nov 27, 2009 at 5:09 PM Post #14 of 23
Quote:

Originally Posted by archimeaties /img/forum/go_quote.gif
Wow!! Bravo Bilavideo, that is totally worth a sticky!

You've also got me thinking about the driver speeds of IEMs. Is that information a standard spec? I don't remember seeing it anywhere...



I don't think it is. I can't say I've ever tripped over it. In the loudspeaker world, where designing your own system is much more common, there's a lot more information available to the consumer-as-builder. That's where I first heard about the speed issue. It comes up a lot in comparing speaker materials. Thirty years ago, a tweeter was just a miniature version of a woofer. Both were paper cones. But stiffened paper has a relatively slow decay rate (unless you're Bose and you want to charge people three grand for a system full of paper mid/tweets). Paper tweeters and midrange speakers still exist but they're at the bottom of a list of alternative materials. Titanium is faster but it has to be damped to avoid issues of ringing. Cloth/textile tweeters (sometimes called silk tweeters) are preferred because they're light and fast, but they have a better decay rate, so they're smoother. Even then, there are tradeoffs between smoothness (a fast decay rate) and sheer speed, so a more expensive option is the aluminum tweeter, sometimes called the super tweeter because it's used to generate hyper-audible frequencies of up to 40,000 kHz for HF "sparkle and presence."

But this issue isn't limited to tweeters. If you go onto the forum at places like Parts Express, you'll frequently catch builders discussing the relative merits of larger and small woofers. The size of the cone is not the only factor determining the quality of a woofer (the type of material used and the size of the magnet are other key factors). Still, all things being equal, a larger cone can drive deeper than a smaller cone. On the other hand, when it comes to mid- and upper-bass, a lot of designers prefer smaller woofers or mid-woofers. Why? They say the smaller cones are "faster." Just as there are woofers made of paper, reinforced (doped) paper, kevlar and plastic (used a lot with subwoofers), the size of the cone has an impact on its speed. For upper bass, which involves shorter/faster waves, many builders prefer a smaller woofer (say, 5 1/2 to 6 1/2 inches) to a 15" monster. They sware by it. For things like snare drums and light tom toms, as well as low registers on the piano and upper registers on the cello, they prefer the smaller, more responsive, woofer to the bigger, deeper digging, death star.

Now, I bring all this up because speed and decay rate are talked about all the time in the world of loudspeakers. They are considered major factors in capturing a clean, detailed, sound. If you want good instrument separation, if you want to feel like you could walk between the instruments, you want a driver that's fast enough to change on a dime. You also want a driver with a short decay rate, one that doesn't linger and ring past where it was supposed to stop. The better the speed and decay rate, the more defined your music will be. The slower the two, the muddier it will be. It's the sonic equivalent of screen resolution.

Now, which drivers excel best at speed and decay rate? I don't know about individual drivers, at least not yet. What I do know is this: Balanced armature (BA) drivers are faster. These tiny hearing-aid drivers rely on a filament that is lighter and faster than a dynamic driver. The dynamics have their own advantages, especially in producing more dynamic bass, but the BAs are faster and clearer.
 
Nov 27, 2009 at 8:06 PM Post #15 of 23
Quote:

Originally Posted by bakhtiar /img/forum/go_quote.gif
BilavideoSo ask ourself, do the speaker systems provide an accurate presentation of recordings? IMHO, there are many variables/parameters which can changes sound signature before it hit our eardrums.
smily_headphones1.gif
But, we can still get good imaging with headphones, earphones or IEMs. Try listen to binaural or dummy-head recordings. Which one is the most real reproductions of the recordings, speakers or head-fis?



Here is the paradox.

We all want an accurate, "precise," presentation of the original recording - or at least that's what we think we want. If given the specific option of a "distorted" or "inaccurate" presentation, we would certainly turn it down. The most obvious problem with "bad" presentations - tinny sound, boomy bass, sonic mud, defective imaging - is the distortion, itself. When you can hear a presentation and tell, right away, that it has been colored, the results are usually unsatisfactory.

The problem, in seeking a totally "objective" presentation, is that sound is not objective to begin with. The same concert sounds different depending on whether you're in the front row, a few rows back or hearing it all from the very back of the room. Some of us like it up front, with a relatively narrow soundstage but where you can hear the instrumentation with that point-blanc clarity that puts you practically onstage. Some of us want a little more distance, a little wider soundstage, where we can take in not just the sound of the instruments, but the unofficial instrumentation of the room, hall or arena. For these folks, the medium in which the concert takes place is part of the concert, too. When they hear the recording, they want to hear this part as well. They want the room (or their head) to be transformed into this place. For still others, the perfect spot is the back of the room, where the sound of the performers and the concert hall are most equally balanced or blended. It's like having the whole concert playing inside a snow globe. This preference reminds me of those east-Asian paintings where no attempt is made to focus on any individual. If people look like they're part of the landscape, that's because they are.

I must confess that my musical preferences are not so Buddhist. When I turn on a recording, I don't want to be one with the universe. I don't want to be analytical and detached. I want to get caught up in the disharmony of the performers as they stir things up. I want the excitement. Sometimes, I want to be up on stage, or in the front row, narrowly focused on the ecstasy of the performance. Sometimes, I want a wider soundstage, a few rows back, where I can get a little more perspective. I suspect that most listeners fall somewhere between the two. They aren't as interested in the secondary sounds of the concert hall as they are in the performers, themselves. But between the front-row listeners and the mid-hall listeners, there's a running debate about how much of the concert-hall acoustics should play a part in what constitutes the performance. The front-row listeners care little or nothing about the concert hall. The mid-hall listeners want at least some of those artifacts because it's part of the overall experience.

In any performance, where you stand changes your perspective. By the same token, where you put the mic changes the nature of the recording. In studio recordings, it's customary to put the mic up close and record each element as if it were in your face. The sound engineer only wants the business end of the speaker, the trumpet, the drums, et cetera. Room ambience is unwanted. When separate elements are being recorded, the engineer doesn't want the combined effect of all these ambient sounds. In studio recordings, as in movie sound, ambient sound may be simulated as a separate recording mixed in for effect. This gives studio recordings better detail (since all you hear are the instruments) but deader than a live performance. When you're playing a studio recording, you'd like to imagine that you have the band playing live in your living room but what you're really playing is a blend of individual tracks designed to sound like a studio session.

Just as a film director uses lighting, focus and camera placement to manipulate your visual focus, the sound engineer uses sound levels and "imaging" to draw your focus to different aspects of the performance. Balance, for example, doesn't just determine the sound levels of two speakers in stereo. It determines the placement or "imaging" of the instruments as their ghosts play before you in the room. If the piano is on the left, it should sound as if it's coming from the left. If the lead singer is centered on stage, he or she should "appear" in the sonic middle of the performance. If the drums or bass or lead guitar are off to the right, they should "appear" off to the right. In a stereo recording, "imaging" is created by manipulating balance and volume. What's louder is closer. Its prominence puts it into the foreground. The sonic balance between left and right speakers can literally cause a sound to ping back and forth across the room. If we gave it a numeric indicator, with five being highest and zero being lowest, and with the numbers arranged so that the number on the left corresponded to the left speaker while the number on the right corresponded to the right speaker, we could describe imaging as follows:

5-5 would be front and center
5-0 would be front and exclusively to the left
0-5 would be front and exclusively to the right
3-0 would be back and to the left
0-3 would be back and to the right
4-3 would be a little to the left, and a little less out in front
3-4 would be a little to the right, and a little less out in frront

By manipulating all of the sound levels of the separate instruments or voices in the recording, the studio engineer can create "imaging." When you close your eyes, you'll be able to say (amazingly), "The piano is here. The lead singer is here. The drums are over there." With imaging, little changes in balance and volume go a long way, that is, as long as your speakers are placed far enough apart. In the old stereo recordings from the late 60s, the left-right dynamics were insane. That's because most people had a portable turntable with left and right speakers sitting next to each other. The left-right arrangement was used to produce surprising pops like the highly-dramatic sequences in The Who's "Pinball Wizard," which starts out with rhythm guitar on the right, then an electric guitar power chord screaming from the left.

These old recordings sound dynamic but weird when played on today's loudspeaker systems. The imaging is both exaggerated and the tracks are basically separate mono recordings pinging out of different speakers. Some of them have been remastered to adjust for a different way of listening to recorded music. Still, they're not as troublesome on loudspeaker system as they are on headphones because loudspeakers have a natural cross-channel mixing, or crossfeed (sometimes called crosstalk). In a room with loudspeakers, the left and right sounds naturally mix since sounds are transmitted through waves which emanate in every direction (even behind the speaker). In a headphone system, you don't get that natural crossfeed, so what's recorded separately remains separate. Your brain notices this, which is one way it distinguishes between loudspeaker sound and headphone sound. Some headphone amps compensate for this with a crossfeed switch which mixes the channels just enough to mimic the natural crossfeed of a loudspeaker system.

Because of their open-air design, Grado headphones fool the braiin into hearing a phantom crossfeed which one reviewer erroneously attributed to the left ear hearing sound leakage from the right speaker cup. In fact, what produces this phantom crosstalk on the Grados is the ambient sounds of the room, which the brain then has to process. Ironically, the mixture of ambient noise with the recorded sound gives the brain the impression that the sound is coming from "out there." I can't count the number of times I've been wearing my Grados and caught myself fooled into thinking I was hearing all this music from the room I was in. Intellectually, I knew such was obviously not the case, but when I wasn't thinking about it, my mind would drift back to accepting the illusion.

When it comes to soundstage, I suspect the question we are all asking ourselves is how we can get enough of the detail to feel "sucked into" the performance while, at the same time, being able to accept this performance as if it were happening live, in front of us. The moment you give serious thought to what you're doing, the illusion vanishes. Absent a mental illness, you know who you are and where you are; you know you're listening to headphones and that the sound pumping into you is from tiny speakers plugged into your ears. But when you let your mind drift, you want to be fooled into "being there" regardless of what you may be doing. If I'm working outside, or taking a walk, or taking a drive, I know I'm not in a concert hall but I still want to be swept away. I want to feel as if I'm hearing that concert, enough to forget where I am and what I'm doing.

Soundstage is important to the illusion. I know that if I were at a concert, I would not just hear the instruments but hear little artifacts of the arena. On the other hand, unless you're listening to a recording of a live concert, there is no arena. You're hearing a studio recording, which is a collection of individual tracks, recorded at point-blanc range, and then edited together to produce composite "imaging." It's an illusion at best. What's more, the raw data - just a collection of vibrations - can fall far short of that illusion. Your phenomenal brain, with its ability to analyze its environment, can tell the difference between a live recording and a bunch of tracks slapped together in a studio. This is where a "precise" recording can end up feeling dead.

But how do you make a studio recording sound live? You can't change what's in the recording but there are little tweaks that can give the illusion a nudge. Too much distortion draws attention to itself and ruins the mix. The distortions I speak of are more subtle, designed to operate at or below the threshold of conscious awareness.

First, you need midrange. Midrange contains most of the critical detail - drum snares, pick scrapes, vocal timbre, piano chords, etc. We live in a world of mostly midrange. If you don't have enough midrange, you're not really "there." The best guitar riffs are in the midrange. Most vocals are there as well. It's the detail we can most easily put our fingers on. It's also the Jan Brady of many sound systems, overlooked when builders are focusing on fancy tweeters and subwoofers. Much of Bose's overpriced room systems comes down to an over-abundance of midrange speakers designed to grab the heart of the sound. On the other hand, if you have too much midrange, the soundstage will collapse. You'll only hear the instruments themselves. In fact, you'll only hear part of the instrumentation.

It's an interesting balancing act. The SE530 has probably the lushest midrange, which is quite forward. The UM3X is more recessed, to make way for the LF and HF, but still very "natural" and "present." The Westone 3 seems to ratched up that EQ smile with a heavier emphasis on the midbass and treble. I have a pair of JVC Marshmallows which are boomy but which recess the mids so much I have to EQ them to open them up. The Etymotic E4 is known for its crystal clear mids and treble, even if the bass is anemic, though I think the treble overshadows the mids. The TF10 also recesses the mids in favor of more bass and treble.

Part of giving the presentation a "live" sound is to have ample bass but quality is as important as quantity. The same jiggling that produces a front wave produces a back wave of equal amplitude. The problem is that the two waves cancel each other out, which is why speakers have a baffle, to separate front waves from back waves. Somebody figured out you can recycle the back wave, just as long as you separate it from the front wave, which gave birth to the "bass reflex" systems that dominate the loudspeaker landscape. Venting lets a speaker produce as much as twice the amount of bass you'd get from a sealed-off acoustic suspension system. The problem is, the bass you end up with is muddier since it represents an echoing. This thicker bass is fine for the "bass is bass" crowd, but it has a tendency to crowd the mix. You don't want bass that carpet bombs the presentation. You want bass that adds flavor, like items in a buffet.

This is especially important because bass isn't just that throbbing bong-bong-bong you hear out of the backs of lowriders. It's the shadow to the light. It's a fundamental part of the low register of many instruments. Piano has bass. Drums have bass. Even the human voice has bass. It's like a primary color, which can neither be sucked out of the picture or left to dominate it. There's a bass fetish enjoyed by total bassheads, which is its own animal, but most of us crave our share of bass. When we crank up the bass, what we're trying to do is compensate for the lack of it in cheap, all-purpose, wide-range speakers. Cheap stereos force the consumer to crank up both the bass and the treble because the speakers do a poor job of communicating either. In a sonic environment where you have the power to get what you want, cranking up the bass too much will bury everything else (these are big, slow, waves). A live performance has ample bass (which distinguishes it from listening to a radio) but a little of the real thing goes a long way.

If you want to feel like you're up in front, crank up more bass and the soundstage will collapse. If you want to feel a bit further back, relax the bass till you've found your sweet spot.

Last but not least, treble adds to the sense of soundstage. When midrange crowds out the treble, the soundstage narrows. One way to "open it up" is to increase the level of treble to the mix. This is important because HF waves are fast and short. They bounce readily off of hard surfaces. A lot of HF feedback (for lack of a better term), coming from different directions, can fool the brain into feeling extra space. Because of their size and speed, HF vibrations are the most aetherical. When Grado wanted to "open up" its midrange-heavy RS-1, it created a G-cush pad that looked like a salad bowl. The pad (which Grado marketed as "a concert hall for your ears") simply distanced the ear from the driver, which "opened up" the soundstage by allowing better HF dispersion. As a purchaser of a GS-1000, my first (and naggging) complaint was the sibilance (which is nothing more than too much HF). I eventually fixed the problem by taking a pair of scissors and cutting back on the cushion until the proper balance was restored.

Too much HF draws attention to itself and takes you away from the "presence" of the music. It can be like moving to the very rear of the concert hall. But a certain amount of HF is critical to feeling that sense of space, especially the higher reaches of HF. Lower HF is just an extension of the midrange. You need it but it doesn't contribute nearly as much to the sense of space as the more extreme HF. Because they use drivers originally intended for hearing aids, single-driver BA designs tend to capture more midrange than treble. The logic of a hearing aid is sometimes opposite to that of an IEM. Someone relying on a hearing aid to make out human speech has less interest in "soundstage," especially if we're talking about ambient sounds that draw focus away from human speech. The "narrow soundstage" that listeners complain of is the "better focus" that sells hearing aids.

Makers of BA drivers, like Knowles and Sonion, have adapted to the audio market by designing "tweeters" with better HF extension. The whole point of the triple-driver design is to be able to set the right levels for each aspect of the mix - bass, mids and treble. The blend created by a given sound signature will different types of "soundstage." The idea that "widest is best" is as extreme as shoving the listener to the front of the stage. While some want that "on the stage" feel, while others want the sound as they'd get it up in the highest rows of the "nosebleed" section, most listeners probably fall somewhere between rows 1 and 20. Most of us want to be "there" but we also want some perspective. What's more, a little "perspective" helps facilitate the illusion that the band is performing a few meters away, not from inside some crevice within our skull.

One of the things to keep in mind is that few of us listen to the same kind of music all the time. What may sound perfectly "neutral" on one recording may sound a bit "flat" on another. What may sound perfectly "kicking" on one track may sound ridiculously "overcooked" on another. Unless you leave the control to your amp, and your earphones are left as neutral as possible, there are going to be mismatches. Most commercial monitors are designed with tweaks that produce a certain sonic "flavor." The ER4 is all about producing a crisp, clean sound, even if it sacrifices some bass. The Westones are warm, with the UM3X going for a more neutral sound while the Westone 3 goes for kick. The SE530 is heavy on the mids. The TF10 tends to recess the mids in order to highlight the high and low ends. The PFE seeks a sonic balance, but it comes in two filter styles - light and dark. As a knockaround pair, I have some Marshmallows, which are so dark I have to use the Treble Boost setting on my iPod to open them up.

The eartip also makes a difference. Silicon delivers better HF. The foamies deliver better bass. Usually, you get the one without the other. Customs get the best of both worlds with a tighter fit (giving better bass) but without a natural dampener like foam, so the HF is less veiled.
 

Users who are viewing this thread

Back
Top