Quote:
Originally Posted by bakhtiar /img/forum/go_quote.gif
BilavideoSo ask ourself, do the speaker systems provide an accurate presentation of recordings? IMHO, there are many variables/parameters which can changes sound signature before it hit our eardrums. But, we can still get good imaging with headphones, earphones or IEMs. Try listen to binaural or dummy-head recordings. Which one is the most real reproductions of the recordings, speakers or head-fis?
|
Here is the paradox.
We all want an accurate, "precise," presentation of the original recording - or at least that's what we think we want. If given the specific option of a "distorted" or "inaccurate" presentation, we would certainly turn it down. The most obvious problem with "bad" presentations - tinny sound, boomy bass, sonic mud, defective imaging - is the distortion, itself. When you can hear a presentation and tell, right away, that it has been colored, the results are usually unsatisfactory.
The problem, in seeking a totally "objective" presentation, is that sound is not objective to begin with. The same concert sounds different depending on whether you're in the front row, a few rows back or hearing it all from the very back of the room. Some of us like it up front, with a relatively narrow soundstage but where you can hear the instrumentation with that point-blanc clarity that puts you practically onstage. Some of us want a little more distance, a little wider soundstage, where we can take in not just the sound of the instruments, but the unofficial instrumentation of the room, hall or arena. For these folks, the medium in which the concert takes place is part of the concert, too. When they hear the recording, they want to hear this part as well. They want the room (or their head) to be transformed into this place. For still others, the perfect spot is the back of the room, where the sound of the performers and the concert hall are most equally balanced or blended. It's like having the whole concert playing inside a snow globe. This preference reminds me of those east-Asian paintings where no attempt is made to focus on any individual. If people look like they're part of the landscape, that's because they are.
I must confess that my musical preferences are not so Buddhist. When I turn on a recording, I don't want to be one with the universe. I don't want to be analytical and detached. I want to get caught up in the disharmony of the performers as they stir things up. I want the excitement. Sometimes, I want to be up on stage, or in the front row, narrowly focused on the ecstasy of the performance. Sometimes, I want a wider soundstage, a few rows back, where I can get a little more perspective. I suspect that most listeners fall somewhere between the two. They aren't as interested in the secondary sounds of the concert hall as they are in the performers, themselves. But between the front-row listeners and the mid-hall listeners, there's a running debate about how much of the concert-hall acoustics should play a part in what constitutes the performance. The front-row listeners care little or nothing about the concert hall. The mid-hall listeners want at least some of those artifacts because it's part of the overall experience.
In any performance, where you stand changes your perspective. By the same token, where you put the mic changes the nature of the recording. In studio recordings, it's customary to put the mic up close and record each element as if it were in your face. The sound engineer only wants the business end of the speaker, the trumpet, the drums, et cetera. Room ambience is unwanted. When separate elements are being recorded, the engineer doesn't want the combined effect of all these ambient sounds. In studio recordings, as in movie sound, ambient sound may be simulated as a separate recording mixed in for effect. This gives studio recordings better detail (since all you hear are the instruments) but deader than a live performance. When you're playing a studio recording, you'd like to imagine that you have the band playing live in your living room but what you're really playing is a blend of individual tracks designed to sound like a studio session.
Just as a film director uses lighting, focus and camera placement to manipulate your visual focus, the sound engineer uses sound levels and "imaging" to draw your focus to different aspects of the performance. Balance, for example, doesn't just determine the sound levels of two speakers in stereo. It determines the placement or "imaging" of the instruments as their ghosts play before you in the room. If the piano is on the left, it should sound as if it's coming from the left. If the lead singer is centered on stage, he or she should "appear" in the sonic middle of the performance. If the drums or bass or lead guitar are off to the right, they should "appear" off to the right. In a stereo recording, "imaging" is created by manipulating balance and volume. What's louder is closer. Its prominence puts it into the foreground. The sonic balance between left and right speakers can literally cause a sound to ping back and forth across the room. If we gave it a numeric indicator, with five being highest and zero being lowest, and with the numbers arranged so that the number on the left corresponded to the left speaker while the number on the right corresponded to the right speaker, we could describe imaging as follows:
5-5 would be front and center
5-0 would be front and exclusively to the left
0-5 would be front and exclusively to the right
3-0 would be back and to the left
0-3 would be back and to the right
4-3 would be a little to the left, and a little less out in front
3-4 would be a little to the right, and a little less out in frront
By manipulating all of the sound levels of the separate instruments or voices in the recording, the studio engineer can create "imaging." When you close your eyes, you'll be able to say (amazingly), "The piano is here. The lead singer is here. The drums are over there." With imaging, little changes in balance and volume go a long way, that is, as long as your speakers are placed far enough apart. In the old stereo recordings from the late 60s, the left-right dynamics were insane. That's because most people had a portable turntable with left and right speakers sitting next to each other. The left-right arrangement was used to produce surprising pops like the highly-dramatic sequences in The Who's "Pinball Wizard," which starts out with rhythm guitar on the right, then an electric guitar power chord screaming from the left.
These old recordings sound dynamic but weird when played on today's loudspeaker systems. The imaging is both exaggerated and the tracks are basically separate mono recordings pinging out of different speakers. Some of them have been remastered to adjust for a different way of listening to recorded music. Still, they're not as troublesome on loudspeaker system as they are on headphones because loudspeakers have a natural cross-channel mixing, or crossfeed (sometimes called crosstalk). In a room with loudspeakers, the left and right sounds naturally mix since sounds are transmitted through waves which emanate in every direction (even behind the speaker). In a headphone system, you don't get that natural crossfeed, so what's recorded separately remains separate. Your brain notices this, which is one way it distinguishes between loudspeaker sound and headphone sound. Some headphone amps compensate for this with a crossfeed switch which mixes the channels just enough to mimic the natural crossfeed of a loudspeaker system.
Because of their open-air design, Grado headphones fool the braiin into hearing a phantom crossfeed which one reviewer erroneously attributed to the left ear hearing sound leakage from the right speaker cup. In fact, what produces this phantom crosstalk on the Grados is the ambient sounds of the room, which the brain then has to process. Ironically, the mixture of ambient noise with the recorded sound gives the brain the impression that the sound is coming from "out there." I can't count the number of times I've been wearing my Grados and caught myself fooled into thinking I was hearing all this music from the room I was in. Intellectually, I knew such was obviously not the case, but when I wasn't thinking about it, my mind would drift back to accepting the illusion.
When it comes to soundstage, I suspect the question we are all asking ourselves is how we can get enough of the detail to feel "sucked into" the performance while, at the same time, being able to accept this performance as if it were happening live, in front of us. The moment you give serious thought to what you're doing, the illusion vanishes. Absent a mental illness, you know who you are and where you are; you know you're listening to headphones and that the sound pumping into you is from tiny speakers plugged into your ears. But when you let your mind drift, you want to be fooled into "being there" regardless of what you may be doing. If I'm working outside, or taking a walk, or taking a drive, I know I'm not in a concert hall but I still want to be swept away. I want to feel as if I'm hearing that concert, enough to forget where I am and what I'm doing.
Soundstage is important to the illusion. I know that if I were at a concert, I would not just hear the instruments but hear little artifacts of the arena. On the other hand, unless you're listening to a recording of a live concert, there is no arena. You're hearing a studio recording, which is a collection of individual tracks, recorded at point-blanc range, and then edited together to produce composite "imaging." It's an illusion at best. What's more, the raw data - just a collection of vibrations - can fall far short of that illusion. Your phenomenal brain, with its ability to analyze its environment, can tell the difference between a live recording and a bunch of tracks slapped together in a studio. This is where a "precise" recording can end up feeling dead.
But how do you make a studio recording sound live? You can't change what's in the recording but there are little tweaks that can give the illusion a nudge. Too much distortion draws attention to itself and ruins the mix. The distortions I speak of are more subtle, designed to operate at or below the threshold of conscious awareness.
First, you need midrange. Midrange contains most of the critical detail - drum snares, pick scrapes, vocal timbre, piano chords, etc. We live in a world of mostly midrange. If you don't have enough midrange, you're not really "there." The best guitar riffs are in the midrange. Most vocals are there as well. It's the detail we can most easily put our fingers on. It's also the Jan Brady of many sound systems, overlooked when builders are focusing on fancy tweeters and subwoofers. Much of Bose's overpriced room systems comes down to an over-abundance of midrange speakers designed to grab the heart of the sound. On the other hand, if you have too much midrange, the soundstage will collapse. You'll only hear the instruments themselves. In fact, you'll only hear part of the instrumentation.
It's an interesting balancing act. The SE530 has probably the lushest midrange, which is quite forward. The UM3X is more recessed, to make way for the LF and HF, but still very "natural" and "present." The Westone 3 seems to ratched up that EQ smile with a heavier emphasis on the midbass and treble. I have a pair of JVC Marshmallows which are boomy but which recess the mids so much I have to EQ them to open them up. The Etymotic E4 is known for its crystal clear mids and treble, even if the bass is anemic, though I think the treble overshadows the mids. The TF10 also recesses the mids in favor of more bass and treble.
Part of giving the presentation a "live" sound is to have ample bass but quality is as important as quantity. The same jiggling that produces a front wave produces a back wave of equal amplitude. The problem is that the two waves cancel each other out, which is why speakers have a baffle, to separate front waves from back waves. Somebody figured out you can recycle the back wave, just as long as you separate it from the front wave, which gave birth to the "bass reflex" systems that dominate the loudspeaker landscape. Venting lets a speaker produce as much as twice the amount of bass you'd get from a sealed-off acoustic suspension system. The problem is, the bass you end up with is muddier since it represents an echoing. This thicker bass is fine for the "bass is bass" crowd, but it has a tendency to crowd the mix. You don't want bass that carpet bombs the presentation. You want bass that adds flavor, like items in a buffet.
This is especially important because bass isn't just that throbbing bong-bong-bong you hear out of the backs of lowriders. It's the shadow to the light. It's a fundamental part of the low register of many instruments. Piano has bass. Drums have bass. Even the human voice has bass. It's like a primary color, which can neither be sucked out of the picture or left to dominate it. There's a bass fetish enjoyed by total bassheads, which is its own animal, but most of us crave our share of bass. When we crank up the bass, what we're trying to do is compensate for the lack of it in cheap, all-purpose, wide-range speakers. Cheap stereos force the consumer to crank up both the bass and the treble because the speakers do a poor job of communicating either. In a sonic environment where you have the power to get what you want, cranking up the bass too much will bury everything else (these are big, slow, waves). A live performance has ample bass (which distinguishes it from listening to a radio) but a little of the real thing goes a long way.
If you want to feel like you're up in front, crank up more bass and the soundstage will collapse. If you want to feel a bit further back, relax the bass till you've found your sweet spot.
Last but not least, treble adds to the sense of soundstage. When midrange crowds out the treble, the soundstage narrows. One way to "open it up" is to increase the level of treble to the mix. This is important because HF waves are fast and short. They bounce readily off of hard surfaces. A lot of HF feedback (for lack of a better term), coming from different directions, can fool the brain into feeling extra space. Because of their size and speed, HF vibrations are the most aetherical. When Grado wanted to "open up" its midrange-heavy RS-1, it created a G-cush pad that looked like a salad bowl. The pad (which Grado marketed as "a concert hall for your ears") simply distanced the ear from the driver, which "opened up" the soundstage by allowing better HF dispersion. As a purchaser of a GS-1000, my first (and naggging) complaint was the sibilance (which is nothing more than too much HF). I eventually fixed the problem by taking a pair of scissors and cutting back on the cushion until the proper balance was restored.
Too much HF draws attention to itself and takes you away from the "presence" of the music. It can be like moving to the very rear of the concert hall. But a certain amount of HF is critical to feeling that sense of space, especially the higher reaches of HF. Lower HF is just an extension of the midrange. You need it but it doesn't contribute nearly as much to the sense of space as the more extreme HF. Because they use drivers originally intended for hearing aids, single-driver BA designs tend to capture more midrange than treble. The logic of a hearing aid is sometimes opposite to that of an IEM. Someone relying on a hearing aid to make out human speech has less interest in "soundstage," especially if we're talking about ambient sounds that draw focus away from human speech. The "narrow soundstage" that listeners complain of is the "better focus" that sells hearing aids.
Makers of BA drivers, like Knowles and Sonion, have adapted to the audio market by designing "tweeters" with better HF extension. The whole point of the triple-driver design is to be able to set the right levels for each aspect of the mix - bass, mids and treble. The blend created by a given sound signature will different types of "soundstage." The idea that "widest is best" is as extreme as shoving the listener to the front of the stage. While some want that "on the stage" feel, while others want the sound as they'd get it up in the highest rows of the "nosebleed" section, most listeners probably fall somewhere between rows 1 and 20. Most of us want to be "there" but we also want some perspective. What's more, a little "perspective" helps facilitate the illusion that the band is performing a few meters away, not from inside some crevice within our skull.
One of the things to keep in mind is that few of us listen to the same kind of music all the time. What may sound perfectly "neutral" on one recording may sound a bit "flat" on another. What may sound perfectly "kicking" on one track may sound ridiculously "overcooked" on another. Unless you leave the control to your amp, and your earphones are left as neutral as possible, there are going to be mismatches. Most commercial monitors are designed with tweaks that produce a certain sonic "flavor." The ER4 is all about producing a crisp, clean sound, even if it sacrifices some bass. The Westones are warm, with the UM3X going for a more neutral sound while the Westone 3 goes for kick. The SE530 is heavy on the mids. The TF10 tends to recess the mids in order to highlight the high and low ends. The PFE seeks a sonic balance, but it comes in two filter styles - light and dark. As a knockaround pair, I have some Marshmallows, which are so dark I have to use the Treble Boost setting on my iPod to open them up.
The eartip also makes a difference. Silicon delivers better HF. The foamies deliver better bass. Usually, you get the one without the other. Customs get the best of both worlds with a tighter fit (giving better bass) but without a natural dampener like foam, so the HF is less veiled.