To crossfeed or not to crossfeed? That is the question...

Nov 12, 2017 at 10:53 AM Post #211 of 2,192
I don't think I want "my room" in the recording I am listening to, at least not always. I believe classical music recordings have all the spatial information there is to have and crossfeed allows that information to enter my ears in a reasonable way. If it something else like rock or techno etc. I don't think "my room" is needed either. It is a more or less artificial soundstage and I think reducing channel separation to natural levels is all that really matters. That is my opinion and other people may differ.
 
Nov 12, 2017 at 12:30 PM Post #212 of 2,192
I believe classical music recordings have all the spatial information there is to have and crossfeed allows that information to enter my ears in a reasonable way.

Do you believe that the following classical music recording arrangements will render equivalent spatial information for headphones playback with added electronic crosstalk?

A) your “DIY Jecklin disk microphone” [or ORTF (French, 110 degrees apart) or NOS (Dutch, 90 degrees apart)] sitting at the conductor spot direct to disk;
B) these examples from Decca: The Decca Sound: Secrets Of The Engineers.

Which of them do you believe will sound better with and without crosstalk?
 
Last edited:
Nov 12, 2017 at 12:57 PM Post #213 of 2,192
I don't think I want "my room" in the recording I am listening to, at least not always. I believe classical music recordings have all the spatial information there is to have and crossfeed allows that information to enter my ears in a reasonable way.

Oh dear. Are you sure you don't want to rephrase this? I had written a reply but thought you might want to reconsider first.
 
Nov 12, 2017 at 12:58 PM Post #214 of 2,192
2. HRTF-convolution techniques also create depth => 3D movie instead of 2D

The parallel of 3D movies and crossfeed is completely false.

One would argue that standing waves and bass overhigh at low frequencies and comb filtering from early reflections at mid and high frequencies the analogy is indeed incorrect or at least a rude parallel.

Nevertheless, when advocating his crosstalk cancellation algorithm, Dr. Choueiri compares it to an stereoscope, a device people used to wear in order to perceive the 3D effect of stereoscopic pictures:

81017bacch4.jpg

https://www.audiostream.com/content/bacch-prelude

What I would object most is that HRTF convolution solely/alone/isolated/only, at playback, causes the 3D effect.

The synthesis of “binaural mixes (equivalent to binaural recordings produced through dummy heads or humans with in-ear microphones)”, played back with speakers (listeners HRTF acoustic convolution) and crosstalk cancellation also causes the 3D effect (perhaps imprecise rendering of elevation).

Binaural recordings with dummy head microphones, played back with speakers (listeners HRTF acoustic convolution) and crosstalk cancellation also causes the 3D effect (perhaps with imprecise rendering of elevation).

Regular stereo recordings with natural ILD and ITD played back with speakers (listeners HRTF acoustic convolution) and crosstalk cancellation also causes the horizontal 360 soundstage effect and probably suffers to render any elevation.

In the last three playback environments, the loudspeaker crosstalk cancellation algorithm may improve with electronic PRIR convolution.

Binaural recordings played back with headphones, HRTF convolution, without electronic crosstalk and with headtracking also causes the 3D effect (perhaps imprecise rendering of elevation).

Regular stereo recordings with natural ILD and ITD, played back with headphones, electronic HRTF convolution, with headtracking, but with lower level of electronic crosstalk than one would find with acoustical crosstalk, also causes the 3D effect (perhaps imprecise rendering of elevation). See:

By the way, I recently created a PRIR for stereo sources that simulates perfect crosstalk cancelation. To create it, I measured just the center speaker, and fed both the left and right channel to that speaker, but the left ear only hears the left channel because I muted the mic for the right ear when it played the sweep tones for the left channel, and the right ear only hears the right channel because I muted the mic for the left ear when it played the sweep tones for the right channel. The result is a 180-degree sound field, and sounds in the center come from the simulated center speaker directly in front you, not from a phantom center between two speakers, so they do not have comb-filtering artifacts as they would from a phantom center.

Binaural recordings sound amazing with this PRIR and head tracking.

Using the first PRIR, central sounds seem to be in front of you, and they move properly as you turn your head. However, far-left and far-right sounds stay about where they were. That is, they sound about the same as they did without a PRIR, and they don't move as you turn your head. In other words, far-left sounds stay stuck to your left ear, and far-right sounds stay stuck to your right ear. It's possible to shift the far-left and far-right sounds towards the front by using the Realiser's mix block, which can add a bit of the left signal to the front speaker for the right ear, and a bit of the right signal to the front speaker for the left ear.

Binaural recordings and regular stereo recordings played back with headphones, with electronic HRTF convolution, but adding electronic crosstalk and headtracking will render the external pan pot stereo effect we are used to perceive with regular speakers in a room.

Binaural recordings made by the own user played back with headphones, without HRTF convolution, without electronic crosstalk and with headtracking also causes the 3D effect (perhaps with more precise rendering of elevation).

Object based tracks mixed with a personalized HRTF convolution (one measured in an anechoic chamber), played back with headphones, without electronic crosstalk and with headtracking also causes the 3D effect (perhaps with more precise rendering of elevation depending on the HRTF density or the integration quality of the interpolation algorithm).

Higher order ambisonics with an PRIR convolution, played back with headphones, without electronic crosstalk and with headtracking also causes the 3D effect (perhaps with more precise rendering of elevation depending on the order used: 3rd 16 channels or 4th 32 channels with and recordings from eigenmikes). And with a little bit more of research and an array of eigenmikes perhaps soundfield navigation of recorded venues! (https://www.princeton.edu/3D3A/Publications/Tylka_POMA_NavigationEvaluation.html)

Pan pot stereo recordings with unnatural ILD and ITD, played back with speakers and crosstalk cancellation or played back with headphones, without the addition of crossfeed will sound odd with like 71dB fears. See:

3 Is the 3D realism of BACCH™ 3D Sound the same with all types of stereo recordings?
(...)

All other stereophonic recordings fall on a spectrum ranging from recordings that highly preserve natural ILD and ITD cues (these include most well-made recordings of “acoustic music” such as most classical and jazz music recordings) to recordings that contain artificially constructed sounds with extreme and unnatural ILD and ITD cues (such as the pan-potted sounds on recordings from the early days of stereo). For stereo recordings that are at or near the first end of this spectrum, BACCH™ 3D Sound offers the same uncanny 3D realism as for binaural recordings18. At the other end of the spectrum, the sound image would be an artificial one and the presence of extreme ILD and ITD values would, not surprisingly, lead to often spectacular sound images perceived to be located in extreme right or left stage, very near the ears of the listener or even sometimes inside of his head (whereas with standard stereo the same extreme recording would yield a mostly flat image restricted to a portion of the vertical plane between the two loudspeakers).
(...)
https://www.princeton.edu/3D3A/PureStereo/Pure_Stereose13.html#x28-1300013

So many possibilities, too difficult to write down them all without writing something wrong. Anxious to test them all.

P.s.: Edited several times to correct mistakes and to embrace more content x playback enviroment possibilities.
 
Last edited:
Nov 12, 2017 at 1:34 PM Post #215 of 2,192
One would argue that at low frequencies (standing waves and bass overhigh) and with comb filtering from early reflections at mid and high frequencies the analogy is indeed incorrect or at least a rude parallel.

Nevertheless, when advocating his crosstalk cancelled algorithm, Dr. Choueiri compares it to an stereoscope, a device people used to wear in order to perceive the 3D effect of stereoscopic pictures:
.
To save time....”Depth perception arises from a variety of depth cues. These are typically classified into binocular cues that are based on the receipt of sensory information in three dimensions from both eyes and monocular cues that can be represented in just two dimensions and observed with just one eye.[2][3] Binocular cues include stereopsis, eye convergence, disparity, and yielding depth from binocular vision through exploitation of parallax. Monocular cues include size: distant objects subtend smaller visual angles than near objects, grain, size, and motion parallax.[4]” quote from Wiki.

Visual depth perception is much more different than similar to spatial hearing. The parallels really don’t work.

The biggest problem with 3D imaging is the focus plane is fixed but the convergence distance is constantly changing and usually out of parity with the focus plane (the screen or aerial image distance). The problems in reproducing 3D audio are completely different.
 
Nov 12, 2017 at 2:10 PM Post #216 of 2,192
The biggest problem with 3D imaging is the focus plane is fixed but the convergence distance is constantly changing and usually out of parity with the focus plane (the screen or aerial image distance). The problems in reproducing 3D audio are completely different.

Please forgive me, I did not dig enough about it. Indeed it was in my to do list:

I just hope Smyth Research develops a way to seamlessly integrate the Realiser A16 with softwares that are able to display stereoscopic threedimensional pictures. I would love to use a BRIR and a VR headset with an stereoscopic picture of the measured room to match vision and audition. I asked them and they did not answer. Perhaps parallax errors, viewing angle of VR headsets and other difficulties with stereoscopic 3D 360 degrees images don’t allow a precise match between real speakers image and virtual speakers sound just yet. I am looking forward to anybody figuring this out.
 
Nov 12, 2017 at 3:27 PM Post #217 of 2,192
Yeah, visual depth perception is quite different than auditory depth perception- with one exception- the importance of head movement. We turn our heads to see depth, like a deer will cock it's head back and forth to determine how far away that coyote is. We do the same thing with sound to perceive directionality and distance. Beyond that, it's best to think of depth in sound as either primary depth cues (slight phase and echo and location information that exists in real space in the listening room) and secondary depth cues (echo information recorded into the music itself). The primary depth cues are real. The secondary ones are copied from a different space. If you combine the two well, secondary cues can greatly enhance the perceived depth in a recording. If the secondary cues don't jibe with the real world environment, they can deter and just muddy up the sound.
 
Nov 12, 2017 at 3:39 PM Post #218 of 2,192
Do you believe that the following classical music recording arrangements will render equivalent spatial information for headphones playback with added electronic crosstalk?

A) your “DIY Jecklin disk microphone” [or ORTF (French, 110 degrees apart) or NOS (Dutch, 90 degrees apart)] sitting at the conductor spot direct to disk;
B) these examples from Decca: The Decca Sound: Secrets Of The Engineers.

Which of them do you believe will sound better with and without crosstalk?

My short answer is: B) needs much stronger crossfeed then A), so without crossfeed A) sounds better.

Oh dear. Are you sure you don't want to rephrase this? I had written a reply but thought you might want to reconsider first.

Do you hear the acoustics of your living room when you are in the classical concert? The acoustics of the concert hall is all we need to capture.
 
Nov 12, 2017 at 4:31 PM Post #219 of 2,192
Do you hear the acoustics of your living room when you are in the classical concert?
Edit: Not if I attend the actual concert, but when listening to a recording, Yes, and so do you and everyone else.
The acoustics of the concert hall is all we need to capture.
Two problems: 1. We can't capture concert hall acoustics in any practical way that even begins to be the complete acoustic picture. 2. The goal of all recordings is to present an acceptable representation of something, good enough to suspend disbelief, not to replicate the original. That's both practical and good because replication of the original is mostly impossible because it never actually existed at all.
 
Last edited:
Nov 12, 2017 at 7:37 PM Post #220 of 2,192
Two problems: 1. We can't capture concert hall acoustics in any practical way that even begins to be the complete acoustic picture. 2. The goal of all recordings is to present an acceptable representation of something, good enough to suspend disbelief, not to replicate the original. That's both practical and good because replication of the original is mostly impossible because it never actually existed at all.

Problems problems problems… …how about putting headphones on, setting proper crossfeed level and just enjoy the music instead of thinking about these problems? As you said, it's good enough to suspend disbelief...
 
Nov 12, 2017 at 11:38 PM Post #221 of 2,192


Speakers are the one area of home audio that offers better quality for more money. That's because they're mechanical. It doesn't hold true for electronics. A circuit board is a circuit board. But when you're working with voice coils and acoustics, it don't come cheap. Expensive speakers are expensive for a reason.
I'm sure there are lousy expensive ones, but cheap ones aren't generally very good. You get what you pay for with speakers.

Great speakers sound more natural than great headphones. I'll take speakers over headphones any day of the week. They sound more real. I have good headphones, but they stay in the drawer most of the time because they don't hold a candle to my speakers. The only reason I wear them is when I'm editing and I don't want to annoy the people around me. I never listen to headphones for pleasure

A good engineer with a budget can make something that sounds as good or better than a buffoon with pound and pounds of beryllium. Yes, the construction and materials cost is higher than headphones, but you don't need to spend a fortune on speakers to get high fidelity. Especially in today's age with a prevalence of artificial materials and automated manufacturing. The "didn't spend enough" excuse is used to stymie all sorts of arguments. "You don't hear what I do because you didn't spend the money", and such like trains of thought. It's a statement that's impossible to respond to with any intelligence, and defers to the superiority of the wallet, which is what spurs such high spending in this hobby. Spend enough money, and nobody nowhere on any forum can question you. Owners of summit gear are treated like royalty. Perspectives unquestioned. It's a toxic perspective in this hobby. I'm not saying you are one of those types, but you should not appeal to their logic.

I understand your preference for speakers. Many people feel the same way. That said, the premise of HeadFi is headphone listening, and that can't be avoided. Some people, I'm imagining all sorts of apartment dwellers in LA, NY, Toyko, London, etc. simply can't own or listen loudly enough to a speaker setup with much enthusiasm. Those people don't have much choice but to listen to headphone for pleasure. Personally, I'm able to listen to headphones and speakers, and get pleasure (and utility) out of both for different reasons.
 
Nov 13, 2017 at 1:19 AM Post #222 of 2,192
Problems problems problems… …how about putting headphones on, setting proper crossfeed level and just enjoy the music instead of thinking about these problems? As you said, it's good enough to suspend disbelief...
How about when "setting proper crossfeed" means not using it at all? I believe I gave an example of that...

But, you said:
I believe 1. classical music recordings have all the spatial information there is to have and 2. crossfeed allows that information to enter my ears in a reasonable way.

1. All the spatial information there is to have? Even modest familiarity with stereo microphone arrays should reveal the complete nonsense of that statement. ORTF/XY: less than a hemisphere. Coincident pair: no ITD. M/S: no ITD. The Decca Tree: scrambled ITD, and unique ILD, but capturing less than a hemisphere. Spaced omnis: fully scrambled ITD and ILD. Spot mic: mono, no 3D spatial information. And those are the commonly used ones. They all fall far short of capturing “all the spatial information there is to have”, but each is usable as an element for creating a believable mix.

2. Since the information isn’t even partially captured, it’s not going to enter your ears, crossfeed or not. Crossfeed corrects for one problem: widely separated mixes listened to on headphones. There hasn't been a widely separated orchestral ping-pong style recording made commercially in perhaps half a century.

You’ve completely missed the purpose of recording and reproduction. We aren’t trying to replicate the entire acoustic event, we are creating something new that represents the impression and feeling of the event when played in two-channel stereo on speakers in a typical home. You don't need to capture all the spatial information there is to do that, which is fortunate because we can't grab even a fraction of it anyway. That's why we have other solutions. There is better spatial representation when 5.1 or greater is used, but we still aren't replicating the original, or even close to it.

I worked on a series of recordings for broadcast of a world-class orchestra in their home hall. The hall had, at that time been tragically "ruinovated" to the point that the acoustic environment was no longer really very good for concerts, being overly dry with an assortment of other issues. We used many mics, including mono spots and various stereo pairs, and added (gasp!) artificial reverberation, mixed actively and judiciously, to create something that not even the live audience heard: good concert acoustics. But the AKG BX20 reverb (yes, it was springs!) didn't generate any 3d space, it generated random space. The Lexicon 224 that came shortly after that did a better job, the same result. They accomplished our goal, but very little spatial information from the original hall existed in those recordings.
 
Last edited:
Nov 13, 2017 at 4:49 AM Post #223 of 2,192
How about when "setting proper crossfeed" means not using it at all? I believe I gave an example of that...

But, you said:


1. All the spatial information there is to have? Even modest familiarity with stereo microphone arrays should reveal the complete nonsense of that statement. ORTF/XY: less than a hemisphere. Coincident pair: no ITD. M/S: no ITD. The Decca Tree: scrambled ITD, and unique ILD, but capturing less than a hemisphere. Spaced omnis: fully scrambled ITD and ILD. Spot mic: mono, no 3D spatial information. And those are the commonly used ones. They all fall far short of capturing “all the spatial information there is to have”, but each is usable as an element for creating a believable mix.

2. Since the information isn’t even partially captured, it’s not going to enter your ears, crossfeed or not. Crossfeed corrects for one problem: widely separated mixes listened to on headphones. There hasn't been a widely separated orchestral ping-pong style recording made commercially in perhaps half a century.

You’ve completely missed the purpose of recording and reproduction. We aren’t trying to replicate the entire acoustic event, we are creating something new that represents the impression and feeling of the event when played in two-channel stereo on speakers in a typical home. You don't need to capture all the spatial information there is to do that, which is fortunate because we can't grab even a fraction of it anyway. That's why we have other solutions. There is better spatial representation when 5.1 or greater is used, but we still aren't replicating the original, or even close to it.

I worked on a series of recordings for broadcast of a world-class orchestra in their home hall. The hall had, at that time been tragically "ruinovated" to the point that the acoustic environment was no longer really very good for concerts, being overly dry with an assortment of other issues. We used many mics, including mono spots and various stereo pairs, and added (gasp!) artificial reverberation, mixed actively and judiciously, to create something that not even the live audience heard: good concert acoustics. But the AKG BX20 reverb (yes, it was springs!) didn't generate any 3d space, it generated random space. The Lexicon 224 that came shortly after that did a better job, the same result. They accomplished our goal, but very little spatial information from the original hall existed in those recordings.

You should be a lawyer. I feel like being in court while debating with you. Everything I say you use against me.

1. Yes, you use the best option and that's it. What else can you do? You have it in stereo or in better cases as multichannel.

2. I believe it is captured almost completely considering the sound is reproduced with stereophonic headphones or speakers. Even just one mono microphone in a room would be exposed to the acoustics largely capturing the reverberation time as a function of frequency completely while of course ignoring directional information. A stereo microphone pair captures a lot that directional information and multichannel microphone setups even better. Classical music in general doesn't suffer from strong stereo separation, but the microphone setups cause some excessice separation. For example if you have an AB-pair 7 feet apart, the ITD information will be exaggarated almost 10-fold in headphone listening. If you use ORTF, the ITD will be well scaled, but cardioid microphones will produce excessive ILD information. Well, of course you don't think about headphones. You think speakers and that's why the result is often not so optimal for headphones. Crossfeed helps correcting this. I like to use quite strong crossfeed on orchestral music and with string quartets for some reason. Solo piano music is the rebel against crossfeed and often doesn't want much of it.

The purpose of recording is to sell recordings and to make money doing so. Most of music listeners understand absolutely nothing about what we are talking about here. They'd have hard time telling a ping pong stereo recording apart from mono. Classical music is record in a way that tries to replicate the acoustic event as accurately as possible while a recorded rock concert perhaps has another philosophy.

Your recording in the "ruinovated" acoustics may not have much real acoustic information, but hearing can be fooled, otherwise using spring reverbs would only ruin things. I don't believe our hearing expects 100 % accurate spatial infomation, because such a thing isn't unique: Move to the another seat in a concert hall and the acoustics your ears experience changes. However, you don't feel changed acoustics, because our hearing is used to that. Movement changes acoustics *. However, excessive stereo separation never happens in real life no matter were you sit in a concert hall. That's why crossfeed is so important.

* I did some binaural test recordings electret mics in my ears couple of years ago. I walked outdoors and came home. The acoustics changed a lot from an open environment to closed space. When I come home in everyday life, I don't pay much attention to this change because it's expected, but listening to the recording at my computer with headphones made the changes huge, because they were unexpected. I wasn't moving at all so why is the acoustics changing so much?
 
Nov 13, 2017 at 12:25 PM Post #224 of 2,192
The thing about speakers and cost is that it all depends on the volume level and size of the room. If you have a small dorm room, near field monitors and sitting close will do, and that doesn't cost much at all. But if you have a good size space to fill and you want to get the volume up, it isn't inexpensive. Perhaps you're just defining expensive differently than I am. A decent set of speakers for a 5.1 system in a good sized living room would run between $5,000 and $10,000. To me, that qualifies as expensive. I'm not talking about speakers that are $15,000 apiece. I know that exists, but that seems like overkill to me. I can totally see spending a grand or two for a speaker though. (I've done it myself.)
 
Nov 13, 2017 at 12:52 PM Post #225 of 2,192
The required space does have an impact on cost. If you are nearfield, in the example you used for a dorm room, you can get a pair of powered monitors for relatively cheap, and hear everything there is to hear for $500 or less. A decent 5.1 system for an average living room, I think you could probably get away with spending $2,000-$3,000 (close to my own budget) and have an unbelievably good sounding system, including receiver and sub. It won't be a 1,000sq ft system, but for an average 15x15 living room it would do the trick.

One day, I hope to spend about $10k on a truly high end system. I'm not saying that spending thousands of dollars on speakers is a waste. There are great speakers out there that fetch a hefty sum because they have solid engineering and materials behind them. But high price is not a guarantee of such quality, and low price doesn't necessarily mean you're missing out on much either.
 

Users who are viewing this thread

Back
Top