2. HRTF-convolution techniques also create depth => 3D movie instead of 2D
The parallel of 3D movies and crossfeed is completely false.
One would argue that standing waves and bass overhigh at low frequencies and comb filtering from early reflections at mid and high frequencies the analogy is indeed incorrect or at least a rude parallel.
Nevertheless, when advocating his crosstalk cancellation algorithm, Dr. Choueiri compares it to an stereoscope, a device people used to wear in order to perceive the 3D effect of stereoscopic pictures:
https://www.audiostream.com/content/bacch-prelude
What I would object most is that HRTF convolution solely/alone/isolated/only, at playback, causes the 3D effect.
The synthesis of “binaural mixes (equivalent to binaural recordings produced through dummy heads or humans with in-ear microphones)”, played back with speakers (listeners HRTF acoustic convolution) and crosstalk cancellation also causes the 3D effect (perhaps imprecise rendering of elevation).
Binaural recordings with dummy head microphones, played back with speakers (listeners HRTF acoustic convolution) and crosstalk cancellation also causes the 3D effect (perhaps with imprecise rendering of elevation).
Regular stereo recordings with natural ILD and ITD played back with speakers (listeners HRTF acoustic convolution) and crosstalk cancellation also causes the horizontal 360 soundstage effect and probably suffers to render any elevation.
In the last three playback environments, the loudspeaker crosstalk cancellation algorithm may improve with electronic PRIR convolution.
Binaural recordings played back with headphones, HRTF convolution, without electronic crosstalk and with headtracking also causes the 3D effect (perhaps imprecise rendering of elevation).
Regular stereo recordings with natural ILD and ITD, played back with headphones, electronic HRTF convolution, with headtracking, but with lower level of electronic crosstalk than one would find with acoustical crosstalk, also causes the 3D effect (perhaps imprecise rendering of elevation). See:
By the way, I recently created a PRIR for stereo sources that simulates perfect crosstalk cancelation. To create it, I measured just the center speaker, and fed both the left and right channel to that speaker, but the left ear only hears the left channel because I muted the mic for the right ear when it played the sweep tones for the left channel, and the right ear only hears the right channel because I muted the mic for the left ear when it played the sweep tones for the right channel. The result is a 180-degree sound field, and sounds in the center come from the simulated center speaker directly in front you, not from a phantom center between two speakers, so they do not have comb-filtering artifacts as they would from a phantom center.
Binaural recordings sound amazing with this PRIR and head tracking.
Using the first PRIR, central sounds seem to be in front of you, and they move properly as you turn your head. However, far-left and far-right sounds stay about where they were. That is, they sound about the same as they did without a PRIR, and they don't move as you turn your head. In other words, far-left sounds stay stuck to your left ear, and far-right sounds stay stuck to your right ear. It's possible to shift the far-left and far-right sounds towards the front by using the Realiser's mix block, which can add a bit of the left signal to the front speaker for the right ear, and a bit of the right signal to the front speaker for the left ear.
Binaural recordings and regular stereo recordings played back with headphones, with electronic HRTF convolution, but adding electronic crosstalk and headtracking will render the external pan pot stereo effect we are used to perceive with regular speakers in a room.
Binaural recordings made by the own user played back with headphones, without HRTF convolution, without electronic crosstalk and with headtracking also causes the 3D effect (perhaps with more precise rendering of elevation).
Object based tracks mixed with a personalized HRTF convolution (one measured in an anechoic chamber), played back with headphones, without electronic crosstalk and with headtracking also causes the 3D effect (perhaps with more precise rendering of elevation depending on the HRTF density or the integration quality of the interpolation algorithm).
Higher order ambisonics with an PRIR convolution, played back with headphones, without electronic crosstalk and with headtracking also causes the 3D effect (perhaps with more precise rendering of elevation depending on the order used: 3rd 16 channels or 4th 32 channels with and recordings from eigenmikes). And with a little bit more of research and an array of eigenmikes perhaps soundfield navigation of recorded venues! (
https://www.princeton.edu/3D3A/Publications/Tylka_POMA_NavigationEvaluation.html)
Pan pot stereo recordings with unnatural ILD and ITD, played back with speakers and crosstalk cancellation or played back with headphones, without the addition of crossfeed will sound odd with like 71dB fears. See:
3 Is the 3D realism of BACCH™ 3D Sound the same with all types of stereo recordings?
(...)
All other stereophonic recordings fall on a spectrum ranging from recordings that highly preserve natural ILD and ITD cues (these include most well-made recordings of “acoustic music” such as most classical and jazz music recordings) to recordings that contain artificially constructed sounds with extreme and unnatural ILD and ITD cues (such as the pan-potted sounds on recordings from the early days of stereo). For stereo recordings that are at or near the first end of this spectrum, BACCH™ 3D Sound offers the same uncanny 3D realism as for binaural recordings
18. At the other end of the spectrum, the sound image would be an artificial one and the presence of extreme ILD and ITD values would, not surprisingly, lead to often spectacular sound images perceived to be located in extreme right or left stage, very near the ears of the listener or even sometimes inside of his head (whereas with standard stereo the same extreme recording would yield a mostly flat image restricted to a portion of the vertical plane between the two loudspeakers).
(...)
https://www.princeton.edu/3D3A/PureStereo/Pure_Stereose13.html#x28-1300013
So many possibilities, too difficult to write down them all without writing something wrong. Anxious to test them all.
P.s.: Edited several times to correct mistakes and to embrace more content x playback enviroment possibilities.