Of course, the stereo width and the depth cues greatly depend on the recording and mixing methods employed, whereby I was focusing on the subjective assessment that some speaker systems amid their interactions with the room seemed to lack the channel-to-channel image width I would expect of certain recordings played through my Genelecs or binaural head-tracking, but at least had the sound roughly on a line between said channels, primarily occupying the phantom center, whereas a few notable speaker systems, not too fancy, successfully created the illusion of the stereo line being positioned in space a foot or so above the tweeters and midrange drivers and a few feet behind the speakers.
Mmmm. Assuming a standard 2 channel stereo mix, then the “
few notable speaker systems” did not “
successfully create the illusion of the stereo line being positioned in space a foot or so above the tweeters …”, they unsuccessfully reproduced the recording. There is no height information in a 2 channel stereo mix and no intention of creating the illusion that there is. So if you are hearing the stereo line above the speakers that is due to some fault in reproduction, maybe some timing/phase or even frequency issue between the drivers or more likely, some undesirable reflections from the ceiling that your brain is interpreting as the stereo line being higher than it actually is, or it might even be completely imaginary. Sounds/Instruments being perceived as behind or in front of the speakers is something we can (and virtually always do) manipulate though.
"Out-there-ness" is the sense of the chance of some sound sources indeed imaging from beyond the front wall of the room or there being a large virtual space intersecting into rather than being confined by the physical listening space; e.g. if I were able to create a perfect personalized binaural recording of a concert hall from the middle of the fifth row and convincingly hear that imaging even if standing with my face against a wall.
OK, I get what you mean by “
out-there-ness” now. We can create that effect/illusion as far as width of the stereo image is concerned (for example by applying “shuffling”, although this effect is quite rarely employed) and also for depth, using a combination of tools I’ll explain below but of course we can’t and are not trying to create a personalised binaural recording of a concert with a standard 2 channel stereo mix. All we can do is very vaguely approximate that subjectively and hope it’s going to translate well enough to a consumer’s listening environment so that their brain fills in the gaps and they imagine they’re hearing a relatively accurate representation of the orchestra in a concert hall.
I at the moment cannot set up my Genelecs to produce the illusion of imaging further than the line between the channels, though ceiling or wall reflections in my crappy setup could push some things like vocals above or leftward on the stereo line which I mainly find to be an annoyance.
Clearly you have some acoustic issues and monitors such as Genelecs that are particularly good at accurately reproducing stereo positioning/imaging are going to be particularly badly affected by room/acoustic problems. I presume you’re using some near-field Genelecs? In which case, they are designed to be used in a near-field environment, EG. A listening position about 1-2m away from the monitors and placing them at least that distance away from the rear and side walls of your room.
With my binaural head-tracking, for the grand majority of recordings that don't do weird time panning or phase tricks, I end up with a largely 1D stereo image between the two virtual channels, whereby even depth cues seem to image along that fixed line between the channels.
Again, I don’t really understand. Binaural head-tracking obviously doesn’t work using speakers, even near-field speakers/monitors, are you talking about “binauralising” a standard 2 channel stereo mix using headphones rather than speakers? If not, the results of that are going to be highly variable and could indeed be perceived as a largely 1D stereo image.
From my DSP simulation of reflections using upmixing and delays, if the delay is small enough, the sound sources end up getting panned/stretched in the direction of the simulated reflection, but I hadn't been able to simulate increased depth/distance so long as my measured HRTF ITDs are tied to the 1.5 m distance; maybe I need to play with front wall reflections. i.e. If I play my DSP while facing out into the living room, I still feel like the stereo line is expectedly no further than 1.5 m in front of me due to the distance at which I measured my HRTF and ITDs.
It seems you are trying to apply HRTF/ITD to speaker reproduction? That’s rarely if ever going to work because a standard 2 channel stereo mix already contains reflections and processing that accounts for how we hear in an acoustic space. The engineers/producer obviously have a HRTF and are subjectively mixing according to what they are hearing/monitoring. You cannot add your own to that because your speakers are not isolated from each other (as HP earcups are). Maybe I’m misunderstanding what you’re trying to do?
As such, I don't know whether I would rather attempt to play with speaker choice, placement, and room treatment to position the stage exactly where I want it (again, my personal experience of live classical concerts is primarily frontal with rare or insignificant perception of wall or ceiling reflections) versus just directly measuring an HRTF from 10 m away and applying binaural pans for said virtual channels, adding simulated reflections as I please.
The way we create depth when mixing is to manipulate volume, HF roll-off, reflections and compression. We can do this artificially, using reverb units, EQ, fader level and compressors or we can do it through microphone layouts/positioning and then panning, time adjusting and balancing or most commonly, some combination of the two approaches. Most importantly, depth is not created by these effects, it is created by the contrast of them. Eg. Merely adding reverb, HF roll-off, etc., will still result in a “1D line”, it’s the contrast of having different reverb, HF roll-off, time off-set, etc., between the different instruments/sounds in the mix which creates depth. For example, applying the same reverb, the same level, HF roll-off, time offset, etc., to say both a violin and a trumpet will result in them both being perceived as the same depth/distance (the same 1D plane) but applying more compression, less HF roll-off, less reverb and more level to only the violin, will result in it being perceived as sounding closer than the trumpet and now we no longer have the perception of a 1D plane, we would perceive “depth”.
Anyways, going back to the OP, the point was my proposing the use of a pink noise volume pan to in effect "trace out" the line through which most traditionally panned sound sources would be positioned on your playback system, revealing to you the "actual shape" of your "soundstage" as induced by your speakers + room …
Using pink noise will only trace out the line between your speakers and in some cases extreme issues with your room acoustics, it will not reveal the actual shape of your (perceivable) soundstage. This is because adding reverb to pink noise just results in slightly louder pink noise, adding pink noise to pink noise with a HF roll-off results in pink noise with slightly more bass, adding quieter pink noise to pink noise results in slightly louder pink noise. In other words, pretty much all those things that define our perception of depth in a mix (or in the real world) simply don’t work with pink (or white) noise, all we perceive is a slightly louder or quieter pink noise or pink noise with a somewhat different freq balance, not any positional depth/distance cues. Again, I’m not sure I’ve correctly understood/interpreted what you’re trying to say?
G