Correct me if I am wrong but wouldn't a "generic" HRTF altered audio signal (i.e. what we get with basically all these VSS products) already simulate the audio effect produced by your outer ear, thus it does not necessarily matter if you are using an IEM? I.e., the HRTF magic your brain relies on from soundwaves bouncing around your ears, body and face is already "baked in." Basically all that would matter is the IEM's accuracy, imaging and soundstage?
IEM's would matter, however, for anything relying on an an in ear mic recording of your headphone's sound signature, e.g. the Realiser, since you cannot do that with an IEM. (But, you could still just do the speaker calibration part...) So the best possible (i.e. most individually calbrated) way to do VSS would not work with an IEM, but everything else would. That's my understanding at least, someone tell me if I am wrong.
Yes. Further, for the HRTF REALLY to work PROPERLY for 3D positional audio, the HRTF (or your head/torso/pinnae in real life) need to be fed multi-directional signal.
No amount of stereo only signal from fixed position stereo headphone transducers will give you ADDITIONAL positional cues.
Only if you are in real life, with free ears and listening to natural sounds coming from all directions, does your natural head/torso/pinnae masking/reflections (=your own HRTF) really give additional positional cues as to the direction from where the sounds is coming from. This and moving your head even slightly.
If you record this natural multi-stream, true-multidirectional signal with a stereo microphone, then the directional signals are GONE forever.
No amount of playing back that signal using IEMs, closed headphones, open headphones, big headpones, small headphones, will regenerate or recreate those positional signals. they are gone.
Yes, you can buy a headset with a huge reverbating close headphone cup with LOTS of phase errors and it will create a sense of "bigger soundstage" and "wider space" and "space that envelops around your head", but it won't be one bit directionally more accurate as to the original direction of the sound signals. In fact, while euphonic to some, it will be WORSE for directional accuracy.
This is basic acoustics/psychoacoustics.
The second way, which the BEST of 3D headphone virtualization algos try to recreate is:
1) you have true multi-channel discrete multi-directional signal (non-downmixed 7.1 discrete multichannel audio is an example). This can be artificially computed signal (like a gaming / VR environment) or a natural multichannel recording done from a natural soundspace (say, a recording in a concert hall).
2) Using the above (1) multichannel signal, the 3D virtualization algo uses a generic or for your head/ears tailored HRTF -function to map these 7.1 discrete audio channels into stereo headphone playback while trying to retain (mimick) as much of the original positional cues as possible. Some positional and spatial cues will be lost, but the good algos can do a fairly competent job.
However, if you already feed that HRTF algo a stereo (non-multichannel) signal, it can NOT recreate positional or spatial cues using the HRTF algos. It needs a real multichannel signal.
With this in mind, it should be obvious, that when one feeds ones ears stereo signals from any pair of stereo headphones, the difference between IEMs and full-size headphones as to accurate and natural positional & spatial cues should not be that great. Esp, if that stereo signal being listened to is already created by an artificial 3D virtualization HRTF algo.
Adding your ear/pinnae reflections to that signal doesn't recover any additional sound cues from the stereo signal.
There is a small caveat here.
If you have say headspeakers like AKG K1000 and you have a ears-wide-apart- stereo microphone (non-HRTF mixed) recorded audio signal, then that combination (very rare!) can recreate a bit of the HRTF signals using the AKG K1000 headspeakers compared to IEMs. But that is a very rare and special case.
Another caveat would be true multi-transducer headphone, where the transducers are spaced far enough apart from each other and are fed a true non-down-mixed multichannel audio signal. AFAIK, such headphones do NOT exist (in commercial production). The faux-multi-tranducer gaming headphones don't have the transducers spaced enough apart and the sound coming from separate enough directions. That last part is IMHO, haven't seen papers on it.
But many IEMs can do very nice bass.
Yes, but NONE can approximate the actual visceral physical force of a 70mm drive moving air against your drum at 2Watts of amplification.
The only thing that will happen if you try that with IEMs is earbleed and blown out eardrums, not more physical feel of the bass.
Yes, there are great IEMs, I'm not saying that. Yes IEMs have their pros, and so do full-sized headphones with large transducers.
But having better "pinnae based HRTF related positional audio" is not one of those advantages (for either headphone type).