Panning for headphones (HPAN)
Dec 2, 2021 at 2:59 PM Thread Starter Post #1 of 1

71 dB

Headphoneus Supremus
Joined
Sep 17, 2017
Posts
2,850
Likes
1,048
Location
Helsinki, Finland
The limits of amplitude panoration

Traditional amplitude panoration isn't perfect, but works relatively well with speakers. With headphones this approach results quite unnatural spatiality. So, how should panning for headphones (HPAN) be?

ILD (Interaural Level Difference)

HRTF-measurements tell us the natural levels of ILD. Since HRTFs are individual and we are talking about panning rather than real binaural sound, the solution is to approximate average HTRFs with simple shelf-filters. The combined power of both channels should ad up to the original mono sound to be panned:

L² + R² = M² ,​

where L is the rms valued amplitude of left channel, R is the rms valued amplitude of right channel and M is the rms valued amplitude of the original sound to be panned. If we scale M = 1 we get:

L² + R² = 1.​

L and R are functions of panoration angle, but what kind of functions? Well, it is a well known fact that:

sin²𝛂 + cos²𝛂 = 1 ,​

for all angles 𝛂. For center panned sounds 𝛂 = 𝛑/4 radians (45°). If our panoration angle is 𝛏 in radians, we get 𝛂 as a function:

𝛂 = 𝛑/4 + 𝛃 * 𝛏 ,​

where 𝛃 is frequency dependent coeffient calculated so that it gives proper ILD levels based of HRTF approximation. For low frequences (ILD = 3 dB for hard-panned sounds) 𝛃 ≈ 0.1 and for the high treble frequencies (ILD ≈ 20 dB) 𝛃 ≈ 0.47.

ISD (Interaural Spectral Difference)

To implement this frequency dependent ILD ( ISD actually ), the sound is divided into lower and upper frequency band at about 800 Hz above which the shadow effect of head becomes much larger. The gain values are calculated for both channels and both lower and higher frequency bands and when the bands are mixed together, the sum has a reasonable approximation of how the shadow effect increases from low to high frequencies:

L_low = cos ( 𝛑/4 + 0.1 * 𝛏 )​
L_high = cos ( 𝛑/4 + 0.47 * 𝛏 )​
R_low = sin ( 𝛑/4 + 0.1 * 𝛏 )​
R_high = sin ( 𝛑/4 + 0.47 * 𝛏 )​

ITD (Interaural Time Difference)

ITD is pretty easy, especially, if we assume a linear relation between ITD and angle, which is not far from the truth. The delays (in microseconds) for left and right channel would be of the form:

delay_left = ( 𝛑/2 + 𝛏 ) * 640 / 𝛑​
delay_right = ( 𝛑/2 - 𝛏 ) * 640 / 𝛑​

This means that left and right channel together are delayed 640 microseconds ( = max ITD): Centered sounds are delayed 320 microseconds in both channels so that ITD is zero.

Comments

This approach means headphone panoration in free field avoid of any primary spatial cues present with speakers such as early reflections of reverberation. Adding simulation of these cues into the panoration process can make the result even more natural, but increases the complexity level dramatically.

I am currently writing a Nyquist plugin to test out these ideas to gain more understanding of headphone spatiality. The holy grail would be a super-panoration method that works well for both speakers and headphones.
 

Users who are viewing this thread

Back
Top