Smyth Research Realiser A16
Feb 22, 2017 at 10:44 PM Post #601 of 16,030
1. the measurement is an analog process.  

2. then something happens in a black box to create a filter which enables a equalized headphone to match the sound the microphones "heard" in the ears.  

Currently, no modeling seems to be out there to systematically measure the respective

3. transfer functions of the room,

4. the playback system, and the individual doing the listening.  

5. If such modeling were available it would be possible to create three discrete filters for each transfer function which would simplify things greatly and enable this simulation to be something other than the expensive, small volume, bespoke kind of process it currently is.

6. And needless to say this refinement should be possible b/c the entire recreation of digital sound is a mathematical process.  The convolution filters Smyth is creating are mathematical programming instructions altering the mathematical data file the music player is passing to a DAC.


1. I believe there is an ADC right after the microphone input preamp. The parameters are digitally stored (PRIR occupy few kb?) and inserted into the function.

2. I thought that you knew what the algorithm does, but you just couldn't figure out, as anyone else, what are the equations and parameters they use.

3. IMHO that is what a recording microphone does. Some better than others.

4. That is somehow what the Realiser does, but it is intrinsically merged with the playback RIR measured by the user.

5. What's your opinion about the item number 5 in my previous post?

6. Thank you for explaining that the Realiser incorporates a mathematical function that may run on low computing power.
IMHO the difficulty lies exactly on how to design such algorithms/equations/functions.
Dolby is what it is today licensing noise reduction and surround sound.
What the Realiser offers is orders of magnitude better than that (although restricted to headphones) and if they understand a box will give them the best return considering all the investments they made to design such equations, that seems fair to me.
 
Feb 22, 2017 at 11:16 PM Post #602 of 16,030
By analog I mean the users don't have a way of transparently getting a digital file of their individual HRTF abstracted from the Smyth hardware which would be portable across platforms that could be added to other portable across platform files for listening rooms and speaker systems, so that users could have their HRTF taken once, match it with a particularly well designed listening room, and finally with whatever anechoically captured speaker transfer function they chose.  
 
Ultimately, this is where the train is heading, though I think it's going to take a while to get there.
 
Feb 23, 2017 at 11:53 AM Post #603 of 16,030
Hi,
 
Quite a lot of the late questions here, have their answers directly given by Stephen Smyth in his very interesting and long interview given to HCFR (in English), see here :
 
http://www.homecinema-fr.com/hcfr-le-podcast-tech-v3-2-entretien-avec-stephen-smyth/
 
... yup, it was a quite nice chat with Stephen... 
wink.gif

 
Hugo
 
Feb 23, 2017 at 2:23 PM Post #604 of 16,030
 
(...)
1. from my own trials, and I have no idea if it's solid advice or not (could be accidental correlation), I have found that the measures from people with the same head circumference or diameter from ear to ear, have consistently felt better (...)

 


1. If two heads with the same diameter result in more and less the same ITD (correlation between head diameter and distance between the ear drums), do you think that pinna format are more and less the same and then there is also a direct correlation between head diameter and pinna size so that ILD are similar? Do you think reflections from the torso also change the ILD and can contribute to make the ILD from these two individuals with same head diameter a little different?

 

I wouldn't be too optimistic about that. just how the ears keep growing with age probably make this unlikely. but again I only know averaged values about everything so you'd need an actual expert for those things.
for the torso I really have no idea. to be clear what I said was that the few almost successful results I had had the same head size as mine or very close. but that's it, plenty HRTF of the right head size didn't work for me at all.
 
 
2. when you say to not add crosstalk, you mean the crossfeed mixing in the convolution to place the speakers at 30degrees(or wherever it is they're placed)? or do you mean some specific method or filter to try and turn speakers in binaural ready stuff(I vaguely remember something like that exists but I don't remember what it is or does)?
(...)
 


2. I was referring to this:
There are a number of methods for generating 3D soundfields from loudspeakers. The three most promising are 1) Ambisonics, 2) Wave Field Synthesis and 3) Binaural Audio through Two Loudspeakers (BA2L). The first two methods rely on using a large number of microphones/recording channels for recording, and a large number of loudspeakers for playback, and are thus incompatible with existing stereo recordings. The third method, BA2L, relies on only two recorded channels and two loudspeakers only, and is compatible with the vast majority of existing stereo recordings (recorded with or without a dummy head).
(...)
The playback of a raw binaural signal through two loudspeakers results in a significant degradation of the integrity of the binaural cues transmitted to the listener ears because of the crosstalk that exists between the loudspeakers and the contralateral ears. Such unintended crosstalk, which obviously does not exist in playback with headphones, requires cancellation or effective reduction if binaural audio is to be successfully implemented through loudspeakers.
https://www.princeton.edu/3D3A/BACCH_intro.html

wow, feels counter intuitive when the very purpose of binaural record is to improve headphones. but the technological attempt sure is interesting.
ok so for that and a few other points, I now think I understand better your initial question ^_^.  I'm now thinking that the ambisonic option would probably be better. there are just too many problems with the other option IMO.
if I get the idea, we start with a binaural recording(with or without dummy head's influence), sent to speakers with a technology aimed at offering a better binaural experience but on speakers. then that sound is  imitated by the Realiser to give the same experience on headphone. basically it's all partially wrong except the head tracking element of the Realiser. as far as the binaural album is concerned it would work better straight to an EQed headphone without BA2L or the Realiser. I understand the reasoning but doubt we can make it work as well as a proper 3D record(not surround, "real" 3d like ambisonic), with much more than 2 mics. 
I guess to achieve some binnaural but with good head tracking including the HRTF impact, the cleanest idea would be to use the realiser on a headphone and then on a kemar head(or whatever was used to record the binnaural track), both done in front of the same speakers but done as if it's the kemar we're trying to simulate in the calibration. maybe the convolution would work that way to make the headphone sound like what the kemar head was hearing?
but I don't know how "clever" the Realiser is. maybe it's always implied in the calibration that one measure will have crossover signal and not the other because it's an expected speaker vs headphone calibration? instead of some speaker vs speaker except the head changes like I'm thinking about? IDK.
 
6. Since I guess most of the elevation cues are imprinted by the size and shape of the pinna, I also guess that XY (90), ORTF (110), Jecklin disk and other microphones arrangements, though retaining a good amount of horizontal ILD and ITD cues, could lead to errors in elevation perception.

I agree, recording the elevation correctly seems like a more difficult challenge in a binnaural system. 2 mics can entirely miss it, and a dummy head while registering the right cues, has the problem of them being the right cues if your head is like the dummy's. if the dummy head was my head, then it would logically be close to perfection and superior to anything else.
the recording side is certainly tricky, and I'm guessing many products would instead rely on virtual simulation for elevation(for ease of use, like panning but in 3D you build a non existing soundstage) which brings the same problem as a dummy head. it's expected to work on a perfectly average head and uses that as model for the calculations. back to square one.
 
Feb 23, 2017 at 4:13 PM Post #605 of 16,030
maybe it's always implied in the calibration that one measure will have crossover signal and not the other because it's an expected speaker vs headphone calibration?


Pure Stereo shines at reproducing binaural recordings through two loud- speakers and gives an uncannily accurate 3D reproduction that is far more stable and realistic than that obtained by playing binaural recordings through headphones 17.

(...)

17 This is because binaural playback through headphones or earphones is very prone to head internalization of sound (which means that the sound is perceived to be inside the head) and requires, in order to avoid this problem, an excellent match between the geometric features of the head of the listener and those of the dummy head with which the recording was made (this problem has been recently surmounted by the Smyth headphones technology http://www.smyth-research.com/). Pure Stereo does not suffer from this problem as the sound is played back though loudspeakers far from the listener’s ears.
https://www.princeton.edu/3D3A/Publications/Pure_Stereo.pdf


By the way, I recently created a PRIR for stereo sources that simulates perfect crosstalk cancelation. To create it, I measured just the center speaker, and fed both the left and right channel to that speaker, but the left ear only hears the left channel because I muted the mic for the right ear when it played the sweep tones for the left channel, and the right ear only hears the right channel because I muted the mic for the left ear when it played the sweep tones for the right channel. The result is a 180-degree sound field, and sounds in the center come from the simulated center speaker directly in front you, not from a phantom center between two speakers, so they do not have comb-filtering artifacts as they would from a phantom center.

Binaural recordings sound amazing with this PRIR and head tracking.


OMG, it took me a lot of time to realize that your procedure allows Dr. Choueiri's method of two channel speaker crosstalk cancellation, but using headphones and the Realiser, that already addresses the "non-idealities, (e.g., mismatches between the HRTF of the listener and that used to encode the recording, movement of the perceived sound image with movement of the listener’s head, lack of bone-conducted sound, transducer-induced resonances in the ear canal, discomfort, etc.)" of "the location of the playback transducers in or very near the ears" that Dr. Choueiri describes in his article:http://www.princeton.edu/3D3A/Publications/BACCHPaperV4d.pdf.
Thank you very much for that!

How do you mute the opposite microphone?


To mute it I unplug the left or right microphone from the Y-junction between sweeps. I set the "post silence" to 8 seconds beforehand to give me enough time. To make it easier I plan to hook up an A/B switch.

I actually got the idea from a comment by Timothy Link in this Stereophile article about Dr. Choueiri's BACCH.
http://www.stereophile.com/content/bacch-sp-3d-sound-experience

You can also add a rear speaker to the PRIR for the left and right surround channels to achieve a full 360-degree circle like PanAmbiophonics, and additional speakers for hall ambience.


Jose Luis Gazal on September 4
@Smyth Research: Would it be possible to implement an optional function that allows the user to experiment a playback mode in which the signals assigned to left side speakers are not played back at the right headphone driver and vice versa?

Stephen Smyth Collaborator on September 4
@Jose Luis Gazal - yes we could do that for you.

https://www.kickstarter.com/projects/1959366850/realiser-a16-real-3d-audio-headphone-processor/comments
 
Feb 25, 2017 at 10:03 AM Post #606 of 16,030
Sorry to keep writing about it.

(...) a way of transparently getting a digital file of their individual HRTF (...)
Ultimately, this is where the train is heading, though I think it's going to take a while to get there.


Comparing databases instead of modeling HRTFs seems promising (Process of HRTF individualization by 3D statistical ear model), although I don't known how precise the morphological data coming from a photograph or scan and how many different individuals HRTFs a database would need in order to achieve the precision the Realiser A16 has by simply measuring a PRIR.

I also don't know if Smyth Research BRIR/PRIR database collected in the Exhange site could be used for a similar comparison since it is intrinsically merged with the room impulse response (RIR as I mentioned before) of the mixing/playback room measured.

I guess to achieve some binnaural but with good head tracking including the HRTF impact, the cleanest idea would be to use the realiser on a headphone and then on a kemar head(or whatever was used to record the binnaural track), both done in front of the same speakers but done as if it's the kemar we're trying to simulate in the calibration. maybe the convolution would work that way to make the headphone sound like what the kemar head was hearing?


I need to rephrase/complete what I wrote before:

5. That's my fundamental doubt and the reason I asked your opinion. :)

I guess the HRTF of a Kemar manikin ou a Neumann KU100 during recording adds to the PRIR applied at the playback, leading to deviations on the elevation/spectral cues (more wave interference with such two steps).

Ideally, if one could not only capture such standard dummy microphones HRTF but also our own HRTF in a anechoic chamber, the Realiser convolution algorithm could compensate for such wave interference by considering the dummy microphone HRTF that was used in the recording (the audio distribution digital file could contain such metadata).

Also ideally, the BRIR aimed to be sold in the Realiser Exchange site could be captured by studios using such standard dummy microphones to make the BRIR to PRIR Smyth Research method of personalization more precise.

But in practice anechoic rooms are rare. :)


Now I will stop writing. :)
 
Feb 25, 2017 at 12:04 PM Post #607 of 16,030
well the hrtf of dummy heads is known(they pretty much made the head so that it would get a particular impact on sound). the real problem right now is getting a clear reference for our own heads. I believe the scans will take over at some point because while it's a lot of work to make models of sound based on pictures or 3d scans, at some point we'll be able to go anywhere and have it done without the need for the room to require an exact acoustic(identical to all the other places around the world where hrtf would be made). plus having the same speakers with the same exact sound. it feel too demanding as a process.
while 3D scans work well nowadays they usually just need a guy making sure no mesh went crazy and that's it. at least for the scan part ^_^.
 
 
 
I didn't reply to the post before because I simply fail to make a proper model of what they explain in my head. I find some ideas conflicting, like I don't feel that a DSP can really be enough to make speakers in a room really binaural in the way I imagine it. canceling out some of the signal when it reaches the other ear, I get the general concept but I really don't get how that could possibly work in a room with all the interference and a dude moving just a little on his chair. maybe if I was more knowledgeable on acoustic I could visualize something like this? IDK.
redface.gif

 
Feb 25, 2017 at 12:18 PM Post #608 of 16,030
If you can have a body scan by laser, and have a bespoke suit made from it, I don't know why we can't get to the same point with HRTFs.  Seems doable, but the problem right now is that anyone with any knowledge in this regard wants to use it to sell you some boat anchor (black box) rather than a software based solution.
 
Feb 25, 2017 at 2:03 PM Post #609 of 16,030
(...) I don't feel that a DSP can really be enough to make speakers in a room really binaural in the way I imagine it. canceling out some of the signal when it reaches the other ear, I get the general concept but I really don't get how that could possibly work in a room with all the interference and a dude moving just a little on his chair. (...)


At the beginning Dr. Choueiri filter for a given listening position was processed with parameters acquired with PRIR measurements, but he had not implemented head tracking:

7 Does Pure Stereo require sitting in a sweet spot?

Like for standard stereo, where serious listening requirers the audiophile to sit in an optimal location called the sweet spot, Pure Stereo also has a sweet spot. However, unlike standard stereo, where the sweet spot must be at a given loca- tion and in a vertical plane that is exactly equidistant from the two loudspeak- ers, the sweet spot of Pure Stereo can be designed to be anywhere in the listening room7, because the Pure Stereo filter can compensate for any asymmetries in the listening configuration (like it does for non-idealities in the loudspeakers and the hi-fi system).

During the filter design session the audiophile chooses the various locations in which he likes to have a Pure Stereo sweet spot. A filter is then designed for each location and the filters are loaded in the processor. The audiophile can then switch between these filters using a simple button on the processor and a display that shows the name of the sweet spot location (e.g. “Main Listening Chair,” “Reading Side Chair,” “Family Couch,” etc.)

When sitting in the sweet spot, the listener will not sense that sounds are emanating from the loudspeakers8. The loudspeakers completely disappear acoustically.

The Pure Stereo sweet spot is large and robust enough so that a listener sitting in it perceives a high-fidelity 3D image without having to strain to keep his head in a fixed position. It does not require any more precision in sitting then standard stereo requires for serious listening. In fact, Pure Stereo imaging is so robust that more than one listener can experience most features of the 3D image as long as they sit near the sweet spot, ideally in front or behind it. Moving a few feet to the side of the sweet spot, however, will cause the 3D image to collapse and the sound to be perceived to emanate from the loudspeakers. Therefore, listeners sitting well outside the sweet spot, will hear the sound clearly but it will lack the 3D imaging and sound equalization that the Pure Stereo system produces.

8 Does Pure Stereo require special positioning of the loudspeakers?

While Pure Stereo filters can be readily designed for a pair loudspeakers and a sweet spot in an any geometric configuration (including the standard stereo triangle configuration9), it is highly recommended that the loudspeakers be positioned in the so-called “stereo dipole” configuration, which typically has the loudspeakers about 50 cm apart only.
While this configuration may initially look odd or surprising, it is in fact a superior configuration for 3D imaging - the 3D image is more robust and less sensitive to listener head movements. It is very unlikely that audiophiles who experience Pure Stereo with this loudspeaker configuration would ever wish to switch back to the standard stereo triangle loudspeaker configuration (although that could easily be done by moving the loudspeakers apart and switching to the corresponding Pure Stereo filter).

_____________

6 Sound directivity is the extent to which loudspeakers beam the sound towards the listener instead of broadcasting it in all directions around the room.

7 It should ideally be located at a close enough distance form the loudpseakers to minimize the ratio of reflected to direct sound, since early sound reflections, i.e. those arriving 20 ms or earlier at the ears after the arrival of the direct sound, are the most detrimental to stereo imaging. This minimal distance depends on the directivity of the loudspeakers and the sound reflectivity of the listening room. In practice, the minimal distance is more than 1.5 meters.

8 Except for the rare cases where the sound source in the original soundfield happens to coincide with the location of one of the loudspeakers. In that case, the sound source will be imaged, correctly, at the location of that loudspeaker.

9 i.e. a loudspeaker half-span of 60 degrees.

https://www.princeton.edu/3D3A/Publications/Pure_Stereo.pdf


Later he implemented a head tracking infrared camera to solve the problem you are arguing:

The BACCH-SP (...) incorporates (...):

1) Individualized tonally-transparent crosstalk cancellation filters (called BACCH® filters) designed from quick calibration (impulse response) measurements made by the listener using calibrated binaural microphones placed in the listener’s ears.

2) Real time adjustment of the 3D audio sweet spot though high-spatial-resolution tracking of the listener’s head position (even in pitch darkness) using a 3D infrared camera and advanced custom-built head tracking software.

3) 64-bit audio processing through a dedicated multi-core CPU running custom-built algorithms and optimized convolution engines.

https://www.princeton.edu/3D3A/PureStereo/Pure_Stereose4.html


With more than one user than you definitely need more transducers, wave field synthesis and beam forming at each pinna:

The three main problems of 3D Audio through XTC and their recent solutions
The main impediment to the wide adoption of 3D Audio through XTC has been the huge spectral coloration that XTC filters inherently impose on the sound emitted by the loudspeakers. This impediment has been completely removed by the advent of the BACCH™ Filter. The fundamental nature of this spectral coloration, its basic features, its dependencies, and optimal methods to abate it with minimal adverse effects on XTC performance, are discussed in detail in this technical paper, which describes most basic aspects of BACCH™ filters.

The second impediment to the wide commercial adoption of 3D Audio through XTC has been the inherent existence of a "sweet-spot" in which the listener's head must be in order to perceive a true 3D image. This difficulty has been completely removed by the use of recent robust head-tracking technologies (such as the widely available Kinect IR depth sensor), which seamlessly move the sweet-spot with head of the listener (e.g. the jss-BACCH System; where "jss" stands for "jogging sweet-spot").

The third and final impediment to the wide commercialization of 3D Audio through XTC has been the limitation of having a single sweet-spot which limits the 3D listening experience to a single listener. This limitation has been completely removed with the recent development of Dynasonix, through a 2-year collaboration between the 3D3A Lab and Cambridge Mechatronics Ltd., which allows creating multiple 3D sweet-spots for multiple listeners who are free to move while seamlessly maintaining full 3D audio imaging.
https://www.princeton.edu/3D3A/BACCH_intro.html


anyone with any knowledge in this regard wants to use it to sell you some boat anchor (black box) rather than a software based solution.


It seems Dr. Choueiri is now selling a software based solution with Bacch filter that his corporation licensed from Princeton University: https://www.theoretica.us/bacch-dsp/

Please let us know how much the company is charging and how much of each copy goes to the University as royalties.
 
Feb 25, 2017 at 3:35 PM Post #610 of 16,030
I have subscribed to their newsletter.  Doesn't seem like there is a product yet, and it seems to possibly be more for recorings engineers to use on DAWs to master Binaural recordings, rather than audiophile playback on PCs, but we'll see.  And the head tracking seems to be accomplished by the camera on the monitor, rather than through a 9 axis DOF IMU (which is the Edge of the Art).
 
Feb 28, 2017 at 2:02 AM Post #612 of 16,030
Yep :)
 
Am waiting for some more progress so I can order a 2nd headphone unit for the other cans.
 
Feb 28, 2017 at 8:26 PM Post #613 of 16,030
Can this product be used with any headphone effectively or do we need HD800?
Currently I own Sony Z1R, HD600 and PSB M4U1 will they fare well with A16 or will I need to invest on HD800
 
Feb 28, 2017 at 8:51 PM Post #614 of 16,030
Kumar,
The thought process is, the simulation is more accurate the more accurate and transparent the headphone is. For example, the HD600 would probably take to EQ well to sound like the room, but that headphone's inherent slowness will still be there and add it's own effect to the sound. The HD800 is an example of a pretty darn transparent and "fast" or responsive headphone, so you need something "like" that.
 
Feb 28, 2017 at 9:06 PM Post #615 of 16,030
Kumar,
The thought process is, the simulation is more accurate the more accurate and transparent the headphone is. For example, the HD600 would probably take to EQ well to sound like the room, but that headphone's inherent slowness will still be there and add it's own effect to the sound. The HD800 is an example of a pretty darn transparent and "fast" or responsive headphone, so you need something "like" that.


Thanks a lot
Got it. I guess Sony Z1R will do well with A16
 

Users who are viewing this thread

Back
Top