DIY Stereo Smyth Realiser
May 15, 2014 at 8:50 PM Thread Starter Post #1 of 13

dude_500

100+ Head-Fier
Joined
Nov 16, 2008
Posts
467
Likes
38
A quick divergence from DIY electrostatic projects... working on making a DIY stereo smyth realiser (couldn't care less about any more than 2 channels, only listen to music - can get by with a lot less DSP than the 8-channel realiser needs).
 

 
Currently my hardware is a Digilent Nexys 4 dev board with an Artix 7 100T FPGA. Hard to beat for the price, 240 18x24 multipliers and 135 36kb RAM blocks. I built an add-on board which receives and transmits SPDIF and also has a stereo mic input amplifier and 24 bit ADC. 
 
So far I've got all the IO systems setup for SPDIF and the ADC, and designed/tested 16384 sample 4-way stereo true-convolution (104gbit/s memory bandwidth, 2.9 billion 24x24 multiplies/s... aren't FPGA's awesome?)
 
Next up will be designing the room capture algorithm. This is actually not all that difficult in practice. Do a log sine sweep, record it, and perform ifft(fft(measurement)/fft(test_signal)) to get the convolution impulse. Actually implementing will be a bit of work since I'll need to manually code the FFT library using an external RAM chip to handle a 524288 sample FFT. Fortunately the dev board comes with a 128mbit RAM chip. I'll store the captured impulses on an SD card.
 
Ultimately I'll add a head-tracker too, but that's still a ways down the road. The hardware should be simple enough, just an accelerator on a wireless card. The software is not that bad either, I can stream a new kernal into the RAM buffer every few milliseconds in real-time while the convolution is operating.
 
 
More to come...
 
May 17, 2014 at 8:14 AM Post #2 of 13
popcorn.gif

 
Which microphone capsules are you going to use? What is the plug in power voltage of your microphone input?
 
Knowles FB-AE-30209-000 microphone with MB6022ASC-1 capsule seems to work with 10Vdc. 
 
Do you know any other microphone/capsule that can be inserted deeply in the ear canal?
 
I wish you all the best with this project!
 
May 17, 2014 at 10:20 AM Post #3 of 13
I'm using these: http://www.soundprofessionals.com/cgi-bin/gold/item/SP-TFB-2
 
I don't know exactly what capsule the use, but they seem to have decent reviews and they at least report a full-range frequency response. Although I'm a bit suspicious of their frequency response since basically all the responses on the sound professionals site is a copy of the WM-61 response, which is discontinued. 
 
It doesn't insert deep into the ear canal, but it should get all the head acoustics which I believe is all that really matters for this project.
 
I have them biased at about 3V in this case. I could go up to 10V on the board, but I believe most electret capsules would be unhappy with 10V (not sure, documentation is sparse on exactly what the realistic max is, and I don't even know what capsule I have).
 
May 17, 2014 at 11:09 AM Post #4 of 13
There is also Primo EM-158. You can buy it here. Just cut a cheap headphone and diy.
 
I have tried this one, with matched capsules. I record people and street ambience, with such microphones placed in my own ears, but it simple does not work when I play back the file: all sounds stay behind my head, like every other generic binaural recording. However, it apparently worked for other people recording with my gear. They say to have better 3d rendering playback.
 
Perhaps the microphone must be inserted deeply in the ear-canal? For tapers point of view in this subject, see here.
Perhaps my recorder does not have a good fidelity? It has 3v pip power (maybe less when loaded), it records at 16 bit, 44 kHz, but I do not know the quality of the internal preamp. 
Would Head-tracking work better for me?
 
I really do not know.
 
My reasoning for higher diaphragm voltage: the higher pip dc voltage, the less distortion when exposed to a higher volume sine sweep, then better signal to noise ratio reaches your processor.
 
There must be a reason for Smyth Research choosing a microphone that enters the ear canal.
 
May 17, 2014 at 11:22 AM Post #5 of 13
I don't think the smyth mics really enter the canal, here's a picture of them. And since we know the mic is facing outwards, it's not at the inner tip (which barely enters the canal anyways). 
 
Actually, I think you want them right at the surface of the ear canal. The reason being that the canal is still there for the headphones, so you actually shouldn't measure the canal response. If you did, you'd be convolving it in but it's also there in reality so the response of the canal would be "double counted". Instead, you just want to measure the head transfer function which is not present when wearing headphones.
 
At least, that's my logic. I could be totally off.
 

 
May 17, 2014 at 11:55 AM Post #6 of 13
Why SOS, when you can read directly from the source:
 
 
 
(...)
 
PRIR measurement apparatus and procedure
 
Custom miniature omni-directional microphones,
designed specifically for the SVS measurement
procedures, are placed at the entrance to the left and
right ear canals, in a blocked meatus configuration.
(...)
 

 
(...)
 
Headphone EQ 
Lastly, using the same microphones in the same in-ear 
location
, the headphone-to-pinna response is recorded 
and a headphone equalisation filter for an individual 
user is generated and stored. Headphone equalisation 
filters may be needed to compensate for the filtering 
effects of the pinna when the virtualised audio is finally 
presented to the listener over headphones. The degree of 
equalisation required will ultimately depend on the 
design of the headphones. Circumaural headphones will 
need quite aggressive equalisation, compared to in-ear 
type headphones that should require no pinna-related 
compensation. (...)
(...)
 
http://www.smyth-research.com/articles_files/SVSAES.pdf

 
May 17, 2014 at 5:37 PM Post #7 of 13
  (couldn't care less about any more than 2 channels, only listen to music - can get by with a lot less DSP than the 8-channel realiser needs).

 
Just one more thought about stereo music vs. multichannel video content. 
 
I am sure you are going to read every paragraph of Dr. Smyth paper, but reading it again this is now screaming to me:
 
(...)
 
VIRTUALISATION PROBLEMS 
 
Conflicting aural and visual cues 
 
Even if headphone virtualisation is acoustically 
accurate, it can still cause confusion if the aural and 
visual impressions conflict. [8] If the likely source of a 
sound cannot be identified visually, it may be perceived 
as originating from behind the listener, irrespective of 
auditory cues to the contrary. Dynamic head-tracking 
strengthens the auditory cues considerably, but may not 
fully resolve the confusion, particularly if sounds appear 
to originate in free space. Simple visible markers, such 
as paper speakers placed at the apparent source 
positions, can help to resolve the remaining audio-visual 
perceptual conflicts. Generally the problems associated 
with conflicting cues become less important as users 
learn to trust their ears.
 
(...)
 
http://www.smyth-research.com/articles_files/SVSAES.pdf 
 

 
People usually underestimate how our brain relies on visual cues to give sense to spatial hearing. Multichannel content is usually linked with visual content, which may help you too better understand the auditory cues. 
 
I thought the remembrance of the moment when I was recording my binaural files would help me to render the position of each source, but perhaps the neural networks that I use to recall visual memory are not related to the neural networks of my visual cortex… 
 
May 19, 2014 at 3:22 PM Post #9 of 13
  I just want to say bravo, this looks amazing.  Also you got me interested in learning how to code FPGA.  Should I learn Verilog or VHDL?  

 
I've only worked with verilog.
 
I believe VHDL is the standard for most of the world, but the United States is almost entirely Verilog centered in industry. Probably not relevant to a hobbyist but very much relevant if you plan to ever use the skills at work. 
 
Some people say it is harder to learn VHDL after having learned Verilog, but ultimately people generally agree that it doesn't really matter which one you learn first, and if you get really serious about FPGA's you'll surely end up learning both eventually.
 
Jan 29, 2022 at 3:48 PM Post #10 of 13
A quick divergence from DIY electrostatic projects... working on making a DIY stereo smyth realiser (couldn't care less about any more than 2 channels, only listen to music - can get by with a lot less DSP than the 8-channel realiser needs).



Currently my hardware is a Digilent Nexys 4 dev board with an Artix 7 100T FPGA. Hard to beat for the price, 240 18x24 multipliers and 135 36kb RAM blocks. I built an add-on board which receives and transmits SPDIF and also has a stereo mic input amplifier and 24 bit ADC.

So far I've got all the IO systems setup for SPDIF and the ADC, and designed/tested 16384 sample 4-way stereo true-convolution (104gbit/s memory bandwidth, 2.9 billion 24x24 multiplies/s... aren't FPGA's awesome?)

Next up will be designing the room capture algorithm. This is actually not all that difficult in practice. Do a log sine sweep, record it, and perform ifft(fft(measurement)/fft(test_signal)) to get the convolution impulse. Actually implementing will be a bit of work since I'll need to manually code the FFT library using an external RAM chip to handle a 524288 sample FFT. Fortunately the dev board comes with a 128mbit RAM chip. I'll store the captured impulses on an SD card.

Ultimately I'll add a head-tracker too, but that's still a ways down the road. The hardware should be simple enough, just an accelerator on a wireless card. The software is not that bad either, I can stream a new kernal into the RAM buffer every few milliseconds in real-time while the convolution is operating.


More to come...
Hi,
Did you ever get to complete the project?
If not, how close did yo get to your goals and what were the challenges?
Thanks,
pico
 
Feb 4, 2022 at 11:46 PM Post #13 of 13
Intriguing but I understand every project doesn’t merit being brought fully to life … as a sometimes coder I’ve been intrigued by FPGA’s and, in my world, what they can bring to streamers and DACs…I have an ambition to learn Verilog so I can create FIFO’s to enable better reclocking for the DACs I build.
 

Users who are viewing this thread

Back
Top