1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.

    Dismiss Notice

Recording Impulse Responses for Speaker Virtualization

Discussion in 'Sound Science' started by jaakkopasanen, Oct 9, 2018.
First
 
Back
1
3 4 5 6 7 8 9 10 11
Next
 
Last
  1. jaakkopasanen
    My idea here would be not only to have this working for myself but also create a process, a guide and tools so that pretty much anybody can do it themselves. In that sense it's not realistic to ask the random headphone enthusiast to know or to learn how to detect distortion in the impulse responses or even to go read through manufacturer documents. What I really would like to have is a almost one click solution which processes the sine sweeps, analyzes them, gives warnings to user if things didn't go as expected and of course produces the output impulse responses which can be directly used with different convolution softwares. Something like what I did with my AutoEQ project. Of course the user still needs to own the mics at minimum so it's not quite possible to have that low level of commitment from the user.
     
    mindbomb likes this.
  2. jaakkopasanen
    Finally found a bit time for doing measurements using the Zoom H1n instead of motherboard's mic input. The improvement is significant. I have been testing with music from Spotify and with videos on YouTube for speech. Previously speech was a good test to notice differences between headphones and speakers but now with the new measurements even speech is very deceptive. I haven't done yet any of the proposed filtering tricks so it's just pure convolution.

    What are the biggest sources of error right now, I'm not sure. My headphones (HiFiMAN HE400S) are not certainly in the same class as my speakers (Dynaudio Focus 110). Also I'm not sure if my headphone compensation if very good. At least it doesn't have separate compensation for left and right channel. Maybe just by upgrading headphones and doing better compensation I would achieve flawless results.

    Measurement is just stereo for now but I'll try to get the multi-channel working later. Disappointing discovery was that Zoom H1n creates overdub tracks by having the original track mixed in so it's not so simple to use that. Maybe I can subtract the original sweep track from the mix. Other option is to adjust latencies between recordings by hand but that might prove to be quite difficult because delay of few samples can mess it up.

    Certainly this thing has huge potential. I usually don't care too much about sound stage with music but this is already now so good that I would not go back to listening music with headphones without this if I have the choice.
     
    Last edited: Nov 11, 2018
    Joe Bloggs likes this.
  3. Joe Bloggs Contributor
    Very interesting! Let me know if you get any ideas for automating this process. I have a programmer friend here working on related stuff that might help.
     
  4. jaakkopasanen
    I'm making progress. Today I recorded my first 7.0 HRIR. The 7 channels are recorded two at a time because I only have a stereo speaker setup. First round was for front left and right in normal listening position. 2nd was recorded while looking at 120 degrees right so that left speaker becomes back left and right speaker becomes side left. 3rd round was the opposite ie looking at 120 degrees left making left and right speakers side right and back right respectively. Last round was center channel while looking directly at left speaker.

    PCs don't have latency consistency in recording meaning the channel delay will be whatever which is not usable for binaural speaker virtualization. I however came up with a method to correct channel delays in post-processing. This correction syncs the channels by the ear which first receives the signal: left ear for left side speakers and right ear for right side speakers. Delay between left speaker and left ear should be the same as between right speaker and right ear. I then looked up head breadth for adults in wikipedia and made some calculations about how much faster signal from each speaker should arrive at ear compared to center of head. Center of the head is a reference point where all speakers should have equal delay. Variance in head breadths is so small that it has next to no effect on actual channel delays. I'm using the following delays with sampling rate of 48 kHz: 5 samples for front and back, 10 samples for center and 0 samples for side channels. This seems to be working wonders. I still need to do some testing to confirm that the break points will have good imaging (between front left and side left, front right and side right, back left and back right).

    So now I have very good HRIR for movies and games. Music is so sensitive to sound signature that it still needs more work. I need to work more on headphone compensation because I think I made mistakes with it the last time. Now I have better tools for it but still it remains a mystery. I was under the impression that I should equalize headphones flat as heard by the in-ear mics but this doesn't look to be the case. I actually got good results just by using normal Harman-like target curve for Sennheiser HD 518. Fortunately I have some things I can try to improve this.

    The harshness problem I mentioned earlier might be caused by incorrect headphone compensation or insufficient sampling rate. I read somewhere that sampling rate needs to be at least 6 times as high as the highest frequency in sine sweep recording for deconvolution to work well. I have so far done my recordings at 48 kHz so this could explain something bad happening to highs.

    I created a github repository for this project: https://github.com/jaakkopasanen/Impulcifer. Maybe your friend would like to take a look. Is he working on or does he have knowledge about deconvolution algorithms? I added simple FFT deconvolution to the repository but it's not as good as Voxengo Deconvolver. I would like to learn which is the best way to go about doing deconvolution. Eventually I want to have a full pipeline from measurements to HRIR output and headphone compensation.
     
    aleksanderp and castleofargh like this.
  5. jaakkopasanen
    I messed up the headphone compensation I mentioned in the previous reply. I forgot to deconvolve the logarithmic sine sweep recording to impulse response. Logarithmic sine sweep has unbalanced frequency response so it cannot be used as is. Now I did the compensation again and got very moderate compensation curve. This sounds a lot better although frequency response for left ear has massive dip in the 8 to 10 kHz range asking for a lot of positive gain which does not do wonders for the sound. I'm using the right ear eq for both. Not ideal but serves me fine for the time being.

    I also tested this new compensation by listening (without speaker virtualization) to music I recorded earlier with my in-ear mics and that sounds just wonderful. This too indicates that the problems with harshness are not in headphone compensation but in the impulse response estimation. Hopefully increasing sampling rate produces better results.
     
  6. Joe Bloggs Contributor
    I'm not able to understand all of what goes above but some comments:
    1. Interaural time delay for one speaker seems much more important to get right than the time delay between different speakers to the close ear, which in real life would vary wildly between different setups unless you have the speakers in a perfect circle around you.
    2. For interaural time delay I'm simply using a stereo binaural mic and recording the sweep response of both ears at the same time from one speaker sweep. This gives me accurate interaural time delays.
    3. Also, pretty sure that the recording latency isn't a variable "whatever" but some fixed "whatever" which means that if you place your sweep at a fixed time on your DAW and route it to a different speaker each time, then if you hit "record" on that and crop the same timed stereo chunk of output from each recording for deconvolution, you'd also have a set of recordings that are not only timed correctly between the two ears but also timed correctly relative to each speaker's distance from your ears, if that tickles your fancy.
    4. I hesitate to declare the same if you record long chunks of audio including multiple sweeps from different speakers then try to crop the same timed recordings relative to each sweep, because of clock drift.
    5. Assuming you're not compensating the response of your HRTF recordings from each speaker to your binaural mics in any way, yes, you want to finalize by putting headphones over your ear and record the transfer function from your headphones to your binaural mics, then neutralize that, to make your headphones simulate canalphones stuffed where the binaural mics were. But in practice I found the effect less than ideal :/

    If you add me at joe0bloggs on Skype, facebook or Telegram we could discuss and further share our findings more quickly and thoroughly :)
     
    Last edited: Dec 9, 2018
  7. jaakkopasanen
    1. True
    2. I'm doing the same
    3. Latency varies between recordings unfortunately. This is a consequence of CPU scheduling by operating system which allocates CPU cycles to different processes in non-deterministic fashion.
    4. As far as I understand measuring all speakers in one go, assuming you have access to 7 speaker setup, should make all channel delays sync. I'm don't know if clock drift would affect this in practice. Problems come when measuring speakers individually, if you have different delay to front left speaker than to front right speaker the stereo image will be unbalanced. Measuring two speakers at a time can mitigate this but there will still be problems with different measurements. I suspect what I'm doing is actually better than anything else because this way the channel delay will be synced even if the speakers are not set up in a perfect circle.
    5. This is my approach. In theory this should work perfectly because what ever is the eara canal transfer function it is there wihen listening to speakers or headphones. For my most recent experiment this worked well in practice as well. One proposed alternative is to do frequency response by comparing 1/3 octave band noises against single fixed band, say around 500 Hz, and adjusting the levels to be the same loudness. This would be repeated with speakers and then with headphones. I'm going to try this at some point and report back. This could make it possible to compensate IEMs which is not possible otherwise. Ideally everything would be measured at the ear drum but that's not feasible I'm afraid.
     
  8. johnn29
    Really excited to see this come to fruition. I've been toying with all the concepts recently so much of what you've made makes sense. Currently using SXFI/ Waves/SXFI and would love to get real measurements from my own ears into the mix. There's a few companies out there that will offer that service eventually - JVC had a product called EXOFIELD at CES2018 but it's not available commercially yet.
     
  9. jaakkopasanen
    I would love to see this service being commercially available for consumers. That Exofield seems interesting, I'll have to look into it more later. There's also a Finnish startup called Hefio which seems to be this as well. The founder Marko Hiipakka has published studies about ear canal acoustics calibration and measuring HRTFs with particle velocity microphones at the ear canal entrance in a way that highly accurate estimation of frequency response at ear drum is possible. Normally headphone calibration with the same ear canal blocking microphones is not exact because headphones change ear canal resonances and ideally those need to accounted for.

    I bought Sennheiser HD 800 headphones and the listening fatigue I mentioned seems to have disappeared. So I guess it was mainly caused by my older headphones HiFiMAN HE400S although I need to experiment more to confirm this. I haven't yet tried higher sampling rates but that should in theory improve the impulse response estimation on high frequencies.
     
  10. jaakkopasanen
    It's been a while since my last update but the project is progressing well. I have logarithmic sine sweep measurement process and code done with phase controlled ESS and inverse filter deconvolution. I also have a headphone compensation implemented and both the speaker measurement and the headphone measurement can be done in a couple of minutes. This is a very good baseline and the results are already very good but I also have a lot of ideas to improve it. I haven't for example implemented the tracking filter for noise reduction yet and room correction is still missing though that should be done soon since I have all the algorithms for it.

    I just finished writing the guide for doing the measurements with Audacity and processing them with Impulcifer. Eventually I would like have a website which guides the user but for now this is a process that can be done by anyone who has the patience to read through the guide. Find the project and the guide here: https://github.com/jaakkopasanen/Impulcifer. All the ideas about improving the results can be found in the issues: https://github.com/jaakkopasanen/Impulcifer/issues

    I would be thrilled to hear experiences for others if someone here has the required recording gear available and could take couple of minutes to try this out. I'm myself super hyped about the results so far and about the potential which all the improvements hold.
     
    Joe Bloggs likes this.
  11. johnn29
    I posted on the OOYH thread but just so people here know - it's the real deal. Once it's matured a bit I'll post a review compared to Super X-Fi which has really improved.

    The prospect of doing room correction in a virtual room sounds amazing - and that's the one thing missing for me. The simulated rooms from SXFI and OOYH have been setup better than my real room. If I could just get a flat frequency response and virtually reduce reverb it's gonna be a winner.

    Just a big shame there's no open source DTS;X or Dolby Atmos decoder on the horizon. We're never going to get a decent virtual experience of those. Dolby Atmos for Headphones does the height channels but it sounds like crap.
     
  12. johnn29
    Just thought I'd give a quick update, after a lot of experimenting I finally got it working with a full 7.1 HRIR. Used the stereo sweep and finally figured out the center channel with the X for ignore.

    The measurement process has a ton of issues due to its nature. I was pressing the inst/line button accidentally on the Behringer which caused errors. I spent a total of, I think 4 hours, to get this working over a couple days. Once the guide is fully written it won't take that long - but I was in a rush to get it working before a month off!

    So far been going back and forth with SXFI and Impulcifer. Impulcifier is slightly better for me. There's going to be no contest when I can fix the frequency response of my room though - that's the only thing OOYH and SXFI have going for it.

    I'm assuming without headphone compensation I can specify for a flat EQ for any set of IEM's and it'd have decent results. I'll report back on that. Obviously the binural mics I've got have no calibration file but it could be the only way to use impulcifier with IEM's.

    Seriously great work man! You working alone put out something better than Creative Labs and OOYH and it's not vaporware like the Realiser

    Edit: Not using HP compensation with flat EQ computed via Auto EQ works works amazingly well with my BT Sony WI-1000x in ear noise cancellers. I can EQ these to flat because my pair were measured on a gras coupler so I think I'm getting excellent results.

    Just remember to resample to 44.1khz for Bluetooth, because Impulcifer outputs at 48khz

    Virtually being able to take my main theater room with me on a plane is something I just can't get over!
     
    Last edited: Jul 16, 2019
    castleofargh likes this.
  13. jaakkopasanen
    I'm glad to hear that you got it working and are enjoying the results. I've been quite busy lately so I had to rush the guide and I haven't got time to write a proper guide for surround measurement.

    One option for IEMs is to use AutoEq to equalize your around ear headphones to sound like the IEMs and use that equalization profile when measuring headphone compensation with the around ear headphones. I tried this quickly by equalizing my HD 800 to the natural frequency response of CustomArt FIBAE 3. Results were obviously not as good as with HD 800 but still better than any HRIR not personalized for me. This still needs a bit more investigation to get the best out of it.
     
  14. johnn29
    I'll give that a shot. The flat EQ is sounding really good to me. I'll try playing with the 3khz band that you mentioned on Github.

    Oratory measured my XM3s and Grado gw100 so I can compare the results from the Impulcifer headphone compensation Vs AutoEQ to flat with a Gras as the source on some overears.
     
  15. johnn29
    So I tried it out - flat EQ vs headphone compensation. The Headphone Compensation renders a more realistic sound than EQ to flat. Although it's very close.

    I also have some extra info on the IEM for HRTF from oratory that I'll post on a GitHub thread.

    What's I'm finding is so strange is how your brain works. I did a bunch of testing on my laptop in a cafe between Dolby Headphone Cinema room with a flat EQ and Impulcifer. My tests there made me conclude that I actually preferred the sound signature of the Dolby Headphone and it had a crazy realistic center channel for an artificial HRIR. While Impulcfier did have more accuracy in the rear channels but it just didn't sound right. That was on Friday.

    Today, sitting in my actual theater room, where I took the measurements, it's the complete opposite. I don't like the sound of the Dolby Headphone room and the center channel just isn't right. But Impulcifer sounds amazing and like my actual speakers. I'm even enjoying music on it which did not sound right in the cafe.

    Psychoacoustics is weird. I suspect my brain knows what my theater room sounds like and is correcting for it. But when I take that same sound into an unfamiliar environment It knows that the center channel should be coming from a distance away too - where as in the cafe I was on a laptop so Dolby's HRIR was sufficient for that.

    It makes it really hard to do traditional A vs B testing and figuring out which you prefer. I just didn't appreciate how important the room was and even the visual cues from the speakers in the room etc.
     
    arnaud likes this.
First
 
Back
1
3 4 5 6 7 8 9 10 11
Next
 
Last

Share This Page