Human threshold of sound and timing
Dec 11, 2008 at 5:35 AM Thread Starter Post #1 of 16

Champ04

100+ Head-Fier
Joined
Jun 24, 2007
Posts
147
Likes
10
I've long been curious of all the various claims and arguments regarding cable audibility and jitter audibility. Recently I devised a little experiment whereby I could take a look into my own ability to "hear" these differences.
I would like to share my methods for this little experiment and get some quality feedback on it.
Let me respectfully request that all feedback be of a useful nature regardless of one's perspective. I dont care if you agree or not. Just make sure its useful. If you feel the need to rag on people who claim to hear differences in cables or if you want to disparage objectivists methods, please refrain from contributing.
That being said, here is what I did.................

I began with a pair of Scaena 3.2 speakers.
Scaena Homepage
These speakers are a line array design using 12 cone drivers aligned vertically immediately next to 9 ribbon type tweeters. Not including the seperate subwoofer they are a two way design.
The speakers use a 1st order crossover which means that there is inherently minimal phase error. Thus, with careful setup, I should be able to get a relatively good step response.
For those who may not know, a step response is a measurement that, among other things, is very useful for measuring the timing delay relative to the various drivers within a particular speaker.
Below is an example of a textbook step response. Sound from all drivers arrive at the ear at exactly the same time giving you a relatively nice right triangle graph.
D4afig08.jpg


Here is an example of a speaker that is not time aligned. Note how the three peaks show that the tweeter leads the midrange which also leads the woofer.
Ht5fig7.jpg


Any speaker with a verticle array of drivers must use either a stepped baffle or a tilted baffle in order to achieve time alignment of its drivers.
The Scaena is different in that the drivers are side by side. This allows me to effect the timing response by adjusting the relative toe in and toe out with respect to the listening position.

For the experiment I chose musical passages that I was very familiar with and also that contained instruments that I have experience playing, namely acoustic guitar and drums. So there was a very high degree of familiarity.

The actual experiement consisted of adjusting the toe in/toe out of the right channel while listening from a fixed and predetermined listening position. Both speakers began in the same horizontal plane relative to the listening position. Ie. same distance away. And both speakers were rotated around the same fixed spot being their inside front spike. I started from a point that was more or less 30 degrees toed out of what might be visually considered a position pointed directly at the listener. I started by toeing in the speaker a quarter of an inch at a time. After a handful of trials I narrowed the adjustment to an eigth of an inch and then eventually to a mear sixteenth of an inch adjustments.
With each trial I listened to the same track segments, listening first for the sustain in the cymbals and guitar strings and secondly for the exquisetly subjective notion of, "That sounds more like the real thing."
The idea being that a mismatch in timing between the drivers would result in cancelations in the musical waveform by the time it hit the ear and thus reduce the audibility of natural sustain and sense of space.
I went until I noticed the sound start to deteriorate from the previous trial. I went a few more trials beyond that point, just to be sure and then back to what I felt was the best subjective performance the speaker could achieve.
I anchored the speaker and moved on to the left channel.

On the left channel I took a calibrated measurement microphone and placed it in exactly the center of the listening position. Then I adjusted the toe in of the left channel as before, only this time I wasnt listening for anything. Rather I took measurements at each interval until I ended up with the best step response I could manage with the speakers.

Once I found the perfect spot for the left channel I left the microphone exactly where it was and measured the step response of the right channel. I also took various impulse responses of single speakers as well as the pair.
I also took careful physical measurements of each speakers placement relative to the listening position.

The results were quite suprising.
The physical placement of the speakers seem to end up with near identical mirror images of each other relative to the listening position. The down side was that there was a certain margin of error because I could not very well measure distance and sit in the position at the same time. But I took multiple measurements to try and be as accurate as possible. I feel comfortable in reporting that the two speakers were within a quarter inch, total, of their respective mirror images. I feel as though my knowledge of geometry is strong but it is quite difficult to define baseline points of space with good degrees of accuracy.
To try and get even greater precision measurements I used the impulse response of the test in order to try and accertain similarity.
Here is the resulting measurement using this method.
In it the top trace is the impulse response of right channel only. The bottom trace is of the stereo pair. Other than amplitude, they appear remarkably similar. In it you would expect the amplitude of the pair to be roughly twice that of a single speaker.
impulse.jpg


For reference, here is an example of the impulse response taken from a stereo pair where one speaker is at a slightly different distance than the other. Note the two initial peaks. Both having the same amplitude but arriving at different times.
impulse2.jpg


Here is an example of the step response I achieved via set up using listening only.
step.jpg


Ok, so here begins the discussion.
I do not claim any sort of "golden ears". Nor do I claim to be super human in any way.
smily_headphones1.gif

I do acknowledge that I have many years experience as a "critical" listener. I have spent hundreds of hours over the last 10+ years setting up speakers in various listening spaces.
I do acknowledge that I have listened to the test tracks literally hundreds of times.
I do acknowledge a certain degree of "experience" in knowing what the actual instruments used in the test tracks sound like live.
This process took several hours to complete and in that time I was careful to avoid listening fatigue as well as what I would call "ridgid listening". In other words, before each trial I would calmly sit down, take a few deep breaths, and then focus on feeling relaxed. I would describe it as listening without holding on too tight. If that makes any sense.

In the measurements the time divisions are milliseconds. So, for example, in the second picture example of impulse response the time from the first vertical line to the second is 900microseconds. Which means the time from peak to peak of the response itself is somewhere between 100 and 300 microseconds. Most likely right around 200us.

So, what are all your thoughts on this?

The whole process was very carefully planned out and I took great precaution to be as "scientific" about it as possible.
I began with absolutely no experience with the speakers in their particular position and without a lot of previous thought given to toe-in adjustments. I adjusted the first one by ear so that there would be as few visual cues vis a vis objective measurements as possible.

Are there any biases that I may have missed?
How well does this one experience indicate the human ability (read: my ability) to hear timing on the level of a hundred microseconds or less?
Is this even significant? Have other tests shown a lower threshold?
How could I go about repeating this experiment and minimize any biases related to experience with the first?
What ramifications does this little experiment have, if any?

Thank you, to all of you who were engaged enough to read this to the end!
 
Dec 11, 2008 at 12:48 PM Post #2 of 16
That was an interesting read. Thanks for posting it. I only wish there were more hands on experiments like this.

Quote:

Originally Posted by Champ04 /img/forum/go_quote.gif
In the measurements the time divisions are milliseconds. So, for example, in the second picture example of impulse response the time from the first vertical line to the second is 900nanoseconds. Which means the time from peak to peak of the response itself is somewhere between 100 and 300 nanoseconds. Most likely right around 200ns.


Btw...did you mean microseconds there?
 
Dec 11, 2008 at 2:15 PM Post #3 of 16
Quote:

Originally Posted by b0dhi /img/forum/go_quote.gif
Btw...did you mean microseconds there?


Crap, thats a good point. I think the milli part is right but that means the subdivisions are micro not nano. I must have been getting a little tired.

Thanks for the heads up. I'll go back and edit it quick.
 
Dec 11, 2008 at 9:19 PM Post #4 of 16
In your reference impulse response, does the time difference of the two initial peaks correlate with the difference in distance of the two speakers?

Did you take this measurement beforehand, and then intentionally move one speaker again after your experiment and measure again?
 
Dec 11, 2008 at 9:56 PM Post #5 of 16
Great job Champ,
This is very impressive to say the least. What this shows us is that the human ear (read: your ears) are capable of discerning minute timing differences below the order of what looks to be about 100 microseconds (the chart you used for the example of about 200microseconds step response delivery peaks). This is useful in determining how sensitive human hearing is and applying this to all audio products: cables, jitter, clocks, etc.etc. It also allows one to see how small of a time sample size is needed to get an accurate representation of the smallest unit of human hearing measured in time and magnitude.

Might I ask what sample rate and bit depth the music you used was?

Again great job, truly inspiring!
Dave
 
Dec 11, 2008 at 10:35 PM Post #6 of 16
Interesting stuff, thanks for doing it. 100 microseconds (0.1ms) timing discrimination is pretty impressive and pretty much also bang on the limits that Psychophysics research cites for young adults with good hearing, we can officially call you golden eared but in a good way
wink.gif
 
Dec 12, 2008 at 12:30 AM Post #7 of 16
Quote:

Here is an example of the step response I achieved via set up using listening only.


You show a graph of the step response you obtained by listening only ... how would it look using measurements? Also, was that the only graph you measured or did you work your way up to that result?

I'd like to see a series of graphs with the speakers toed in and out of the ideal position .... especially in 1/16" adjustments .... to see exactly how much change there is or in other words how critical the positioning is. I'd also like to see the absolute best graphs you could obtain using a microphone only to adjust both speakers.

If the "sweet spot" angle is that critical (1/16" adjustments), wouldn't the fact that you initially used your ears (which would effectively be two microphones spaced 6" apart), to analize the first speaker, mean that each time you moved to make an adjustment, both ears would need to be returned to the exact same position pointed directly at the same distance/angle towards the speaker? (to do this, you'd need some sort of head clamp) If they weren't, wouldn't they introduce their own time delay variances vs. the single stationary microphone you used for the second speaker setup? Again, if 1/16" makes a huge difference in the speaker's angle, wouldn't the same hold true at the microphone or ear position? I guess what I'm getting at is that because of the excellent dispersion properties of most quality modern speakers, perhaps your results are easy to obtain at various speaker angles and don't necessarily show that ears can distinguish minute nuances like it seems you're suggesting. What would the graphs look like if the speaker were turned 1/16" then again at 1/8" then again at 3/16" etc.? It would help to know how much lee-way there is in positioning/angles.

I'm not an expert at any of this, and am not criticizing the obvious hard work you've put into your experiment, but without other measurements at other angles to compare to the 3 graphs you've shown, I have no way of knowing if you've proven anything so far. Also, it would be nice to know how far away the speakers were from the listening position, the speaker's placement in the room, as well as the room's acoustic environment.

Quote:

The results were quite suprising.
The physical placement of the speakers seem to end up with near identical mirror images of each other relative to the listening position.


I don't find it all that surprising. I think the question is "how near?" and "how far off" of these positions can you go before you start noticing degradation in the graphs? Perhaps I'm completely missing the point?
 
Dec 12, 2008 at 11:40 AM Post #8 of 16
Quote:

Originally Posted by nick_charles /img/forum/go_quote.gif
Interesting stuff, thanks for doing it. 100 microseconds (0.1ms) timing discrimination is pretty impressive and pretty much also bang on the limits that Psychophysics research cites for young adults with good hearing, we can officially call you golden eared but in a good way
wink.gif



Actually it's quite a bit lower than that. See:

Quote:

One of the most important binaural cues for sound localization and separation of signals is the interaural time difference (ITD). Humans can detect an ITD as small as 10-20 µs.


(2005; Neural Correlates of Behavioral Thresholds to Interaural Time Differences)

However, the discernability of time differences depends on frequency.

I haven't confirmed this myself but I've read that some researchers have documented ITD threshold of 2-5us by some people using very high frequency pulses.
 
Dec 12, 2008 at 1:29 PM Post #9 of 16
Quote:

Originally Posted by b0dhi /img/forum/go_quote.gif
Actually it's quite a bit lower than that. See:


(2005; Neural Correlates of Behavioral Thresholds to Interaural Time Differences)

However, the discernability of time differences depends on frequency.

I haven't confirmed this myself but I've read that some researchers have documented ITD threshold of 2-5us by some people using very high frequency pulses.



Nice post!

It makes a lot of sense that the time differences depends on frequency (inherently wavelength) and magnitude and time is the dependent.

Dave
 
Dec 12, 2008 at 6:33 PM Post #10 of 16
Very neat, and good read.
smily_headphones1.gif
Very interesting speaker system, too.
 
Dec 13, 2008 at 7:32 AM Post #11 of 16
Thank you all, for your contribution.
I've had an overwhelming week. But I hope to answer all the pertinent questions as soon as I can.
Thanks again. I'll be back with some answers very soon.
 
Dec 13, 2008 at 12:47 PM Post #12 of 16
Great post!
smile.gif

Interesting reading indeed...
 
Dec 13, 2008 at 6:38 PM Post #13 of 16
I really enjoyed reading this also, but I have the same question as mbriant. Since you were so precise in moving the angle of the speakers, what did you do to make sure your head always returned to the exact same position? As I'm sure you're aware, moving your head only 1/100th of an inch could have negated all of the careful placing you had just done, depending on how the speakers and your listening position were configured.
 
Dec 13, 2008 at 6:46 PM Post #14 of 16
Quote:

Originally Posted by Muftobration /img/forum/go_quote.gif
I really enjoyed reading this also, but I have the same question as mbriant. Since you were so precise in moving the angle of the speakers, what did you do to make sure your head always returned to the exact same position? As I'm sure you're aware, moving your head only 1/100th of an inch could have negated all of the careful placing you had just done, depending on how the speakers and your listening position were configured.


Indeed this would have lengthened the amount of trials necessary to get to the point where his ears could not perceive a difference, however the only placement that was necessary to not move was that of the microphone. He did not state a number of trials needed to get that positioning nor did he state a time period, so he could keep returning to the position multiple times attempting to gain his position. The idea of the ear is to funnel in a wide area of sound so a slight movement would not affect it nearly as much as would a turn of the speaker (the source of the sound). It likely would have made his trials lessen and the time required lessen if say he had a head clamp of a chin rest and a dot on the wall to focus his eyes on, but this is unnecessary because he would have achieved the same result; indistinguishable left and right channel timing differences.

Dave
 
Dec 13, 2008 at 6:58 PM Post #15 of 16
Ok, I still have to sort through some of the measured results. And I will have to take some more measurements in order to get pictures. But I can answer some questions so far.

Quote:

Originally Posted by digger945 /img/forum/go_quote.gif
In your reference impulse response, does the time difference of the two initial peaks correlate with the difference in distance of the two speakers?
Did you take this measurement beforehand, and then intentionally move one speaker again after your experiment and measure again?



Yes, it does correlate. The picture showing the dual peaks was acutually taken after all other testing was done. I had theorized that taking an impulse of both speakers at once would ideally show a near identical result, with the exception of amplitude, as a single speaker. I was shocked that I actually achieved this the first time out. So to make sure I wasnt missing something I moved the microphone two inches to the left. This is where the picture of the dual peak comes from.

Quote:

Originally Posted by myinitialsaredac /img/forum/go_quote.gif
Might I ask what sample rate and bit depth the music you used was?


I used 16bit 44.1K CDR. But the Esoteric transport was set to upsample to 176K.

Quote:

Originally Posted by mbriant /img/forum/go_quote.gif
You show a graph of the step response you obtained by listening only ... how would it look using measurements? Also, was that the only graph you measured or did you work your way up to that result?


The picture I posted already is of the right channel. This is the one that I adjusted by ear only. I took this measurement after settling on this position and havent moved the speaker since. No other measurements were used on that channel.

Quote:

Originally Posted by mbriant /img/forum/go_quote.gif
I'd also like to see the absolute best graphs you could obtain using a microphone only to adjust both speakers.


Below is an example of the absolute best I was able to achieve thru measuring. However, it should be noted that this one came later and at a further distance. My actual listening position is 8.5ft. away from each speaker. To get the measurement seen here I had to measure from a distance of 13ft. This allowed all the drivers in the array to sum a little better and cause less ripple at the top of the step.
The results for the listening position look a little more choppy at the top but you are still able to distinguish the tweeter from the mids. In the picture below the first tiny blip downward is the transition between drivers. That little blip gets deeper the more the speaker is toed out until there is actually two distinct curves.
step2.jpg


Quote:

Originally Posted by mbriant /img/forum/go_quote.gif
If the "sweet spot" angle is that critical (1/16" adjustments), wouldn't the fact that you initially used your ears , to analize the first speaker, mean that each time you moved to make an adjustment, both ears would need to be returned to the exact same position pointed directly at the same distance/angle towards the speaker?


Absolutely. This was a challange. But not one that couldnt be over come. I was careful to try and sit in exactly the same spot each time. This isnt very difficult with my chair since it is narrow enough to aid in this anyway. I also experimented with turning my head towards the individual speaker so that both ears were equidistant and while looking straight ahead as it would be in a normal stereo listening situation. On paper it looks like this couldnt possibly be very precise. It was a concern of mine before the experiment as well.
But there must have been some degree of consistency because changes of 1/16th of an inch at the speaker was audible at the listening position.

Unfortunately, I have to go to work now. I'll be back asap to answer more questions.

Thanks again for all the input.
 

Users who are viewing this thread

Back
Top