The Most Important Spec Sheet: The Human Ear
Jan 17, 2013 at 1:05 PM Post #46 of 95
Quote:
Can someone point to a web page that gives specs on jitter audibility?
What amount of dB is audible? 1 dB? .5 dB? Web cite?
Length of time for auditory memory?
What other specs would be useful?

Artifact Audibility Report
Audiophoolery
Perception - the Final Frontier
 
Deviations in frequency response depend on the frequency, volume, and (believe it or not) room acoustics unless you're using headphones. Auditory memory is good for less than one second, though there's more to it than that.
 
--Ethan
 
Jan 17, 2013 at 9:44 PM Post #48 of 95
Quote:
Deviations in frequency response depend on the frequency, volume, and (believe it or not) room acoustics unless you're using headphones. Auditory memory is good for less than one second, though there's more to it than that.
 
--Ethan

The audibility of a response deviation depends on both the magnitude and the Q (bandwidth) of the deviation.  The relationship is: the lower Q (the more spectrum affected) the less deviation is required to be audible, and the inverse, the higher Q the more deviation required to become audible.  Positive response deviations are somewhat more audible than negative ones, especially at higher Q.  To the extreme, a very high Q notch that is 30dB deep can be inaudible unless it lands on a specific musical note. An identical Q peak, but with 30dB gain is more likely to become audible.  
 
I'm not sure the term "auditory memory" is correct for what we're going for, which I think is the ability to discern differences in two sample signals when compared with brief dead gaps between them, like ABX testing, in which it's well known that the switching time between A,B, and X must be kept extremely short for subjects to be able to resolve small differences.  Even a .250ms gap hurts resolution significantly.  I'm sure there's a paper...just don't quite know what to search for.  "Auditory memory" turns up a bunch of stuff relating to memory based on audio stimulus.
 
The most well publicized audible group delay figures come from a 1978 paper by Blauert and Laws, which put the audibility threshold at 1KHz at 2ms, with the minimum threshold at 2KHz and 1ms. A 2005 AES paper by Finnegan, Moore and Stone used impulses and found the mid-band threshold at 1.6ms.  But the audibility of GD depends not only on how much delay there is, but what the signal is, and whether or not the group delay curve is constant or changing. Much of the work done on the effects of group delay was motivated by early digital recorders with multi-pole analog anti-aliasing and reconstruction filters, which had by definition a fair amount of group delay.  In the early digital days, mixing was done in the analog domain, so it was possible to pass through many iterations of anti-alias and reconstruction filters before finally ending up with audio out of a CD player.   Today with over-sampling and high bit rates, the initial anti-aliasing filter requirements aren't as severe, and mostly realized digitally.  Mixing occurs in the digital domain too, so the chances of filtering and re-filtering to accumulate audible group delay have diminished.  
 
We could probably safely set the threshold of audibility for group delay at 1ms for any frequency group, though the extremes would obviously have higher audibility thresholds.
 
As to inter-channel delay (differential time delay between two channels), we could probably base that figure on the Minimum Audible Angle, the smallest detectable difference in a sources position on a horizontal plane.  That's a very frequency dependent figure, ranging from about 1 degree at 1KHz and down, to 18 degrees at high frequencies (ref: "Hearing", ed. Brian C. J. Moore, Ch. 9, "Spacial Hearing and Related Phenomenon" by D. Wesley Grantham).  Assuming a source straight ahead, a 1 degree angle shift right or left calculates out to a 9us inter aural time difference, whereas 18 degrees is a 160us difference.  Probably throw that out, and stay with 9us as the audible inter-channel delay threshold.  
 
Jan 18, 2013 at 10:39 AM Post #49 of 95
I'm not sure the term "auditory memory" is correct for what we're going for, which I think is the ability to discern differences in two sample signals when compared with brief dead gaps between them, like ABX testing, in which it's well known that the switching time between A,B, and X must be kept extremely short for subjects to be able to resolve small differences.  Even a .250ms gap hurts resolution significantly.  I'm sure there's a paper...just don't quite know what to search for.  "Auditory memory" turns up a bunch of stuff relating to memory based on audio stimulus.


This is where I'm confused too - my understanding is based on SM and echoic memory, but that doesn't deal with ABX testing. So generally you'll see 5-10s stated for SM, but there's always a clause about how SM gets increasingly unreliable as it decays (it "fades"), and that perfect recall from SM is generally not possible. Which I'm guessing gets in the way of ABX testing.

The most well publicized audible group delay figures come from a 1978 paper by Blauert and Laws, which put the audibility threshold at 1KHz at 2ms, with the minimum threshold at 2KHz and 1ms. A 2005 AES paper by Finnegan, Moore and Stone used impulses and found the mid-band threshold at 1.6ms.  But the audibility of GD depends not only on how much delay there is, but what the signal is, and whether or not the group delay curve is constant or changing. Much of the work done on the effects of group delay was motivated by early digital recorders with multi-pole analog anti-aliasing and reconstruction filters, which had by definition a fair amount of group delay.  In the early digital days, mixing was done in the analog domain, so it was possible to pass through many iterations of anti-alias and reconstruction filters before finally ending up with audio out of a CD player.   Today with over-sampling and high bit rates, the initial anti-aliasing filter requirements aren't as severe, and mostly realized digitally.  Mixing occurs in the digital domain too, so the chances of filtering and re-filtering to accumulate audible group delay have diminished.  


Interesting. Thanks.

We could probably safely set the threshold of audibility for group delay at 1ms for any frequency group, though the extremes would obviously have higher audibility thresholds.


My understanding is that 1ms is not possible below a certain frequency bound - e.g. when you're dealing with say, 50hz, group delay will be at least a few ms, and generally it isn't worried about until it becomes many tens (or hundreds :ph34r:) of ms. But this is based on speaker building and drivers, not within a signal domain where "flat" is less of a problem. I also remember (reading, hearing at a presentation, who knows) that as frequency goes up, higher group delay will present more problems - that we're supposed to be more sensitive to 10ms of GD at say, 5khz, than we are at 50hz. But I'm not sure if that applies or not (I don't have a citation for this one).
 
Jan 18, 2013 at 1:00 PM Post #50 of 95
Quote:
Ethan, I'm curious if you could expand on this.


You already got some other good comments, and the Perception article I linked above explains in more detail. Basically, 1) it's difficult to remember the exact tonality of a sound for very long, and 2) it's difficult to pay attention to everything going on in music at once.
 
--Ethan
 
Jan 18, 2013 at 3:49 PM Post #51 of 95
You already got some other good comments, and the Perception article I linked above explains in more detail. Basically, 1) it's difficult to remember the exact tonality of a sound for very long, and 2) it's difficult to pay attention to everything going on in music at once.

--Ethan


Gotcha. I can certainly agree with #2 just based on experience as well. :)
 
Jan 18, 2013 at 11:54 PM Post #52 of 95
This is where I'm confused too - my understanding is based on SM and echoic memory, but that doesn't deal with ABX testing. So generally you'll see 5-10s stated for SM, but there's always a clause about how SM gets increasingly unreliable as it decays (it "fades"), and that perfect recall from SM is generally not possible. Which I'm guessing gets in the way of ABX testing.

The ability to make qualitative comparisons between two very slightly different signals diminishes much more quickly. A second of silence between the two choices reduces the ability to detect differences very significantly. The greatest ability to discern slight differences occurs when the two choices are presented with an audibly seamless transition. Very different from SM and echoic memory.

My understanding is that 1ms is not possible below a certain frequency bound - e.g. when you're dealing with say, 50hz, group delay will be at least a few ms, and generally it isn't worried about until it becomes many tens (or hundreds :ph34r:) of ms. But this is based on speaker building and drivers, not within a signal domain where "flat" is less of a problem. I also remember (reading, hearing at a presentation, who knows) that as frequency goes up, higher group delay will present more problems - that we're supposed to be more sensitive to 10ms of GD at say, 5khz, than we are at 50hz. But I'm not sure if that applies or not (I don't have a citation for this one).


Got to be a little careful here. The audibility of GD as a form of distortion is not related to the amount of delay but the fact that GD may not be constant over the full spectrum. If it were constant over the full spectrum it would just be delay, and could be hours long, it wouldn't matter. But if GD were plotted vs frequency and the result was a radically changing curve, then we have issues.

Minimal GD shouldn't be a issue with basically flat electronic systems or single driver transducers. The fact that its much less audible at frequency extremes benefits multi-driver speakers with crossovers, though in active designs its possible to include delay equalizes that flatten the GD curve if desired.
 
Jan 19, 2013 at 6:48 AM Post #53 of 95
This is where I'm confused too - my understanding is based on SM and echoic memory, but that doesn't deal with ABX testing. So generally you'll see 5-10s stated for SM, but there's always a clause about how SM gets increasingly unreliable as it decays (it "fades"), and that perfect recall from SM is generally not possible. Which I'm guessing gets in the way of ABX testing.

The ability to make qualitative comparisons between two very slightly different signals diminishes much more quickly. A second of silence between the two choices reduces the ability to detect differences very significantly. The greatest ability to discern slight differences occurs when the two choices are presented with an audibly seamless transition. Very different from SM and echoic memory.

My understanding is that 1ms is not possible below a certain frequency bound - e.g. when you're dealing with say, 50hz, group delay will be at least a few ms, and generally it isn't worried about until it becomes many tens (or hundreds :ph34r:) of ms. But this is based on speaker building and drivers, not within a signal domain where "flat" is less of a problem. I also remember (reading, hearing at a presentation, who knows) that as frequency goes up, higher group delay will present more problems - that we're supposed to be more sensitive to 10ms of GD at say, 5khz, than we are at 50hz. But I'm not sure if that applies or not (I don't have a citation for this one).


Got to be a little careful here. The audibility of GD as a form of distortion is not related to the amount of delay but the fact that GD may not be constant over the full spectrum. If it were constant over the full spectrum it would just be delay, and could be hours long, it wouldn't matter. But if GD were plotted vs frequency and the result was a radically changing curve, then we have issues.

Minimal GD shouldn't be a issue with basically flat electronic systems or single driver transducers. The fact that its much less audible at frequency extremes benefits multi-driver speakers with crossovers, though in active designs its possible to include delay equalizes that flatten the GD curve if desired.
 
Jan 19, 2013 at 9:10 AM Post #54 of 95
The ability to make qualitative comparisons between two very slightly different signals diminishes much more quickly. A second of silence between the two choices reduces the ability to detect differences very significantly. The greatest ability to discern slight differences occurs when the two choices are presented with an audibly seamless transition. Very different from SM and echoic memory.


See, this is where I'm perhaps getting hung up - it almost seems like you're trying to have a discussion external to SM/echoic memory (which is the system being relied upon here).

Got to be a little careful here. The audibility of GD as a form of distortion is not related to the amount of delay but the fact that GD may not be constant over the full spectrum. If it were constant over the full spectrum it would just be delay, and could be hours long, it wouldn't matter. But if GD were plotted vs frequency and the result was a radically changing curve, then we have issues.


It...usually is a "radically changing curve" - run a sim or two and take a look. Again FWIR/AFAIK it's never going to be a flat line when you look at a speaker - with something like a CD player it should be pretty close, but with a speaker you won't get there (you won't ever get 100% perfect timing with a speaker).

Minimal GD shouldn't be a issue with basically flat electronic systems or single driver transducers. The fact that its much less audible at frequency extremes benefits multi-driver speakers with crossovers, though in active designs its possible to include delay equalizes that flatten the GD curve if desired.


And my understanding is that DSP is not an "all the way" fix to GD, especially with LFE systems. Single drivers usually have higher GD down low than multi-driver systems, at least IME.
 
Jan 19, 2013 at 10:48 AM Post #55 of 95
It...usually is a "radically changing curve" - run a sim or two and take a look. Again FWIR/AFAIK it's never going to be a flat line when you look at a speaker - with something like a CD player it should be pretty close, but with a speaker you won't get there (you won't ever get 100% perfect timing with a speaker).
And my understanding is that DSP is not an "all the way" fix to GD, especially with LFE systems. Single drivers usually have higher GD down low than multi-driver systems, at least IME.

There's no reason a dsp couldn't correct GD in an LFE system to below audible threshold, especially since the same thing can be accomplished with analog topology. 100% flat is not necessary, and tolerance for GD is quite high in LFE bands.
 
Jan 19, 2013 at 11:05 AM Post #56 of 95
See, this is where I'm perhaps getting hung up - it almost seems like you're trying to have a discussion external to SM/echoic memory (which is the system being relied upon here).

Echoic memory refers to the ability to "re-hear" sound events mentally for a few seconds, primarily the information content. The mechanism accomplishes the equivalent of a visual rescan.

That's not exactly the same mechanism as an instantaneous qualitative comparison, during which the qualitative difference determination emerges very quickly if the test signals are changed without a silent gap. Echoic memory of information content lasts 4-10 seconds depending on the study you look at, and can be improved on by people using dictation skills for example, or copying morse code. Qualitative comparison ability is much shorter in time, becoming measurably impaired with silent gaps as small as 100ms, and cannot be improved on AFAIK.
 
Jan 19, 2013 at 2:11 PM Post #57 of 95
Echoic memory refers to the ability to "re-hear" sound events mentally for a few seconds, primarily the information content. The mechanism accomplishes the equivalent of a visual rescan.

That's not exactly the same mechanism as an instantaneous qualitative comparison, during which the qualitative difference determination emerges very quickly if the test signals are changed without a silent gap. Echoic memory of information content lasts 4-10 seconds depending on the study you look at, and can be improved on by people using dictation skills for example, or copying morse code. Qualitative comparison ability is much shorter in time, becoming measurably impaired with silent gaps as small as 100ms, and cannot be improved on AFAIK.


Okay - this makes sense (I'm familiar with the parts on SM/echoic memory and the numbers there). Thanks. :beerchug:
 
Jan 28, 2013 at 11:31 AM Post #58 of 95
Quote:
 
I think it's rather misleading to think of the dynamic range of human hearing in the same way we do a piece of electronic gear.
 
Yes, we can hear down to the thermal noise floor of the air itself, and up to the threshold of pain at 130dB. But NOT AT THE SAME TIME.

 
Yes, spot on and the ear is quite poor in this regard. But it's why perceptual codecs (such as MP3) can work at all. 
 
If you want to hear a 1 KHz tone, and there is an interfering tone at 900 Hz, if the interferer is 30 dB higher than signal of interest, then it completely masks the signal of interest. If your ear was an ADC, this 30 dB dynamic range this would translate to about about 5-6 bits. The ADC spec here would roughly be SFDR.
 
A modern ADC can tell the difference between a 1 KHz tone at 0 dBFS and a 999 Hz tone at -110 dBFS, something the ear could never hope to do.
 
Jan 28, 2013 at 11:42 AM Post #59 of 95
Epic thread!
 
 
I want to also add another little quirk - the 'ol noodle.
 
Besides the physical neurobiology, there is the perception of the sounds. I guess I would call it training the brain to recognize what it hears and be able to differentiate it and describe it.
 
 
Take genetic twins - train one in music, with all the details, notes, examples, etc etc
 
The other one you leave alone.
 
 
Pop some audiophile headphones/tuned rig on both and see what they both say...  (Note: results may be tainted with any hearing damage they each may have had in their lives)
 
...would be interesting.
 

Users who are viewing this thread

Back
Top