Hi all,
Been busy! China three weeks ago, then a week in Indonesia, then a week in the Midwest, then SF (twice!), finally getting back on this...
ITDs and ILDs change with frequency AND total SPL. Just like sensitivity to THD - it's frequency, spectrum, and amplitude that will tend to indicate if THD can be heard, and if so whether or not it is objectionable. ITDs as low as 100 microseconds can be important in the midrange, growing to hundreds of milliseconds in the bass ranges. It's not a linear relationship either.
But we're talking about CSDs, and that's not necessarily what an ITD/ILD would inform (ITDs and ILDs tend to be for localization). There has been a tremendous amount of research into timbre, tonality and resonances. One of the better papers was by Floyd Toole and Sean Olive, back in the late 80s I think. It had some really good guidelines and research about how resonances, even 20+ dB down, can affect the perceived timbre of an instrument, making a viola in the upper registers sound like a violin in the lower registers, for example. Gabrielsson and Tolve followed up in the mid 1990s with more research.
Essentially, what makes a viola sound like a viola, what makes a tenor sax sound like a tenor sax isn't just the musical scale range of the instrument, but the complex resonance structure the instrument creates. Thus it is more than just its frequency response (range of notes); it is the
decay of how those notes go away that builds the individual character of the instrument. And just like a transducer with a hot bass (or lean bass!) frequency response will skew what you hear, hanging on for multiple milliseconds - when the signals of interest last less than a fraction of a millisecond - will audibly color what you hear.
Actual sensitivity to CSD resonances is complex, as it relies upon not just the length of the resonance, but the amplitude of such resonance
and the details of other resonances around it. Single, or relatively sparse resonances tend to be more audibly benign. As the total number of resonances increases, then all tend to become more audible. It's almost like a "total amount of excess signal after the event" matters, and I've been noodling on ways to model that, but so far it's a pretty complex solution. Suffice to say, one or two resonances that hang on no more than 2-3 periods tends to be OK (provided they are at least 18 dB down from passband); more resonances, or longer duration, or higher amplitude begins to add an audible coloration.
And this doesn't even touch on the area about masking, where details can be hidden by such resonances, especially if the resonances are high in amplitude and duration...
Someone much more educated and driven than I needs to research this more, but I can safely report that such research is on-going!