I should note something here......
When I'm talking about those "5 uS timing differences" I am
NOT talking about jitter or timing
errors between channels.
What I'm talking about is having the same sound recorded in both channels, with a time delay being added to one of the channels.
The "overall time resolution" of any digital recording of a continuous sine wave is essentially infinite.
If I take a 500 hz sine wave, and record it on a stereo CD, after delaying one channel by 5 uS, you will be able to easily resolve the 5 uS difference (on an oscilloscope).
The reason this works is because we can accurately reconstruct the two 500 Hz sine waves (in the two channels), and compare them.
HOWEVER, our brain uses differences in arrival time to "calculate" location.
Assuming I start with a sound located equally in both channels, I can "move" its apparent location from left to right by adding delay to one channel or the other.
(Our brain calculates that the source is closer to one ear or the other by comparing differences in arrival times at each ear.)
Now let's assume that I start with an unreasonably abrupt impulse (let's call it 5 uS).
(And, yes, I can easily generate a 5 uS sound pulse using various methods.)
Even though most of the energy in that impulse will be at inaudible frequencies, enough will extend into the audible range that we will hear it as a click.
And, if I delay that click in one channel or the other, it will seem to shift locations - between left and right.
However, because that click actually falls between two samples at 44k, we will
NOT be able to precisely reconstruct its time from our 44k sample rate recording.
When we apply our band limiting, that impulse will indeed be spread out into a longer waveform that extends over multiple samples.
And, by looking carefully at that new waveform, we will be able to infer where the original impulse occurred in time.
HOWEVER:
1) in order to do so we will have to make certain assumptions about the filter we used
(we'll assume that, if there is equal amplitude in two samples, then the pulse was equidistant in time between them - but this assumption relies on our filter spreading the energy symmetrically in time)
2) the new waveform will be very different than the original
3) more importantly, unlike the original, our new waveform will have a much more gradual envelope
3a) as a result, mechanisms that rely on sensing abrupt edges of waveforms will be less able to accurately "find" the beginning edge of the impulse
3b) current research seems to strongly suggest that our brains do in fact look for those "edges"
3c) this in turn suggests that turning a sharp impulse into a more gradual band limited waveform may compromise the accuracy with which our brains can determine its exact beginning
3d) and this, in turn, suggests that doing so may reduce the accuracy with which our brains are able to utilize this particular location cue
(if the starting time of the impulse cannot be determined distinctly, then we will have the equivalent of a blurry image when we attempt to compare them)
The result of all this MAY be that our brains end up being less accurate in their estimation of where sound objects are located in space.
The result could be that objects seem to be in different locations, or that we perceive the location of individual instruments as being less distinct.
(A similar effect occurs with "stereoscopic 3D video"; when the various depth cues conflict, even slightly, which they often do, the image seems "less distinct and less real".)
Note that, if you accept the current "spectrum analyzer" model of the human ear (with a bunch of "detector hairs", each of which is "tuned" to a distinct frequency).
The frequency which the "top hair" responds to will determine the highest frequency we can detect as a continuous sine wave.
However, that number says nothing about the
TIME RESOLUTION (how quickly, and how accurately, our brain can respond to
WHEN a particular hair was excited.)
Please note that I'm not specifically suggesting that this will turn out to be true...... however I don't assume that it isn't true either.
Recent research in how your brain calculates eye movement has shown that the mechanism is quite different than what we previously thought... and not especially intuitive to most people.
Therefore, I prefer NOT to make claims based on inferences based on old or incomplete information...
I agree with you..... however this thread is not a discussion about enjoying music.
It is about some very specific assertions about what is and is not "humanly perceptible".
1)
No.
16/44.1 audio can do 5 uS timing differences easily ON CONTINUOUS SINE WAVES.
(and not so well under some other conditions and with some other waveforms.)
2)
I agree with you.
However, this thread is NOT about "whether high-res is worthwhile"...
It makes a very specific assertion (and it asserts, not that the difference is "nearly impossible to detect" but rather that the difference is "IMPOSSIBLE to detect").
In fact, the title of the thread actually makes two assertions that directly contradict each other (as does the original paper).
The original Xaph Audio paper actually asserts that the high-res version will sound audibly WORSE because of interactions between equipment.
(Which claim must be based on the idea that there will be an audible difference after all.)