I have zero clue as to what you are speaking on
This means I failed to communicate effectively! Some time passes between a DAC receiving a digital signal and the corresponding analog signal coming out of it, and I endeavored to quantify that delay or latency.
That delay appears to be less than 4 ms for Schiit DACs. So if I watch a video and the sound gets reproduced 75 ms later than it should, it's probably mostly not the DAC's fault.
That said, replacing a DAC with 1 ms latency with one that has 4 ms latency may push the system's total latency over the edge of detectability when before it was just below that threshold.
According to this, avg human reaction time is about 250ms:
https://humanbenchmark.com/tests/reactiontime
That includes latency between the software drawing green in memory and the display actually showing it (varies dramatically depending on display technology), the brain processing the change and instructing a finger to tap/click, the finger actually touching the screen, the screen processing the input and making it available to the software layer and the software reading it and calculating the time passed.
When designing software (or websites) the typical guideline is to make sure that an action has a perceivable effect at most 100 ms later, i.e. everything in that range is perceived as instantaneous. But that's for visual perception.
For audio, the threshold appears to be much lower. Around 10 ms seems to be a good baseline based on some googling, but this stuck out to me:
"That said, even before latencies reach that (roughly approximate) 10-12 ms point, some musicians may be more sensitive than others, given the nature of the musical sounds they make. Players of percussive instruments like drummers may be bothered by even small amounts of latency that others might not notice, due to the sharp attacks of drums and percussion instruments. In the studio I’ve had drummers comment on latencies of only around 6-8 ms, which most performers are oblivious to. I figured out that this was in part due to the fact they they were also hearing their notes acoustically - getting a better headphone seal and cranking the level in the cans until it dominated helped, enabling them to subconsciously compensate for any subliminally-perceived lag they felt between feeling the stick hit the drum and hearing it in the monitor mix."
https://ask.audio/articles/monitoring-latency-how-low-can-you-go
Now that's in a context where you take an action and expect an acoustic response to it. I remember playing a virtual drum set in virtual reality once and it wasn't nearly as enjoyable as I expected simply due to the obvious audio latency.
I suspect when watching a video the tolerance is greater, especially when you have a frame rate of 24 FPS, i.e. about 41.7 ms go by without the image changing at all. Then again whenever that first frame of a gunshot or explosion (anything that makes a sudden sound) appears is when the clock starts ticking.
I imagine for dialog the tolerance is greater since the exact moment when a sound should be produced is typically not easy to determine. I find lip sync issues most noticeable when someone starts or stops speaking, less so while someone continues to speak.
The numbers out there vary considerably.
"For television applications, the Advanced Television Systems Committee recommends that audio should lead video by no more than 15 milliseconds and audio should lag video by no more than 45 milliseconds. However, the ITU performed strictly controlled tests with expert viewers and found that the threshold for detectability is -125ms to +45ms. For film, acceptable lip sync is considered to be no more than 22 milliseconds in either direction."
(
https://en.m.wikipedia.org/wiki/Audio-to-video_synchronization)
It's also important to note that standards for video transmission need to be tighter than the limits of perception since the devices reproducing the audio and video add their own latency to the signal's.