sajunky
Headphoneus Supremus
Perfect. You are truly a network engineer and it is seen through all details. I have similar network exposure, I do confirm, everything is 100% correct. The same with USB.Factually, there are many wrong things in what you said.
I will only enumerate a few:
- The USB transfer does not have timing in the signal being transmitted. The clock of the DAC controls the timing of the USB packets that are transmitted and buffered.
- Over USB, the DATA is not an audio signal, and is not transferred to the DAC in a waveform, but in packets of binary DATA.
- Over the LAN, the sound is also transferred as binary DATA by Ethernet packets. The transfer can be synchronous or asynchronous, depending on the streaming protocol being used. In the case of an asynchronous protocol, the clock of the Ethernet network module of the endpoint (the streamer) is the master.
- Unlike a USB transfer, which is a direct transfer between two devices controlled by a single clock, on the LAN, the packets pass through a multitude of devices: router, switches, eventually FMCs… and each of them has its own clocks that synchronize the streamed packets. They produce jitter in the LAN transmission.
- The DATA in USB transmission contains a rendered sound that arrives to the DAC in the form of USB packets. In LAN transmission, the sound needs to be rendered first. That's the job of the streamer, and it precedes the DAC.
- Upstream jitter is difficult to eliminate completely. In the best case, it is reduced when the sound is clocked, after being rendered. The better the clock, the lower the jitter.
- If the streamer is connected to the DAC by coaxial or I2S, the DDC of the streamer clocks the sound. If it is connected by USB, it's the clock of the DAC that clocks it.
EDIT
You may be confusing coaxial connection with USB connection, when you are saying that there's a timing component in the audio that is transferred through USB… Through USB, the sound is not transferred neither in the form of PCM or DSD, but as USB packets of binary DATA.
This is agreed from my side that both Ethernet and USB deliver audio datastream asynchronously. Asynchronous means that a receiver use its own clock to control a timing of receiving data, there is no added jitter; there is no need for audio reclocking. It is like loading data from a hard drive (no matter how complicated it is on the transmission line).
There is important point of understanding that clock accuracy becomes important in the place where a continuous audio data stream is created. So by example, a network end point (streamer) or USB sink after receiving asynchronous data convert it to the I2S, S/PDIF (or AES/EBU).
Now is a question. So why (assuming there is a perfect galvanic isolation on a network connection and asynchronous delivery), clocking on the dirty side of network matters? It doesn't need to be synchronised with audio clock, but low jitter clock is beneficial, it is confirmed by many. Also using special protocols matters.
The only explanation is in a burst nature of a load on the receiver's power supply. If network frames are small and arrive in regular intervals, a noise spreading through the power supply is reduced.