Everything you wanted to know about USB audio but where afraid to ask
Basically when any usb device is plugged into a computer the computer ask's the device for what is called it's enumeration. This is a big table of information that declares what the device can do etc. You can look at some of this information on a PC with USBView and MAC's with USB Prober.
In Audio without drivers there is basically 3 methods of Iscosynchronus (ISO) transmission:
1) Sync, not used much basically the device strickly tracks the computer on the sample rate.
2) Adaptive here the data is buffered and the sample clock is updated every so often to match the computer.
3) Async here the device tells the computer what rate to send the data with a feedback packet. Kind of like a flow control 3 byte data packet tell the computer to slow up or speed up the data flow. Most promising.
If you write a driver then you can do just about anything but most are doing what is called Block Mode (0404 USB). The computer sends down packets and the device spills it out. Block mode is a low priority and therefore can cause skipping and stuff which some of you have seen.
Ok back to ISO stuff...
In the enumeration you declare what is called the rate of the SOF (Start of Frame) packet. This is a one byte packet usually for audio sent every 1ms that determines the timing from the computer in Adaptive mode. In Async mode I use this to dertemine what size of data my buffer has and if I should slow down or speed up the data flow.
Typically what happens in all operating systems is that a packet is sent between each of the SOF frames. Since these happen every 1ms let's look at what we have.
Ok 44.1/16 bits. That means we need 4 bytes of data for stereo (32 bits or 16 bits by 2). Since we are running at 44,100 we divide that by 1000 since each SOF is 1ms we have 1000 SOF a second. So we have (44,100/1000)*4 bytes = 176.4
Well it is impossible to send less than a byte so what the computer does is send 9 packets of 176 then 1 packet of 180:
(176*9 + 180)/10 = 176.4
In the case of Adaptive which is the most common of controllers (Benchmark, most of the PCM27xx series and others) the controller takes the SOF frame and compares it to it's internal clock and then adjusts the serial stream to the dac to make sure that the internal buffers don't underrun over overrun.
In Async mode we actually compare the input DMA pointer (iDMA) to the out going DMA pointer (oDMA) going to the dac and determine if we need the computer to stream faster or slower.
So basically the reason that Async is better than Adaptive is because in Async the clock does not change. Only the rate at which the data reaches the device. This is not to say that either is really better than the other, there are many other attributes that determine how good a device maybe.
Jitter... mainly confused and stuff.
First there is no jitter in the USB stream like there is in SPDIF because there is no clock ridding over the data like there is in SPDIF products. There is in all clocked audio devices both SPDIF/USB jitter which is called intrinsic. Intrinsic jitter is caused by the Audio receiver (SPDIF, USB, Firewire, etc) in internal clocking and also in the dac and also caused by the power supply.
Much of the jitter caused by USB Audio devices is due to the USB recveiver it self. See many of them including the TAS1020, TUBS3200, PCM27xx devices take some multiple of the required USB 12MHZ clock to determine the serial clock for the dac. Well try and get 44.1, 48, 88.2, 96K from a 12MHZ clock and you can see the problem. Many of these use clock multipliers and dividers of great complexity to do this. This causes most of the jitter that people write about saying that USB Audio is bad. Well... not really that true.
Also with Async mode and using either an ARM controller or the TAS1020/TUSB3200 you can source the Master Clock (MCLK) into the device using a very low jitter clock and derive all the Audio Clock signals from that Clock lowering the jitter by a substantial amount.
Or you can do what other companies on the SPDIF side have done for a long time and that is by the use of PLL/VCXO you can rederive the clock and also buffer the data via FIFO's and other things to clean up the signal between the USB Audio receiver and the DAC or processor.
Part of the problem with USB Audio at this point is there isn't really a controller you can buy off the shelf that already implements all this stuff without programming. The main reason being the enumeration. The amount of options is really staggering. Also to make a generic controller that works with anything is really a task that may never pan out.
With say an ARM processor or the TAS1020/TUSB3200 you can program your way into some really good designs. This is no easy task as USB is not something simple to deal with. There is allot going on to make one of these work really well. But expect more and more companies to climb on board to this technology in the near future.
Five years ago this month I made my first USB DAC. I thought this is silly nobody is every going to buy this thing. My first listen through the Cosecant and I knew it was going to be hot.