Before asking about digital audio: READ this Digital Audio Primer

FallenAngel · Feb 18, 2009 at 11:06 AM

Hi,

I have seen and answered so many of the same questions regarding digital audio that I decided it's time for a reasonably comprehensive primer to digital audio. It seems a great deal of people do not understand what goes into a digital audio system that this is truly necessary and for those who know enough to get along might just learn something along the way as well.

Scope

This primer will deal with a "standard" digital system without any digital processing or filters such as DTS, Dolby, 3DCMSS, etc. I wish to keep the scope of this article as concrete in the "2-channel headphone world" as possible, lets not get off track.

Section 1: Parts involved in a digital system

Lets start with an introduction and a description of the main parts of the digital system. ALL DIGITAL SYSTEMS WILL CONTAIN ALL OF THESE COMPONENTS, no exceptions. Please try to understand this FACT, you will not have a digital system without any of them. Yes, there are lots of "devices" (although I would not call them that) that contain many or even all of these components in a single case/enclosure/box/card/etc.

Please note that I will use the term "device" to describe the following : "A circuit, that may be standalone or part of a system or sub-system that performs a single task". This simply means that by "device" I do not mean a "standalone box" or separate object, simply a circuit or combination of circuits that are designed for the purpose of achieving a single goal.

Digital Media : A device (such as a hard disk, flash disk, CD, DVD, etc) that stores digital music on it.
Source / Transport : A device or interface that is responsible for reading from digital media and outputting a digital output stream.
Digital Receiver : A device that reads a digital input stream, restored clock if necessary (more on this later) and sends it (usually in I2S format) to the DAC.
DAC : Digital to Analog Converter. This device reads a digital input stream and converts it to an analog output stream (whether this be a Current-Output DAC or Voltage-Output DAC)
I/V stage (Current to Voltage Converter) : A device that converts Current output from a DAC to Voltage output that can be amplified and handled down the line. *Note before you get totally confused - some DACs are "Voltage-Output and either do not need this done or simply have the I/V stage is inside the DAC chip*.
Analog Output Stage : An "amplifier" that takes the low-level signal from the DAC and amplifies it to what is called "line level", generally between 1V and 2V (although some go lower/higher than that, I'm just taking a "general medium" here). The "output stage" description is a little all-encompassing as it generally includes a low-pass filter and well as amplifier. This also may be on the DAC chip itself.
Amplifier : A device that amplifies the analog output stream from line level to "headphone level", or "pre-amp level" because almost all headphone amps make wonderful pre-amps, although I won't get into power amps in this article.

Section 2: Connections between system parts

Now that we have handled the basic components of the digital system, lets get into some details of the stages, interfaces and interactions that go on. The way I feel this most clearly described is by showing the interaction between every step in the chain.

Media -> Transport : Low-Level (beyond scope of article) data reading algorithms of reading data from media.
Transport -> Digital Receiver : Here is where it gets REALLY complicated. There are quite a few ways to send digital data from one point to another (having eared a diploma which included digital communications and digital network security, I can say there's more here than most think). I will include the details in the "Digital formats" section.
Digital Receiver -> DAC : Most likely I2S, sometimes S/PDIF.
DAC -> I/V Stage : I admit to not knowing much about the I/V section of the digital chain and will simply state the simplified explanation - current of a certain amount goes in, voltage of corresponding amount comes out. This will certainly be edited once I learn more about I/V conversion.
I/V -> Analog Output Stage : This stage is very simple (unless you choose to make it complicated). This generally happens on the same circuit board and within the same case as the "DAC Unit" since most DACs cannot send their output through interconnect wires (not powerful enough - don't think this is a drawback, DACs are simply not meant to drive interconnects), so the analog output stage is simply connected through a short wire or PCB trace and is a standard analog signal.
Analog Output Stage -> Amplifier : This is where the wires generally come in; what people generally consider the "DAC portion" is completed and can now send a line-level signal to any other device such as a headphone amp, pre-amp, etc.

Section 3: Digital Formats

As I mentioned above, there are quite a few different ways to send digital audio (generally from the Source/Transport to the Digital Receiver, but sometimes directly to the DAC, depending on format and DAC input).

S/PDIF : The most commonly used digital audio format. This Wiki article has some basic info on the protocol and it's applications. I suggest you read it as well as the part on Biphrase mark code. A key note on S/PDIF is that it is a streaming signal without error correction so what leaves the transport (and what's done with it before it leaves) and what gets to the receiver is a one-time transmission and must be transmitted correctly. Further to the article:

1) The clock being extracted from the signal - while BMC encoding does make extracting the clock easier, it still does not make it foolproof as there is still those two bits that may be read incorrectly (assumed the 2 same bits are actually 1), it just makes it easier to send both the data and clock at the same time. If proper re-clocking is implemented, Jitter is not a problem, otherwise, it can be. Do note that jitter can be a problem at different stages in the system, not just this transmission. Read the article, learn what it actually means. Good info on this can be found in papers by Dunn and Hawksford and possibly Pohlmann's book.
2) There are generally three ways to send S/PDIF : Optical TOSLINK, 75 Ohm Coax and 110 Ohm AES/EBU. AES/EBU is not going to be in scope of this article, but you can check out this Wiki. There is a great amount of info on S/PDIF here, it gets considerably more technical but still a good read.
3) Optical TOSLINK vs 75Ohm Coax with advantages and disadvantages:

Optical TOSLINK is not an electrical connection, therefore it avoids ground-loops, Coax may cause ground loops if not isolated, preferably using input/output pulse transformers (this also times things well as pulse transformers are designed to send precisely timed signals).

Optical TOSLINK cannot run over long distances (5m is long, 10m is getting dangerous), Coax can run for 10-15m without problems. There is an optimal length for coax, it is 3m-5m, this is due to signal timing.

Optical TOSLINK should not be bent at extreme angles as optical signals are susceptible to reflection. Coax does not have any such restrictions.

I2S : The native interface of most DACs. There is a little info in this Wiki. Generally this is kept on the circuit board although some transports can output this format and some DACs can accept it directly. Along with the data, there are 2 clocks sent separately, the Bit-Clock and the Word Clock. This means that there is no clock reconstruction and can be a very low-jitter signal. The master clock can sometimes be sent as well. This is also a send-and-forget standard and falls to the same limitations as S/PDIF in terms of error correction.

USB : This gets a little interesting as there is no all-encompassing standard. Since USB is a computer connection, it is not restricted to send-and-forget. Generally the standard employed and used is USB Audio 1.1 as the drivers for this standard are distributed with most operating systems. USB Audio 1.1 is a streaming non-error-corrected protocol with a limited of 16-bit depth and 48kHz sample rate. The PC being a smart device so Audio over USB is only limited by the drivers and interfaces. Some devices like EMU 0404 USB have used the USB 2.0 protocol and with their own drivers (as well as firmware on the unit) can achieve higher bit depth and sample rates. Even USB 1.1 is capable of 24bit/192kHz, it just needs drivers to be written and USB receivers to be made to those standards.

There seems to be a lot of confusion regarding USB Audio with terms like "USB DAC" and without a doubt, it can be confusing when such things are said. There is no such thing as a "USB DAC", there is such a thing as a DAC that has a USB receiver, sometimes they are even on the same chip, but nevertheless, it is easier to understand that those two separate devices are there even though they may not necessarily be physically separate.

Just to really hammer this down, USB is the protocol, a USB Receiver receives the data (after it has "agreed" with the computer how it will communicate and drivers have been loaded) and the computer sends the data (in which case it acts as a transport).

Section 4: Sample digital systems

I wanted to take a little time to show a few "sample systems" and illustrate that they include all parts mentioned above. I will use only a couple of common components to illustrate this.

*If the mods/admins or component manufacturers would like me NOT to use their product in this breakdown, please send me a PM and I will take it down, but considering this info is in other threads on this site, I hope that won't be a problem.

System 1
Marantz SA8001 CD Player -> TOSLINK cable -> Stello DA100 DAC -> Analog RCA cables -> HeadAmp GS-1 -> Headphones

Digital Media : Audio CD
Source/Transport : Marantz SA8001
-> S/PDIF over Optical TOSLINK cable
Digital Receiver : TORX optical module converts Optical to wire signal and connects using wires to AKM AK4117 S/PDIF receiver inside Stello DA100
-> I2S over PCB traces (must verify)
DAC : AKM AK4395 DAC inside Stello DA100
I/V : AK4395 is Voltage-Out DAC
-> Analog signal over PCB traces
Analog Output Stage : OPA2604 Balanced-Unbalanced converter, 4x NE5534 low pass filter, discrete output stage (amplifier)
-> Analog signal over RCA cables
Amplifier : HeadAmp GS-1

System 2
Computer -> USB Cable -> Stello DA100 -> Analog RCA cables -> HeadAmp GS-1 -> Headphones
Digital Media : MP3 files on hard disk
Source/Transport : USB Host Controller on computer motherboard
-> USB Audio 1.1 Stream over USB cable
Digital Receiver : PCM2704 USB receiver inside Stello DA100
remainder is same as system 1

System 3
Computer -> ESI Juli@ sound card -> Coax S/PDIF cable -> Stello DA100 -> Analog RCA cables -> HeadAmp GS-1 -> Headphones

Digital Media : MP3 files on hard disk
Source/Transport : ESI Juli@
-> S/PDIF over Coax cable *
Digital Receiver : AKM AK4117 S/PDIF receiver inside Stello DA100
remainder is same as system 1

*NOTE: Inside the ESI Juli@, the path is a little more complicated than the simplified version above. The full details depend on design in the card but the overall picture is this:
The CPU processes some data and sends it off to the VIA Envy24HT-S DSP for finishing, through the Southbridge (ICH), and the VIA Envy24HT-S processes the data as configured by the drivers and card firmware, and then passes along via I2S to the AKM AK4114 digital transceiver to be output via coax cable.

System 4
Computer -> ESI Juli@ sound card -> Analog RCA cables -> HeadAmp GS-1 -> Headphones

Digital Media : MP3 files on hard disk
Source/Transport : ESI Juli@
-> Internal I2S over PCB traces
Digital Receiver : The Envy24HT-S DSP can output I2S directly to the DAC, thus there is no real "digital receiver"
-> Internal I2S over PCB traces
DAC : AKM AK4358 inside the ESI Juli@
remainder is same as system 1

Author's notes and requests
1) Please take the time to read the entire article (especially before posting a question here).

2) A few ground rules please:

"What's better : USB or S/PDIF" is GROSSLY out of scope and purpose of this article
"What's better : Sound card or DAC", RTFM before you dare ask this question!
Whether "RFI concerns/issues" inside computers is "detrimental" to the quality of the sound will not be discussed, nor will there be discussions about "faults" of using an internal sound card as DAC vs just Transport.
Before saying anything about Jitter, you must read all articles (Wiki and otherwise) linked to in this article, thoroughly. I do not want ignorance about this commonly misunderstood topic to play a part in any discussions here.
Discussion of what is considered "audible" vs "measurable" is to be kept to a complete minimum or this rule will change to simply ban it.

3) Constructive criticism is always welcome, key word being constructive. I know I must have missed some things, please do point it out and I will update this article.

I hope this article helps alleviate the confusion that seems to circle digital audio. Thanks for reading!

FallenAngel · Feb 19, 2009 at 9:42 AM

First draft completed.

obobskivich · Feb 19, 2009 at 9:45 AM

I'm not in love with your explanation of internal DSP audio and whatnot, it isn't wrong, its just, very vague/genearlistic

other than that, four thumbs up

FallenAngel · Feb 19, 2009 at 9:48 AM

Quote:

Originally Posted by obobskivich /img/forum/go_quote.gif
I'm not in love with your explanation of internal DSP audio and whatnot, it isn't wrong, its just, very vague/genearlistic

other than that, four thumbs up

Please elaborate on the DSP part, I did not wish to cover DSP in depth in this article, but a little detail can't hurt.

apatN · Feb 19, 2009 at 10:45 AM

Thanks for your work. I will read it all when I find some time.

obobskivich · Feb 19, 2009 at 10:58 AM

Quote:

Originally Posted by FallenAngel /img/forum/go_quote.gif
Please elaborate on the DSP part, I did not wish to cover DSP in depth in this article, but a little detail can't hurt.

Quote:

Inside the ESI Juli@, the path is a little more complicated than the simplified version above. This is what really happens:
CPU processes data, sends through SouthBridge to the VIA Envy24HT-S PCI interface on the ESI Juli@. This PCI interface converts the data into an S/PDIF stream and sends it through the AKM AK4114 digital transceiver out of the card via breakout cable.

the CPU processes some data and sends it off to the DSP for finishing, the southbridge/northbridge (I prefer MCH and ICH, but whatever), it basically "generates data for the DSP to compute", much like how the CPU generates data for your GPU to process for 3D, and the Envy24 isn't a PCI interface, its a PCI compatable DSP which processes the data as configured by the drivers and card firmware, and then passes along via I2S to the AKM

if there is no direct digital out, instead of going to the S/PDIF Tx, it goes along I2S to the D/A or CODEC

now the Envy has a built-in S/PDIF Tx as well, however if you're using that AKM, its going via I2S (and being converted to S/PDIF there), the advantage to the AKM is that while the Envy has a built in S/PDIF Tx, it doesn't have a built in S/PDIF Rx, and the ESI is designed to be top quality, so you want the lower jitter of the AKM for output

just because we're talking digital audio doesn't mean all digital audio is the same, there are probably half a dozen digital to digital conversions which can take place before the audio even hits a digital to analogue bit (but not always, I'm just saying you could see this depending on implementation)

nick_charles · Feb 19, 2009 at 4:46 PM

This is a worthy pursuit and I commend your effort .

Quote:

If proper re-clocking is implemented, Jitter is not a problem, otherwise, it can be. Do note that jitter can be a problem at different stages in the system, not just this transmission. Read the article, learn what it actually means.

This is contentious. Better souces for a technical description of Jitter in Audio would be papers by Dunn and Hawksford, possibly Pohlmann's book, as for the audible effect of jitter, this is a real can of worms and in any case many knowledgeable commentators will say it is only a problem at the DAC.

The *evidence* for Jitter ever being an *audible* problem is questionable at best.

Quote:

Optical TOSLINK cannot run over long distances (5m is long, 10m is getting dangerous), Coax can run for 10-15m without problems. There is an optimal length for coax, it is 3m-5m, this is due to signal timing.

Do you mean an optical cable needs to be 3m to 5m long ?

Quote:

This is also a send-and-forget standard and falls to the same limitations as S/PDIF in terms of error correction.

Which are ?

FallenAngel · Feb 19, 2009 at 7:46 PM

Quote:

Originally Posted by obobskivich /img/forum/go_quote.gif
the CPU processes some data and sends it off to the DSP for finishing, the southbridge/northbridge (I prefer MCH and ICH, but whatever), it basically "generates data for the DSP to compute", much like how the CPU generates data for your GPU to process for 3D, and the Envy24 isn't a PCI interface, its a PCI compatable DSP which processes the data as configured by the drivers and card firmware, and then passes along via I2S to the AKM...

I updated the article, thanks for the info (still not very detailed, but a little more so).

Quote:

Originally Posted by nick_charles /img/forum/go_quote.gif
Better sources for a technical description of Jitter in Audio would be papers by Dunn and Hawksford, possibly Pohlmann's book, as for the audible effect of jitter, this is a real can of worms and in any case many knowledgeable commentators will say it is only a problem at the DAC.

Good to know, thanks. I'll toss that in.

Quote:

Originally Posted by nick_charles /img/forum/go_quote.gif
The *evidence* for Jitter ever being an *audible* problem is questionable at best.

Which is why I would love to steer away from this discussion.

Quote:

Originally Posted by nick_charles /img/forum/go_quote.gif
Do you mean an optical cable needs to be 3m to 5m long ?

No, but a coax cable should be.

Quote:

Originally Posted by nick_charles /img/forum/go_quote.gif
Which are ?

When a signal is received, it is assumed to be correct, in correct order and perfectly timed, there is no check for errors or timing irregularities and no way to resend it (because there is no way to know it needs to be).

gregorio · Feb 19, 2009 at 10:04 PM

Some observations:

"When a signal is received, it is assumed to be correct, in correct order and perfectly timed, there is no check for errors or timing irregularities and no way to resend it (because there is no way to know it needs to be)."

I may be wrong here, but that was not my impression. AFAIK, the datastream from CD, DAT and other sources includes data used for error checking (minimum parity checking, plus more complex error correction) which is then used by the DAC. This is regardless of the method of transmission SPDIF, AES/EBU, etc. The professional DACs I've used always had error checking and I thought consumer DACs do as well?

You've mentioned bit-clock and word clock, AFAIK, these are the same thing. Word Clock is the only type of clock embedded in the metadata AFAIK.

Jitter is a can of worms! Severe jitter is easily noticable. When we get down to moderate jitter, slight phasing issues may be noticed. At low levels, jitter becomes much more wierd. In some tests with low level jitter, a slightly higher jitter was perceived (by audio professionals) as an improvement!? One of the other main problems with jitter is that there is no standard measurement point. So a piece of gear with a lower jitter spec may in fact cause more jitter than one with a higher jitter spec. This is true in both pro and consumer gear. Conclusion: Lower jitter does not necessarily mean better audio quality.

Line level = 0dBv = 0.775v (usually).

G

nick_charles · Feb 19, 2009 at 10:14 PM

Quote:

Originally Posted by FallenAngel /img/forum/go_quote.gif
No, but a coax cable should be. (3M to 5M)

Puzzled, normally the shorter the better , reference ?

nick_charles · Feb 19, 2009 at 10:32 PM

Quote:

Originally Posted by gregorio /img/forum/go_quote.gif
Jitter is a can of worms! Severe jitter is easily noticable. When we get down to moderate jitter, slight phasing issues may be noticed. At low levels, jitter becomes much more wierd. In some tests with low level jitter, a slightly higher jitter was perceived (by audio professionals) as an improvement!? One of the other main problems with jitter is that there is no standard measurement point. So a piece of gear with a lower jitter spec may in fact cause more jitter than one with a lower jitter spec. This is true in both pro and consumer gear. Conclusion: Lower jitter does not necessarily mean better audio quality.

Line level = 0dBv = 0.775v (usually).

G

Can you point me to the jitter listening tests you mention , can I get them from the AES library, thanks. Can you PM me with it as I do not want this thread to derail into a jitter thread, thanks.

gregorio · Feb 19, 2009 at 10:58 PM

Quote:

Originally Posted by nick_charles /img/forum/go_quote.gif
Puzzled, normally the shorter the better , reference ?

Professionally 3m - 5m is the maximum length advised for an unbalanced analogue cable, although shorter is better. In the case of digital audio transmission, you can probably get away with a fair bit longer as the quality is a bit of a non-issue until the receiving system cannot resolve a zero from a one. That's why in commercial studios we tend to stick with AES/EBU (or MADI for multi-channel applications) and steer clear of optical and SPDIF.

G

FallenAngel · Feb 20, 2009 at 12:59 AM

Quote:

Originally Posted by gregorio /img/forum/go_quote.gif
Some observations:

"When a signal is received, it is assumed to be correct, in correct order and perfectly timed, there is no check for errors or timing irregularities and no way to resend it (because there is no way to know it needs to be)."

I may be wrong here, but that was not my impression. AFAIK, the datastream from CD, DAT and other sources includes data used for error checking (minimum parity checking, plus more complex error correction) which is then used by the DAC. This is regardless of the method of transmission SPDIF, AES/EBU, etc. The professional DACs I've used always had error checking and I thought consumer DACs do as well?

You've mentioned bit-clock and word clock, AFAIK, these are the same thing. Word Clock is the only type of clock embedded in the metadata AFAIK.

Jitter is a can of worms! Severe jitter is easily noticable. When we get down to moderate jitter, slight phasing issues may be noticed. At low levels, jitter becomes much more wierd. In some tests with low level jitter, a slightly higher jitter was perceived (by audio professionals) as an improvement!? One of the other main problems with jitter is that there is no standard measurement point. So a piece of gear with a lower jitter spec may in fact cause more jitter than one with a higher jitter spec. This is true in both pro and consumer gear. Conclusion: Lower jitter does not necessarily mean better audio quality.

Line level = 0dBv = 0.775v (usually).

G

The S/PDIF standard is streamed and of course not completely without structure and the "words" are sent in a standard way. The data in those "words" is not CRC checked and therefore is not verified to be correct on the receiving end.

Along with the signal there are 3 clocks:
Bit Clock
Left/Right Word Clock
Master Clock

In S/PDIF, I believe that the bit clock is is what is embedded in the signal using BMC encoding.

Let us not get into audible vs inaudible jitter as this can be both objective and subjective and can quickly take things out of scope of a comprehensive introduction to digital audio.

There is no error checking or correction built into I2S or S/PDIF (including AES/EBU which is almost identical to S/PDIF, just higher voltage and speed, frames or "words" are generally the same). They are all streaming formats.

Quote:

Originally Posted by nick_charles /img/forum/go_quote.gif
Puzzled, normally the shorter the better , reference ?

Quote from this article I linked earlier : "Coaxial S/PDIF connections work typically at least to 10-15 meter distances with good 75 ohm coaxial cable." The reason for this is proper timing of rise and drop of signal, it it's too short, rise and fall will mix together, too long and it'll seem like signal never arrived (from what I can remember from my data communications class, should confirm though). This is for COAX only, optical has problems with long distances with reflection/refraction and a load of other issues.

Quote:

Originally Posted by gregorio /img/forum/go_quote.gif
Professionally 3m - 5m is the maximum length advised for an unbalanced analogue cable, although shorter is better. In the case of digital audio transmission, you can probably get away with a fair bit longer as the quality is a bit of a non-issue until the receiving system cannot resolve a zero from a one. That's why in commercial studios we tend to stick with AES/EBU (or MADI for multi-channel applications) and steer clear of optical and SPDIF.

G

We are talking about digital here. With analog, I still have no problems with driving a 10m long interconnect with a proper buffer, but that's another story. Digital coax does have a limitation of approximately 10m while AES/EBU can go longer but since XLR cables are COMPLETELY worthless as in terms of keeping the 110-ohm standard, even if the equipment is transformer coupled, signal drop is very common.

obobskivich · Feb 20, 2009 at 3:21 AM

lots of great contributions from nick_charles and gregorio (I've said it before, and I'll say it agian, greg, you're an asset to this forum)

@ greg, I believe most consumer equipment does have some form of error correction, however that doesn't mean that its correcting correctly

nick_charles · Feb 20, 2009 at 3:40 PM

Quote:

Originally Posted by FallenAngel /img/forum/go_quote.gif
Quote from this article I linked earlier : "Coaxial S/PDIF connections work typically at least to 10-15 meter distances with good 75 ohm coaxial cable." The reason for this is proper timing of rise and drop of signal, it it's too short, rise and fall will mix together, too long and it'll seem like signal never arrived (from what I can remember from my data communications class, should confirm though).

If you could please. I would have thought that the timing issue you mention would be an issue for the receiver not the cable. That the receiver could not keep up with the data speed. I remember writing some comms sotware for a Hayes modem back in the 1980s and it never managed to handshake properly until a colleague inserted a 15 second delay for the modem to reset fully.

I know that Steve Nugent believes that 1.5M is a minimum for coax, but this has been subect to some, shall we say, robust debate.

Again thanks for your hard work.

Featured Sponsor Listings

Before asking about digital audio: READ this Digital Audio Primer

FallenAngel

Headphoneus Supremus

FallenAngel

Headphoneus Supremus

obobskivich

Headphoneus Supremus

FallenAngel

Headphoneus Supremus

apatN

Headphoneus Supremus

obobskivich

Headphoneus Supremus

nick_charles

Headphoneus Supremus

FallenAngel

Headphoneus Supremus

gregorio

Headphoneus Supremus

nick_charles

Headphoneus Supremus

nick_charles

Headphoneus Supremus

gregorio

Headphoneus Supremus

FallenAngel

Headphoneus Supremus

obobskivich

Headphoneus Supremus

nick_charles

Headphoneus Supremus

Users who are viewing this thread