Hi-fi audio signal chain -- no more sigma-delta
Dec 21, 2014 at 3:11 PM Thread Starter Post #1 of 110

22906

100+ Head-Fier
Joined
Feb 24, 2005
Posts
385
Likes
12
I just had an idea of a complete hi-fi audio signal chain -- from the recording studio to our ears. It goes like this:
 
condenser mic -> successive approximation ADC -> PCM audio codec -> compressed file (FLAC)
 
FLAC -> R-2R dac -> headphones/speakers
 
The problem with digital audio is the advancement of SIGMA-DELTA architecture in DAC and ADC design to cut costs.... the world has not yet begun to tap the full potential resolution of digital audio... quality R-2R DACs are available such as Totaldac, Aqua La Scala S2, but more affordable NOS, filterless R2R dac using pcm1704 is Needed
 
It's not that R-2R/SAR converters are better sounding, it's that sigma-delta design is theoretically flawed. Single-bit sigma-delta cannot faithfully reproduce audio signals (at the same level as 16/44 PCM) until we reach operating/sampling frequencies of 3GHz. Right now the fastest SD converter I've heard of, ESS Sabre, operates at ~100MHz. EDIT: Please refer to this thread regarding real resolution of ESS http://www.diyhifi.org/forums/viewtopic.php?f=2&t=2241
 
On the recording side, if sound is recorded with crappy sigma-delta ADCs, there is no possibility for hi-fi reproduction. This is the main lacking area. I would like someone to make a Successive-approximation style Analog-to-digital converter which is the highest fidelity architecture possible. If someone can design a circuit around existing SAR ADC chips like the Analog Devices AD7621
 
This should be developed and promoted widely in recording studios... a 16-bit SAR ADC can revolutionize audio in the world today... better than vinyl and all hitherto audio formats in existence.
 
To all circuit designers out there... hear my plea! I will be entertained by your responses. Thanks.

EDIT: Check out my other thread 
 ​ 
EDIT: Jump to Post 83 http://www.head-fi.org/t/747265/hi-fi-audio-signal-chain-no-more-sigma-delta/75#post_11203199
 
Dec 21, 2014 at 3:37 PM Post #2 of 110
Not sure if Sound Science is the best forum to create this thread at, given that most replies here will probably not agree with your post. In any case, oversampling delta-sigma converters are capable of reproducing CD quality audio and better. It is not a new technology now, and has been well tested over the years. The reason why R2R is disappearing is not just cost cutting, delta-sigma actually performs better if properly designed and implemented, while still being cheap.
 
Dec 21, 2014 at 3:55 PM Post #3 of 110
I agree that sigma-delta can reproduce CD-audio, but it's resolution is much less than R-2R. Sigma delta performs better only in specific tests such as THD, etc. Theoretically sigma delta (below GHz clock rate) is inferior to R-2R, and real world performance is limited by theory. Sigma-delta originated as a cost-saving ploy by IC manufacturers, and attempts to improve performance have been successful. But, it simply cannot match R-2R.

I believe this is the right forum, as I'm trying to make a point about audio theory.
 
Dec 21, 2014 at 4:41 PM Post #4 of 110
you're going to make all the "DSD is analog" fans very sad if you start saying that pulse modulated signal is flawed.
biggrin.gif

 
 
I love R2R for one reason and one only, it's what helped me understand digital audio. else I think you just have to look at how many manufacturers still build them for audio to know it's not where you should put your pension funds.
yes pulse modulated signal will always add some noise, the faster the clock the smaller the noise. keeping the idea that we're dealing with waves so increasing either sample speed or sample precision we can end up with the very same resolution(just like DSD does with supposedly 1bit). 
sure magnifying that noise on a scope reading is great for advertising, next to some kick ass square waves for the very last diNOSaurs DACs, it's impressive for sure. but in reality it's nothing to fuss about.
 
all in all the industry decided to go with increased samplerate instead of sample resolution, I would guess that they had good reasons to do so(even if it's only just money). one day we'll hit a wall with speed and then manufacturers will go back to add bits(real bits like a R2R) to be able to claim superiority of something. I'm sure if you wait long enough it will come back, there are only so many things we can keep pretending to improve in audio. but by then we will be listening to 128bit/600mhz records, and all that stuff will still sound just like a good old CD.
 
Dec 21, 2014 at 4:54 PM Post #5 of 110
In reality waiting for sampling rates to increase for DSD/Sigma-delta to reach PCM resolution levels (i.e. from megahertz range to gigahertz) is going to take forever... and I want hi-fi audio now. 24-bit R2R dacs are here... but they are expensive. All we need to do is get the recording people set on SAR ADC to take the audio world as a whole to the next level.
 
Dec 21, 2014 at 6:07 PM Post #6 of 110
Originally Posted by m3_arun /img/forum/go_quote.gif
 
Sigma-delta cannot faithfully reproduce audio signals (at the same level as 16/44 PCM) until we reach operating/sampling frequencies of 3GHz.

 
This would only be the case with a very naive and sub-optimal design that basically outputs a simple pulse width modulated square wave, as I guess the figure of 3 GHz comes from multiplying 44100 Hz by 65535. This is analogous to halftone dither in image processing, whereas noise shaping is more like error diffusion (e.g. Floyd-Steinberg) dither, which is more efficient at the same resolution (see these example pictures). By the way, even then 3 GHz would not be correct, because modern delta-sigma converters are multi-bit.
 
In reality, a one-bit stream at 44100 * 256 Hz sample rate can have quite a bit better than 8-bit resolution in the audio band. That is because the mathematically correct way to extract the audio information at the original 44100 Hz sample rate is not simply dividing it into separate 256-sample blocks and averaging them, but rather lowpass filtering it with an impulse response of sin(x*PI/256)/(x*PI) (where x is the distance in samples as a signed integer), and then taking only every 256th sample. This means that for example a pattern of 111000 is slightly different audio signal than 101010 or 100011, even though all have a weight of 3. Noise shaping does not use a simple static PWM-like pattern, it dithers the input signal with uncorrelated noise, and moves most of the quantization error out of the audio band by using a (filtered) feedback loop.
 
  Sigma delta performs better only in specific tests such as THD, etc. Theoretically sigma delta (below GHz clock rate) is inferior to R-2R, and real world performance is limited by theory.

 
Actually, it performs better in more tests than just THD. Even in difference extraction tests using complex music (of which examples have been posted on this forum in the past, even in one of the threads linked in my signature), there is no evidence of unexpected large increase in noise and/or distortion. A well designed delta-sigma DAC can indeed perform as advertised, and not just with sine waves. The theory that it requires GHz clock frequencies for adequate performance is flawed (in fact, it is the first time I see that claim), just like some popular theories regarding the use of negative feedback in amplifiers, or even the basics of digital audio. The audiophile industry likes to invent pseudo-science to be able to sell products that try to fix things that have not been broken for years, if not decades.
 
Dec 21, 2014 at 6:25 PM Post #7 of 110
  you're going to make all the "DSD is analog" fans very sad if you start saying that pulse modulated signal is flawed.
biggrin.gif

 
Well, a theory that "proves" that their 2.8224 MHz/1-bit SACDs have only 6-bit audio resolution (comparable to AM radio) is definitely something to be worried about.
 
Dec 21, 2014 at 6:35 PM Post #8 of 110
Actually you are right. Close the thread! Thank you stv014 for the explanation.

So the converter is fine, but the concept of a binary-coded audio codec (DSD) is just wasteful. Why would you need 2.8 or 5.6 Mhz data to describe a signal with 20 KHz bandwidth?
 
Dec 21, 2014 at 6:41 PM Post #9 of 110
  you're going to make all the "DSD is analog" fans very sad if you start saying that pulse modulated signal is flawed.
biggrin.gif

 

 
Did you mean to say "pulse density modulation", as doesn't PCM also have "pulse" and "modulation" in the acronym?
 
Dec 21, 2014 at 7:38 PM Post #10 of 110
 
  you're going to make all the "DSD is analog" fans very sad if you start saying that pulse modulated signal is flawed.
biggrin.gif

 

 
Did you mean to say "pulse density modulation", as doesn't PCM also have "pulse" and "modulation" in the acronym?

 I mentioned DSD because it was the extreme example of what the OP was against. going as low as 1bit, yet still having as much resolution as high res pcm. but yeah I could have been a little more precise as pretty much anything is pulse modulated, be it in time or amplitude.
 
Dec 21, 2014 at 8:49 PM Post #11 of 110
   I mentioned DSD because it was the extreme example of what the OP was against. going as low as 1bit, yet still having as much resolution as high res pcm. but yeah I could have been a little more precise as pretty much anything is pulse modulated, be it in time or amplitude.

 
I gotcha. And yes, much as I don't like DSD/SigDelt b/c I can't cleanly apply the PCM theory that makes sense to me to it, in the end it's really all about how well the reconstruction of the DAC matches the theoretical waveform that should come out of a perfect one in the audible passband. I'd have to imagine that the major DACs of the day do a pretty good job, R2R or not.
 
Dec 22, 2014 at 5:24 AM Post #12 of 110
  Why would you need 2.8 or 5.6 Mhz data to describe a signal with 20 KHz bandwidth?

 
Because CD quality audio has a bit rate of 44100*16 = 705600 bps / channel, and with 1-bit encoding, that is the theoretical minimum that is required to avoid loss of information. Additionally, SACD is more comparable to 96 kHz/18-20 bit PCM, so it is already about 1.8 Mbps of information per channel. Encoding it as a noise shaped 1-bit stream is not perfectly efficient either, which increases the required sample rate to a few MHz in practice.
 
96/24 format PCM has a bit rate of ~2.3 Mbps/channel, and higher dynamic range than SACD over its 48 kHz bandwidth. What makes the latter popular among audiophiles is that it does not have a hard band limit at a relatively low ultrasonic frequency. Although practical implementations might still use a ~50 kHz lowpass filter anyway, and since DSD is not well suited to digital processing, it is not unlikely that many commercial SACDs have actually been converted from PCM.
 
Jan 5, 2015 at 4:37 PM Post #13 of 110
I would like to add something to this discussion; I believe stv014 is right about the first part, that I forgot to take into account the multi-bit nature of modern delta-sigma converters. However, after reading this article:
 
http://www.positive-feedback.com/Issue65/dac.htm

and doing some independent research, I have come to the conclusion that my original ideas still stand; that filterless R2R design is theoretically optimal for PCM audio reproduction.
 
If we take into account multi-bit converters, say 6-bit, the required frequency for SD converter to decode a 16/44.1Khz PCM signal would be divided by 2^6. So instead of 3 Ghz, 6-bit SD converters at about 45 Mhz can convert CD audio perfectly. This means the ESS 9018, for example, which is capable of running at around 100 Mhz, is probably the only SD converter that has barely 16-bit resolution.
 
EDIT: I've found that the ESS Sabre runs at about 8.5 Mhz for 44.1KHz PCM audio.

To refute the claim that seperate "samples" in a PDM (pulse-density modulated) signal have different "values" and count towards the bit-resolution of the DAC, we can look at an example:
 
Say we have a sequence 001 001 001 in a PDM signal. This is no different in the passband than 001 100 010; because the shifting of the position of the bits can only add information that is above the Nyquist frequency (half the sampling rate), and filtering, no matter how sophisticated, cannot actually add any information to the signal.
 
As far as filtering, I believe the physical transducers are adequate for low-pass filtering in audio. Artificial filtering can only induces undesirable phase aberrations. I welcome your comments.
 
Jan 5, 2015 at 5:40 PM Post #14 of 110
Originally Posted by m3_arun /img/forum/go_quote.gif
 
Say we have a sequence 001 001 001 in a PDM signal. This is no different in the passband than 001 100 010; because the shifting of the position of the bits can only add information that is above the Nyquist frequency (half the sampling rate), and filtering, no matter how sophisticated, cannot actually add any information to the signal.

 
This is incorrect, and, as already explained above in post #6, is based on the naive assumption that downsampling by an integer ratio of N can be performed by simply averaging N sample blocks of the input. Using your specific patterns to "oversample" by a factor of 3 from 14700 Hz to 44100 Hz sample rate, it can be clearly seen below that the "passband" of 0 to 7350 Hz is not identical at all between the two cases:

Also, even with the simple (and incorrect) block averaging method, the two patterns become different in terms of block weights if you simply delay them by 1 sample: 100 100 100 vs. 000 110 001.
 
I recommend watching this video for some basics of sampling and digital audio.
 
Jan 5, 2015 at 5:55 PM Post #15 of 110
As far as filtering, I believe the physical transducers are adequate for low-pass filtering in audio. Artificial filtering can only induces undesirable phase aberrations.

 
For phase aberration from filtering, it is only the overall response that matters, regardless of whether it is artificial or not. In practice, transducers introduce more phase errors under 20 kHz than any decent DAC, especially one that uses a linear phase reconstruction filter.

 

Users who are viewing this thread

Back
Top