Hi-fi audio signal chain -- no more sigma-delta
Jan 5, 2015 at 6:06 PM Post #16 of 110
I think that graph is a great way to explain things. However, the point I am making is about the coding scheme, not about downsampling; it is about preserving the information present in the original audio signal.
Yes, we can filter the PDM signal to get different waveforms, and maybe 100 will be converted into something different from 001, but that filtering process is not adding real resolution to the signal; if the waveform that results from converting 100, 010, 001 is different, but still has nothing to do with the original audio waveform (rather it has to do with what kind of filter being applied), then its not audio-band information. It's like adding a special effect to the sound to make it seem clearer.
Additionally, delaying the signal is the same as changing the signal altogether. My intention is not to start an argument; thank you for your comments.
 
Jan 5, 2015 at 6:48 PM Post #17 of 110
  I think that graph is a great way to explain things. However, the point I am making is about the coding scheme, not about downsampling; it is about preserving the information present in the original audio signal.

 
Downsampling is how we reconstruct the content at the original low sample rate, so it is definitely relevant, and your specific method on which your theory is based is wrong.
 
Originally Posted by m3_arun /img/forum/go_quote.gif
 
Yes, we can filter the PDM signal to get different waveforms, and maybe 100 will be converted into something different from 001, but that filtering process is not adding real resolution to the signal; if the waveform that results from converting 100, 010, 001 is different, but still has nothing to do with the original audio waveform (rather it has to do with what kind of filter you're applying), then its not audio-band information.

 
You can see on the FFT that the two patterns have different content below 7350 Hz (the "audio band" for this particular example). Only the second one produces a tone at 4900 Hz. There are 512 possible 3x3 sample 1-bit patterns, and converting these to 1/3 sample rate will produce more than just 64 (4x4x4) different 3-sample patterns.
 
  Additionally, delaying the signal is the same as changing the signal altogether. My intention is not to start an argument; thank you for your comments.


Delaying a signal is a purely linear process. It cannot introduce new components that did not already exist in the signal, it just changes the phase. The fact that the block averaging results did not agree with this already shows that there is a problem (notably aliasing) with the averaging method, and therefore also with the theory based on it.
 
Jan 5, 2015 at 7:24 PM Post #18 of 110
   
your specific method on which your theory is based is wrong.
 

I think this is uncalled for
 
There are 512 possible 3x3 sample 1-bit patterns, and converting these to 1/3 sample rate will produce more than just 64 (4x4x4) different 3-sample patterns.  

That is dithering noise being added to the signal, not real resolution (i.e has nothing to do with the original audio signal)
 
I agree with your comment about filtering; I believe that an overall zero-phase shift is the most desirable outcome for the audio signal chain.
 
EDIT: Linear phase shift is also acceptable
 
Jan 5, 2015 at 7:28 PM Post #19 of 110
  I think this is uncalled for

 
That is a factual statement, not an insult. It's completely called for, since your hypothesis about how DACs work is indeed wrong.
 
 
Quote:
  That is random noise being added to the signal

 
It's not random, since it is correlated to the input. Random noise would be completely uncorrelated to the original signal.
 
Jan 5, 2015 at 7:36 PM Post #20 of 110
Theory is not based on method, method is based on theory! I have edited my post above, correlated or uncorrelated, noise is noise, its definitely not music. Thank you for an interesting debate; my mind is made up.
 
Jan 5, 2015 at 8:38 PM Post #21 of 110
won't a R2R non filtered DAC give you exactly the dreadful staircase output that the marketing guys use everyday to scare ignorant audiophiles into going high res?
 
R2R ladders are easy to imagine, you just use how many R of the same value to get your number of bits, the impedance doesn't change, it's fast, all is beautiful. in a world where we always complain about how 24bit DACs aren't really able of 24bit resolution, didn't you ask yourself why nobody is making some fancy 64bit resolution R2R DAC?
 
rejecting filters, I don't even want to start. each tech has needs and limits and at some point reality kicks in.
 
 
if only the Sabre reaches 16bit resolution, how come we have so many DACs using other delta sigma chips with specs better than 100db? problem?
 
 
 

 
Jan 5, 2015 at 10:19 PM Post #22 of 110
Yes, it will give you staircase output. I believe the theoretically optimum output should be a sequence of impulses, and not a staircase, something modern circuits are more than capable of. I would like to see that in modern DACs as well. People are making fancy R2R dacs... 64-bit is exorbitant, but 24-bit is definitely being done... see Total DAC, MSB Tech for examples. Like I said in a previous post, Sigma Delta can perform really well in synthetic tests, because they are literally designed to have good specs. R2R may not be superior in tests, but I believe all tests are fallible, and only by sticking to the theory can we come up with optimal performance. To discover the truth about sigma-delta vs R2R resolution, requires a solid grasp of the Shannon-Nyquist theorem, which is the theoretical basis for digital audio. I'm no expert in audio; but if I'm going to invest in a system, I want to make sure I'm making the right purchases. Now I know R2R is the right way to go. That is not to say that high-performance sigma-delta DACs do not exist. It's just that R2R is the simpler and more effective, but more costly way of achieving high performance.
 
EDIT: I want to add that R2R would be superior to SD in a test for dynamic resolution, but I know of no such test at the moment. I don't want to make the impression that the superiority of R2R architecture is just some mumbo-jumbo. Tests for THD, SNR, and DNR favor SD converters, and SD converters were designed for these measurements.
 
Jan 5, 2015 at 10:38 PM Post #23 of 110
  Yes, it will give you staircase output. I believe the theoretically optimum output should be a sequence of impulses, and not a staircase, something modern circuits are more than capable of. I would like to see that in modern DACs as well. People are making fancy R2R dacs... 64-bit is exorbitant, but 24-bit is definitely being done... see Total DAC, MSB Tech for examples. Like I said in a previous post, Sigma Delta can perform really well in synthetic tests, because they are literally designed to have good specs. R2R may not be superior in tests, but I believe all tests are fallible, and only by sticking to the theory can we come up with optimal performance. To discover the truth about sigma-delta vs R2R resolution, requires a solid grasp of the Shannon-Nyquist theorem, which is the theoretical basis for digital audio. I'm no expert in audio; but if I'm going to invest in a system, I want to make sure I'm making the right purchases. Now I know R2R is the right way to go. That is not to say that high-performance sigma-delta DACs do not exist. It's just that R2R is the simpler and more effective, but more costly way of achieving high performance.

 
Even a true impulse train has to be filtered to remove imaging. In fact, it's the same imaging as the zero-order hold, just without the sinc multiplicative factor.
 
Jan 6, 2015 at 4:50 AM Post #26 of 110
  That is dithering noise being added to the signal, not real resolution (i.e has nothing to do with the original audio signal)

 
It is "real resolution" in the band of the original audio signal. The fact that correctly downsampling all the 512 possible 9x1 bit patterns by a factor of 3 yields more than 64 unique 3-sample patterns already shows that there is more than just 2 bits per sample of information that can be encoded in the "audio band", and this does not agree with your simple PDM model that predicts only 2 bits of possible resolution in this case (basically, 000, 100, 110, and 111 for 4 levels). The purpose of noise shaped dithering, which is more efficient than PWM (see the halftone vs. error diffusion dither example images linked in post #6), is exactly to increase the effective resolution in a low frequency band by moving most of the quantization noise outside that band.
 
  EDIT: I want to add that R2R would be superior to SD in a test for dynamic resolution, but I know of no such test at the moment. I don't want to make the impression that the superiority of R2R architecture is just some mumbo-jumbo. Tests for THD, SNR, and DNR favor SD converters, and SD converters were designed for these measurements.

 
To be able to test dynamic resolution, first you need an accurate definition of what it is. Otherwise, it is just one of those audiophile invented terms, like PRaT, which cannot be tested objectively because no one can tell what they really are.
 
As suggested in post #6, if you do not trust standard measurements, it is always possible to perform difference extraction tests with any signal of your choice, even complex music. In this particular case, it would not even be necessary to use any analog hardware, as the effectiveness of the noise shaping/delta-sigma modulation itself can be tested by simulating it in software, downsampling the output, and then subtracting it from the original input to see how much error there is.
 
Jan 6, 2015 at 3:02 PM Post #28 of 110
   
It is "real resolution" in the band of the original audio signal. The fact that correctly downsampling all the 512 possible 9x1 bit patterns by a factor of 3 yields more than 64 unique 3-sample patterns already shows that there is more than just 2 bits per sample of information that can be encoded in the "audio band", and this does not agree with your simple PDM model that predicts only 2 bits of possible resolution in this case (basically, 000, 100, 110, and 111 for 4 levels). The purpose of noise shaped dithering, which is more efficient than PWM (see the halftone vs. error diffusion dither example images linked in post #6), is exactly to increase the effective resolution in a low frequency band by moving most of the quantization noise outside that band.
 
 
To be able to test dynamic resolution, first you need an accurate definition of what it is. Otherwise, it is just one of those audiophile invented terms, like PRaT, which cannot be tested objectively because no one can tell what they really are.
 
As suggested in post #6, if you do not trust standard measurements, it is always possible to perform difference extraction tests with any signal of your choice, even complex music. In this particular case, it would not even be necessary to use any analog hardware, as the effectiveness of the noise shaping/delta-sigma modulation itself can be tested by simulating it in software, downsampling the output, and then subtracting it from the original input to see how much error there is.


I must admit I'm a little surprised at this response. No, adding dithering noise is not adding real resolution to the signal. It's like magnifying an image and then sharpening it; while it may please our senses, it is not information that was present in the original audio signal, so for me, it's not high fidelity.

What I mean by resolution is how well discrete amplitudes in the input match discrete amplitudes in the output. For an audio signal is completely described by an amplitude-vs-time graph (as the S-N theorem states), but not by a frequency spectrum. This is why resolution must be studied in the time domain.

It's not about trust. All tests and measurements are made with certain assumptions of what kind of behavior is optimal. In this case we are not looking for performance in the right areas.
 
I agree, I think if you can provide an extraction test that would be similar to the dynamic resolution test idea I am trying to convey.
 
EDIT: I'd like to add this useful link about sigma-delta converters: http://www.ti.com/lit/an/slyt076/slyt076.pdf Really helped me confirm the fact that correlated noise is not the same as time-domain resolution.
 
Jan 6, 2015 at 4:02 PM Post #30 of 110
 
I must admit I'm a little surprised at this response. No, adding dithering noise is not adding real resolution to the signal. It's like magnifying an image and then sharpening it; while it may please our senses, it is not information that was present in the original audio signal, so for me, it's not high fidelity.

What I mean by resolution is how well discrete amplitudes in the input match discrete amplitudes in the output. For an audio signal is completely described by an amplitude-vs-time graph (as the S-N theorem states), but not by a frequency spectrum. This is why resolution must be studied in the time domain.

It's not about trust. All tests and measurements are made with certain assumptions of what kind of behavior is optimal. In this case we are not looking for performance in the right areas.
 
I agree, I think if you can provide an extraction test that would be similar to the dynamic resolution test idea I am trying to convey.

 
Your comments on dither would be more accurate if dither was something done after the samples have been obtained, but when done during capture it has a real effect on dynamic range within the main audible band. And any set of samples is not "the original signal", if we define this term as "what went into the microphone".
 
Theoretically, frequency and time are exactly equivalent when conditions are met. In practice, of course, sacrifices have to be made since conditions aren't exactly met. The question therefore becomes one of audibility; that is, can you audibly detect these errors in the time domain that come about by frequency optimizations?
 

Users who are viewing this thread

Back
Top