NOS dacs and upsampling

Mar 24, 2025 at 4:04 AM Thread Starter Post #1 of 89

Rayon

500+ Head-Fier
Joined
Mar 27, 2017
Posts
541
Likes
957
Location
Finland
There has been quite a bit discussion on upsampling in the context of NOS dacs, especially R2R. Often for oversampling dacs the highest sample rate and bitrate that the dac can accept sounds best, but with NOS dacs things get more complicated.

Examples of questions that have been on the table several times:
  • What bit depth should one choose when using PCM
    • Should one go with the lowest bit depth with perfect linearity or is there something to be gained if some non-linearity is tolerated to get higher bit depth
  • What sample rate sounds best
  • PCM vs DSD and why
Some motivation for a separate thread:
  • Not all the users of HQPlayer and PGGB have NOS dacs
    • It occasionally feels like this discussion steals quite a bit of focus from more general discussion about upsampling
  • The discussion is already split into HQP, PGGB and dac specific threads
    • Having some centralized place for this helps us to share thoughts between owners of different dacs
 
Mar 24, 2025 at 5:01 AM Post #2 of 89
I’m very curious on Goldensounds recommendation of 18 bits for the Cyan 2. Is this just the optimal from a technical measurement perspective? I’d love a clear explanation of why one would want to lower the bit depth to a NOS DAC.

Intuitively I have the sense that if a DAC like the Cyan 2 had a 24 bit ladder it can be supported technically so is it just a question that the higher order resisters are less accurate so they should not be used?
 
Mar 24, 2025 at 5:21 AM Post #3 of 89
it can be supported technically so is it just a question that the higher order resisters are less accurate so they should not be used?
This. At certain point the distortion/resolution -ratio just isn't worth it anymore. The question is, how much delineation is "too much"?

Here's a linearity measurement of Holo May:
1742807293174.png


20bits clearly sounds clean as it should according to the picture (20x6db=120db), so one wouldn't benefit anything from using for example 19bit. But what about bits 21-24, which one should one use?

To my ears 21bits adds a little bit of distortion, but it contributes to dynamics a lot. 22 starts to sound more distorted and also kind of softer maybe. However, 24 bits sounds better than 22bits again. My guess is that with 24bits I have so many lower accuracy bits in use that even though they together generate more noise, their noise in relation to each other isn't fully correlated and thus, they smooth each other a bit. But this is just a guess. Right now I'm torn between 21 and 24 bits.

But generally +- 0.1db seems to be in non-audible territory, 0.5db audible and >0.5db it starts to really get problematic. To my understanding linearity is also more critical if noise shaping is used. I personally like gaussian dither more with May anyway.
 
Mar 24, 2025 at 7:02 AM Post #4 of 89
But generally +- 0.1db seems to be in non-audible territory, 0.5db audible and >0.5db it starts to really get problematic. To my understanding linearity is also more critical if noise shaping is used. I personally like gaussian dither more with May anyway.
I think linearity is more critical if noise shaping is used because you usually lower the bit depth if you noise shape, with noise shaping you are basically compressing the same data into less bits, but then if the last bit is "off" you're losing a much bigger chunk of your music signal.
I don't think linearity of bits 21-24 matter more with LNS15 than with Gaussian dither if you output 24 bits.
I’m very curious on Goldensounds recommendation of 18 bits for the Cyan 2. Is this just the optimal from a technical measurement perspective? I’d love a clear explanation of why one would want to lower the bit depth to a NOS DAC.

Intuitively I have the sense that if a DAC like the Cyan 2 had a 24 bit ladder it can be supported technically so is it just a question that the higher order resisters are less accurate so they should not be used?
Creating R2R dacs with accurate resistors for signals up to -140db is nearly impossible. Holo has used a clever trick to pump the linear range up to ~20 bits for the flagship models and 18 for the Cyan but most other R2R dacs, even very expensive ones, are not linear beyond 14-16 bits.

Normally 16 bits gives you a noise floor of -96db. However with noise shaping and high output rates all that noise can be moved towards the higher frequencies and filtered out by the analog filter, enabling you a noise floor of <140db with only 16 bits. Since your noise floor is then so low, you don't need additional bits to make it even lower so then you only want to use your "perfect" bits, instead of the imperfect ones too (in the case of the Cyan the imperfect bits are bit 19-24) because the extra lowered noise floor from those bits doesn't add anything useful and the imperfections only add distortion. That's the theory anyway.
 
Mar 24, 2025 at 8:49 AM Post #5 of 89
Got it! Yeah i remember reading how making hard it is from a physics standpoint to add more bits to an r2r ladder - makes sense.

What is the clever trick you mentioned? And why bother if the bits can essentially be considered useless? Is it just to allow compatibility with 24 bit audio files? (If so does 16bit audio on such a dac just bypass this issue?)
 
Mar 24, 2025 at 9:54 AM Post #6 of 89
Got it! Yeah i remember reading how making hard it is from a physics standpoint to add more bits to an r2r ladder - makes sense.

What is the clever trick you mentioned?
I don't know the exact details but Holo uses a second R2R ladder purely to compensate the errors of the first one, done by an FPGA. Intuitively it is trying to add the outputs together such that their errors cancel each other out, similar to balanced interconnects or Supersymmetry amplifiers
And why bother if the bits can essentially be considered useless? Is it just to allow compatibility with 24 bit audio files? (If so does 16bit audio on such a dac just bypass this issue?)
They are certainly not useless! They can be made useless by upsampling externally with advanced noise shapers but in a NOS scenario you definitely want the extra resolution provided by these last few bits.

On a sidenote I never tried the IIR2 filter with 24-bits. It's a special filter in the sense that it sounds crap above 96khz but that sample rate limitation works quite well with the 24 bits and gaussian dither. Absolutely breathtaking transients with pop and rock music with nice analog timbre.
 
Mar 24, 2025 at 10:45 AM Post #7 of 89
I think linearity is more critical if noise shaping is used because you usually lower the bit depth if you noise shape, with noise shaping you are basically compressing the same data into less bits, but then if the last bit is "off" you're losing a much bigger chunk of your music signal.
I don't think linearity of bits 21-24 matter more with LNS15 than with Gaussian dither if you output 24 bits.
That explanation makes sense. Basically the last bit is in big shoes when noise shaping is used. I was thinking something like: noise shaped information is more correlated internally and thus non-linearities could cause surprising anomalies.
 
Mar 24, 2025 at 4:53 PM Post #8 of 89
Noise shaped dither, when done correctly, linearizes R2R nonlinearities without sacrificing bit-depth.

If the DAC is linear to say 18 bits, then you noise shape to 18bits, and the last 6 bits are padded as 0, so they become 'don't care' and do not contribute to the sound, effectively removing the non-linearity after 18 bits (because those resistor elements do not come into play). This is how noise shaping helps linearize. However, to be able to noise shape, you need a higher sampling rate, 8x or higher to be able to push the quantization noise to a higher frequency where they stay inaudible.

Nonlinearity due to resistor mismatch is bad because it introduces distortions that are correlated to music, this is because, music signal at specific levels always get distorted in a predictable manner as they likely involve the same resistors in the R2R ladder. One could also just truncate and dither to 20bits. Objectively, noise shaping will result in more accurate oversampled signal, but there is a subjective element as well that one may prefer.

With R2R DACs, nonlinearity can also increase with higher sample rates, so what works best at 8x or 16x sampling rate may not work at 32x. Part of this is due to settling time and you may need to drop a few more bits.
 
Last edited:
Mar 24, 2025 at 7:54 PM Post #9 of 89
Noise shaped dither, when done correctly, linearizes R2R nonlinearities without sacrificing bit-depth.

If the DAC is linear to say 18 bits, then you noise shape to 18bits, and the last 6 bits are padded as 0, so they become 'don't care' and do not contribute to the sound, effectively removing the non-linearity after 18 bits (because those resistor elements do not come into play). This is how noise shaping helps linearize. However, to be able to noise shape, you need a higher sampling rate, 8x or higher to be able to push the quantization noise to a higher frequency where they stay inaudible.

Nonlinearity due to resistor mismatch is bad because it introduces distortions that are correlated to music, this is because, music signal at specific levels always get distorted in a predictable manner as they likely involve the same resistors in the R2R ladder. One could also just truncate and dither to 20bits. Objectively, noise shaping will result in more accurate oversampled signal, but there is a subjective element as well that one may prefer.
I just want to comment on the terminology. If noise shaped or gaussian dithered into 20bits vs 24bits, we are sacrificing bit depth by definition. However, we are able to encode more "dac-usable" information into those bits when noise shaping is used when taking into consideration the limitations of the dac.

Based on earlier discussions around this theme, it seems that there is common understanding on the idea of noise shaping as such (padding least significant, non-linear bits as 0 and then moving quantization noise upwards into inaudible frequencies by distributing dither wisely). It also seems to be common knowledge why we may want to use noise shaping in theory and that nonlinearities in those last bits (due to inaccuracies in resistors) cause distortion. The open questions are quite technical.

Where we seem to lack consensus:
  • How much nonlinearity is too much
    • For arguments sake assume R2R which is perfectly linear to 8 bits and then it has 8 bits with -0.1db deviation
      • Which one sounds best when upsampling to 16fs, just using all 16bits or noise shaping to 8 bits?
        • If 16bits, why?
      • Asking as many people seem to prefer still using those less linear resistors with some small deviation
  • Does noise shaping sound artificial? (Not a technical question as such, but people don't share the same taste)
    • To me noise shaping, especially higher order noise shaping, sounds unnatural vs. gaussian dither
      • Yes, this is subjective preference, but in the end we do all this to enjoy, which is always subjective
        • Understanding why some people enjoy dither over noise shaping would help us make better products, potentially understand our hearing system better and even ourselves
  • I clearly prefer dithered 21bits (and 24bits) over noise shaped (or normally dithered) 20bits with Holo May and I would like to understand why
    • Higher than 20bits sounds more dynamic
      • Why?
    • May isn't perfectly linear but to 20bits -> explanation needed as in theory noise shaped 20bits should have more information
      • At least if we assume that anydelineation means that we should ditch the bit
        • I personally don't know the formula that we should use to calculate if the bit should be kept or not, other than GoldenOne's suggestion of SNR / 6
    • Others have had similar experiences with other R2R dacs

With R2R DACs, nonlinearity can also increase with higher sample rates, so what works best at 8x or 16x sampling rate may not work at 32x. Part of this is due to settling time and you may need to drop a few more bits.
Thanks for extra confirmation on this. I've seen some speculation that this could be the case.

Tangentially related: I've also noticed that we need more bit depth at higher sample rates to keep the sound "as tall". I don't think that this would be related, but nevertheless an interesting phenomenon. My current favourite is 4fs @ 21bit. If I increase sampling rate to 8fs, the sound is not only pushed a bit further away from me, but it also sounds flatter in comparison. It's a bit of a problem if we would want more bits as a function of sample rate, but dac's capability to keep up is actually going down.
 
Last edited:
Mar 25, 2025 at 1:04 AM Post #10 of 89
I just want to comment on the terminology. If noise shaped or gaussian dithered into 20bits vs 24bits, we are sacrificing bit depth by definition. However, we are able to encode more "dac-usable" information into those bits when noise shaping is used when taking into consideration the limitations of the dac.
Whenever re-quantization occurs - i.e., when converting from a higher bit depth (such as 64 bits used during computation) to a lower bit depth (as required by a DAC), you have three options:

Truncation:
Simply rounding the signal to a lower bit depth introduces harmonic distortion that is audible and should be avoided. This occurs because the quantization noise becomes correlated with the music signal; signals within a specific range consistently fall into the same quantized bins. Consequently, the noise is modulated by the music, resulting in a harsher, more digital sound - akin to viewing a pixelated image. Here, the noise is defined as the difference between the truncated (quantized) signal and the original high-bit-depth music signal.

Dithering:
Dithering involves adding a small amount of randomness (using methods such as triangular or Gaussian dithering) to the least significant bit. This randomness prevents the music signal from falling into predictable quantized bins, resulting in a more natural and less digital sound even though no additional information is added. Essentially, dithering decorrelates the quantization noise introduced by truncation; in this case, the noise is the difference between the quantized-and-dithered signal and the original high-bit-depth music signal.

Noise Shaping:
While truncation preserves dynamic range and dithering slightly reduces it (with the benefits of dithering outweighing this minor loss), noise shaping can effectively increase the dynamic range. This technique is particularly useful for linearizing DACs that require such correction. Noise shaping employs a feedback mechanism to filter the error (i.e., the noise), attenuating noise at lower frequencies while allowing higher frequencies to pass. In effect, the quantization noise is "pushed" to a frequency range where it is less audible. Depending on the design of the noise shaper and the extent of in-band noise attenuation, the dynamic range can be increased within the bandwidth of interest. This approach allows more information to be "packed" into a narrow frequency band (typically below 100 kHz), effectively increasing the bit depth in that band. Although noise shaping is highly beneficial for R2R DACs, its advantages are less obvious for delta-sigma DACs, where linearity is less critical and focus shifts to increasing dynamic range. Even with noise shaping, truncation occurs first; therefore, it is generally advisable to apply dithering before computing the error to avoid high-frequency distortions. Though I use 'noise shaping', I mean dithered noise shaping. The only exception is when noise shaping is used for 1-bit modulators where dithering is not possible.

How much nonlinearity is too much
  • For arguments sake assume R2R which is perfectly linear to 8 bits and then it has 8 bits with -0.1db deviation
    • Which one sounds best when upsampling to 16fs, just using all 16bits or noise shaping to 8 bits?
      • If 16bits, why?
    • Asking as many people seem to prefer still using those less linear resistors with some small deviation
Objectively, you want to linearize every bit for maximum accuracy. However, the point at which additional linearization is unnecessary depends on when you stop hearing a difference. If you can distinguish between a 20‑bit dithered signal and a 20‑bit noise-shaped signal, then you have your answer. Additionally, you must separate personal preference from technical accuracy; a more accurate signal does not automatically mean you will prefer it.

Does noise shaping sound artificial? (Not a technical question as such, but people don't share the same taste)
  • To me noise shaping, especially higher order noise shaping, sounds unnatural vs. gaussian dither
    • Yes, this is subjective preference, but in the end we do all this to enjoy, which is always subjective
      • Understanding why some people enjoy dither over noise shaping would help us make better products, potentially understand our hearing system better and even ourselves
I prefer noise shaping, though the objective answer is clear, you may need to do a poll to get a sense of how the distribution is, not very scientific though.
May isn't perfectly linear but to 20bits -> explanation needed as in theory noise shaped 20bits should have more information
  • At least if we assume that anydelineation means that we should ditch the bit
    • I personally don't know the formula that we should use to calculate if the bit should be kept or not, other than GoldenOne's suggestion of SNR / 6
Below is the linearity measurement performed by JA on Stereophile. Linearity is straightforward: as you increase the input signal level, the DAC’s output should increase proportionally. If the output sometimes deviates - going up or down unexpectedly - that will introduce distortion. In the graph below, the blue trace represents how the output level of a 1 kHz signal (shown on the left y-axis) changes with the input level (x-axis) as it increases from -140 dB to 0 dB. Ideally, this blue line should be perfectly straight from one corner to the other, but you can see it is slightly curved near -140 dB. Since these small deviations are harder to notice, the error is plotted separately in red. This error is calculated as the ratio (x/y) expressed in dB; when x equals y, the ratio is 1, which is 0 dB. Therefore, you want the red line to stay as close to 0 dB as possible.



1742876225430.png

In the graph you see the error settles to very close to 0 starting about -120dB, and approximately 120dB/6 = 20. The reason behind this is very simple, the smallest number you can represent using 20 bits is 1/2^20, and that value in dB is 20 * log10(2^-20) ~ = -120dB. Each bit contributes a doubling of the resolution, so if you go to 21 bits 20 * log10(2^-21) ~ = -126dB. Each doubling is a change of 6dB because 20 * log10(2^-1) or 20 * log10(0.5) ~= -6dB (i.e., each halving decrement by 6dB). So, you could simply divide by 6 to get the approximate useful number of bits.

Noise shaping doesn't add extra bits; it simply leverages the linear region of the graph (i.e., from -120 dB to 0 dB) to encode more information. How is this possible? How can 20 bits effectively store the equivalent of 64 bits of information?

A simplified example can illustrate the concept. Suppose you have a bit depth of 1, meaning you can only represent two values: -1 and 1 (with -1 corresponding to a bit value of 0 and 1 corresponding to a bit value of 1). Now, what if you want to represent 0.5? You would need a bit depth of 2. One way to achieve this is by increasing the sampling rate fourfold—for instance, from 44.1 kHz to 4 × 44.1 kHz. If you transmit a sequence such as [1, 1, 1, -1] (four samples instead of one), the average value is (1 + 1 + 1 - 1) / 4 = 0.5. This averaging acts as a simple low-pass filter. Since we cannot hear frequencies above 20 kHz, our auditory perception effectively filters out the noise (averages in some sense), allowing us to perceive 0.5 even though each individual sample is only 1-bit.

This intentionally simplistic example demonstrates the principle behind noise shaping. In practice, moving average filters are commonly used to convert 1-bit DSD to PCM.
 
Last edited:
Mar 25, 2025 at 6:42 AM Post #11 of 89
I just want to comment on the terminology. If noise shaped or gaussian dithered into 20bits vs 24bits, we are sacrificing bit depth by definition. However, we are able to encode more "dac-usable" information into those bits when noise shaping is used when taking into consideration the limitations of the dac.

Based on earlier discussions around this theme, it seems that there is common understanding on the idea of noise shaping as such (padding least significant, non-linear bits as 0 and then moving quantization noise upwards into inaudible frequencies by distributing dither wisely). It also seems to be common knowledge why we may want to use noise shaping in theory and that nonlinearities in those last bits (due to inaccuracies in resistors) cause distortion. The open questions are quite technical.

Where we seem to lack consensus:
  • How much nonlinearity is too much
    • For arguments sake assume R2R which is perfectly linear to 8 bits and then it has 8 bits with -0.1db deviation
      • Which one sounds best when upsampling to 16fs, just using all 16bits or noise shaping to 8 bits?
        • If 16bits, why?
      • Asking as many people seem to prefer still using those less linear resistors with some small deviation
  • Does noise shaping sound artificial? (Not a technical question as such, but people don't share the same taste)
    • To me noise shaping, especially higher order noise shaping, sounds unnatural vs. gaussian dither
      • Yes, this is subjective preference, but in the end we do all this to enjoy, which is always subjective
        • Understanding why some people enjoy dither over noise shaping would help us make better products, potentially understand our hearing system better and even ourselves
  • I clearly prefer dithered 21bits (and 24bits) over noise shaped (or normally dithered) 20bits with Holo May and I would like to understand why
    • Higher than 20bits sounds more dynamic
      • Why?
    • May isn't perfectly linear but to 20bits -> explanation needed as in theory noise shaped 20bits should have more information
      • At least if we assume that anydelineation means that we should ditch the bit
        • I personally don't know the formula that we should use to calculate if the bit should be kept or not, other than GoldenOne's suggestion of SNR / 6
    • Others have had similar experiences with other R2R dacs


Thanks for extra confirmation on this. I've seen some speculation that this could be the case.

Tangentially related: I've also noticed that we need more bit depth at higher sample rates to keep the sound "as tall". I don't think that this would be related, but nevertheless an interesting phenomenon. My current favourite is 4fs @ 21bit. If I increase sampling rate to 8fs, the sound is not only pushed a bit further away from me, but it also sounds flatter in comparison. It's a bit of a problem if we would want more bits as a function of sample rate, but dac's capability to keep up is actually going down.
If we accept the hypothesis that a bit of nonlinearity isn't a big deal up to a certain threshold, but the ear preferring higher bit depth then I find something peculiar: you like 24 bits and 21 but not 22 or 23.

Now from the May measurement graph we can see the linearity is perfect up to 20 bits, but the blue graph with the naked eye does not start to deviate significantly until below -134db, which would suggest 23 bits would be the best compromise.

To gain more insight into the effect of sample rate on bits it would be great if someone could do a linearity measurement of the May with 8, 16 and 32x rate signals.

Finally I think it would be useful to also characterize the other end of the spectrum. As Goldensound showed in the Cyan thread in the audible band 18-bit lns15 is strictly superior to 24-bit flat dithered.
Could we go even lower? If we try LNS15 with only 8 bits, does it sound horrible or somewhat okay?

Finally we still need a plausible theory on why different dithering options have audible differences when all their effects should be significantly below what the human ear can hear..
One explanation is that the human brain does its own noise filtering, this is also why Vinyl sounds okay despite measurements suggesting otherwise. I would imagine in nature most naturally occurring noise is Gaussian which why the brain's own noise filtering finds it most natural?
 
Mar 25, 2025 at 7:03 AM Post #12 of 89
Yes, my big question is, why does non-linearity matter at all at those incredibly low signal levels? I probably have incomplete understanding but my intuition is that this would only affect really quiet passages of music? What am I missing?
 
Mar 25, 2025 at 8:08 AM Post #13 of 89
Yes, my big question is, why does non-linearity matter at all at those incredibly low signal levels? I probably have incomplete understanding but my intuition is that this would only affect really quiet passages of music? What am I missing?
That's the problem, according to science nothing matters below ~-110db and above ~15khz depending on age, yet people can reliably pass ABX tests between dac filters that only affect 20khz and things like noise floor modulation are very audible despite being below these limits.

Our model of the combination of the human ear + the brain's auditory processing is unfortunately incomplete. One hint is that the human ear acts more like a wavelet transform https://en.m.wikipedia.org/wiki/Wavelet_transform
While most of our audio equipment testing is done using a Fourier transform.
Most people cannot hear pure 20khz tones, but 20khz content can affect waveform shape which the ear is sensitive to and the wavelet model better accounts for that.
 
Mar 25, 2025 at 10:08 AM Post #15 of 89
Is there really anyone who can't hear above 20kHz but could reliably pass ABX tests between DAC filters that don't roll off before 20kHz?
Goldensound did a nice video on it (the resulting ASR thread was hilarious since they concluded he must have cheated, until Amir came in and declares filter differences are known to be audible)
There's also this nice article: https://www.audiosciencereview.com/forum/index.php?threads/high-resolution-audio-does-it-matter.11/
 

Users who are viewing this thread

Back
Top