NOS dacs and upsampling

Mar 30, 2025 at 5:07 PM Post #46 of 73
I do encourage you and everybody who never tried, to indeed go fool around with files at bit depths where the background noise is clearly audible. Then you can apply various noise shaping, or progressively increase the bit depth and find out where the noise stops being noticeable. Because obviously, just because something super loud is noticed, doesn't mean the same thing at a very quiet level will also be noticed. The entire concept of hearing threshold exist because our sensitivity to various things is always limited somewhere.

Spoiler(things start sounding the same well before 20bit). At a purely digital level, the question of what we do with the bits below 20 is irrelevant to a human ear and a modern playback system.
Actually this is what I've done today. I used 8fs and started from 16 bits gradually rising the bit depth. It surprised me how much I liked 16bits, but after a while I started to miss the depth that higher bit depth brings. I could hear constant improvement in sound until 19bits but between 19 and 20 bits I didn't like the change anymore. While 20 bits seemed more... real?, I lost touch to when the decay ends. With 20 bits sound just decays into blackness. That made the sound feel like it's too soft, while it's likely "more correct". Then when I went to 21 bits, I realized that now the noise started to rise again as it wasn't perfectly linear. However, this noise was irritating my ears. Doing a-b between 19 and 21 bits, 19 was preferable between the two and I prefer both over 20. All the testing was done with gaussian dither (no noise shaping).

So @plumpudding2 it may well be that I just want to hear the noise floor (or limits of the resolution), but juuuuust barely (and capabilities of my headphones).

And thanks @gregorio for answering more constructively, even though I was emotional. As an enthusiast it's just so hard to talk about these themes freely anywhere so that the discussion would stay together until the problem was solved or at least the right questions understood.

P.S. It's been very interesting to follow the discussion here generally.
 
Last edited:
Mar 30, 2025 at 5:23 PM Post #47 of 73
As for digital/analog, I think it's a false conversation. Vinyl has the amplitude accuracy and noise floor of a bad digital solution, but it's analog(continuous stuff). It definitely does not have infinite resolution and strictly speaking, nothing has. Certainly not our ears that are made of discrete elements triggering, or not, neurons, which act in a binary way. Action potential is reached, or it isn't.
That this somehow results in us feeling a fluid realistic experience to the point we constantly convince ourselves it's the objective reality, that's the impressive and most mysterious part of it all.
I was referring to live music, as in someone singing or playing a guitar and listening to a live performance not playback via digital/vinyl. To say neural pathways are discrete is over-simplification. Sure, sigmoid functions have been used to model activations of artificial neural networks (though relu is more favored for hidden layers now). But strength of the signal and the time and frequency of firing play an important role too. I was not trying claim that we need infinite bit-depth, I was pointing to the fact that looking for the lowest audible level may not be the best approach.
 
Mar 30, 2025 at 5:30 PM Post #48 of 73
Actually this is what I've done today. I used 8fs and started from 16 bits gradually rising the bit depth. It surprised me how much I liked 16bits, but after a while I started to miss the depth that higher bit depth brings. I could hear constant improvement in sound until 19bits but between 19 and 20 bits I didn't like the change anymore. While 20 bits seemed more... real?, I lost touch to when the decay ends. With 20 bits sound just decays into blackness. That made the sound feel like it's too soft, while it's likely "more correct". Then when I went to 21 bits, I realized that now the noise started to rise again as it wasn't perfectly linear. However, this noise was irritating my ears. Doing a-b between 19 and 21 bits, 19 was preferable between the two and I prefer both over 20.
Was this just dither or with noise shaping? With dither alone, DAC non-linearity will become an additional variable.
 
Last edited:
Mar 30, 2025 at 7:09 PM Post #49 of 73
Was this just dither or with noise shaping? With dither alone, DAC non-linearity will become an additional variable.
Just gaussian dither as I prefer it. It's somehow more lively with R2R dac.
 
Mar 30, 2025 at 7:40 PM Post #50 of 73
Just gaussian dither as I prefer it. It's somehow more lively with R2R dac.
Thanks for clarifying. From your post, you were not looking for audible noise (as was originally suggest) but you noticed improvements in depth between 16 - 19bits. While I understand your preference is for Gaussian dither and 8x for dynamics, is it not true that you mentioned earlier that you heard improved depth with noise shaping (even though you did not prefer it)? What happens at even higher rates such as 16x?

The reason I am asking these questions is to move the focus away from the lowest audible levels to when bit-depth stops to matter (assuming you are interested in exploring it). Depending on the noise shaper you are using, the effective bit-depth may still be higher than 20bits.
 
Last edited:
Mar 31, 2025 at 2:21 AM Post #51 of 73
Thanks for clarifying.
Thanks for the help!
From your post, you were not looking for audible noise (as was originally suggest) but you noticed improvements in depth between 16 - 19bits. While I understand your preference is for Gaussian dither and 8x for dynamics, is it not true that you mentioned earlier that you heard improved depth with noise shaping (even though you did not prefer it)? What happens at even higher rates such as 16x?
I mean, noise shaping does improve depth as well, but when I turn noise shaping on, it changes the sound qualitetively. It cleans up the sound, but sound kind of loses it's edge. It doesn't sound as snappy, like some realism is lost. Imaging gets better and it's easier to place everything in space, but at the same time it no longer feels like these things happen in the same air space with you. Or at least this is how I would describe it.

Regarding to 8fs, I actually tend to prefer 4fs as that one is more in my face. I've just now been using 8fs for testing as then it's easier to also test the effect of noise shaping keeping other parameters unchanged. However, now that I found the beauty of 19bits, I can keep some of the in-your-face nature of 4fs, while enjoying bigger sound. I may test even higher rates with these lower rates and see how I like it. Previously it's been way too laid back for my preferences.

Maybe as a child of digital era there needs to be something wrong for it to sound familiar. And I'm only half-joking. When I tried 16bits for the first time, it made me emotional and music sounded like how it used to sound in my childhood, but with steroids. If it's too analog, it doesn't sound like music to me anymore. I love a small touch of digital as I learned to associate it with something cool (like, with CDs you could just select which track you wanted to listen to and all the cool music (like Darude - Sandstorm) was on CDs, while my father's boring old music was on vinyls and cassettes).
The reason I am asking these questions is to move the focus away from the lowest audible levels to when bit-depth stops to matter (assuming you are interested in exploring it). Depending on the noise shaper you are using, the effective bit-depth may still be higher than 20bits.
I understand. And this is what I've occasionally referred to when saying things like "bit depth and noise shaping are trying to solve the same problem". However, to me increasing the actual bit depth that I feed to my dac brings very obviously superior sound when compared to bringing in the noise shaper. They both contribute to for example depth, but with noise shaping it just becomes boring. I've for example used both noise shaping and dither for 8fs. And currently all the testing is being done with PGGB.

And regarding to "when bit-depth stops to matter", assuming that the measurements available in stereophile reflect the linearity of my unit, to me the sweet spot seems to be 19bits with gaussian dither (as May's linearity is flat until 20bits but I prefer to keep it at 19). But these preference things tend to come and go. I've also noticed that different settings synergize differently with different headphones. Right now I've been using MDR-Z1R.

EDIT: I did some further testing still with 4fs. With that I still prefer 21bits over any of 16-20. With 4fs the nonlinearity of 21bits gives just enough spice to make things interesting and 4fs clearly needs those bits in order to not sound flat. Everything sounds pretty much perfect to my ears with MDR-Z1R. However, I've at least reached my goal and I think I understand the anatomy of all this a bit better. It's quite interesting how much the effect when lowering the bitrates differs between 4fs and 8fs.
 
Last edited:
Mar 31, 2025 at 5:24 AM Post #52 of 73
Decorrelation is not the same as removing quantization noise. It prevents the quantization noise from being modulated by the music signal, but it does not eliminate the quantization error itself.
Dither absolutely does eliminate the quantisation error itself, assuming the correct amount of standard TDPF dither. That’s the whole point of dither.
Unless you apply noise shaping, dither is not a substitute for bit-depth.
Increasing bit depth reduces the amount of quantisation error, dither (with or without noise shaping) eliminates quantisation error entirely.
For example, 8-bit audio with dither is not the same as 24-bit audio with dither and 8-bit audio will have an increased noise floor due to quantization error. While dither may reduce harmonic distortion caused by quantization error, 8-bit audio will still sound less detailed and more smoothed out compared to 16-bit or 24-bit audio.
Dither will eliminate all quantisation error, converting it to white (uncorrelated) noise. The dither noise floor will be very significantly higher with 8 bit because there’s significantly more quantisation error, so any detail below about 48dB will be buried in the dither noise floor. With 16bit, that dither noise floor is below both the noise floor of recordings and audibility given a reasonable playback level. This is why the CD standard was specified with 16bits rather than 8bit and even 16bit is somewhat overkill.At a 44.1 kHz sampling rate, you're limited by the Nyquist frequency of 22.05 kHz. Yes, you can apply psychoacoustic noise shaping to gently push some of the noise above 2–3 kHz, and then more steeply between 20 kHz and 22 kHz. But there’s very little headroom to work with, and some of that shaped noise still remains within the audible band.
At a 44.1 kHz sampling rate, you're limited by the Nyquist frequency of 22.05 kHz. Yes, you can apply psychoacoustic noise shaping to gently push some of the noise above 2–3 kHz, and then more steeply between 20 kHz and 22 kHz. But there’s very little headroom to work with, and some of that shaped noise still remains within the audible band. You might gain the equivalent of 1 or 2 extra bits, but that’s not enough to accommodate a significant reduction in bit depth—certainly not down to 1, 5, or even 8 bits.
There’s plenty of headroom for noise-shaped dither to 16bit 44.1kHz, human hearing drops of dramatically above about 14kHz and is almost non-existent above 17kHz in adults and in fact, the equivalent of 3-4 extra bits is routinely achieved with no audible dither noise (at reasonable listening levels). You’re correct that it wouldn’t be enough to accommodate a reduction in bit depth to 1 or 5 bits and probably not for 8 bits but then no one ever tries to!
Most software-based upsampling filters fit this definition of relatively fast roll-off starting around 19–20 kHz, with the exception of a few designed specifically for lower-bitrate formats. These software filters often offer selectable characteristics, such as linear-phase, minimum-phase, or mixed-phase. While not all may be optimal, most are objectively superior to the filters implemented within typical DAC hardware.
If they are optimal (do not cause any audible artefacts) then “objectively superior” is just “on paper”, so they might be preferable to someone who likes better performance figures on paper but they cannot be audibly preferable.
Even if the effective dynamic range for a specific genre of music is say 60dB, music signal varies continuously (i.e., infinitely granular) within this range. If 10bits were used to accommodate 60dB dynamic range, then there are exactly 1024 levels available to represent, one of the bits being the sign bit, it is really about 512 levels in the postive and negative direction. The question is if 512 levels are enough, given we are used to living in an analog world where the resolution is infinite.
Sorry but we do not live in an analogue world, we live in an acoustic world and regardless, neither the acoustic nor the analogue worlds have infinite resolution. The “levels available to represent” are largely irrelevant because we do not output those discrete levels, we reconstruct the continuously varying analogue signal from those discrete levels. The question isn’t if the number of discrete levels available with 10bit is enough because no one uses 10bit and it’s proven that even just 2 levels is enough (SACD for example). The roughly 32,000 available levels with 16bit is way more than enough. Lastly, very few recordings have a dynamic range of 60dB, the vast majority have half or less than half that dynamic range.
I was pointing to the fact that looking for the lowest audible level may not be the best approach.
Why? If it’s below the lowest audible level, IE. Is inaudible, or is so low in level it can’t even exist as sound in the first place, then what does it matter?
With dither alone, DAC non-linearity will become an additional variable.
Why would dither noise that’s well below the noise floor of any recording cause a non-linearity in a DAC?
The reason I am asking these questions is to move the focus away from the lowest audible levels to when bit-depth stops to matter (assuming you are interested in exploring it).
Again, why would you want to move the focus away from what’s audible to what isn’t?
I mean, noise shaping does improve depth as well, but when I turn noise shaping on, it changes the sound qualitetively. It cleans up the sound, but sound kind of loses its edge.
The term “noise shaping” is short for “noise-shaped dither”. The applied dither is noise-shaped, not the recording. It does not affect the recording, it does not “clean up the sound”, change the sound qualitatively or improve the depth. If applying noise shaped dither makes any audible difference at all, there is something very seriously wrong with how it’s being applied!

G
 
Mar 31, 2025 at 10:56 AM Post #53 of 73
I mean, noise shaping does improve depth as well, but when I turn noise shaping on, it changes the sound qualitetively. It cleans up the sound, but sound kind of loses it's edge. It doesn't sound as snappy, like some realism is lost. Imaging gets better and it's easier to place everything in space, but at the same time it no longer feels like these things happen in the same air space with you. Or at least this is how I would describe it.

Regarding to 8fs, I actually tend to prefer 4fs as that one is more in my face. I've just now been using 8fs for testing as then it's easier to also test the effect of noise shaping keeping other parameters unchanged. However, now that I found the beauty of 19bits, I can keep some of the in-your-face nature of 4fs, while enjoying bigger sound. I may test even higher rates with these lower rates and see how I like it. Previously it's been way too laid back for my preferences.

Maybe as a child of digital era there needs to be something wrong for it to sound familiar. And I'm only half-joking. When I tried 16bits for the first time, it made me emotional and music sounded like how it used to sound in my childhood, but with steroids. If it's too analog, it doesn't sound like music to me anymore. I love a small touch of digital as I learned to associate it with something cool (like, with CDs you could just select which track you wanted to listen to and all the cool music (like Darude - Sandstorm) was on CDs, while my father's boring old music was on vinyls and cassettes).
Thanks for walking me through your perspective and explaining the reasoning behind your preference. With R2R DACs, increasing the sampling rate pushes out-of-band images further away, which reduces reconstruction artifacts and timing errors - subjectively improving depth perception. However, the trade-off is often a reduction in the ‘liveliness’ that many people seek in R2R DACs. For me, the loss of depth is more noticeable than the loss of liveliness, so my personal preference is to use higher sampling rates.

When keeping the sampling rate constant and increasing the bit-depth, you’re effectively squeezing in more information (by increasing the precision of the intermediate samples that were inserted). But beyond a certain point - around 19 bits or more - DAC nonlinearity starts introducing distortion, which adds a coloration that you’ve found you enjoy. On the other hand, applying noise shaping helps to linearize the response and can provide a few more ‘effective bits.’ My own preference is to use noise shaping to extract even more effective resolution, as I tend to value the added detail and texture it brings.

While the PCM path in R2R DACs does have its limitations - some of which stem from nonlinearity, others from settling time issues at higher sampling rates - the DSD path bypasses many of those constraints. Personally, I prefer the DSD path for these reasons.
 
Mar 31, 2025 at 10:21 PM Post #54 of 73
However, the trade-off is often a reduction in the ‘liveliness’ that many people seek in R2R DACs. For me, the loss of depth is more noticeable than the loss of liveliness, so my personal preference is to use higher sampling rates.

DDC and signal reclocking before it goes to DAC is the key to get liveliness back from the depth increase with oversampling :)
 
Apr 1, 2025 at 8:59 AM Post #55 of 73
DDC and signal reclocking before it goes to DAC is the key to get liveliness back from the depth increase with oversampling :)
FWIW, I’ve always felt that the ‘liveliness’ and wider presentation are colorations caused by out-of-band images. I’m not particularly sensitive to the liveliness aspect - it’s more anecdotal, based on what others have reported - but on a 2-channel setup, the change in perceived width has been noticeable.
 
Apr 1, 2025 at 9:23 AM Post #56 of 73
FWIW, I’ve always felt that the ‘liveliness’ and wider presentation are colorations caused by out-of-band images. I’m not particularly sensitive to the liveliness aspect - it’s more anecdotal, based on what others have reported - but on a 2-channel setup, the change in perceived width has been noticeable.

Oversampling is more transparent to the hardware limitations of the system while the NOS pretty much masks those limitations. Improving the timing stability and noise after oversampling will bring back that live propulsive feeling that’s very palpable yet unmistakably oversampled presentation (more correct sounding) aka. Like high end vinyl feeling but without the surface noises
 
Apr 1, 2025 at 10:53 PM Post #57 of 73
HiRes is less fatiguing to me. The sound is very similar to CD, except it's softer, like something sharp and annoying is gone. The difference might be hard to spot if listening for 30 seconds, it's rather more pleasant experience after 30 minutes, no fatigue vs some fatigue.
 
Apr 2, 2025 at 4:44 AM Post #58 of 73
FWIW, I’ve always felt that the ‘liveliness’ and wider presentation are colorations caused by out-of-band images. I’m not particularly sensitive to the liveliness aspect - it’s more anecdotal, based on what others have reported - but on a 2-channel setup, the change in perceived width has been noticeable.
I maintain the belief that the distortion/nonlinearity etc of NOS or PCM with Gaussian dither etc must somehow compensate for something that's missing. The human ear + brain isn't dumb, if something sounds more natural it probably is more natural.
Theory would predict the most technically correct, lowest distortion lowest noise floor signal to be the most natural sounding. If the human ear prefers something else theory must be incomplete I reckon.
However as I type this I do realise hardware stores sell fancy Oled tv's by turning the saturation and contrast all the way up to artificial levels.. I guess if the human eye wants "better than real" then maybe the ear does too.

I vaguely recall an MQA study where they found that the ear isn't able to differentiate properly between "true" hires content and artificial hires where a 44.1khz got injected with an artificial high-frequency signal with a downwards sloping power spectrum that mimics true hires. I believe that was the basis for their "origami" unfolding, no need to be lossless according to their testing. Maybe the NOS out-of-band images produce a similar result?
 
Apr 2, 2025 at 7:35 AM Post #59 of 73
But beyond a certain point - around 19 bits or more - DAC nonlinearity starts introducing distortion, which adds a coloration that you’ve found you enjoy.
As already mentioned, anything that occurs at “19bits or more” cannot exist as acoustic sound (at a reasonable listening level) and therefore what anyone has “found you enjoy” obviously cannot have anything to do with sound or hearing.

G
 
Apr 2, 2025 at 10:26 AM Post #60 of 73
I maintain the belief that the distortion/nonlinearity etc of NOS or PCM with Gaussian dither etc must somehow compensate for something that's missing. The human ear + brain isn't dumb, if something sounds more natural it probably is more natural.
Theory would predict the most technically correct, lowest distortion lowest noise floor signal to be the most natural sounding. If the human ear prefers something else theory must be incomplete I reckon.
However as I type this I do realise hardware stores sell fancy Oled tv's by turning the saturation and contrast all the way up to artificial levels.. I guess if the human eye wants "better than real" then maybe the ear does too.

I vaguely recall an MQA study where they found that the ear isn't able to differentiate properly between "true" hires content and artificial hires where a 44.1khz got injected with an artificial high-frequency signal with a downwards sloping power spectrum that mimics true hires. I believe that was the basis for their "origami" unfolding, no need to be lossless according to their testing. Maybe the NOS out-of-band images produce a similar result?
Yes, personal preference plays a significant role here - and so does the playback chain. Objectively less accurate options can still be preferred, sometimes because they mask flaws, and other times because they emphasize certain characteristics. It’s the same reason why some people prefer tube gear, even though it introduces harmonic distortion (I use them too).

However, these preferences aren’t universal, and it’s difficult to generalize across all listeners. Ideally, I would hope that most of us lean toward a preference for accurate playback. A good example is the development of the Harman target curve - while the majority of listeners may prefer it over other alternatives, there’s still room to personalize it based on individual taste or the headphone.
 

Users who are viewing this thread

  • Back
    Top