It’s a lot more of a complicated world than you think, especially if you’re “asking industry folks”, because they are audiophile industry folks and the only ones allowed to talk to you are marketers, the actual engineers are under NDAs and can’t talk to you. Your article seems to revolve around this issue:
“DSD measures identically in either output mode. And the actual filtering they are using on DSD is EXTREMELY gentle, which allows one of the biggest strengths of DSD to shine out, and that is the transient response. Also, it allows enough ultrasonic noise to enter the ears, and even though we cannot hear it, that ultrasonic quantization noise, unlike random PCM quantization noise, stays harmonically related to what we can hear. Psycho-acoustic experts theorize that this plays a big part in why DSD sounds so 'good' to many people.”
Unfortunately, that’s just marketing. The filtering employed on DSD is very gentle but it doesn’t affect transient response so that is not one of its biggest strengths. In fact DSD doesn’t really have a biggest strength, technically it is inferior to PCM, though not to the point of audibility. There is no quantisation noise with DSD, there is only “noise-shaped” dither noise, it is NOT harmonically related to what we can hear, just like noise-shaped dither with PCM, and even if it were related, it is ultrasonic and we cannot hear it any way. Lastly, psychoacoustic experts do not theorise that plays any part at all in why DSD sounds good to many people.
I’m not sure if you’re aware but almost all DSD/SACD recordings are converted to PCM for mixing and mastering. The only exceptions are a relatively small number of direct to disk recordings that were mixed live in analog (and therefore are noisier) and not mastered.
G
I am very aware of more than most, how conversations with some industry folks is filled with marketing muck. (actually the one or two 'insiders' I talk to don't give a crap about their NDA, one because he actually intellectually owns the technology the former company is still selling. Never would I suggest he actually breaks his NDA, but the candor is extreme, and the disdain for marketing is severe.) I was trying to be humble because I am not a programmer or super math guy who understands the most complicated math and noise-shaping principles. I was trying to be humble. But if we must do this.
DSD and PCM both have strengths and weaknesses. Amplitude resolution at high frequencies is a weakness of DSD, strength of PCM. Perceived SNR at low frequencies is a strength of delta-sigma, and if we are getting past marketing, then let's just drop the DSD moniker all together and talk about what it is, Single bit pulse density modulation. Now lets go to its basic form ( sans noise-shaping.. just the most basic averaging without the noise shaping feedback) in which uses time-slicing averaging (you can approximate the actual basic resolution over any arbitrary time window, although that isn't how the format is defined, it is a basic, undeniable characteristic of it's modus operandi) and then go beyond that and combine the basic averaging with noise shaping to fool the ear into hearing more resolution than is there. Actual resolution vs perceived resolution.
Yes, DSD has an advantage in transient resolution. This is undeniable. The amount of advantage is INDEED moderated by filtering. The averaging itself lengthens the transient response, and furthermore ringing will smear response as well. For the same reason oversampling and gentle filters work with PCM, it works with DSD but with much more oversampling to work with for even smoother filtration. The caveat is allowing ultrasonic noise to pass through into the analog system, which causes loss of actual resolution by intermodulation distortion and amplification of harmonic distortion. Also, the faster you go, (DSD 256, DSD512, DSD1024) the more stress is put on the digital logic, which increases said distortions. So a balance must be found.
You are keen of pointing out DSD weaknesses as reasons to call it inferior. Yes, it is inferior in several ways. I would argue its superior in other ways. Both PCM and DSD have their strengths and weaknesses. It is pointless to argue about which is better overall, or worse overall. They are different. Now, subjectively, people may have a preference. And there has actally been research done into that. What are your references when you say the psycho-acoustics have nothing to say on the matter?
Also, to say there is no quantization noise in DSD is a complete falsehood. At its core it is a 1-bit PCM system before noise-shaping that uses averaging. Each 1 bit pulse DEFINITELY had massive quantization noise. That noise is lessened with simple averaging over time periods. Remember, this is BEFORE noise shaping. Considering that DSD64 for example is a 64 x 44,100khz oversampling system, it is natural to use a time window/period of 64 pulses. One drawback to this, since, again we are using a time-splicing system, is that the actual resolution will really be dependent on what frequency we are talking about quantizing into digital. (A way to show this difference in a time-splicing system would be to just use a bigger time period, but that is beyond our scope right now)
to understand the resolution in oversampled 1-bit pulse averaging, let's break down the process: ( i put the math into Chat GPT to make this go quicker and format it faster. I am sorry if you think that lessens this somehow, but actually the information is spot on correct)
Therefore, with 64 oversampled 1-bit pulses every 22.68 microseconds, the average amplitude resolution across that 22.68 microseconds is indeed approximately 6 bits. This means that by averaging the 1-bit pulses, you achieve a resolution that is roughly equivalent to a 6-bit binary representation.
So we have 6-bits now over a time spliced period. We have now gone from a resolution a 6dbs at 1 bit, to a much improved 36db in a 44,100khz PCM sample space, but there is still MASSIVE QUANTIZATION NOISE HERE. Its basic sampling theorem. Lets forget that this is a 1 bit system. Lets make it a 2, 3, 4, 5 bit pulse in a pulse density system, we are dealing with amplitude resolution, amplitude error, therefore quantization noise.
The trick of Delta-Sigma modulation (and yes its a good trick) is to move that noise to where we can't hear it. But its quantization noise, and unlike PCM which has randomized noise,
the noise in DSD is ABSOLUTELY HARMONICALLY RELATED to the actual content we wish to sample.
In DSD, ultrasonic noise is typically harmonically related to the audio signal due to the nature of delta-sigma modulation. This harmonic structure can extend well beyond the human hearing range (20 Hz to 20 kHz). Although the ultrasonic frequencies are above the audible range, their harmonic relationships can influence subharmonic frequencies within the audible range through intermodulation distortion, which can enhance the perception of a richer and more complex sound. These frequencies might interact with the auditory system in ways that affect the perception of lower frequencies, potentially adding to the sense of depth and spatiality in the audio even, if they are not audible in the traditional sense, the brain ear system still can process them, and they can affect what our brain gives to us that we can hear under 20khz.
Also, The human auditory system exhibits nonlinearities, meaning that interactions between ultrasonic frequencies and audible frequencies can generate audible artifacts or enhance existing tones.
And perhaps most importantly is harmonic masking. Ultrasonic content can have masking effects, altering how lower frequencies are perceived. This can lead to a cleaner and more detailed perception of the mid and low frequencies.
And subjectively speaking, many listeners subjectively prefer audio with rich harmonic content, including ultrasonic harmonics, as they may contribute to a perception of higher fidelity and naturalness.
References for all the above:
Psychoacoustics: Facts and Models by Hugo Fastl and Eberhard Zwicker
The Influence of High-Frequency Audio Content on the Perception of High-Resolution Audio AES Paper
Perceptual Audio Coders: What To Listen For by James D. Johnston:
High-Resolution Audio: The Technology, Perception, and Impact by Joshua D. Reiss and Andrew P. McPherson:
And yes, of course I am aware of how DSD is edited and mastered. Would you also like a discussion on how DSD-wide works? How it has a 34 tap FIR filter than converts the 1 bit DSD into an 8 bit binary 256 level system, with an impulse response similar to that of 96khz PCM? How it deals with those redundant samples after filtering (since it doesn't decimate) and further increases perceived resolution by re-noise shaping the system after the FIR filter. We could look at the AES paper describing it and go over it together.
Of course, there is the Pyramix method of conversion to a marketing term called DXD. The entire mix can be converted to 352khz, or it can simply be used to punch-in punch-out and do things like cross-fades for simple editing so the majority of the native DSD is preserved. I believe that actually as you move to DSD128, the PCM editing rate doubles as well, and the when to DSD256 it doubles yet again, so at THAT massive PCM rate, who cares what format it is in anyway???
This is all beside the point, though because there is a niche market that many people, including myself patronize, that contain music in Native DSD, with no to minimal PCM conversion. And yes, classical and jazz music IS what I like, since I have a degree in Classical Piano, it suits me. Also, I love the fact you can transfer analog tape, which I subjectively feel sounds remarkable, especially in its early generations, directly to DSD.
OH, and back to where we started. You picked up on the part where I was pointing out there is no difference in the filtering or volume level of DSD in this Topping iteration of the ak4493. I offer it as evidence their marketing is incorrect, because if DSD bypass (bypass the volume and the multi-bit modulator) is in use in fixed volume mode, it will measure differently. The filters will should show evidence of it. Yet you took that simple observation to go off on tangents I wasn't even talking about.
I guess it doesn't pay to be humble, because so many people on here assume you know nothing, when they themselves don't understand the more narrow point you were trying to make.
Thanks for the critique! All this extra work helps this enthusiast plebe keep learning. Its folks like you that make assumptions first that started pushing me to learn more in the first place.