I wonder if what's happening here is that you are hearing an architectural issue in the way DAVE works?
What if DAVE is working better due to the first stage WTA being inactive, when fed with 16FS (or 768KHz sample rate) material. The power consumption within DAVE associated with the arduous primary WTA could be substantially lower if it's not being used.
Alternatively, there's a subtle bug in DAVE's first stage WTA, or there's some kind of interaction between the 16FS WTA output and the second stage WTA input?
Also, in the past haven't you said that there's an issue with the coefficients inside the primary WTA, which is the reason that the HF filter sounds better when playing back 44100Hz sample rate through DAVE, even though "it shouldn't".
Now playing: Timesbold - Sing
For sure Dave sounds a bit better and measures slightly better via an M scaler or with a 768k input.
And when I first started listening to the M scaler, I could not believe the sound quality improvements of M scaling; and I did seriously entertain the possibility that there was a fault in the WTA coding of Dave.
But for sure, I know that is not the case, for a couple of reasons. Firstly, simulation. A Verilog simulation is not a simulation in the sense that it approximates the output; with a Verilog simulation, if your FPGA module is fed that particular data set, you are guaranteed that actual output (assuming the real FPGA meets timing closure). In the past, simulation was very limited - there is no way I could simulate a WTA filter and get enough data to do a FFT. Today, it's not a problem; I can do a 4 million point FFT from simulation data and find out exactly how well the module is performing to a level of accuracy that you can't get with real world measurements. So for example, here is the plot of the THD and noise performance of the output noise shaper from the M scaler:
This is my usual test of distortion that I use for digital modules - can it perfectly reproduce a -301dB signal. I do this test because I know that depth reproduction relies upon a
perfect reproduction of small signals in terms of amplitude. What is interesting with this test is that the truncator is perfectly reproducing the dither of the 80 bit test tone; the noise floor you can see at -385 dB comes from the test tone. So actually it has better than 80 bit performance up to 15 kHz. All the modules in Dave passes this test, and so does the M scaler. Also, when I test the filter performance I know for certain it is performing as intended - when you examine FFT's of the filter performance, the side lobes are examined against the ideal; if one coefficient is incorrect (even with half a million others), you will see a difference, and part of my testing process is to ensure they are identical.
So I know objectively that Dave and the M scaler is correct; but the real proof is when you set it to video mode. In this mode, I select a different set of coefficients for some 16,000 out of the half a million, and merely insert the data into a different point in the SRAM buffer. Everything else is the same; Dave can't know there is a change; THD and noise is identical, the signal path is identical; but when you listen to 2/3 million taps against the full one million you hear a surprising big difference - the sound-stage opens up tremendously with the full 1 M taps.
So there really is something odd about the full million taps, and running at 750/768 kHz, as the sound-stage also collapses at lower sample rates. And I still do not fully understand why this is the case. I have some ideas why, and will be testing these ideas out with Davina.
Rob