Blind test: 6 DACs compared

Discussion in 'Sound Science' started by skamp, Jan 28, 2013.
  1. skamp
    That's very impressive. What differences did you hear?
  2. miceblue
    As I mentioned in that other post, sample C sounds a little more dynamic to me versus G. The bass was clearer to me on the K 701's I have and things just sounded a little more airy and punchy. Like I said though, I'm not saying C is the "original" file as maybe the clearer bass is a bad thing.
  3. stv014
    I think I found out most, if not all files with the help of some measurements. Here is what I measured, in alphabetical order of the device names (I do not show the corresponding FLAC letters so that the test is not spoiled). The second frequency response graphs are skamp's RMAA results for comparison. Note that my frequency response graphs (created from the music) have some anomalies in the top octave.
    Realtek ALC663 (pitch correction = 0.9999115) - the ripple in the FR gives it away clearly: fr_alc663.png     fr.png
    Sansa Clip+ (pitch correction = 0.9995014) - this was hard to tell apart from the iPod, but the Clip+ has a more rolled-off top octave, and a very slight FR ripple. I am not 100% sure about these two, however, and the channel imbalance does not match the RMAA graphs (it could depend on the volume).
    fr_clip.png     fr.png
    E-Mu 0204 (pitch correction = 1.0 (measured: 0.999999995815), because the DAC and ADC share the same clock), here my FR graph matches skamp's one relatively decently:
    fr_emu.png     fr.png
    Galaxy Nexus (pitch correction = 0.9998235) - relatively large and obvious FR errors, perhaps the easiest file to identify:
    fr_galaxy.png     fr.png
    iPod Classic (pitch correction = 0.9995120) - this was hard to tell apart from the Clip+, but the Clip+ has a more rolled-off top octave, and a very slight FR ripple. I am not 100% sure about these two, however, and the channel imbalance does not match the RMAA graphs (it could depend on the volume).
    fr_ipod.png     fr.png
    ODAC (pitch correction = 1.0000974) - there is a slow high frequency roll-off that begins quite early. It also does not have as bad pitch accuracy as the various portable players.
    fr_odac.png     fr.png
    Original sample, for completeness only (pitch correction = 1.0, obviously):
    Some interesting observations:
    - not all files seem to be accurately level matched (some DAPs in particular are slightly louder than the source file)
    - portable players can have a pitch error as high as 500 ppm, however, at less than 1 cent, this is still not audible
    - the E-Mu 0204 used for the recording apparently has a slight low frequency roll-off on the left channel, but not on the right channel; this is odd, even if not necessarily audible
    - one of the devices inverts the phase of the output; I do not tell which one, because it would spoil (partly) the test
  4. skamp

    This test is about audibility.

    They're level matched within 0.02 dB.
  5. stv014
    I admit I would probably not be able to tell most of the files apart, except maybe the Galaxy Nexus (because of the frequency response) and anything that is not level matched accurately enough or has other problems.
    The iPod is louder than the original by 0.23 dB on the right channel, and by 0.07 dB on the left channel. The exact values depend on how it is measured, but even on the FR graphs above it can be seen clearly. The Clip+ (or at least what I think is the Clip+) is also louder than the original by about 0.14 dB on both channels. These are not major differences, but are above the "standard" allowed maximum of 0.1 dB.
  6. skamp

    Please publish positive ABX logs of a 0.23 dB difference.
  7. stv014
    I have seen someone at HydrogenAudio successfully ABX a difference of only 0.1 dB, but that was under "ideal" conditions (with a particularly well suited sample, very fast switching, low ambient noise, etc.). I will give the files I suspect to possibly sound different a try. By the way, if you want a positive ABX log, just have a look at miceblue's one; C is louder overall than G, but that is of course not necessarily the reason of the positive result, even if the description of a "little more dynamic" sound does fit a slight loudness difference.
  8. skamp
    Also, the Clip+ suffers from stereo crosstalk which measures at -54 dB. I tested it with a stereo file that has content in only one channel, and that's a lot more obvious than a 0.23 dB difference. With normal music though, I couldn't hear a difference, and I would be surprised if anyone could.
  9. skamp
    Also, Replaygain finds a 0.01 dB difference between C and G. Though I can make a new recording of C later. I won't replace the current file, obviously.
  10. stv014
    That seems to depend on what ReplayGain scanner you use:
    Given that it calculates an overall gain for both channels (so that 0.07 dB on one channel and 0.23 dB on the other is combined to 0.15 dB on both, for example), it seems to mostly agree with my measurements.
  11. skamp
    I corrected the channel imbalance on C: download C2.flac.
  12. stv014
     foo_abx 1.3.4 report foobar2000 v1.2.2 2013/02/10 19:13:28 File A: F:\abx\D.flac File B: F:\abx\E.flac 19:13:28 : Test started. 19:15:31 : 00/01  100.0% 19:15:44 : 00/02  100.0% 19:17:04 : 01/03  87.5% 19:17:24 : 02/04  68.8% 19:18:08 : 03/05  50.0% 19:18:45 : 04/06  34.4% 19:19:46 : 05/07  22.7% 19:27:40 : 06/08  14.5% 19:30:38 : 06/09  25.4% 19:30:59 : 07/10  17.2% 19:32:23 : 08/11  11.3% 19:34:06 : 09/12  7.3% 19:35:38 : 10/13  4.6% 19:41:25 : 11/14  2.9% 19:42:09 : 12/15  1.8% 19:42:51 : Test finished.  ---------- Total: 12/15 (1.8%)
  13. skamp
    stv014: was that really blind? You analyzed the file first and you were aware of frequency reponse variations prior to your listening test. Also, did you set the number of trials to 15 before starting, or did you just stop when you were satisfied with your score?
  14. stv014
    Knowing what A and B are does not in any way make an ABX test invalid, only the "do you prefer A or B" type of tests.
    I stopped the test because I do not have more time for it currently, and I am reasonably confident of hearing a difference, even though it is admittedly minor and difficult to notice. If I was only going for a minimum sufficient result, I would have stopped at 13 trials, because the probability of guessing was already under the required 5% there. Note also that the log begins with 2 of the 3 failed attempts; if I was "cheating", I would never have included those, but simply reset the test until the first successful attempt.
  15. miceblue
    Oh dang, I'll need to try the ABX test with the new file when I get the chance.
