Tidal lossless listening test, what's going on here?
Nov 24, 2014 at 8:24 PM Thread Starter Post #1 of 75

limpidglitch

Headphoneus Supremus
Joined
Oct 29, 2008
Posts
3,389
Likes
195
Location
Sandnes
Link
 
Apart from 5 trials being on the short side of sufficient. If you can do 4/5 or better four times in a row, it should be pretty solid.
I did that and got 16/20 correct, P(X≥16)=0.006.

I've never managed anything similar to that when performing self administered tests comparing lossless and high bitrate lossy, be it aac, mp3 or ogg, so why can I do it in this case?
 
The test caused a little stir on Reddit, but I'm not really content with the answers that came there, even when they had a Tidal official taking part in the discussion.
Might it just be a result of choosing a 'bad' or sub-optimal encoder, prioritizing CPU time over transparency?
 
Nov 24, 2014 at 8:43 PM Post #2 of 75

Head Injury

Headphoneus Supremus
Joined
Sep 11, 2009
Posts
5,404
Likes
440
I downloaded the files used for the first song, "Flesh & Bone". Unfortunately the files are a bit out of sync so I can't get a good null from Audacity. Or at least I don't know how to align them accurately.
 
I'll download the rest and throw them in Dropbox for people to experiment.
 
Nov 24, 2014 at 9:09 PM Post #4 of 75

castleofargh

Sound Science Forum Moderator
Joined
Jul 2, 2011
Posts
9,483
Likes
4,875
I'll try later, but let's be honest, having tidal set up the test to know if they sound better than spotify or itunes, that's like asking Tyll Hertsens if he likes hawaiian shirts.
 
Nov 24, 2014 at 9:22 PM Post #5 of 75

Head Injury

Headphoneus Supremus
Joined
Sep 11, 2009
Posts
5,404
Likes
440
Dropbox link to test files: https://www.dropbox.com/sh/72c6yijkg1nwkze/AAC9uCKCXBnVg-9paXxXjWU2a?dl=0
 
No null because of the sync issue, but plot spectrum shows a slight increase in bass and treble in the lossless version of the Killers song. The difference in bass is about 1 dB at 60 Hz and goes up to 3 dB at 20 Hz, likewise the treble difference is about 1 dB from 10-16 kHz. I don't know if this is the result of the codec itself, or some EQ on Tidal's part. Hopefully someone with better testing software will find something more concrete.
 
Will check some of the others to see if I notice anything similar.
 
Daft Punk has virtually the same boosts to bass and treble (looked like just treble at first until I increased the sample size).
 
Hey, the Eagles get a similar treatment. Treble volume increases by nearly 3 dB in the 11-12 kHz area on the lossless copy.
 
I did a quick test on one of my own FLAC files, using 320 kbps MP3 for comparison (don't have an AAC encoder). There are no obvious changes to the spectrum from the lossy encoding, apart from a tiny (fraction of a dB) increase below 30 Hz and above 15 kHz, and a sharp drop-off at 20 kHz.
 
If AAC works even remotely similar, I think it's safe to say Tidal is applying an EQ to either their lossy or lossless files in this test. Whether or not they do in the streaming service itself, I'm not sure.
 
Nov 24, 2014 at 9:32 PM Post #6 of 75

limpidglitch

Headphoneus Supremus
Joined
Oct 29, 2008
Posts
3,389
Likes
195
Location
Sandnes
  I'll try later, but let's be honest, having tidal set up the test to know if they sound better than spotify or itunes, that's like asking Tyll Hertsens if he likes hawaiian shirts.

 
Absolutely, but it would be nice to have something concrete before we start pointing fingers.
And importantly, they're not comparing themselves against iTunes or Spotify, but their own lossy service. So if lossless sounds better, it might mean that they're hampering their own budget service, intentionally or not.
 
Nov 25, 2014 at 3:00 PM Post #7 of 75

CoiL

Member of the Trade: Wood Audio Accessories & Modifications
Joined
Jan 23, 2013
Posts
6,336
Likes
3,188
Location
Estonia
Did 2 times with Aune T1 (modded) + Fidelio X1... first was 4/5 and second 5/5. I had issue with Killers song which imo doesn`t have good mastering 
rolleyes.gif

 
Nov 25, 2014 at 3:41 PM Post #8 of 75

bigshot

Headphoneus Supremus
Joined
Nov 16, 2004
Posts
23,091
Likes
4,737
Location
Hollywood USA
Take the lossless file and make your own lossy and do the test again. It sounds like there is a problem with the lossy files.
 
Nov 25, 2014 at 8:36 PM Post #9 of 75

shstux

New Head-Fier
Joined
Nov 25, 2014
Posts
7
Likes
13
When I saw the Tidal listening test, I was really annoyed by the poor reporting of results (i.e. the lack of stats).
I went and built an 'alternative' test which uses an ABX-style format to test *exactly the same samples*.
 
You can do the test here: http://abx.digitalfeed.net
 
You can choose to do 5, 10, or 20 trials of each track, and if you do 10 or 20 trials it will give you per-track results as well as overall results (as p < .01, p < .05, etc. I haven't included specific p value display).
 
There can be a bit of glitching when switching between A, B and X. Unfortunately thats just a limitation of the Web Audio API (or the way I'm using it). It's worth reading the instructions - there are hotkeys for everything so you don't need to be clicking buttons with your mouse (and so can have your eyes closed and concentrate)
 
I just knocked this up over a couple of evenings, so feedback would be awesome.
Also, FF and Chrome only. Sorry, Web Audio API support isn't good enough in other browsers.
 
As I said, these are the same samples so if they were out of sync on the original test they will be on here too. I would love to re-create the samples to avoid that, but I don't know what encoders they used. Depending on how people like this test, I'll probably make a different version which uses files I've encoded to known parameters.
 
Finally: if anyone has a good article on Signal Detection Theory and how to use it for ABX, let me know. I'm using basic frequentist statistics at the moment because I can't find my old lecture notes on SDT.
 
Disclosure: I'm using google analytics and putting final results in the URL so I can see how people are doing overall, but that's all the logging I do.
 
Nov 25, 2014 at 10:26 PM Post #10 of 75

limpidglitch

Headphoneus Supremus
Joined
Oct 29, 2008
Posts
3,389
Likes
195
Location
Sandnes
  When I saw the Tidal listening test, I was really annoyed by the poor reporting of results (i.e. the lack of stats).
I went and built an 'alternative' test which uses an ABX-style format to test *exactly the same samples*.
 
You can do the test here: http://abx.digitalfeed.net
 
You can choose to do 5, 10, or 20 trials of each track, and if you do 10 or 20 trials it will give you per-track results as well as overall results (as p < .01, p < .05, etc. I haven't included specific p value display).
 
There can be a bit of glitching when switching between A, B and X. Unfortunately thats just a limitation of the Web Audio API (or the way I'm using it). It's worth reading the instructions - there are hotkeys for everything so you don't need to be clicking buttons with your mouse (and so can have your eyes closed and concentrate)
 
I just knocked this up over a couple of evenings, so feedback would be awesome.
Also, FF and Chrome only. Sorry, Web Audio API support isn't good enough in other browsers.
 
As I said, these are the same samples so if they were out of sync on the original test they will be on here too. I would love to re-create the samples to avoid that, but I don't know what encoders they used. Depending on how people like this test, I'll probably make a different version which uses files I've encoded to known parameters.
 
Finally: if anyone has a good article on Signal Detection Theory and how to use it for ABX, let me know. I'm using basic frequentist statistics at the moment because I can't find my old lecture notes on SDT.
 
Disclosure: I'm using google analytics and putting final results in the URL so I can see how people are doing overall, but that's all the logging I do.

 
Nice work! That's more like it should have been done in the beginning. Although you need some real stamina to do the 5x10 and 5x20 trials. :)
Apart from a few typos (Benferroni, not Benferonni), the only major issue is the out of sync problem, which I believe artificially inflated my scores (using FF 33.1).
 

 
Nov 25, 2014 at 10:39 PM Post #11 of 75

shstux

New Head-Fier
Joined
Nov 25, 2014
Posts
7
Likes
13
   
Nice work! That's more like it should have been done in the beginning. Although you need some real stamina to do the 5x10 and 5x20 trials. :)
Apart from a few typos (Benferroni, not Benferonni), the only major issue is the out of sync problem, which I believe artificially inflated my scores (using FF 33.1).
 

 
Thanks for the feedback. I really wish Tidal had provided info on what their compression was so I could re-create their samples without weird sync problems.
 
I'll correct that spelling error, too =)
 
To make sure that there are not systematic sync problems in the *listening* system, three independent streams are played (i.e. A, B and X are all played distinctly) instead of just switching to the A or B stream when X is chosen depending on which X is. I was worried that if I just used two streams, it would glitch going from X to the incorrect option, but not X to the correct option. The streams are always started in the same order, too, irrelevant of whether X is A or B. Thus, any *systematic* sync problems should only come from the source files. Should.
 
I am thinking I might modify the system at some stage soon so that people can pass through a query-string option pointing to a 'settings', and run their own samples through my interface. (i.e. they would provide a list of files, images, etc and I would just read them into my code and handle the playback etc). As in maybe in the next few days.
 
Nov 25, 2014 at 10:45 PM Post #12 of 75

Head Injury

Headphoneus Supremus
Joined
Sep 11, 2009
Posts
5,404
Likes
440
   
Thanks for the feedback. I really wish Tidal had provided info on what their compression was so I could re-create their samples without weird sync problems.
 
I'll correct that spelling error, too =)
 
To make sure that there are not systematic sync problems in the *listening* system, three independent streams are played (i.e. A, B and X are all played distinctly) instead of just switching to the A or B stream when X is chosen depending on which X is. I was worried that if I just used two streams, it would glitch going from X to the incorrect option, but not X to the correct option. The streams are always started in the same order, too, irrelevant of whether X is A or B. Thus, any *systematic* sync problems should only come from the source files. Should.
 
I am thinking I might modify the system at some stage soon so that people can pass through a query-string option pointing to a 'settings', and run their own samples through my interface. (i.e. they would provide a list of files, images, etc and I would just read them into my code and handle the playback etc). As in maybe in the next few days.

They're using 320 kbps AAC, the info's under the "About the High Fidelity Test".
 
Allowing users to upload their own files will be awesome, I was about to suggest it! I'm sure many people won't test themselves because they're unable or unwilling to install a plugin like the foobar one, if we had a website to point them to they'd really have no excuse. Maybe you could even hook it up to do conversions server-side, so they just upload one lossless file and the rest is handled by their settings?
 
Nov 25, 2014 at 11:03 PM Post #13 of 75

shstux

New Head-Fier
Joined
Nov 25, 2014
Posts
7
Likes
13
  They're using 320 kbps AAC, the info's under the "About the High Fidelity Test".
 
Allowing users to upload their own files will be awesome, I was about to suggest it! I'm sure many people won't test themselves because they're unable or unwilling to install a plugin like the foobar one, if we had a website to point them to they'd really have no excuse. Maybe you could even hook it up to do conversions server-side, so they just upload one lossless file and the rest is handled by their settings?

 
 
Initially at least, people running the tests would need to host the audio files, images, and a configuration file  on their own systems (but could use something like Cloudflare's free account to get them cached on a CDN).
The main issue with doing things like server-side conversions is that I'm running on $10-a-month hosting so I think they'd throw me off pretty quickly =)
 
Still, I could pretty readily provide instructions for getting CDN caching set up. I think Dropbox actually has the CORS headers (required on the server hosting the files) set up so that someone could keep the FLAC files etc in their public dropbox and use it from there. Once I've got something going I'll ask for more feedback and ideas.
 
 
One option for people who want to test using files they have on their local machine (or files they download) is this: http://webabx.nfshost.com/
I haven't used it at all, so I don't know how well it worked. I just looked at it for
 
 
What I'd 'really' like to do is implement a more complex psychometric testing system where the lossy compression bitrate is switched (i.e. from a 64kbps to 96kbps to 128kbps etc) depending on whether people answer correctly or not. For example, it might start on 96kbps, and if someone is wrong enough then drop to 64kbps, or if they are right go up to 128, then to 196. I'll have to look through my old study notes because it has been a long time since I studies psychometrics =)
 
 
Also, for anyone interested, I've put up a 128kbps (VBR) Opus test: http://abx.digitalfeed.net/opus.html
It uses exactly the same UI as the other one, so check the URL to make sure you're in the right spot.
It uses the same lossless samples as the base, but I compressed the lossless to opus (and then decompressed and went back to lossless).
Turns out the bad sync was in the source files, because this certainly doesn't have those problems.
 
Nov 25, 2014 at 11:04 PM Post #14 of 75

limpidglitch

Headphoneus Supremus
Joined
Oct 29, 2008
Posts
3,389
Likes
195
Location
Sandnes
  They're using 320 kbps AAC, the info's under the "About the High Fidelity Test".
 

 
That's not terribly precise. No mention of what encoder, what settings within that encoder, or whether normalization was applied.
 

 

Users who are viewing this thread

Top