When is mp3 transparent? An ABX test of compressed audio versus lossless
Dec 23, 2014 at 7:06 PM Thread Starter Post #1 of 10

Jon Sonne

Member of the Trade: Lucky Ears
Joined
Dec 9, 2014
Posts
194
Likes
83
Introduction
 
I know there are other threads out there with the same topic. I just want to share my personal experience on compressed audio. For a long time, I was convinced that I was not able to hear the difference between 128 kbps and lossless files. Then I tried the “golden ear challenge” by Philips and passed their 128 kbps vs. reference test. This got me thinking: Was I really able to hear a difference, or was it just pure luck?
 
I’m a scientist, and it could not be helped: I had to conduct an experiment to evaluate if I could hear the difference between compressed and lossless audio. More specifically, I did an ABX test of 128, 160, 192, 224, 254 and 320 kbps constant bit rate MP3 all against 44.100 kHz 16 bit WAV.  
  
 
 
ABX software:
 
I used ABXTester for Mac OSX.
 
For those of you who are not familiar with the app:
In this app you choose two songs. The first song is marked as “A” and the other song is marked as “B”. The app then generates a random list of five songs marked as “X”, that are identical to either “A” or “B”. It is now your task to tell the app, if the “X” songs are either “A” or “B”. You can play song A, B and X as many times as you like, before you decide to say which song X is.
 
 
 
Music files:
 
I ripped Workin’ Day And Night by Michael Jackson from the 2001 remastered edition of Off The Wall. I used iTunes constant bitrate mp3 encoder and wav encoder.
 
 
 
What I listened for:
 
I listened in the beginning of the track and I focused on the maracas.
 
 
 
Set-up:
 
MacBook Pro -> Head-Direct RE-0
 
I used my good ol’ head-direct RE-0 and my even older MacBook Pro for this test. I used a parametric equalizer for each audio channel to improve the RE-0’s stereo balance and to alter the RE-0’s frequency response to a more un-coloured sound (to my ears). This really improves the sound quality of these IEM and makes them sound more like reference full size cans. For a guide on how to use parametric eq, please take a look at this thread: http://www.head-fi.org/t/615417/how-to-equalize-your-headphones-advanced-tutorial-in-progress 
 
 
 
Results:
 
I started with some lower bitrates to get some training. It was actually relatively easy for me to distinguish them from the wav file reference. The maracas sound snappy, dynamic and clear on the reference file, while the compressed files are more diffuse and less dynamic. Here is my data:
 
128 kbps: 5/5 correct
 
160 kbps: 5/5 correct
 
192 kbps: 5/5 correct
 
 
I need more data to say if the above result is significant or not, but I was just very certain that I could tell them apart, so I will leave the result as it is.
 
Now, it becomes more interesting above 192 kbps. On 224 kbps there is still clearly a difference, but it is much less obvious. Here are the results:
 
 
1st trial:            2nd trial:           3rd trial            4th  trial            5th trial            
 
4/5                   4/5                   4/5                   5/5                   4/5
 
 
Total: 21/25
 
A significant result (one with 95% confidence) can be claimed if the number of correct responses exceeds N/2+N^0.5 (Source: http://en.wikipedia.org/wiki/ABX_test). Thus, with 25 guesses you need at least 18 correct answers to have a significant result with 95% confidence. Since I got 21 out of 25 correct, it is likely, that I can discern the two files. 
 
 
It became even more difficult with 256 kbps:
 
 
1st trial:            2nd trial:           3rd trial            4th  trial            5th trial    
 
3/5                   3/5                   3/5                   5/5                   4/5
 
 
Total: 18/25 correct
 
Even though I made some errors, the result is still significant at the 95% confidence level.
 
 
When I then tried compare 320 kbps to wav, I was really in doubt if I could hear a difference at all. Results:  
 
 
1st trial:            2nd trial:           3rd trial            4th  trial            5th trial    
 
3/5                   2/5                   4/5                   3/5                   4/5
 
 
Total: 16/25 correct
 
The 16 out of 25 correct answers is not enough to say that it is a significant result at the 95% confidence level. So… I guess I was not able to tell the difference here.
 
 
 
Conclusion:
 
To my great surprise, it is very likely, that I am able to hear a difference between losslesss WAV and MP3 encoded at bitrates equal to or lower than 256 kbps. But it is unlikely that I am able to distinguish between 320 kbps and lossless audio. The audio quality of lower bitrates files, eg. 128, 160, 190 and 224 kbps, was IMO noticeably worse than the quality of the wav file playback. However, the difference in quality between 256 or 320 vs. lossless was negligible to my ears.  
 
 
 
Considerations for the next experiment:
 
I only focused on a single instrument (the maracas), on a single track. To get to a more general conclusion on compressed audio, I need to test more tracks and focus on different instruments.
 
I discovered after the experiment that the 2001 remastered edition of Off The Wall has a significantly poorer dynamic range (DR09) compared to the original version (DR17). Source: The Dynamic Range Database, link: http://dr.loudness-war.info/album/list?artist=&album=Off+the+Wall. For the next experiment, I will exclusively use tracks with a dynamic range of at least 17.
 
A pair of reference full size cans might improve my success rate. Next time I will do the test with my sennheiser HD650 in a quiet environment.
 
 
I hope you found this interesting
 
Cheers,
 
/LuckyEars
 
Dec 23, 2014 at 9:29 PM Post #2 of 10
Seems reasonable. I think most people who will listen intently could pass the 128 test without much fuss. Someone with better ears than I can probably do 256, especially if they use markers. And your decreasing trend with bit-rate is exactly what we'd expect to see. I'd be curious with how you do with another codec, say AAC.
 
Dec 24, 2014 at 11:27 AM Post #4 of 10
  Seems reasonable. I think most people who will listen intently could pass the 128 test without much fuss. Someone with better ears than I can probably do 256, especially if they use markers. And your decreasing trend with bit-rate is exactly what we'd expect to see. I'd be curious with how you do with another codec, say AAC.

Good it makes sense. I will try with AAC when I have the time 
regular_smile .gif

 
  Did you take any steps to eliminate the clipping during playback inherent to MP3 (and other lossy compression schemes)?

I was not aware that this is a problem. How can you eliminate clipping during playback? 
 
Dec 24, 2014 at 11:41 AM Post #5 of 10
  I was not aware that this is a problem. How can you eliminate clipping during playback? 

 
It can be especially with tracks that hit 0db or close for significant periods. The only time I ever was able to detect a difference between a VBR 0 mp3 file and the wav was on a very hot track and the added clipping after encoding was measurable You could lower the level of the source track before encoding or or just choose a different track. There are always a few problem tracks that give encoders problems. It is in any case good practice to try with different tracks as otherwise you cannot generalize your result...
 
Dec 24, 2014 at 6:41 PM Post #6 of 10
   
It can be especially with tracks that hit 0db or close for significant periods ....... You could lower the level of the source track before encoding or or just choose a different track. 

 
I just looked at waveform of the track intro in Logic, and it is nowhere near 0db, so I guess clipping would not be a problem for this track. The first passage of the song is actually very "quiet":
 

 
 
Quote:
 
The only time I ever was able to detect a difference between a VBR 0 mp3 file and the wav was on a very hot track and the added clipping after encoding was measurable 

 
A hot track as in hot treble? I actually chose Workin' Day And Night by Michael Jackson exactly because the intro has a lot of treble in it.
 
   
The only time I ever was able to detect a difference between a VBR 0 mp3 file and the wav was on a very hot track and the added clipping after encoding was measurable 

 
Keep in mind that I used constant bit rates. VBR 0 should sound better than a constant 256 kbps. 
 
   
It is in any case good practice to try with different tracks as otherwise you cannot generalize your result...

 
Of course, I will try with some other tracks. My conclusion was just a preliminary conclusion on this single song. 
 
Dec 24, 2014 at 11:18 PM Post #7 of 10
I was not aware that this is a problem. How can you eliminate clipping during playback? 

I don't know what applications you can use on a Mac. On a PC MP3Gain can reduce the volume of MP3s in 1.5dB steps. It can scan the files and reduce the playback volume to prevent clipping. ReplayGain tags and an application that supports them can do the same thing without altering the files

Edit: typos
 
Dec 24, 2014 at 11:33 PM Post #8 of 10
  It can be especially with tracks that hit 0db or close for significant periods. The only time I ever was able to detect a difference between a VBR 0 mp3 file and the wav was on a very hot track and the added clipping after encoding was measurable You could lower the level of the source track before encoding or or just choose a different track. There are always a few problem tracks that give encoders problems. It is in any case good practice to try with different tracks as otherwise you cannot generalize your result...

 
That not exactly correct.  You don't need a significant period of loud passages.  Nearly all mp3 tracks clip during playback.  AAC files do it too.  They store the audio information as floating point data and when it's converted to fixed point math on decoding it can clip if the source gets near or to 0db.   Lowering the level before encoding is probably not the proper solution.  Reducing the gain during the decode stage based on ReplayGain tags is probably the better solution if you don't want to actually change the file.
 
Dec 25, 2014 at 3:40 AM Post #9 of 10
   
That not exactly correct.  You don't need a significant period of loud passages.  Nearly all mp3 tracks clip during playback.  AAC files do it too.  They store the audio information as floating point data and when it's converted to fixed point math on decoding it can clip if the source gets near or to 0db.   Lowering the level before encoding is probably not the proper solution.  Reducing the gain during the decode stage based on ReplayGain tags is probably the better solution if you don't want to actually change the file.

 
If you look in my earlier post, you can see the waveform of the track I used, and it is is nowhere near 0db, so I think clipping would not be an issue for this particular experiment. 
 
Dec 25, 2014 at 9:02 AM Post #10 of 10
  If you look in my earlier post, you can see the waveform of the track I used, and it is is nowhere near 0db, so I think clipping would not be an issue for this particular experiment. 

 
I'd agree.  The track your listening to certainly doesn't seem loud enough to encounter clipping.  I only brought it up because when ABX'ing a MP3 with the lossless you want to make sure you're checking for compression artifacts not audible clipping, though it could be argued the clipping is an inherent part of the compression and you have to take additional steps specifically to remove it.
 

Users who are viewing this thread

Back
Top