What I'm trying to say is, what's important is not whether it *looks* the same, but whether it *sounds* the same. It's called psychoACOUSTICS for a reason.
Every so often, some guy comes up with the bright idea of subtracting the compressed waveform from the original and looking at how much stuff is left. He'd call the encoder with the least difference from the original the best.
Every time this happens he'd be laughed out of any serious sound compression discussion forum.
If you really want to 'see' (um) what I mean...
http://www.audio-illumination.org/fo...4&hl=wondering Skip to dibrom's post after reading the first post.
Looking at graphs seems like a very scientific to do in most occasions but is simply the wrong methodology for this topic.
Or, more accurately, it is possible in theory to see from a graph whether one codec is better than another, but it is very counterintuitive--for example, in an overall frequency analysis you may see that one codec preserves as much treble energy as the original whilst another codec has much less, but it may turn out that the codec with more treble energy, has it all messed up in the form of pre- and post- ringing whilst the other codec correctly encoded only the parts of the treble in the music that we are likely to notice. You must look at a spectrogram that shows time as well as frequency.
Even then you are likely to commit many mistakes. For example, after 'waveform subtraction', one codec may show a little deviation from the original in the time corresponding to the attack of an instrument, whereas another codec may show apparently a lot more deviation in the 'steady state' phase of the instrument (i.e. the time when the instrument is holding a note). You may conclude from the graph that the latter codec is the poorer one. But it may turn out that the deviation of the first codec is exhibited in offending pre-echo and alteration of the perceived character of the instrument (the sound in the attack phase of an instrument is much more important in determining the perceived character of the instrument than the sound in the steady state--in fact, when the attack phase of instruments is edited away in a sound clip, people even have difficulty telling apart instruments from completely different families.)
And yet everything in the above examples could be the other way round, the codec with roughly the correct amount of treble COULD indeed be the more accurate-sounding codec, the codec with more distortion in the steady phase of the instrument COULD indeed be distorting the character of the instrument. There is simply no way for human eyes to tell which is the case. When all is said and done, it is simply better to trust good ears, PROPERLY UTILIZED (well trained in detecting psychoacoustic encoder artifacts, good acuity, good mental condition e.g. well-rested, working with ABX comparisons)
Why do psychoacoustic encoders have to produce output wavs that are different from the original at all? It's because it's simply impossible to put in all the information about the original wav at the bitrates these encoders are using. All is not lost, however, because it's also impossible for the human ear to take in all the information available in their acoustic surroundings. (
based on the anatomy of the human hearing system, especially the auditory nerve, it is estimated that auditory information is sent to the Central Nervous System at a rate of only 10-20kbps!!!) For this reason it is obvious that there are a range of waveforms that would map to the same representation after processing by the human auditory system, and that the waveform could be represented by a 'simpler' (informationally speaking, not conceptually
) representation than PCM.
Since humans do not process sound purely in the time domain (PCM waveform) or frequency domain (spectrogram), the human ear has different ideas about what sound more similar or more different than the ideas you'd get from a waveform or spectrogram.