1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.

    Dismiss Notice

Perceptual coding

3 4
  1. VNandor
    I'm no expert at any of the mentioned topics but I know that if you process music you'll probably have to use oversampling. It is being built in virtually every DSP software nowadays. For playback, 44.1kHz is perfect to reproduce the signal if the signal is bandlimited to 22kHz.I don't know if today's filters are capable enough but I'd assume yes.  As far as I know the only advantage 24 bit has over 16 bit is bigger dynamic range and headroom.
    As far as I know the ear and brain consistently trash out/dismiss a portion of the information and mp3 takes advantage of this. You could argue about how well it works but what I know is that mp3 works for me perfectly so far. It not only manages to convey the art and move me emotionally just as much as lossless but I literally can't spot any difference between mp3 and lossless (as I already mentioned).
  2. FFBookman

    This shows how flawed ABX tests really are. I probably would fail a certain percentage of ABX tests also. 
    It's a horrible test for determining true render quality of a final mix for a variety of reasons: shortness of sample time, quick switching between poorly labeled sources, the loss of passive listening when aware of the test, level matching, mix/production tricks to sweeten the sound, usually unfamiliar material in an unfamiliar environment., to name just a few.
    [Note that when mixing we AB things all the time, but this is in the mix stage, in very small segments, singular tracks, or at least singular decisions. This is not flawed since there is no wrong answer, it's creative  :wink: ]
    Given a more natural and long term listening test – on a variety of systems, in a variety of daytimes, with a variety of speaker sets – you should really see your accuracy rise. 
    If you can't spot the difference between lossy and lossless given different speaker sets, different daytimes, different environments -- and your playback rig is not absolute rubbish -- I really suspect a focus or hearing problems. Of course that gets into sensitive stuff because I'm no doctor or shrink, so ya know, end of topic there.
  3. FFBookman

    But this is the key -- Bigger dynamic range and headroom is a huge advantage over 16 bit. People always think about audio as up-down, like a drawn waveform flat on the screen, but remember it is 3 dimensional. It's actually far more complex than 3 dimensional because it's binaural L-R, and we simulate binaural stereo in fantastical ways with our production systems.
    Headroom is actually depth, too.  Width. Space.  More gradients in all directions. Less pixelation on all parameters.
    What I don't understand is why everyone "gets" this with visual but believes it's the opposite, or that the basics are different, for audio.
    If your jpeg is 3200k it either :
    a) has many more colors than a 320k jpg
    b) is much larger in dimension than the 320k jpg
    c) is both more colorful and larger in dimension than the 320k jpg
    if we know the jpeg is the same pic with the same dimensions, and you believe dimensions for audio are runtime only, then it must have 10x the color depth.  but the dimensions in audio are far more data intensive than 2-dimensional jpeg art, so the headroom is far more helpful.
    the first thing you notice when hearing 24bit audio rendered properly is the largeness of the sound and the room it was recorded in. headroom can also be considered elbow room.

    As far as your perception of mp3, there's all kinds of reasons why everything might sound the same to you.
    First you have the source file.
    You have the rendering through the DAC.
    You have the rendering and amplification in analog stage.
    You have output amps and wiring.
    You may have software EQ and processing.
    You have interconnects to the speakers.
    You have billions of speakers in the world.
    You have trillions of rooms to set them in.
    You have even more thoughts and ideas going through your head as you listen to music.
    Any of the above after the source file can and will degrade the signal. I just think the initial degradation of the source file is unnecessary and more psychologically damaging than believed.  After that it's up to you, not much I can do to help :wink:
  4. WraithApe
    Quick-switching between properly transcoded, level-matched files (from a single lossless source) is the only reliable way to determine a difference. If you rely on A-B testing you are inevitably at the mercy of the transitory nature of auditory memory (just a few seconds), subjectivity and expectation bias. Like I said before, the fact that it works for 192 and below vs FLAC consistently proves the test works. Even at 192, the differences are not night and day and you have to train your ear to listen for subtle compression artefacts.
    I can assure you there is nothing wrong with my playback rig or my hearing - as it happens, I had an audiology test last year and got an above average result. I've also completed the Golden Ear challenge. I have to say, this line of argumentation about peoples' gear and/or hearing is rather weak IMO.
  5. FFBookman

    That's fair, I think it's weak too. I'm not saying you can't hear or have bad gear, I'm just hoping you can quantify that, and you have.
    I think you are on the cusp of a breakthrough. Take your methodical ABX tests and extend them. Your first statement about quick switching being the "only reliable way" is not true, as you explain yourself.  It's not reliable at all. It breaks down even good listeners like yourself.  The quick-switch ABX test for lossless vs lossy is garbage and you should never trust it's results. The quicker you come to believe that the happier you'll be. Relying on a false metric is worse than relying on none at all.  
    Ignoring the ears ability to adapt then detect after several minutes of taking in material is a mistake. Ignoring the brains ability to layer the sound onto familiar surroundings - render spaces - is a mistake. Ignoring day parts and environmental variables is a mistake. 

    My idea for a proper listening test is similar to the one used by Ayre audio, who happen to make the amazing sounding ponoplayer amongst other gear i can't afford.  
    Basically - it is a blind test using 50 of your favorite songs coded in different resolutions. They are rendered on a DAP that allows you some ability to rewind and review, but generally just plays them on volume-matched shuffle in sets of AB and displays a code for your journal entry. You are encouraged to play this through every speaker system you already own, at your own leisure, and write down results.  It will play 2 versions of each song back to back, and let you determine which one did what for you.
    You are also encouraged to video record yourself while listening to songs for review and commentary on your body movements during playbacks.
    The final piece is after reviewing the video and your notes, you get to pick which one you enjoyed more - A or B, and own it. 

    Charlie Hansen from Ayre has described his circuit builders listening test as a multi-day event, including at work and at home, on a variety of systems and environments, with a focus being on deep listening and full immersion into the musical number. After consuming the music in such a way, his engineers are told to decide based purely on feel, i.e. which one makes them feel better, and not turn to specs or scopes or theory.
  6. Music Alchemist
    A controlled test with a single variable that determines whether you can reliably distinguish between two things is infinitely better than a casual test with endless variables that, at best, determines your subjective feelings and proves nothing.
    RRod likes this.
  7. WraithApe
    FF, I can't help but feel you're wilfully ignoring my key point - if ABX testing consistently produces statistically significant results for 96/128/192 vs FLAC (and it does), that proves it is a viable means of testing. The fact that it generally breaks down when you get to 256/320 vs FLAC just goes to show how effective perceptual coding is at that resolution.
    Anyway, I'm outta here. To paraphrase Deckard, I was done with this subject when I came into this thread; I'm even more done with it now! [​IMG] 
  8. VNandor
    Bigger dynamic range is bigger dynamic range. End of story.
    I thought the headroom you have left tells you how much you can amplify the signal before you clip it.
    Yeah I don't trust those shills on the internet who do tests independently, I listen to the salesman's BS instead and let the prices do the talking. However they're salesmans and not EEs after all so they might not know what they're talking about and because of that, I'm planning to do my own set of tests. So far no positive results with ABX tests but I'm planning to capture the output of my different DACs with different cables, amps, different humidity, moon phase, at different parts of the day, etc.
    Anyways you aren't here to discuss technical details, you've pretty much settled down in that regard I guess. You said you want to know why lossy survives. I'm surprised you didn't figure it out. People buy it -> it will survive. Why do people buy it? They either don't care all that much (nor the artists because they let their stuff released in mp3) or don't have a system that could reveal the differences or maybe they care but they think differently than you do. Or evil aliens who benefit from the use of mp3 mind control us, humans to use mp3. Ok, seriously, I can't come up with any other good reasons right now.
  9. FFBookman

    Keep believing it if you want to -  it's the only test that tells you 320 = 1200.
    I know it's crap to believe there is no difference because your tests tell you so, and if my millions of words can't convince you few remaining holdouts oh well.
    It's been educational. I do believe they have gotten better at perceptual coding. But I just don't understand why the bother anymore. It was only created due to the largeness of initial audio files on the internet.  That was literally 24 years ago they started down that path. I stream netflix and several other things day and night, but we for some reason still think we don't have bandwidth for lossless. 
    My point is philosophical - why believe a test that tells you less is the same?  Especially when so many people are shouting at you that you must be crazy to listen to that test?  Oh well, rock on folks. If you can get your mojo going with 10% of the file that's fine, you'll always have the ability to go up in quality sometime down the road. 
    As a person who mixes music I am telling you that you are really missing out. Something in the path isn't working if you hear 24bit rendered properly and think "hmm, sounds like an mp3 to me".
  10. FFBookman

    Better resolution includes better resolution on everything.  The entire signal. Not just headroom.  Headroom is a way to diminish what it is. It's overall resolution.
    It's 16,000,000 possible points of data  (24bit) vs 64,000 possible points (16bit).
    How are you gonna take a grid of 16 million down to 64 thousand?  Downsample and then apply dither, which is fuzz, over top, to cover up the artifacts. There's several different fuzz wave shapes to choose from, and they each sound different. Unless you've AB'd 16bit dithered mixes from 24bit masters you just don't know what's being pulled out. It's impossible to describe in words. It's space, shape, depth, etc.
  11. Music Alchemist
    No one is saying that high bit rate lossy and lossless are the same in terms of data; it's just that all the evidence demonstrates that there is no audible difference between them, and none of the evidence has disproved that. If you want to disprove it, plenty of people have shown you how to attempt to do so.
  12. FFBookman

    When every producer and musician I know proves that it exists, and has for over 20 years now, nothing you can post will convince me it doesn't. Such strange types, the lossy defenders.  You truly believe that your body can't distinguish between the same song coming at you using 320k bandwidth or 3200k? 
    It's so obvious to me when hearing hi-res music that it's amazing you want a test to tell you it's the same as lossy.  Is there anything else in your life where you believe that your tests show that math and industry experts are wrong?
    May I reference you to http://www.grammy.com/quality-sound-matters  to find all sorts of experts that clearly disagree with you. The people that actually make and distribute music disagree with you. Scientists and internet physicists can't tell you anything about making and recording music, because they have no clue.  Your experts are the wrong experts.
    If there was no difference we'd all mix with lossy tracks --- so much more processing power available, so much hard drive space saved.  
  13. FFBookman

    I've stated several times that I think ABX tests are total garbage for mixed music testing. Quick switching is misleading unless you did the master mix yourself and you know what should be there.
    Since the test is garbage all the data it generates is garbage, aka misleading, aka invalid, so any theories supported by that data could also be invalid.
    All that said, of course higher bitrate lossy will sound better than lower bitrates, since they are throwing less away. Perhaps their coding even got better at fooling us. That should be obvious, I'm not arguing whether it has improved, or whether 320k is better than 192k. Of course it is. But 3200k is better than 320k whether you believe you can detect it or not. It's the master. It's the original.
    But it's basic to me, it's data, it's math, it's the total soundstage: All the parameters, all the frequencies, all the non-frequencies (air), all the layering and interplay of signal ---- all gets put into a data stream.  The size of that data stream is predetermined.
    24/88 - the production standard since the 90's - about 3000k per second for stereo PCM.
    24/96 - the production standard from 2008-2014 - about 4000k per second for stereo PCM.
    24/192 - the best we work at these days - about 5500k per second for stereo PCM.
    None of those would fit through the 90's internet pipes (tubes?!) so mpg compression was needed. Degraded signal resulted, fancied up with a fun nickname lossy.
  14. FFBookman
    a panel of industry types and mastering engineers discussing distribution quality, from 2014:
  15. VNandor
    If you want to debate audibility why don't you join in one of those pointless threads in the sound science sub-forum? Much better than starting an other pointless thread in the GD sub-forum.
    Fearless1 likes this.
3 4

Share This Page