1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.

    Dismiss Notice

Perceptual coding

1 2 3
  1. FFBookman

    I have a blind test available to me with my own materials on my own rig, built right into my DAP. it's called the 'revealer' and it was provided as a firmware from pono.
    I don't bother posting "results" since A- they really don't prove anything I don't already know, and B- you wouldn't trust me/them if they did.
    I can close my eyes, run my fingers in a circle, and the ponoplayer will select 1 of 4-5 versions of the same exact 24bit source file encoded to various qualities, some lossless, some lossy.
    have you seen this? let me grab a screenshot from someone else:
    You have to set it up in the desktop client by feeding it 24bit files, then it will churn out 3-4 versions below what you give it, depending on where you start. Then you sync and the player has those songs set up in a separate area to play around with. you can close your eyes and blind test away.
    The playback mechanism stays in the same location when you select a different rez and almost gets to gapless switching, but not quite. I don't know what kind of volume matching they might have done, if any, but I know they used open source downsampling, dithering, and lossy compression techniques that are not deliberately old or flawed. I have read and posted with enough people inside that project to believe they aren't trying some VW crap on the files.  
    Screenshot shows, from bottom to top: 192k MP3 - 256k AAC - 16/44 PCM Lossless - 24/96 PCM lossless - 24/192 PCM lossless.  (I don't think it can include DSD, although the player can play them natively.)
    I have 5-6 songs in there. It's so obvious to me within seconds, almost every time, that I sit here using the internet to try and understand why people claim to not hear it. 
    Granted I can't call exactly which one I'm on every time, but I immediately sense whether I am in the big room or the small room.
    Soon as there is a cymbal crash or hi hat roll or super complicated texture like layers of guitar distortion, then you know where you are.   Bass guitar is a real good tell. If you hear it as a low stringed instrument with string sounds you are in hi-res. The mix and the timbre of the instruments, things like the solidness of the kick drum in that bass line, is another tell. Multi-part voice harmonies -- if you hear all the voices come in and layer, not gel into one glob, it's lossless. If you hear breath and lip smacks, it's hi-res.
    I hear digital artifacts and narrow room in the red.
    I hear wide open spaces and accurate instrument blends in the green and yellow.
    Blue usually confuses me b/c it's so good and so bad, depending on which direction you are coming from!  After lossy, CD sounds amazing. After hi-res, CD sounds flat and lifeless.
  2. FFBookman
    Nyqvist-Shannon - valid theory for less than critical sound sampling. but i love my music. i'm not a rich guy, i'm a pissed off poor consumer tired of being sold less than the original, and i'm also a producer that has been working in 24bit for well over a decade now so I know all about it too. Just trying to mate the professional world with the consumer world, for the sake of music, and precedent.
    Try to imagine if everyone still had SD TV's, 35" or so.  But when you saw interviews with Hollywood types and movie stars they had 1080p 70" screens. When you went into video editing areas they had 1080p 50" screens.  In sports bars they had 60" HD everywhere. But at home everyone still had 35" SD, regardless of income level or nerdiness.
    People just didn't believe you needed anything bigger than 35", anything better than SD, in the house, because you can't tell the difference anyway. Something about viewing distance and resolution of the eye in tests someone told you about once. Who knows. People who had those big high resolution TV's were just teeveophiles, the worst type or pony tailed rich guy nerd.
    As far as stretching my example to an extreme to test it -- I've never heard anything in PCM higher than 24/192. And even that, I've probably only heard and own a few hours of music at 24/192. I have never heard DSD, MQA, or other random formats.  But none of that actually means that less is more. You are just trying to blow out my theory with ridiculous numbers. I think you know that Floating point is a totally different thing, there is no higher than 24bit PCM bit-depth audio that I know of on the planet as a transit format. I'm not talking about what is happening inside the digital realm with oversampling and converting to different data structures. I'm talking about file format, already encapsulated.
    Analog audio has a limit. We don't need infinite processing to recreate it. Accurate audio is just slightly higher than 16/44 data space can achieve. I would be surprised to hear much better digital quality than 24/192.
    Even 20/88 sounds amazing, markedly better than 16/44 when i heard it 25 years ago.
    As far as thread title, i wasn't going for bait and switch. I just don't want to be seen as the CD-defender against lossy argument. I want to see hi-res across the board, to save music. there's still hope.
    Sony's archive guy said in 2013 that the first time they tried to digitize their back catalog (1985-95) they only got about 30% of the way complete, then they switched to 24bit in the late 90's.
  3. Music Alchemist
    Use dBpoweramp and convert the files yourself, then get back to me.
  4. VNandor
    Where did you get that idea? The theorem is very clear about what it is. If the signal is bandlimited the signal can be reproduced perfectly if the sampling rate is at least two times higher than the highest frequency in the given signal. If you use a sampling rate of 192kHz instead of 44.1kHz when the signal is bandlimited to ~20kHz you will get the exact same analog data after the converting just from more samples. According to the theory the only benefit you can get from higher sampling rates are higher frequencies. In the real world your DAC might be able to play back 192kHz wavs better than 44.1kHz wavs but I personally don't know the technical reasons behind it.
    I have never ever said that less is more. You may want to read it again. And I'm glad you noticed where I'm going with the stupid numbers. For me 192/24 seems just as pointless as 352/24bit (yeah) or the I don't even know what numbers DSD to you. They are effectively meaningless after a certain point. You say 352kHz probably sounds the same as 192kHz despite 352 being the bigger number. But you've just said bigger is oh so much always better. So how does it work again? I'm saying the same thing but with 44.1kHz vs 192kHz based on what I know about digital audio. (And based on my personal experience which again doesn't have much to do with actual facts about audibility.)
    Are you going with your gut now or are there any theories/facts behind the assumption? I haven't encountered any analog source that had a dynamic range around 95dB or exceeded it. Granted I don't have thousands of vinyls and tapes lying around. Or do you refer to actual live music as analog audio?
    Symphonic orchestras can be very loud. If 120dB SPL is the threshold of pain, then I would say it can get to ~100 dB SPL as I wanted to plug my ears with my fingers when I  unsuspectingly attended to a contemporary classical music concert. It didn't hurt but I haven't realized how loud symphonic orchestras can be until I heard something I didn't like listening to.
    BUT there always will be a good amount of ambient noise, 20dB SPL is considered to be a very low ambient noise and is typical in recording studios AFAIK. That's about 80dB dynamic range, less than what 16 bits can represent. I believe one could possibly make an orchestra to play in a way it permanently damages the ear of their audience but it would be silly.
  5. FFBookman
    How I know science is on my side: Neuroscientists.  These people are just scratching the surface on how human brains work.
    Telephone engineering from 1950's (Nyqvist) or software engineering from the 1990's (lossy coding) do not drive neuroscience, they take advantage of the ignorance in neuroscience.
    ​When the neuroscientists image the brain as it listens to music, and finally get around to trying different resolutions, this thread can start anew.
  6. FFBookman
    I've read the people pushing nyquest-shannon admit that 16/44 is not enough for critical listening, especially in the 80's when it was first being pushed on us. They just claim consumers don't critically listen, so 16/44 is sufficient, or good enough.  Sorry I can't site a source other than my own digital audio education which I started to receive around 1988, and my professor then also said that 16/44 was not enough for critical listening, just mainstream use.
    I answered the part about the big numbers. Analog audio is not infinite. Our auditory system is not infinite. It does indeed have a range (more than just frequency or headroom) but it does have limits.  16/44 PCM isn't enough to properly encapsulate those limits.
    To my ears 20/88 should be enough space to not degrade the original great mix.  You need more space than 16 bit allows, and you more freq range than 22k.  But it's not infinite. 
    Floating point is not the same as fixed point. When you site 32bit I think you are talking about internal, floating point math, not an encapsulated file for transfer and render.
  7. VNandor
    Any type of sound is only frequency and amplitude changing over time. There is definitely more to our auditory system because noone hears the harmonics individually when listening to an instrument. When the sound is interpreted by the auditory system it will gain such qualities as clear, harsh, dull whatever. But it all comes from he interpretation of various frequencies and amplitudes. What's even more, the same frequencies and amplitudes can be interpreted differently from time to time! But I fail to see how it's relevant to sound reproduction as any given sound will only have frequency and amplitude values despite the complex auditory system.
    So where's the flaw in my logic when I say we don't need more than 16/44.1? AFAIK people can't hear above ~20kHz and if you don't want to make a recording which makes the quietest parts to drown in background noise and/or make your ears bleed at the loudest parts you don't need more than 16 bits. The only flaw is that experiences show otherwise?
    So far, neuroscience didn't take a side regarding high-res audio. Nonetheless it would be very interesting if they found out that music that contains ultra-sonic frequencies can stimulate our brains differently.
  8. Music Alchemist
    Yes it is. I know you've already seen this article and ignored it, but I really think you should seriously read it.
    In a nutshell, 16/44 already reproduces all the frequencies and dynamic range that human ears can hear. Having more than that provides zero audible benefit for playback.
    My first guess about your situation is simply that the conversion software you're using is ruining everything. That's why I recommended dBpoweramp instead.
  9. FFBookman

    Well we've come full circle.  
    xiph dot org is the main reason i post these things online - i am completely opposed to their stance and their findings.  i actually blame most of the ignorance on this topic online to xiph.org, so no, i won't be reading that article again.
    xiph.org is a bunch of people that have never recorded music, never set foot in a recording or mastering studio, trying to tell all audio professionals they are full of crap.
    yet monty himself won't tell me what format he listens to his music in.  at one point i got him to admit he listens to ogg (lossy) and flac (lossless) but he wouldn't tell me if he owned any 24bit audio, or if he's even heard 24bit audio.  he also wouldn't tell me about his playback rigs.
    so i suspect monty - and xiph.org as a whole - is suffering from what you ascribe me -- their rig can't play hi-res, or their ears can't hear hi-res, so they mask their true intentions (promoting lossy) to attack hi-res.
  10. Music Alchemist
    Well, I take it you're never going to bother converting the files yourself with dBpoweramp and conducting proper controlled tests...so I'm out.
  11. VNandor
    So, if you think 44.1 kHz has any advantage over 192 kHz beside better extension in the high frequencies, you are wrong. I got a 192/24 track, downsampled it to 44.1 then null tested it with the original 192/24 file. Here's what's left and please note i added 50dB amplification to the analyzed signal:

    Total silence from 0 Hertz to ~20kHz (probably to 22.05) and some rather quiet high frequency information. If you believe there would be differences below 20kHz I will take my time to find a more refined spectrum analyzer, something that could go down to -300dB maybe? So do you think all the scientists who say people only hear from to 20kHz are wrong and tested with the wrong methods? You could probably get big and gain lot of recognition and money if you managed to prove all of them wrong. You also ignored my post about why 95dB dynamic range is not enough to reproduce music. Do you think live music has a bigger dynamic range than that?
    I noticed you repeatedly use the 'I recorded and mixed music for xx years' argument. You may very well know how to make a great recording but you seem to be completely ignorant about how digital signal processing, the human ears, and human psyche work. Nothing wrong with that, you aren't an engineer, biologist, or psychologist after all and I will be more than happy to apologize about my ignorance and pertinacity once you or anyone else publishes a study that explains how and why people can hear up to 96kHz. But until then, I'm going to stick with "my facts" and experiences, not yours.
  12. FFBookman

    I already told you I have several songs that started at 24bit and have been converted down to a variety of resolutions, all loaded up and ready to test on my DAP. I don't need to use your dBpoweramp program to hear down sampled audio and lossy audio.
    My results of "a proper blind test" matter not, since there's no such thing as a proper blind listening test that gives scientifically accurate results.
    What I don't understand is how you claim you have a test to prove that I am imagining things. To prove that lossy = lossless.  It's patently ridiculous. Do you have any other tests that tell you 2 ≠ 2?   Any at all, from any other topic?   
    And the legends of music production that I study also were imagining things. And the record labels who left 16/44 as a primary format long ago, they are imagining things.
    Tom Petty is imaging things.

    Well nice discussion I guess.   You have a garbage test giving you garbage results, that go against both common sense and experts in the field, and you still choose to believe that test. 
  13. FFBookman
    You can keep looking for lost sound on your digitally generated pictures - you won't find much there.

    It's all about accuracy and blending of sounds.
    Accuracy of the instruments and how they blend.
    Accuracy of the delays, pre delays, decays, and various EQ bands on top of each other.
    How many voices and sounds can you hear at once?
    Then there's depth of the soundstage. Width of the soundstage. Width of each instrument within the soundstage. Center.
    Is that graph you are showing me in stereo? Of course not, it's just a picture. A picture of a single moment in timeline based media. So there's absolutely no information about the mix, the shape, or the movement of the sounds in that graph. 
    Music isn't 2D and can't be represented by oscilloscopes or frequency graphs. If you don't believe me try to mix a song with just oscilloscopes and frequency graphs. Our modern quantitative internet minds think we can represent it in a picture but that's very dangerous. None of those pictures come close to representing the true complexity of music.
    You are using the wrong tools for this argument. Only your ears and your perception in a stereo field can "find" what I'm talking about. Also - what dither did you use for your downsampling from 24bit to 16bit? There's several dithers and they all sound different.  They all apply a nice fuzz over the waveform to try and hide the downsampling degradation.
  14. FFBookman
    You say "any type of sound is only frequency and amplitude changing over time".  I agree. And then you have removed the changing over time, and you have removed the stereo imaging that gives it shape/depth/width.
    Also, our ears sensitivity to levels is WAY more than your pictures draw. These cute digital analysis tools with their 4" high visualizations are showing, what, 4bits of visual resolution?   So reduced it's inaccurate.
    If your pictures were accurate they'd be on a grid of 16,000,000 x 16,000,000 pixels (24bit) for EVERY SECOND OF AUDIO, x2 FOR STEREO. The differences between the two signals (L & R) would have to be visually represented too, which from what I've seen doesn't really exist except for in the frequency domain.
    Even pictures of lossy would have to be a grid of 65,000 x 65,000 pixels per second x2.  Why don't we do this?  Well a 24bit waveform would take 222,000 inches to show on screen, or 18,000 feet of screen space. That's never happening. So the pics are kind of useless.
    No pictures generated by the computer (except during production) can possibly show where a sound is supposed to move, how far back it is, how and what it blends with, what's in front or behind it.
1 2 3

Share This Page