Why 24 bit audio and anything over 48k is not only worthless, but bad for music.
Nov 6, 2017 at 12:54 PM Post #2,507 of 3,525
Honestly, I'm not grasping so much as suggesting that the situation is not as simple and definite as many people seem to prefer to believe.

I disagree with your assertion about my comparison to 3D audio... although you may well be right that we tend to ignore the weaker cue.
I agree that the dominant cue takes precedence..... but has anyone studied whether a dissonant less-dominant cue is totally ignored, or whether it has some small effect.
(With 3D video, you see both cues; with audio, do you not notice the less dominant cue at all, or does it act as a distraction, and render the use of the dominant cue less certain?
For example, does the dominant cue take longer for your brain to process because there is another conflicting cue present, or does the conflicting cue affect the results?)
(It seems to get rather complicated in practice)......

https://msu.edu/~rakerd/Hartmann et al, Transaural Experiments and a Revised Duplex Theory.pdf

Also, while a square wave only contains the primary frequency and odd order harmonics, a single impulse contains frequency components at every frequency - down to DC.
(Although, with a 5 uS impulse, the energy present at audible frequencies is going to be pretty low.)


Actually that's not quite it. The apparent location of a sound is dependant on the localization mechanism that is dominant, and that depends on the frequency of the sound. "Duplex Theory".

The spectrum of a 5us pulse would be placed well above the frequency band where both ITD and ILD cross over, so it's location would be dominated by ILD (assuming it could be transduced and heard at all), which 5us of interchannel delay alone will not change. The apparent location shift won't be quite what you expect, or allude to. In fact the spectrum of that pulse is above the audible range. You'll hear a click if you generate it only because of the nonlinearity of any transducer. The click you hear is a byproduct, a distortion. In any case a 5us interchannel time delay is far, far below hearing angular position resolution.
But because of its frequency including whatever a band limiting filter does to it, and the fact that the only thing changed is ITD, which is not the dominant localization method at that frequency, the apparent image shift will not happen that way...or at all. It simply is not an issue, ignoring the fact that such a pulse does not occur in any acoustic event. Hence, I strongly suspect, our inability to localize it based on ITD alone.
Again, you seem to be ignoring how spacial localization actually works. Our brains do not always "look for those edges" to determine anything, especially when those "edges" never occur in life. What you're getting at here is a possible limitation to digital systems that falls outside of the application.
No, your visual parallel with 3D video is incorrect. 3D suffers from a divergence in two simultaneous localization cues, convergence and focus position. The operate together, but 3D projection demands they separate. There is no parallel in audio.

I think there is some serious grasping for straws going on here.
 
Nov 6, 2017 at 1:07 PM Post #2,508 of 3,525
Hmmmmm..... so let's see.
If I start out with a short signal that occurs between two sample points, when I band limit it it will be transformed into a much more spread out SinC "equivalent" waveform.
But I'll still be able to tell when it occurred because that spread out waveform will be spread out across multiple samples (so I can see where 'the highest spot on the hill" is).
However, if I use an apodizing filter, then the pre-ringing will be converted into post ringing... so my "mill will no longer be symmetrical.
I wonder how I'm supposed to tell where my short impulse signal occurred (where between the two samples that "bracketed" it).
Your apodizing filter has removed the information we need in order to retrieve that time information.

In the sense that all waveforms can be described as some sort of complex combination of sine waves you are correct.
However, there are microphones and speakers that respond well past 20 kHz.

I'm talking about the edges of the sound amplitude envelope.
(Which, in the case of a single short impulse, would be the same as the leading edge of the impulse itself.)
(And, yes, if I start with 0 VDC, then gate a single cycle of a sine wave, the beginning of that waveform will be "an edge".)

An audiophile myth which you keep repeating and which I've explained is nonsense:
1. Agreed, assuming a linear phase filter but a minimal phase/apodizing doesn't, you only get post-ringing, not pre-ringing.
2. What original?
3. What original?
3a. What beginning edge of the impulse?
3b. Not that I'm aware of. All the research I'm aware of suggests that ITD, measures/compares phase relationships of the signal reaching different ears. How can the brain measure/compare edges when there are no edges?
3c. A suggestion based on edges which don't exist.
3d. A further suggestion based on edges which don't exist!

A sine wave does not have an "edge"! Modulated sine waves do not have an edge! Anything other than sine waves cannot exist; they cannot travel through air, even if they could, your ear drum cannot respond to them, an analogue current can only be sine waves and our transducers (mics, speaker drivers) can only respond to or recreate sine waves. So, what "original" waveform are you talking about in 2 & 3? You CANNOT be not talking about an original waveform, you can ONLY be talking about digital data which cannot exist as a waveform and then you're complaining that this data which cannot be a waveform becomes distorted when we try and turn it into a waveform, huh? Let's put it another way, let's say we invented a theoretical system and filter which could perfectly reconstruct your impulse, what then? What are you going to convert it into? You can't convert it into an analogue electrical signal because you'll loose your "edges", you cannot use some other method because your speakers/headphones will distort your edges, as will the air and your ear drums.



And how can we have anything other than a "blurry image"? ONLY "blurry images" can travel through air, ONLY "blurry images" can be recorded, reproduced and heard. Think about it for a moment! With a square wave you have an instantaneous rise time ("edge"), a speaker cone would have to be in two different places at the same instant in time to accurately reproduce it, the molecules in the air would have to be in two different places at the same instant in time to transfer that square wave, so would your ear drums and so would the electrons in the analogue electrical current.

G
 
Nov 6, 2017 at 1:20 PM Post #2,509 of 3,525
In simplest terms.....
Start with a 100 Hz square wave.... which is a collection of every odd order harmonic of 100 hz.
If you were to try to reproduce it using a SinC you would need an infinite number of terms (impractical).
So band limit it.
It will now be incorrect because you have truncated the series......
You will also have introduced additional errors by whatever band limiting filter you used.
(Now you get to decide WHICH of those errors is audible......... note that the errors we have now are not simple "extra harmonic components" like normal distortion.)

Incorrect to who? Batman? God? Even if 16/44.1 audio could miraculously play all of the infinite odd harmonics, your square wave would be quite low-pass filtered after headphones or speakers. Amplifiers have finite bandwitdth. Even if your speakers had a response up to 50 kHz, the radiation pattern would be interesting to say the least. Your ears would need to be "on the radiation axis" to get anything. High frequencies attenuate in the air. Your ears are a low-pass filter etc. What do you need harmonics above 20 kHz for? I don't need. The bandlimited version is correct for my ears.

The correct way to create bandlimited square waves is to sum harmonics up to 20 kHz. That way you don't need to use filters.
 
Nov 6, 2017 at 1:28 PM Post #2,510 of 3,525
You're making a very basic - and false - assumption.
The original analog signal is a series of measurements of the pressure of the air over time.
It has no constraints and no limitations.... and no band limitations... and no windows limitations.
It is NOT a SinC function... it is completely arbitrary.
Set off a string of fire crackers.
Each pop is a single pressure wave, which expands, and eventually hits your ears.
What follows is a whole bunch of odd little squiggles in pressure as that wave bounces around and interacts with other stuff.
If I had a "perfect oscilloscope" I could draw a "perfect" picture of it.
There is no sound whatsoever before the first pop.

Things like "windowing errors" and "Gibbs ringing" do not exist in the original.
They are ERRORS that result from the conversion into digital.

1)
I'm going to hit a bell......... now.
In order to make an "accurate digital representation" of that signal - sort of......
You can sample it.
When you then reconstruct those samples, in order to do so perfectly, and get back a "perfect" version of the original (in terms of energy distribution) your reconstructed signal would have to extend backward in time.
Forget the practicalities of that.......
The original bell hit had ZERO energy before I hit the bell; your reconstruction does; therefore your reconstruction has an error.
(The bell was NOT ringing before I hit it... but, in your reconstructed signal there is ringing before the bell hit; therefore they are NOT the same.)
Therefore, the ONLY question is whether the error that we know exists is audible or not.

2)
A DAC does NOT use a SinC function.
The output of a DAC is NOT "a sum of sample-weighted sinc-functions with various delays"
The DAC (chip) outputs a stream on analog voltages - one for each sample you feed to it.
The DAC (chip) does not put out signal before it receives the fist signal (even if a "real" SinC reconstruction would require it to do so).

I'm not a mathematician....... but I believe that the "problem" is that the SinC function of a non-continuous waveform must extend forward and backward in time to infinity.
(The SinC function of a continuous sine wave needn't do that..... which is why the theory works perfectly for continuous sine waves.)

I can give you a more ridiculous - but still valid - example.......

Let's design the most ridiculous filter imaginable.
It will be a super-duper-hyper-narrow bandpass filer.
It will pass 400.0000000000 Hz, with a cutoff of a million dB per octave.
I'm too lazy to do the math, but you will find that, due to the tradeoff between time resolution and sharpness
....our fun filter will take SEVERAL SECONDS to ring up to approximately full output level once it receives a 400 Hz input signal
....and our fun filter will ring detectably for several seconds after the signal stops (it will actually ring forever, but I've made sure it will ring powerfully enough that it will be easy to see).

I now create a tone burst that is 40 cycles of a 400 Hz tone (it exists for 0.1 seconds).
If I play it from reasonably good speakers in an anechoic chamber it will seem to start and stop quite suddenly.
I can create my signal by taking the output of a signal generator set to 400 Hz and gating it at the zero crossing point to pass ten full cycles and then stop.
(I'm going to gate it using an FET for a switch.)

Now I'm going to send this signal to my fun filter.

The input of my filter will be a 0.1 second set of ten sine wave cycles of a 400 Hz tone.
The output will NOT.
It will increase in level gradually and decrease (continue to ring) for several seconds.
The INFORMATION it contains will be the same (which satisfies Nyquist and Shannon).
(Nyquist and Shannon don't actually specify how long I have to wait for all of my information to "accumulate" or "reconstruct".)
However, the FORM of that information will be very different...... which may or may not satisfy a human being.

Basically, at the risk of being intuitive, the information theory says that, as long as you follow certain constraints, the SAME INFORMATION will still be there.
HOWEVER, their definition of the term "information" isn't intuitively what you might think.
I could post this message in Braille, or in MIME encoding...... the same information would be there...... but it would LOOK quite different.
Likewise, Nyquist & Shannon make a statement about the information.... but not about how our signal SOUNDS, or whether it is AUDIBLY the same as the original.

(When we design DACs, we do our best to design the filters and such so that the output also SOUNDS audibly similar.
Besides following the constraints, we follow other constraints that are based on acoustics and human perception.
For example, we don't spread out a tick over ten seconds - because we know that, even if the information content is the same, it will SOUND different.
in real life everything does have limits and band limiting of sort, what you say in introduction is clearly false. air isn't some superconductor of vibrations, a mic or the eardrum are band limiting the signal, I can't think of one thing that will agree with your statement. I'm limited by the speed of my movement when I go to hit the bell. the bell is clearly limited by how fast and how much it will flex under my hit. the all thing will for the duration of me still pushing into the bell, be subjected to a deceleration over time(from the bell resisting my movement, and then starting to resonate at it's own freq). maybe I can't define that movement entirely with my redbook sampling, components above F/2 would be one obvious reason to start with.
all your examples are in my eyes, the Dirac pulse argument revamped again and again. of course you won't have an ideal steady state behavior while looking at the most extreme transient part of any system. hitting the bell, a square wave or anything of the sort amounts to showing stuff outside of Nyquist's theorem. they let you win the fidelity argument. so if you want to discuss increasing fidelity, do that. but if you want to discuss audibility, then that's a big pile of empty rhetoric.
if the question is really potential audibility, then a measurable error isn't motive to avoid it if as far as we know it's not audible. who cares about the missing ultrasonic components of a transient that a human brain probably never perceived in real life? the "better safe than sorry" approach suggesting to act just in case, is caution when the occurrence is credible based on statistics and facts, but paranoia when we act based on nothing concrete. the more I learn each year, the more I give my vote for the later.

I'll put my eternal question here: if something is missing as we can easily demonstrate, and is audible in music as many believe it to be. why is it so hard to pass blind test? that's the question we have to answer because a blind test is what will prove audibility.
 
Nov 6, 2017 at 3:00 PM Post #2,511 of 3,525
...There are microphones and speakers that respond well past 20 kHz...

But are there singers whose voices produce such frequencies? How about instruments? I guess you could generate the sound electronically, so perhaps that's the next big thing in EDM, supersonic frequencies that nobody can hear!

You could be the Emperor's New DJ.
 
Nov 6, 2017 at 3:46 PM Post #2,512 of 3,525
But are there singers whose voices produce such frequencies? How about instruments? I guess you could generate the sound electronically, so perhaps that's the next big thing in EDM, supersonic frequencies that nobody can hear!

You could be the Emperor's New DJ.
Gonna be some seriously empty crowds.
Imagine an EDM event, where all the people in crowds are like "where's the music at?".
The DJ says it's playing right now, and all the sounds are above 20 khz XD
 
Nov 7, 2017 at 11:01 AM Post #2,513 of 3,525
I think you will find plenty of instruments which produce harmonics that extend up well above 20 kHz (cymbals certainly do).
(Likewise, a synthesizer set to produce square waves can do so - depending on the synthesizer.)
The only question is whether humans can tell when those harmonics are missing or not.

Personally I'd rather risk recording some frequencies I can;t hear rather than missing some that I can.

But are there singers whose voices produce such frequencies? How about instruments? I guess you could generate the sound electronically, so perhaps that's the next big thing in EDM, supersonic frequencies that nobody can hear!

You could be the Emperor's New DJ.
 
Nov 7, 2017 at 11:17 AM Post #2,514 of 3,525
Gonna be some seriously empty crowds.
Imagine an EDM event, where all the people in crowds are like "where's the music at?".
The DJ says it's playing right now, and all the sounds are above 20 khz XD


Thing is, there will that one guy that hear it and starts dancing. Then you have a bunch of people dancing. Then you have the dude that said: “And those who were seen dancing were thought to be insane by those who could not hear the music.” get up and bitch slap the DJ.
 
Nov 7, 2017 at 11:18 AM Post #2,515 of 3,525
I've got to mention something here.........
I simply don't understand why some people seem so resentful about this entire subject.

I've never driven my car over 90 mph..... yet I still see no reason why it's "bad" that my car "can" go that fast.
In general, in almost every other subject, most people agree that you're better off if your tools can actually deliver, not just adequate performance, but performance that's BETTER than necessary.
It goes by names like "safety margin" and "margin for error" and "headroom" and even "clearance".
And who would really buy a car that can only go 56 mph?

So, why, even if you believe that we can't hear above 20 kHz, is it so awful to allow some safety margin.
If I were recording bats, and found out that their cries extended to 46 kHz, I would buy a microphone whose response extended to 60 kHz; I wouldn't buy one that went up to 46.1 kHz.
So, why, even if humans can only hear to 20 kHz, doesn't it make equal sense to make recordings that extend "well above" 20 kHz..... just in case.... to leave a little safety margin?
Why would anyone specifically choose to use a sample rate that's "just barely good enough"?

There is a reason why the 44.1k sample rate was chosen for CDs.....
The reason is that, with the constraints of the technology at the time CDs were invented, the time/space constraint on CDs was considered to be important.
They couldn't have used the next-higher standard sample rate without reducing the storage time on a standard CD below one hour - which had been established as a target requirement.
Using the 44.1k sample rate, they were able to fit over an hour on a disk, and still deliver frequency response that was a tiny bit above the bare minimum necessary.
The 48k sample rate was already in use on DAT tapes, and was considered to be a sort of standard; they would have used that except that, if they had, they couldn't have fit an hour on a CD.
In fact, most movie audio (on DVDs) is still standardized at 48k.... and not 44.1k.

However, when you're talking about download FILES, that constraint simply doesn't exist.
(in fact, even originally, it was strictly tied to "fitting a complete album on a CD")
 
Nov 7, 2017 at 11:27 AM Post #2,516 of 3,525
You might want to check out something called "a mosquito ringtone" : http://www.freemosquitoringtone.org/
(You don't dance to them; school kids use them so the adult teacher can't hear their phone ring.)

Thing is, there will that one guy that hear it and starts dancing. Then you have a bunch of people dancing. Then you have the dude that said: “And those who were seen dancing were thought to be insane by those who could not hear the music.” get up and bitch slap the DJ.
 
Nov 7, 2017 at 12:21 PM Post #2,517 of 3,525
"Nyquist" says: A continuous time signal can be represented in its samples and can be recovered back when sampling frequency fs is greater than or equal to the twice the highest frequency component of message signal. Note that it's talking about "continuous time signals" and not impulses.

That's fantastic because the music I listen to with my stereo consists of waveforms that Nyquist can perfectly reproduce. This makes me happy and satisfies me. I wouldn't expect a car to drive on the ocean. It wasn't designed for that. It was designed to drive on roads. I wouldn't expect digital audio to reproduce impulses and square waves and other theoretical things we can't really hear. I would expect it to reproduce what it was designed to reproduce- music.

I think you will find plenty of instruments which produce harmonics that extend up well above 20 kHz (cymbals certainly do).
(Likewise, a synthesizer set to produce square waves can do so - depending on the synthesizer.)
The only question is whether humans can tell when those harmonics are missing or not.

No, they can't. Studies have shown that super audible frequencies have no impact at all on audio fidelity of recorded music. And auditory masking makes sure that you probably can't even hear some of the upper level harmonics in the audible range either. The truth is that ears don't have a brick wall filter. They fade out starting at around 15kHz or so. When you listen to music, the core frequencies are MUCH more important to perceived audio fidelity than the top octave. In fact, of the ten octaves or so that humans can hear, the top octave is the least important. Why to people expend so much energy worrying about the least important thing? Makes no sense.

You can rest easy knowing that CD sound is all you need. When you look for the truth and use your ears, OCD and pathological attention to detail isn't a problem any more. You have a lot more time and attention to spend on things that really matter- like appreciating great music. Music can be appreciated even without perfect fidelity. Acoustic Caruso 78s never fail to impress me. I can't imagine life without the Duke Ellington sides from the early 30s. And Bruno Walter's first act of Die Walkure from 1935 has never been bettered regardless of the sound technology advances. We are VERY fortunate to be living in an era of perfect recorded sound. Why go looking for theoretical imperfections? Appreciate how great digital audio is.

Focus on the road, not the clarity of the windshield.
 
Last edited:
Nov 7, 2017 at 12:58 PM Post #2,518 of 3,525
In all fairness, I don't think "people expend a lot of energy worrying about the top octave".
It's simply that, regardless of our personal limitations, the asserted goal of high fidelity is to reproduce the music as accurately as possible.
Therefore, some of us just like the idea that our gear will do so.
(So far I haven't seen a thread bemoaning how selling us audio amplifiers with a frequency response past 20 kHz is a major scam.)
Why is this such a big deal ONLY when it comes to high-res files?

My reply is simply.........
What's the big deal about paying an extra $5 for a 96k file instead of a 44k file?
I'm not personally all that sure I'd hear a difference......
But I'd rather pay an extra five bucks to get the version that's "twice as good as I need" instead of the one that's "just barely as good as I need" (most of my other equipment is also a bit better than I strictly need).
And, no, I'm not totally convinced that it's quite good enough.
But, even if it is, what's so awful about paying a few cents more to buy a lot of extra safety margin?
And what's the big deal about "only" being able to fit 20,000 albums on a $150 hard drive?
(If I just read this thread, I'd think 96k files were selling for $500 each - like fancy power cables.)

That's fantastic because the music I listen to with my stereo consists of waveforms that Nyquist can perfectly reproduce. This makes me happy and satisfies me. I wouldn't expect a car to drive on the ocean. It wasn't designed for that. It was designed to drive on roads. I wouldn't expect digital audio to reproduce impulses and square waves and other theoretical things we can't really hear. I would expect it to reproduce what it was designed to reproduce- music.

No, they can't. Studies have shown that super audible frequencies have no impact at all on audio fidelity of recorded music. And auditory masking makes sure that you probably can't even hear some of the upper level harmonics in the audible range either. The truth is that ears don't have a brick wall filter. They fade out starting at around 15kHz or so. When you listen to music, the core frequencies are MUCH more important to perceived audio fidelity than the top octave. In fact, of the ten octaves or so that humans can hear, the top octave is the least important. Why to people expend so much energy worrying about the least important thing? Makes no sense.

You can rest easy knowing that CD sound is all you need. When you look for the truth and use your ears, OCD and pathological attention to detail isn't a problem any more. You have a lot more time and attention to spend on things that really matter- like appreciating great music. Music can be appreciated even without perfect fidelity. Acoustic Caruso 78s never fail to impress me. I can't imagine life without the Duke Ellington sides from the early 30s. And Bruno Walter's first act of Die Walkure from 1935 has never been bettered regardless of the sound technology advances. We are VERY fortunate to be living in an era of perfect recorded sound. Why go looking for theoretical imperfections? Appreciate how great digital audio is.

Focus on the road, not the clarity of the windshield.
 
Last edited:
Nov 7, 2017 at 1:01 PM Post #2,519 of 3,525
I like a clean windshield, but thats my server is for
 
Nov 7, 2017 at 1:12 PM Post #2,520 of 3,525
In all fairness, I don't think "people expend a lot of energy worrying about the top octave".
It's simply that, regardless of our personal limitations, the asserted goal of high fidelity is to reproduce the music as accurately as possible.
Therefore, some of us just like the idea that our gear will do so.
(So far I haven't seen a thread bemoaning how selling us audio amplifiers with a frequency response past 20 kHz is a major scam.)
Why is this such a big deal ONLY when it comes to high-res files?

My reply is simply.........
What's the big deal about paying an extra $5 for a 96k file instead of a 44k file?
I'm not personally all that sure I'd hear a difference......
But I'd rather pay an extra five bucks to get the version that's "twice as good as I need" instead of the one that's "just barely as good as I need" (most of my other equipment is also a bit better than I strictly need).
And, no, I'm not totally convinced that it's quite good enough.
But, even if it is, what's so awful about paying a few cents more to buy a lot of extra safety margin?
And what's the big deal about "only" being able to fit 20,000 albums on a $150 hard drive?
(If I just read this thread, I'd think 96k files were selling for $500 each - like fancy power cables.)
@gregorio
Explained in his 24bit vs 16bit thread that the difference is not twice as much!
Hopefully you're still with me, because we can now go on to precisely what happens with bit depth. Going back to the above, when we add a 'bit' of data we double the number of values available and therefore halve the number of quantisation errors. If we halve the number of quantisation errors, the result (after dithering) is a perfect waveform with halve the amount of noise. To phrase this using audio terminology, each extra bit of data moves the noise floor down by 6dB (half). We can turn this around and say that each bit of data provides 6dB of dynamic range (*4). Therefore 16bit x 6db = 96dB. This 96dB figure defines the dynamic range of CD. (24bit x 6dB = 144dB).

So, 24bit does add more 'resolution' compared to 16bit but this added resolution doesn't mean higher quality, it just means we can encode a larger dynamic range. This is the misunderstanding made by many. There are no extra magical properties, nothing which the science does not understand or cannot measure. The only difference between 16bit and 24bit is 48dB of dynamic range (8bits x 6dB = 48dB) and nothing else. This is not a question for interpretation or opinion, it is the provable, undisputed logical mathematics which underpins the very existence of digital audio.

So, can you actually hear any benefits of the larger (48dB) dynamic range offered by 24bit? Unfortunately, no you can't. The entire dynamic range of some types of music is sometimes less than 12dB. The recordings with the largest dynamic range tend to be symphony orchestra recordings but even these virtually never have a dynamic range greater than about 60dB. All of these are well inside the 96dB range of the humble CD. What is more, modern dithering techniques (see 3 below), perceptually enhance the dynamic range of CD by moving the quantisation noise out of the frequency band where our hearing is most sensitive. This gives a percievable dynamic range for CD up to 120dB (150dB in certain frequency bands).
 

Users who are viewing this thread

Back
Top