How Digital Audio works
Sep 7, 2014 at 3:23 PM Thread Starter Post #1 of 29

FoxSpirit

Head-Fier
Joined
Dec 13, 2010
Posts
88
Likes
13
Xip.org has put up a fantastic video explaining the basic science behind digital audio. I was impressed.
http://xiph.org/video/vid2.shtml
 
Have fun.
 
Sep 7, 2014 at 7:26 PM Post #3 of 29
yes it's a nice vid but you would probably find it already in most topics in sound science about hirez or bitdepth or samplerate or DAC or ADC....
 
in fact http://www.head-fi.org/newsearch?advanced=1&action=disp&search=http%3A%2F%2Fxiph.org%2Fvideo%2Fvid2.shtml&titleonly=0&byuser=&output=posts&replycompare=gt&numupdates=&sdate=0&newer=1&sort=relevance&order=descending&Search=SEARCH&Search=SEARCH
 
Sep 7, 2014 at 7:39 PM Post #4 of 29
Yes a very good video.  Though posted before well worth making note of once again.
 
If everyone posting would first view and understand that video there were be far fewer foolish exchanges.   As well as the context of other questions being much clearer and cleaner to discuss.
 
Oct 8, 2014 at 6:49 PM Post #5 of 29
So, I just watched Monty's vid for the second time, having seen it some months ago. And I have to say, he does an impressive job of explaining what he knows.
 
The problem is that he clearly doesn't know what he doesn't know.
 
What he doesn't know is something that I used to know, long ago, and that my son happened to write in a paper for school a couple of weeks ago that touched on the Nyquist-Shannon Theorem. That is that the theorem only applies to the digitization of continuous bandwidth-limited functions, which, as you'll recall from the video, can be accurately digitized by a sample having a sampling frequency not less than twice the maximum frequency of the bandwidth-limited function being sampled. The key here is that the function must be continuous.
 
Music is not continuous. That's why digital music, sampled at 44.1kHz, doesn't always sound as good as you would hope it should. Indeed, I often wonder why it sounds as good as it does, and perhaps that's the question we really should ask.
 
In any event, every drum beat, every cymbal crash, every piano keystroke, every pluck of a stringed instrument, even every catch of a burr on a horsehair on a violin, viola, cello or bass string, creates a discontinuity that an A-to-D converter has a hard time dealing with. The result in a mere approximation in the digital code, and then an approximation in the resulting signal after reconversion by a DAC. That's why many people hear a difference, and not a happy one, between analogue and digital recordings.
 
Oct 8, 2014 at 7:47 PM Post #6 of 29
  Nyquist-Shannon Theorem... only applies to the digitization of continuous bandwidth-limited functions
 
Music is not continuous. That's why digital music, sampled at 44.1kHz, doesn't always sound as good as you would hope it should.

 
As long as a transient isn't any faster than one wave of the highest frequency Redbook reproduces, it is represented. That means the transient has to be faster than 1/20,000th of a second. How small is that? Well here is a photo that captures 1/20,000th of a second of a drop of milk.
 

 
There's nothing with that fast of a transient in music. Transients in music are thousands of times slower than that. A quick snare drum hit takes up about 1/5th of a second. That is 4000 waves (or 8000 samples). That's plenty of continuity to define the sound of a sharp transient in music.
 
Human hearing can't distinguish short burst transients. If you listen to a continuous 1kHz tone, it sounds like a tone. If you hear a short burst of the same tone, it sounds like a click.
 
So not only are transients in music multiple orders of magnitude slower than Redbook's ability to reproduce them, the human ear can't discern very short burst transients as anything but just an instant of noise anyway.
 
Oct 8, 2014 at 7:52 PM Post #7 of 29
So here's my only question
 
with regards to these threads, are the informed parties participating for the goal of moving towards a ultimate truth, or are they simply here to disprove each other and further their own ego 
 
it's for that reason that I generally avoid getting involved in these threads
 
also the information in the video was very useful and very basic. I appreciated it and enjoyed watching, now have fun you guys with your pissing match from  here on out
 
I also would like to mention, that the Video Guy clearly admited that he left a lot out and choose to have a simple presentation 
 
 
 
The problem is that he clearly doesn't know what he doesn't know.
 

So from my limited perspective, the above comment does nothing to bring any sort of forward momentum to the topic on hand, But is rather a pointless insult lobbied to either discredit Monty or bring some kinda of credit to the poster 
 
from my completely un biased perspective [as I don't know jack schiit about this] I apprecaited his simple explanition and cleary understood that there was a lot left out of that video, he cleary acknowledge this, so I'm sure he's aware of what he doesn't know and works to fill in the blanks. No thanks to such pointless insults from his peers 
 
Instead of posting your objections as an opened ended question for the sake of KNOWLEDGE, you took a pot shot at Monty, bragged about your self and your son. Which is fine, that's great that you have that pride, but there is no place for such a display of pride here. From my perspective you allowed your pride to "muck" up this discussion. It's all in how it's worded, and sadly I can't hear you speaking so I'm making some pretty drastic assumptions. How ever, I voice my opinion in the hopes that some one will recognize when their pride is taking away from the forward momentum that we all want, and adjust their actions accordingly. 
 
To that end I hope I can learn something here! As that first video was pretty cool, but if this de rails into another pissing match. I'll just avoid the Sound Science board for another couple of weeks. An by the way, my decission to "avoid" the board because of the actions of a few is in it self an act of "pride" getting in the way of learning, I can see that in my self and I hope others who are more serious about learning this will not allow such pride to impede the ultimate goal, as I the filthy casual have done 
 
Oct 8, 2014 at 8:51 PM Post #8 of 29
My purpose is to share the techniques I've learned to give other people a leg up on getting good sound, and to learn from others so I can figure out techniques to make my own system sound better.
 
Oct 8, 2014 at 8:55 PM Post #9 of 29

Mshenay, you are certainly right about many of these exchanges--I almost wrote debates, but they are often, perhaps usually, not that--that occur on internet boards. It was not my intent to appear to be a braggart, or proud either of myself or my son. I apologize for giving that impression.
 
That said, what made me speak up was that I was offended by Monty's tone in his video. I believe that he oozes a smugness that he has not earned, and will not have earned, until he learns a lot more math. Sorry if that bothers you, but it's the math that underlies this stuff that's really important.
 
So, I will try to explain what I was trying to say one more time, and then I'll shut up.
 
Here's the thing. Everyone who says that Redbook digital is just fine and dandy because its cutoff frequency--22.05kHz--is above the upper limit of normal human hearing--20kHz--and then cites the Nyquist Theorem to back it up is citing a flawed version of the Nyquist Theorem. That theorem actually applies to continuous functions only--things that look like those sine waves that Monty put up on his scopes. Once he goes to a square wave, as he actually demonstrated, the ability to reproduce the wave through sampling no longer exists. That's what happens when you bandwidth limit that square wave--you get all those ripples, and the rise time is slower than the original square wave was, and the resulting output wave has the ripples and the slow rise time too. So Monty, instead of demonstrating the principle that digital audio is a really good reproducer of original signals, actually demonstrated the principle of garbage in, garbage out. He didn't reproduce the square wave. He reproduced his bandwidth limited version of the square wave.
 
As for Bigshot's issue of whether musical transients are slow enough for digital reproduction to capture them, well, some are discernible and some are not. To say that a snare drum hit takes 1/5 second isn't the relevant fact. You need to know what the snare drum hit's rise time is. I haven't been able to find that in a quick bout of googling, but I have found that the rise time of a cymbal hit is 1ms--1/1,000 second. With the high frequencies involved in that cymbal hit, you're going to need to be very lucky to get the samples to capture them in the time of that rise. Yes, you will get the decay, but you want both, accurately. Moreover, there are lots of musical sounds that have faster rise times, indeed, infinitely fast rise times--plucked strings and hammered piano strings, for example. In both, the string starts in a deformed position and then it is released, and therefore the string's vibrations begin upon the release, so the attack portion of its wave envelope has infinite slope. That's the kind of discontinuity I'm talking about.
 
No sampling frequency can truly accurately convey these transients within the meaning of Nyquist-Shannon, but when you increase the sampling frequency, you increase the probability that you will do a better job of conveying the information you are digitizing. That's the point of a higher sampling frequency, and that's why, as you increase the sampling frequency you will get better and better fidelity, although admittedly you're also going to reach a point of diminishing marginal returns at some sampling frequency. The question becomes what that frequency is for the vast majority of people, and for the vast majority of critical listeners.
 
Oct 8, 2014 at 8:56 PM Post #10 of 29
  My purpose is to share the techniques I've learned to give other people a leg up on getting good sound, and to learn from others so I can figure out techniques to make my own system sound better.

This is tru, I do enjoy Bigs post. He is a master of the "neutral" type 
 
Oct 8, 2014 at 9:32 PM Post #11 of 29
 
That theorem actually applies to continuous functions only--things that look like those sine waves that Monty put up on his scopes. Once he goes to a square wave, as he actually demonstrated, the ability to reproduce the wave through sampling no longer exists. 

 
There is nothing in music that even remotely resembles a square wave. That is a theoretical sound that only exists in machines. I'm sure someone could hobble a computer and make it produce only square waves, but it wouldn't sound anything like the real world.
 
Nyquist is perfectly able to cover all of the audible sound in recorded music... everything from a symphony orchestra to a snare drum. And if a snare drum hit is 1/5th of a second. You can bet the attack on it isn't anywhere near 1/20,000ths of a second. The difference in scale there is monumental. And the human ear itself is limited as to the speed of a transient it can discern as being different from a faster or slower transient. The ear hears using the exact same waveforms that redbook produces, which within the band of human hearing is identical. You are talking about theories that don't have any application in practice.
 
Oct 8, 2014 at 9:38 PM Post #12 of 29
  As for Bigshot's issue of whether musical transients are slow enough for digital reproduction to capture them, well, some are discernible and some are not. To say that a snare drum hit takes 1/5 second isn't the relevant fact. You need to know what the snare drum hit's rise time is. I haven't been able to find that in a quick bout of googling, but I have found that the rise time of a cymbal hit is 1ms--1/1,000 second. With the high frequencies involved in that cymbal hit, you're going to need to be very lucky to get the samples to capture them in the time of that rise. Yes, you will get the decay, but you want both, accurately. Moreover, there are lots of musical sounds that have faster rise times, indeed, infinitely fast rise times--plucked strings and hammered piano strings, for example. In both, the string starts in a deformed position and then it is released, and therefore the string's vibrations begin upon the release, so the attack portion of its wave envelope has infinite slope. That's the kind of discontinuity I'm talking about.
 
No sampling frequency can truly accurately convey these transients within the meaning of Nyquist-Shannon, but when you increase the sampling frequency, you increase the probability that you will do a better job of conveying the information you are digitizing. That's the point of a higher sampling frequency, and that's why, as you increase the sampling frequency you will get better and better fidelity, although admittedly you're also going to reach a point of diminishing marginal returns at some sampling frequency. The question becomes what that frequency is for the vast majority of people, and for the vast majority of critical listeners.

 
It's still coming from the real world and is continuous (not a true discontinuity), just sometimes with some ultrasonic content maybe.
 
If that doesn't fit into Nyquist frequency, it gets aliased down or filtered out somewhere in the A/D chain, in various parts (maybe both to some degree), depending on the implementation. The key is whether or not the ultrasonics (or even aberrations, especially looking at phase shifts close to Nyquist in practical systems, non-brickwall filters, etc.) are valuable to the listening experience or impact it. Say you don't capture them. Is that important?
 
I've not really seen much evidence of any kind of critical listener benefiting from those differences other than a couple papers that don't quite seem to be reproducible. Is it fair to throw out some anecdotes without proper controls, these results? Some say not. On the other hand, hundreds of people in Meyer/Moyan and other examples couldn't detect the 44.1 kHz A/D/A loop inserted into the high-res playback... though people would dispute those results for various reasons too. Plenty of people have done comparisons in studios when digital was newer between a straight recording (all analog) played through speakers vs. recorded digitally and played back, and other such things. Personally, from what I've listened for myself, they don't seem to matter for me, so I leave it at that for myself.
 
You can also examine the physiology of the ear and go from that angle, but I know even less about that.
 
Oct 8, 2014 at 9:56 PM Post #13 of 29
  So, I just watched Monty's vid for the second time, having seen it some months ago. And I have to say, he does an impressive job of explaining what he knows.
 
The problem is that he clearly doesn't know what he doesn't know.
 
What he doesn't know is something that I used to know, long ago, and that my son happened to write in a paper for school a couple of weeks ago that touched on the Nyquist-Shannon Theorem. That is that the theorem only applies to the digitization of continuous bandwidth-limited functions, which, as you'll recall from the video, can be accurately digitized by a sample having a sampling frequency not less than twice the maximum frequency of the bandwidth-limited function being sampled. The key here is that the function must be continuous.
 
Music is not continuous. That's why digital music, sampled at 44.1kHz, doesn't always sound as good as you would hope it should. Indeed, I often wonder why it sounds as good as it does, and perhaps that's the question we really should ask.
 
In any event, every drum beat, every cymbal crash, every piano keystroke, every pluck of a stringed instrument, even every catch of a burr on a horsehair on a violin, viola, cello or bass string, creates a discontinuity that an A-to-D converter has a hard time dealing with. The result in a mere approximation in the digital code, and then an approximation in the resulting signal after reconversion by a DAC. That's why many people hear a difference, and not a happy one, between analogue and digital recordings.


as suggested by big shot, the problem would be at best for what? the first sample? the first 2 samples of the new signal before we can pretend were back into the conditions of a continuous signal? and what would be the error margin? I would guess that if it was significant, it might appear on at least some measurements. to me the fact that a DAC is by far the less distorting part of an audio system is proof enough that the theory works just fine for our actual needs. and in any case it's a lot more precise than any analog alternatives available.
nyquist is math so obviously it goes with perfection as sadly only math does. but then impedance bridging is made for a source with an impedance infinitely close to zero, power sources are supposed to be perfect, usb are supposed to deliver exactly 5V all the time...
that leads me to pissing contest mentioned by Mshenay, we can always argue about something being not good enough, and if we can improve it, hell why not! but if we do that in audio I wish we would go for speakers and headphones first, as they are clearly the weak part of the audio chain. that is where massive sound improvements can be expected.
each time someone cries wolf on jitter, cable, or poor old 44khz not being enough and sacd scam saving the world, we're wasting time for stuff -80db or lower below signal. I find that to be a great waste of time and resources when at the same time we can't seem to be able to get a modern cd without clipping or even 80db of dynamic. same with headphones and speakers having distortions well above -80db.- 80db below signal that's 0.01% for thd, what sound system provides that from the album to our ears? the DAC(digital processing) is the good kid first in class in that respect. the day he ends up being last in class, I'll be all over those nyquist limits and jitter buggers.
 
Oct 8, 2014 at 11:46 PM Post #14 of 29
Here's the thing. Everyone who says that Redbook digital is just fine and dandy because its cutoff frequency--22.05kHz--is above the upper limit of normal human hearing--20kHz--and then cites the Nyquist Theorem to back it up is citing a flawed version of the Nyquist Theorem. That theorem actually applies to continuous functions only--things that look like those sine waves that Monty put up on his scopes. Once he goes to a square wave, as he actually demonstrated, the ability to reproduce the wave through sampling no longer exists. That's what happens when you bandwidth limit that square wave--you get all those ripples, and the rise time is slower than the original square wave was, and the resulting output wave has the ripples and the slow rise time too. So Monty, instead of demonstrating the principle that digital audio is a really good reproducer of original signals, actually demonstrated the principle of garbage in, garbage out. He didn't reproduce the square wave. He reproduced his bandwidth limited version of the square wave.

 
To add to rebuttals of this point:
The issue isn't whether he can recreate the square wave exactly, it's whether or not we, as humans, can hear the difference between his reconstructed signal and the original analog square wave.  The frequency content of the *difference* between the reconstructed signal and the original square will ideally all be above Nyquist.  Here for instance is the theoretical reconstruction of a 4186hz square wave from 44.1k and 96k samples, along with their difference:
http://i1292.photobucket.com/albums/b574/rolandoarodriguez/square_zps39bffd05.png
 
Here is said difference as a 96/16 wav:
https://drive.google.com/file/d/0BwmVtb5IwniEWmNKMTJEMVNEelU/view?usp=sharing
 
I sure can't hear anything from it.  The fact is that we hear any square wave above about 6600hz as a sine wave, as the overtones present in the square wave will exceed our hearing capacity.
 
Oct 9, 2014 at 5:41 AM Post #15 of 29
...  
That said, what made me speak up was that I was offended by Monty's tone in his video. I believe that he oozes a smugness that he has not earned, and will not have earned, until he learns a lot more math. Sorry if that bothers you, but it's the math that underlies this stuff that's really important. ...
...
Here's the thing. Everyone who says that Redbook digital is just fine and dandy because its cutoff frequency--22.05kHz--is above the upper limit of normal human hearing--20kHz--and then cites the Nyquist Theorem to back it up is citing a flawed version of the Nyquist Theorem. That theorem actually applies to continuous functions only ...
...
As for Bigshot's issue of whether musical transients are slow enough for digital reproduction to capture them, well, some are discernible and some are not. To say that a snare drum hit takes 1/5 second isn't the relevant fact. You need to know what the snare drum hit's rise time is. I haven't been able to find that in a quick bout of googling, but I have found that the rise time of a cymbal hit is 1ms--1/1,000 second. With the high frequencies involved in that cymbal hit, you're going to need to be very lucky to get the samples to capture them in the time of that rise. Yes, you will get the decay, but you want both, accurately. Moreover, there are lots of musical sounds that have faster rise times, indeed, infinitely fast rise times--plucked strings and hammered piano strings, for example. In both, the string starts in a deformed position and then it is released, and therefore the string's vibrations begin upon the release, so the attack portion of its wave envelope has infinite slope. That's the kind of discontinuity I'm talking about.
 
...

 
Play the message, not the man. Some people do find Monty's mannerisms irritating, but the video does a very good job of explaining the theory without resorting to math, which would be over the head of the target audience. His grasp of the math involved is likely to be entirely adequate, given his track record as a codec developer.
 
The Nyquist-Shannon Sampling Theorem still applies to transients. Every transient (step function) is composed of a series of continuous functions, as Monty shows with the addition of harmonics to make a square wave.
 
Your understanding of transients created by musical instruments appears to be somewhat less than complete. For example, the initial rise time of a plucked string is far from infinitely fast. The string does not instantly go from rest to maximum velocity when released, it obeys the laws of physics and takes a finite time. Granted, some instruments may generate components in excess of 20 KHz, and sampling at 44 KHz may not capture all of them, but this isn't a shortcoming unique to digital reproduction. A few minutes' experimentation with an audio editor and some well recorded 24/96 music should give you an appreciation of the real-world spectrum content of musical instruments, and if you apply a low pass filter at decreasing frequencies you'll likely to be surprised to find out how low you can cut off before it becomes noticeable. 
 

Users who are viewing this thread

Back
Top