digital theory versus reality
Jul 6, 2016 at 10:35 PM Thread Starter Post #1 of 88

johncarm

100+ Head-Fier
Joined
Jun 27, 2014
Posts
313
Likes
21
I read the first pages of Gregorio's 24 bit/16 bit digital "myth exploded" thread. Within the first few pages, it was all theory. There should be no audible difference between 16 bits and 24 bits assuming the equipment works in an ideal manner.
 
But, does the fact that no audio device is perfect affect the ideal bit rates/depths?
 
Let me clarify what I mean by "no audio device is perfect." 
 
DO NOT mean that digital signals are band-limited and require anti-aliasing. Nor do I consider the need for dither to be a malfunction. 
 
One "imperfection" could be jitter. Also, maybe nonlinearities in the analog portion of ADCs and DACs. Also, maybe nonlinearities in ADC conversion if any such errors could have a pervasive effect on distortion in the output signal.
 
So basically the question is, does the existence of real-world problems such as these require higher bit rates or depths?
 
Also I want to ask a question about distortion as conceive of "in time."
 
I usually see distortion described as its level relative to 0 dbFS, and it is obviously pretty tiny in digital. This is describing the amplitude of the distortion.
 
I'm curious if it could also be characterized as a time distortion.
 
Let me explain where I'm coming from with this. I did some sound synthesis via software in college, and I wanted to make synthesized instruments "sound real." Some kinds of tiny imperfections in time, like randomly varying the phase of a waveform by amounts that were not consciously perceptible, increased the realism. This was more obvious when the waveform had high-frequency transients, i.e. spikes. 
 
A "spike" in a waveform is defined not only by its amplitude and frequency content, but also by the moment in time that it occurs. It's an event, so to speak.
 
It was interesting---maybe my ear was hearing the relative position of spikes at a high level of precision. We would need some psychoacoustic experimentation to find out.
 
There is another kind of sound synthesis which involves "events," namely granular synthesis. You start by defining a short sound, and then create a sustained sound by overlapping many instances of the short sound. The sustained sound's characteristics can be modified by choosing the relationship in time among the "grains," as the instances are called.
 
I'm picturing impulses travelling through the brain's neurons, and a pattern in the nervous system is established by the relative timing of these impulses.
 
It seems to me that digital distortion arising from sources such as antialiasing signals distorts the shape of a transient, which could then have an effect on the timing of the neuron signal it triggers.
 
In that case, understanding the effects of distortion would be a matter of understanding the ear's response not to amplitude, but rather to relative timing of many transients.
 
So my question is, what work has been done on this in psychoacoustics, and is there any analysis or experiment to demonstrate that antialiasing filters don't disrupt the brain's perception of transient timing?
 
Jul 7, 2016 at 9:38 AM Post #2 of 88
  I read the first pages of Gregorio's 24 bit/16 bit digital "myth exploded" thread. Within the first few pages, it was all theory.

 
True but let's not forget that digital audio theory is quite different from many other scientific theories. If we take say the theory of evolution as a comparison, it's a theory which has been designed to explain an observation in the natural world. This theory has a wealth of supporting evidence and therefore our confidence in it being correct is extremely high but not absolutely certain beyond all doubt. This is not the case with digital audio theory though. The basic tenet of digital audio theory was not designed to explain a natural phenomena, it was designed as a mathematical theory/concept. Some 20 years later the theory was proven mathematically and then roughly 20 years later again, various organisations started trying to invent equipment to realise this theory in practice.
 
Quote:
  But, does the fact that no audio device is perfect affect the ideal bit rates/depths?

 
The simple answer is "no, it shouldn't" but the accurate answer is not so simple. There are many variables at play here, not least of which is that an audio device manufacturer may not even be aiming for "perfect" in the first place. Indeed, there have been DACs released on the market which deliberately break the basic tenets of digital audio purely for marketing purposes, even though this degrades the audio fidelity within the audible range! With these sorts of shenanigans going on (particularly in the audiophile segment of the market) then the accurate answer would be "yes it can", almost anything is possible, including that an audio device could affect the "ideal bit rates/depths". All we can say is that no competently engineered device should affect the ideal bit rates/depths and this even includes relatively cheap audio devices.
 
Quote:
  So basically the question is, does the existence of real-world problems such as these require higher bit rates or depths?

 
No. There is a list of measurable non-linearities in ADCs and DACs but typically these are orders of magnitude below audibility and there are even some which are potentially audible but are solved by local up/down sampling rather than requiring a higher sample rate or bit depth in the distribution format.
 
  A "spike" in a waveform is defined not only by its amplitude and frequency content, but also by the moment in time that it occurs.

 
I see where you are coming from. As a musician we have 3 fundamental basics: Volume, pitch and timing. If we're taught anything at all about the science of sound, we're usually taught that volume (loudness) = amplitude and pitch = frequency and as amplitude and frequency is all we are able to record or reproduce, this leaves timing rather unaccounted for! Unfortunately, "volume = amplitude" and "pitch = frequency" is not actually true, it's just a useful over-simplification which is only partially true or true sometimes. Without going into reams of details, our timing component is already dealt with by frequency, because frequency is the number of audio cycles per second (Hertz), so timing is implicit in measuring/recording/reproducing frequency. In other words, a spike (transient) is in fact defined only by it's amplitude and frequency content. This fact invalidates most of the rest of your post.
 
  So my question is, what work has been done on this in psychoacoustics, and is there any analysis or experiment to demonstrate that antialiasing filters don't disrupt the brain's perception of transient timing?

 
The timing itself, as mentioned above, is a non-issue. However, there is a potential issue when employing a steep anti-aliasing/reconstruction filter within the theoretical hearing spectrum, as is the case with the 44.1k sample rate. Such a filter can cause a phenomena called pre or post ringing, which are essentially echoes of the transient which occur before or after the transient itself. Such phenomena can be experienced but it requires particular listening conditions and test signals specifically designed for the task (which don't occur naturally). Modern ADC/DAC design schemes (oversampling, improved filter topologies, etc.) have effectively eliminated the issue, even in relatively cheap units, although again, the possibility of incompetent designs (deliberate or not) existing on the market cannot be ruled out.
 
There's the famous Meyer & Moran study published by the AES in 2007, which demonstrated no discernible difference between high sample rate recordings and the same recordings truncated down to 44.1/16 by a CD player. But beyond that, I'm afraid I cannot provide links or citations off the top of my head for the experiment or other info I mentioned. Maybe someone else will be kind enough to provide some or you could research pre and post ringing yourself.
 
G
 
Jul 7, 2016 at 10:40 AM Post #3 of 88
 
Let me explain where I'm coming from with this. I did some sound synthesis via software in college, and I wanted to make synthesized instruments "sound real." Some kinds of tiny imperfections in time, like randomly varying the phase of a waveform by amounts that were not consciously perceptible, increased the realism. This was more obvious when the waveform had high-frequency transients, i.e. spikes. 

I'm not 100% sure about that but I think it's called phase distortion? Anyways, I wonder how we can discern two signals which have the same amplitude and frequency but different phase by using frequency and amplitude values only.
 
Jul 7, 2016 at 3:51 PM Post #4 of 88
 gregorio
True but let's not forget that digital audio theory is quite different from many other scientific theories. If we take say the theory of evolution as a comparison, it's a theory which has been designed to explain an observation in the natural world. This theory has a wealth of supporting evidence and therefore our confidence in it being correct is extremely high but not absolutely certain beyond all doubt. This is not the case with digital audio theory though. 

That's not how I would put it. Digital signal theory is math. And you always have to check if reality matches the math. Theoretical physics has the same issue. There is a deep and somewhat mysterious link between math and reality. If we imagine all possible universes, there is no guarantee that we end up in a universe that is well-explained by math, so progress in, say, physics is astounding. But again, you have to check the reality.
 
The theory you are explaining corresponds to maybe a first-year course in digital signal processing. It doesn't get into jitter. That's just one issue. The central problem is verifying that the reality matches the theory well enough.
 
 gregorio
 
I see where you are coming from. As a musician we have 3 fundamental basics: Volume, pitch and timing. If we're taught anything at all about the science of sound, we're usually taught that volume (loudness) = amplitude and pitch = frequency and as amplitude and frequency is all we are able to record or reproduce, this leaves timing rather unaccounted for! Unfortunately, "volume = amplitude" and "pitch = frequency" is not actually true, it's just a useful over-simplification which is only partially true or true sometimes. Without going into reams of details, our timing component is already dealt with by frequency, because frequency is the number of audio cycles per second (Hertz), so timing is implicit in measuring/recording/reproducing frequency. In other words, a spike (transient) is in fact defined only by it's amplitude and frequency content. This fact invalidates most of the rest of your post.

 
I am familiar with the Fourier transform and how it's a complete representation of a signal. 
 
But that's theory again. 
 
You are asking the question "What numbers/functions do we need for a complete description of this transient?"
 
I'm asking an entirely different question, which is "How does the device distort the transient and how do we characterize that distortion?" 
 
Turntable wow and flutter is a distortion. You wouldn't take an FFT of the turntable's output to characterize its wow and flutter. That's the wrong tool for the job. Instead you would characterize it by something like average or peak variation in speed.
 
We're in odd territory here, compared to the theory/math you are using. A theory is useful, in part, because it allows prediction of an experiment that has not yet been run. The math you describe allows an exact prediction. For instance, say we are talking about a perfectly linear device L. If we measure its transfer function in the frequency domain, we can predict its response to any signal. In theory. We have "characterized" the behavior of the device.
 
But we have to ask in what ways the device deviates from reality. And in doing so, we end up characterizing distortions in ways that don't allow much prediction.
 
For instance, we measure wow and flutter of a turntable, but that doesn't help us predict the turntable's output level O at time T given an input signal S.
 
So back to DACs. I often see a plot of the frequency response of a DAC. And it usually looks very flat through the audible band, giving the impression it has no significant distortion.
 
But is that the right tool for characterizing the way the DAC distorts transients? I don't think so. For one thing, it doesn't allow you to predict the response of the DAC to a given input. We are no longer in "theory land."
 
This is where I think the timing of a transient needs to be investigated. I would be interested if there is any psychoacoustical work characterizing the precision of the ear/brain in picking up timing relationships among transients. Such investigation is necessary to characterize not a purely theoretical device, but instead the ways it deviates from theory.
 
I understand what you are saying about pre and post ringing. I'm interested in reading the Meyer & Moran paper as well (although it costs money so I am not likely to). But I would ask, how do we know these things? How do we know that pre and post ringing can only be heard under certain conditions? How did Meyer and Moran make their determination? I am not asking you to give a complete explanation, but I'm not too impressed with what I've seen about psychoacoustics so far. The experiments seem to be run under pretty limited conditions or investigate a very small subset of phenomena.
 
Jul 7, 2016 at 4:51 PM Post #5 of 88
  That's not how I would put it. Digital signal theory is math. And you always have to check if reality matches the math. Theoretical physics has the same issue. There is a deep and somewhat mysterious link between math and reality. If we imagine all possible universes, there is no guarantee that we end up in a universe that is well-explained by math, so progress in, say, physics is astounding. But again, you have to check the reality.
 

 
The link between math and reality is deep, not necessarily mysterious. On the fringes of theoretical physics, that is things like a unifying theory, we simply don't know a lot. But in many ways we do. Newton's laws explain motion (on a human-size scale) and relativity (on a galactic scale). There are areas that we don't understand, sure, but to say "Digital signal theory is math" and conflate the established mathematics of it to the not yet complete mathematics that would unify Einstein and Newton than you simply don't understand either subject. There is a difference between not understanding something, and it being mysterious and unknowable. 
 
Jul 7, 2016 at 5:33 PM Post #6 of 88
   
The link between math and reality is deep, not necessarily mysterious. On the fringes of theoretical physics, that is things like a unifying theory, we simply don't know a lot. But in many ways we do. Newton's laws explain motion (on a human-size scale) and relativity (on a galactic scale). There are areas that we don't understand, sure, but to say "Digital signal theory is math" and conflate the established mathematics of it to the not yet complete mathematics that would unify Einstein and Newton than you simply don't understand either subject. There is a difference between not understanding something, and it being mysterious and unknowable. 

Sure, there are differences between digital signal processing and theoretical physics, but in both cases you have to check if reality matches the theory. It's pretty simple.
 
Jul 7, 2016 at 5:45 PM Post #7 of 88
and it has been checked by real world application
 
digital analog signal theory is one of the most powerful and thoroughly tested applications of math to technology, with "real world results" - its always fun reading of how it can't work over internet DSL connection sending Mbaud of data in digitally synthesized QAM, converted to analog DAC output, passed over 1/2 mile of 1/2 century old voice telephone twisted pair to my desktop where the analog signal is then ADC captured, decoded digitally to my computer
 
 
a question; is this a "Gish Gallop" rhetorical attack? - you just keep throwing out easily answered/debugged constructions - but too fast for any to reply in depth - and then you declare we don't have the answers?
 
Jul 7, 2016 at 5:54 PM Post #8 of 88
There is some investigation into pulse trains.  A group of transients which the ear can group together and time the differences between the two ears. You typically will see how the stereo effect is made up of loudness between the two ears which predominates above 1500 hz.  And how timing between ears predominates at 800 hz and lower.  The two are mixed 800-1550 hz. The ear can group closely spaced pulse trains that are at higher frequencies and hear the timing difference between the two ears.
 
Not quite directly from that we know at around 700 hz or so the ear/brain can distinguish between test ones and other stimuli a difference down to about 10 microseconds between the two ears. It might though probably can't quite distinguish quite that low a time difference with the pulse trains.
 
In regards to digital audio, the timing of transients with the frequencies below nyquist is much better than 10 microseconds.  CD can potentially do this to something like 55 picoseconds.
 
This still isn't a direct and complete data for your exact question though it is somewhat in the area.
 
Pre and post ringing of competent digital I don't think is a problem.  For one the rate of the ringing (and most of it is not ringing actually) occurs at frequencies in the transition band only.  With CD that is between 20 khz and 22.05 khz.  Not likely you can hear that either. Secondly actually hard ringing as seen in test signals won't occur in a musical file due to the filtering prior to ADC conversion.  So your ear can't ear the pre or post ringing, and what is left is the blunted transient which is blunted due to frequencies above 20 khz having been removed.  The higher bandwidth transient were it to occur would more precisely locate in a smaller time space the energy from the transient, but your ear's limited frequency response would smear it out anyway. As long as the filter in the digital gear isn't incompetently designed it won't smear out more than the ear does. Which means the movement of your ear drum to a bandwidth restricted signal and a wide bandwidth signal is going to be the same or so very nearly the same it won't matter.
 
Jul 7, 2016 at 8:59 PM Post #9 of 88
  and it has been checked by real world application
 
digital analog signal theory is one of the most powerful and thoroughly tested applications of math to technology, with "real world results" - its always fun reading of how it can't work over internet DSL connection sending Mbaud of data in digitally synthesized QAM, converted to analog DAC output, passed over 1/2 mile of 1/2 century old voce telephone twisted pair to my desktop where the analog signal is then ADC captured, decoded digitally to my computer
 
 
a question; is this a "Gish Gallop" rhetorical attack? - you just keep throwing out easily answered/debugged constructions - but too fast for any to reply in depth - and then you declare we don't have the answers?

Here's what I mean by checking theory against reality.
 
I don't mean that digital theory as a big totality needs to be checked against reality as some big abstract entity and declared "Valid!" for all time.
 
Some objections to digital theory don't need to be checked as they can be resolved in the mathematical domain, such as the idea that "digital destroys a waveform by slicing it up." Showing a working DAC busts that argument, but it's not really necessary.
 
I mean that digital theory can be used to predict the behavior of a specific ADC or DAC. The "reality" is how that specific ADC or DAC actually behaves.
 
Digital theory does not predict that a waveform will be "sliced up", There's no need to verify or contradict a prediction that is not even made by the theory. 
 
However, we do know there are some real-world problems that are not part of the basic theory, such as jitter. So we do need to measure the jitter and apply maybe some other measurements together with some theory to check that the ADC/DAC behaves reasonably close to the basic theory. And this needs to be done for each new design of ADC/DAC.
 
Note the analogy between digital theory predicting the behavior of a DAC, and theoretical physics predicting the behavior of fundamental particles. In both cases, we expect reality to diverge from the theory to some degree.
 
EDIT; Oh regarding "Gish," don't be ridiculous; I'm asking these questions because I'm interested in the answers. I have a lot of questions stored up over years of thinking about it. There are 20 regular members on this forum at the very least and only 1 of me, so I don't see how I could be overwhelming the forum. But if you would like, I'll slow down.
 
I hadn't heard the term Gish Gallop before, but I'm asking questions; he was making assertions. 
 
Jul 7, 2016 at 9:09 PM Post #10 of 88
  There is some investigation into pulse trains.  A group of transients which the ear can group together and time the differences between the two ears. You typically will see how the stereo effect is made up of loudness between the two ears which predominates above 1500 hz.  And how timing between ears predominates at 800 hz and lower.  The two are mixed 800-1550 hz. The ear can group closely spaced pulse trains that are at higher frequencies and hear the timing difference between the two ears.
 
Not quite directly from that we know at around 700 hz or so the ear/brain can distinguish between test ones and other stimuli a difference down to about 10 microseconds between the two ears. It might though probably can't quite distinguish quite that low a time difference with the pulse trains.
 
In regards to digital audio, the timing of transients with the frequencies below nyquist is much better than 10 microseconds.  CD can potentially do this to something like 55 picoseconds.
 
This still isn't a direct and complete data for your exact question though it is somewhat in the area.
 
Pre and post ringing of competent digital I don't think is a problem.  For one the rate of the ringing (and most of it is not ringing actually) occurs at frequencies in the transition band only.  With CD that is between 20 khz and 22.05 khz.  Not likely you can hear that either. Secondly actually hard ringing as seen in test signals won't occur in a musical file due to the filtering prior to ADC conversion.  So your ear can't ear the pre or post ringing, and what is left is the blunted transient which is blunted due to frequencies above 20 khz having been removed.  The higher bandwidth transient were it to occur would more precisely locate in a smaller time space the energy from the transient, but your ear's limited frequency response would smear it out anyway. As long as the filter in the digital gear isn't incompetently designed it won't smear out more than the ear does. Which means the movement of your ear drum to a bandwidth restricted signal and a wide bandwidth signal is going to be the same or so very nearly the same it won't matter.

 
I want to think about this in greater depth, but it's interesting that the nervous system has timing resolution of 10 microseconds in that one particular context. I wonder what the math says about this problem: say we have an analog signal S that has two successive transients spaced T seconds apart. Do a Fourier transform of the finite-duration signal. Now modify S by moving the transients closer together by 10 microseconds; call that S_2. If the brain really has 10 microsecond precision, then S and S_2 should trigger distinct patterns in the brain. It might be interesting, then, to ask what bandwidth is needed in a system that transmits S and S_2 in order to preserve the timing. 10 microseconds is the period of 100 KHz. Does that suggest a 100 KHz bandwidth is needed to preserve that timing and that a much lower bandwidth would disrupt the timing?
 
Jul 8, 2016 at 5:00 AM Post #11 of 88
The point that you're missing, is that 44.1kHz audio can encode transients at arbitrarily high timing accuracy. If a transient happens in time between two samples, it does not just fall into the cracks or get shifted left or right to the nearest sample, as you would seem to believe. Rather, it is encoded by adjacent samples and no timing information is lost except for the fact that the resulting audio is bandlimited.

For illustration, from top to bottom,

1. Two impulses spaced one sample apart in an audio stream encoded at 999999Hz. Take it as a round 1MHz, the time between them would be 1 microsecond.
2. The audio stream is resampled down to 44100Hz. The result appears to be a wild ride and the time difference between the channels appear to have disappeared. But if you look closely you'd see that while the digital peak samples have moved to the same timed sample, the amplitude of each sample is subtly different in each channel. For example, the right channel reaches full scale while the left channel doesn't.
3. The audio stream is resampled from 44100Hz back up to 999999Hz. As you can see from the red clipping indicators, the timing information of the impulses have been recovered--they peak exactly 1 sample apart, just as before the two resampling passes. The only difference is that the impulse has been band-limited to 22050Hz and under--hence the "ringing" shape of roughly 22050Hz frequency.
 
HiBy Stay updated on HiBy at their facebook, website or email (icons below). Stay updated on HiBy at their sponsor profile on Head-Fi.
 
https://www.facebook.com/hibycom https://store.hiby.com/ service@hiby.com
Jul 8, 2016 at 10:50 AM Post #12 of 88
  [1] That's not how I would put it. Digital signal theory is math. And you always have to check if reality matches the math. Theoretical physics has the same issue.
 
[2] If we imagine all possible universes, there is no guarantee that we end up in a universe that is well-explained by math, so progress in, say, physics is astounding. But again, you have to check the reality.

 
1. No, theoretical physics does not have the same problem, it has the opposite problem. Digital audio theory does not exist to explain how digital audio works, it's exactly the other way around, digital audio works because the theory is correct. The "reality" (digital audio components) is based on the theory, not the other way around as is the case with theoretical physics.
 
2. No, you don't have to check the reality. The reality doesn't need checking because it is already proven by the fact that the universe exists. What you have to check is the theory and if it doesn't match the reality, you have to change the theory. Digital audio theory is mathematically proven and the digital audio "reality" has been designed from the theory. So here we have to do the opposite and check the reality ... and if the reality doesn't match the theory then we need to change the reality.
 
  I am familiar with the Fourier transform and how it's a complete representation of a signal. 
 
Turntable wow and flutter is a distortion. You wouldn't take an FFT of the turntable's output to characterize its wow and flutter. That's the wrong tool for the job. Instead you would characterize it by something like average or peak variation in speed.

 
You can't have it both ways, either a Fourier transform is a complete representation of a signal or it isn't. If it is, then a Fourier transform can indeed perfectly represent wow and flutter but if it isn't and you can prove that it isn't, then you're in for some serious fame and riches as you'll have disproved a proven mathematical theorem which no one else has successfully contested in nearly 200 years! You are correct though that in practice, we may choose to use a tool other than an FFT to measure wow and flutter, not because an FFT is in any way inaccurate or "wrong" but simply because another tool may present the results in a more convenient format.
 
 
We're in odd territory here, compared to the theory/math you are using. A theory is useful, in part, because it allows prediction of an experiment that has not yet been run. The math you describe allows an exact prediction. For instance, say we are talking about a perfectly linear device L. If we measure its transfer function in the frequency domain, we can predict its response to any signal. In theory. We have "characterized" the behavior of the device.

 
I don't understand what you mean. If device L is perfectly linear and we measure it's transfer function in the frequency domain, it will measure perfect fidelity in "reality", not only in theory.
 
 
[1] But we have to ask in what ways the device deviates from reality. [2] And in doing so, we end up characterizing distortions in ways that don't allow much prediction. ...For instance, we measure wow and flutter of a turntable, but that doesn't help us predict the turntable's output level O at time T given an input signal S.

 
1. The device is reality, we have to look at how that reality deviates from the theory.
 
2. That depends on the type of device and what you mean by "prediction". Sure there is a certain amount of inaccuracy and randomness in digital devices (ADCs/DACs). For example we can't make perfectly accurate clocks, we can't know exactly how many electrons are going to collide inside a resistor and we can't know exactly what quantisation errors there will be but we can predict all these things with a statistical probability. A turntable on the other hand is far less predictable because we have some of the same randomness as with digital devices but also physical/mechanical variables, such as dust, plus friction and various other forces, many of which constantly change due to mechanical wear.
 
 
[1] Digital theory does not predict that a waveform will be "sliced up" ...  
[2] I often see a plot of the frequency response of a DAC. And it usually looks very flat through the audible band, giving the impression it has no significant distortion. But is that the right tool for characterizing the way the DAC distorts transients? I don't think so. For one thing, it doesn't allow you to predict the response of the DAC to a given input.

 
1. Yes it does! That's the first and most fundamental tenet of digital audio, the Nyquist/Shannon Sampling Theorem.
 
2. There is only frequency and amplitude, nothing else ... and, this isn't just true of digital audio but of all audio recording/reproduction, since the dawn of audio recording! What has changed in the last 150 years or so is our ability to reduce the factors which distort the amplitude and frequency. If the plot of a DAC is flat, then there is no distortion in the amplitude or frequency and we have perfect fidelity. In reality of course no DAC is perfectly flat/linear but to see the distortion in amplitude/frequency we need a plot with a far larger scale than manufacturers typically publish. Of course though, a far larger scale is needed because the distortions are typically tiny, well below or even orders of magnitude below audibility. My question is then: If not a plot of amplitude/frequency, what would be a good tool in your opinion for characterizing frequency and amplitude? Or, are you saying there is something there other than just frequency and amplitude?
 
G
 
Jul 8, 2016 at 5:12 PM Post #13 of 88
   
 
2. No, you don't have to check the reality. The reality doesn't need checking because it is already proven by the fact that the universe exists. What you have to check is the theory and if it doesn't match the reality, you have to change the theory. Digital audio theory is mathematically proven and the digital audio "reality" has been designed from the theory. So here we have to do the opposite and check the reality ... and if the reality doesn't match the theory then we need to change the reality.
 
 
You can't have it both ways, either a Fourier transform is a complete representation of a signal or it isn't. If it is, then a Fourier transform can indeed perfectly represent wow and flutter but if it isn't and you can prove that it isn't, then you're in for some serious fame and riches as you'll have disproved a proven mathematical theorem which no one else has successfully contested in nearly 200 years! You are correct though that in practice, we may choose to use a tool other than an FFT to measure wow and flutter, not because an FFT is in any way inaccurate or "wrong" but simply because another tool may present the results in a more convenient format.
 
 
 
 
 
2. There is only frequency and amplitude, nothing else ... and, this isn't just true of digital audio but of all audio recording/reproduction, since the dawn of audio recording! What has changed in the last 150 years or so is our ability to reduce the factors which distort the amplitude and frequency. If the plot of a DAC is flat, then there is no distortion in the amplitude or frequency and we have perfect fidelity. In reality of course no DAC is perfectly flat/linear but to see the distortion in amplitude/frequency we need a plot with a far larger scale than manufacturers typically publish. Of course though, a far larger scale is needed because the distortions are typically tiny, well below or even orders of magnitude below audibility. My question is then: If not a plot of amplitude/frequency, what would be a good tool in your opinion for characterizing frequency and amplitude? Or, are you saying there is something there other than just frequency and amplitude?
 
G

 
I'm trying to focus on the practical reality of designing, building and testing audio devices. Some of this discussion is getting bogged down in the trees and not the forest.
 
Are you saying that you could design a DAC, the only theory necessary being the discrete Fourier transform, and then sell it without testing it? 
 
Regarding wow and flutter, suppose you ask me to characterize the wow and flutter of a turntable. So I send you an FFT of the output of the turntable. Does that give you enough to go on?
 
Regarding frequency and amplitude: Why are we measuring a device? What's the practical reality here? Maybe we are testing its fidelity in a situation that requires high fidelity. Maybe we are checking it is functioning correctly. Could be a lot of things. In essence we are characterizing its behavior. 
 
So let's say I tell you, I have a device here. I send you a plot of frequency and amplitude of its output. Does that help you know its fidelity? Does that help you check if it functioning correctly? Does it help you characterize its behavior? Does it help you with anything?
 
Jul 8, 2016 at 5:18 PM Post #14 of 88
The point that you're missing, is that 44.1kHz audio can encode transients at arbitrarily high timing accuracy. If a transient happens in time between two samples, it does not just fall into the cracks or get shifted left or right to the nearest sample, as you would seem to believe. Rather, it is encoded by adjacent samples and no timing information is lost except for the fact that the resulting audio is bandlimited.

For illustration, from top to bottom,
1. Two impulses spaced one sample apart in an audio stream encoded at 999999Hz. Take it as a round 1MHz, the time between them would be 1 microsecond.
2. The audio stream is resampled down to 44100Hz. The result appears to be a wild ride and the time difference between the channels appear to have disappeared. But if you look closely you'd see that while the digital peak samples have moved to the same timed sample, the amplitude of each sample is subtly different in each channel. For example, the right channel reaches full scale while the left channel doesn't.
3. The audio stream is resampled from 44100Hz back up to 999999Hz. As you can see from the red clipping indicators, the timing information of the impulses have been recovered--they peak exactly 1 sample apart, just as before the two resampling passes. The only difference is that the impulse has been band-limited to 22050Hz and under--hence the "ringing" shape of roughly 22050Hz frequency.

 
I appreciate the detail you put into this. I do understand it.
 
However, this discussion is getting bogged down in the trees to some extent. Let me try to restate my point.
 
I want to look at the big picture. The big picture of audio reproduction always includes the ear and the brain. 
 
Let's say we measure the sound pressure and call it signal S. Now we ask, what is going on inside the ear and brain? The brain has, among other things, synapses that have an electrical potential. So, we could ask, is the signal S present somewhere in the brain? That is, is there a synapse somewhere that is fluctuating in potential pretty much precisely along with S?
 
Jul 8, 2016 at 9:24 PM Post #15 of 88
I appreciate the detail you put into this. I do understand it.

However, this discussion is getting bogged down in the trees to some extent. Let me try to restate my point.

I want to look at the big picture. The big picture of audio reproduction always includes the ear and the brain. 

Let's say we measure the sound pressure and call it signal S. Now we ask, what is going on inside the ear and brain? The brain has, among other things, synapses that have an electrical potential. So, we could ask, is the signal S present somewhere in the brain? That is, is there a synapse somewhere that is fluctuating in potential pretty much precisely along with S?


There are nerve bundles leading from ear to brain that do something like this yes. I am pretty sure what direction your questions will take next. You aren't going to get the answers you seem you be looking for. Which is something to show digital audio somehow is lacking and has been overlooking some very basic stuff about hearing and the brain.
 

Users who are viewing this thread

Back
Top