Why 24 bit audio and anything over 48k is not only worthless, but bad for music.
Nov 3, 2017 at 6:46 PM Post #2,476 of 3,525
I think they said that 44.1 sampling rate can perfectly reproduce any sine wave representing sound that the human ear can hear. That's good enough for me. I tend to listen to music with human ears.

It doesn't matter if I use fancy equipment or cheap equipment. I haven't ever seen any evidence that super audible frequencies are audible, so higher sampling rates don't have much purpose for my human ears. Feel free to believe that pigs can fly and you can hear things that other humans might not be able to, but there's plenty of evidence on Nyquist's side and on the side of audiologists who have established the thresholds of human hearing. The ball is in your court to either prove that Nyquist is wrong and the audible range isn't perfectly reconstructed; or that human ears are capable of things that no one tested them for before. I think both of those things are unlikely, but the best way to test that would be a simple line level matched, direct A/B switchable, double blind listening test between Redbook and high sampling/bit rate audio. Go to it tiger! Achieve that and you'll be the most famous audiophile in the world! Maybe they'll add a KeithEmo corollary to the Nyquist theory.
 
Last edited:
Nov 3, 2017 at 6:52 PM Post #2,477 of 3,525
Wait, are we suddenly being cast as advocates of the Theiss paper, the most stinky of the papers from the hi-res meta-analysis? What's next: I'm shilling for Oohashi?

Find me someone who is just totally convinced of the superiority of hi-res, and I'll offer to sneak in their house and change all their material to 128k Opus or AAC. Anyone care to take wagers on whether any of them would notice without some kind of blue light or flashing text letting them know?
 
Nov 3, 2017 at 7:33 PM Post #2,478 of 3,525
...
(And, yes, I can easily generate a 5 uS sound pulse using various methods.)
...
Not really interested in the the arguments generally, I have 99+% of my music as CD or rips thereof so there ends my practical/investment interest in "high res".

I am interested as to what amplification and transducers you used to "easily generate a 5 uS sound pulse" and the measurement techniques and equipment you used to verify you had created a 5 uS sound pulse.
I assume by "pulse" you mean something like this:

Single-Pulse.png

If you can actually generate a 5 uS sound pulse wouldn't transmission through the air alter the profile?
And as for hearing it wouldn't you be hearing resonances in your hearing equipment triggered by it?

Just wondering...
 
Nov 3, 2017 at 9:42 PM Post #2,479 of 3,525
... Just wondering...

The point usually missed is that such a pulse is not valid in digital form. It cannot be accurately digitised - it contains frequencies (energy) above half the sampling frequency of the ADC at any real-world sampling rate. If you run such a pulse through an ADC and plot the resulting sample values produced, you get a graph that looks exactly like the graph you get after running the same invalid pulse in digital form through a DAC. And the timing precision of the pulse after passing through the ADC - DAC chain is, for all practical purposes, exact.
Shannon and Nyquist showed that as long as you keep all components of the input signal below half the sampling frequency, you can reconstruct the original signal perfectly - not just in terms of amplitude, but in terms of temporal relationships too. They only addressed sampling, and assumed infinite resolution in amplitude. With a digital signal the precision is limited by the number of quantisation amplitude steps. The actual best time resolution for a 16 bit, 44.1khz PCM channel is about 55 picoseconds. To put that in perspective, light travels less than an inch in that time.
 
Nov 4, 2017 at 1:39 AM Post #2,480 of 3,525
There's a third option... There shouldn't be any audible difference on any recording.

I've supervised more mixes than I can count in some very good sound studios in Hollywood. The last step is to output the mix to 16/44.1 and for everyone involved to compare it to the original still in the board for final sign off. If I ever heard a difference between the two, I would have thrown up a red flag, as would have the engineers and talent. The equipment in the room is always carefully calibrated to be consistent and perfect. It represents the reference standard. We never spent much time worrying about how a mix would sound on uncalibrated equipment or DACs that performed out of spec because the range of error would be so broad, there would be no point. We approved 16/44.1 on the reference system and made sure it matched everyone's intentions. And the bounce down never sounded different at all.

I think you're operating beyond the range of reality. It's great to finesse the details, but they have to be perceivable. And the level of finessing that makes sense for a recording studio is greater than the level required to play back that recording in the home. I can see arguing for the need to keep noise floors down in a mix where you're boosting levels on multiple channels, but when I sit in my living room and listen to an album, audibly transparent is audibly transparent.

This is the difference between recording engineers and audiophiles. When engineers output to 16/44.1 from 24/96 source and they hear a difference they know something is wrong and start looking to find what is broken or set wrong. When an audiophile manages to here something wrong, they view it proof they have super human hearing, not that something is broken.
 
Nov 4, 2017 at 1:57 AM Post #2,481 of 3,525
That is true. Unlike high frequency limit that is quite sudden and absolute, the low frequency sensitivity remains, albeit at very high thresholds. Here is a composite of various research from the last 20 years or so on audibility of low frequency through ear alone (i.e body sensation excluded):

I would like to know how they test for low frequency sensitivity yet block out body sensation.
 
Nov 4, 2017 at 2:15 AM Post #2,482 of 3,525
Close mic'ed drum hits.
Room noise isn't specified as SPL, but there are 10dB studios in the world. The one I last worked in was NC15 (we don't use NC anymore).

Yup, and it's not even a challenge.

You are either measuring dB SPL or pascals. I always measure flat and add the curves later. We don't use NC anymore because NC masks the problem. You can easily have an NC 10 room and it be completely unsuitable for foley recording. Anytime someone uses NC or A weighting they are just trying to meet some compliance not solve the problem.
 
Nov 4, 2017 at 2:37 AM Post #2,483 of 3,525
I should note something here......

When I'm talking about those "5 uS timing differences" I am NOT talking about jitter or timing errors between channels.

What I'm talking about is having the same sound recorded in both channels, with a time delay being added to one of the channels.

The "overall time resolution" of any digital recording of a continuous sine wave is essentially infinite.
If I take a 500 hz sine wave, and record it on a stereo CD, after delaying one channel by 5 uS, you will be able to easily resolve the 5 uS difference (on an oscilloscope).
The reason this works is because we can accurately reconstruct the two 500 Hz sine waves (in the two channels), and compare them.

HOWEVER, our brain uses differences in arrival time to "calculate" location.
Assuming I start with a sound located equally in both channels, I can "move" its apparent location from left to right by adding delay to one channel or the other.
(Our brain calculates that the source is closer to one ear or the other by comparing differences in arrival times at each ear.)
Actually that's not quite it. The apparent location of a sound is dependant on the localization mechanism that is dominant, and that depends on the frequency of the sound. "Duplex Theory".
Now let's assume that I start with an unreasonably abrupt impulse (let's call it 5 uS).
(And, yes, I can easily generate a 5 uS sound pulse using various methods.)
Even though most of the energy in that impulse will be at inaudible frequencies, enough will extend into the audible range that we will hear it as a click.
And, if I delay that click in one channel or the other, it will seem to shift locations - between left and right.
The spectrum of a 5us pulse would be placed well above the frequency band where both ITD and ILD cross over, so it's location would be dominated by ILD (assuming it could be transduced and heard at all), which 5us of interchannel delay alone will not change. The apparent location shift won't be quite what you expect, or allude to. In fact the spectrum of that pulse is above the audible range. You'll hear a click if you generate it only because of the nonlinearity of any transducer. The click you hear is a byproduct, a distortion. In any case a 5us interchannel time delay is far, far below hearing angular position resolution.
However, because that click actually falls between two samples at 44k, we will NOT be able to precisely reconstruct its time from our 44k sample rate recording.
When we apply our band limiting, that impulse will indeed be spread out into a longer waveform that extends over multiple samples.
And, by looking carefully at that new waveform, we will be able to infer where the original impulse occurred in time.
But because of its frequency including whatever a band limiting filter does to it, and the fact that the only thing changed is ITD, which is not the dominant localization method at that frequency, the apparent image shift will not happen that way...or at all. It simply is not an issue, ignoring the fact that such a pulse does not occur in any acoustic event. Hence, I strongly suspect, our inability to localize it based on ITD alone.
HOWEVER:
1) in order to do so we will have to make certain assumptions about the filter we used
(we'll assume that, if there is equal amplitude in two samples, then the pulse was equidistant in time between them - but this assumption relies on our filter spreading the energy symmetrically in time)
2) the new waveform will be very different than the original
3) more importantly, unlike the original, our new waveform will have a much more gradual envelope
3a) as a result, mechanisms that rely on sensing abrupt edges of waveforms will be less able to accurately "find" the beginning edge of the impulse
3b) current research seems to strongly suggest that our brains do in fact look for those "edges"
3c) this in turn suggests that turning a sharp impulse into a more gradual band limited waveform may compromise the accuracy with which our brains can determine its exact beginning
3d) and this, in turn, suggests that doing so may reduce the accuracy with which our brains are able to utilize this particular location cue
(if the starting time of the impulse cannot be determined distinctly, then we will have the equivalent of a blurry image when we attempt to compare them)
Again, you seem to be ignoring how spacial localization actually works. Our brains do not always "look for those edges" to determine anything, especially when those "edges" never occur in life. What you're getting at here is a possible limitation to digital systems that falls outside of the application.
The result of all this MAY be that our brains end up being less accurate in their estimation of where sound objects are located in space.
The result could be that objects seem to be in different locations, or that we perceive the location of individual instruments as being less distinct.
(A similar effect occurs with "stereoscopic 3D video"; when the various depth cues conflict, even slightly, which they often do, the image seems "less distinct and less real".)
No, your visual parallel with 3D video is incorrect. 3D suffers from a divergence in two simultaneous localization cues, convergence and focus position. The operate together, but 3D projection demands they separate. There is no parallel in audio.
Note that, if you accept the current "spectrum analyzer" model of the human ear (with a bunch of "detector hairs", each of which is "tuned" to a distinct frequency).
The frequency which the "top hair" responds to will determine the highest frequency we can detect as a continuous sine wave.
However, that number says nothing about the TIME RESOLUTION (how quickly, and how accurately, our brain can respond to WHEN a particular hair was excited.)

Please note that I'm not specifically suggesting that this will turn out to be true...... however I don't assume that it isn't true either.
Recent research in how your brain calculates eye movement has shown that the mechanism is quite different than what we previously thought... and not especially intuitive to most people.
Therefore, I prefer NOT to make claims based on inferences based on old or incomplete information...
I think there is some serious grasping for straws going on here.
 
Nov 4, 2017 at 2:51 AM Post #2,484 of 3,525
I'm not jumping back into anything.....

Someone specifically asked whether there would be any reason whatsoever for someone to purchase higher quality content "even if they couldn't hear the difference on their current equipment".
My reply is that it still makes sense to purchase it, even if you don't hear any difference on your current equipment, IF you expect that you might later have equipment on which you WILL hear a difference.
(Note that I didn't say anything about price; although I will certainly assert that some AVRs sound noticeably inferior to others.... I wasn't the one who suggested "AVRs" as the example )

I also make no assertion that ALL 20 year old equipment is inferior.... although I would also note that significant advances in ADC and DAC technology have certainly occurred.
While avoiding generalities, I would suggest that it's not unreasonable to suspect that at least some equipment available today is better than ANY available 20 years ago.
I would also assert that many of the tests I see quoted repeatedly were performed on equipment that I consider suspect... and none of them seemed to have thoroughly verified the capabilities of the equipment they used.
When you perform an experiment, one of the first validation steps is to confirm and document that your equipment can in fact deliver the test stimuli it is intended to test.
So, if you want to test whether people can hear 30 kHz, you start by using a test microphone to confirm that you have 30 kHz actually physically present at the test location (and present in your test content).
And right after that you'll want to be sure you can get your transducer to generate that 30kHz signal, without changing it's level or distorting it, and get that signal all the way to the listening year. That's where this really falls apart. There are transducers that have some 30kHz response, but it's hardly flat and distortion free. Until you get clean transducers you're test includes all their distortion products added to the signal.
Errrrr..... I disagree with your final assertion entirely.

If you want to ask "whether high-res audio makes sense for the average consumer" then by all means lets look at statistics cubes.
HOWEVER, if we're talking about a scientific claim about whether "the difference is audible", a SINGLE well documented and repeatable example is enough to establish that it is.
(If a single human can reliably hear a difference, on a single file, on a single combination of equipment, then "it is audible by human beings".)
Then, once that "if" is established, we would want to know both "why" and "how many".
I don't think any scientist or engineer would accept a single example as anything but an anomaly.
Of course, we absolutely require the best possible cross section of humans.......
(If it turns out that only left handed midget harp players from Burundi can hear it, we wouldn't want to miss that by not including at least one of them.)
From the data so far, it seems quite possible that "audiophiles who are certain they can hear a difference" may NOT be the best possible subjects.
(I might suggest running a few tests using school-age children as test subjects - since younger humans have been documented as generally having better high-frequency hearing than older humans.)
…and you'll be generating the afore mentioned data array.
 
Nov 4, 2017 at 2:55 AM Post #2,485 of 3,525
You are either measuring dB SPL or pascals. I always measure flat and add the curves later. We don't use NC anymore because NC masks the problem. You can easily have an NC 10 room and it be completely unsuitable for foley recording. Anytime someone uses NC or A weighting they are just trying to meet some compliance not solve the problem.
I used SPL because I meant SPL. I clearly stated, and you quoted, that we don't use NC anymore. But we do use PNC, which is a modification of NC and is more stringent. You can measure any way you want, but if you're working with constructing studios with architects and HVAC contractors, you need to work with PNC. I seriously doubt an NC 10 room would be unacceptable for foley, especially given that most foley effects have nothing below 150 Hz. We have filters for that.
 
Nov 4, 2017 at 3:17 AM Post #2,486 of 3,525
Great. Now you are getting it. Yes, for classic stereo reproduction using speakers there is some form of soundstage. And no, there is hardly any soundstage as by physical space, outside the head of the listener, when using headsets. Not for regular recordings.

What I am arguing, is simply record every instrument in mono, with as little room acoustics as possible, and simply place the instrument or artist on the fly, or by pre calculation by the dimensions of the head of the listener. And, yes, finally, someone who gets the point, that you would have to do this on the fly, as to correct for head movement, if the listener is to experience the sound source as fixed in physically space. That is perfectly possible, if using vectors and math, and a sensor in the headset for head movements.


This has been done many times, I find it to be exaggerated and fake sounding. Go to AES or NAB, somebody is always demoing some great new 3D or VR audio. I think it was Dolby demoing VR for headphones. It is always exaggerated. You want it sound real, place your microphones to capture what you want 360 hemispherical, spherical. Map the physical locations on the microphone. Place loudspeakers at the inverse of the microphones a few feet out and sit in the center. It will be real enough you will be looking over your shoulder.
 
Nov 4, 2017 at 3:21 AM Post #2,487 of 3,525
The point usually missed is that such a pulse is not valid in digital form.

The point missed in this case is what I was talking about.

It was nothing to do with ADC, DAC or any other DSP, it was purely about the generation, propagation and reception of a 5 uS SOUND pulse.
BTW a 5 uS pulse is a totally valid digitally, a pulse is a common digital output, I can make one with a single monostable multivibrator or a digital file ..0.0,0,0,1,0,0 ... read at 200 kHz, with either fed into an electro-acoustical transducer. What I was querying is what kind of electro-acoustical transducer could accurately perform the task, and how it would be received unadulterated by the listener.
 
Nov 4, 2017 at 3:40 AM Post #2,488 of 3,525
I used SPL because I meant SPL. I clearly stated, and you quoted, that we don't use NC anymore. But we do use PNC, which is a modification of NC and is more stringent. You can measure any way you want, but if you're working with constructing studios with architects and HVAC contractors, you need to work with PNC. I seriously doubt an NC 10 room would be unacceptable for foley, especially given that most foley effects have nothing below 150 Hz. We have filters for that.

PNC is an improvement, but it is still a curve. I often have to come in after the fact to find out why there is a problem. HVAC contractors rarely meet the specification, so I have the requirements put in their contract.
Most movies are overhyped audio, big footsteps pushed up 30dB with HVAC rumble and you have a problem.
What is the old saying "In theory there is no difference between theory and practice. In practice there is."
 
Nov 4, 2017 at 3:47 AM Post #2,489 of 3,525
The point missed in this case is what I was talking about.

It was nothing to do with ADC, DAC or any other DSP, it was purely about the generation, propagation and reception of a 5 uS SOUND pulse.
BTW a 5 uS pulse is a totally valid digitally, a pulse is a common digital output, I can make one with a single monostable multivibrator or a digital file ..0.0,0,0,1,0,0 ... read at 200 kHz, with either fed into an electro-acoustical transducer. What I was querying is what kind of electro-acoustical transducer could accurately perform the task, and how it would be received unadulterated by the listener.


If it is truly unadulterated, I would expect the listener to hear nothing.
 
Nov 4, 2017 at 4:12 AM Post #2,490 of 3,525
... BTW a 5 uS pulse is a totally valid digitally, a pulse is a common digital output, I can make one with a single monostable multivibrator or a digital file ..0.0,0,0,1,0,0 ... read at 200 kHz, with either fed into an electro-acoustical transducer. What I was querying is what kind of electro-acoustical transducer could accurately perform the task, and how it would be received unadulterated by the listener.

..0,0,0,0,1,0,0 ... is an invalid digital (audio) data stream. It cannot be generated by sampling an analogue signal. Forget about doing this digitally.
As for accurately transducing this pulse to acoustic waves, I don't know of any commercially available transducers that have a flat response from DC to the MHz range. Even if you had one, attenuation of sound in air rises rapidly with frequency. The transducer would have to be placed against the listener's ear.

By this point, you're totally off into the weeds. The ear simply doesn't have any mechanism by which to perceive 200 KHz signals.
 

Users who are viewing this thread

Back
Top