can someone explain about frequency on music?
Jan 24, 2013 at 10:04 PM Post #18 of 34
stv014,
 
That is an interesting file especially the one that sounds like crickets chirping.  I guess I'm not made to hear real detailed stuff or high frequency as I barely if not at all can tell the difference on the one with some tunes. then again, my max is at 16k.
 
I was reading some where that avg. people loses around 2k per 10 years due to age (genetics/exposure), but all that is changing as more people are listening to earphones and cell phones too loud. 
 
I use to do that and now I don't, but I'm still wondering if I'm going to need hearing aid by the time I reach 70.
 
Jan 24, 2013 at 10:55 PM Post #19 of 34
I had dome a teest to do with this a few days ago before signing up with head-fi
 
I cant remember the exact web link but the test file i downloaded bc firefox kept blocking scripts was Elliott H. Berger, Aearo Corporation;
i needed to compare files but the buffer was taking too long that id forget.
 
his explanation if im not wrong was that hearing loss is supposed to be a graded process over life time but you are right if your avg song is very loud then damage to the earing is non recoverable if they never get a rest.
 
how to know? tinitinus (ringing) or funny frequencies in the ear like a constant buzzing that arent to do with music.
hope that helps.
 
Jan 25, 2013 at 6:55 AM Post #20 of 34
Quote:
Originally Posted by soundeffect /img/forum/go_quote.gif
 
That is an interesting file especially the one that sounds like crickets chirping.  I guess I'm not made to hear real detailed stuff or high frequency as I barely if not at all can tell the difference on the one with some tunes. then again, my max is at 16k.

 
There are two groups of files, those that have names beginning with "lpf" are lowpass filtered (these are the ones with tunes), that is, the high frequency information is removed from them above the frequency in the name of the file. See also the graph, the yellow trace corresponds to lpf10kHz.flac, that has a flat response up to about 9.6 kHz, halves the level at 10 kHz, and has basically everything removed at 10.8 kHz and above. The other "lpf" files increase the cutoff frequency in 2 kHz steps. The last, 22 kHz one is in fact not filtered at all. The "hpf" files (the ones that only contain "chirps", and above a certain threshold that varies by person, nothing audible) are highpass filtered, and are essentially what was removed from the corresponding "lpf" files". If I used linear phase filters, you could in fact mix each lpf+hpf pair, and get back the unfiltered 22 kHz version, other than the addition of some quantization noise.
Note that a previous version of this package included broken "hpf" files in which some low frequency information was still audible.
 
It would be interesting if those who believe that a 44.1 kHz sample rate is not enough for them listened to these files. Similarly to the bit depth comparison, it can be surprising to find out how difficult it really is to hear bandlimiting and quantization noise.
 
Jan 25, 2013 at 10:11 PM Post #21 of 34
Sound localization works by the detection of an interaural time difference of a common source to arrive at the two ears (in the pons, not auditory cortex).  Humans can discern sources that arrive as little as 10-20 µs apart (Hafter ER, Dye RH, Gilkey RH (1979).  So to preserve all the spatial information of a musical performance the brain would require time resolution of at least 50 kHz.  This is not 'heard' in pitch, but in location.  
 
Jan 26, 2013 at 3:04 AM Post #22 of 34
Quote:
Sound localization works by the detection of an interaural time difference of a common source to arrive at the two ears (in the pons, not auditory cortex).  Humans can discern sources that arrive as little as 10-20 µs apart (Hafter ER, Dye RH, Gilkey RH (1979).  So to preserve all the spatial information of a musical performance the brain would require time resolution of at least 50 kHz.  This is not 'heard' in pitch, but in location.  

 
Probably. But just to clarify, that doesn't necessarily mean to me that we can hear or need information at frequencies higher or equal to 50kHz or 100kHz. Accoriding to this paper, Hafner ER, Dye RH, Gilkey RH (1979) seem to be dealing with Interaural Phase Differences (IPD) in > 150 ms duration audible frequency signals.
 
I'm not an expert in interaural cues, but this paper describes some of the concepts. It mentions that the Just-Noticeable Difference (JND) for Interarual Time Difference (ITD) seems to be indeed 10-20 us. However, same paper mentions that ITD is estimated "from the phase of the ratio of the complex transfer functions for the right and left ears." And "ITDs are generally assumed to be relatively unimportant above 2kHz" which is obviously quite a few cycles below 50 kHz...
 
As with all technical journals, I would need to re-read it a few times to digest the content (specially since this is all new to me.) However, I wouldn't be too quick to assume that just because we may be able to discern sources 10-20 us apart, it is because we can hear 50 kHz (20 us) or 100 kHz (10 us)... Specially given that there is plenty of evidence that we humans can't.
 
Given the above, I don't think Hafter ER, Dye RH, GIlkey RH (1979) necessarily claim in their "Lateralization of Tonal Signals which have neither onsets nor offsets" that we need 50 kHz or 100 kHz of music signal bandwidth (requiring above 200 kHz sampling rate) to be able to discern sources that arrive as little as 10-20 us apart.
 
Seems to me 20 kHz of pristine audio signal bandwidth is up to the task, and not contradictory to the findings of your 1979 source.
 
Jan 26, 2013 at 8:21 AM Post #23 of 34
Quote:
Humans can discern sources that arrive as little as 10-20 µs apart (Hafter ER, Dye RH, Gilkey RH (1979).  So to preserve all the spatial information of a musical performance the brain would require time resolution of at least 50 kHz.  This is not 'heard' in pitch, but in location.  

Are you saying that 44.1 kHz is not enough to encode time delays as big as 10 to 20 us? Because it certainly is. It can encode time delays down to a couple of ns actually.
 
Jan 26, 2013 at 8:29 AM Post #24 of 34
Quote:
Sound localization works by the detection of an interaural time difference of a common source to arrive at the two ears (in the pons, not auditory cortex).  Humans can discern sources that arrive as little as 10-20 µs apart (Hafter ER, Dye RH, Gilkey RH (1979).  So to preserve all the spatial information of a musical performance the brain would require time resolution of at least 50 kHz.  This is not 'heard' in pitch, but in location.  

 
Sample rate does not affect the "resolution" of delays that can be applied to a signal. It really only limits the maximum frequency that can be encoded.
 
You can see an impulse delayed by 1/3 and 2/3 sample below:
 

 
Jan 26, 2013 at 10:01 PM Post #25 of 34
Quote:
 
Probably. But just to clarify, that doesn't necessarily mean to me that we can hear or need information at frequencies higher or equal to 50kHz or 100kHz. Accoriding to this paper, Hafner ER, Dye RH, Gilkey RH (1979) seem to be dealing with Interaural Phase Differences (IPD) in > 150 ms duration audible frequency signals.
 
I'm not an expert in interaural cues, but this paper describes some of the concepts. It mentions that the Just-Noticeable Difference (JND) for Interarual Time Difference (ITD) seems to be indeed 10-20 us. However, same paper mentions that ITD is estimated "from the phase of the ratio of the complex transfer functions for the right and left ears." And "ITDs are generally assumed to be relatively unimportant above 2kHz" which is obviously quite a few cycles below 50 kHz...
 
As with all technical journals, I would need to re-read it a few times to digest the content (specially since this is all new to me.) However, I wouldn't be too quick to assume that just because we may be able to discern sources 10-20 us apart, it is because we can hear 50 kHz (20 us) or 100 kHz (10 us)... Specially given that there is plenty of evidence that we humans can't.
 
Given the above, I don't think Hafter ER, Dye RH, GIlkey RH (1979) necessarily claim in their "Lateralization of Tonal Signals which have neither onsets nor offsets" that we need 50 kHz or 100 kHz of music signal bandwidth (requiring above 200 kHz sampling rate) to be able to discern sources that arrive as little as 10-20 us apart.
 
Seems to me 20 kHz of pristine audio signal bandwidth is up to the task, and not contradictory to the findings of your 1979 source.

 
True, the interaural time difference is calculated from the phase difference of same-frequency signals arriving at the two ears.  This activity comes from neurons in corresponding tonotopic areas of the cochlea, which are only sensitive to frequencies not much higher than 20 kHz.  But as noted before, the audible difference in phase can be as little as 10-20 microseconds.  So ~2x more detailed information about the time of arriving sound is important.  
 
 

 
 
 
 
Sample rate does not affect the "resolution" of delays that can be applied to a signal. It really only limits the maximum frequency that can be encoded.
 
You can see an impulse delayed by 1/3 and 2/3 sample below:
 

 
 
I'm a little unclear on your example.  The phase of the oscillation is indeed the important information, but in your example 4 samples are used to encode the different phase shifts of the signal.  But the smaller amplitude frequencies appear to use the minimum Niquist rate of 2 samples per oscillation.  How is phase outside of the minimum sample times encoded in this signal?  Here the sample rate does seem to affect the resolution of delays.

 
Jan 26, 2013 at 10:51 PM Post #26 of 34
yes people often jump to the "means we must hear >100 kHz" conclusion without any thought about dynamic range, frequency, integration time relations - mathematical correlation of signals with huge dynamic range can resolve phase very accurately despite frequency band-limiting
 
auto and cross correlation mathematical functions are one place neural net parallelism wins big, despite slow chemical diffusion across junctions, slow propagation in axons, neuron firing rates saturating at a fraction of even 20 kHz
 
 
one crude way to visualize the phase resolution of digital audio is to consider how many distinct lines you can draw between adjacent samples - just "connecting the dots" -  we can draw 32768 lines of the same slope of a 1/2 max full scale step (not exactly a legal signal but gives the outside number for "phase resolution") - that would be ~23 us / 32768 ~ 700 ps ( pico 10^-12 )
 
there is a slope, sample interval trade off (which can be used to calculate a Sine amplitude) - for the lowest few us numbers anyone has reported for ITD even a -60 dB Sine can naively phase resolved at 16 bits
 
in fact we can do even better with noise shaped dither at frequencies where we hear best
 
Jan 27, 2013 at 6:09 AM Post #27 of 34
Quote:
 
True, the interaural time difference is calculated from the phase difference of same-frequency signals arriving at the two ears.  This activity comes from neurons in corresponding tonotopic areas of the cochlea, which are only sensitive to frequencies not much higher than 20 kHz

 
To be honest, I think most of us would be lucky to have a hearing range of 18 kHz. I personally don't think I can hear past 15 kHz.
 
Quote:
Originally Posted by eucariote /img/forum/go_quote.gif
 
But as noted before, the audible difference in phase can be as little as 10-20 microseconds.  So ~2x more detailed information about the time of arriving sound is important
 
 

 

 
Like jcx said, the ~Nx more detailed information is probably embedded in the dynamic range of the < 20 kHz audio signals picked up by our left and right ears. Furthermore, the results probably depend on how the brain processes these signals.
 
I don't think the brain process instantaneous times of arrival. My understanding is that the brain needs quite a few ms of audio data from both ears to figure out localization.
 
Quote:
I'm a little unclear on your example.  The phase of the oscillation is indeed the important information, but in your example 4 samples are used to encode the different phase shifts of the signal.  But the smaller amplitude frequencies appear to use the minimum Niquist rate of 2 samples per oscillation.  How is phase outside of the minimum sample times encoded in this signal?  Here the sample rate does seem to affect the resolution of delays.

 
stv014 continuous time bandlimited impulse plots seem to have a little different amplitudes and oscillation behavior probably because one needs to sample above the Nyquist rate to get the same results regardless of sampling phase offset.
 
However, I think the point in stv014 example is that the phase is embedded in the amplitude of all of the sampled values of the bandlimited signal. A sine wave has a different values from 0 to just below pi...
 
Jan 27, 2013 at 6:48 AM Post #28 of 34
Quote:
Originally Posted by ultrabike /img/forum/go_quote.gif
 
stv014 continuous time bandlimited impulse plots seem to have a little different amplitudes and oscillation behavior probably because one needs to sample above the Nyquist rate to get the same results regardless of sampling phase offset.

 
All three waveforms should look the same in continuous time with an "ideal" converter. The fact that they do not on the picture is because of the limitations of the audio editor; it does use sinc interpolation to display the waveform, but it uses only a few neighboring samples, so the reconstruction is not quite perfect.
 
Jan 27, 2013 at 7:04 AM Post #29 of 34
Quote:
 
I'm a little unclear on your example.  The phase of the oscillation is indeed the important information, but in your example 4 samples are used to encode the different phase shifts of the signal.  But the smaller amplitude frequencies appear to use the minimum Niquist rate of 2 samples per oscillation.  How is phase outside of the minimum sample times encoded in this signal?  Here the sample rate does seem to affect the resolution of delays.

 
The ideal continuous time representation of a discrete time impulse (one sample of 1.0 at a certain time, and zeros everywhere else) is sin(x) / x, where x is the time difference relative to the time of the impulse, in samples multiplied by ℼ. At x = 0, the value of the function is 1. Obviously, this function extends infinitely in both directions, so a real world approximation needs to be windowed to a finite length; this causes frequency response errors near the Nyquist frequency, and imaging above that, but still does not limit the "resolution" of the phase or frequency to discrete steps. To fully reconstruct the continuous time signal mathematically, you multiply (convolve) each sample value with the above described sinc function, and then sum all the multiplied and sample-delayed sinc functions. This will, for example, turn a coarse, "stair-stepped" digital sine wave into a smooth continuous one. Also, if you offset x by any fraction of ℼ, that will delay the signal by fractions of a sample.
 
Jan 27, 2013 at 7:17 AM Post #30 of 34
Quote:
Originally Posted by jcx /img/forum/go_quote.gif
 
one crude way to visualize the phase resolution of digital audio is to consider how many distinct lines you can draw between adjacent samples - just "connecting the dots" -  we can draw 32768 lines of the same slope of a 1/2 max full scale step (not exactly a legal signal but gives the outside number for "phase resolution") - that would be ~23 us / 32768 ~ 700 ps ( pico 10^-12 )

 
Uncorrelated noise in the signal (like from from properly dithered quantization) will obviously make it harder to accurately detect the frequency, phase, and amplitude of a tone, but that is also true of analog (continuous time) signals. However, it does not result in discrete "phase steps". With more noise, a longer signal is needed for the same accuracy of detection. With a long -3 dBFS test tone, I was able to measure delays in a 44.1 kHz/16-bit dithered file with a few tens of ps random error, rather than in 700 ps steps.
Increasing the sample rate reduces the length (in seconds) of audio needed for the same accuracy at the same bit depth, but that is because the noise floor is lowered: the same total noise power is distributed over a wider frequency range. So, going from 44100 Hz to 192000 Hz for example decreases the noise by 6.39 dB in the same bandwidth, assuming white noise dither.
 

Users who are viewing this thread

Back
Top