Head-Fi.org › Forums › Equipment Forums › Sound Science › Can someone explain audio resolution to me?
New Posts  All Forums:Forum Nav:

Can someone explain audio resolution to me?

post #1 of 77
Thread Starter 
So I understand between 16/44 and 24/44 that the bit depth of an audio track increases, and between 24/44 and 24/96 that the sample rate increases, and what I understand this to mean is that the more samples there are per second and the greater their bit depth, generally provides more detail and a generally increased quality up to the limits of your hearing abilities

What I don't understand is why some people say that 96khz music is pointless because humans only hear up to about 20khz. I thought sample rate and sound frequencies were separate from each other? If the 96 meant sound frequencies and not sample rate then 192khz music would bring all the bats within earshot to your house, wouldn't it?
post #2 of 77

Sample rate determines the maximum frequency that can be reproduced. Ideally, it is half the sample rate (which can already not be reproduced without loss of information), for example, 22.05 kHz for 44.1 kHz sample rate. The practical limit where there is no significant degradation (roll-off etc.) with a real DAC is slightly less, but still above 20 kHz for 44.1 kHz sample rate with a decent converter.

 

Bit depth affects the amount of quantization noise that is added to the signal (6.02 dB lower for every additional bit, until it becomes limited by noise from analog components). With dithering, the quantization noise can be made uncorrelated to the input signal, and it is then basically like hiss from analog equipment.


Edited by stv014 - 8/16/13 at 7:44am
post #3 of 77
Quote:
Originally Posted by InternetSandman View Post

So I understand between 16/44 and 24/44 that the bit depth of an audio track increases, and between 24/44 and 24/96 that the sample rate increases, and what I understand this to mean is that the more samples there are per second and the greater their bit depth, generally provides more detail and a generally increased quality up to the limits of your hearing abilities

What I don't understand is why some people say that 96khz music is pointless because humans only hear up to about 20khz. I thought sample rate and sound frequencies were separate from each other? If the 96 meant sound frequencies and not sample rate then 192khz music would bring all the bats within earshot to your house, wouldn't it?

 

 

What do you mean by detail ? The problem is that you are confusing data with quality. The bit depth determines the SNR and the dynamic range but a 16 bit sample and even a dithered 8 bit sample can both equally well represent a given signal within a specified bandwidth sampled at 2x the highest frequency you want to capture, the 8 bit sample will be noisier but the signal will be just as well recreated.

 

It seems intuitive that the more bits per sample you use creates more detail/quality but this is a pervasive audiophile myth. What you get with more samples is a better SNR but this extra detail is at the bottom end which is already quiet even for humble 16 bit - the recreated waveforms from bandwidth limited 16 bit and 24 bit samples are identical as they should be. No doubt you have seen sampling methods represented as stair steps where more samples per second means smaller steps, this is highly misleading , when a DAC converts a digital signal it outputs a smooth continuous analog waveform not a set of impulses. You can test this by running the outputs through analog scopes.

 

Despite careless manufacturer claims for more detail from 24 bits to date there have been few controlled studies to compare the difference in bit depth the largest of these (Meyer and Moran) suggested that only when the output level was raised to extreme volume levels could the difference in background noise be heard no qualitative difference was found at normal listening levels.

 

The sample rate dictates the highest frequencies that can be captured by the AD process and this is determined by fs/2 - so if you sample music at 44.1khz you can accurately capture to about 22050hz (but practically a little less) but 20khz is no problem. Few people can hear above 20khz and this ability declines with age, there are no musical fundamentals above about 5300hz and even in music with high harmonics the energy is relatively low above 20K - as long ago as the late 1970s even before the CD standard some JVC researchers experimented with filtering off high frequencies in their study nobody detected a roll off at 20khz. At the time analog FM radio in most countries did not carry musical signals above about 15Khz but was subjectively judged as high quality

 

 

 

 

Sampling at 96khz allows frequencies of up to about 48Khz to be captured. How many speakers can give you anything at 48Khz I do not know , such things as super tweeters do exist but I do not know how widely used they are - yes you can find harmonics in the trumpet, cymbals and the Balinese Gamelan that extend above 40K even above 50K (Oohashi et al.) . The audibility of such signals is highly disputed even a physiological response for such signals is highly debated.

 

There are other technical arguments for sampling at above 44.1khz these are to do with how easy it is to do filtering and relocating noise out of the audible range, it is much harder to justify tye audible merits of playback at 96khz sampling rates.

post #4 of 77

Also keep in mind that there is noise that microphones pick up in the recording studio. It's not much, but except in maybe the quietest parts of certain mixes, it's going to be above the level induced by dithering from 16-bit output playback and should pretty much completely mask it.

 

Unless you are listening very loudly and in a quiet environment, the ambient noise level of the listening environment (even attenuated by in-ear monitors, at least usually) is going to be well above the dithered quantization noise as well.

 

With respect to supersonic frequencies, ignoring for now if you hear them, many sound reproduction systems don't handle them so well anyway, particularly headphones and IEMs. Microphones used for recording instruments and vocals really have no need to capture say 40 kHz or so too.

 

 

By the way, with respect to bit depth there is a simple demo for 16 bits on down, so you can listen for yourself. It's in this video:

http://www.youtube.com/watch?v=BYTlN6wjcvQ

 

audio files found here:

http://www.ethanwiner.com/aes/

post #5 of 77
^^^ Thanks Mike. Also, not to be a shill, my Audio Expert book explains all of this in great detail using plain English explanations.

--Ethan
post #6 of 77

http://www.xiph.org/video/ may help some (or not)

post #7 of 77

When an engineer is recording an extraordinarily weak signal, it may be useful to have more than 16 bits, since the noise floor will become audible and perhaps troublesome when the signal is greatly amplified later, through quantization noise. I switched to 24 bits as soon as I could because of the breakup of long reverb tails in live classical music recordings, but even with reverb tails, the noise problem isn't noticeable under normal listening conditions.

 

The Nyquist theorem lies behind the choice of music CD sample rate: twice the sample rate of the highest frequency you want to capture. 22,500 Hz is well out of my hearing range now, and unfortunately, out of that of many 20-year-olds now as well, due to unhealthy listening habits, especially on earphones.

 

However, pure frequency information alone is not all there is to the listening experience. The A/D and D/A processing used by digital music equipment involves filtering in order to properly represent the signal, and different filter types have different effects. Even if the microphones only pick up to about ~25 kHz, and the speakers only play up to about 25 kHz, the music hardware still responds differently to files encoded in different sample rates, and for some listeners the difference is audible. The differences are extraordinarily subtle though, and not of interest to  many listeners.

post #8 of 77
Thread Starter 
Quote:
Originally Posted by UltMusicSnob View Post

When an engineer is recording an extraordinarily weak signal, it may be useful to have more than 16 bits, since the noise floor will become audible and perhaps troublesome when the signal is greatly amplified later, through quantization noise. I switched to 24 bits as soon as I could because of the breakup of long reverb tails in live classical music recordings, but even with reverb tails, the noise problem isn't noticeable under normal listening conditions.

The Nyquist theorem lies behind the choice of music CD sample rate: twice the sample rate of the highest frequency you want to capture. 22,500 Hz is well out of my hearing range now, and unfortunately, out of that of many 20-year-olds now as well, due to unhealthy listening habits, especially on earphones.

However, pure frequency information alone is not all there is to the listening experience. The A/D and D/A processing used by digital music equipment involves filtering in order to properly represent the signal, and different filter types have different effects. Even if the microphones only pick up to about ~25 kHz, and the speakers only play up to about 25 kHz, the music hardware still responds differently to files encoded in different sample rates, and for some listeners the difference is audible. The differences are extraordinarily subtle though, and not of interest to  many listeners.

So in essence, and just to clarify, 24/44 would essentially be the highest practical quality for playback on even high end headphones, and even then it's genre dependent? (I can't imagine the extra low noise floor would matter in rock, for example)

If this is the case I'm glad I only bought 3 albums from HDTracks that were 24/192
post #9 of 77

Yep, I purchased a couple HDtracks in all three basic sample rates just to see.  All appeared to be from the same master.  The master was different from previous CD releases.  All sounded the same for me.  I now buy only the lowest (cheapest) sample rates from them.  I even digitally filtered ultrasonic content and differenced them.  Below 20khz there is no difference. 

 

If you ever have reason to downsample to different rates (44.1 and 48) the 96 khz might leave fewer downsampled artifacts though I doubt any of them are audible.  For the most part, just get the lowest sample rates and be happy. 
 

post #10 of 77

actually dither can be used to give even greater psychoacoustic perceived S/N when you increase sample rate - I'd rather have 16/96 if trying to responsibly over bound human hearing ability than 24/44

 

http://www.meridian-audio.com/w_paper/Coding2.PDF

 

if you want to play the game with only a 50% increase still go for sample rate - would end up pretty much per Stuart's suggestion in the paper - and today's dither algorithms can improve over what he had available


Edited by jcx - 8/17/13 at 2:50pm
post #11 of 77
When you are listening to playback of music, standard CD quality is audibly identical to higher bitrates. In order to hear a difference, you would have to turn the volume up to ear splitting volummes and be able to hear sounds that no human can hear. It's a complete waste.

It's very useful for mixing however. Studio engineers need that extra range so they can raise the volume of elements in the mix without bringing up the noise floor with it.
post #12 of 77
Quote:
Originally Posted by InternetSandman View Post


So in essence, and just to clarify, 24/44 would essentially be the highest practical quality for playback on even high end headphones, and even then it's genre dependent? (I can't imagine the extra low noise floor would matter in rock, for example)

If this is the case I'm glad I only bought 3 albums from HDTracks that were 24/192

Umm, I may have soft-pedaled the hi-res too much.

The key is "not of interest to  many listeners." These details are of very high interest to me, however.

 

It depends on what you're listening to and how. If you like to have something playing in the background while you chat with company, or while you drive in your car, then plain old Redbook audio at 44,100 Hz and 16 bits is going to be just fine.

 

Personally, I give a lot of attention to music when I listen, to such a degree that if I can't listen attentively then I don't want to listen at all.

 

There ARE real differences in playback sound between music CD audio and 192/24. I can hear those differences, so I upsample my music CD's to full 192/24. Yeah, it's a lot of space on the HD, and it takes time---that's my problem, no one else's.

 

To me, it's very much worth it. Playback at 192/24 sounds better to me, and I can prove it.  http://www.head-fi.org/t/676885/successful-abx-testing-to-hear-the-difference-between-redbook-audio-vs-upsampled-to-192-24

 

But it depends on what you want to listen to, and how. I wouldn't bother with my old Sex Pistols album, but I'm in the process of converting everything I own by Yasutaka Nakata.

post #13 of 77
Quote:
Originally Posted by jcx View Post

actually dither can be used to give even greater psychoacoustic perceived S/N when you increase sample rate - I'd rather have 16/96 if trying to responsibly over bound human hearing ability than 24/44

 

http://www.meridian-audio.com/w_paper/Coding2.PDF

 

if you want to play the game with only a 50% increase still go for sample rate - would end up pretty much per Stuart's suggestion in the paper - and today's dither algorithms can improve over what he had available


Perhaps I don't quite understand this correctly, but I've been wondering about this for a while, which is better, 16/96 or 24/44. A 16/96 stream is 48/33 more data than a 24/44 stream; however, if all of the information is bandwidth limited below 20kHz, then the sampling rate of 44 or 96 is sufficient to capture the audio. In this case, the additional dynamic range stored in the 24bitdepth is theoretically superior (I think?) to the few extra samples stored in a 16bit stream with a bit more than 2x samples because the extra data in the 16/96 stream is redundantly storing the full dynamic range, whereas the 8 extra bits in the 24/44 stream is all extra dynamic range.

 

I guess the thought experiment is this... start with a mathematically defined bandwidth limited signal, say 

 

Quote:
f_samplerate1 = 44100; f_samplerate2 = 96000; f_lcm = 14112000; %! holy crap this would be rediculous to calculate! better use 44100 and 96000...
f_samplerate1 = 44100; f_samplerate2 = 88200; f_exact = 176400; % will compare to the exact waveform values at the upsampled rate of 176.4kHz
 
 
T = 1; %period in seconds
t1 = [0:f_samplerate1*T-1]/ f_samplerate1; % 1 second audio, hence FFT frequencies will be spaced by 1/period = 1Hz
t2 = [0:f_samplerate2*T-1]/ f_samplerate2; % 1 second audio, hence FFT frequencies will be spaced by 1/period = 1Hz
te = [0:f_exact*T-1]/ f_exact; %
 
% this is our mathematically exact definition of the signal
flim = 20000;
fs = 1:flim; % the frequencies present
a = -1+2*rand([1, flim]) ; % cosine coefficients for a 20kHz bandwidth limited signal---white noise
b = -1+2*rand([1, flim]); % sine coefficients for a 20kHz bandwidth limited signal---white noise
 
A_Squared = sum( a.^2 + b.^2 ); % parseval says sum( abs(y_j)^2 ) = sum ( abs(fft(y_j))^2 ), so we can normalize the signal to a nominal +/- 1 amplitude
a = a/sqrt(A_Squared);
b = b/sqrt(A_Squared); % -> these coefficients tend to give rms amplitude of 0.7071 (ie 1/sqrt(2)), but samples can get up to +/- 3 or so (ie. +/-4 std)
% to be safe, lets cut the samples down by a factor of 4 to avoid clipping
a = a/4;
b = b/4;
 
Fy1 = zeros(size(t1));
Fy2 = zeros(size(t2));
Fe = zeros(size(te));
 
Fy1(2:flim+1) = a + 1i*b; % fill in DFT coefficients
Fy1 = 0.5* (Fy1 + fliplr(circshift(Fy1, [0,-1])));
 
Fy2(2:flim+1) = a + 1i*b; % fill in DFT coefficients
Fy2 = 0.5* (Fy2 + fliplr(circshift(Fy2, [0,-1])));
 
Fe(2:flim+1) = a + 1i*b; % fill in DFT coefficients
Fe= 0.5* (Fe+ fliplr(circshift(Fe, [0,-1])));
 
y1 = ifft(Fy1*f_samplerate1, 'symmetric');
y2 = ifft(Fy2*f_samplerate2, 'symmetric');
ye = ifft(Fe*f_exact, 'symmetric');
 
dither_function = @(y) randn(size(y)); % i dont' know how dither is actually implemented, can somebody point me towards a easy noise-shaping algorithm? I could calculate white noise (like here), and in fft-space, apply a normalization curve, invert the fft, and rescale for a nominal amplitude of +/- 1 bit?
 
% now we want to cut the samples down to 16bits and 24bits
y1_24 = round( 2^23 * y1 + dither_function(y1)) ; % now, integers in the +/- 2^23 range
y2_16 = round( 2^15 * y2 + dither_function(y2)) ; % now, integers in the +/- 2^15 range
 
% "playback" on an upsampling DAC or with upsampling DSP
y1_24_up176 = 4*ifft(ifftshift( [zeros([1, length(y1)*1.5]), fftshift(fft(y1_24/(2^23))), zeros([1, length(y1)*1.5])] ),'symmetric');
y2_16_up176 = 2*ifft(ifftshift( [zeros([1, length(y2)*0.5]), fftshift(fft(y2_16/(2^15))), zeros([1, length(y2)*0.5])] ),'symmetric');
 
% compare to the original mathematically defined waveform
err1 = ye-y1_24_up176;
err2 = ye-y2_16_up176;
 
% compare the square error of the two? How would you quantify the error, or the quality, etc...?
[sum(err1.^2), sum(err2.^2)]

10*log10([sum(err1.^2), sum(err2.^2)])
 
figure(1), plot(t1, y1, te, ye-y1_24_up176); % shows the 1st sample of the signal and a line which indicates really good reconstruction (ie small error)
figure(2), plot(t2, y2, te, ye-y2_16_up176); % shows the 2st sample of the signal and a line which indicates really good reconstruction (ie small error)
 

 

Quote:
> ans =
 

1.0e-03 *

    0.0000    0.1770

 

> ans =

 

-85.6572  -37.5203

 


This code runs in MATLAB and should run in Octave, so anybody else can try this.

 

Basically, I mathematically defined a sound, sampled it at two rates and at two different bit depths. I used white noise with rms 1 lsb for the dithering.

here, it appears that 24bit sampled slower out performs 16 sampled twice as fast, despite 24/44 have 2/3 the data density of 16/96.

 

I would like to test noise shaping , but I dont' know how to implement it

 

Sorry for the poorly documented example, I'll try to elaborate on it later.

 

cheers

 

 

 

EDIT, in Octave, need to replace all "ifft(*,'symmetric')" with "real(ifft(*))"


Edited by ab initio - 8/17/13 at 10:57pm
post #14 of 77
Thread Starter 
Quote:
Originally Posted by UltMusicSnob View Post

Umm, I may have soft-pedaled the hi-res too much.
The key is "not of interest to  many listeners." These details are of very high interest to me, however.

It depends on what you're listening to and how. If you like to have something playing in the background while you chat with company, or while you drive in your car, then plain old Redbook audio at 44,100 Hz and 16 bits is going to be just fine.

Personally, I give a lot of attention to music when I listen, to such a degree that if I can't listen attentively then I don't want to listen at all.

There ARE real differences in playback sound between music CD audio and 192/24. I can hear those differences, so I upsample my music CD's to full 192/24. Yeah, it's a lot of space on the HD, and it takes time---that's my problem, no one else's.

To me, it's very much worth it. Playback at 192/24 sounds better to me, and I can prove it.  http://www.head-fi.org/t/676885/successful-abx-testing-to-hear-the-difference-between-redbook-audio-vs-upsampled-to-192-24

But it depends on what you want to listen to, and how. I wouldn't bother with my old Sex Pistols album, but I'm in the process of converting everything I own by Yasutaka Nakata.

If I'm perfectly honest, even if I could hear the differences between 16/44 and 24/192 (I've never run an ABX test myself) it wouldn't matter for my listening purposes. The vast majority of the time I'm using my IEM's (soon to be CIEM's when they finally show up in the mail) as earplugs when I'm at work; I'm in a noisy environment both on the bus and at work, so while I can enjoy the music, the difference between the two probably would be undetectable in my listening environment, regardless of genre (jazz, rock, orchestral, some pop music)

That being said, the higher resolution music that i own does seem to put a bit more strain on my DAP's battery life, which barely manages to get me through the whole day as is, so what program would you recommend for taking my 24/96 and 24/192 content down to 24/44?
post #15 of 77
Quote:
Originally Posted by UltMusicSnob View Post

It depends on what you're listening to and how. If you like to have something playing in the background while you chat with company, or while you drive in your car, then plain old Redbook audio at 44,100 Hz and 16 bits is going to be just fine. Personally, I give a lot of attention to music when I listen, to such a degree that if I can't listen attentively then I don't want to listen at all.

 

When you upsample, there is absolutely no difference whatsoever, unless the program you use is introducing artifacts. If you can hear a difference, it isn't music you are hearing. It's noise introduced by your upsample conversion.

 

In order to hear the difference between high bitrate and rebook with high bitrate recordings, you would have to turn the volume on your stereo up as loud as a jet engine at close range.

 

Attentive listening has nothing to do with bitrate.


Edited by bigshot - 8/17/13 at 11:00pm
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Sound Science
Head-Fi.org › Forums › Equipment Forums › Sound Science › Can someone explain audio resolution to me?