Half and full precision floating point for audio?
Oct 18, 2013 at 1:56 PM Thread Starter Post #1 of 7

shaocaholica

Head-Fier
Joined
Oct 3, 2013
Posts
88
Likes
11
I'm fairly new to audio and I mainly with with pixels.  In visual effects all studios now use half and full precision floating point (16bit/32bit) to represent pixels values for intermediates which may need lots and lots of processing so using FP lets you get away with things like negative values and values above 1 without clipping.  Of course the math is much more complex and it eventually gets converted to int for display in a framebuffer.  For playback and publish purposes only 8 or 10bits is required as current displays can't use any additional precision, yet.
 
So are there any mics and A/D hardware that can even do floating point sound?  Could it be useful for recording?  Are there standards for FP audio data?
 
Oct 18, 2013 at 2:17 PM Post #2 of 7
Single/double precision in programming usually refers to 32/64 bit IEEE floating point.
 
For integers common resolutions are 8, 16 (CD), 20, 24, 32 bits per sample, where 8 is usually unsigned (0 to 2^8-1) and the rest is signed, e.g. -2^15 to 2^15-1 for 16 bits.
 
Floating point resolutions are 32 and 64 bits per sample, with sample values normalized to the range -1.0 to +1.0.
 
 
Most digital signal processing software works with 32 or even 64 bit floats. Some DACs accept 32 bit floats, most accept 24/20 or most older ones only 16 bit integers. Similarly, some ADCs output 32 bit data.
 
The problem is of course that a high resolution does not indicate high SNR, low distortion etc. The commonly achieved performance is roughly 20 bits... so the lower 4 bits in 24 bit samples would be noise.
 
Oct 18, 2013 at 2:21 PM Post #3 of 7
Ah ok.  In VFX we like to use half precision 16bit FP as its -enough- to pass the threshold of visual scrutiny.  Internally all the tools we use convert to 32bit FP for calculations and transforms but when stored to disk we typically use 16bit FP for disk space concerns.
 
Oct 18, 2013 at 2:30 PM Post #4 of 7
  I'm fairly new to audio and I mainly with with pixels.  In visual effects all studios now use half and full precision floating point (16bit/32bit) to represent pixels values for intermediates which may need lots and lots of processing so using FP lets you get away with things like negative values and values above 1 without clipping.  Of course the math is much more complex and it eventually gets converted to int for display in a framebuffer.  For playback and publish purposes only 8 or 10bits is required as current displays can't use any additional precision, yet.
 
So are there any mics and A/D hardware that can even do floating point sound?  Could it be useful for recording?  Are there standards for FP audio data?

 
In software, floating point is widely used for audio processing, since it is fast on modern CPUs, and also makes the code simpler (no need to constantly normalize/check for clipping, etc.). However, there is generally not much point increasing ADC/DAC resolution above that of 24 bit integer samples, since it would be difficult to build hardware that has enough dynamic range to make the sample format the limiting factor.
 
Floating point audio files usually contain 32-bit IEEE format samples (1 bit sjgn + 8 bits exponent (127 = 2^0) + 23 bits mantissa with an implied MSB=1), which is also natively supported by x86 CPUs. 64-bit floats are also commonly used internally in software.
 
Oct 18, 2013 at 2:33 PM Post #5 of 7
The problem is that half precision floating point has only 11 significant bits, so quantization noise could cause more problems than simply using 16 bit signed integers.
 
I guess in visual effects a high dynamic range is more important / a better tradeoff.
 
Oct 18, 2013 at 2:40 PM Post #6 of 7
 
I guess in visual effects a high dynamic range is more important / a better tradeoff.

 
Yeah it is.  Its really motivated by diskspace.  Next time you see a big budget VFX shot in a movie try to find quantization noise :)
 
But I'm sure as storage gets cheaper and faster we'll eventually move to 32bit FP files on disk.  Remember that a single VFX frame can have 1000s of elements each with millions of pixels and each pixel has at least 3/4 channels and sometimes more.  Multiply that by 16bits and some lossy (yep lossy) compression and you still have a huge amount of data to build a single visual frame.
 
Basically its like recording a large orchestra only each instrument and singer has their own isolated channel.  Thats a lot of data.
 
Oct 18, 2013 at 2:45 PM Post #7 of 7
  Remember that a single VFX frame can have 1000s of elements each with millions of pixels and each pixel has at least 3/4 channels and sometimes more.

That definitely helps, a lot.
 
In audio we just have 16 bits * n channels (usually just 2) * Fs (sampling rate, usually 44100) = 172 KiB/s
 
Of course during production you'd have more "channels", 24 or probably 32 bits and 96 kHz.
 

Users who are viewing this thread

Back
Top