Quote:
Having some trouble moving from digital photography, where I am totally comfortable with the concepts of dynamic range, bit depth and resolution... and maybe the terms have a different meaning in audio than digital photography, but to some extent digital should be digital....
Dynamic range is what it is based on the sensor and has nothing to do with bit depth (dynamic range = the difference in stops between the darkest and lightest source where the sensor can detect a difference)
Partially true, in that the sensor is the limiting factor, but so is bit depth. The confusing occurs in the scaling a camera does between sensor output and the digital conversion. So an 8 stop sensor can still be scaled so it is digitized to 12 bits per channel, even though the actual sensor dynamic range is much less than what 12 bits per channel is capable of. In photography we are also concerned with who big the steps are in the gray scale. This is one way digital audio a digital imaging differ.
Quote:
The bit depth is the precision with which strength of a given "piece of light" can be measured - the light is what it is and bit depth simply is a measure of precision. In images this is relevant particularly in the editing process where changes to an 8 bit image eg: JPG = roughly analagous to MP3 (actually 3 x 8 bit = 24 one for the red, blue and green channels) where "quantize errors" are more significant than 12 14 or 16 bit images eg: TIFF or RAW - roughly analagous to FLAC etc.
The precision of measurement idea is right, but the analogies are a bit off. JPG images are reduced in size by eliminating duplicated pixels during jpg encoding, then predicting them and reinserting them on display. It's done by considering groups of pixels and the degree to which they differ, keeping the most different ones and dumping the similar ones. The degree to which that is done is chosen by the jpg quality setting, which is pretty high in cameras, variable in image processing software. mp3 (technically MPEG-2, Layer 3) processing is a bit different in that it uses the concept of masking to determine what's needed and what's not. Masking is where a dominant loud frequency makes another close by, but lower level frequency inaudible. While that's sort of similar to jpg image processing, audio is changing over time, so the data that can be eliminated because it's not audible changes in definition on a continual basis. Also, when you compare jpg or mp3 compression, the discussion of bit depth is technically a separate issue. You're right there being larger approximations for lower bit depth, but that's only a related issue to the actual data reduction methods. TIFF and RAW are "uncompressed", as is FLAC and AIFF, WAV and ALC, but a TIFF image can also have meta data, and a RAW image has tags that are required for proper rendition, and are camera specific as to dynamic range, gamma, color etc. None of that happens in any of the audio formats. Part of what goes into a RAW file is determined by the scaling and calibration of the sensor. In audio, there isn't any of that going on.
Quote:
The resolution is the density in a given area of the photosites of the sensor and would seem to correspond to the samples per second in audio. the more photosites (pixels) the higher the resolution.
So the way I see this in audio is take a given sound pressure say 100db - the bit depth would determine the difference between 100.0000000000 (lower bid depth) and 100.000000000012345 (higher bit depth) - whether that is an audible difference is probably still open for discussion, but I don't see that bit depth is relevant to dynamic range, it certainly isn't in digital photography.
The resolution analogy is good as far as pixel count vs sample rate. However, bit depth is always related to dynamic range in both photography and audio. The fewer the bits the less range between the maximum signal level or light level and the minimum (and noise) level. In audio, quantization is linear, meaning there's no scaling pre conversion. So there is a fixed relationship between bit depth and available dynamic range, which is roughly 6dB per bit, not counting noise shaping and dither. 16bit audio is basically capable of 96dB between maximum and noise. The same is true in photography, except there is scaling dictated by the sensor. So your blackest black and whitest white of the sensor lands somewhere between the minimum and maximum of the digital word and bit depth, even if the actual sensor output is non-linear. The key to decoding the scaling data is provided in meta tags, an is important for RAW decoding. Every get the RAW profile wrong in Photoshop? Probably not, because that's mostly been fixed now, but early on, you could sometimes mis-decode a raw image, the results were interesting, but not useful. But it's that correction that's got you confused. The other issue is color profiles in display and output devices. So you sensor may capture 10 stops, but your display can't display that, and you certainly can't print that, so what profiles do is again apply a correction to let your (hopefully calibrated) screen "fake" a 10 stop image. We don't do that in digital audio either.
Trying to say it simply, bit depth always relates to DR. In audio, the steps have a fixed size, in imaging, the size of the step is scaled to the sensor/scanner, and then again to the display or output device, such that the sensor's minimum black is still within the bit depth, and the sensor's maximum white is also below maximum defined by bit depth.