gregorio
Headphoneus Supremus
- Joined
- Feb 14, 2008
- Posts
- 6,798
- Likes
- 4,057
[1] Now imagine raising the volume (dramatically, if you like) just during the quietest moment of the track. ... Theoretically, you should then be able to 1) not destroy your hearing and 2) hear the difference between a 24-bit and 16-bit dithered noise floor.
[2] I recently found the following article, which describes this more eloquently than I ever could: http://www.tonmeister.ca/wordpress/2014/09/15/audio-mythinformation-16-vs-24-bit-recordings/
1. This was discussed, albeit briefly, in the first couple of pages of this thread. Let's take an example, an extreme one, a recording with a 72dB dynamic range, this is extreme because hardly any commercial recordings have a dynamic range of more than 60dB. For a 72dB dynamic range we need about 12bits of data/resolution. Now let's say we whack the volume up during the quietest parts by 20dB so that we can hear the digital noise floor (and differentiate 24 from 16bit): As Spruce Music says, what you've effectively done is manual compression, you've raised the noise floor by 20dB while peak volume remains the same (because you lower the volume again during the loud parts). The 72dB dynamic range of our recording is now 52dB, for which we only need about 9bits!
2. I see this kind of thing quite often, even in some published papers. Most of the information provided in that article is based on 16bit with TPDF dither, the result of this dither is white noise (as the article stated) and under certain circumstances the potential problems described might exist, although I would dispute both the magnitude of these potential problems and how often they would be encountered in practice. However, my main problem is with the starting premise: In the real world, how many commercial 16bit recordings actually have TPDF dither? If comparing 24bit vs 16bit, the answer is pretty much none at all! It has NEVER been standard practice to apply TPDF dither to a 24bit master/mix for 16bit distribution, always some form of noise shaped dither. This brings us on to the appendix, where the author admits his potential problem with TPDF dither is eradicated by noise shaped dither but introduces a new potential problem in the form of possible IMD, caused by the shaped dither noise energy up around the >16kHz range. However, this again makes no sense when comparing 24bit to 16bit. In practise, we don't really encounter 24/44.1 files, 24bit consumer music files are typically 96kHz or 192kHz, which provide a significantly higher frequency response. If a replay system can't handle the audible range and is generating IMD products from frequency content at say 17kHz, what IMD products is it going to generate from frequency content at say 30kHz?
Quote:
Originally Posted by castleofargh /img/forum/go_quote.gif
some stuff are relatively easy, like the maximum dynamic range of an ADC or a DAC. but when looking at microphones, recording methods, different studio environments... it becomes hard to put a number on things.
As Pinnahertz stated, dynamic range is a "weakest link" scenario, it's defined by the point in the whole recording/playback chain which has the smallest dynamic range. The peak/high point of that dynamic range is probably defined by the consumer amp speakers/cans part of the chain but let's generously assume high quality consumer playback equipment and define the peak part of our dynamic range equation as 120dBSPL (dictated by the human hearing part of the chain). For most music it's going to be a substantially lower figure than this but again, let's take an extreme case, say a symphony orch which could commonly have sustained peaks up around 105dB but may contain the odd transient up to nearly 120dBSPL, which is very loud but just about bearable provided those transients are infrequent and very short duration. Let's also take another extreme, very well isolating IEMs/headphones in a very quiet listening environment, and therefore a listening environment of say 20dBSPL. Using these two extreme circumstances to define our boundaries, we have a potential dynamic range of 100dB, which could in theory mean that we would be able to differentiate between 16 and 24bit. However, this is in theory, not in practice because there are a few serious holes in this scenario:
1. I chose a symphony orchestra as an example because logic suggests this produces the largest dynamic range. Typically from just one, two or a small handful of musicians playing quietly at one extreme, to 80-120 musicians simultaneously playing as loud as they can at the other. A large top class studio should have a noise floor around 30dBSPL, which btw already reduces our potential dynamic range from 100dB to 90dB but it doesn't stop there: Put say 90 living, breathing, moving musicians in that studio and it's noise floor is no longer even close to 30dBSPL. Our potential dynamic range is now down to about 60dB for average peak levels and 75dB for the occasional transient. And obviously, if we're talking about recording a live performance then we typically have around 1-4 thousand living, breathing, moving audience members to add to the noise floor and probably around another 10dB or so reduction in our potential dynamic range.
2. As far as dynamic range is concerned, our hearing operates on a similar fundamental principle as our sight. We have a wide visual dynamic range for brightness, from a bright sunny day to a fairly dark room. However, we're all well aware that this is a bit of a clever trick, our eyes actually have a much smaller dynamic range than the limits suggest but get around this fact by effectively making that limited dynamic range window moveable. We can see well in a darkened room but if we leave that room and walk into bright sunlight, it's dazzling to the point of painful. The upper limit of our visual dynamic range is significantly less than our theoretical/usual limit, until our eyes have had time to adjust their dynamic range window and once they have, if we re-enter the darkened room we can no longer see well, it's pitch black, again until our eyes readjust it's dynamic range window to a lower/darker level. The same happens with our hearing; if we really have achieved a listening environment of just 20dBSPL, our upper limit is no longer 105dBSPL (with occasional 120dB transients), it's significantly/proportionately lower. The evidence I've seen suggests our ears' moveable dynamic range window is somewhere between 30dB and 60dB.
3. Closely related to my response #2 to csglinux above, we have to be careful about the quoted response figures, how do these figures actually apply to the real world? Yes, a symphony orch can produce transient peaks up to 120dB, yes, a harmon-muted trumpet can produce measurable volume at >80kHz, etc. But, what can be produced and what is actually heard are two different things. Not withstanding the fact that I don't know of any symphonies which require a trumpet to use a harmon mute or that we can't hear 80kHz, just because we can record and measure 80kHz content with a mic placed a few inches from the trumpet's bell is meaningless in terms of an accurate or realistic recording, unless you're accustomed to sitting just a few inches in front of the trumpet during a symphony performance?! In practice we're going to be at least about 30ft away and probably double that in a "prime" seat, add to this a dozen or so living absorption panels in the way (say three or so desks of violas and a few rows of audience) and see how much 80kHz trumpet content you can record now! Even worse with say a french horn, where between what a french horn actually produces and what you (in the audience) hear is maybe: 100ft of air, two percussion sections, a wall (!), 4 or 5 desks of violins and a few rows of audience! Same with the volume figures for the orchestra, where are we measuring say 120dBSPL transient peaks? From just in front of the conductor or from a few rows back in the audience? If it's the latter, then our peaks (and therefore dynamic range) are probably around 10dB or more lower.
Taking the above three practicalities into account: 1. The dynamic range limit of the 16bit format is pretty much the least of our dynamic range bottlenecks. Even in extreme circumstances it's no more than the second least, with still several other bottlenecks of more significance actually defining the practical dynamic range. 2. Dynamic range is effectively an artistic decision, which in the case of acoustic performance genres is defined not by the sound the instrument/s actually produce but by where we choose to position the listener and therefore where we place the mics relative to the orch/sound sources. 3. I don't believe it's entirely coincidental that the dynamic range of the most dynamic music recordings is generally no more than about 60dB.
G