R2R/multibit vs Delta-Sigma - Is There A Measurable Scientific Difference That's Audible
Feb 12, 2016 at 12:20 PM Post #826 of 1,344
There seem to be several conversations going on here....
 
To address your first comment - simply - if the original acoustic source didn't have any ringing, but the digital representation of it does, then the digital representation is NOT perfect - and the ringing added by the digital proces is indeed a flaw. (Since there shouldn't be any ringing at all, but some ringing is inevitable, and even inherent, in any currently available digital representation, we much admit that none of the solutions are perfect, and so we are picking a compromise. The best we may even hope for is "the best digital representation, given the sample rate, and the other limitations of the system we've chosen.) However, even beyond that, we can't absolutely choose a single "best compromise" because the choice depends on our parameters; one filter gives the flattest power response, another gives the best phase response, and yet another gives the most accurate impulse response... name your poison. (And, in the end, the question returns to "which of the flaws is least audible? And, of course, we are hoping to be able to choose an option where there are NO AUDIBLE FLAWS. )
 
My comment about how some DACs do in fact process different sample rates differently was directed at your comment about "the results of conversions performed by Sox sounding identical to you". And my point was that, when comparing files at different sample rates, you can't assume that listening to both on the same DAC constitutes "a level playing field". For example, assuming that you started with a 24/192k file and converted it to 16/44k using Sox, it's possible that the 16/44k file has actually been seriously altered, but that the DAC itself is seriously altering the 24/192k file when it plays it, and so their apparent similarity is simply the result of two similar errors. (This isn't as far fetched as it sounds - because many DACs - especially older ones - actually do have significantly higher levels of various types of distortion at higher sample rates. Therefore, because of design limitations, the DAC you've chosen could actually be "cancelling out" some positive difference in the higher sample rate file by adding more distortion to it during playback.) And, in the particular case of the example you've chosen, the 44k version of the file you converted using Sox might sound identical BECAUSE THERE'S SOMETHING WRONG, and, if so, then doing a better quality conversion might just produce a 16/44k file that sounded BETTER than the 24/192k original ON THAT DAC. (I would agree that it's somewhat unlikely, and the only way I could imagine to determine if the different result was better would be to repeat the same experiment with several different converters and DACs, but such a result is not at all beyond the range of possibility, which means that it's just one more factor that makes the results uncertain.)
 
There absolutely ARE DACs in existence which process and filter source files at different sample rates differently enough that the differences in how the DAC handles the files exceeds the differences in the files themselves. (I owned a well-known non-oversampling DAC whose frequency response was 20-20k +/- 0.25 dB at 96k, but 20-20k +0/-3 dB at 44k. If you'd compared your two files on that DAC, and they were audibly identical, then that would have meant that the files were in fact far different... )
 
As for the fact that you seem to not "believe that ringing matters" - there seem to be a lot of voices to the contrary. (While you can feel free to argue at what level it becomes audible, the vast majority of DAC chip manufacturers, and people who write CODECS, all seem to agree that, at some point, pre-ringing becomes audible and unpleasant. We're not talking about something that's being claimed by a few fringe companies; we're talking about something that Wolfson and several DAC manufacturers consider important, that virtually all DAC chip vendors specify, and that Dolby Labs considers to be an important feature in their latest professional level encoder. Therefore, I really don't think it's reasonable to dismiss it out of hand.)   
I would suggest that, if you want to establish reasonably that a given sample rate converter is "inaudible", a good start would be to take a 16/44k file, upsample it to 96k, then down-sample it back to 44k - using the same converter. If they sound identical, then, excluding the possibility that the conversions might introduce errors that cancel out, you will have at least established the possibility that the conversions are "perfect". To me, simply converting one to another, using an "arbitrarily good enough" conversion, then comparing the two on an "arbitrarily good enough" DAC and playback system, is leaving far too many variables not adequately controlled.
 
Now, again, we need to remember that we're discussing "proving something to a reasonable degree of certainty to make it a scientific claim" here.
 
If the question is simply of whether "there seems to be enough evidence to convince you or me that it probably doesn't matter to us" - then the level of proof required is FAR lower.
 
(I do apologize, to a degree, for "trying to pick you apart on scientific details" - but, at least to me, you seem to be getting dangerously close the the line of "I don't hear a difference - therefore there can't possibly be one". I personally don't have especially "good pitch", so I can't hear when a guitar is slightly out of tune... but I still can't rule out that many of the people I know who claim that they CAN hear when one is even slightly out of tune might still be telling the truth... )
 
I did know someone once who tried the experiment of taking a 44k file, upsampling it to 96k, and then downsampling it to 44k again - using one of the popular audio editing programs (I believe it was an early version of Adobe Audition.) Note that he used actual music, and not steady state test tones or pink noise. He was horrified at the level and type of differences that existed between the original file and the double-converted one - which should have been "identical". (I've never tried that experiment, and it might be interested to see how Sox would fare with it.)
 
 
Quote:
  Ringing is technically not a flaw. The *ideal*, meaning technically perfectly correct filter rings indefinitely at the cutoff frequency. This is why having a cutoff higher than we can hear is important, and why non-minimum phase filters *in the audible range* can be problematic. I am well aware of how bad some SRCs are, but SoX (which was the program under question) isn't one of them. And why are you trying to expand into how DACs work at different rates? The whole point here was to get two sources of the same content to the same rate so they can be compared. I'm also well aware of what people "claim" to hear, which somehow suddenly disappears under certain situations (that are being well hashed to death in the other thread so we'll leave that be).

 
Feb 12, 2016 at 1:01 PM Post #827 of 1,344
To address your first comment - simply - if the original acoustic source didn't have any ringing, but the digital representation of it does, then the digital representation is NOT perfect - and the ringing added by the digital proces is indeed a flaw.

 
But we're talking about reproduction here, and a properly reproduced band-limited signal will ring.
 
This isn't as far fetched as it sounds - because many DACs - especially older ones - actually do have significantly higher levels of various types of distortion at higher sample rates.

 
Yes which is why I told the question-asker to first check for IMD at the higher rates in his system. But that IMD would not be caused by SoX upsampling because the resampler will lowpass the upsampled file at the original bandlimit. That means if he compares the two versions and they are identical in the audible spectrum but sound different, then it is very likely distortion is affecting the playback of the higher-frequency original.
 
I would suggest that, if you want to establish reasonably that a given sample rate converter is "inaudible", a good start would be to take a 16/44k file, upsample it to 96k, then down-sample it back to 44k - using the same converter. If they sound identical, then, excluding the possibility that the conversions might introduce errors that cancel out, you will have at least established the possibility that the conversions are "perfect". To me, simply converting one to another, using an "arbitrarily good enough" conversion, then comparing the two on an "arbitrarily good enough" DAC and playback system, is leaving far too many variables not adequately controlled.

 
Which is exactly what I've done with SoX in blind testing with numerous resampler settings. I really don't get what the heck people are worrying about so much with this.
(I do apologize, to a degree, for "trying to pick you apart on scientific details" - but, at least to me, you seem to be getting dangerously close the the line of "I don't hear a difference - therefore there can't possibly be one". I personally don't have especially "good pitch", so I can't hear when a guitar is slightly out of tune... but I still can't rule out that many of the people I know who claim that they CAN hear when one is even slightly out of tune might still be telling the truth... )

 
I'm perfectly willing to believe that someone with unusually high frequency hearing can eek out stuff at 44.1, especially with test signals. But, referencing again the particular matter at hand, I think the answer to the question "why does this 192 version sound different than this 96 version" isn't particularly going to be screwed up by a decent resampler, since we're already talking about a cutoff near 48kHz.
 
Feb 12, 2016 at 1:52 PM Post #828 of 1,344
Worth pointing out that actually ringing only occurs in the transition band of a filter.  Other effects are bandwidth limiting that look like ringing. 
 
So let us say we are using 96 khz sampling.  Ringing is in the 40 khz to 48 khz area.  If we took a mic feed and added ringing at times at 40 khz why would you expect to hear that?
 
At 44.1 khz the ringing would be just above 20 khz.  But is with musical sources only rarely of any substantial level.  While there are people who can hear to 22 or 23 khz their threshold at those frequencies is around or above 100 db in loudness. 
 
Filters that are designed to prevent pre-ringing usually have other artefacts at 44.1 khz like altering the freq. response down in the audible range.
 
All of that ignores the fact the ADC will have filtered out signals that would ring the DAC filter.  Yes you can generate a pulse or squarewave that would ring a DAC filter, but you won't ever get that signal if the source was via an ADC. 
 
Feb 12, 2016 at 4:14 PM Post #829 of 1,344
All agreed.... and, as I've said before, the reality is that the differences in PRODUCTION VALUES between different CDs, and even between different reissues of the same CD, usually FAR overshadow any of the tiny differences we're talking about here. (In other words, most modern CDs have such serious flaws, introduced deliberately or accidentally during recording and mastering, that the differences between most decent DACs are quite small by comparison. And, yes, sometimes it does start to feel like we're discussing which brand of $1000 a square foot museum glass to put over a cheap $50 print of the Mona Lisa.)
 
very_evil_smiley.gif

  Quote:
   
But we're talking about reproduction here, and a properly reproduced band-limited signal will ring.
 
 
Yes which is why I told the question-asker to first check for IMD at the higher rates in his system. But that IMD would not be caused by SoX upsampling because the resampler will lowpass the upsampled file at the original bandlimit. That means if he compares the two versions and they are identical in the audible spectrum but sound different, then it is very likely distortion is affecting the playback of the higher-frequency original.
 
 
Which is exactly what I've done with SoX in blind testing with numerous resampler settings. I really don't get what the heck people are worrying about so much with this.
 
I'm perfectly willing to believe that someone with unusually high frequency hearing can eek out stuff at 44.1, especially with test signals. But, referencing again the particular matter at hand, I think the answer to the question "why does this 192 version sound different than this 96 version" isn't particularly going to be screwed up by a decent resampler, since we're already talking about a cutoff near 48kHz.

 
Feb 13, 2016 at 10:35 AM Post #830 of 1,344
Here's a link to a VERY interesting website that offers a comparison of the technical accuracy of the sample rate conversion performed by a whole bunch of popular software programs. You'll notice that the various options on many of the programs often produce very different results; and that the results from many of the programs that don't offer a choice of options are also quite different (either by choice, or because they simply didn't write a very good filter algorithm). However, it makes it pretty clear that "just converting it in your favorite editor" is really a hit-and-miss process. Be sure to check out the impulse response and 1 kHz sideband spectra of each 
very_evil_smiley.gif

 
http://src.infinitewave.ca/

 
I used that site before and love it.  It's also why I never do SRC in anything other than SoX (almost always with VHQ linear phase).  
 
Some of the results for highly regarded software suites are surprising and somewhat inexplicable (if you can't do good SRC, just embed SoX).  Examples:
 
-ProTools HD 10.3.5 sweep & transition filters
-Adobe CS6 sweep, 1 khz (damn!)
-Weiss Saracon sweep (uh oh), transition (wth), 
 
 
dBpoweramp has also gotten a lot better, apparently.
 
Feb 13, 2016 at 11:31 AM Post #832 of 1,344
   
Do notice, however, that Audition CS6 is completely fine, so they obviously know how to do it properly.

 
Yes, which boggles as to why the regular CS6 media encoder is so much worse.
 
Feb 13, 2016 at 12:11 PM Post #834 of 1,344
   
Because it's ostensibly a video encoder, would be my guess. It's not such a critical feature then, so can be made less memory and processor intensive with an (I assume) acceptable trade-off in quality

 
I guess, but with modern CPUs/OSes and the tiny bandwidth that audio represents vs video, I'm surprised there would even be a need to compromise.
 
Feb 13, 2016 at 1:23 PM Post #835 of 1,344
I'm not in any way experienced with video production, but I suspect a software like that will rarely have to convert between 96 and 44.1kHz sample rates. Most camcorders and such don't go higher than 48kHz.
In a prosumer/professional scenario sound will usually be recorded separately, and then possibly at a higher sample rate, but in that case it will also be processed separately, before being added to the video in something like Premiere Pro.
 
Feb 19, 2016 at 9:54 AM Post #836 of 1,344
Yes, a properly band-limited signal will/must have some ringing (which some will insist is an "unavoidable flaw" of digital audio); and, yes, that ringing should theoretically be up at frequencies where it is inaudible. (There is also some theoretical minimum of ringing due to the math, but many encoders may well produce more than this because of their choices or compromises.) However, the exact characteristics of the ringing on playback are "negotiable". For one thing, some DACs have far less ringing than others (presumably because their filters add ringing, again presumably because lengthened ringing is a trade off with some other desirable benefit - like flatter phase response). In addition to that, the ringing can be "moved around". One popular thing to do is to virtually eliminate all pre-ringing (you can mathematically "push" all the pre-ringing until after the impulse; you get virtually no pre-ringing, but a post-ringing period that is twice as long). Since masking applies much less to events before the masking stimulus than it does to events after it, the logic is that post-ringing is less audible - even if there is more of it.
 
Since the effect of doing so is only relevant to transients, it doesn't theoretically affect measured steady-state frequency response. (But, the specific implementations of that feature on many DACs do in fact affect the frequency response, which could be what people hear in many cases.) Now, without getting into an argument about whether that ringing should be audible, the fact is that many people are quite convinced that it is. And, yes, our little Ego DACs have multiple filters, which differ only in where they place the ringing, and, yes, at least to me and many other people, they do sound audibly different - very slightly - with certain source material. Dolby Labs, and several other major vendors have also incorporated the option to select different filter types in the recent encoder products.
 
(I can't speak for others, but I can state that most of us here do in fact hear slight differences between the filters - at least with some content and some playback devices. However, we haven't run any careful double-blind tests; for a very simple reason... Being able to choose between multiple filters is a feature that our customers like, and it costs virtually nothing to add that feature to a DAC, so we have no specific reason to "validate" whether those differences are in fact due to other factors... or even whether they exist. It's quite possible that many other vendors feel the same way.)
 
I can also think of an endless string of scenarios where ringing COULD affect things that are audible. For one thing, there is the possibility of increased IMD somewhere - like you mentioned. For another, most tweeters exhibit some mechanical ringing; perhaps having ringing energy at inaudible frequencies causes some tweeters to ring, or causes them to ring longer once they start ringing - by adding energy. Ideas like this would have to be tested individually, on a case-by-case basis, in order to be proven to occur - or not. (However, by and large, it's not unreasonable to suggest that moving energy around at frequencies near those that are audible might cause something audible to occur.)  
 
(And you also can't ignore that fact that many DACs do in fact have rather different distortion specs at 96k than at 192k - for whatever design reasons.)
 
In short, I do agree with you that a well-implemented SRC should produce, at worst, less of an audible difference than many other likely factors.
 
Quote:
Originally Posted by RRod /img/forum/go_quote.gif
 
 
But we're talking about reproduction here, and a properly reproduced band-limited signal will ring.
 
 
Yes which is why I told the question-asker to first check for IMD at the higher rates in his system. But that IMD would not be caused by SoX upsampling because the resampler will lowpass the upsampled file at the original bandlimit. That means if he compares the two versions and they are identical in the audible spectrum but sound different, then it is very likely distortion is affecting the playback of the higher-frequency original.
 
 
Which is exactly what I've done with SoX in blind testing with numerous resampler settings. I really don't get what the heck people are worrying about so much with this.
 
I'm perfectly willing to believe that someone with unusually high frequency hearing can eek out stuff at 44.1, especially with test signals. But, referencing again the particular matter at hand, I think the answer to the question "why does this 192 version sound different than this 96 version" isn't particularly going to be screwed up by a decent resampler, since we're already talking about a cutoff near 48kHz.

 
Feb 19, 2016 at 10:32 AM Post #837 of 1,344
One popular thing to do is to virtually eliminate all pre-ringing (you can mathematically "push" all the pre-ringing until after the impulse; you get virtually no pre-ringing, but a post-ringing period that is twice as long). Since masking applies much less to events before the masking stimulus than it does to events after it, the logic is that post-ringing is less audible - even if there is more of it.

 
Yes you can have minimum or intermediate phase filters, but if the ringing they are moving about is inaudible then masking is a bit irrelevant.
 
(I can't speak for others, but I can state that most of us here do in fact hear slight differences between the filters - at least with some content and some playback devices. However, we haven't run any careful double-blind tests; for a very simple reason... Being able to choose between multiple filters is a feature that our customers like, and it costs virtually nothing to add that feature to a DAC, so we have no specific reason to "validate" whether those differences are in fact due to other factors... or even whether they exist. It's quite possible that many other vendors feel the same way.

 
Striving to completely eliminate ringing can lead one to do things like use filters that roll off quite a bit before 20kHz, which could certainly lead to audible differences. Of course, people will naturally want to attribute these audible differences to "less time smearing" instead of "worse frequency response."
 
 
  I can also think of an endless string of scenarios where ringing COULD affect things that are audible. For one thing, there is the possibility of increased IMD somewhere - like you mentioned. For another, most tweeters exhibit some mechanical ringing; perhaps having ringing energy at inaudible frequencies causes some tweeters to ring, or causes them to ring longer once they start ringing - by adding energy. Ideas like this would have to be tested individually, on a case-by-case basis, in order to be proven to occur - or not. (However, by and large, it's not unreasonable to suggest that moving energy around at frequencies near those that are audible might cause something audible to occur.)  
 
(And you also can't ignore that fact that many DACs do in fact have rather different distortion specs at 96k than at 192k - for whatever design reasons.)

 
This is indeed another avenue for audibility, but again we're in the frequency domain and not the time domain where people seem to want to assign improvements. Certainly transducers are the weak end of the equation and one shouldn't do things to them like have a constant ringing frequency or, for that matter, send them a bunch of ultrasonic frequency content that they can't actually deal with. Funny that when we're talking about filtering for lowly 22050Hz, distortion associated with ringing is a horrible thing, but when we're talking about hi-res music sounding "better", distortion from ultrasonics is suddenly a good thing (for many people on this forum, not necessarily yourself).
 
Having had equipment that sucked it up at 192 must agree with your last point. But again, this makes me wonder why we're even bothering with 192, let alone 384 which some people seem to crave.
 
Feb 19, 2016 at 10:34 AM Post #838 of 1,344
 
Having had equipment that sucked it up at 192 must agree with your last point. But again, this makes me wonder why we're even bothering with 192, let alone 384 which some people seem to crave.

 
Well, I would say that there are plenty in the audio engineering world who think 192 is stupid, and 384 doubly so.  These are driven by sales/marketing needs.
 
Feb 19, 2016 at 10:42 AM Post #839 of 1,344
   
Well, I would say that there are plenty in the audio engineering world who think 192 is stupid, and 384 doubly so.  These are driven by sales/marketing needs.

 
Yeah, "makes me wonder" was the wrong word choice. It's all about them benjamins.
 
Feb 19, 2016 at 11:43 AM Post #840 of 1,344
There is one thing that I wonder about on your first point: Can a tone which is itself inaudible still mask a tone which IS audible? Specifically, could a loud tone at 22 kHz, which is itself inaudible, affect your ear in such a way that it would reduce your ability to hear high frequencies that are audible? (Or, more in the realm of what might be audible, could ringing at 22 kHz make a cymbal sound dull because it's presence is reducing the sensitivity of your ear to harmonics which are audible? This might make sense of the claim by some people that, with some DACs, the "reverberant tail" of some instruments is altered or reduced.)
 
(Note that I'm not specifically suggesting that this is in fact occurring; however, it seems possible, and I'm not aware of anyone ever actually testing it.)
 
And, yes, you're absolutely right about your second point. While it should theoretically be possible to "rearrange ringing" without altering the frequency response significantly, the "apodizing filters" used by many DACs to do so seem to often cause a serious high frequency roll off (on one DAC I had it was -3 dB at 20 kHz with that filter engaged; and flat without it). Not only did it produce significant audible differences, but they were very much in line with how people often describe "the benefits" of such a filter.
 
On your final point, I'm inclined to agree with you. However, I also suggest that the question is very different depending on whether you approach it as a music producer, a music seller, or a consumer.
 
1) As a producer, the question is of whether the 192k version is inherently audibly superior to the 96k version (or has other benefits).
 
2) As a seller, plain and simple, the question is whether you can charge more for it (or use it to justify selling someone another copy of an album they already have).
 
3) And, as a customer, the question is whether YOU benefit. For example, if an album was mastered at 192k, then the 192k version is presumably "the first generation". Therefore, the 96k version will have gone through a conversion (which opens up the possibility that it lost some quality in the conversion process). There is also the possibility that the producer has altered the file in other ways as well; either to deliberately force the 96k version to sound inferior to the 192k "premium" version, or because he honestly feels that "the guys who buy 192k files are looking for a different sound". (It is a well known fact that CD reissues of vinyl albums were often deliberately made to sound different to meet the expectations of their customers.)
 
I tend to base my purchasing decisions on that third point. When an album is reissued by someone like HDTRacks, or a major studio, they often do actually re-mix and re-master the audio, so the new version is often audibly significantly different than the previous releases (not always). In those situations, I fully expect that the highest resolution version they offer will be the first generation; and I also expect that, if they do any additional tweaking after the conversion, that it will be the best sounding version (remember that this could be simply because it's one conversion closer to the "master", or because they deliberately made it different). However, either way, I assume that, IF there is a difference, the 192k one will probably be the better version. And, if I do purchase that version, I then have no particular reason to convert it to a lower sample rate afterwards (because space is cheap, converting takes effort, and the conversion just might compromise the quality).
 
So it really depends on who you mean by "we". It's pretty obvious that at least my second point is true; people WILL buy it, which is all the justification anybody needs to produce and sell it. (There are also more abstract reasons; for example, there's a reason why dishwasher detergent packets come in "regular", "extra strength", and "ultimate" - even though they may turn out to have the same ingredients... the reason is that people prefer variety, and so people are more likely to purchase a brand that offers them options. And, when offered three options, many people will avoid the lowest and the highest, and buy the middle one - based on factors that have little to do with actual differences between the choices - and even if the lowest option is perfectly suited to their needs.)
 
 
Quote:
   
Yes you can have minimum or intermediate phase filters, but if the ringing they are moving about is inaudible then masking is a bit irrelevant.
 
 
Striving to completely eliminate ringing can lead one to do things like use filters that roll off quite a bit before 20kHz, which could certainly lead to audible differences. Of course, people will naturally want to attribute these audible differences to "less time smearing" instead of "worse frequency response."
 
 
 
This is indeed another avenue for audibility, but again we're in the frequency domain and not the time domain where people seem to want to assign improvements. Certainly transducers are the weak end of the equation and one shouldn't do things to them like have a constant ringing frequency or, for that matter, send them a bunch of ultrasonic frequency content that they can't actually deal with. Funny that when we're talking about filtering for lowly 22050Hz, distortion associated with ringing is a horrible thing, but when we're talking about hi-res music sounding "better", distortion from ultrasonics is suddenly a good thing (for many people on this forum, not necessarily yourself).
 
Having had equipment that sucked it up at 192 must agree with your last point. But again, this makes me wonder why we're even bothering with 192, let alone 384 which some people seem to crave.

 

Users who are viewing this thread

Back
Top