To crossfeed or not to crossfeed? That is the question...

Strangelove424 · Dec 10, 2017 at 4:58 PM

jgazal said:
I agree completely and that’s why I think future acquisition of HRTF with biometrics has an edge over PRIR’s. Anyway, Smyth Research Realiser PRIR’s cannot be compared with Dolby Headphone DH-1 at all because the former are literally that (“personal binaural impulse responses”) while the latter relied on generic BRIR/HRIR. Chances are the lower adoption of Dolby Headphone DH-1 had more to do with the lack of personalization than with the choice of mastering rooms in which the generic BRIR/HRIR were acquired.

You’ve got me there, particularly having in mind the efficiency of the interpolation algorithm. You have to hear it by yourself. Nevertheless, here is what Professor Smyth says about you apprehension:

Oh, I'm certainly not trying to pass judgment on the Realizer based on my experience with the Dolby. I was just mentioning it in terms of reference simulations having less overall spatial processing. The Dolby algorithm isn't necessarily bad, I actually use it occasionally, but I would assume the Realizer is more natural. It's something I've been curious about trying. I cannot make any subjective claims until I do so.

jgazal said:
I have been doing that with the layman multimedia guide to immersive sound for the technically minded. Thanks god because there were a lot of errors (and probably still have errors). Please help me to find them if you have free time available.

Thanks for letting me know! I didn't see anything wrong, but I will mention any errors I find. I'm horrible with editing my own words. I only see what I meant to say, not what's actually there. Amazes me what gets by sometimes.

castleofargh · Dec 10, 2017 at 6:57 PM

Strangelove424 said:
There is a lot of relevance in this comment. I like some crossfeed, I think it reduces fatigue by making things sound more natural acoustically. However, when people start talking about smyth realizers, room size algorithms, HRTF, etc. I have to admit that there is a part of me that becomes reactionary. It reminds me of VR, and I see a direct effect from gaming technology making its way into music listening through devices such as the SmythRealizer. In gaming the emphasis on spatial perception is important for either immersion or competition. Many of the headphone advancments like HRTF functions or 3d tracking came from that field. I think they are innovative solutions, but I do not understand why it is being assumed that an ideal headphone experience is a mimic of speakers, and that in order to experience "reference" sound on headphones first we must artificially simulate the interaction of sound in space that was never really there. It is precisely this lack of space that gives headphones their own character, for both its positives and negatives. You get a presentation isolated from ambient/acoustic effects, which can be beneficial. It's an intimate and detailed presentation. When that gets traded for a sense of synthetic space, like wearing a trackIR in a 3d game, the whole experience seems less authentic to me. Neither headphones or speakers, but a Frankenstein between. For me, it not only fails to suspend my disbelief, it puts me right in the uncanny valley!

I still maintain that I like some crossfeed, but I don't know how I feel about replicating speakers to the full extent with HRTFs, head trackers, binaural mics, etc. That is my personal preference though, and if others wish to pursue this technology I encourage them and will stay informed of their progress. I do not, however, think that should become the standard of headphone listening.

I'll answer for myself. the album was mastered using speakers, so I consider them to be the first meaningful reference. ideally I would want to have the sound of the speakers in the very studio the master was done, while sitting where the guy was. that is my own idea of the sound like the artist intended. not some often poor sound coming out from giant speakers at a live concert. live event is the true sound like the band is playing in front of me. but it is not what I wish to replicate. because I don't really like that, and also because it is typically impossible when using a mixed and mastered album. so I aim for the next best thing, the sound like the guy heard it when doing the mastering in the studio. I also don't get that, but I try to get close to it and it starts with speaker sound.
I do believe that one day(soon) it will be a reality and albums will have the data(one way or another) to use on our devices and blend in the result with our own HRTF. and maybe when those tech are everywhere, making an album will change too, and the released albums will become the sound like someone in a VIP seat heard it at the live event in some glorious room without spectators. or the sound like you're next to the singer(although I don't think that would sound great). with the potential for good mimicry of a given space or given speakers, comes all the potential to produce differently. so you should IMO see the Realiser as the step in the door of future audio.
for now if I can get at night on headphones, the sound I get on my speakers during the day, that would already be mighty cool. anything beyond that will be bonus to me. ^_^

castleofargh · Dec 10, 2017 at 7:00 PM

Strangelove424 said:
@catleofargh, is it possible to edit an original post indefinitely or do editing capabilities get locked out after a certain point?

I have no idea ^_^. but a great many topics are edited on a regular basis, so if there is a limit I'd expect it to be real high. do whatever you wish to without worry.

ironmine · Dec 11, 2017

gregorio said:
1a. Aliasing of what? There is nothing above 22.05kHz if you're feeding 16/44.1, upsampling does NOT magically recreate those frequencies ALREADY removed above the Nyquist point. Same with bit depth: If we've got a 16bit file and convert it to 24bits it does NOT magically generate data for those extra 8 bits, all it does is fill/pad those 8 bits with zeros!
1b. The quantisation/round-off error is ALWAYS in the LSB of the plugin format, which is 64bit float in many cases, 32bit float in others.
1c. Sample rate has nothing to do with phase.

2. That's of course nonsense! How does 1.25343 x 1.54789 give a less accurate result than 1.2534300000000 x 1.54789?
2a. Due to the above, your assertion obviously has nothing to do with why plugins upsample! There are 3 potential reasons plugins upsample: 1. The plugin is using some non-linear process which generates content above 22.05kHz, typically something like an analogue modelled compressor will do this to generate IMD in the audible band. 2. It might be more practical for a plugin to operate at a single sample rate and up/down sample it's input to match, some convolution reverbs do this for example. 3. It could be purely marketing, to fool newbs who are gullible enough to believe that a higher sample rate must be better because it's a bigger number!
2b. DAWs do not operate at a higher sample rate! If you record in 44.1kHz, they operate at 44.1kHz. Their internal mix environment is commonly 64bit float, some are 32bit float or in some older DAWs it's 48bit fixed.

3. All of this is irrelevant nonsense!! Let's take your 1.24343 as our 16bit value, let's convert it to 24bit, so now we have something like 1.2434300000000. What happens if we were to feed those two values into a 64bit plugin? Our 16bit value gets padded with a whole bunch of zeros to create a 64bit word so that our 64bit plugin can actually process it, so now we have:
1.253430000000000... On the other hand, our 24bit word gets padded with a whole bunch of zeros to create a 64bit word so that our 64bit plugin can actually process it, so now we have:
1.253430000000000... In both cases we've got 1.25343 followed by exactly the same number of zeros, so WHAT'S THE DIFFERENCE??
The result of all the internal calculations of the plugin is also a 64bit float (because it's a 64bit plugin!). The quantisation error is in the LSB of that 64bit result (because it's a 64bit plugin!). The output of the plugin when it's finished all it calculations is also a 64bit float (because it's a 64bit plugin!), which either stays as a 64bit float if the data paths between plugins is 64bit or gets truncated to 32bit if that's the width of data path.
The difference between a 16bit word or a 16bit word padded to 24bits is LITERALLY zero (or 8 zeros if you want to be really precise about it) and once input into a 64bit plugin even the number of trailing zeros is the same!!! The only way your examples and statements would make any sense would be if feeding a 16bit word to a 64bit plugin somehow magically changed all the plugin's internal coding/processing and turned it into a 16bit plugin, while feeding it a 24bit word magically turned it into a 24bit plugin. That's of course nonsense, all that happens is that those 16 or 24 bit words are padded with zeros to 64bit floats and that 64bit plugin is always a 64bit plugin!

4. What's your DAC go to do with it? You are talking about the precision of plugin processing not whether or not your DAC is incompetently designed, which is a completely different issue!

5. Ah, it seems like the suggestion in my previous post was incorrect. Instead, try a book which explains the very basics of digital first, and then you might correctly understand what's in Bob's book!

G

Please do not ascribe to me some primitive ideas as if I don't understand that converting the wordlength from 16 bit to 24 bit "recreates" extra 8 bits (or that upsampling "recreates" some frequencies). I may have some misconceptions but not as stupid as that.

Let's talk about the wordlength (bit rate) first.

I do understand that these extra bits will be "padded" with zeros. But only initially, because, as we apply more and more processes to audio, these zeros will be quickly replaced with figures other than zeros. After several processes even 64 bit wordlength will probably be not long enough to accurately represent the full result of all computations without rounding it off.

Do you agree that (from the technical, mathematical point of view - let's forget for now the debate whether we can hear it or not) after 16-bit audio is heavily processed by 32-bit or 64-bit plugins, the 24-bit representation of their final computational result will be more accurate/complete than the 16-bit representation? If so, do you agree that it's better to send out to the DAC the signal in 24-bit format rather than 16-bit?

ironmine · Dec 11, 2017 at 2:29 AM

71 dB said:
When making music I "render" the raw tracks and often fist crossfeed and then add reverberation. The "direct sound" is strongly crossfed while reverberation contains greater ILD. Summing these together ensures, that the ILD levels stay low enough. Works nicely imo.

Why do you think it's better to inject reverberations after the crossfeed? I tried both ways. Reverb after the crossfeed sounded worse to me.

In theory, which way is more correct?

What if we inject a bit of reverb before the crossfeed AND a bit after?

WoodyLuvr · Dec 11, 2017

Strangelove424 said:
Ok, I think I'll start one up. I think I'd enjoy maintaining a thread like that, but I'll need help from different people on different platforms and players. I could update the original post with links to their posts or other threads. I will type up some kind of an intro to start with, and maybe a few basic links. It'd be nice to have a single repository of reviews, links, and help. Nobody can find all the DSPs out there on their own. Too many.

@catleofargh, is it possible to edit an original post indefinitely or do editing capabilities get locked out after a certain point?

Awesome! Looks great thus far. May I suggest that you QUICKLY reserve two or three spots right under your initial opening post (via replying to your post two or three time and enter the following into each: "RESERVED FOR FUTURE DATA STICKIES").

If I may here are a few suggestions:

Add "Components" to Plugins (e.g. PLUGINS & COMPONENTS)

Add "Samplers & Ditherers" under Plugins & Components

Add "Emulators" under Plugins & Components (so that we can add tube saturators, reverb, pre-amps, and other emulation/saturation plugins)

Add "VST Chainer" under Plugins & Components (e.g. KVR Audio Art Teknika Console)

Add a third section entitled: "ACTIVE DSP LIST ORDER"

I am extremely interested in understanding what should go before what and why as there seems to be conflicting information/opinion on this. (e.g. I have read that volume control should always go before any limiter/clipper and that samplers should go before crossfeed: resampler > crossfeed > volume > limiter and that limiters should always be the last in the chain. Should equalizers, saturators, and/or pre-amps go before or after crossfeed?

Add this alternative VST Adapter to your list:

Yohng Foobar2000 VST Wrapper
I found that the standard VST 2.4 adapter sometimes acts up with some VST dll plugins in WIN10 (e.g. SlickEQ and Voxengo Marvel GEQ constantly crash and/or throw errors in VST 2.4 (WIN10) but work perfectly via the George Yohng's VST Wrapper and it is very easy to install and use.)

71 dB · Dec 11, 2017 at 5:04 AM

ironmine said:
1. Why do you think it's better to inject reverberations after the crossfeed? I tried both ways. Reverb after the crossfeed sounded worse to me.

2. In theory, which way is more correct?

3. What if we inject a bit of reverb before the crossfeed AND a bit after?

1. With speakers you have direct sound, early reflections and reverberation. Putting reverb after crossfeed makes "direct" sound and reverb sound a bit different. Reverb not being crossfed leaves potential spatial distortion, but when you have dozens of audio tracks they mask each other a bit. Spatial distortion of a mix is less than the spatial distortion of individual tracks. I tend to crossfeed the direct sound heavily, it's not just "crossfeed", but ILD/ITD based panoration. I also "shape" ILD and keep it low at lowest frequencies. It's a working strategy that suites me. I'm not a guru on this, I learn all the time while making music. I usually duplicate the raw track that which may contain some reverb already. The first one is direct sound and gets perhaps floor reflection simulation (increases realism) and ILD/ITD panoration (crossfeed) while the second track gets reverb and possibly ILD reduction at bass if it's needed. Then I set a proper level for the tracks and mix together.

2. Hard to say. Matter of taste. What sounds best is most correct. As reverberation is a diffuse soundfield, every single reflection should theoretically be crossfed according to the angle of arrival, which makes things insanely complicated. You could have "partial" reverbs you crossfeed differently and mix it all together. If it's worth it I don't know.

3. Probably works nicely. Haven't tried. Good idea.

gregorio · Dec 11, 2017 at 5:05 AM

ironmine said:
[1] Please do not ascribe to me some primitive ideas as if I don't understand that converting the wordlength from 16 bit to 24 bit "recreates" extra 8 bits (or that upsampling "recreates" some frequencies). I may have some misconceptions but not as stupid as that.
[2] I do understand that these extra bits will be "padded" with zeros. But only initially, because, as we apply more and more processes to audio, these zeros will be quickly replaced with figures other than zeros. [2a] Processing After several processes even 64 bit wordlength will probably be not long enough to accurately represent the full result of all computations without rounding it off.
[3] Do you agree that (from the technical, mathematical point of view - let's forget for now the debate whether we can hear it or not) after 16-bit audio is heavily processed by 32-bit or 64-bit plugins, the 24-bit representation of their final computational result will be more accurate/complete than the 16-bit representation? [3a] If so, do you agree that it's better to send out to the DAC the signal in 24-bit format rather than 16-bit?

1. There's only two choices: A. Ascribe to you those primitive ideas/stupid misconceptions or B. Ascribe to you a decent basic understanding of the principles of plugin processing, in which case the only logical conclusion is that you were deliberately giving incorrect advice.

2. I don't get the "But only initially". You advised: "The first plugin should be a high-quality upsampler so that all further calculations are done with a higher precision and less quality loss." - The initial input into a subsequent plugin (and "further calculations") is the ONLY thing you are affecting with this advice. Whatever happens after this initial input into a 64bit plugin occurs in 64bit, NOT in 16, 24 or whatever other bit depth you input!
2a. This statement is also completely unrelated to whether or not you initially feed the plugin with a 16bit word or a 16bit word padded with zeros to 24bit. Even though it's unrelated, I'll answer it anyway, or rather, I've already answered it, twice! Yes, you will get quantisation error in the LSB of the 64bit. The truth of your statement depends on what you mean by "accurately represent": If you mean in terms of pure mathematics, no 64bit would not enough. If by "accurately represent" you are talking in terms of sound waves, which of course is the whole point of the mathematics in the first place, then yes, it's way more than enough. I'm not talking about audibility here but the sound waves themselves. A sound wave is the movement of billions of air molecules and the quantisation error at the 64bit level represents an energy level significantly lower than that required to move billions of air molecules and therefore, it's unable to affect a sound wave. Obviously there's a cumulative effect of quantisation error but you'd probably need many hundreds of 64bit plugins in series before you accumulated enough quantisation error energy to have any affect on a sound wave and probably thousands for that affect to be potentially audible.

3. Mathematically, yes. But "technically", it would entirely depend on what processing we're talking about and what we're processing (the input audio file/s). BTW, by "technical" I'm not talking about audibility but the technical practicalities of sound wave reproduction.
3a. Providing we're NOT talking about what's audibile, then possibly. However, this is a question of signal to noise ratios of an individual replay system and environment. Of course NONE of this is related to using a preliminary upsampling plugin, as we're now talking about the output of a plugin chain and therefore a bit depth defined by those plugins and by the processing environment in which they are employed (the bit depth of the data connections between plugins), NOT the bit depth of what we initially feed that processing chain!

Your advice to use an upsampler as the first plugin to improve the precision of subsequent plugins/calculations was incorrect. Also incorrect was your advised positioning of the dither plugin, which you now appear to have accepted. All the points above are unrelated to my refuting these parts of your advice.

G

ironmine · Dec 11, 2017 at 8:19 AM

gregorio said:
2. I don't get the "But only initially". You advised: "The first plugin should be a high-quality upsampler so that all further calculations are done with a higher precision and less quality loss." - The initial input into a subsequent plugin (and "further calculations") is the ONLY thing you are affecting with this advice. Whatever happens after this initial input into a 64bit plugin occurs in 64bit, NOT in 16, 24 or whatever other bit depth you input!

No, it's not the ONLY thing. My advice affects not only the bit rate at the input to the VST-chainer, but also the sample rate (which we agreed not to talk about, for now). And, probably (this is what I am not sure about), it also affects the bit rate at the output of the VST-chainer.

After the dBpoweramp/SSRC upsampler in Foobar upsamples 44/16 audio to 176/32 (or 176/64? not sure about it), it passes the result to the VST-chainer in 32bit format. After the VST-chainer finishes its calculations it passes the result to Foobar (or to Foobar VST wrapper which links Foobar to VST-chainer? again, not sure) in 32bit format, right? Foobar truncates it to 24 bits and sends the data out to the DAC.

What would happen if I don't upsample and send the original 44/16 audio to the VST-chainer? I know that the extra bits will be padded with zeros at the input and the calculations will be done in 32bit, but what would happen at the output of the VST-chainer? Will the VST-chainer assume: "Ok, the signal was handed to me in 16bit format, I've done my job in 32bit format, but now I need to truncate and hand back the result in the same wordlength (16) as I had received it"?

Or, will it disregard the initial bit rate (16) at the input and will return the new increased bit rate (32) from its output?

gregorio · Dec 11, 2017 at 10:02 AM

ironmine said:
[1] After the VST-chainer finishes its calculations it passes the result to Foobar (or to Foobar VST wrapper which links Foobar to VST-chainer? again, not sure) in 32bit format, right? [2] Foobar truncates it to 24 bits and sends the data out to the DAC.

[3] What would happen if I don't upsample and send the original 44/16 audio to the VST-chainer? I know that the extra bits will be padded with zeros at the input and the calculations will be done in 32bit, but what would happen at the output of the VST-chainer? Will the VST-chainer assume: "Ok, the signal was handed to me in 16bit format, I've done my job in 32bit format, but now I need to truncate and hand back the result in the same wordlength (16) as I had received it"?
[3a] Or, will it disregard the initial bit rate (16) at the input and will return the new increased bit rate (32) from its output?

1. Either 32 or 64bit float, you'd have to refer to the manual or the developers for which, probably 32bit though.
2. That will depend, is Foobar 32 or 64bit? Is it outputting through it's own driver or through say ASIO? If it's through it's own driver it presumably has an output bit depth setting, otherwise it will output 32 bits to the driver.
3. It depends on the exact coding of the software. However, my previous answer also comes into play, if it has it's own driver and provides an option for 16bit output and you select that option, it will output in that format. Otherwise it will assume a 32 or 64bit depth, most probably 32bit to maintain compatibility with virtually every DAW/Audio software.
3a. This. Audio data within the PC environment is expected to be 32bit, even 16/24 bit data is encapsulated within 32bit frames so it would make no sense to truncate to 16 or even 24bit as it would just be padded with zeros again. Modern PC processors and data paths are 64bit but of course all the software and drivers in the chain need to be able to accept 64bit, if they don't they'll simply truncate to 32bit or fail/crash if they cannot recognise 64bit words.

G

WoodyLuvr · Dec 11, 2017 at 10:08 AM

gregorio said:
2. That will depend, is Foobar 32 or 64bit?

If I am not mistaken Foobar2K is 32-bit.

Strangelove424 · Dec 11, 2017 at 11:22 AM

WoodyLuvr said:
Awesome! Looks great thus far. May I suggest that you QUICKLY reserve two or three spots right under your initial opening post (via replying to your post two or three time and enter the following into each: "RESERVED FOR FUTURE DATA STICKIES").

If I may here are a few suggestions:

Add "Components" to Plugins (e.g. PLUGINS & COMPONENTS)

Add "Samplers & Ditherers" under Plugins & Components

Add "Emulators" under Plugins & Components (so that we can add tube saturators, reverb, pre-amps, and other emulation/saturation plugins)

Add "VST Chainer" under Plugins & Components (e.g. KVR Audio Art Teknika Console)

Add a third section entitled: "ACTIVE DSP LIST ORDER"

I am extremely interested in understanding what should go before what and why as there seems to be conflicting information/opinion on this. (e.g. I have read that volume control should always go before any limiter/clipper and that samplers should go before crossfeed: resampler > crossfeed > volume > limiter and that limiters should always be the last in the chain. Should equalizers, saturators, and/or pre-amps go before or after crossfeed?

Add this alternative VST Adapter to your list:

Yohng Foobar2000 VST Wrapper
I found that the standard VST 2.4 adapter sometimes acts up with some VST dll plugins in WIN10 (e.g. SlickEQ and Voxengo Marvel GEQ constantly crash and/or throw errors in VST 2.4 (WIN10) but work perfectly via the George Yohng's VST Wrapper and it is very easy to install and use.)

Great ideas! Thank you very much for the suggestions Woodyluvr, this is exactly the kind of help I was hoping I'd receive! I'm kind of kicking myself because I didn't read your post before I got the chance to reserve spots under the op, but oh well I guess. I updated a couple descriptions last night, am about to upload more screenshots, and will make these additions you mentioned. I will also change the current saturation heading to a general Emulators heading, and include more subsections under Emulators as they come. I'm also going to be adding a Normalization section for peak and loudness, thinking of you and replay gain and others using Soundcheck

. Keep the suggestions coming. Thanks again.

WoodyLuvr · Dec 11, 2017 at 11:40 AM

Strangelove424 said:
Great ideas! Thank you very much for the suggestions Woodyluvr, this is exactly the kind of help I was hoping I'd receive!... ...I'm also going to be adding a Normalization section for peak and loudness, thinking of you and replay gain and others using Soundcheck . Keep the suggestions coming. Thanks again.

You are very welcome. Looking forward to seeing that section, as well as more on Crossfeed as I am still uncertain on how to set the Meier Crossfeed slider... floating between 6 and 20.

ironmine · Dec 12, 2017

gregorio said:
This. Audio data within the PC environment is expected to be 32bit, even 16/24 bit data is encapsulated within 32bit frames so it would make no sense to truncate to 16 or even 24bit as it would just be padded with zeros again.

Ok.

If it is really so, then it is indeed unnecessary for the user to convert 16 bit audio to a higher bit rate, because not only it will be done automatically anyway, but it will be kept at a higher bitrate until the very output. (But it's still advisable to set up the Foobar's output setting to 24bit or 32 bit, depending on how much your DAC can accept, because now, at the end of a processing chain, these lower bits - from 16 to 24, contain useful information. They are not zeros anymore.)

But here we need to come back to the issue of upsampling (changing the frequency).

As I am still convinced that it's beneficial to upsample the frequency, let's say, from 44 to 88 or 176, prior to feeding the audio date to a processing chain of plugins, we anyway end up with not only a higher sample rate, but also a higher bit rate, because, as it was shown above, the audio data at the end of the upsampler will be 176/32, not 176/16.

Aleksey Vaneev (the author of Voxengo VST plugins):

"Almost all types of audio processes benefit from an oversampling: probably, only gain adjustment, panning and convolution plug-ins have no real use for it. An oversampling helps plug-ins to create more precise filters with minimized warping at highest frequencies, to reduce aliasing artifacts in compressors and saturators, to improve a level detection precision in peak compressors." (quoting from here)

The quote from Bob Katz' book: "The Art & The Science of Mastering":

Also, consider this argument: plugins oversample (optionally or automatically).
DAСs also oversample or upsample the signal.

https://en.wikipedia.org/wiki/Oversampling:
"Oversampling improves resolution, reduces noise and helps avoid aliasing and phase distortion by relaxing anti-aliasing filter performance requirements."

So, since plugins/DACs upsample/oversample anyway, why don't we help them do this work (fully or, at least, partially) by upsampling the signal first ourselves with the high-quality upsampler such as dbPoweramp/SSRC available in Foobar? I don't think that the quality of oversampling/upsampling inside plugins is superior compared to a dedicated upsampler. I am sure dBpoweramp/SSRC can increase the sample rate with a better result - measurements prove that it's the best compared to similar upsamplers.

So, our signal, while starting its way from modest 44/16, even without an upsampler at Foobar, at any case, undergoes these changes on its way:

Step A (audio file): 44/16
Step B (VST plugins - let's assume they oversample everything to 176): 176/32
Step C (at the Foobar output limited by the DAC driver, and at the DAC input): 44/24
Step D (inside the DAC - let's assume it oversamples all incoming signals to 352): 352/32

So, if our signal finishes its way being 352/32 in the DAC, why would it not be beneficial to insert our highest-quality upsampler in between steps A and B to upsample the signal first to 88 or 176?

Please note that there is downsampling happening from step B to step C (if we don't upsample before the VST chain).

But if we upsample before our VST chain, then step C looks like this:

Step C (foobar output, limited by the DAC driver, and at the DAC input): 176/24

If we don't upsample, then the VST plugin has to do the full job, i.e. oversample 4X.
If we upsample to 88, then the VST plugin has to do only half of its job, i.e. oversample only 2X.
If we upsample to 176, then the VST plugin does not have to oversample at all.

What's wrong with my logic?

gregorio · Dec 12, 2017 at 7:43 AM

ironmine said:
But it's still advisable to set up the Foobar's output setting to 24bit or 32 bit, depending on how much your DAC can accept, because now, at the end of a processing chain, these lower bits - from 16 to 24, contain useful information. They are not zeros anymore.

No, they're not zeros anymore, in almost all cases they're reasonably random zero's and one's, IE. Noise. Also, don't forget that your DAC cannot resolve the last four bits or so. So really we're talking about what's in bits 16-20, assuming there is something "useful" there in the first place. If there is, then it's a question of what your amp, transducers (speakers/HP) and listening environment combined are capable of and then finally, if there is still anything "useful" actually being produced, are your ears capable of hearing it? With noise-shaped dither we can achieve the equivalent of 20bit performance with 16bit anyway and in the vast majority of cases most people can't hear the artefacts of just truncating 32 or 24bit to 16bit.

However, if you're doing this self administered plugin processing as the end user, then you might as well not apply dither and just output the 32bit from Foobar. Unless Foobar is directly outputting to your DAC, which I assume it isn't (it's most likely outputting to a driver) then there's no advantage (or disadvantage) to outputting 24bit from Foobar, so you might as well leave it at 32bit.

ironmine said:
What's wrong with my logic?

Several things:

1. You appear to be confusing two different things here: Anti-alias and anti-imaging filters. To comply with Nyquist, we have to remove all freqs above the Nyquist Point to avoid aliasing. With oversampling our Nyquist Point is far higher, this requires a much simpler, cheaper and less damaging analogue filter and also spreads the dither noise over a wider band, much of which is then discarded when a secondary (digital) decimation filter is applied. This is why pro ADCs oversample into the multiple mHz range, commonly somewhere around 22mHz. However, none of this is applicable in our case because what we're dealing with has already been anti-alias filtered and there is nothing above the 22.05kHz Nyquist Point of our 44.1/16 input file/signal!
1a. What Bob is talking about was absolutely true, for a number of years. I sometimes used to run sessions at 96kHz as many plugins operated audibly better at that rate, many/most soft-synths, compressors, some EQs and various others. There is now (and for quite a few years) no benefit to this as the processing power available to plugin developers has increased significantly and the coding is more sophisticated. Most plugins no longer benefit from a higher sample rate and those few which do (such a non-linear analogue modelling plugins), now simply up/down sample where necessary and can apply better filters when doing so. I no longer run sessions higher than 44.1 or 48kHz for audio quality reasons, I do so only if a client requires delivery of a higher sample rate.

2. In your "Step B", if we were to make that assumption, then what you've asserted might have some merit. However, you cannot make that assumption! Those plugins which do legitimately upsample would do so at x2 (either 88.2 or 96kHz) as any theoretical benefits of filters and any non-linear processes are perfectly addressed by those sample rates and going higher is just a waste of processing and potentially less accurate. Those plugins with a fixed sample rate (such as some convolution reverbs and some soft-synths/samplers for example) tend to operate at 96kHz. I know of no plugins which upsample to 176.4kHz legitimately and by "legitimately" I mean processors which upsample to that rate for any reason other than marketing. The only exception to this would be plugin processors designed to deal with intersample peaks, such as true-peak limiters, although some upsample well beyond 176kHz.

3. Your last paragraph; if we don't upsample to 176 and the plugins do automatically upsample 176 then yes, there would be fewer applications of anti-alias filters. BUT:
A. If we're feeding the plugin a signal with no content above 22.05 and if the processor is not creating any content above that freq, then we're applying a smoother, higher frequency filter where there is no frequency content anyway, to a signal which already has (whatever) filter artefacts from being filtered to 22.05kHz to start with.
B. All plugins do not automatically upsample to 176. In the case of a plugin with a fixed sample rate of say 96kHz then: You upsample to 176, adding a filter in the process. The plugin downsamples, processes and upsamples again, adding two more filters to the process. That's the application of 3 filters where if you'd just fed the plugin 44.1 to start with, there would only have been two filters applied.
C. If the plugin does not upsample (and there's no reason to, in most cases) then: You're upsampling for no benefit, adding an unnecessary processing step, an additional filter and risking lower precision by operating at a higher than optimal sample rate.
D. If we were to upsample to 88.2, we'd be adding a filter. If the plugin is operating at 176.4, then it has to upsample from 88.2 to 176.4 and add another filter, then downsample to 88.2 and add another filter. If we'd just fed the plugin 44.1 to start with there would only be two filters applied, rather than three.

4. Your fear of up/down sampling and the effects of applying filters to this process is unwarranted. The filters in plugins today are far superior to those of 15 or so years ago and are audibly transparent.

5. In the case of a DAC oversampling to say 352kHz, then going from 44.1kHz to 352kHz is theoretically better than going from 44.1 to 176 and then from 176 to 352. It's one less processing step and filter application, the same as 3d above.

None of the above is absolutely set in stone, as of course it all depends on the skill/effort of the plugin developer.

G

Latest Thread Images

Featured Sponsor Listings

To crossfeed or not to crossfeed? That is the question...

Strangelove424

500+ Head-Fier

castleofargh

Sound Science Forum Moderator

castleofargh

Sound Science Forum Moderator

ironmine

500+ Head-Fier

ironmine

500+ Head-Fier

WoodyLuvr

Friend of Blur Earphones

71 dB

Headphoneus Supremus

gregorio

Headphoneus Supremus

ironmine

500+ Head-Fier

gregorio

Headphoneus Supremus

WoodyLuvr

Friend of Blur Earphones

Strangelove424

500+ Head-Fier

WoodyLuvr

Friend of Blur Earphones

ironmine

500+ Head-Fier

gregorio

Headphoneus Supremus

Users who are viewing this thread