Neutrality is boring - is there something more going on, and can upsampling help?
May 5, 2015 at 9:20 AM Thread Starter Post #1 of 18

chongky

Head-Fier
Joined
Sep 27, 2014
Posts
86
Likes
36
Ever since I started this hobby I find there's a discrepancy between what is supposedly "neutral" and what is "warm, inviting and musical" sound. Almost everyone (let's be honest) prefers a warm sound and finds them more involving and fun than a cold, analytical, "neutral" sound. It was after I purchased a ODAC/O2 combo that I started to think more deeply about this question.
 
Technically, ODAC/O2 cannot be faulted. Ruler-flat frequency range within the audible range, great signal-to-noise ratio and VERY transparent. However, when I pair it up with a neutral pairs of cans like Soundmagic HP150, the effect is nice but a little dull. The mids are thin and emaciated, and although soundstage and imaging are good, it seems to vindicate the notion that "neutral" is boring.
 
Another neutral and supposedly "unmusical" pair of cans oft criticized by detractors is the Sennheiser HD800. Possibly the most resolving dynamic headphones around, but once paired with the O2 amp, they seem to highlight mutual weakness instead of complementing each other.
 
I often listen to classical music and it was only after purchasing a warmish pair of cans - the Philips Fidelio X2 - that I begin to engage more with the music. I've been thinking about this and would like to propose a hypothesis and an experiment to see if my hypothesis is correct.
 
How comes the science part. 
 
I strongly suspect this thinness has to do with interaural time delay. Both O2 amp and HD800 are so revealing that this shortcoming at the source is ruthlessly exposed. The sources for this are http://www.ambisonic.net/pdf/hiresaudio.pdf and http://www.jamminpower.com/PDF/New%20Audio%20Formats.pdf. To quote:
 
"If you put a pulse into one ear, then a pulse slightly delayed into the other ear, most people can hear a time delay of 15 microseconds or more. Under some circumstances, some people can hear time delays of 3 to 5 microseconds. Note that one sample at 48Khz is 20.833 microseconds. At 96Khz, it is 10.4167 microseconds. The minimal inter-aural (across the two ears) time delay that most people can hear is less than one sample period at 48Khz.
 
"When listening with both ears, everyone can distinguish 96Khz recordings from 48Khz recordings, and everyone prefers the 96Khz recordings...the reason being probably because some time-domain resolution between the left and right ear signals is more accurately preserved at 96Khz."
 
My instinct about this is just gut. At the risk of being ridiculed (for I'm no science guy, no engineer), I'm feeling curious enough. I am hoping someone would take the challenge and see whether this hypothesis might be correct. 
 
A warm pair of cans does "smear" the audibility of these interaural delays inherent in CDs (16-bit/44.1 Khz) so it becomes less offensive. I also suspect Charles Hansen of Ayre Audio has stumbled on something like this as his DACs upsample single-pass 16x oversampling. And this may be the same reason why DSD sounds good (although technically inferior to PCM) - some time-domain resolution is preserved better at higher sampling rates.
 
If my hypthesis is correct, then oversampling a CD quality file to 352.8 Khz would (somewhat) resolve this problem. (192 Khz - 5.208 microseconds; 353 Khz - 2.104 microseconds.)
 
My proposed experiment:
 
Take the HD800 and O2 amp, pair them together with a DAC with 352.8 Khz decoding capability. Using a software upsample a CD-source FLAC file to 352.8 Khz and listen to them. If my hypothesis is correct, the sound might become fuller, more analog owing to the fact that inter-aural time delay is drastically diminished.
 
I understand someone on the Science forum has done a blind test for DXD/CD on loudspeakers and can't distinguish them, but interaural time delays are most noticeable on headphones. For me, I don't have a DAC which upsamples to 352.8 Khz, neither do I own a HD800, so I can't conduct this experiment. I'm just curious to see if my proposal works.
 
Pardon my bad English. 
wink_face.gif
 
 
May 5, 2015 at 9:41 AM Post #2 of 18
ITD is a channel to channel thing at mid-range frequencies. It's about the lateral position of a sound. So if the ITD changes with time a musical instrument would seem to move from left to right.  But this is about mid-range frequencies, maybe 1000Hz and lower, so I'm not sure just how it could change with time.  Nor to I read about musical instruments moving from side to side.
 
May 5, 2015 at 9:53 AM Post #3 of 18
  "If you put a pulse into one ear, then a pulse slightly delayed into the other ear, most people can hear a time delay of 15 microseconds or more. Under some circumstances, some people can hear time delays of 3 to 5 microseconds. Note that one sample at 48Khz is 20.833 microseconds. At 96Khz, it is 10.4167 microseconds. The minimal inter-aural (across the two ears) time delay that most people can hear is less than one sample period at 48Khz.
 
"When listening with both ears, everyone can distinguish 96Khz recordings from 48Khz recordings, and everyone prefers the 96Khz recordings...the reason being probably because some time-domain resolution between the left and right ear signals is more accurately preserved at 96Khz."

 
A sample rate of 44.1 kHz does not imply there has to be an inter-channel delay of 1 / 88200 = 11.3 us. It is entirely up to the implementation of the DAC. For example, in my tests a PCM1792 based sound card had virtually none, and for another one that uses the CS4398, it was ~40-50 ns (perhaps one cycle of the 44100 * 512 Hz = 22.5792 MHz clock frequency of the DAC). In other words, orders of magnitude below the threshold of audibility.
 
Also, sampling does not limit the time resolution, it just removes (in a mathematically perfect implementation) all information above half the sample rate. Upsampling cannot recover the lost high frequency information, and modern DACs internally oversample anyway. Fortunately, removing content above 22.05 kHz should not make an audible difference, because human hearing is limited to about 20 kHz (less for older people).
 
May 5, 2015 at 10:19 AM Post #4 of 18
Hi stv014,
 
I hope you be patient with me, as I'm non-science trained. You may be right, but:
 
1) What Moorer is talking about seems different from the notion of inter-channel delay - more of the time-domain resolution as preserved in a CD recording. Or am I wrong?
 
2) I understand that time resolution becomes infinite with the addition of the correct dither. But Moorer explains in his article: "If dither is applied properly, you can produce a waveform that you can adjust on the sub-microsecond level, and still get the waveform to change smoothly and evenly. While that works [at conventional sampling rates], it would work better at 96 or 192, or with DSD. And it's not the same thing as reproducing sub-sample, which, of course, you can't do without a higher sampling rate."
 
May 5, 2015 at 10:42 AM Post #5 of 18
As I already noted, sampling does not limit time resolution, only the bandwidth. It is possible to delay CD quality digital audio by nanoseconds (or capture such delays with an A/D converter), which is obviously a far smaller amount than the duration of one sample.
 
Dithering is a separate issue, and it deals with the quantization error (of the 16-bit or other resolution), rather than sampling. Correctly implemented dithering makes the quantization error uncorrelated to the input, so that instead of distortion, it is just noise at a constant level. A higher sample rate benefits dithering simply by reducing the noise density, as the same overall amount of quantization noise is distributed over a greater bandwidth (4x sample rate = 6.02 dB = 1 bit lower noise in the audio band). It also makes noise shaping more effective, because a greater percentage of the total bandwidth is inaudible. However, the noise floor of 44.1/16 format CD audio, which is about -95.8 dBFS A-weighted from 0 to 20 kHz with the simplest white noise TPDF dither, is already good enough for virtually all musical content under normal listening conditions, and if necessary, it can be improved with noise shaping.
 
If you have not seen it yet, it is recommended to watch this video (especially the chapter "bandlimitation and timing" at about 17:20), as well as others on the same site, for an easy to understand explanation of the basics of digital audio.
 
May 5, 2015 at 10:58 AM Post #6 of 18
Just performed the following test:
Make a waveform at 11289600Hz (256xCD), 16384 samples in length. Put two unit impulses (16bit) at samples 8193 and 8195 in the L and R channel respectively. These are now just 2/10 of microsecond apart. Downsample to 44.1kHz then back up to 11Mhz. The new sinc-like waveforms at 11MHz end up with peaks, you guessed it, 2 samples apart. No between-channel timing information was lost in the down-conversion.
 
If you put the pulses in the same channel, then yes, you lose the separation between the impulse peaks. But the quote you had was "If you put a pulse into one ear, then a pulse slightly delayed into the other ear"... which leads me to believe they are talking about the first case.
 
May 5, 2015 at 11:15 AM Post #8 of 18
May 5, 2015 at 11:16 AM Post #9 of 18
  How can you say that resampling does something and using higher sampling rates doesn't?

 
Because it's digital signal processing that does technical things I don't understand and significantly changes the sound, whereas if you just convert files to higher resolution, it doesn't change the sound. Just use the software and hear for yourself.
 
May 5, 2015 at 11:23 AM Post #10 of 18
   
Because it's digital signal processing that does technical things I don't understand and significantly changes the sound, whereas if you just convert files to higher resolution, it doesn't change the sound. Just use the software and hear for yourself.

 
When you say "resampling", I think people assume you mean the processes of interpolation and decimation. If it changes the sound audibly, then it's probably doing something like EQ that is meant to be audible.
 
May 5, 2015 at 12:12 PM Post #11 of 18
  When you say "resampling", I think people assume you mean the processes of interpolation and decimation. If it changes the sound audibly, then it's probably doing something like EQ that is meant to be audible.

 
It's not EQ. I believe it does more than real-time resampling/oversampling/upsampling as well. You can ask @Dobrescu George for a technical explanation about how it works. It provides a dramatic improvement in sound quality; I just don't know how exactly.
 
May 5, 2015 at 12:39 PM Post #12 of 18
   
It's not EQ. I believe it does more than real-time resampling/oversampling/upsampling as well. You can ask @Dobrescu George for a technical explanation about how it works. It provides a dramatic improvement in sound quality; I just don't know how exactly.

 
Based on the description it looks like it allows for selectable re-sampling algorithms so that you can play whatever content (including DSD) on whatever your system might be. This is the kind of thing that always falls apart in blind testing. But if you like it, great. Back to the thread.
 
May 5, 2015 at 12:43 PM Post #13 of 18
  Based on the description it looks like it allows for selectable re-sampling algorithms so that you can play whatever content (including DSD) on whatever your system might be. This is the kind of thing that always falls apart in blind testing. But if you like it, great. Back to the thread.

 
Translation: I (along with the countless others who hear an obvious difference) am deaf and stupid. Very nice. This is why I don't like to post in Sound Science anymore...
 
May 5, 2015 at 12:47 PM Post #14 of 18
   
Translation: I (along with the countless others who hear an obvious difference) am deaf and stupid. Very nice. This is why I don't like to post in Sound Science anymore...

 
If it's obvious then it would hold up in blind testing and it never does; that has nothing to do with intelligence or severe hearing loss. And when you post in sound science you're supposed to try to be scientific!! Maybe that's why you get such responses. None of this has to do with the topic anyway, unless you plan on explaining how this program helps with time resolution in a reasoned manner.
 
May 5, 2015 at 12:54 PM Post #15 of 18
  If it's obvious then it would hold up in blind testing and it never does; that has nothing to do with intelligence or severe hearing loss. And when you post in sound science you're supposed to try to be scientific!! Maybe that's why you get such responses. None of this has to do with the topic anyway, unless you plan on explaining how this program helps with time resolution in a reasoned manner.

 
You are telling me that I am imagining a difference (for a product that you haven't even tried, I might add), when it is a very obvious dramatic difference. Saying so is highly insulting to my intelligence and implies that I have bad hearing as well. The difference is real, not imagined.
 
Real-time upsampling helps because it sounds better. It is as simple as that. I don't have to provide a scientific explanation to answer that question. All you have to do is use the software I linked to, with the settings I linked to, and hear it with your own ears.
 

Users who are viewing this thread

Back
Top