Testing audiophile claims and myths
Feb 5, 2024 at 7:03 PM Post #17,161 of 17,336
Unless I misunderstand you, deltawave (the software you linked) can already kinda do that. It can give you the difference of spectrum as well the difference of amplitudes over time. If you want a graph that shows spectrum over time on a 2D plot, just load the difference file produced by deltawave into a spectrogram maybe. You could use a real time spectrum analyzer as well if you are OK with time being represented actually over time instead of on the X axis.
I was particularly interested in methods for analyzing the null that could separate linear distortions from nonlinear distortions.
 
Feb 5, 2024 at 9:28 PM Post #17,163 of 17,336
Wouldn't the changing impedance levels between the DAC/AMPs and headphones affect the final sound heard by the tester that wouldn't be revealed in a direct null test? That seems to me to be a possibility here.
If prior to the transducer, the DAC and amp and perhaps also transducer coupling incurs a linear distortion (phase or magnitude response alteration) at the node prior to the transducer, then that would show up in the electrical null test. If the linear distortion or difference in any other distortion only occurs at the output of the transducer, then we would need to conduct an acoustic null test between the outputs for the same transducer being driven by different stacks or whichever component has been swapped out, though I suppose that is necessarily harder or more bottlenecked (adding a microphone into the mix) than electrical null tests.
 
Feb 6, 2024 at 4:57 AM Post #17,164 of 17,336
We should think that it’s the same old BS audiophiles/reviewers have been coming up with for 35 years or so, ever since properly executed double blind tests started demonstrating that no one, including audiophiles, could discern audible differences between many/most/all components. So in order for audiophile manufacturers to justify their prices, audiophiles to justify their purchases or reviewers to justify their jobs/reputations, there was no alternative but to (falsely) discredit DBT/ABX testing. Numerous methods have been employed to accomplish this over the years: Not understanding and/or misrepresenting what blind testing is, what it’s for or how it should be used, performing invalid blind tests, making up false conclusions/assertions not indicated by the results, concentrating on *potential* deficiencies of the testing protocol (even those already solved decades ago) and various other ways and variations on the above. The quoted article employed pretty much ALL of the mentioned methods!!
I think https://www.audiosciencereview.com/...-error-metric-discussion-and-beta-test.19841/ was the latest development for that on ASR, but I still need to do more reading.
Latest development” in terms of judging the audibility of a null test result. However, it’s still somewhat limited in this regard as it has to rely on psychoacoustic models rather than on the actual hearing ability and listening skills of individual listeners. So it should be viewed as a sort of “ball park” guesstimate. For an accurate answer, then an audibility discernment test is required, say an ABX test.
I would be more interested in software that could plot any spectral differences with respect to time for two chains playing the same music through the same transducer.
Personally, I’m far less interested in that. Firstly, “two chains” is an awful lot of variables and would therefore tell us little/nothing about individual components in each chain and Secondly, measuring the output of the transducer introduces a bunch of potential measurement inaccuracies, particularly with HPs/IEMs, that are very likely to be of far higher magnitude than the differences we’re trying to measure between many other components.
As for Resolve, others may be better at explaining the details of what may have been wrong with the test and what additional controls were needed.
God, where to start! In fact, it’s difficult to think of anything he got right, even the very premise of blind testing, let alone how he executed it!

G
 
Feb 6, 2024 at 6:15 AM Post #17,165 of 17,336
Just got to reading the article, have to concur with @gregorio here.

The other limitation is that I wasn’t able to test just the DAC portions through a common amplifier, again because I didn’t use an RCA switch. This means that I was just evaluating two completely distinct source chains for differences, and not individual pieces of each chain.
4th paragraph. The preceding paragraph denotes a problem as well, but this one kills this iteration of the test. Not isolating the variables properly is fatal to the credibility of the test.
 
Feb 6, 2024 at 9:53 AM Post #17,166 of 17,336
Just got to reading the article, have to concur with @gregorio here.


4th paragraph. The preceding paragraph denotes a problem as well, but this one kills this iteration of the test. Not isolating the variables properly is fatal to the credibility of the test.

I completely understand, but I still have to ask:
if every component (DAC, Amp, Source) supposedly cannot be told apart from another, why would you expect differences with two separate chains?
(I guess, the main issue was his mediocre approach at volume matching using the SPL from the headphones instead of measuring the output voltage of the amplifier?)
 
Feb 6, 2024 at 10:04 AM Post #17,167 of 17,336
I completely understand, but I still have to ask:
if every component (DAC, Amp, Source) supposedly cannot be told apart from another, why would you expect differences with two separate chains?
(I guess, the main issue was his mediocre approach at volume matching using the SPL from the headphones instead of measuring the output voltage of the amplifier?)
Standard procedure for designing a double blind study is to define the hypothesis being tested and your operational parameters, identify and eliminate any confounding variables that could contaminate the study, and control for operator and subject bias errors in order to isolate and identify clearly causal links that can then be replicated in peer review. The study design in question is completely inadequate. This makes the study an invalid test as opposed to an unsound test, meaning he made a process error and can thus be refuted without addressing the results.

I'm pretty sure gregorio is going to comment on this too, but Amps can make a material difference in sound based on the tech used. A SS amp will be clearly different from a tube amp for instance. Sources also have varying levels of noise floor and output impedance that can have a material effect on the resulting sound. You have to control for those variables to eliminate those possibilities.
 
Last edited:
Feb 6, 2024 at 12:16 PM Post #17,168 of 17,336
I completely understand, but I still have to ask:
if every component (DAC, Amp, Source) supposedly cannot be told apart from another, why would you expect differences with two separate chains?
(I guess, the main issue was his mediocre approach at volume matching using the SPL from the headphones instead of measuring the output voltage of the amplifier?)
The idea of testing audibility for something specific is, at least as a scientific approach with the purpose of demonstrating something as a fact, about doing the best we can to remove anything that might not be about what we're testing.
It goes against that very principle to introduce more variables like different amps when the target of the test is the sound of 2 DACs. Even twice the same amp model is, at least in principle, an extra risk for variables we did not want in the test.
Perhaps it matters, perhaps not, and the true answer is probably something like "it depends". So avoiding it is just the better choice if available.
 
Feb 6, 2024 at 1:59 PM Post #17,169 of 17,336
The idea of testing the outputs of two entire chains for the same transducer was to test any claims of "synergy", though individual components within said chains could still be swapped out. For example, how to definitively prove to someone that the combination of different gear doesn't magically cause them to all behave differently in concert.

As for acoustic measurements, provided the test rig is highly stable, my particular interest is in separating analysis of linear and nonlinear distortions to see whether the amp even in nominal operation can somehow change its transfer function to the transducer or induce compression depending on the signal in a manner not revealed by sine sweeps or multitone measurements.
 
Feb 6, 2024 at 2:48 PM Post #17,170 of 17,336
The idea of testing the outputs of two entire chains for the same transducer was to test any claims of "synergy", though individual components within said chains could still be swapped out. For example, how to definitively prove to someone that the combination of different gear doesn't magically cause them to all behave differently in concert.

As for acoustic measurements, provided the test rig is highly stable, my particular interest is in separating analysis of linear and nonlinear distortions to see whether the amp even in nominal operation can somehow change its transfer function to the transducer or induce compression depending on the signal in a manner not revealed by sine sweeps or multitone measurements.
I think you want to go look into just about any paper with Steve Temme as an author or coauthor.
Anytime they speak pure math, my brain brings up the monkey on a bicycle, but I remember finding:
https://www.researchgate.net/public...ibility_and_listener_preference_in_headphones
and
https://www.listeninc.com/wp/media/2023/06/paper_51_AES_practical_implementation_of_PRB.pdf
And another one about noncoherence, that might have been older and not relevant anymore, seemingly in the vein of what you're thinking of. At least the guy agrees that THD isn't all that great a measurement when it comes to what we perceive and like, and he's been looking into more than THD for a while now. Those are all about transducers, we have to consider that most amps or DAC will probably do one or 2 magnitudes better as they usually can manage to do for other variables. But Admittedly, as we typically don't see measurement beyond THD and sometimes ITD, maybe it's one of those situations where many manufacturers don't fix the issues that people don't know about?

My way to find papers usually stops at looking for a title or a guy and adding PDF at the end of my google search, so I might not be the best source for this stuff. :sweat_smile:
 
Feb 7, 2024 at 3:12 AM Post #17,171 of 17,336
if every component (DAC, Amp, Source) supposedly cannot be told apart from another, why would you expect differences with two separate chains?
But most components CAN be “told apart from another”!! You can either relatively easily create the conditions under which they can be “told apart” or not compensate for audible differences. For example, many DACs have significantly different output voltages, ranging from around 0.5V to over 4V and unless compensated (precisely volume matched) they would be easily “told apart”. Even if precisely volume matched, it’s still relatively easy to create conditions under which they could be differentiated, say looping a very quiet section and whacking up the gain. It’s even more the case with amps, the output wattage and impedance of different amps vary. It obviously wouldn’t be hard to tell the difference between say a 10W amp and a 500W amp when using 400W speakers or if comparing an amp that has higher output impedance than the transducers impedance with an amp that has the appropriate impedance.

Comparing whole chains multiplies all the variables and conditions that have to be precisely matched and compensated for, in order to eliminate them as causes of audible differences. Which brings us back to what @KinGensai stated about “isolating the cause” and what I stated about making false conclusions/assertions not indicated by the results.
For example, how to definitively prove to someone that the combination of different gear doesn't magically cause them to all behave differently in concert.
If someone believes in magic and doesn’t understand or accept the science then there’s no way to “definitively prove” to them. Even ignoring magic and just going on the science/facts, it’s still not possible to definitively prove that the combination of different gear doesn’t cause an audible difference because it commonly can, as explained above.

G
 
Feb 7, 2024 at 4:30 AM Post #17,172 of 17,336
gotcha!
Thanks for pointing this out.


Would be great to have a conclusive list of "things" that can cause amplifiers, DACs, CD-players, etc. to actually affect the output.

e.g. Output voltage, impedance, distortion? damping? SINAD?

Having a nice spreadsheet.. or concept map?... or flowchart? 🤔


I once found an article by Sanders regarding amplifiers and other gear: http://sanderssoundsystems.com/technical-white-papers

like this:
http://sanderssoundsystems.com/technical-white-papers/162-audio-equipment-testing-white-paper

and he defines "basic quality criteria" for amps and other gear as follows:
1000014347.jpg


I really liked those articles.
 
Feb 7, 2024 at 8:26 AM Post #17,174 of 17,336
At parity of a good seal ...
would the density / rigidity of the material make a difference?
The depth of the tip's collar over the edge of the nozzle make a difference?

Material makes a notable difference as does obscuring the nozzle by various amounts, obviously. But each iem is affected differently and tips that sound good on one may not on another. Hence I've found there is no superior tip design that works for all. Also what sounds good to me may not to someone else.
Azla Crystals use a very firm material but for me they affect the overall sound and affect the highs especially.
 
Last edited:

Users who are viewing this thread

Back
Top