Testing the claim: "I can hear differences between lossless formats."

Oct 26, 2014 at 11:25 PM Post #121 of 721
 
Does the method have a name?
It reminds me a bit about the randomized response method, where you set up the questionnaire so that you can eek out the result using Bayes' theorem.


Controls. 
 
The file being 1 db louder would be a positive control.  One not glaringly obvious, but one known to almost always be detected blind.  If the positive gives a null result it would indicate a problem with the test perhaps other confounding variables. 
 
A negative control would be comparing a file to itself.  You expect null results.  Getting non-null results would point to some problem with the test.  Perhaps the methodology is unintentionally un-blinding the results.
 
For example you could do ABX testing of 5 files.  In one case it would be the same file for A and B giving a negative control.  In another you could have one file subtly louder as a positive control.  Then the other 3 files could be genuinely different in the way being tested for perhaps differing sample rates. 
 
Oct 27, 2014 at 9:18 AM Post #122 of 721
 
The place would be empty because objectivists are shirking just as much as subjectivists. "You can't prove a negative." "You're shifting the burden of proof." If you believe in testing, test your own beliefs too instead of just taking them for granted.

 
Give us a test designed for the task and I'm sure people would take it.  The problem, as others are pointing out, is that ABX is designed for people who desire to show they *can* discern between two tracks.  But it's easy to deliberately fail such a test: just answer A the whole time.  We need a test that avoids this issue for testing if someone cannot discern two tracks.

MRI+polygraph+one of your kids loses a finger every 2 wrong answers?
very_evil_smiley.gif
not sure a lot of people have the practical means to set up the necessary test.
 
 
 
 
 
 
 
 
 
 
as long as we reject something because the opposite was never proved, it will be a dead end. nothing in the real world works like that(luckily) and I don't see the need to disprove any weird claim.
nobody can say that no man can fly with the force of his mind. should we let someone claiming just that,  to roam freely? treat him with consideration and convince people of it being true because we don't know how to prove it is impossible? maybe the guy can't do it when we look at him because he's shy? maybe he fails experiments because of the stress, maybe any human or man made technology interferes with his own brainwaves? but when he's alone at his house sure he can hoover above his bed anytime!
at some point we have to keep it real.
even when some humans can do something, it doesn't mean everybody can or that we should even start to take his ability as a normal human thing. having one guy that can hold his breath for 20mn( with hyperoxygenation) isn't reason enough to start writing in science books that humans can do 20mn, because pretty much anybody would be asphyxiated and die under 10mn. and believing anybody coming out of nowhere and telling you he can stand 10mn underwater without breathing isn't being open minded, it's being gullible. and I'm talking about something that actually exists sometimes. unlike so many things in audio.
 
 
 I know I'll have a hard time simply telling flac from max bitrate mp3 with my average IEMs(unless I hand pick the passages), and someone comes telling us that one lossless sound warmer than another on several listening systems. yeah sure.
and nobody ever measured that before, and the guys making the lossless codecs never realized it. only our buddy doing casual listening and encoding. surely the problem is with the codecs... or not.
 
Oct 27, 2014 at 11:17 AM Post #123 of 721
  MRI+polygraph+one of your kids loses a finger every 2 wrong answers?
very_evil_smiley.gif
not sure a lot of people have the practical means to set up the necessary test.
 
I know I'll have a hard time simply telling flac from max bitrate mp3 with my average IEMs(unless I hand pick the passages), and someone comes telling us that one lossless sound warmer than another on several listening systems. yeah sure.
and nobody ever measured that before, and the guys making the lossless codecs never realized it. only our buddy doing casual listening and encoding. surely the problem is with the codecs... or not.

 
Yeah, this is another level of "head asplode" above the "I get benefit from 96kHz signals" argument.  At least there there is some difference in file content; here we can straight up show that the embedded wav files from the various codecs are identical, and identical is a subset of indistinguishable.  And instead of the immediate assumption "man, maybe I have some weird mixer setting in my software", it's "prove to me you can't hear a difference!"
 
Oct 27, 2014 at 6:43 PM Post #124 of 721
 
as long as we reject something because the opposite was never proved, it will be a dead end. nothing in the real world works like that(luckily) and I don't see the need to disprove any weird claim.

 
I agree with you in this case (the guy himself should consider investigating further).

But the usual claims about DAC, amps, and cables aren't so weird and tests against those claims would have to be run too if you believe in testing and want to use it to support the claim that you aren't hearing a difference.
 
Oct 27, 2014 at 11:53 PM Post #125 of 721
Your broad comments that failing to reject the null hypothesis of "no difference" does not equate to the proving that both items are the "same" is correct on a theoretical level. Our conclusion is not that those two items are the "same," but rather if there is no audible difference between two things, we can practically say that any sonic differences that may exist are beneath our threshold of hearing, so from a practical standpoint, they are audibly transparent and identical to the human ear.


Ok, glad you see my point. It is rather basic.

Heres my problem with the second part of your statement. Suppose for the sake of argument that the OP's ABX results are statistically significant at 90% confidence level, but not 95%. If you want 95%, you would fail to reject the null hypothesis. But it does not necessarily follow that there are no audible differences at all.
 
Oct 28, 2014 at 1:27 AM Post #126 of 721
Ok, glad you see my point. It is rather basic.

Heres my problem with the second part of your statement. Suppose for the sake of argument that the OP's ABX results are statistically significant at 90% confidence level, but not 95%. If you want 95%, you would fail to reject the null hypothesis. But it does not necessarily follow that there are no audible differences at all.

 
Whether something is statistically significant is dependent on the p-value falling below a certain threshold. With different confidence levels, there are different associated p-values. At 95% CI, the p-value will be 0.05. At 90%, the p-value will be 0.1.
 
You will not get an alternative results using the same data with different confidence levels.
 
Oct 28, 2014 at 4:53 AM Post #127 of 721
Heres my problem with the second part of your statement. Suppose for the sake of argument that the OP's ABX results are statistically significant at 90% confidence level, but not 95%. If you want 95%, you would fail to reject the null hypothesis. But it does not necessarily follow that there are no audible differences at all.

 
The simple solution is to do more trials, and combine the results of all (i.e. not just selectively include those that suit whatever you are trying to prove). The larger the sample size, the easier (lower percentage of correct guesses required) it is to get a p-value close to zero if an audible difference does indeed exist. For example, with a total of 30 trials, it only takes a score of 20/30 (~67%) to reach a p-value under 0.05, and with 22/30 or 24/30, it would be only 0.008 or 0.0007, respectively. A score of 37/50 (74%) gives a p-value under 0.0005, and 39/50 (78%, arguably still a reasonable requirement to claim an audible difference) reduces it to below 0.00005, or 99.995% confidence level. It can never be zero, but it can be close enough for practical purposes without requiring an extreme sample size.
 
Oct 28, 2014 at 5:31 AM Post #128 of 721
The confidence level is the probability we fail to reject H0 given that it is true.  Thus if you want to guard against rejecting the null incorrectly, you must set a high confidence level.  Doing this at a fixed sample size, though, decreases your power, which is the probability you accept Ha given that it is true.  A properly specified experiment decides on both confidence level and power before commencing; changing your mind about significance after you see results is one of the no-nos.
 
Oct 28, 2014 at 8:29 AM Post #129 of 721
Whether something is statistically significant is dependent on the p-value falling below a certain threshold. With different confidence levels, there are different associated p-values. At 95% CI, the p-value will be 0.05. At 90%, the p-value will be 0.1.

You will not get an alternative results using the same data with different confidence levels.


You will get the same p-value either way, but you will most certainly get different results. Whether you reject or fail to reject the null (the only two possible results) depends entirely on your choice of confidence level, which is arbitrary.
 
Oct 28, 2014 at 1:38 PM Post #132 of 721
 Whether you reject or fail to reject the null (the only two possible results) depends entirely on your choice of confidence level, which is arbitrary.

Yes true enough.  Let us suppose we choose 50% confidence levels.  Then pretty much all test results indicate a positive result at our chosen confidence level.  Including someone purely guessing at random.  That means the test would not be able to differentiate between an audible effect and an inaudible effect. 
 
So we could then choose instead 75% confidence levels.  Fewer test results would meet that bar and we would have improved discrimination between audible and truly inaudible effects.  Yet just guessing or choosing randomly when something is inaudible would result in 1 out of 4 tests giving apparently positive results.  The typically used 95% confidence level is chosen so very few (1 in 20) of our test results will cause a false positive.
 
Now you imagine something truly sometimes audible at the 90% confidence level that fails to meet 95% in one of your earlier posts.  The answer to that situation where something is audible, but barely so, not discerned every single time, yet really audible sometimes is increased numbers of tests.  In 20 trials 15 correct (75%) is needed for 95% confidence.  In 100 trials only 60 correct (60%) is needed for 95% confidence.  Something inaudible tested might result in 12 of 20 correct randomly.  It likely would result in maybe 8 out of the next 20 and maybe 10 of the next 20.  A genuinely audible effect is more likely to result in 12 of 20 correct and then 13 of the next 20 and then 11 of the next 20 and so on until it met the level over 100 trials of 60% correct and 95% confidence in the result.  
 
Lest you take offense, I am not explaining confidence levels to you, I am explaining the effects of choosing confidence levels.
 
Nov 15, 2014 at 8:51 PM Post #133 of 721
Interestingly, a few people have claimed to be able to reliably hear a difference between 24-bit and 16-bit audio as well.
 
@daltonljj @Existence @RUMAY408
 
You are hereby invited to strut your stuff and show us if you can demonstrate this with statistical significance!
 
(Or if you've done tests before, you can present documentation.)
 
Nov 15, 2014 at 10:05 PM Post #134 of 721
  Interestingly, a few people have claimed to be able to reliably hear a difference between 24-bit and 16-bit audio as well.
 
@daltonljj @Existence @RUMAY408
 
You are hereby invited to strut your stuff and show us if you can demonstrate this with statistical significance!
 
(Or if you've done tests before, you can present documentation.)

No arguments from me about frequencies above 44.  Versions 24/196 and now up to 32/384 are memory killers in my book.  That's just my opinion using my own equipment.  24 bit depth makes a difference to me but again that's just my opinion. 
Many of the better albums that show up on HDTracks were 1st issued on SACD or DVD-A format.  The newer downloads are often from the original mastered tapes ex. Van Halen.  I'm impressed with the better albums, but dogs are here as well.
I have 100's of CD's, 100's of MP3 albums, and 100's of vinyl albums.  The loudness wars ruined a lot of CD's since about the mid 80's on.  Digital downloads that equaled vinyl albums on some level was something I was after.
HDTracks 16/44 downloads (and above) have instituted better than average quality standards, whether the cost is worth it or not is always a question.
 
The above article is referenced here and in other audio communities frequently, dogma becomes dogma when it is never questioned, I am too skeptical to believe in everything I'm told to believe.
 
Since you paraphrased me, not so reliably, I thought I would throw down my entire quote.
 

Users who are viewing this thread

Back
Top