I get the all idea of wishing to have real accurate data and facts, it's the ideal stuff we all aspire for. but let's be honest, nothing provides that, not even real scientists doing real science. what we get at best is getting closer to the truth and further away from fooling ourselves. you can always find some flaw to any test, there is always a potential for something we didn't think of, or don't even know exists. there is always a limit of resolution in the measurements...
what should matter is to know if on average a method gives more reliable results than another. shooting down abx because of the risk of false negative or false positive, that looks to me like someone saying that cars and bikes shouldn't be used for transportation because they suck at climbing stairs. one problem doesn't mean it doesn't work great for everything else. you use a test for what it's relatively good at, and for the rest, you try to find a better test. if you think the reliability is poor, then you just don't make claims about the the conclusions that you shouldn't make and we're back to the burden of proof and how people should always avoid making claims for half baked reasons. and why I failed to pass the abx test isn't saying the same as "there is no difference". on that we agree very much. we go as far as we confidently can go. it doesn't mean we shouldn't use a test that isn't 100% reliable. even 80% reliable is better than nothing.
Any test needs to have some measure of calibration or qualification to check it's suitability for the role it is being used for. In a test which relies on statistical analysis such as ABX testing, this represents the power of the test which is related to the level of false positives, level of false negatives, sample size, etc. The power of the test is directly related to it's accuracy
You made claims about the accuracy of home based blind testing & I asked for statistics to back up this claim. I'm not shooting down home based abx testing - I find that without these statistics, it is just another anecdote about listening (one that seems to be skewed to not hearing any differences). If you can present false negative statistics for home based ABX tests that show my opinion is wrong then I will change my view.
Essentially you are asking me to accept a tool that you claim is accurate yet you can't show me any calibration results for the tool. Instead you talk about how all the other tools are not good enough - I'm not convinced as I would not be convinced if you claimed you were more accurate at target practise than others & tried to prove this by telling me about the other guy's one eye or the twitch of another or the lack of balance of another
right now, ABX is available to all curious people, and can help find out a number of things on a number of subjects. any time I use an obviously audible difference, I get 100% in an abx, so it's at least as good as sighted evaluation for obvious stuff. not inferior and not the lottery you're trying to depict.
how do I know if I pick up an amp because it sounds better or because I like how it looks? well it's simple, you hide how it looks and I test again. there is nothing wrong with having a preference for a given look and it's very ok to buy a product for that reason. but when pretending to be testing only for sound, we should at least try to remove as many external variables as possible. sighted evaluation fails to offer that 100% of the time.
Yes, that's one of the problems of ABX testing - it's availability to all curious people, so we see all sorts of results of varying quality without any ability to judge the quality of the test. It's just a curiosity, nothing more!!
so I wish for better than blind test or simply better than abx, and I'm also ok for not drawing weird conclusions or give too much credibility to some half controlled personal test. but sighted evaluation isn't an alternative to blind testing. all those arguing while offering no other choice don't know what testing sound means. so any claim about sound made from sighted evaluation should be called upon for verification when you feel like they might be wrong, and checked in any available testing of audio that removes at least some external variables. and if he made a claim about how something sounds, then the burden of proof justifies that he should either retract his statement, or make what he can to offer proper confirmation that he didn't make stuff up.
if he can't, then most likely he should retract his statement, as nobody forced him to claim something he couldn't try to verify.
maybe more than talking about the burden of proof, I should have talked about how silly it is to make statements we aren't entitled to make. but parenting should have made that clear long ago, and we all know you can't stop people from talking nonsense, so instead I pointed out to the burden of proof that is IMO, a proper arguing method against illegitimate claims.
Yes, I'm glad you wish for a better test but that doesn't mean we should demand someone do a curiosity test that produces results of unknown quality