My take on foobar ABX test with very subtle effects on files.
Aug 5, 2017 at 2:37 AM Thread Starter Post #1 of 45

WindowsX

Member of the Trade: Fidelizer Audio
Joined
Apr 27, 2007
Posts
1,962
Likes
364
I tried to do ABX test with very subtle changes today with foobar2000 by adding subtle effects I can normally notice in DSP. I think the result is interesting so I decided to share it here.

After starting ABX for a while, I started to build up stress and fatigue. At some point, I started to notice difference harder and harder.

I can normally tell apart between those two files easily but A B X Y selection takes more than listening skills. It needs high endurance and patience.

After performing the test, the result was as I predicted. If we divide the test to 3 parts, I made a fine progress at first part. No mistakes at all. I chose correctly for first part with no mistakes.

I started to make mistakes at middle part and I start to get guess-like results for the last part. Please keep in mind that this is ABX on very subtle changes I added into file myself.

Please keep in mind that I'm just sharing my experience here. I have no opinion about ABX test itself.
 
Aug 5, 2017 at 7:21 AM Post #2 of 45
ABX in any form doesn't impose a comparison time limit, or a test duration limit, or any comparison limits at all. You could spread your trials over days, weeks or years. You could listen to any choice for as long as you like. You could even take a break for as long as you like, any time you like.

The stress you refer to was self-imposed.
 
Aug 5, 2017 at 9:43 AM Post #3 of 45
I agree winX, it certainly is a fatigue factor in your setup.

At some point, even working in a studio, the ears are "done" for the day.
 
Last edited:
Aug 5, 2017 at 10:23 AM Post #4 of 45
I agree winX, it certainly is a fatigue factor in your setup.

At some point, even working in a studio, the ears are "done" for the day.
Again, fatigue is not an inseparable aspect of ABX testing, especially if done by an individual with his own comparator. If a tester feels fatigued or unable to make a decision, but continues trials anyway, he's biasing the results. Properly administrated DBT controls all biases including tester fatigue.

If you drive without rest until you have an accident do you blame the road or your choice to drive until your abilities are impaired?
 
Aug 5, 2017 at 11:03 AM Post #5 of 45
I agree winX, it certainly is a fatigue factor in your setup.

At some point, even working in a studio, the ears are "done" for the day.

Exactly. And there's nothing you can do but indulging yourself with alcohol after that happens. :)

Again, fatigue is not an inseparable aspect of ABX testing, especially if done by an individual with his own comparator. If a tester feels fatigued or unable to make a decision, but continues trials anyway, he's biasing the results. Properly administrated DBT controls all biases including tester fatigue.

If you drive without rest until you have an accident do you blame the road or your choice to drive until your abilities are impaired?

More like you should change the way you handle the measurements rather than ignoring uncontrollable real-world factors. There's no fatique and stress gauge to monitor like in games so I can take a breather after certain threshold. I find ABX isn't a thing for me to measure very subtle changes even though I'm confident with my hearing perceptions but A B X Y times 16 = minimum 64 listening tests with small gap of silence in-between. I swear even final exam didn't stress me out as much as this ones.

Of course, some people with great endurance and stamina may not notice this as an issue and I respect everyone who can pass ABX test wonderfully. It worked fine for me comparing between mp3 and flac files but small effects I added with foobar DSP may not be so sometimes.

Regards,
Keetakawee
 
Aug 5, 2017 at 11:09 AM Post #6 of 45
Exactly. And there's nothing you can do but indulging yourself with alcohol after that happens. :)



More like you should change the way you handle the measurements rather than ignoring uncontrollable real-world factors. There's no fatique and stress gauge to monitor like in games so I can take a breather after certain threshold. I find ABX isn't a thing for me to measure very subtle changes even though I'm confident with my hearing perceptions but A B X Y times 16 = minimum 64 listening tests with small gap of silence in-between. I swear even final exam didn't stress me out as much as this ones.

Of course, some people with great endurance and stamina may not notice this as an issue and I respect everyone who can pass ABX test wonderfully. It worked fine for me comparing between mp3 and flac files but small effects I added with foobar DSP may not be so sometimes.

Regards,
Keetakawee
You're not getting it. There is no stress or fatigue built into the ABX/DBT test methodology except what you added by how you chose to do it.

You're doing it wrong and therefore adding an uncontrolled bias.
 
Aug 5, 2017 at 11:11 AM Post #7 of 45
You're not getting it. There is no stress or fatigue built into the ABX/DBT test methodology except what you added by how you chose to do it.

You're doing it wrong and therefore adding an uncontrolled bias.

Oh? Can someone control the test and make me do it right? How can they tell that I should take a break and come back to do it again? Get real man, not everyone can do ABX test without feeling stressed and it can dull your perceptions to some extent.

Regards,
Keetakawee
 
Aug 5, 2017 at 11:31 AM Post #8 of 45
I tried to do ABX test with very subtle changes today with foobar2000 by adding subtle effects I can normally notice in DSP. I think the result is interesting so I decided to share it here.

After starting ABX for a while, I started to build up stress and fatigue. At some point, I started to notice difference harder and harder.

I can normally tell apart between those two files easily but A B X Y selection takes more than listening skills. It needs high endurance and patience.

After performing the test, the result was as I predicted. If we divide the test to 3 parts, I made a fine progress at first part. No mistakes at all. I chose correctly for first part with no mistakes.

I started to make mistakes at middle part and I start to get guess-like results for the last part. Please keep in mind that this is ABX on very subtle changes I added into file myself.

Please keep in mind that I'm just sharing my experience here. I have no opinion about ABX test itself.


let's start with the elephant in the room, the bolded parts I quoted. aside from the abx test, how do you decide how you can or cannot notice things? if it's sighted test, you know how confirmation bias alone makes it irrelevant to us and should make it irrelevant to you. if it's another blind test, which is it? what is the audible difference? do you also get tired after the same time repeating it?

and as mentioned above, there is nothing in the ABX procedure forcing you to do 20 runs in one go. foobar's abx might seem that way because once you've started you won't keep it opened for days, but you can just add up the results of doing 3 runs each time and do the some stats yourself once you've reached whatever number you decided you'd do at the start. it's the only requirement, if you decided the all experiment would contain 20 runs, then you must end it at 20. not when you're bored, not until you get a result you like. but how you decide to space those trials is up to you. of course we can get tired of doing something with extreme focus, even if it's not about getting tired, we just don't all have the same attention span. so it's up to you to find out how long you're good to go and then set sessions shorter than that.
also just like anything in life, if you do it often enough, stress will go down or at least you'll get used to the experience. if anytime I drove a car I felt like I did the first times, it would be a nightmare. with time and practice the brain goes "ok I've done that stuff many times, I'm not dead, I'm not even surprised anymore, guess I was panicking for no reason". it's after all a pattern machine.

now something I believe I must clear out too even if it's the captain obvious moment of the day. it's a statistical tool, you cannot treat it as if it's not one. the idea that because you succeeded 2 or 3 times, you should succeed all the time in a test is wrong. almost nothing in life follows that rule. statistics give us a degree of confidence and it's normal for something harder to notice to result in lower confidence. if I abx a loud 1khz tone against a quiet song, even if I get tired, stressed, and mix up the X/Y pressing from time to time, I will end up with a very high level of confidence that I can tell them apart.
 
Aug 5, 2017 at 11:31 AM Post #9 of 45
[1] More like you should change the way you handle the measurements rather than ignoring uncontrollable real-world factors. ...
[2] Oh? Can someone control the test and make me do it right? How can they tell that I should take a break and come back to do it again?

1. You don't change the way you handle the measurement, what you do is NOT ignore the uncontrollable factors like fatigue!

2. You don't need someone else, you just need to be able to recognise when fatigue is affecting your judgement. That's something pretty much all adults have to learn to do! If you don't, what are you going to do, wait until you drive into an on-coming truck before you take a break? And then what, undergo surgery (hopefully by a non-fatigued surgeon!) buy a new car and smash into another truck?

G
 
Aug 5, 2017 at 11:31 AM Post #10 of 45
1. Oh? Can someone control the test and make me do it right? 2. How can they tell that I should take a break and come back to do it again? 3. Get real man, not everyone can do ABX test without feeling stressed and it can dull your perceptions to some extent.

Regards,
Keetakawee

1. Sure.

2. That's up to you. Nobody else would know that.

3. Everyone could, but few do. Then, out of ignorance they blame the test methodology rather than understanding the biases they themselves have introduced.
 
Aug 5, 2017 at 12:04 PM Post #11 of 45
It sounds like ignorance claim that I should find my own approach to deal with fatique without being specific about methods to deal with it. I'm sorry for not being strong enough to deal with fatique myself. I can tell A B X Y quite easily for first 5 rounds but not sure after that.

By the way, comparing music listening to driving truck and surgery is very hilarious. Nice puns, man. :)

Regards,
Keetakawee
 
Last edited:
Aug 5, 2017 at 12:16 PM Post #12 of 45
It sounds like ignorance claim that I should find my own approach to deal with fatique without being specific about methods to deal with it.

You can't "deal with fatigue" you just have to recognise it and take a break, same as when driving. Alternatively, don't try and recognise it, just "ignore" it, bias your ABX testing and probably kill yourself in a road traffic accident!

G
 
Aug 5, 2017 at 12:55 PM Post #13 of 45
When I tried ABX testing with differences I could just barely notice it took me around a week to get enough testing done. I chose to only focus on really short samples where I believed the differences were the most apparent. I did 5 takes at a time once a day and I never got a clear 5/5. Some days no matter how hard I tried I couldn't get any testing done simply because I couldn't hear the difference despite not changing anything I'm aware of. With a home made ABX test, there are still a lot of uncontrolled things going on and I believe that's still true to some extent to more professional tests as well.
 
Last edited:
Aug 5, 2017 at 1:40 PM Post #14 of 45
You did get results! 5/5 is not a goal, ABX test results are statistical data. You are looking for results significantly better than random guessing (50%) that would indicate the degree of audible differences.

5 trials is insufficient for any degree of statistical accuracy, but you could combine all trials from your total testing and get better accuracy.
 
Aug 5, 2017 at 1:54 PM Post #15 of 45
When it comes to subtle differences in home audio, most of the time I find that if you have to go to extremes in blind testing to determine whether it's audible or not, it probably doesn't matter anyway. There are always things you can work on that actually do make a clear difference. If you enjoy testing, then the procedure can be fun though.
 
Last edited:

Users who are viewing this thread

Back
Top