Quote:
Originally Posted by
hannyjuca 
Not at all, it have a swap delay.
Not only that, but you only get to see A and B at the very beginning and cannot refer back to A and B during the test. So you get to see A and B at the beginning (better commit them perfectly to memory) and then get to see 20 versions of X one at a time for the duration of the test. A proper ABX test would allow you to refer back to A and B at any time that you want.
An ABX test should be designed so that you are testing whether the subject can detect a difference and properly match X with A or B. The test should be designed to minimize other factors that may influence the test.
The ABX test by Sieveking introduces a source of error due to forcing you to memorize the colors of A and B for the duration of the test. The test becomes more of a psychology experiment on color memory than a proper ABX test to show whether you can or cannot detect a difference between A and B.
The Sieveking ABX test is an example of how not to design a proper ABX test.
There is always going to be some amount of a psychology experiment component to an ABX test relating the memory of audio (or in this case color) and other psychology experiment factors. The goal when designing an ABX experiment should be to minimize those other factors not maximize them. The Sieveking test chooses to try to maximize the influence of color memory rather than minimize that influence.