That makes sense.... and also note that human voice has a relatively narrow frequency response, so wouldn't strain even the relatively narrow frequency range of an Edison phonograph.
I would also suggest that they were taking advantage of several known human bias mechanisms.
First off, we humans have a strong tendency to assign things to known and familiar categories. For example, we are all familiar with the saying about "when you hear hooves, you think horses, and not zebras". What nobody even bothers to mention is that you don't even consider ten legged robots wearing horseshoes. Instead you choose from among things you are familiar with. When the Edison phonograph was debuted very few people had ever experienced a machine that could reproduce human voice. Therefore, when hearing human voice, everyone assumed they were listening to a human singing. Since most of the participants had never experienced any other source of human voice, when they heard a human voice, they were essentially faced with the single option of assuming that they were listening to a human. And, having made that choice, they then became biased to notice details that tended to support that choice, and to tend not to notice details that were dissonant with it.
From their point of view, they
never actually even evaluated the situation. What they experienced was a human singer they could see, followed by "something they couldn't see that sounded very much like the human singer". They had no specific reason to suspect that the human singer was no longer singing, and they had no experience whatsoever with anything that might have served as a replacement, therefore, based on that knowledge, the only logical conclusion was that "'the singer was still singing". They essentially had no compelling reason to doubt that, no compelling reason to consider alternatives, and no alternatives to consider even if they had wanted to.
If you or I were to hear an Edison recording today, we would notice all sorts of minor discrepancies, like noise, and speed variations, and even the occasional tick or pop, which to us would represent obvious clues that we were listening to a flawed recording. However, consider someone who had never heard those sorts of flaws before, and furthermore had already decided that they were listening to a human performer.
All of those discrepancies that we would take as obvious clues that we were listening to a recording would have little or no similar meaning for them. To them, that noise might sound like slightly noisy steam heat, the speed variations could be a slightly odd mannerism of the performer, and the ticks and pops might represent noises made by machinery backstage, or by someone in the audience dropping something. However, because we are familiar with recording technology, and the sorts of flaws common in older recording equipment, we would take them as "obvious clues" that we had switched over to a recording.
They would have been "thoroughly primed" to think they were listening to a human singer.
And they would have no experience whatsoever with the clues that might indicate a mechanical reproduction.
Therefore they would have no reason to suspect that the singer had been replaced.
Or, to put it another way, since they had never heard a phonograph before, they had had no opportunity to learn how to tell the difference between a phonograph and a human performer. And, because of that lack or training, they simply overlooked the clues that would have been obvious to a trained observer.
(One might imagine that an aborigine who was totally unfamiliar with modern technology might run screaming from a black and white video of an attacking lion - because, to him, the similarity of the experience to actually being attacked by a lion would far outweigh the differences... and the differences which would be obvious to us would have no clear meaning to him. And, if you asked him later, he would probably reply that the black and white two dimensional moving image "looked pretty much like a real lion to him" and "he had no reason to suspect it wasn't a real lion".)
Back in the teens when recording was in its infancy, Thomas Edison would conduct blind "Tone Tests" in vaudeville theaters. The test consisted of a singer on stage singing a song. The lights would go out making the whole theater pitch black. When the lights came on, the singer would be gone and an Edison Laboratory model C-19 would be singing the song. During the blackout, they would switch off in a pause between verses. Contemporary reports were all favorable. People were astonished that a phonograph could exactly duplicate a human voice.
The problem with discerning live from "Memorex" depends more on the directionality of the sound than it does the fidelity. The horn of a phonograph was very close to the way a human voice projects. The Edison engineers would position the horn so it lined up closely with the singer, so when it handed off, the directionality wouldn't change. The natural acoustics of the vaudeville theater would wrap the same acoustic around the phonograph that it did around the singer. The wall reflections and reverb were all identical because the recording was dry, just like the voice.
I've heard that some antique phonograph fans have recreated Edison Tone Tests and have gotten similar results.