Introduction and test protocol:
Pierre, Frederic (Head-fi member Superfred21), Karim and myself (Eric, head-fi member Eric65 ) gathered up at my place on 7-8 September 2013 in order to perform a shootout of 4 electrostatic amplifiers driving the Stax SR009 headphone. Namely: Eddie Current Electra; Audiovalve RKV mk2 + Wee energizer from Woo Audio, Stax SRM 727II, Stax SRM 007t2. In order to remain as objective as possible, the tests we performed blind while performing rigorous level matching of all the amplifiers.
This comparative blind test of 4 amplifiers lasted 4 hours, involving 2 testers and 2 operators for switching gear connections and ensure strict respect of the blind test.
Level matching was realized with 500Hz test tone using a voltmeter at the amplifiers output with a precision of 0.1 V (corresponding to about 0.1dB in equivalent SPL) and also using a sonometer to check that the actual listening level after plugging the headphone was consistent. Voltage level was set to 6V and verified at the beginning / during / at the end of the test with max variations of 0.1V observed.
While the 4 of us had some preliminary listening sessions on saturday, only the 2 most experienced members (Pierre and Frederic) participated in the blind test. 5 music samples were chosen for their good sound quality and representation of a wide range of music genres. Both testers were very familiar with the recordings. Both testers used their own SR009 headphone for all the tests (after noticing on Saturday that the 3 SR009 sounded somewhat different from each other). Due to time constraints, each recording would be listened to over a 3 minutes maximum duration:
- Melody Gardot : The absence (Pop)
- Jazz at the Pawnshop (Jazz)
- François Couperin Apothéoses (Spoken man voice + Baroque orchestra)
- Lalo : Symphonie Espagnole (Symphonic orchestra + violon)
- Barbara : Nantes (Women voice speaking / singing)
Special attention was paid to preventing the testers from recognizing the gear they were listening to: never seeing the gear, headphone cables never pulling, usage of dummy cables and headphone to simulate plugging / unplugging of amplifiers during the amplifier switch.
The amplifiers were evaluated over the following 6 criteria, some being very objective but others (such as neutrality and overall musicality) being more influenced by the listener's taste, hence the use to 2 testers this time:
- Soundstage & imaging
- Tone, timbre
- Perceived tonal balance (which sounds most neutral)
- Overall musicality
The amplifiers were evaluated in pairs (A/B test), each tester judging which is better for each of the 6 criteria above. The evaluations were repeated 5 times, once per recording. While the ideal comparison would have involved 6 combinations (A-B, A-C, A-D, B-C, B-D, C-D), it was not possible to do so with all 5 recordings. We thus opted for using a "reference" amplifier that would be compared to the others (A-B, A-C, A-D). Using this protocol, the listeners could always go back to a reference sound signature which made the A/B test a bit easier on them. Main disadvantage is that the amplifier A was listened to 3x more than the others so it's not a completely fair comparison between all 4 amplifiers.
The reference amplifier (A) was only known to the 2 operators (myself and Karim). The testers were made aware only after the test ended. It should be noted that neither was able to guess with certainty which amplifier it was at the end of the 4 hours of listening test.
The reference amplifier wasn't chosen randomly as the goal was to pick a unit that did not sound drastically different from the 3 others (i.e neither much better nor much worse). Assuming amplifiers relying on valves could have a distinct sound signature from the solid state device, it was decided a tube amp would be used as reference (3 out of the 4 tested amps is a valve based design). We also did not want to choose the amp that was assumed as best of the lot (Electra) or worst (SRM 007t) when going in (based on previous non blind comparisons including those made on the first day of this meet). On the day of the blind test (2nd day), the amplifier used as "reference" (A) was the RKV-Wee combination.
In practice, for each of the sound track, the tester would fill a test grid with all 6 criteria listed and select which amplifier (A or B/C/D) was better. The order of passage of the B/C/D amplifiers was varied with the 5 tunes played but, as stated above, the reference amplifier A would systematically come back at very other listen to act as a reset. Each tune was thus played 6 times in total (A > [B/C/D] > A > [B/C/D] > A > [B/C/D]). Each B/C/D amplifiers was listened to 5 times (5 tunes) and the A amplifier listened to 15 times.
For each ( A / ? ) comparison, for the given musical passage, the 2 testers had to tell (out-loud, note from arnaud: I don't know why it was done this way as this is bound to influence the testers choices) which amp was better for each of the 6 criteria: black cross for reference amp "A", red cross for amp "?", else green cross when he could not tell the 2 amps apart.
Because amplifier A (RKV+Wee combo) was listened to 3x more times than the other amps, its score (number of times it was picked) was divided by 3 for the "weighted" results. Both weighted and row scores are presented below.
Note finally that the protocol listed above was debated over 3 weeks on the french forum HCFR (note from arnaud: french people love to argue and debate to death, we certainly had a fair number of discussions this time too ;) ). Although far from perfect, this test protocol was retained as most time effective, and fortunately so because much less could achieved than originally planned during this week end meet.
I start with the raw data for each recording, e.g. the sum of the crosses for all the 3 A/? comparisons, 18 crosses total (6 criteria x 3 A/B comparisons):
The following table is sums the crosses over both testers and all 5 recordings but for each distinct criteria (5 recordings x 3 A/? comparisons x 2 testers = 30 crosses total for each criteria):
The following tables present the total number of crosses for each A/? comparison (6 criteria x 2 tester x 5 songs = 60 crosses in total) along with an unweighted % result:
Below, the results are presented in weighted format. In particular, the score of the RKV-Wee combo is divided by 3 because it is statistically represented 3x more than all the other amps due to the test protocol employed.
First, the weighted results for each recording (sum of the crosses for all the 3 A/? comparisons, the RKV-wee score being 1/3 of the true number):
Next, the sums the crosses over both testers and all 5 recordings but for each distinct criteria (5 recordings x 3 A/? comparisons x 2 testers = 30 crosses total for each criteria, the RKV-wee score being 1/3 of the true number):
Having participated to these tests as one of the 2 operators, I can guarantee the exactitude of these raw data results and assure you on the following 2 essential points of the procedure:
1. The procedure was rigorously followed, the reference (A) and B/C/D amps were kept secret until the end of the test. At the end of the test, one of the testers (Pierre) though the reference amplifier was the SRM727 and the other tester (Frederic) hesitated between the RKV-Wee and SRM007t.
2. The amplifiers were rigorously level matched, using a 500Hz tone set at 6V +/-0.05V. An SPL meter was used to verify that the levels were matched at +/-0.25dB.
For my personal opinion: subjectively, during non-blind yet level matched testing, the Electra (a beautiful looking amp) seems to be slightly ahead of the RKV-Wee combo. Furthermore, the SRM727 amp is not enjoyable with the 009 at "high" SPL levels. In particular, during initial tests on the 1st day, the 727 amp would sound hard during dynamic peaks for one of the women voice recordings (Barbara - "Nantes" tune) which is rather unusual for this particular recording. In this situation, the other amplifiers (including the SRM007t) all subjectively came ahead of the SRM727 amp (note from Arnaud: this is along the same lines as the observations posted by Tyll Hertsens in the Inner Fidelity article).
The SRM727-SR009 combo is thus very good (even excellent) while I keep it at a listening level I personally quality as low to moderate, while it can sound edgy / hard on feminine voices for higher listening levels.
Objectively (raw and especially weighted results from the blind test on day 2), it would seem the RKV-Wee combo comes ahead (which may surprise at first) followed very closely by the Electra and then the SRM727, the SRM007t amp beeing very far behind. However, the differences between the RVK-Wee, the Electra, and the SRM727 are not statistically significant and, from this test, we cannot clearly establish that any one of these 3 amps came ahead. On the other hand, the position of the 007t comes loud and clear from the tests, even the weigthed results show it as very far behind.
My personal explanation for why the RKV-Wee fares very well against all the other amps in direct comparison is that, even though technically it may not be the most transparent of the bunch, it synergizes rather well with the SR009. In particular, the 009 headphone is not exempt of issue (some find it to lack foundation as well as being artificially bright).
Other equipment used:
Drive: Audiomat D1 linked to the DAC in AES/EBU (used for 1 of the CDs as well as to play the 500Hz tone for level matching)
D/A Converter: PS Audio Perfect Wave DAC used with a NAS drive + tablet to select the tunes (note from Arnaud: the drive was thus not used during the blind tests)
Note that we also had another DAC (TotalDac D1 dual) and while another objective of the meet was to perform a blind shootout of 3 dacs (PS audio, Audiomat, TotalDac), we ran out of time...
- Eddie Current Electra with PSVANE tubes
- Audiovalve RKV mark 2 + WooAudio Wee energizer using a custom adapter cable between the RKV headphone out and the Wee HP input (the RKV impedancer thus wasn't used)
- Stax SRM 727 II (plugged into PS Audio PowerPlant P3)
- Stax SRM 007t2 (plugged into PS Audio PowerPlant P3)
High quality power cables and interconnects were used. All amplifiers were connecter to the PW DAC in balanced mode, except for the RKV which only has asymmetrical inputs.
Here's the price of the amplifiers in France / USA):
- Stax SRM 727 : 3000 euros inc. taxes
- Stax SRM 007t : 3500 Euros inc. taxes
- Audiovalve RKV : 2200 Euros inc. taxes + Energizeur WooAudio Wee (500 USD + import duties)
- Eddie current Electra : 4000 USD + import duties + ~500USD PSVANES tubes complement
Test setup and participants:
Eddie Current Electra with PSVANES tubes:
TotalDac D1 DAC on top of the Audiomat drive:
Stax 009 on top of the Electra power supply, RKV on the right side:
PerfectWave DAC on top of the Audiomat DAC:
Stax SRM-007t and SRM-727II (black):
WooAudio Wee in the foreground, AudioValve RKV with glass top on the back:
Another shot of the Electra with Stax SR-007mk2 headphones (black color, not used during blind test):
A view of the XLR interconnects and power cables:
PS Audio PowerPlant P3, NAS drive, Audio mat power supply I believe (silver color box):
Karim (one of the operators) and Pierre (one of the testers):
Eric and Pierre:
Pierre and Karim:
Eric, Karim, Pierre:
Frederic wearing the SR009:
Pierre wearing the SR009:
Edit: correction of the names in the captures...