I prefer to pull test tracks that use instruments that I play, so my test tracks tend to be more focused than my general listening. I also tend to rotate tracks in and out of my evaluation list pretty frequently, while generally sticking to the same principles for selection. In general, it stops me from getting distracted from the evaluation by the "oh wow, I haven't listened to this in a while" reaction.
I generally start with a set of tracks in which I look for specific characteristics.
[Classical] Avi Avital - Bach - Sonata in E Minor, BWV 1034 - IV. Allegro. The last movement of a violin sonata played on mandolin.
[It's complicated] Chris Thile & Mike Marshall - Live Duets - Carpathian Mt. Breakdown. Mandolin and mandola duet. Good test for instrument separation and accurate timbre.
[Also complicated] Chris Thile & Edgar Meyer - Chris Thile & Edgar Meyer - Farmer and the Duck. This is the closest thing to a bass test I have. Aside from parts of the tracks dropping down into the lowest regions of the double bass's range, it's also where I'm listening for sloppiness in the bass. Personally, I prefer the double bass to electronic stuff for my bass tests, because I have a better idea of what it should sound like.
[Bluegrass] Chris Thile & Michael Daves - Sleep With One Eye Open - If I Should Wander Back Tonight. Mandolin and guitar with harmonizing male vocals. Good test for balance and male vocals.
[Classical] Alan Civil - Mozart: The Horn Concertos - Horn Concerto No. 3 in E flat, K. 447 - II. Romanze. French horn concerto. An instrument I play with a piece I've also played.
[Classical] Canadian Brass - Canadian Brass Takes Flight - Little Fugue in G Minor. 2 Trumpets, french horn, trombone and tuba playing a Bach fugue. Good balance test. Tuba needs to not be overpowering and the trumpets can't sound too sharp and piercing.
[Classical] Atlanta Symphony Orchestra and Chorus - Verdi: Requiem - II. Dies Irae. Test for epicness. More specifically I'm looking for the bass drum hits. They must rattle my brain. This is the one instance where I will not tolerate "politeness" in my sound.
[Classical] San Francisco Symphony - Mahler Symphony No. 4 - IV. Sehr Behaglich - This is both my test for female vocals (soprano soloist throughout the piece) and my general test for how symphonic pieces will sound on something.
Other than this kind of stuff, I generally just try a bit of whatever has my attention at the moment to see if the headphones/speakers/whatever really grab my attention. The stuff I listed above can knock a piece of equipment out contention, but once I feel I've cut out the options with technical limitations, I'm more comfortable sitting back, being less analytical, and just seeing what I enjoy.
The idea of a test for how crappy recordings of music you still enjoy seems like a good idea. However, I already own a pair of HD 650s so I think I'm set in the "forgiving of bad recordings" department.