Help needed: insight about Grado drivers
Frequency response graphs only tell part of the story about how a headphone sounds.  Even including other measurements like square wave response and distortion measurements only gives you part of the story.
The frequency response graphs don't tell you anything about soundstage (aka headstage) improvements, quality and characteristics of the treble, midrange, and bass, and other important factors.  You can have every headphone in the SR series measure essentially the same in a frequency response graph yet have each of them sound different when listening.  Some of the improvements in the various SR series Grados has to do with improved headstage, and that's accounting for putting bowls on an SR60 when comparing against an SR225 so you're comparing apples to apples.  That's something that doesn't show up in a frequency response graph.

Soundstage does show up though; it's impacted by frequency, time, amplitude, dispersion characteristics (which can be highlighted in the others), and more.  How else do you think things like Dolby Headphone or the Smyth Realizer would work?   Quality of treble, midrange, and bass?  I'm guessing you're referring to spectral decay and possibly distortion?  Those too measure very same on the SR series on ryumatsuba's site, even between the original and i's.
And that's the issue - it's not that just one measurement is alike . . . it's that they're almost identical save what could be considered margin of error in both driver manufacturing and measuring.
I'm not saying you aren't going to notice a differences between Grado "A" and Grado "B" though.  In fact, I'd expect it to some extent.  However, it's not to the reasons you allude from what I've seen in various measurements.  Look at those graphs: at some points you can see deviations of 3dB or more.  The problem is these probably aren't engineered differences (not specifically), but just a difference in yield or fit on the testing gear.  In other words you could probably experience the same differences in multiples of the same exact model (which to some extent is to be expected).  Different tensions in the headband which can compress the foam, the position on head, and just average manufacturing variance can all have an impact.
What I guess I'm saying is it's my belief the number on the Grado is largely irrelevant to how it sounds or measures for the most part (i.e going up the line means as there's no guarantee of an "improvement").  The only time I've seen LARGE variances is with RS, GS, and PS in comparison to the SR.  Even then, we could argue whether those are really "improvements" in spite of them being more expensive.

