It's not like they're playing with the codecs. This has to do with resources, buffers, background functions or simple subtleties due to code and registry config. Ears are just very sensitive, especially if dealing with something that's inherently revealing and relatively neutral.
I wouldn't restrict myself to a hybrid but it seems that you're not strictly doing so. I'd consider a custom like the CTM CT-300 but to stay in your price and requirements, maybe the DN-2000J. I haven't heard it myself but a few here that I trust have thought it a clear upgrade on the 2000 and the best of the moderately priced hybrids.
There's no great way other than keeping the volume down. Inline resistors were common in the day of receivers but raised the output impedance. Basically there to drop noise and gain. Same you could do if for single driver dynamic headphones. Best bet is to find a speaker switch that includes headphone jack to accomplish this.