The ear has a lot of harmonic distortion (HD) because it has lots of moving parts that vibrate inside. The ear actually generates noise by itself when there is no signal (think of the ear as an active amplifier that has idle noise). The ear has so much HD that it easily masks the hamonic distortion of tubes for the large part. However, the ear's harmonic distortion is mainly low-order and decays rapidly when n gets higher. SS amps may have lower overall HD but if it has even a tiny bit of higher-order HD, it could be quite audible. Don't forget the ear has 130 dB dynamic and can hear 20 dB into the noise when there is no masking. Triode vacuum tubes may have a lot of HD but if it is mainly low-order stuff and easily masked by the ear itself. Speakers have lots of HD but also low order stuff. Therefore even the best speakers appear to have a lot of distortion on paper but can still sound very transparent. Digital sources generally measure with very little HD, but those tiny jitter artifacts not harmonically related to the signal can be quite detrimental as well, though they may be only -100 dBFS.
Now the myth about intermodulation distortion, IMD. The ear has a lot of IMD, too. In fact IMD can be mathematically calculated from HD, for the ear and for the amp without feedback. IMD produced by amplifiers without feedback are masked by the ear in the same manner as HD. What happens when there is feedback? Then there is transient IMD that is signal dependent and not harmonically related to the input. The more feedback, the worse the transient IMD. These are quite detrimental. It's not that feedback is all evil. Without feedback no amplification can be linear. Tubes are a lot more linear than transistors in open loop circuits so less feedback can be used in tube amps in general. However, there are also tube and SS designs that uses no minimal global feedback. But there is still local feedback anyway.
We can measure the "quantity" of amplifier distortion easily and accurately with FFT analyzers, but we can't measure the the "quality." Scientific instruments can measure more accurately than the ear can hear--that's a fact (come on, scientists are even measuring gravitational wave nowadays, and they definitely can measure amplifier distortion). But only the ear can determine the quality. There is already living proof that some distortion makes things sound better, not worse. In some state-of-the-art digital equalizers, the CPUs are so powerful that it can alter FR without any mathematical artifacts. But engineers still find these equalizer to sound worse than the classic tube equalizer. So they measure the distortion of tube equalizers and program the distortion into their math algorithm. Lo and behold, the newest digital EQ achieves amazing sound by intentionally adding the right distortion, not eliminating them.
Is this saying that direct heated triode and zero-feedback amplifier sounds the best? Not really. But many say such an amp can have 10% 2nd-order HD and still sound very transparent. Every amp design has shortcomings. An amplification device that is totally distortionless, immensely powerful and super linear simply does not exist, and the "quality" of distortions are still poorly understood.
If the history of audio has taught us anything, it is that any reasonable amplification device can be used to build great sounding amplifiers, including all kinds of tubes, transistors, ICs or even fast-switching transistors (class D and derivatives). Fighting about tube vs SS makes no sense.