I'm not sure if you're misusing the term, I think so, I'm trying to get my head around the implications of your test. If you truncate to 8bit, you'll get truncation error in the 8th bit and truncation error is correlated to the signal. Applying dither in theory randomises (decorrelates) this error, so what you'll end up with when using dither is the same amount of error (level) but distributed evenly as white noise rather than as spikes at particular frequencies. I say in theory because in practise dither is often applied at a somewhat higher level than the level of the truncation error, particularly in the case of noise-shaped dither.
If I'm understanding your test, below -48dB in the 8bit padded version you'll just have digital silence (zeros in the last 8bits) as compared to non-zero data in some of the 8 LSBs in the original, an obvious numerical difference. However, truncation error will be occurring in the 8th bit and although not obvious numerically (because like the original it will contain non-zero values at times), you should get a difference measurable up to -42dB, the same as if you applied a perfect 1 LSB dither. With noise-shaped dither you'd almost certainly get a difference file which is higher, probably around -38dB but of course, it would depend on the actual settings of the dither applied. Dither is not such a complex process to get one's head around but like many things in audio, it's a bit of a rabbit hole when we start looking in detail at it's practical application. For example, we have to think about the application of dither in quite different terms when reducing bit depth to 16bit as compared to reducing it to 8bit.