You have a set target level for loudness. A pop music track stuck near digital full scale all the time will be lowered to that target level.
On the contrary, a very dynamic track of classical music will have an average level way below digital FS and often below replaygain's target. So replaygain will want to boost that track to make it feel as loud as the pop music tracks. Now the problem is to boost a track that already might have some peaks near digital FS. If something peaked at -3dB and replay gain boost the entire track by 11dB, that will clip.
The louder the target for replaygain, the bigger the boost for those very dynamic tracks and the more clipping will result(if nothing else is involved!).
I guess that's the general idea.
Now you might have something to prevent clipping. The most direct way to prevent clipping is to stop boosting when the peaks reach FS level. That will leave you with a very dynamic track that will sound much quieter than all the very compressed tracks. As the boost applied to that track will only be a small one instead of going for the average target loudness.
So a lower replaygain target can simply allow more of your tracks to actually sound matched loudness(without clipping). Of course that could come at the price of more bits sacrificed to Cthulhu down there, deep in the digital abyss.
and a higher target(if it's actually applied) will either clip your dynamic tracks a lot to get them to that target, or leave you with a bunch of quieter tracks that you'll hardly hear when they come u