Go to
www.analog.com and search for AD1896A datasheet. It's an asynchronous sample rate converter chip (so it can do not only "upsampling" but "downsampling" or "equisampling" as well) and on page 18 of its datasheet manufacturer explains how the rate conversion works.
In a nutshell, you have incoming signal at one sampling rate (say 44kHz) and want output signal at different rate (say 96kHz or even 44kHz again). It's easy to realize that in practice the ratio of two clocks will be an irrational number - because clocks are not exactly 96000Hz or 44100Hz and even if they were, they vary with time due to properties of crystal oscillators, noise, interference, changes in temperature etc . In other words you have jitter.
So you interpolate the input signal with very high sampling rate, say 2 to the power of 20 in case of AD1896A. Now you have signal that is identical to the original (save the rounding errors) but instead of having one sample every 1/44100 seconds, you have one sample every 1/2**20 seconds. These samples are queued into a FIFO (first in first out) buffer. Now you sample this signal with the output sample rate (96kHz), with the precision of less than 5 picoseconds, using digital servo loop. It turns out that higher the interpolation sampling rate (2*20) lower the error in the output signal.
In practice of course you're not going to sample with 2*20 because you don't have a DSP that operates in GHz range, and besides you don't need to. In the end all they use is a 64-tap FIR filter. This filter would need 2*26 coefficients, but even those are not all stored. Only a subset of coefficients is stored and the rest are interpolated from those.
As one of the side effects, the output signal will have greatly reduced jitter.