Converting the original file into DSD or up-sampling is a very bad idea. The rule of thumb is to always maintain the original data as Mojo's processing power is way more complex and capable than any PC or mobile device.
DSD as a format has major problems with it; in particular it has two major and serious flaws:
1. Timing. The noise shapers used with DSD have severe timing errors. You can see this easily using Verilog simulations. If you use a step change transient (op is zero, then goes high) with a large signal, then do the same with a small signal, then you get major differences in the analogue output - the large signal has no delay, the small signal has a much larger delay. This is simply due to the noise shaper requiring time for the internal integrators to respond to the error. This amplitude related timing error is of the order of micro seconds and is very audible. Whenever there is a timing inaccuracy, the brain has problems making sense of the sound, and perceives the timing error has a softness to the transient; in short timing errors screw up the ability to hear the starting and stopping of notes.
2. Small signal accuracy. Noise shapers have problems with very small signals in that the 64 times 1 bit output (DSD 64) does not have enough innate resolution to accurately resolve small signals. What happens when small signals are not properly reproduced? You get a big degradation in the ability to perceive depth information, and this makes the sound flat with no layering of instruments in space. Now there is no limit to how accurate the noise shaper needs to be; with the noise shaper that is with Mojo I have 1000 times more small signal resolution than conventional DAC's - and against DSD 64 its 10,000 times more resolving power. This is why some many users have reported that Mojo has so much better space and sounds more 3D with better layering - and its mostly down to the resolving power of the pulse array noise shaper. This problem of depth perception is unlimited in the sense that to perfectly reproduce depth you need no limit to the resolving power of the noise shaper.
So if you take a PCM signal and convert it to DSD you hear two problems - a softness to the sound, as you can no longer perceive the starting and stopping of notes; and a very flat sound-stage with no layering as the small signals are not reproduced accurately enough, so the brain can't use the very small signals that are used to give depth perception.
The second issue in using the transport to up-sample (44.1 to 176.4 say) is that the up-samplers in a PC or mobile device are very crude, with very limited processing power and poor algorithms. This results in timing problems, and like with DSD you can't hear the starting and stopping of notes correctly. These timing problems also screw up the perception of timbre (how bright or dark instruments sound), the pitch reproduction of bass (starting transients of bass lets you follow the bass tune), and of course stereo imagery (left right placement is handled by the brain using timing differences from the ears). Now Mojo has a very advanced algorithm (WTA) that is designed to maximise timing reconstruction (the missing timing information from one sample to the next) and huge processing power to more accurately calculate what the original analogue values are from one sample to the next. Its got 500 times more processing power than normal, and this allows much more accurate reconstruction of the original analogue signal.
So the long and the short is don't let the source mess with the signal (except perhaps with a good EQ program) and let Mojo deal with the original data, as Mojo is way more capable.
Rob