Well.....is true this is not the best CPU, but Fiio uses generic CPU Governor, Interactive i think was called. It is agressive, but tries keep frequency as low as possible at the same time. So this is why happends so, if you can set the CPU frequency to be at max. frequency all the time it will do at least DSD128 i think. I have Sony Xperia Z2 from 2013 and it does up to DSD128, this is Snapdragon 801 chipset, a 8-9 years old. So from this i make conclusion that if the CPU in agressive governor mode, it will do DSD128 at least at any/most devices 2016-2017+ year release. And at most 2019+ can set DSD256 no problem (i mean even 150$-200 phone should be able to do it).
However, you could probably try use a phone and then use the DAP as DAC only. Kind a stupid way to do it, but is maybe possible and i am almost sure you will hear improvement using DSD256 upsampling especially. There can be battery draining though and other inconveniences that come with this, the DAP maybe does not have option to stop/disable USB charging. Can do it if the device have 2 ports, 1 for dedicated for charging and 1 for the USB Input. But most new DAPs do not have this (2ports right). If your device have 2 USB ports is worth trying though.
I just recommend trying upsample to DSD because it makes the sound more dynamic. It does not make the sound higher quality than the PCM sound, but it improves a lot the dynamic, the bass is there when needs to be, the treble is not harsh, the mids are so quickly changing that you do not have voices that are anxious sounding. I use always EQ on anything PCM, with DSD you just don't need that, everything is in place. Super separation and soundstage. You do not hear more details, but you hear them a lot better 'pronounced' than on PCM, is big improvement. Is better on DSD128 and 256 for sure. 64 is...better than the PCM if you use 64bit processing on, not so noticable as 128 or 256 though. Yes, that counts for any PCM upsampled to DSD as well. And for any files, MP3 up to FLACs and everything. And Neutron does this upsampling exceptionally well.
Maybe try 32 bit proccessing, so 64 off. And then see if can do it 32bit+DSD128 or 256.
You could as well convert the files to DSD and play so as you said, but is a lot of space.
Ok, thanks for reading.