Note in my first post I didn’t say that DSD and PCM were converted to 5 bit, high rate PCM. I said “something like 5 bit PCM at a high sample rate and noise shaped”. That noise shaping allows a better signal to noise ratio in the audio frequency band than you’d have if you just used 5 bit, high rate PCM. The DACs typically use a 5 bit (or whatever) sigma delta modulator to generate that intermediate format so one can just as reasonably argue that instead of DSD being converted to PCM that PCM is is being converted to something closer to DSD.
It may sound like just semantics, but if you are interested in the practical differences between PCM and DSD, noise shaping does matter and noise shaping PCM doesn’t give you PCM, it gives you noise shaped PCM. In that sense saying DSD and PCM get converted to PCM in most DACs simply isn’t true.
One can choose an intermediate format that can, in principle, be converted back to PCM or DSD losslessly (over the audio frequency band.) That isn’t true for 24/88.1k PCM vs DSD. You can arguably hear when DSD is converted to 24/352.8k PCM. But, I suspect that with the right up and down sampling filters DSD to 24/192 to DSD could be sonically transparent for most users.
“keep[ing] DSD as DSD all the way through the conversion to analog process” essentially means you are just using a low pass filter. The DS could do that, but I don’t want one mode that supports a volume and another that doesn’t.
The DS converts DSD and PCM to a superset of each: the sampling rate is at least the sampling rate of the input and the sample width greater than the sample width of the input. This intermediate format can be easily converted back losslessly to the original inputs. However with DSD input that upsampled format still has the noise shaping of the DSD and so isn’t strictly speaking PCM.
The overall point is that there’s a continuum of formats and that with proper handling many can act a good sonically transparent intermediate format between PCM and DSD.
IMO much of the character of various DAC chips comes from the quality of the upsampling and/or downsampling filters used not whether they use a particular format as a part of the process.