Hmm, OK. A question which warrants a book not just a forum comment, but here goes.
The main dimensions of performance for a DAC are:
- Hitting the exactly correct voltage for each sample
- Reproducing each sample at exactly the right time
- Making the transition between each sample track as closely as possible to a perfectly bandwidth-limited waveform with no components above the frequency that is half of the sampling rate
The last point – filtering – is in some ways the hardest, because in theory it can be done perfectly but only if you have a perfect recording as well as the ability to perform calculations on the entire length of the recording all at once. In practice, filtering is compromised and different filter design choices can lead to very different sounding products – even on the exact same DAC. The filtering implementation can include both physical (analog circuits after D-to-A conversion) as well as mathematical components (calculations on the data prior to D-to-A). And there is a niche class of DACs which have no explicit filters at all… they just let your ears/brain deal with the issues.
Measurements which tell us something about a DAC’s filtering characteristics include impulse response traces (amplitude vs time) where we can see pre/post ringing; group delay (phase shift vs frequency); standard frequency response plots (amplitude vs frequency); and imaging/aliasing of a fixed input tone (amplitude vs frequency).
The second point – clocking – is very challenging in both design and implementation but thankfully is pretty easy to observe. Jitter sounds BAD. Less jitter always sounds better. The DS DAC family is particularly good in this respect and I believe that’s a large part of its appeal to listeners.
Jitter observations (not measurements!) of an entire DAC system are usually shown in the form of an amplitude vs frequency plot of the DAC’s reproduction of a pure sine wave. Jitter is visible in the broadening of the base of the spike, as variations in timing cause variations in frequency at the analog output. Actual measurements are hard. In rare cases we might get a “phase noise” plot showing the amplitude (of jitter) vs frequency. There’s no single number which can describe jitter though, so be cautious of anybody relying on “piocseconds” or “femtoseconds” in their marketing.
Now the question of how exactly the machine turns numbers into voltages. This is the main area of difference between all those architectures you listed. The simplest mechanisms just take each (PCM) sample as it arrives and select a combination of resistors in a circuit to try and map the numeric value to a corresponding voltage. Ladder DACs and R2R DACs work this way. The challenge there is getting the level of detail that we want across the dynamic range we can hear. Each binary digit (bit) in a sample adds 6dB to the range of values the sample can represent. 16 bits gets us to 96dB difference between the largest and smallest values, with 65,535 possible points, and that is sufficient to make audio which sounds really good to humans. But to build that with resistors in a ladder or R2R arrangement requires 0.0015% precision!
Measurements of a DAC’s precision often turn up as noise floor plots (often with a very low level sine sticking up from the floor), and the occasional amplitude vs time trace of a minuscule sine wave where you might expect to see a three-level stair step shape.
But it turns out that with the crazy advances in computing-power-per-dollar over the decades, we can now transfer some of the budget from precision resistors and spend it on precision silicon to get better overall performance. The idea is to have a DAC mechanism with many fewer possible output levels operating at a much higher frequency, feed it a calculated signal which contains the original audio with very little noise in the audio band but a whole bunch of noise higher up, then take the output from the DAC and pass it through a simple analog filter to attenuate the ultrasonics. The vast majority of modern DACs use this approach, with a combination of digital filtering to produce a high frequency version of the input, then sigma-delta modulation (and noise shaping) to reduce the number of bits per sample. The DS DAC just happens to take this all the way down to single-bit form.
There aren’t any new kinds of measurements to consider for that, though your noise floor certainly looks different as the number of bits decreases and you rely on noise shaping to move that energy into higher frequencies away from the audio band. That’s essentially the focus of the recent conversation re Stereophile and the DS MkII.
One other point to mention just to round out the differences between some of the DAC architectures you mentioned: parallelisation. You cannot expect perfect hardware, but you can use an understanding of statistics to get a group of devices to behave together as if they were a single more-perfect device. DCS takes this to an extreme with their Ring architecture (SDM to 5-bit, then a ladder-like decoder which randomly selects from a pool of 60-odd resistors for each bit IIRC). Ted has given us four parallel digital switches per channel, all active all the time, for the same reason. These hardware components, along with the quality of power supplies and the mitigation of EMR, set the physical noise floor for the machine.
What we’re sort of hoping in the recent discussion on this thread is either that a modification to the way the noise floor measurements are made will show that the DS MkII is actually performing competitively in that respect, or that a modification to the digital filtering or SDM algorithms will deliver a quieter audio band. Or alternatively we might decide that it sounds great so the measurement is irrelevant.