Good explanation of High Res Audio for people who don't believe in it?

I work part time as a recording engineer and end up getting into intractable arguments with other engineers, especially on the internet about high resolution audio. It’s a friendlier space in the audiophile world because people here actually listen to things in order to make judgements about them.

I’ve recorded and listened extensively over 15 years with DSD, 24 bit, 192khz, and even 352khz digital, as well as 2" 24 track tape. Every one of this is qualitatively better than 16/44.1, in many cases WAY better. I’ve also found all of these changes in audio recording and reproduction to have distinct and predictable signatures across various sets of AD and DA converters. When I talk about this with other people they say I am crazy, delusional, ignorant, etc. and ask me to prove every observation with a double blind ABX test like in a research study.

Because my subjective experience of high res audio is immediately tossed out, I often try to argue in technical terms citing research papers, which gets tiring. I’ve probably had the ubiquitous Monty Montogomery Audio Myths video recommend or posted at me in the dozens of times, where Monty uses his oscilloscope to demonstrate that standard digital AD and DA perfectly recreates all and any signals with just 16 bits and a 44.1khz sample rate.

I end up fighting an uphill battle against these claims:

  1. Resolution above 16 bits in digital audio is pointless because nobody listens to music loud enough to hear the 16 bit noise floor. 24 bit audio is only used for practical recording and editing purposes and has no sonic benefit.
  2. Thanks to the existence of dither, 16 bit, 24 bit and DSD all sound exactly the same. Any benefits of 24 bit audio are completely preserved to CD quality when you add dither.
  3. Digital filtering has no impact on sound quality whatsoever because we can’t hear filter ringing, and timing smear is just an optical illusion from looking at impulse/sinc plots. Time and frequency are the same, therefore, all audio below the filter cutoff is represented perfectly and completely, with perfect impulse/transient response according to Nyquist. (therefore sample rates above 44.1khz are bogus)
  4. Sample rates above 96khz create heavy amounts of intermodulation distortion, degrade audio, and will wreck your equipment. Dan Lavry from Lavry Engineering says so.
  5. All PCM has perfect linearity so there is no need for greater bit depth or DSD
  6. 192khz exists because of fraudulent conspiracy to sell audiophile and pro audio equipment
  7. Double blind tests show that nobody can hear the difference between 16/44.1, 24/192 and DSD
  8. Because of its high noise floor, analogue tape can be perfectly recorded with only 12 bits.

Experience tells me that all of the above claims are false, but I have a hard time making the case.

Does anybody have good answers, or insights about these points? I know they are pretty technical in nature. I think this is a significant topic because outside small corners of the audiophile world and even smaller corners of pro audio, the Monty Mongtomery perspective that 16/44.1 is audio perfection and that High Res is delusion is now the consensus viewpoint and it becomes hard to even talk about practical audio stuff without every claim getting called into question.

5 Likes

These “discussions” will never stop, they are intractable. Many, if not most, have completely made up their minds without any personal experience. Fortunately some people out there have ears and some are willing to entertain at least hand waving explanations.

When I was a little younger, people spent a lot of time arguing about whether quantum mechanics was real: like the current issues many made up their minds that QM was rubbish, it cracked me up that any software engineer didn’t realize that computers as we build them couldn’t possibly work without a good understanding of QM. I also had plenty or arguments about metastability with hardware engineers, at least they teach about it in school now. In most of those cases people were just as self righteous as the people you are running into.

Or, for example Galen’s papers about the ICONOCLAST cables have a lot of solid information explaining how and why they are built, but many people still claim that those effects are not relevant for audio frequencies, if they honestly think they understand basic science then they could understand the papers and at least reconsider their “beliefs”.

11 Likes

Given a sufficiently excellent recording and mastering, I mostly agree. Having 24 bits to play with is more forgiving though.

Noise shaping allows all those formats to encode the exact same audible signal with a given SNR – up to certain limits which are partly dependent on the sampling frequency. But how they sound when played back is highly dependent on the DAC implementation, and it would be a very rare DAC that could make them all sound the same.

If you have audio engineering colleagues who can not hear the difference between a linear vs a minimum phase filter in something like an original DACMagic, I hope they are not working on anything especially high quality. Sample rates above 44.1 make it easier to lessen the impact of filters.

Would recommend not buying that kind of equipment. Mine seems fine.

It is not sufficient to use a linear encoding scheme. You must also have linear encoders and decoders which convert between analog and digital. Sigma-delta schemes exist because it’s much easier to get that linearity with increased frequency in the time domain than with increased precision and accuracy in a varying voltage domain.

If we are talking about media delivered to consumers, then that is partly true. The rest of the truth is that it made sense to explore increased resolution for engineering reasons, and that PCM playback systems with poor (or no!) filters can give better results when run with higher resolution pre-filtered inputs

Yes, all blanket statements are always correct.

The sonic quality of that tape hiss will be rather different than the dithered quantisation noise. Same level, maybe. Same sound, doubtful.

I hope you find some of that helpful in easing your frustrations.

2 Likes

I think the only way to „explain“ is, to let them hear the difference on a setup that’s resolving enough and tell them it can’t be heard on most less revealing setups, which doesn’t mean the difference doesn’t exist.

Explaining theory pro existence of something which is mainly audible imo is just helpful if you have people who want to understand why something exists, not if you have people who want to prove the opposite to you with theory.

1 Like

I don’t think you’re going to get much argument here. Great post ;^ )

Could be explaining it wrong on some of these points, Ted. Help me out, but there seems to be a belief that the optimum sample rate for video applies to audio. We have a problem at the VERY beginning of the process.

1.0 You want audio to be multiples of 44.1 sample rate, other sample rates are best for video and high jacked audio with DVD movie sound tracks.

2.0 DSD is mastered as PCM, and may start at the wrong sample rate as the source equipment is VIDEO optimized, and MAYBE get interpolated to a 44.1 multiple going back to DSD. The digital sample frequency “mistake” is built in from the start, though.

3.0 DSD needs a better clock than PCM to reduce quantization errors on time and magnitude axis. Higher bit depth improves the accuracy of digital quantization. This mitigates jitter and non linearity of the quantization value for each bit. Better, well shielded, clocks are here to stay on good equipment

4.0 Pure DSD is hard to find, but is is very good if you do, and should follow the best sample frequency for audio.

Digital can be far better than the mess of source files we have today. Eventually, even with PCM mastering converted to DSD and at the right sample rate and higher bit depths for pure audio we can see improvements. The real battle is at the VERY beginning.

Galen

When I was working for a company doing audio post for video, the video sample rate was 44.0559Hz. That was fun to translate back and forth from 44.1k (999/1000 for drop frame.)

Some of the better players that support multichannel audio and video have the audio clock being the master clock and also some let you turn off the video stuff entirely if you aren’t using it.

Larger sample widths do indeed help to lower the relative jitter, but they are inherently less linear than a one bit encoding (two values define a line and hence there’s no DNL, INL, etc.).

DSD moves the quantization noise out of the audio band and can easily get better than -144dB (i.e. 24bit) resolution over the audio band.

The 48, 96, 192, etc. set of sample rates and the 44.1, 88.2, 176.4 can be interconverted relatively cleanly: their least common multiple sample rate is 28.224MHz (10 x the nominal DSD sample rate of 2.8224MHz): 28.224MHz / 176.4kHz is 160 and 28.224MHz / 196k is 147.

1 Like

Is DSD resolution equal to its dynamic range in this case? If this is the case, why do most DSD converters bottom out at -120db (except the 6 bit ESS chips)? DSD definitely sounds better than and more resolving 20 bit PCM, which would be the equivalent bit depth for the dynamic range of most DSD conversion.

:stuck_out_tongue_closed_eyes:

OMG, thanks for the memories, Ted👍🏻

@IanB52: My first question is, did you work on building B52’s, or are they a favorite band?:fist:t2:

Very well put question. Lots of good answers have followed it. I would say that, unless you actually Enjoy arguing with measurementalists and theorists, life is too short, bruh. I mean if Ted says what he said…:man_shrugging:t2:

Quite often it seems to be that a subset of folks are focusing on theory (often in the absence of experience) as in, “The Numbers Don’t Lie, therefore such-and-such Can’t Possibly Be…”

And SO MANY arguments are essentially one side arguing about SOUND and the other is arguing about MUSIC. Talking past one another. This is a fundamental confusion.

I’ve been an A/V producer most of my life. How something looks and/or sounds reproduced after the fact on a video or audio system has never “fooled me” vs. the look or sound of the live experience while shooting or recording (or simply attending) that event/performance.

I often have a hard time imagining a world in which sound theory trumps listening to music. I suppose/imagine that this situation is reversed for measurementalists. But as I said, I truly have a hard time imagining it. I freely admit that.

Just the other day, I was sitting in on a discussion where someone asked if any of us had heard a Master Tape vs. a digital version of same (not sure if resolution or format was specified). He was mightily impressed by the difference - likely assuming nearly none of us had had that experience. I was about to say some version of, “…Duh” when we were interrupted. But it brought this point home to me once again.

The designers of SACD picked the 64 time oversampling as a compromise that would both allow multichannel music (along with a stereo only version) on a DVD formfactor and keep at least 120dB S/N over the audio band. The resolution can be much better than 24 bit PCM below approx. 20kHz and is worse above 20kHz. I always get a kick out of people that complain about DSD’s noise floor above 20kHz compared to Redbook. The thing is Redbook has a 0dB S/N above about 22k and single rate DSD is still in the range of 120dB near 22k and still has a non-zero S/N up past 80 or 100kHz.

The thing to remember is that we can barely build a 22 bit accurate DAC, the spectrum plots you see are always averaged which lowers random noise relative to the signal present.

Double rate DSD can easily be more accurate over the audio band than 24 bit PCM and can also support multiple passes of editing without the noise growing appreciably into the audio range.

…What He Said👊🏻

1 Like

Frankly I will also have to admit as both an audiophile and an audio engineer, that I’ve long gotten a boot out of how All These Here Formats were determined by things other than Ultimate SQ. Practical Considerations tend to be paramount.

1 Like

I had not thought of it this way before. A wonderful, and amusing, observation.

1 Like

…IKR?..

EDIT: sorry the Forum software generated all above my post.

Haha! I thought I had just picked a random number, but I do like the B52s. Hmm…

I tend see a lot of the arguments based on tension between different epistemological attitudes. Engineers often feel comfortable with a finite and fully understood reality or system that they can control, and so I think it can be attractive to match the “limits” of human perception and the real world behavior of electronic systems to well below what can be reproduced at the most standard and economic modern technologies and be done with it. That way you don’t have to let in too many ambiguities or complexity into what appears to be an airtight knowledge set. From that the point of view of a “perfect” narrowly band limited system of digital audio based on a finite and easily manipulated code of absolute number values is extremely attractive, and something like DSD based on relative values and with enormous bandwidth is just the opposite.

This makes subjective experience problematic when listeners start hearing things that aren’t easily controlled by the current standard of technology, or explained by the dominant textbook theory, and so we get a lot of gatekeeping demands like double blind tests tests, scientific studies, etc.

1 Like

And Sigma Delta Modulation isn’t tractable mathematically. It’s a feedback loop with essentially infinite gain which is hard to model linearly. But all I care about is that it sounds good…

2 Likes