AES/EBU vs HDMI Digital Outputs

A perfect question.

Threads on this board often veer off in interesting ways.

1 Like

Oh how I wish that the above were trueā€¦ The problem is, a sequence of square waves going through a cable is not going to appear at the other end like it did at the transmitting end. Nature likes round corners, especially in transmitting voltage through cable which is another word for transmission line. Specifically, transmission lines prefer sin waves. So the characteristics of the transmission line (with various inductance, resistance, and length characteristics) determine if the voltages present on the receiving end can do two things: recover the clock and the 0/1 data.

The point is, a cable is an analog device. This could be a rude awakening to those steeped in digital technology [those who are surprised should remember DSL which attempted to transmit digital data over twisted pair telephone wire - sometimes it worked and often it didnā€™t]. Typically, at the receiving end of a transmission line, receivers donā€™t see +1 volt, -1 volt, or 0 volts but instead see voltages which vary between 1 and -1, rounded corners, decreasing voltages (for a succession of +1 or -1 volts - due to DC buildup) and similar ā€œcorruptedā€ wave forms. Receivers have to guess at a 1 or 0 based on the voltage amplitude at a point in time.

There are many ā€œsolutionsā€ to the transmission of digital data but some of the simplest are bipolar encoding and bit stuffing. Check out Bipolar Encoding on Wikipedia which has a brief discussion of this. Bipolar encoding is instructive in that a succession of 1s is encoded as +1 volt followed by -1 volt followed by +1 volt followed by ā€¦ The result is closer to a sin wave which the transmission line likes. The problem then becomes a series of zeros which can causes the receiver to loose the (senders) clock. To combat this, the transmitters could stuff 1 bits at predetermined intervals so that a string of 0s is always broken up by the predetermined extra stuffing bit (1 or -1 volt), again recognizing that transmission lines like sin waves. The receivers, knowing this, throw away the stuffing bit and supposedly the original bit stream is recovered.

Bit stuffing and bipolar encoding sound like a good solution, but one problem remains: how do you know, if you stumble into the middle of a bit stream that uses bit stuffing, which is the stuffing bit and which is data? In Ethernet, you occasionally send a checksum and if that doesnā€™t check out, ask for a retransmission. Not so in real time audio, you either got it right or you have corrupted audio (which audio geeks like to call distortion). In some digital transmission protocols, there is a ā€œsynchronization bit sequenceā€ which can never appear in a data stream. Iā€™ll leave the research on that to you.

So yes, cables make a difference. I wish it were otherwise but sadly it is true. For electrical digital data cables (as opposed to fiber optics), and, especially in audio, I try to use the shortest possible cable and the best I can afford.

1 Like

The devil is in the details. It all comes down to assumptions; reality is far more interesting.

God is also in the details.

It is obviously complicated.

2 Likes

Ok let me explain this one more time ā€¦

A digital sender switches the voltage on a wireā€¦ >1 volt == 1, <.6 volt ==0 ā€¦ That is all that comes out of that sourceā€¦ a chain of timed 1s and 0s.

Now the wire gets itā€™s hands on the signal. It doesnā€™t matter if it arrives at the other end as a sine wave, a triangle wave, a square wave or your uncle jims used socks. >1 volt == 1, <.6 volt == 0.

Now the receiving end does not amplify, modify or even keep this signal. It takes a very narrow sample in the center of the bit time. A bit window of 100us is probably sampled for less than 1 us at itā€™s center. The incoming signal is not kept or processed in any wayā€¦ If the sampling detects >1 volt a 1 is latched into itā€™s input buffer. If the sampling detects <.6 volt a 0 is latched into itā€™s input buffer. Then everything moves over one bit and the process reapeats.

The information inside the receiving machine is derived from the travelling signal ā€¦ but it is not the travelling signal. That is discarded immediately after sampling.

The signal on the cable can have noise, hum, and any number of other issuesā€¦ but that signal is discdared as soon as it is sampled in the receiving system. It no longer existsā€¦ what we have is a perfect digital copy of the source inside the receiving device.

Thus the data on both ends is stored in memory as plain old binary 1s and 0s ā€¦ an exact digital copy of the digital source ā€¦ This new copy is then passed to your DAC, hard disk, flash cardā€¦ whatever.

It is an exact bit for bit digital copy of the original stored in the receiving machineā€™s memory ā€¦ how can it possibly sound different?

I can think of only 2 scenarios where changing a digital data cable would affect the end sound from your system: 1) You have grounding problems and the new cable is better grounded, causing a ā€œground loopā€ current between audio sections of your gear which can induce hum --OR-- 2) The data cable you took out had poor connections and bits were being lost in the transfer of data.

The method above is not marvelous or revolutionary ā€¦ Itā€™s how these things are routinely doneā€¦ because an exact copy is needed at both ends of that cable.

No one thinks that a working audio system doesnā€™t get correct bits over the wireā€¦ Thatā€™s not the issue at all.

The issue is the timing of the bits. The timing matters because the bits are being ā€œshoved down the DACā€™s throatā€ and hence the DAC needs to track the incoming clock. A PLL is the worst case because it tries to track every edge of every bit - i.e. it passes most of the jitter right on thru. The more technical answer is it passes jitter below the bottom of itā€™s control loop bandwidth. If the control loop has too low of a bandwidth, then the DAC will get too far ahead or too far behind the incoming samples and the buffer will overflow. Older DACs just had a one sample buffer so that limited the amount of jitter that was rejected. The next less worse case is a FLL, a control loop that only attempts to track the average frequency - but that still has a control loop bandwidth/buffer size tradeoff.

In any case the jitter that gets thru is the low frequency jitter that audio is most sensitive to. (The phase noise of the jitter is convolved with the audio smearing out the frequency even of a sine wave and since music is more than sine waves and itā€™s a convolution the energy of the effects from the jitter depends directly on the energy of the musicā€¦)

This is all simple control theory. When dealing with only digital data, all of this makes perfect sense and it common knowledge to the practitioners of the field. But when audio comes into the picture people forget that jitter matters. Itā€™s simple to calculate the noise introduced in audio by jitter, but you need to know the phase noise of the remaining jitter that affects the final clockā€¦

[Edit: this isnā€™t a bad intro: https://headfonics.com/2017/12/what-is-jitter-in-audio/]
[Edit 2: Also a good read about jitter: https://www.stereophile.com/reference/1093jitter/index.html

1 Like

Communication requires a transmitter, a receiver, and a medium to connect the two. Any one of the three can fail, spectacularly, or subtilty. As you point out, if everything is working correctly, it is perfect. In the engineering world, if I can paraphrase Forest Gump, ā€œ(stuff) happensā€.

One of the most successful communications I have ever been a part of concerned buying $10K of audio gear. My wife said, ā€œYou do realize that $10K represents about 35 trips to the Seattle Symphony ā€“ where youā€™ll experience unlimited dynamic range and absolutely zero distortion?ā€ I replied that I did, but wasnā€™t that like saying a broken clock, that is exactly right twice a day, is better than a clock thatā€™s off by no more than a minute at all times? She then informed me that I should never compare a condition of marriage with a broken clock.

Although English is not her first language, Iā€™d rate this communication at a 97 of 100, because Iā€™m sure I got her message. The remaining 3% is only because I was afraid to ask if she was telling me to buy a tuxā€¦

1 Like

Wonderful fun post, Raystone!

And thanks Ted for the explanation and links.

You are making the same mistake everyone else isā€¦ assuming that the signals on the cable are somehow fed directly to the dac and in a working system they arenā€™tā€¦

Take the USB exampleā€¦

The dac is likely working at 96000 samples per second. For 24 bit audio that is 288,000 bytes per second of raw D to A conversion.

However USB is much faster than that, USB 1 (the slowest) can transfer up to 12 megabits per second ā€¦ that is 1.5 megabytes per second or roughly 5 times the speed of the DAC.

OOPSā€¦ we canā€™t feed that directly into the DAC now can we?

The answer is to buffer the dataā€¦ the USB sender puts out a burst of maybe 1 megabyte at a time. This data is sampled and stored as I described above. Now it sits in the receiverā€™s memory to be timed into the DAC at the correct sample rate. Once the buffer is nearly spent the receiver sends a simple ā€œMoreā€ signal and the sender transmits another burst of data to refill the buffer.

The trick with the sampling window I described above is pretty coolā€¦ It is sampling the input at the centre of each bit windowā€¦ The incoming bit can be early or late by up to 49% and it will still read the data accurately into the DACā€™s memory ā€¦ so much for our jitter problem.

This isnā€™t analog processing where the signal on the RCA jack is fed directly into the input of an amplifierā€¦ it is much more complex than that.

You canā€™t get around the laws of physics, no matter how much you argue.

No one said that the bits are feed directly into the DAC, Iā€™ve always mentioned the buffer they go into.

Still something has to clock them out of the buffer, where does that clock come from? How does it track the incoming clock? Is that tracking perfect? If so then all of the incoming clockā€™s jitter is in the output clock. Is that tracking sloppy? Then how often would the buffer overflow? No matter how you slice it, jitter from the incoming clock affects the DACā€™s clock (the one that takes the data out of the buffer.) The ironic part is that (unlike high speed digital) the part of the jitter that most affects the audio is the part that getā€™s thru:

Search for these words [phase noise pll control theory] and youā€™ll get plenty of good info, a few examples:

https://en.wikipedia.org/wiki/Phase-locked_loop
http://www.analog.com/media/en/training-seminars/tutorials/MT-086.pdf
http://www.highfrequencyelectronics.com/index.php?option=com_content&view=article&id=1354:phase-locked-loop-noise-transfer-functions&catid=134:2016-01-january-articles&Itemid=189

The difference between bit perfect data transfer (e.g. confirmed by MD5#) and streaming related transfer (jitter affected) seem to be a recurring problem similar as trying to explain people that I2s has nothing to do with a HDMI signal. After a couple of days, the same question surface again.

PS;
I thought the S/PDIF receiver uses a Schmitt-trigger to determine the 0 & 1 treshold on the rising and falling flank or maybe this technique is outdated?

A comment to USB transfer;
Here the transfer is done in packets.

Since they have to operate at different speeds, there has to be two clocks. One to track the incoming data on the USB and one to time data out into the DAC.

The DAC clock can be extremely precise using a high frequency crystal and divider circuits.

The USB clock would (IIRC) be derived from the incoming data, so that it always samples at the center bit mark

Buffer over and under flows are handled just like they are in any serial interfaceā€¦ with signals generally passed back and forth on the cable ā€¦ā€œMoreā€, ā€œPauseā€, ā€œErrorā€ etc.

Most often the whole thing is under FPGA or MCU controlā€¦

S/PDIF is Manchester encoded, for example each bit is in a time unit, and, say, the falling clock is always on the unit boundaries, then if the rising clock is in the middle you call it a 1 and if itā€™s about 1/4 of the way thru you call it a zero. You can implement that will simple logic some gates, some flip flops and a PLL (which you can build with a non-linear element and some gates and flipflopsā€¦) Iā€™ll post a circuit when I have time.

Itā€™s easier in software :slight_smile:

Nope, We were talking about AES3, TOSLink or S/PDIF (see the title of this thread) where there is no path from the DAC to the source. The DAC has two choices: track the incoming clock (which passes the low frequency phase noise to the local clock) or use Asynchronous Sample Rate Conversion which tho it allows the local clock to be cleaner, it also changes the data as it encodes the incoming jitter into the data.

You do realize I design DACs for a living and in them I use an FPGA and write the software for it, donā€™t you?

It depends if S/PDIF is using NRZ logic or not ā€¦ USB does.

Rather than have the thing switching back and forth for every bit they will simply leave the signal where it is for consecutive bitsā€¦ so two 1s in a row it will not switch to 0 then back to 1ā€¦ It just stays at 1 for two bit times. The same is true for consecutive 0sā€¦ it just stays at 0 until a 1 comes along.

A schmidt trigger can be used to detect transitionsā€¦ but you still have to sample in the centre of each bit time to get the actual data back.

As I said S/PDIF is using Manchester encodingā€¦

Had to look that one up ā€¦ I knew it as Biphase Encodingā€¦

Yes, I knew that. Thatā€™s why I was kind of surprised to see you taken in by the rather naive notion that digital cables change the sound.

You do realise that as far back as the early 1980s I was designing and servicing computers for a livingā€¦ donā€™t you?

Yep, there are a confusing number of protocols (and subtle variations on them)

Rather than looking thru the definitions to get the specific choices of up/down/transitions, 0ā€™s, 1ā€™s, I just hook up the scope.

AES, etc. cheat the low level protocol with invalid transitions every once in a while to mark their frames, but otherwise they are a simple protocol.

https://en.wikipedia.org/wiki/Manchester_code

Thatā€™s why Iā€™m surprised you donā€™t seem to know anything about the insidiousness of jitter and are taken in by the rather naive notion that bits are bits in audio.