Jitter is the difference in propagation delay between frames. However, in the case of an ethernet frame the application never sees this jitter. I’ll describe, while leaving out a lot of detail, the way Linux networking functions (generic;y, not talking about SR-IOV, affinity, or other configurable things).
The ethernet PHY watches the wire for stuff that belongs to it. This can be broadcast traffic, multicast traffic, or traffic specificly for its MAC address. When the PHY sees something of interest it moves those bits into network buffer memory and there is an interrupt that is a sort of “hey, data here, come get it” sent to Linux. Linux then identifies which application the traffic belongs to and moves that data to memory space where the application can process it. How the application is made aware there is data to process… there are various ways this works.
Anyway… that whole process of pulling bits off the wire and getting them to the application isn’t a perfect “clock” and has its own jitter. This is why everything is buffered. Even networked audio. You don’t play the first audio packet you receive… you buffer 50ms to many seconds of audio before you start playback. This, effectively, eliminates any network jitter that occurs from the playback routines.
If a frame is just gone, for whatever reason, then yes you lose a segment of audio and you’ll get a pop, static, something. But networks are so reliable these days it’s really rare to see packet loss on a home network. Switched networks are not plagued with shared wire “hub” problems from decades ago. Port contention yes (but there are ways around this as well)… but not the collisions and retries from back in the “hub” days.