- Thomson Visuals Backstage Access
- Posts
- Thomson Explains: SMPTE 2110 - Part 3
Thomson Explains: SMPTE 2110 - Part 3
PTP, NMOS, and SDPs!
Tier 3: Advanced SMPTE 2110 Knowledge
Tier 3 delves into the specifics of SMPTE ST 2110 and related standards that form a modern IP media system. This is where you get into the guts of 2110: the exact parts of the standard, the role of PTP in detail, how audio, video, and data are encapsulated, and the emerging protocols (like NMOS) that make a large 2110 system manageable. After Tier 3, you should have a conceptual blueprint of a SMPTE 2110 system in your head and how all the pieces fit.
Precision Time Protocol (PTP) and SMPTE ST 2059 (Timing in IP)
We introduced PTP in Tier 2; now let’s expand on it. PTP (IEEE 1588) is the cornerstone of synchronization in IP systems. SMPTE 2110 explicitly relies on the SMPTE ST 2059-2 PTP profile for broadcast
How PTP Works (Basics): PTP uses a leader-follower paradigm. A device is elected as the Grandmaster (GM) clock (usually a dedicated time server device disciplined by GPS/GNSS for accuracy). All other devices (PTP clients) adjust their clocks to match the GM. PTP achieves this by exchanging timestamped packets (Sync, Follow_Up, Delay_Request, Delay_Response) to measure network delays and calibrate each client’s clock. The result: every camera, mixer, audio console, etc., has its clock synchronized to within microseconds. The SMPTE ST 2059-2 “broadcast profile” of PTP sets specific parameters (message rates, domain number, etc.) appropriate for video – e.g. it typically uses PTP Domain 127 by default for ST 2110 systems, and sends 8 Sync messages per second with 1-second Announce intervals.
By contrast, AES67 audio profile often uses Domain 0 with different rates – important if integrating with other audio networks.
PTP Grandmaster and Domains: In a given PTP domain, there is one active Grandmaster. If multiple clocks are present, the Best Master Clock Algorithm (BMCA) elects one based on priority and quality parameters. It’s crucial that your whole facility is on the same PTP domain (usually domain 127 for ST 2110). If a device is set to a different domain, it will ignore the GM and either free-run or try to be its own GM – effectively not syncing with others. For example, an audio device left at AES67 default domain 0 won’t lock to the video GM on domain 127 – leading to drift. Or if someone accidentally introduces a second GM on the same domain with a higher priority, it could overthrow the intended leader, causing chaos. (We’ll see in Tier 4 how to detect a rogue PTP leader.)
Locking and Alignment: All 2110 senders use PTP time to timestamp their RTP packets (each RTP packet has a 1588-derived timestamp in its header). This allows receivers to playout multiple essences in sync. ST 2110-10 requires all devices to support a common PTP epoch so that a video frame’s timestamp aligns with the exact microsecond per frame defined by ST 2059. The accuracy target is within 1 µs for professional media
In practical terms, this means if two cameras point at a clock, their captured frame timestamps differ by no more than a microsecond. PTP makes that possible.
PTP in the Network: PTP packets are multicast (to 224.0.1.129 in IPv4). Network switches can either treat them like any other packet, or (ideally) participate in PTP timing. High-end switches offer Boundary Clock (BC) or Transparent Clock (TC) modes. A Boundary Clock switch acts like a PTP client to the GM and then itself becomes a local GM for devices downstream – this can reduce load and jitter on the timing distribution. It is recommended in media deployments to use switches with Boundary Clock enabled, to simplify sync distribution
Without BC, PTP traffic is simply forwarded like multicast; this works on a small flat network, but in large or routed networks it can get complex (you’d need to allow PTP through routers, potentially use multicast routing, etc. – something to avoid if possible). For a junior engineer: know whether your network infrastructure supports PTP and how it’s configured. Misconfiguring PTP (like leaving a switch in default mode that filters or delays PTP packets) will cripple your media streams.
Blunt point: If PTP isn’t solid, NOTHING in a 2110 facility will be reliable. Always configure at least two PTP Grandmasters (primary and backup) and ensure all endpoints report “PTP Locked”. PTP status is the first thing to check when devices show sync errors. As SMPTE says, they chose PTP for “media signal generation and synchronization” in IP making it as mission-critical as black burst was in analog days.
SMPTE ST 2110 Suite Overview (Essence Streams and Specifications)
Now we focus on the SMPTE ST 2110 standard itself. ST 2110 is not a single document but a suite of standards (several parts) that describe how to send video, audio, and data over an IP network in real time. The SMPTE ST 2110 suite “specifies the carriage, synchronization, and description of separate elementary essence streams over IP for real-time production and playout.” The foundation came from the Video Services Forum’s TR-03 recommendation – which basically said “send each essence as its own RTP stream over IP”. Here are the key parts of SMPTE ST 2110 a junior engineer should know:
ST 2110-10: System Timing and Definitions. This is the top-level document that outlines overall principles. It establishes that senders and receivers must use PTP timing (SMPTE 2059-2 profile), defines terms, and common behaviors. Think of -10 as the umbrella: if you read only one part, -10 gives the general rules (e.g. all streams use RTP over UDP/IP, all devices must have PTP, etc.)
ST 2110-20: Uncompressed Active Video. This part covers video essence transport. It defines how to packetize uncompressed video frames into RTP packets. Video is sent as a series of RTP packets for each frame (or line), excluding the blanking intervals (only active picture is sent – saving bandwidth vs SDI which sent blanking). Key for engineers: 2110-20 video is high bandwidth (e.g. HD ~1.5 Gbps, UHD ~12 Gbps per stream) so it stresses your network. It’s also sensitive to packet timing (we’ll get to -21 next). Each video flow is identified by a multicast IP:port and has an SDP description (format, resolution, etc).
ST 2110-21: Traffic Shaping for Video. A critical but slightly abstract part – it defines timing models for how video packets are sent. Because a whole video frame’s worth of packets could theoretically burst out of a sender as fast as the NIC can send (which would overwhelm some receivers or network buffers), 2110-21 introduced the concept of “narrow senders” that pace out the packets over the frame interval. It defines two traffic models: Type N (narrow) and Type W (wide) senders, with specific timing constraints on packet emission. For practical purposes, as an operator you want to ensure your senders are “narrow” (most professional equipment is) so that traffic is smooth. If you see a device marked as “2110-21 Type W”, be cautious as it may burst traffic more. -21 also helps ensure deterministic buffering – a receiver can allocate a certain size buffer if it knows the sender is narrow. This is deep engineering detail, but awareness matters if you debug packet loss – if a sender doesn’t conform to -21, you could overflow switch buffers. The 2110-21 spec essentially guarantees that even high bitrate video is network-friendly if everyone follows the rules.
ST 2110-30: PCM Digital Audio. This part specifies carriage of audio (usually 24-bit PCM) in RTP streams, based on AES67 profile. It allows up to 8 channels per stream typically (though more can be supported). Audio flows are much smaller bitrates (e.g. 1.5Mbps for 8 channels of 48kHz/24-bit) and typically one video has multiple audio flows associated. Important: 2110-30 audio must also lock to PTP (it uses the same epoch so audio samples can align with video frame boundaries if needed). If an audio device isn’t PTP-locked, you’ll get drift or slips (just like unsynchronized audio in analog).
ST 2110-31: AES3 Transparent Audio Transport. This carries non-PCM audio signals (AES3 encapsulated) like Dolby-E or other compressed audio that fits in an AES3 data stream. It’s less common for a new engineer to handle, but be aware it exists in case your facility passes Dolby-E (used in some broadcast workflows).
ST 2110-40: Ancillary Data. This covers how ancillary data (e.g. captions, timecode, SCTE markers) is carried in its own RTP stream. Each 2110-40 stream can carry multiple ANC “packets” as they were in SDI VANC. For example, you might have a teletext caption ANC data stream associated with a video. The key is that things like timecode can be sent separately and remain in sync via PTP timestamps. A practical example: your facility might embed VITC timecode in SDI for legacy gear, but also produce a 2110-40 stream with that same timecode so IP-only gear can get it. As a junior engineer, you likely just route these like any other stream when needed.
ST 2110-22: Compressed Video (CBR). This part (added later) allows certain compressed video formats (like JPEG XS or J2K) to be sent within the 2110 framework. It’s essentially a way to reduce bandwidth when needed, while still using RTP. Not every system uses -22, but remote productions or bandwidth-limited links might. If your facility is full 10/25Gb, you likely stick to uncompressed (-20). But if you see mention of JPEG XS over 2110, that’s -22 in action (useful to know but maybe above “absolute minimum” – just keep it in mind).
Note! (There are other parts like ST 2110-41 for more advanced metadata mapping, ST 2110-23 for combining multiple video streams for a single ultra-high bitrate signal, etc., but those are specialized. I recommend you focus on -10, -20, -30, -40, and -21 as the core.)
In summary, SMPTE 2110 turns what used to be one SDI signal into multiple IP flows: one for video, one or more for audio, one for ancillary data, all synchronized via PTP. The standards above detail how each is formatted. A good exercise: take a simple scenario – one camera with two audio channels and timecode – and enumerate what streams you’d have:
A 2110-20 video stream (e.g. 1080p60).
A 2110-30 audio stream (2 channels).
A 2110-40 ANC stream (carrying timecode, maybe also CC data).
All share the same PTP clock and will have consistent timing.
Citing a reference: The SMPTE FAQ succinctly says the suite covers “uncompressed video and audio streams” (2110-10/-20/-30), “traffic shaping” (2110-21), “compressed video” (2110-22), “AES3 audio” (2110-31), and “ancillary data packets” (2110-40) in RTP over UDP/IP. Those are the buzzwords to remember.
Session Description Protocol (SDP) and Stream Descriptions
When you plug an SDI cable from a camera into a monitor, the monitor immediately gets the video – the format is implicit in the electrical signal. In IP, when a receiver joins a multicast stream, how does it know what it’s receiving(resolution, frame rate, audio channels, etc.)? Enter SDP files.
What SDP Is: Session Description Protocol is a format for describing streaming sessions. In SMPTE 2110, each sender (e.g. a camera or device output) generates an SDP file (a small text file) that describes the stream’s parameters – e.g. for a video stream: “This is 1080p59.94, RTP payload type 112, resolution 1920x1080, colorimetry BT.709, PTP timestamp format…” etc. For an audio stream: channel count, sample rate, bit depth, etc. The SDP also specifies the multicast address and port for that flow. Essentially, the SDP is the digital equivalent of labeling a cable and specifying the signal format.
How SDP is Used: In a 2110 facility with a control system, when you route a source to a destination, the controller will provide the destination with the source’s SDP info. Typically the source device hosts the SDP file (accessible via an API or a directory), and the control system “gets” that SDP and then instructs the receiver to “set up and receive according to this SDP.” One SDP per flow is the norm. For example, an IP camera might have an SDP for its video flow, and separate SDPs for each of its audio flows. The system uses these to connect the streams properly.
Why You Should Care: Troubleshooting stream compatibility often comes down to SDP mismatches. If a receiver is not showing video, perhaps it doesn’t support the format advertised in the SDP (e.g. maybe the camera is 1080p60 but the receiver only handles up to 1080p30). Or if someone manually types the wrong multicast address, the SDP in the receiver might not match the actual stream. Understanding SDP allows you to independently verify streams. You can open an SDP file in a text editor – it’s mostly human-readable. For example, it might contain lines like
a=framerate:59.94
ora=recvonly
.SDP and Standards: SMPTE 2110 builds on existing standards – SDP is defined by IETF (RFC 4566) and adopted for use in ST 2110. It’s not something you have to memorize, but you should be able to find and read an SDP when needed. Many vendor UIs will show the SDP or at least the key parameters.
In practice, a junior engineer might use SDP as follows: If a device won’t lock to a stream, retrieve the SDP from the source and compare it to what the receiver expects. Are payload types matching? Is the PTP timing indicated correct? Tools like Wireshark can also interpret SDPs. Think of SDP as the “spec sheet” for a stream – always check the spec if there’s a problem.
NMOS (Networked Media Open Specifications) – Discovery, Control, and Management
As systems scale to dozens or hundreds of streams, manually handling IP addresses and SDPs becomes impractical. NMOS, developed by AMWA, is a set of open specifications that sits above SMPTE 2110 to handle device discovery, registration, and connection management.
Discovery & Registration (IS-04): NMOS IS-04 is like a directory service. When an ST 2110 device (camera, monitor, etc.) boots up, it registers itself and its streams to an NMOS registry (usually an HTTP API running on a server). It announces: “I am Device A, I have a source named Camera1, which has a video flow at this multicast address, here’s the SDP, etc.” This allows a central overview of all available senders and receivers in the network. Instead of typing in IPs, a user can see friendly names.
Connection Management (IS-05): Once discovery is done, NMOS IS-05 provides a way to connect streams. A controller can tell a receiver “subscribe to that sender” through an API call (essentially, “set your RX to this multicast address/port”). Under the hood, the receiver will get the SDP and configure accordingly, but NMOS abstracts that away. This means operators might simply choose sources and destinations by name on a panel, and NMOS does the heavy lifting. For an engineer, knowing NMOS means you can interface with these APIs or at least understand what the broadcast controller is doing.
Why NMOS Matters: NMOS was created to ensure interoperability between multivendor IP gear and to avoid proprietary control systems locking you in. It’s supported by the Joint Task Force on Networked Media (JT-NM) which includes AMWA, EBU, SMPTE, VSF.
The goal is “straightforward interoperability between products from a wide range of manufacturers” by providing a common control layer on top of ST 2110.
In plain terms: it lets a Sony camera and a Grass Valley switcher and a Lawo audio console all register their streams in one place and connect them without custom drivers. As a junior engineer, you might not write NMOS APIs, but you should know what IS-04 and IS-05 refer to (discovery and connection management, respectively), and be aware if your facility uses an NMOS controller or a vendor-specific system.
Other NMOS Specs: NMOS has grown: IS-08 (Audio Channel Mapping), IS-07 (Events & Tally), NMOS security specs, etc. Those are more advanced, but if you work with audio channel management (like mapping intercom channels), IS-08 could be relevant. To start, focus on IS-04/05 basics: they are essential for automating large IP systems. Without NMOS, you’d be configuring each device’s multicast subscriptions manually – possible for a demo, not for a live TV station with hundreds of signals.
In practice: Make sure you know how to access the NMOS registry (it might be a web UI or REST API). If a new device isn’t showing up, maybe it didn’t register – you might need to troubleshoot its network or the registry. If a stream won’t connect, check the NMOS logs to see if the connection request was sent/acknowledged. Understanding NMOS fosters a mindset that we treat the system more like IT (with service discovery) rather than static wiring.
Hybrid SDI/IP Environments and Legacy Integration
As of now, many facilities are hybrid – part IP, part SDI. Tier 3 knowledge includes understanding how to integrate the two worlds strategically.
Gateways and Conversion: We already touched on IP/SDI gateways. At a deeper level, appreciate the challenges: e.g., converting an IP stream to SDI introduces a bit of latency (buffering a frame perhaps). If you have parallel IP and SDI paths of the same signal, they might not be exactly time-aligned without a frame sync. An engineer must plan for these offsets.
Genlock and PTP Bridging: In a hybrid plant, you likely still have a black burst reference for SDI gear and PTP for IP gear. Those two reference systems must be locked together. This is done by locking the PTP Grandmaster to the house sync (or vice versa). For instance, a Grandmaster can output a black burst signal (many GPS/PTP generators have analog sync outputs) so that legacy SDI gear and the PTP-driven IP gear are all essentially in one sync domain. It’s possible to run them separately, but then an IP-to-SDI gateway has to perform frame synchronization (buffer and align frames) which adds latency. Better to lock them. Know the devices in your chain: if you have a SPG (sync pulse generator) and a PTP Grandmaster, ensure one disciplines the other.
Video and Audio Consoles: Some production switchers and audio consoles might internally still work with SDI/AES signals and use gateway cards at their edges for IP I/O. When working with these, you might be dealing with a mix of IP streams going in/out, but the console’s internal routing is like an SDI router. Be mindful of where conversion happens (usually those devices will hide it, but under stress, you might have to check those gateway modules).
Operational Workflow: Operators shouldn’t have to care whether a source is coming via SDI or IP. Your job as an engineer is to make the hybrid transparent. This often means managing a control system that encompasses both. For example, a routing control panel might have some destinations that are physical SDI crosspoints and others that send an IP subscription via NMOS. Part of Tier 3 knowledge is understanding the overall system architecture – which control software talks to the SDI router, which talks to the NMOS controller, and how they coordinate. As a junior engineer, you may not design it, but you will definitely be the one making it work at 3 AM when a feed needs patching.
In summary of Tier 3: You now know the specific standards making up SMPTE 2110, the importance of precise timing (PTP) binding it all, and the ecosystem (SDP, NMOS) that lets multiple streams and devices operate coherently. The theme is modularity and interoperability – each essence is separate, each device is discoverable, and everything syncs via a common clock. Where Tier 1 and 2 gave you pieces of the puzzle, Tier 3 should allow you to see the whole picture of an IP media facility: camera sends video, audio, ancillary as separate multicast streams (2110-20/30/40), network transports them (with multicast and maybe PIM), PTP keeps them in lockstep, NMOS registers them, and a control system connects them to receivers which decode via SDP info. A quick check: If someone mentions “2110-20” or “NMOS IS-04” or “PTP domain”, you should know what that means now. If you do, congrats – you’re speaking the same language as the IP broadcast engineers.
Reply