20 KiB
VideoBackend Subsystem Design
This note defines the target design for the VideoBackend subsystem introduced in PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md.
It focuses on input/output device lifecycle, pacing, buffering, and recovery policy for live video I/O. It does not redefine the whole app architecture. Its job is to make the backend boundary concrete enough that later phases can move current DeckLink and bridge code toward one clear ownership model.
Purpose
VideoBackend is the hardware-facing timing subsystem.
It owns:
- video device discovery and capability inspection
- input and output device configuration
- input callback handling
- output callback handling
- buffer-pool ownership for device-facing frames
- playout headroom policy
- queueing and pacing policy between render and hardware
- input signal presence tracking
- backend lifecycle and degraded-state transitions
It does not own:
- GL contexts
- frame composition
- shader execution
- persistence
- control mutation policy
- render snapshot publication
The core rule is:
RenderEngineproduces framesVideoBackendmoves those frames to and from hardware at the right cadence
Why This Subsystem Exists
Today the boundary between render and hardware pacing is still too blurred.
The main current pressure points are:
OpenGLVideoIOBridgestill performs render-facing work inside the output completion callback:DeckLinkSessionowns device setup, mutable output frame pools, and schedule timing in one class:- the output scheduler currently reacts to late and dropped frames with a fixed skip policy:
- the current output frame pool and preroll depth are not sourced from one policy object:
DeckLinkSession::ConfigureOutput()creates10mutable output frameskPrerollFrameCountis currently12
Those overlaps make latency, buffering, and recovery behavior harder to reason about.
Subsystem Responsibilities
VideoBackend should own the following responsibilities explicitly.
1. Device Discovery and Capability Reporting
The subsystem should:
- discover available input and output devices
- choose the configured input/output pair
- inspect mode support and pixel-format support
- expose capability facts needed by higher layers
Examples:
- input present or absent
- output present or absent
- model name
- keyer support
- internal/external keying availability
- supported pixel formats for the configured mode
- input/output frame sizes
This work is currently mostly in:
2. Input Lifecycle and Input Callback Handling
The subsystem should:
- configure input mode and pixel format
- install and own the input callback delegate
- start and stop capture streams
- translate hardware input frames into backend-level input frame events
- track signal-present versus no-input-source conditions
It should not decide how uploaded textures are produced. That belongs to RenderEngine.
The backend may expose input frames as:
- borrowed CPU-accessible frame views
- backend-managed input frame objects
- typed input events containing signal state and frame payload metadata
This work is currently split across:
- DeckLinkSession::ConfigureInput
- CaptureDelegate::VideoInputFrameArrived
- OpenGLVideoIOBridge::VideoFrameArrived
3. Output Lifecycle and Output Callback Handling
The subsystem should:
- configure output mode and pixel format
- own the output frame pool
- install and own the scheduled-frame completion callback
- start scheduled playback
- stop scheduled playback
- account for completion results such as completed, late, dropped, and flushed
It should not render the next frame in the callback path.
This work is currently split across:
4. Pacing and Scheduling Policy
The subsystem should own:
- target frame duration and timescale
- schedule time generation
- preroll policy
- spare-buffer policy
- queue headroom policy
- late-frame and dropped-frame recovery policy
This is not just a utility detail. It is one of the main timing responsibilities of the subsystem.
The current VideoPlayoutScheduler is a useful seed, but it is too small and too implicit to represent the eventual backend policy by itself.
5. Device-Facing Buffer Pools
The subsystem should own all device-facing buffers that exist to satisfy the hardware API contract.
Examples:
- mutable output frames created through DeckLink
- any staging buffers required by a future non-DeckLink backend
- reusable CPU frame containers for hardware ingress/egress
The goal is to make buffer depth and lifetime explicit and measurable.
RenderEngine may own render surfaces and GPU readback resources. VideoBackend owns the buffers required to talk to the hardware or OS video I/O API.
6. Backend Health and Degraded State
The subsystem should publish operational state such as:
- running normally
- prerolling
- temporarily late
- dropping frames
- no input signal
- output stopped
- failed to configure
This state should be reported to HealthTelemetry, not hidden inside debug logs or modal dialog paths.
Boundary With Other Subsystems
This subsystem must stay aligned with the Phase 1 dependency rules.
Allowed directions:
VideoBackend -> RenderEngineVideoBackend -> HealthTelemetry
Not allowed in the target design:
VideoBackend -> RuntimeStoreVideoBackend -> RuntimeCoordinatorVideoBackend -> ControlServices
The important operational boundary is:
VideoBackendmay request or consume rendered output frames- it may not own frame composition policy
That means:
- no shader parameter validation here
- no persistence decisions here
- no direct mutation of runtime state here
State Owned by VideoBackend
VideoBackend should own the following state categories.
Device Configuration State
Examples:
- selected device handles
- configured input/output formats
- negotiated pixel formats
- keyer configuration
- output model name
- supported keying flags
Session Lifecycle State
Examples:
- discovered
- configured
- prerolling
- running
- degraded
- stopping
- stopped
- failed
Input Runtime State
Examples:
- signal present or missing
- last observed input format properties
- input frame counters
- input callback timestamps
- queued capture frames awaiting render ingestion
Output Runtime State
Examples:
- output queue depth
- scheduled frame index
- completed frame index
- late frame count
- dropped frame count
- spare buffer count
- current headroom target
Backend-Owned Transient Buffers
Examples:
- output mutable frame pool
- playout ring buffer entries
- input frame handoff queue
- staging buffers if required by the device API
This is transient live state, not persisted state.
Target Lifecycle Model
VideoBackend should eventually expose an explicit lifecycle state machine rather than relying on scattered imperative calls.
Suggested states:
uninitializeddiscoveringdiscoveredconfiguringconfiguredprerollingrunningdegradedstoppingstoppedfailed
Suggested transition rules:
uninitialized -> discoveringdiscovering -> discovered | faileddiscovered -> configuring | stoppedconfiguring -> configured | failedconfigured -> prerolling | stoppedprerolling -> running | failed | stoppingrunning -> degraded | stopping | faileddegraded -> running | stopping | failedstopping -> stopped
Why this matters:
- startup failure reporting becomes more predictable
- backend recovery can become policy-driven
- telemetry can report backend state directly
- later backends do not need to mimic DeckLink's exact imperative shape
Target Timing Model
The long-term timing design should be producer/consumer playout.
Current Model
Today the callback path effectively does this:
- DeckLink signals completion.
- The callback path asks for a new output buffer.
- The callback path enters the shared GL section.
- The callback path renders the next frame.
- The callback path reads it back.
- The callback path schedules the next hardware frame.
That path is visible in:
This couples output timing directly to render work.
Target Model
The target model should be:
RenderEngineproduces completed output frames at the configured cadence.RenderEngineplaces them into a bounded queue owned or mediated byVideoBackend.VideoBackenddequeues ready frames when the device needs them.- hardware callbacks only:
- record completion results
- release or recycle buffers
- dequeue and schedule the next ready frame
- raise underrun or degraded-state signals if needed
The timing rule becomes:
- render is the producer
- hardware output is the consumer
This gives the app a clear place to manage:
- target latency
- playout headroom
- stale-frame reuse
- underrun behavior
- spare buffer policy
Input Buffering and Pacing
The input side needs a simpler but still explicit handoff model.
Recommended target behavior:
- hardware callbacks push input frames into a bounded ingress queue
RenderEnginepulls the newest useful input frame when preparing a render- if the ingress queue overflows, old frames are discarded according to policy
Recommended default policy for live playout:
- prefer recency over completeness
- drop stale capture frames instead of blocking render or output
The current "skip upload if the GL bridge is busy" behavior is directionally correct for live timing:
But in the target architecture that decision should move out of GL lock acquisition and into an explicit backend-to-render handoff queue policy.
Suggested input metrics:
- input frames received
- no-signal transitions
- input queue depth
- dropped input frames
- oldest queued input age
Output Buffering and Headroom Policy
Output buffering should be policy-driven from one source of truth.
The target design should define a playout buffering policy object with at least:
- target preroll depth
- minimum spare device buffers
- maximum queued rendered frames
- allowed catch-up depth
- underrun behavior
Example policy fields:
targetPrerollFramesminSpareOutputBuffersmaxReadyFramesmaxCatchUpFramesreuseLastFrameOnUnderrunallowAdaptiveHeadroom
This replaces the current split between:
- fixed mutable frame pool size in
DeckLinkSession::ConfigureOutput() - fixed preroll count in
kPrerollFrameCount - fixed skip-ahead recovery in
VideoPlayoutScheduler
Underrun and Recovery Policy
The backend should define explicit behavior for when no fresh frame is ready at schedule time.
Candidate policies:
- Reuse the last completed rendered frame.
- Reuse the last scheduled output frame.
- Schedule a known black or degraded frame.
- Temporarily increase headroom if the system is repeatedly catching up.
Which one is correct may differ by operating mode, but the choice should be explicit rather than incidental.
Similarly, completion-result handling should become measured rather than fixed.
The current scheduler does this:
- late or dropped frame ->
mScheduledFrameIndex += 2
That is a useful emergency simplification, but not a durable backend contract.
The target backend should instead track:
- scheduled frame index
- completed frame index
- backlog depth
- late streaks
- dropped streaks
- current operating headroom
Then recovery can use measured lag, not a hardcoded skip.
Suggested Public Interface
This is not a final class API. It describes the shape the subsystem should move toward.
Discovery and Configuration
DiscoverDevices(...)SelectFormats(...)ConfigureInput(...)ConfigureOutput(...)GetCapabilities()GetBackendState()
Lifecycle
StartCapture()StartPlayout()StopCapture()StopPlayout()Shutdown()
Input Handoff
PollInputFrame(...)orTryDequeueInputFrame(...)ReportInputSignalState(...)
Output Handoff
QueueRenderedFrame(...)TryDequeueReadyFrameForSchedule(...)RecycleCompletedFrame(...)
Timing and Recovery
SetPlayoutPolicy(...)AccountForCompletionResult(...)BuildBackendTimingSnapshot()
Health Reporting
BuildBackendHealthSnapshot()GetWarningState()
Suggested Internal Components
The subsystem will likely be easier to evolve if its responsibilities are split internally.
Possible internal structure:
VideoBackendSession
Owns:
- high-level lifecycle state
- configuration
- input/output subcomponents
- policy objects
InputEndpoint
Owns:
- input device callback registration
- input frame queue
- signal detection state
OutputEndpoint
Owns:
- output device callback registration
- output device buffer pool
- schedule/dequeue logic
- preroll and output queue management
PlayoutPolicy
Owns:
- preroll target
- spare buffer target
- underrun behavior
- catch-up and lateness rules
BackendTimingState
Owns:
- frame counters
- queue depth snapshots
- late/dropped streaks
- observed intervals
These can remain implementation details in Phase 1, but the design should leave room for them.
Mapping From Current Code
Current DeckLinkSession
Should mostly migrate into:
VideoBackend- device discovery
- input configuration
- output configuration
- keyer capability handling
- output frame pool ownership
- lifecycle state handling
Candidates to stay backend-owned:
DiscoverDevicesAndModes(...)SelectPreferredFormats(...)ConfigureInput(...)ConfigureOutput(...)Start()Stop()HandleVideoInputFrame(...)HandlePlayoutFrameCompleted(...)
Current VideoPlayoutScheduler
Should likely become:
- a backend-owned policy helper or timing component under
VideoBackend
It is still a backend concern, but it should be expanded beyond a single counter and fixed skip rule.
Current OpenGLVideoIOBridge
Should split between:
RenderEngine- input texture upload scheduling
- render submission
- readback or output-frame production
VideoBackend- input ingress queue
- output callback and scheduling policy
- pacing stats
The most important migration is:
- remove render work from
PlayoutFrameCompleted()
Previous Runtime Status Updates
Frame pacing and signal status setters that were historically called from the bridge should route through:
VideoBackend -> HealthTelemetry
rather than the old pattern:
- callback/bridge ->
RuntimeHost
Migration Plan
The migration should avoid a flag-day rewrite.
Step 1. Name the backend boundary explicitly
Create a conceptual VideoBackend interface around the existing VideoIODevice/DeckLinkSession shape without moving all logic at once.
Step 2. Pull timing policy into backend-owned objects
Move:
- completion accounting
- headroom configuration
- frame-pool sizing
- queue depth reporting
behind explicit backend policy types.
This can happen before changing the render thread model.
Step 3. Separate callback work from render work
Change the output completion path so it stops rendering immediately in the callback chain.
Intermediate step:
- callback records completion and wakes a playout worker
Target step:
- callback only dequeues and schedules already-ready frames
Step 4. Move input handoff to a bounded queue
Replace direct callback-to-GL upload behavior with:
- backend-owned input queue
- render-owned dequeue/upload policy
Step 5. Introduce explicit backend lifecycle states
Start surfacing:
- configured
- prerolling
- running
- degraded
- failed
before changing all recovery behavior.
Step 6. Route backend health to HealthTelemetry
Move debug-only warnings and ad hoc status strings toward structured counters and backend snapshots.
Risks
Latency Versus Stability Tradeoff
Increasing headroom reduces deadline misses but increases end-to-end latency. The backend must make that tradeoff explicit and configurable enough for live use.
Hidden Coupling During Migration
The current bridge still mixes backend and render concerns. Partial extraction can accidentally preserve the old coupling under new names if the callback path is not cleaned up deliberately.
Buffer Ownership Ambiguity
If device-facing buffers and render-facing buffers are not separated clearly, lifetime bugs and timing regressions will remain easy to reintroduce.
Backend-Specific Assumptions
The first target is still DeckLink-centric. The interface should avoid baking in assumptions that would make alternate backends awkward later.
Recovery Policy Complexity
A more explicit backend model will surface choices that are currently hidden:
- stale frame reuse
- black-frame fallback
- adaptive headroom
- catch-up rules
That is healthy, but it will require deliberate policy decisions.
Open Questions
- Should
VideoBackendown both input and output under one session object long-term, or should it expose distinct input and output endpoints under a shared shell? - Should queue ownership sit fully inside
VideoBackend, or should there be a narrow shared frame-exchange interface betweenRenderEngineandVideoBackend? - What should the default underrun policy be for live playout: reuse last frame, reuse newest completed frame, or output black?
- Should adaptive headroom be automatic, operator-configurable, or both?
- At what point should preview timing be treated as a backend concern versus a render concern? The Phase 1 direction says preview is subordinate to render, not owned by the backend, but later timing work may still require explicit coordination.
- How much of the current
VideoIOStatebelongs insideVideoBackendversusHealthTelemetrysnapshots?
Short Version
VideoBackend should become the subsystem that owns hardware timing, device lifecycle, buffer policy, and playout recovery.
It should not render frames.
The target direction is:
RenderEngineproduces frames ahead of needVideoBackendconsumes and schedules them- callbacks become lightweight control-plane events
- headroom, queue depth, and recovery become explicit backend policy
- hardware health is reported structurally instead of being inferred from scattered logs and bridge behavior