aiden/video-shader-toys

Fork 0

Files

Aiden e8a3805fff

CI / React UI Build (push) Successful in 10s

Details

CI / Native Windows Build And Tests (push) Successful in 2m40s

Details

CI / Windows Release Package (push) Has been cancelled

Details

Doc update again

2026-05-11 18:48:55 +10:00

10 KiB

Raw Blame History

Phase 7 Design: Backend Lifecycle And Playout

This document expands Phase 7 of ARCHITECTURE_RESILIENCE_REVIEW.md into a concrete design target.

Phase 4 made the render thread the sole owner of normal runtime GL work, but output timing is still callback-coupled: DeckLink completion callbacks synchronously request render-thread output production before scheduling the next hardware frame. Phase 7 should make backend lifecycle, buffer policy, playout headroom, and recovery explicit.

Status

Phase 7 design package: proposed.
Phase 7 implementation: not started.
Current alignment: VideoBackend, VideoIODevice, DeckLinkSession, and VideoPlayoutScheduler exist. Phase 4 removed callback-thread GL ownership, but the DeckLink completion path still waits for render-thread output production.

Current backend footholds:

VideoBackend wraps device discovery/configuration, start/stop, input callback handling, output completion handling, and telemetry publication.
DeckLinkSession owns DeckLink device handles, frame pool creation, preroll, keyer configuration, and scheduled playback.
VideoPlayoutScheduler owns basic schedule time generation and simple late/drop skip-ahead behavior.
OpenGLVideoIOBridge is the current adapter between VideoBackend and RenderEngine.
HealthTelemetry receives some signal, render, and pacing stats.

Why Phase 7 Exists

The current output path works only while render/readback stays comfortably inside budget. A late render can make the callback late, which reduces device-side headroom, which makes the next callback more fragile.

The resilience review calls this the main remaining live-resilience risk after Phase 4:

output playout is still effectively render-on-demand from the DeckLink completion callback
buffer pool size and preroll depth are not sourced from one policy
late/dropped recovery is a fixed skip rule
backend lifecycle is imperative rather than represented as explicit states

Phase 7 should separate hardware timing from render production.

Goals

Phase 7 should establish:

explicit backend lifecycle states and allowed transitions
one playout policy for frame pool size, preroll, headroom, and underrun behavior
a bounded producer/consumer output queue between render and DeckLink scheduling
lightweight DeckLink callbacks that dequeue/schedule/account rather than render
measured recovery from late/dropped frames
structured backend health reporting
tests for scheduler, queue, lifecycle, and underrun policy without DeckLink hardware

Non-Goals

Phase 7 should not require:

a new renderer
changing shader/state composition
replacing DeckLink support with multiple backends
full telemetry UI redesign
removing every synchronous API immediately
perfect adaptive latency policy in the first pass

Target Timing Model

The target model is producer/consumer playout:

RenderEngine/render scheduler produces completed output frames
  -> bounded ready-frame queue
  -> VideoBackend consumes ready frames
  -> DeckLink callback schedules already-prepared frames

The callback should not wait for rendering. It should:

record completion result
recycle/release completed buffers
dequeue a ready frame or apply underrun policy
schedule the next frame
publish backend timing/health observations

Target Lifecycle Model

Suggested backend states:

Uninitialized
Discovering
Discovered
Configuring
Configured
Prerolling
Running
Degraded
Stopping
Stopped
Failed

Suggested transition rules:

Uninitialized -> Discovering
Discovering -> Discovered | Failed
Discovered -> Configuring | Stopped
Configuring -> Configured | Failed
Configured -> Prerolling | Stopped
Prerolling -> Running | Failed | Stopping
Running -> Degraded | Stopping | Failed
Degraded -> Running | Stopping | Failed
Stopping -> Stopped

The exact enum can change, but the lifecycle should become observable and testable.

Proposed Collaborators

`VideoBackendStateMachine`

Pure or mostly pure lifecycle transition helper.

Responsibilities:

validate state transitions
produce transition observations
track failure reasons
keep start/stop/recovery behavior auditable

Non-responsibilities:

DeckLink API calls
rendering
persistence

`PlayoutPolicy`

Policy object for queue and timing behavior.

Expected fields:

target preroll frames
maximum ready frames
minimum spare device buffers
underrun behavior
maximum catch-up frames
adaptive headroom enabled/disabled

`RenderOutputQueue`

Bounded queue or ring for completed output frames.

Responsibilities:

accept completed render outputs
expose ready frames for scheduling
track depth, drops, stale reuse, and underruns
keep ownership/lifetime clear between render and backend

`OutputFramePool`

Backend-owned device buffer pool.

Responsibilities:

own DeckLink mutable frames
expose available buffers for render/readback or scheduling
recycle completed frames
report spare-buffer depth

`PlayoutController`

Coordinates policy, ready frames, device schedule times, and completion accounting.

Responsibilities:

preroll frames
schedule next frame
handle late/drop/completed/flushed results
apply underrun policy
publish timing state

Output Queue Policy

The initial output queue should be small and bounded.

Candidate defaults:

target ready frames: 2-3
max ready frames: 3-5
underrun: reuse last completed frame if available, otherwise black
late/drop: increase degraded counters and optionally increase headroom within limits

The exact numbers should be measured, but the policy should live in one place instead of being split across constants.

Underrun Policy

When no fresh rendered frame is available, options are:

reuse newest completed frame
reuse last scheduled frame
schedule black/degraded frame
skip/catch up schedule time

Phase 7 should pick one default and make it visible in telemetry. Reusing the newest completed frame is often the best first policy for live visual continuity, but key/fill behavior may require careful testing.

Migration Plan

Step 1. Name Lifecycle States

Introduce backend state enum and transition reporting without changing scheduling behavior much.

Initial target:

state changes are explicit
invalid transitions are detectable
tests cover allowed transitions

Step 2. Create Playout Policy Object

Unify fixed constants and scheduler assumptions.

Initial target:

frame pool size derives from policy
preroll count derives from policy
late/drop recovery reads policy

Step 3. Add Ready Output Queue

Introduce a bounded queue for completed output frames.

Initial target:

pure queue tests
explicit depth/underrun metrics
no DeckLink dependency in queue tests

Step 4. Move Callback Toward Dequeue/Schedule

Stop producing frames directly in the completion callback path.

Transitional target:

callback wakes/schedules a backend worker
worker consumes ready frames

Final target:

callback only records, recycles, dequeues, schedules

Step 5. Make Render Produce Ahead

Teach render/output code to keep the ready queue filled to target headroom.

Initial target:

render thread produces on demand until queue has target depth
callback does not synchronously wait for fresh render
stale/black fallback is explicit on underrun

Step 6. Replace Fixed Late/Drop Recovery

Replace fixed +2 schedule-index recovery with measured lag/headroom accounting.

Initial target:

track scheduled index, completed index, queue depth, late streak, drop streak
recovery decisions use measured lag

Step 7. Route Backend Health Structurally

Publish backend lifecycle, queue depth, underrun, late/drop, and degraded-state observations through HealthTelemetry.

Testing Strategy

Recommended tests:

allowed lifecycle transitions pass
invalid lifecycle transitions fail
playout policy derives frame pool/preroll sizes consistently
output queue preserves ordering
bounded output queue rejects/drops according to policy
underrun reuses last frame or black according to policy
late/drop accounting updates degraded state
scheduler catch-up uses measured lag, not fixed skip
stop drains/recycles device-frame ownership in pure fakes

Useful homes:

VideoPlayoutSchedulerTests for scheduler evolution
VideoIODeviceFakeTests for fake backend lifecycle
a new VideoBackendStateMachineTests
a new RenderOutputQueueTests

Risks

Latency Risk

More headroom means more latency. Phase 7 should make latency a visible policy choice.

Buffer Lifetime Risk

Render and backend will share ownership boundaries around output buffers. Frame ownership must be explicit to avoid reuse while hardware still owns a frame.

Underrun Policy Risk

Reusing stale frames can be visually acceptable, but wrong key/fill behavior may be worse than black. Test with real output.

Callback Thread Risk

Even after decoupling render, callback work must stay small and bounded.

Scope Risk

Backend lifecycle and playout queue are related, but either can grow large. Implement in small, testable slices.

Phase 7 Exit Criteria

Phase 7 can be considered complete once the project can say:

backend lifecycle states and transitions are explicit
playout policy owns preroll, pool size, headroom, and underrun behavior
output callbacks no longer synchronously wait for render production
render produces completed output frames into a bounded queue
underrun behavior is explicit and observable
late/drop recovery is measured rather than fixed skip-only
backend health reports lifecycle, queue, underrun, late, and dropped state
queue/lifecycle/scheduler behavior has non-DeckLink tests

Open Questions

What should the default ready-frame depth be at 30fps and 60fps?
Should underrun reuse last completed, last scheduled, or black?
Should output queue depth be user-configurable?
Should render cadence be driven by backend demand, a timer, or queue-fill pressure?
How should external keying influence stale-frame/black fallback?
Should input and output lifecycle states be separate endpoints under one backend shell?

Short Version

Phase 7 should stop making DeckLink callbacks wait for render.

Render produces ahead into a bounded queue. The backend consumes ready frames according to explicit lifecycle and playout policy. Queue depth, underruns, late frames, dropped frames, and degraded states become measured and visible.

10 KiB Raw Blame History

Phase 7 Design: Backend Lifecycle And Playout

Status

Why Phase 7 Exists

Goals

Non-Goals

Target Timing Model

Target Lifecycle Model

Proposed Collaborators

VideoBackendStateMachine

PlayoutPolicy

RenderOutputQueue

OutputFramePool

PlayoutController

Output Queue Policy

Underrun Policy

Migration Plan

Step 1. Name Lifecycle States

Step 2. Create Playout Policy Object

Step 3. Add Ready Output Queue

Step 4. Move Callback Toward Dequeue/Schedule

Step 5. Make Render Produce Ahead

Step 6. Replace Fixed Late/Drop Recovery

Step 7. Route Backend Health Structurally

Testing Strategy

Risks

Latency Risk

Buffer Lifetime Risk

Underrun Policy Risk

Callback Thread Risk

Scope Risk

Phase 7 Exit Criteria

Open Questions

Short Version

10 KiB

Raw Blame History

`VideoBackendStateMachine`

`PlayoutPolicy`

`RenderOutputQueue`

`OutputFramePool`

`PlayoutController`