Files
video-shader-toys/docs/subsystems/RenderEngine.md
Aiden 718e4dcadd
All checks were successful
CI / React UI Build (push) Successful in 11s
CI / Native Windows Build And Tests (push) Successful in 2m40s
CI / Windows Release Package (push) Successful in 2m44s
step 3
2026-05-11 19:05:29 +10:00

20 KiB

RenderEngine Subsystem Design

This document expands the RenderEngine portion of PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md. It defines the target ownership, boundaries, and migration shape for the rendering subsystem so later phases can move GL work out of today's mixed orchestration paths without inventing new boundaries on the fly.

The intent here is not to force a one-step rewrite. It is to make the target render boundary explicit enough that later work on events, live-state layering, sole-owner GL threading, and backend decoupling all land in the same place.

Purpose

RenderEngine is the live frame-production subsystem.

It owns:

  • GL context ownership in the target architecture
  • render loop cadence and render task execution
  • shader program and render-pass execution once build outputs are available
  • capture texture upload scheduling once frames are accepted for render
  • temporal history resources
  • shader feedback resources
  • render-local transient overlays
  • preview-ready frame production
  • playout-ready frame production
  • render-local reset and rebuild behavior

It does not own:

  • persisted runtime state
  • high-level mutation policy
  • OSC/UI ingress
  • device discovery or callback policy
  • playout queue policy
  • operator-visible health policy beyond publishing observations

In the Phase 1 terminology, RenderEngine consumes snapshots plus render-local transient state and produces completed visual frames plus timing signals.

Why This Subsystem Needs A Sharp Boundary

The current rendering path is split across several classes:

  • OpenGLComposite.cpp constructs the renderer, render pipeline, shader programs, runtime services, and video bridge in one owner.
  • OpenGLRenderPipeline.cpp performs pass execution, pack/readback, preview paint, and performance stat publication.
  • OpenGLVideoIOBridge.cpp accepts capture frames and still performs render work from the playout completion callback path.
  • RenderFrameStateResolver and RenderStateComposer now keep frame-state selection and live value composition outside GL drawing, while RenderEngine still owns the current GL resource and draw path.

That split is workable today, but it creates architectural pressure:

  • GL ownership is thread-shared instead of sole-owned.
  • render and playout timing are still callback-coupled.
  • preview and playout are produced in the same immediate path.
  • render-local transient state now has clearer Phase 3 boundaries, but GL ownership is still shared through callback and UI entrypoints.
  • it is difficult to test render behavior separately from app bootstrap and hardware integration.

RenderEngine exists to absorb that responsibility into one subsystem with one direction of ownership. Phase 4 has completed the GL ownership part of this target: normal runtime GL work now enters through the RenderEngine render thread.

Responsibilities

1. Sole GL Ownership

In the target design, RenderEngine should be the only subsystem that performs long-lived GL work.

That includes:

  • context binding and release policy
  • framebuffer and texture lifetime
  • shader program binding and draw execution
  • upload/readback buffer lifetime
  • preview blit or present paths
  • render-local resource reset on rebuild or video-format changes

This is the most important boundary. Other subsystems may request work or provide data, but they should not directly perform GL commands.

2. Snapshot Consumption

RenderEngine should consume immutable or near-immutable render snapshots from RuntimeSnapshotProvider.

It is responsible for:

  • detecting snapshot version changes
  • rebuilding or re-binding render-local resources when the snapshot changes
  • resolving render-pass execution from snapshot contents
  • separating structural snapshot changes from transient overlay changes

It should not inspect mutable runtime store objects directly.

3. Frame Production

RenderEngine should produce completed frames for two consumers:

  • preview presentation
  • VideoBackend playout consumption

Those outputs may share most of their render work, but they are not equal-priority outputs. The subsystem rule from Phase 1 should be preserved:

  • playout is the primary timing-sensitive output
  • preview is subordinate and best-effort

4. Render-Local Transient State

RenderEngine owns transient visual state that affects output but is not persisted truth.

Examples:

  • temporal history textures
  • feedback ping-pong buffers
  • render-local OSC/live overlay state
  • queued input frames accepted for upload
  • cached readback frames
  • preview-only presentation state
  • in-flight rebuild generations

This state should remain render-local even when it influences visible output.

Phase 5's RuntimeStateLayerModel explicitly keeps temporal history, feedback state, accepted input frames, staged output frames, preview staging, and screenshot/readback staging in the render-local category. These are deliberately outside the persisted/committed/transient-automation parameter composition rule.

RuntimeLiveState now owns transient automation invalidation for render-facing compatibility. It can clear overlays for a target layer/control key and prunes overlays that no longer resolve to the current layer and parameter definitions before applying them to a frame. This keeps shader reload, preset load, and layer removal behavior local to the live-state/composition boundary instead of scattering it through GL drawing code.

5. Shader Build Application

Compilation itself may eventually move into a separate build service, but once shader build outputs exist, RenderEngine owns:

  • program creation/link usage
  • pass graph application
  • sampler/texture binding layout application
  • resource reallocation required by shader shape changes
  • safe invalidation of old render-local feedback/history resources

6. Render Timing Publication

RenderEngine should publish observations to HealthTelemetry such as:

  • frame render duration
  • upload duration
  • pass execution duration
  • pack/readback duration
  • preview present timing
  • rebuild stalls
  • dropped/skipped input uploads
  • output frame production latency

It should publish them, not own the health policy built from them.

Non-Responsibilities

The target boundary should remain explicit about what does not belong here.

RenderEngine should not:

  • decide whether a parameter mutation is persisted
  • normalize OSC/UI actions
  • choose device modes
  • own DeckLink callback behavior
  • own playout headroom policy
  • perform stack preset serialization
  • broadcast UI state
  • treat telemetry as a control plane

Those rules matter because the current codebase often solves timing issues by letting the render path reach sideways into nearby systems.

GL Ownership Model

Current Rule

One subsystem owns GL. RenderEngine now starts a dedicated render thread, binds the existing GL context on that thread for normal runtime work, and routes input upload, output render, preview presentation, screenshot capture, shader application, and render-local reset work through render-thread requests.

The render thread should:

  • create or adopt the GL context
  • execute all frame production work
  • perform accepted texture uploads
  • execute all pass graphs
  • manage async readback and output packing
  • manage feedback/history resets and reallocations

Other threads should interact with the subsystem through queues, snapshots, and completion signals, not by borrowing the GL context.

Remaining Timing State

GL ownership is no longer shared across callback-driven and UI entrypoints:

The remaining timing issue is not shared GL ownership; it is the transitional synchronous output request/response path. The DeckLink completion callback still waits while the render thread produces an output frame, fills the DeckLink buffer, and then schedules the next frame.

Migration Direction

The next target path should be:

  1. input callback enqueues frame payloads or references
  2. render thread accepts the latest usable input frame
  3. render thread performs uploads on its own cadence
  4. render thread produces completed output frames ahead of backend demand
  5. backend callbacks only dequeue and schedule pre-rendered frames

Phase 4 completed the part that removes callback-thread GL ownership. Phase 7 should complete the producer/consumer playout part.

Render Loop Boundaries

RenderEngine should own a render loop with explicit phases. A good target shape is:

  1. drain render-side commands and accepted service events
  2. swap to the latest published snapshot if needed
  3. apply render-local transient overlays
  4. accept or coalesce latest input frame for upload
  5. perform required uploads
  6. execute pass graph
  7. update temporal and feedback resources
  8. pack and stage output frame(s)
  9. publish preview-ready image if due
  10. publish playout-ready frame(s) to VideoBackend
  11. emit timing and health samples

The important property is that preview, playout preparation, feedback maintenance, and upload execution all happen under one render-owned cadence rather than as ad hoc side effects of unrelated callbacks.

Snapshot And Overlay Interaction

RenderEngine should treat snapshots and overlays as different layers of state.

Snapshot Inputs

Snapshots should provide:

  • layer stack structure
  • shader/package selections
  • validated committed parameter values
  • pass graph definitions
  • resource requirements derived from runtime state

Render-Local Overlay Inputs

Overlays should provide:

  • active automation targets
  • smoothed transient parameter overrides
  • temporary visual state that should not persist back into the store
  • queued reset/rebuild invalidations for render-local resources

Resolution Rule

The render-side resolution order should be:

  1. snapshot committed state forms the baseline
  2. render-local transient overlays are applied on top
  3. feedback/history resources influence shading as render-local inputs
  4. completed frame is produced without mutating the underlying snapshot

This is especially important after the OSC work already moved toward render-local overlays. Phase 1 should keep that direction: render consumes committed truth plus transient live overlays, but render does not become the owner of persisted truth.

Preview And Playout Relationship

Preview should be a subordinate consumer of render results, not a peer that can disturb playout timing.

Target Rule

  • playout deadlines come first
  • preview is best-effort
  • preview cadence may be reduced independently
  • preview failure must not stall output frame production

Current State

Today preview still hangs off the render pipeline path through mPaint() in OpenGLRenderPipeline::RenderFrame(). That keeps preview close enough to the playout path that it is still part of the same timing surface.

Target Shape

RenderEngine should internally distinguish:

  • playout-ready frame production
  • preview presentation or preview-copy publication

Possible later implementations:

  • playout frame and preview frame share one composite render, but preview present is decoupled and rate-limited
  • render publishes a preview texture handle or CPU-side preview image to a preview presenter
  • preview updates are skipped under load without affecting playout queue fill

The exact implementation can change later, but the subsystem contract should already assume preview is subordinate.

Interaction With RuntimeSnapshotProvider

RenderEngine should depend on RuntimeSnapshotProvider, not on RuntimeStore.

Expected interactions:

  • query latest snapshot version
  • consume latest stable snapshot
  • detect structural versus parameter-only changes
  • request no mutation back into the snapshot provider during render

Expected non-interactions:

  • no direct persistence reads/writes
  • no raw store mutation
  • no direct service ingress handling

This is one of the main Phase 1 guardrails, because the current code often achieves convenience by letting render reach back into runtime-owned mutable objects.

Interaction With VideoBackend

The target dependency direction stays:

  • VideoBackend -> RenderEngine

That means:

  • backend requests or consumes ready frames
  • backend reports output timing/completion events
  • render does not own output device policy

RenderEngine should expose frame-production and queue-facing interfaces, while VideoBackend owns:

  • device callback handling
  • output scheduling policy
  • buffer pool policy
  • backend state transitions

In later phases, this should evolve toward a producer/consumer queue where:

  • render produces completed frames ahead of demand
  • backend consumes already-produced frames
  • callbacks drive dequeue/schedule/accounting only

Current Code Mapping

The following current responsibilities should converge into RenderEngine.

From OpenGLComposite

  • render-local overlay management
  • render-facing rebuild application
  • screenshot-related render execution hooks
  • render bootstrap ownership currently mixed with app bootstrap

From OpenGLRenderPipeline

  • frame render orchestration
  • output pack conversion
  • async readback state
  • output frame caching
  • preview-ready signal publication

From OpenGLVideoIOBridge

  • GL texture upload execution should move under render ownership
  • playout callback render work should move out of the callback path

Remains Outside RenderEngine

  • device callback registration
  • playout scheduling policy
  • signal/device status lifecycle
  • runtime mutation policy

Suggested Internal Components

This document does not require final class names, but RenderEngine will likely be easier to evolve if it is not one monolithic replacement for OpenGLComposite.

Reasonable internal pieces could include:

  • RenderLoopController
  • RenderSnapshotConsumer
  • RenderOverlayState
  • RenderInputQueue
  • RenderPassExecutor
  • RenderHistoryManager
  • RenderOutputStager
  • PreviewPresenter

Those are internal implementation helpers. They should not become new cross-cutting subsystem boundaries by themselves.

Public Interface Shape

Aligned with the Phase 1 design, RenderEngine should eventually expose operations in this family:

  • StartRenderLoop(...)
  • StopRenderLoop()
  • ConsumeSnapshot(...)
  • EnqueueInputFrame(...)
  • ApplyOverlayUpdate(...)
  • RequestRenderLocalReset(...)
  • HandleRebuildOutputs(...)
  • TryProduceOutputFrame(...)
  • GetPreviewFrame(...)
  • ReportRenderState()

Interface goals:

  • calls are explicit about whether they mutate render-local state or request frame production
  • no caller needs direct GL access
  • preview and playout are exposed as outputs, not as reasons for callers to enter the render path

Migration Plan From Current Code

Step 1. Name The Boundary

Treat OpenGLRenderPipeline plus the render portions of OpenGLComposite and OpenGLVideoIOBridge as conceptually belonging to RenderEngine, even before physical extraction is complete.

Step 2. Stop New Render Work From Escaping

As new features are added, keep:

  • feedback buffers
  • temporal history
  • render-local overlays
  • preview state

inside render-owned code paths instead of putting them back into runtime storage or service layers.

Step 3. Isolate Snapshot Consumption

Introduce snapshot-facing APIs so render no longer depends on broad runtime-state access for frame production.

Current status: Phase 3 introduced RenderFrameInput, RenderFrameState, and RenderFrameStateResolver, so frame-state selection is named and no longer lives inside GL drawing. Phase 4 built on that contract and moved normal runtime GL ownership onto the render thread.

Step 4. Move Uploads Onto Render Ownership

Input callbacks should enqueue or hand off frame data; render executes the upload.

Step 5. Break Callback-Driven Rendering

Move from "render in playout completion callback" to "render ahead and let backend consume ready frames."

Step 6. Decouple Preview Cadence

Make preview a best-effort presentation path with its own skip/rate-limit policy.

Step 7. Narrow OpenGLComposite

After the above, OpenGLComposite should collapse toward a composition root and legacy adapter rather than remaining the owner of render behavior.

Risks

Latency Risk

Moving to queue-based frame production can accidentally increase latency if headroom is allowed to grow without policy. RenderEngine should therefore expose queue-friendly production, but VideoBackend must still own explicit latency/headroom policy.

Resource Churn Risk

Snapshot changes, shader rebuilds, and video-format changes can cause expensive reallocation of:

  • feedback surfaces
  • history buffers
  • output pack resources
  • readback buffers

The subsystem needs clear structural-change boundaries so parameter-only changes do not trigger broad resource churn.

Preview Coupling Risk

If preview remains too close to the render/playout path, it can continue to steal budget from output production even after the rest of the subsystem is cleaned up.

Readback Deadline Risk

The current async readback path still falls back to synchronous reads when the deadline is missed. That behavior may remain necessary, but RenderEngine should treat it as a degraded-path metric, not as an invisible normal case.

Overlay Complexity Risk

Render-local overlays are powerful, but they can become a hidden second state model if not kept clearly subordinate to committed snapshot state.

Open Questions

  • Should preview become a separate presenter helper inside RenderEngine, or remain a subordinate callback/output sink?
  • Where should screenshot capture live long-term: inside RenderEngine, or in a small render consumer layered on top of it?
  • Should shader compilation outputs be delivered to render as whole-framegraph rebuild packages, or incrementally by layer/pass?
  • How should input frame ownership work under load: newest-only, bounded queue, or policy selected by backend mode?
  • Should render expose one playout-ready frame at a time, or a bounded ring the backend drains directly?
  • What exact distinction should the snapshot provider publish between structural changes and parameter-only changes so render rebuilds stay cheap?

Phase 1 Exit Criteria For RenderEngine

For Phase 1, this subsystem design is sufficiently defined once the project agrees that:

  • render is the sole long-term owner of GL work
  • render consumes snapshots, not mutable runtime store objects
  • preview is subordinate to playout
  • feedback/history/overlays are render-local transient state
  • backend callbacks should converge toward dequeue/schedule behavior rather than direct rendering
  • current render responsibilities in OpenGLComposite, OpenGLRenderPipeline, and OpenGLVideoIOBridge are expected to migrate under this subsystem

Short Version

RenderEngine should become the subsystem that owns live GPU execution and nothing else.

It consumes committed snapshots plus render-local overlays, owns the full GL lifecycle, produces preview and playout-ready frames, and publishes timing observations. It should not own persistence, control ingress, or hardware scheduling policy. If later phases hold to that line, timing work and render-state work can get cleaner without reintroducing the same cross-thread coupling in a different form.