aiden/video-shader-toys

Fork 0

Files

Aiden 20476bdf63

CI / React UI Build (push) Successful in 11s

Details

CI / Native Windows Build And Tests (push) Successful in 2m40s

Details

CI / Windows Release Package (push) Successful in 2m46s

Details

Step 3

2026-05-11 17:41:59 +10:00

17 KiB

Raw Blame History

Phase 4 Design: Render Thread Ownership

This document expands Phase 4 of ARCHITECTURE_RESILIENCE_REVIEW.md into a concrete design target.

Phase 1 named the subsystems. Phase 2 added the typed event substrate. Phase 3 made render-facing live state explicit through RuntimeLiveState, RenderStateComposer, RenderFrameInput, RenderFrameState, RenderFrameStateResolver, and RuntimeServiceLiveBridge. Phase 4 can now focus on the core timing-risk boundary: making one render thread the only owner of OpenGL work.

Status

Phase 4 design package: proposed.
Phase 4 implementation: Step 3 started. The existing synchronous RenderEngine entrypoints delegate their GL bodies to named ...OnRenderThread(...) helpers, preview/screenshot/render-reset/input-upload/output-render requests pass through a small RenderCommandQueue compatibility mailbox, and RenderEngine now starts a dedicated render thread for normal runtime GL work.
Current alignment: the repo has a named frame-state contract and cleaner render-state preparation. Normal runtime GL work is routed through the render thread after startup, while startup initialization still runs before the render thread is started.

Current GL ownership footholds:

RenderEngine owns GL resources, a dedicated render thread, the current synchronous compatibility shims, a small render command mailbox, and named render-thread helper methods.
RenderFrameInput / RenderFrameState provide the frame-state contract that a render thread can consume.
RenderFrameStateResolver prepares the render-facing layer state before drawing.
OpenGLVideoIOBridge still calls RenderEngine::TryUploadInputFrame(...) from the input path and RenderEngine::RenderOutputFrame(...) from the output path.
OpenGLComposite::paintGL(...), screenshot capture, input upload, and output rendering still call synchronous RenderEngine methods, but those methods now invoke render-thread work once OpenGLComposite::Start() has started the render thread.

Why Phase 4 Exists

The resilience review identifies shared GL ownership as the main remaining timing and failure-isolation risk. Today the shared context lock protects correctness, but it does not isolate timing:

input callbacks can attempt texture upload
output callbacks can trigger frame rendering and readback
preview paint can enter the same GL context
screenshot capture can enter the same GL context
the DeckLink completion path is still too close to render work

That means brief input, preview, readback, or callback stalls can still collide on the most timing-sensitive path.

Phase 4 should turn GL from a shared resource guarded by a lock into a resource owned by one thread with explicit queues and handoff points.

Goals

Phase 4 should establish:

one render thread as the sole long-lived owner of the GL context
non-render threads enqueue work instead of binding the GL context
input upload requests are accepted and executed by the render thread
output frame rendering is requested or scheduled through render-owned work
preview and screenshot requests become render-thread commands or consumers
RenderFrameInput / RenderFrameState become the stable data contract for frame production
GL context entrypoints are reduced to render-thread-only code paths
tests for queue semantics, request coalescing, and lifecycle behavior without requiring DeckLink hardware

Non-Goals

Phase 4 should not require:

the final producer/consumer playout queue for DeckLink
the final DeckLink lifecycle state machine
replacing the async readback policy
implementing background persistence
completing Phase 5's deeper live-state layering
replacing every UI or backend API at once

Those are later phases or follow-on work. Phase 4 is about making GL ownership deterministic first.

Current GL Entry Points

The current code paths that matter most are:

Entry point	Current behavior	Phase 4 direction
`RenderEngine::TryUploadInputFrame(...)`	synchronous compatibility shim; after render-thread startup it queues input upload work and waits for render-thread completion	enqueue latest input frame; render thread uploads without callback-owned GL
`RenderEngine::RenderOutputFrame(...)`	synchronous compatibility shim; after render-thread startup it queues output render work and waits for render-thread completion	render thread executes output frame production
`RenderEngine::TryPresentPreview(...)`	synchronous compatibility shim; after render-thread startup it queues preview presentation and waits for render-thread completion	render thread or preview presenter consumes latest completed frame
`RenderEngine::CaptureOutputFrameRgbaTopDown(...)`	synchronous compatibility shim; after render-thread startup it queues screenshot readback and waits for render-thread completion	screenshot request becomes render-thread command
`OpenGLVideoIOBridge::UploadInputFrame(...)`	calls render upload directly	push input frame into render queue/mailbox
`OpenGLVideoIOBridge::RenderScheduledFrame(...)`	calls render output directly from backend path	request/consume render-produced output without callback-owned GL

Target Ownership Model

Render Thread

The render thread should own:

wglMakeCurrent(...) for the rendering context
all GL resource creation/destruction
input texture upload
pass execution
output pack conversion
async readback buffers and fences
preview presentation or preview frame publication
screenshot readback
temporal history and feedback resources

Other Threads

Other threads may:

enqueue input frames or replace the latest input frame
publish control/runtime/backend events
request shader build application
request render-local resets
request screenshots
consume ready output frames or receive completion notifications

Other threads should not:

call GL directly
bind or unbind the render context
wait on GL fences directly
mutate render-local resource state

Proposed Collaborators

`RenderThread`

Owns the OS thread, wakeup primitive, lifecycle, and render-loop execution.

Responsibilities:

start and stop the render thread
bind the GL context for the thread lifetime or render-loop lifetime
drain render commands
execute frame production work
publish lifecycle and failure observations

Non-responsibilities:

runtime mutation policy
DeckLink scheduling policy
durable persistence

`RenderCommandQueue`

Small bounded queue or command mailbox for render-thread work.

Current implementation:

RenderCommandQueue exists as a pure C++ mailbox helper.
Preview present and screenshot capture requests use latest-value coalescing.
Input upload requests use latest-value coalescing. During the compatibility phase the input frame memory is still drained immediately; a real render thread will need copied or otherwise owned frame storage.
Output frame requests use FIFO semantics so scheduled output demand is not collapsed.
Render-local reset requests coalesce to the strongest pending reset scope.
The synchronous compatibility shims submit queued work to the render thread and wait for completion once the render thread is running.

Possible commands:

UploadInputFrame
RenderOutputFrame
PrepareFrameState
ApplyShaderBuild
ResetTemporalHistory
ResetShaderFeedback
PresentPreview
CaptureScreenshot
Stop

High-rate commands should be coalesced where appropriate. Input frames should likely be latest-value rather than unbounded FIFO.

`RenderFrameCoordinator`

Optional helper that combines Phase 3's frame contract with render-thread execution.

Responsibilities:

build or receive RenderFrameInput
call RuntimeServiceLiveBridge and RenderFrameStateResolver
hand RenderFrameState to RenderEngine

This can begin as a thin helper. The important part is that it keeps frame-state preparation explicit when renderEffect() stops being called directly from the callback path.

`RenderOutputMailbox`

Optional transitional bridge for output frames.

Responsibilities:

hold the latest completed output frame or a small bounded set
let backend code consume output without owning GL
report underrun/stale-frame reuse observations

This may be a Phase 4 late step or a Phase 7 playout-policy step. Phase 4 should at least avoid designing the render thread in a way that blocks it.

Threading Contract

Phase 4 should make thread ownership visible in APIs.

Candidate naming:

RenderEngine::StartRenderThread(...)
RenderEngine::StopRenderThread()
RenderEngine::EnqueueInputFrame(...)
RenderEngine::RequestOutputFrame(...)
RenderEngine::RequestPreviewPresent(...)
RenderEngine::RequestScreenshot(...)

Render-thread-only methods should be private or clearly named:

RenderEngine::UploadInputFrameOnRenderThread(...)
RenderEngine::RenderOutputFrameOnRenderThread(...)
RenderEngine::CaptureOutputFrameRgbaTopDownOnRenderThread(...)

The current TryUploadInputFrame, RenderOutputFrame, TryPresentPreview, and CaptureOutputFrameRgbaTopDown methods can remain as compatibility shims during migration, but their implementations should move toward enqueue-and-wait or enqueue-and-return behavior instead of binding GL directly from the caller's thread.

Frame Production Shape

A target render-thread frame should look like:

wake for input, output demand, preview demand, shader build, reset, screenshot, or stop
drain bounded render commands
coalesce to the latest input frame and latest control/live state
build RenderFrameInput
prepare RenderFrameState
upload accepted input frame
render layer stack
pack output if needed
stage readback or output buffer
publish preview/screenshot/output completion as needed
record timing and queue metrics

The exact cadence can remain demand-driven initially. The architectural win is that the demand wakes the render thread rather than borrowing GL from the caller.

Migration Plan

Step 1. Name Render-Thread-Only Methods

Split existing direct GL methods into public request methods and private render-thread methods without changing behavior much.

Initial target:

keep current synchronous behavior where callers need a result
move GL bodies into clearly render-thread-owned helpers for upload, output render, preview presentation, and screenshot readback
make future queue migration mechanical

Step 2. Add Render Command Queue

Introduce a small queue/mailbox for render commands.

Start with low-risk commands:

preview present request
screenshot request
render-local reset requests
input upload request
output render request

The queue and wakeup behavior still need the dedicated render thread before the callbacks stop borrowing the GL context.

Step 3. Start A Dedicated Render Thread

Create the render thread and make it own context binding.

create a dedicated render thread owned by RenderEngine
bind the existing GL context on the render thread for normal runtime work
stop the render thread before GL context destruction
keep transitional synchronous request/response for output frames
remove normal runtime dependence on the shared GL CRITICAL_SECTION
add timeout/failure behavior for render-thread requests

Transitional behavior still allows synchronous request/response for output frames. Render-thread requests now fail fast if they cannot begin within the request timeout, and log over-budget tasks that have already started before waiting for safe completion. The important change is that the caller waits for render-thread completion rather than taking the GL context itself.

Step 4. Move Input Upload To The Render Thread

Change OpenGLVideoIOBridge::UploadInputFrame(...) so it enqueues or replaces the latest input frame.

Policy targets:

bounded memory
latest-frame wins under load
input upload skip count is observable
input callback never waits for GL

Step 5. Move Output Rendering To The Render Thread

Change OpenGLVideoIOBridge::RenderScheduledFrame(...) so it requests render-thread output production or consumes a completed render-thread output.

Transitional option:

synchronous request/response through the render thread

Better follow-up:

render ahead into a bounded output queue and let backend callbacks consume ready frames

Step 6. Decouple Preview And Screenshot Requests

Preview should become best-effort:

request preview presentation from the render thread
skip when render is busy or output deadline pressure is high
record preview skips

Screenshot should become:

queued render-thread capture request
async disk write remains outside render thread

Step 7. Remove Shared GL Lock From Normal Paths

Once all GL entrypoints are render-thread-owned:

remove normal dependence on pMutex for render correctness
keep assertions or diagnostics that detect wrong-thread GL calls
leave only lifecycle synchronization where needed

Testing Strategy

Phase 4 tests should avoid hardware where possible.

Recommended tests:

render command queue preserves FIFO for non-coalesced commands
latest-input mailbox drops older frames under load
stop command wakes and drains the render thread
screenshot request receives one completion or failure
output render request reports timeout/failure if render thread is stopped
render reset commands coalesce where expected
wrong-thread render-only methods are not publicly reachable

Existing useful homes:

RuntimeEventTypeTests for new render/backend observations
RuntimeSubsystemTests for pure request/coalescing helpers
a new RenderThreadTests target for queue/mailbox/lifecycle helpers that do not require GL

Manual verification will still be needed for:

real DeckLink input/output
preview interaction
screenshot capture
shader reload while rendering

Telemetry Added During Phase 4

Phase 4 should add minimal metrics while moving ownership:

render command queue depth
input frames accepted, replaced, and dropped
render-thread wake reason counts
render-thread frame duration
output request latency
preview request skipped count
screenshot request success/failure count
wrong-thread GL call diagnostics if practical

Full operational reporting remains Phase 8, but these metrics make the threading migration debuggable.

Risks

Deadlock Risk

Synchronous request/response shims can deadlock if the caller is already on the render thread or holds a lock the render thread needs. Phase 4 should keep request waits narrow and add render-thread detection early.

Latency Risk

Moving work through queues can hide latency. Queue depth and output request latency should be measured from the first migration step.

Lifetime Risk

Moving context ownership changes startup and shutdown order. The render thread must stop before GL resources or window/context handles are destroyed.

Callback Pressure Risk

If DeckLink callbacks wait too long for render-thread work, Phase 4 may improve GL ownership but still leave callback timing fragile. A synchronous bridge is acceptable as a transition, but the design should keep the path open for producer/consumer playout.

Preview Coupling Risk

Preview can remain a hidden budget consumer if it stays in the output frame path. Phase 4 should keep preview explicitly best-effort, even if physical decoupling continues later.

Phase 4 Exit Criteria

Phase 4 can be considered complete once the project can say:

one render thread owns the GL context during normal operation
input callbacks do not bind GL or wait on GL upload
output callbacks do not bind GL directly
preview and screenshot requests enter render through explicit render-thread requests
RenderFrameInput / RenderFrameState remain the frame-state contract
normal frame production no longer depends on a shared GL CRITICAL_SECTION
render-thread queue/mailbox behavior has non-GL tests
shutdown order is explicit and tested or manually verified

Open Questions

Should the first output migration be synchronous request/response, or should Phase 4 go directly to a small ready-frame queue?
Should the render thread own RuntimeServiceLiveBridge calls, or should frame state be prepared just before enqueue?
How much input frame memory should be copied at enqueue time versus referenced from backend-owned buffers?
Should preview present on the render thread, or should render publish a preview image/texture to a separate presenter?
What timeout should output callbacks use if the render thread cannot produce a frame in time?
Should wrong-thread GL access be enforced with assertions, telemetry, or both?

Short Version

Phase 4 should make GL ownership boring and deterministic.

One render thread owns the context. Other threads submit work or consume results. Input upload, frame rendering, readback, preview, and screenshot capture all move behind render-thread entrypoints. The first implementation can be transitional and partly synchronous, but after Phase 4 the app should no longer rely on callback and UI paths borrowing the GL context under one shared lock.

17 KiB Raw Blame History

Phase 4 Design: Render Thread Ownership

Status

Why Phase 4 Exists

Goals

Non-Goals

Current GL Entry Points

Target Ownership Model

Render Thread

Other Threads

Proposed Collaborators

RenderThread

RenderCommandQueue

RenderFrameCoordinator

RenderOutputMailbox

Threading Contract

Frame Production Shape

Migration Plan

Step 1. Name Render-Thread-Only Methods

Step 2. Add Render Command Queue

Step 3. Start A Dedicated Render Thread

Step 4. Move Input Upload To The Render Thread

Step 5. Move Output Rendering To The Render Thread

Step 6. Decouple Preview And Screenshot Requests

Step 7. Remove Shared GL Lock From Normal Paths

Testing Strategy

Telemetry Added During Phase 4

Risks

Deadlock Risk

Latency Risk

Lifetime Risk

Callback Pressure Risk

Preview Coupling Risk

Phase 4 Exit Criteria

Open Questions

Short Version

17 KiB

Raw Blame History

`RenderThread`

`RenderCommandQueue`

`RenderFrameCoordinator`

`RenderOutputMailbox`