docs update
Some checks failed
CI / React UI Build (push) Successful in 10s
CI / Native Windows Build And Tests (push) Successful in 2m36s
CI / Windows Release Package (push) Has been cancelled

This commit is contained in:
Aiden
2026-05-11 17:16:39 +10:00
parent e5c5920ccd
commit ebc10a9925
5 changed files with 417 additions and 31 deletions

View File

@@ -0,0 +1,373 @@
# Phase 4 Design: Render Thread Ownership
This document expands Phase 4 of [ARCHITECTURE_RESILIENCE_REVIEW.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/ARCHITECTURE_RESILIENCE_REVIEW.md) into a concrete design target.
Phase 1 named the subsystems. Phase 2 added the typed event substrate. Phase 3 made render-facing live state explicit through `RuntimeLiveState`, `RenderStateComposer`, `RenderFrameInput`, `RenderFrameState`, `RenderFrameStateResolver`, and `RuntimeServiceLiveBridge`. Phase 4 can now focus on the core timing-risk boundary: making one render thread the only owner of OpenGL work.
## Status
- Phase 4 design package: proposed.
- Phase 4 implementation: not started.
- Current alignment: the repo has a named frame-state contract and cleaner render-state preparation, but GL work is still entered through multiple paths protected by one shared `CRITICAL_SECTION`.
Current GL ownership footholds:
- `RenderEngine` owns GL resources and the current context-binding helpers.
- `RenderFrameInput` / `RenderFrameState` provide the frame-state contract that a render thread can consume.
- `RenderFrameStateResolver` prepares the render-facing layer state before drawing.
- `OpenGLVideoIOBridge` still calls `RenderEngine::TryUploadInputFrame(...)` from the input path and `RenderEngine::RenderOutputFrame(...)` from the output path.
- `OpenGLComposite::paintGL(...)`, screenshot capture, input upload, and output rendering still reach GL through `RenderEngine` methods that bind the shared context under `pMutex`.
## Why Phase 4 Exists
The resilience review identifies shared GL ownership as the main remaining timing and failure-isolation risk. Today the shared context lock protects correctness, but it does not isolate timing:
- input callbacks can attempt texture upload
- output callbacks can trigger frame rendering and readback
- preview paint can enter the same GL context
- screenshot capture can enter the same GL context
- the DeckLink completion path is still too close to render work
That means brief input, preview, readback, or callback stalls can still collide on the most timing-sensitive path.
Phase 4 should turn GL from a shared resource guarded by a lock into a resource owned by one thread with explicit queues and handoff points.
## Goals
Phase 4 should establish:
- one render thread as the sole long-lived owner of the GL context
- non-render threads enqueue work instead of binding the GL context
- input upload requests are accepted and executed by the render thread
- output frame rendering is requested or scheduled through render-owned work
- preview and screenshot requests become render-thread commands or consumers
- `RenderFrameInput` / `RenderFrameState` become the stable data contract for frame production
- GL context entrypoints are reduced to render-thread-only code paths
- tests for queue semantics, request coalescing, and lifecycle behavior without requiring DeckLink hardware
## Non-Goals
Phase 4 should not require:
- the final producer/consumer playout queue for DeckLink
- the final DeckLink lifecycle state machine
- replacing the async readback policy
- implementing background persistence
- completing Phase 5's deeper live-state layering
- replacing every UI or backend API at once
Those are later phases or follow-on work. Phase 4 is about making GL ownership deterministic first.
## Current GL Entry Points
The current code paths that matter most are:
| Entry point | Current behavior | Phase 4 direction |
| --- | --- | --- |
| `RenderEngine::TryUploadInputFrame(...)` | attempts to take the GL lock, binds the context, uploads input texture | enqueue latest input frame; render thread uploads |
| `RenderEngine::RenderOutputFrame(...)` | takes the GL lock, binds the context, renders, packs/readbacks output | render thread executes output frame production |
| `RenderEngine::TryPresentPreview(...)` | attempts to take the GL lock and presents preview | render thread or preview presenter consumes latest completed frame |
| `RenderEngine::CaptureOutputFrameRgbaTopDown(...)` | takes the GL lock and reads output pixels | screenshot request becomes render-thread command |
| `OpenGLVideoIOBridge::UploadInputFrame(...)` | calls render upload directly | push input frame into render queue/mailbox |
| `OpenGLVideoIOBridge::RenderScheduledFrame(...)` | calls render output directly from backend path | request/consume render-produced output without callback-owned GL |
## Target Ownership Model
### Render Thread
The render thread should own:
- `wglMakeCurrent(...)` for the rendering context
- all GL resource creation/destruction
- input texture upload
- pass execution
- output pack conversion
- async readback buffers and fences
- preview presentation or preview frame publication
- screenshot readback
- temporal history and feedback resources
### Other Threads
Other threads may:
- enqueue input frames or replace the latest input frame
- publish control/runtime/backend events
- request shader build application
- request render-local resets
- request screenshots
- consume ready output frames or receive completion notifications
Other threads should not:
- call GL directly
- bind or unbind the render context
- wait on GL fences directly
- mutate render-local resource state
## Proposed Collaborators
### `RenderThread`
Owns the OS thread, wakeup primitive, lifecycle, and render-loop execution.
Responsibilities:
- start and stop the render thread
- bind the GL context for the thread lifetime or render-loop lifetime
- drain render commands
- execute frame production work
- publish lifecycle and failure observations
Non-responsibilities:
- runtime mutation policy
- DeckLink scheduling policy
- durable persistence
### `RenderCommandQueue`
Small bounded queue or command mailbox for render-thread work.
Possible commands:
- `UploadInputFrame`
- `RenderOutputFrame`
- `PrepareFrameState`
- `ApplyShaderBuild`
- `ResetTemporalHistory`
- `ResetShaderFeedback`
- `PresentPreview`
- `CaptureScreenshot`
- `Stop`
High-rate commands should be coalesced where appropriate. Input frames should likely be latest-value rather than unbounded FIFO.
### `RenderFrameCoordinator`
Optional helper that combines Phase 3's frame contract with render-thread execution.
Responsibilities:
- build or receive `RenderFrameInput`
- call `RuntimeServiceLiveBridge` and `RenderFrameStateResolver`
- hand `RenderFrameState` to `RenderEngine`
This can begin as a thin helper. The important part is that it keeps frame-state preparation explicit when `renderEffect()` stops being called directly from the callback path.
### `RenderOutputMailbox`
Optional transitional bridge for output frames.
Responsibilities:
- hold the latest completed output frame or a small bounded set
- let backend code consume output without owning GL
- report underrun/stale-frame reuse observations
This may be a Phase 4 late step or a Phase 7 playout-policy step. Phase 4 should at least avoid designing the render thread in a way that blocks it.
## Threading Contract
Phase 4 should make thread ownership visible in APIs.
Candidate naming:
- `RenderEngine::StartRenderThread(...)`
- `RenderEngine::StopRenderThread()`
- `RenderEngine::EnqueueInputFrame(...)`
- `RenderEngine::RequestOutputFrame(...)`
- `RenderEngine::RequestPreviewPresent(...)`
- `RenderEngine::RequestScreenshot(...)`
Render-thread-only methods should be private or clearly named:
- `RenderEngine::UploadInputFrameOnRenderThread(...)`
- `RenderEngine::RenderOutputFrameOnRenderThread(...)`
- `RenderEngine::CaptureOutputFrameOnRenderThread(...)`
The current `TryUploadInputFrame`, `RenderOutputFrame`, `TryPresentPreview`, and `CaptureOutputFrameRgbaTopDown` methods can remain as compatibility shims during migration, but their implementations should move toward enqueue-and-wait or enqueue-and-return behavior instead of binding GL directly from the caller's thread.
## Frame Production Shape
A target render-thread frame should look like:
1. wake for input, output demand, preview demand, shader build, reset, screenshot, or stop
2. drain bounded render commands
3. coalesce to the latest input frame and latest control/live state
4. build `RenderFrameInput`
5. prepare `RenderFrameState`
6. upload accepted input frame
7. render layer stack
8. pack output if needed
9. stage readback or output buffer
10. publish preview/screenshot/output completion as needed
11. record timing and queue metrics
The exact cadence can remain demand-driven initially. The architectural win is that the demand wakes the render thread rather than borrowing GL from the caller.
## Migration Plan
### Step 1. Name Render-Thread-Only Methods
Split existing direct GL methods into public request methods and private render-thread methods without changing behavior much.
Initial target:
- keep current synchronous behavior where callers need a result
- move GL bodies into clearly render-thread-owned helpers
- make future queue migration mechanical
### Step 2. Add Render Command Queue
Introduce a small queue/mailbox for render commands.
Start with low-risk commands:
- preview present request
- screenshot request
- render-local reset requests
Then move input upload and output render requests once the queue and wakeup behavior are proven.
### Step 3. Start A Dedicated Render Thread
Create the render thread and make it own context binding.
Transitional behavior may still allow synchronous request/response for output frames. The important change is that the caller waits for render-thread completion rather than taking the GL context itself.
### Step 4. Move Input Upload To The Render Thread
Change `OpenGLVideoIOBridge::UploadInputFrame(...)` so it enqueues or replaces the latest input frame.
Policy targets:
- bounded memory
- latest-frame wins under load
- input upload skip count is observable
- input callback never waits for GL
### Step 5. Move Output Rendering To The Render Thread
Change `OpenGLVideoIOBridge::RenderScheduledFrame(...)` so it requests render-thread output production or consumes a completed render-thread output.
Transitional option:
- synchronous request/response through the render thread
Better follow-up:
- render ahead into a bounded output queue and let backend callbacks consume ready frames
### Step 6. Decouple Preview And Screenshot Requests
Preview should become best-effort:
- request preview presentation from the render thread
- skip when render is busy or output deadline pressure is high
- record preview skips
Screenshot should become:
- queued render-thread capture request
- async disk write remains outside render thread
### Step 7. Remove Shared GL Lock From Normal Paths
Once all GL entrypoints are render-thread-owned:
- remove normal dependence on `pMutex` for render correctness
- keep assertions or diagnostics that detect wrong-thread GL calls
- leave only lifecycle synchronization where needed
## Testing Strategy
Phase 4 tests should avoid hardware where possible.
Recommended tests:
- render command queue preserves FIFO for non-coalesced commands
- latest-input mailbox drops older frames under load
- stop command wakes and drains the render thread
- screenshot request receives one completion or failure
- output render request reports timeout/failure if render thread is stopped
- render reset commands coalesce where expected
- wrong-thread render-only methods are not publicly reachable
Existing useful homes:
- `RuntimeEventTypeTests` for new render/backend observations
- `RuntimeSubsystemTests` for pure request/coalescing helpers
- a new `RenderThreadTests` target for queue/mailbox/lifecycle helpers that do not require GL
Manual verification will still be needed for:
- real DeckLink input/output
- preview interaction
- screenshot capture
- shader reload while rendering
## Telemetry Added During Phase 4
Phase 4 should add minimal metrics while moving ownership:
- render command queue depth
- input frames accepted, replaced, and dropped
- render-thread wake reason counts
- render-thread frame duration
- output request latency
- preview request skipped count
- screenshot request success/failure count
- wrong-thread GL call diagnostics if practical
Full operational reporting remains Phase 8, but these metrics make the threading migration debuggable.
## Risks
### Deadlock Risk
Synchronous request/response shims can deadlock if the caller is already on the render thread or holds a lock the render thread needs. Phase 4 should keep request waits narrow and add render-thread detection early.
### Latency Risk
Moving work through queues can hide latency. Queue depth and output request latency should be measured from the first migration step.
### Lifetime Risk
Moving context ownership changes startup and shutdown order. The render thread must stop before GL resources or window/context handles are destroyed.
### Callback Pressure Risk
If DeckLink callbacks wait too long for render-thread work, Phase 4 may improve GL ownership but still leave callback timing fragile. A synchronous bridge is acceptable as a transition, but the design should keep the path open for producer/consumer playout.
### Preview Coupling Risk
Preview can remain a hidden budget consumer if it stays in the output frame path. Phase 4 should keep preview explicitly best-effort, even if physical decoupling continues later.
## Phase 4 Exit Criteria
Phase 4 can be considered complete once the project can say:
- [ ] one render thread owns the GL context during normal operation
- [ ] input callbacks do not bind GL or wait on GL upload
- [ ] output callbacks do not bind GL directly
- [ ] preview and screenshot requests enter render through explicit render-thread requests
- [ ] `RenderFrameInput` / `RenderFrameState` remain the frame-state contract
- [ ] normal frame production no longer depends on a shared GL `CRITICAL_SECTION`
- [ ] render-thread queue/mailbox behavior has non-GL tests
- [ ] shutdown order is explicit and tested or manually verified
## Open Questions
- Should the first output migration be synchronous request/response, or should Phase 4 go directly to a small ready-frame queue?
- Should the render thread own `RuntimeServiceLiveBridge` calls, or should frame state be prepared just before enqueue?
- How much input frame memory should be copied at enqueue time versus referenced from backend-owned buffers?
- Should preview present on the render thread, or should render publish a preview image/texture to a separate presenter?
- What timeout should output callbacks use if the render thread cannot produce a frame in time?
- Should wrong-thread GL access be enforced with assertions, telemetry, or both?
## Short Version
Phase 4 should make GL ownership boring and deterministic.
One render thread owns the context. Other threads submit work or consume results. Input upload, frame rendering, readback, preview, and screenshot capture all move behind render-thread entrypoints. The first implementation can be transitional and partly synchronous, but after Phase 4 the app should no longer rely on callback and UI paths borrowing the GL context under one shared lock.