docs update

2026-05-11 17:16:39 +10:00
parent e5c5920ccd
commit ebc10a9925
5 changed files with 417 additions and 31 deletions
--- a/docs/PHASE_4_RENDER_THREAD_OWNERSHIP_DESIGN.md
+++ b/docs/PHASE_4_RENDER_THREAD_OWNERSHIP_DESIGN.md
@@ -0,0 +1,373 @@
+# Phase 4 Design: Render Thread Ownership
+
+This document expands Phase 4 of [ARCHITECTURE_RESILIENCE_REVIEW.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/ARCHITECTURE_RESILIENCE_REVIEW.md) into a concrete design target.
+
+Phase 1 named the subsystems. Phase 2 added the typed event substrate. Phase 3 made render-facing live state explicit through `RuntimeLiveState`, `RenderStateComposer`, `RenderFrameInput`, `RenderFrameState`, `RenderFrameStateResolver`, and `RuntimeServiceLiveBridge`. Phase 4 can now focus on the core timing-risk boundary: making one render thread the only owner of OpenGL work.
+
+## Status
+
+- Phase 4 design package: proposed.
+- Phase 4 implementation: not started.
+- Current alignment: the repo has a named frame-state contract and cleaner render-state preparation, but GL work is still entered through multiple paths protected by one shared `CRITICAL_SECTION`.
+
+Current GL ownership footholds:
+
+- `RenderEngine` owns GL resources and the current context-binding helpers.
+- `RenderFrameInput` / `RenderFrameState` provide the frame-state contract that a render thread can consume.
+- `RenderFrameStateResolver` prepares the render-facing layer state before drawing.
+- `OpenGLVideoIOBridge` still calls `RenderEngine::TryUploadInputFrame(...)` from the input path and `RenderEngine::RenderOutputFrame(...)` from the output path.
+- `OpenGLComposite::paintGL(...)`, screenshot capture, input upload, and output rendering still reach GL through `RenderEngine` methods that bind the shared context under `pMutex`.
+
+## Why Phase 4 Exists
+
+The resilience review identifies shared GL ownership as the main remaining timing and failure-isolation risk. Today the shared context lock protects correctness, but it does not isolate timing:
+
+- input callbacks can attempt texture upload
+- output callbacks can trigger frame rendering and readback
+- preview paint can enter the same GL context
+- screenshot capture can enter the same GL context
+- the DeckLink completion path is still too close to render work
+
+That means brief input, preview, readback, or callback stalls can still collide on the most timing-sensitive path.
+
+Phase 4 should turn GL from a shared resource guarded by a lock into a resource owned by one thread with explicit queues and handoff points.
+
+## Goals
+
+Phase 4 should establish:
+
+- one render thread as the sole long-lived owner of the GL context
+- non-render threads enqueue work instead of binding the GL context
+- input upload requests are accepted and executed by the render thread
+- output frame rendering is requested or scheduled through render-owned work
+- preview and screenshot requests become render-thread commands or consumers
+- `RenderFrameInput` / `RenderFrameState` become the stable data contract for frame production
+- GL context entrypoints are reduced to render-thread-only code paths
+- tests for queue semantics, request coalescing, and lifecycle behavior without requiring DeckLink hardware
+
+## Non-Goals
+
+Phase 4 should not require:
+
+- the final producer/consumer playout queue for DeckLink
+- the final DeckLink lifecycle state machine
+- replacing the async readback policy
+- implementing background persistence
+- completing Phase 5's deeper live-state layering
+- replacing every UI or backend API at once
+
+Those are later phases or follow-on work. Phase 4 is about making GL ownership deterministic first.
+
+## Current GL Entry Points
+
+The current code paths that matter most are:
+
+| Entry point | Current behavior | Phase 4 direction |
+| --- | --- | --- |
+| `RenderEngine::TryUploadInputFrame(...)` | attempts to take the GL lock, binds the context, uploads input texture | enqueue latest input frame; render thread uploads |
+| `RenderEngine::RenderOutputFrame(...)` | takes the GL lock, binds the context, renders, packs/readbacks output | render thread executes output frame production |
+| `RenderEngine::TryPresentPreview(...)` | attempts to take the GL lock and presents preview | render thread or preview presenter consumes latest completed frame |
+| `RenderEngine::CaptureOutputFrameRgbaTopDown(...)` | takes the GL lock and reads output pixels | screenshot request becomes render-thread command |
+| `OpenGLVideoIOBridge::UploadInputFrame(...)` | calls render upload directly | push input frame into render queue/mailbox |
+| `OpenGLVideoIOBridge::RenderScheduledFrame(...)` | calls render output directly from backend path | request/consume render-produced output without callback-owned GL |
+
+## Target Ownership Model
+
+### Render Thread
+
+The render thread should own:
+
+- `wglMakeCurrent(...)` for the rendering context
+- all GL resource creation/destruction
+- input texture upload
+- pass execution
+- output pack conversion
+- async readback buffers and fences
+- preview presentation or preview frame publication
+- screenshot readback
+- temporal history and feedback resources
+
+### Other Threads
+
+Other threads may:
+
+- enqueue input frames or replace the latest input frame
+- publish control/runtime/backend events
+- request shader build application
+- request render-local resets
+- request screenshots
+- consume ready output frames or receive completion notifications
+
+Other threads should not:
+
+- call GL directly
+- bind or unbind the render context
+- wait on GL fences directly
+- mutate render-local resource state
+
+## Proposed Collaborators
+
+### `RenderThread`
+
+Owns the OS thread, wakeup primitive, lifecycle, and render-loop execution.
+
+Responsibilities:
+
+- start and stop the render thread
+- bind the GL context for the thread lifetime or render-loop lifetime
+- drain render commands
+- execute frame production work
+- publish lifecycle and failure observations
+
+Non-responsibilities:
+
+- runtime mutation policy
+- DeckLink scheduling policy
+- durable persistence
+
+### `RenderCommandQueue`
+
+Small bounded queue or command mailbox for render-thread work.
+
+Possible commands:
+
+- `UploadInputFrame`
+- `RenderOutputFrame`
+- `PrepareFrameState`
+- `ApplyShaderBuild`
+- `ResetTemporalHistory`
+- `ResetShaderFeedback`
+- `PresentPreview`
+- `CaptureScreenshot`
+- `Stop`
+
+High-rate commands should be coalesced where appropriate. Input frames should likely be latest-value rather than unbounded FIFO.
+
+### `RenderFrameCoordinator`
+
+Optional helper that combines Phase 3's frame contract with render-thread execution.
+
+Responsibilities:
+
+- build or receive `RenderFrameInput`
+- call `RuntimeServiceLiveBridge` and `RenderFrameStateResolver`
+- hand `RenderFrameState` to `RenderEngine`
+
+This can begin as a thin helper. The important part is that it keeps frame-state preparation explicit when `renderEffect()` stops being called directly from the callback path.
+
+### `RenderOutputMailbox`
+
+Optional transitional bridge for output frames.
+
+Responsibilities:
+
+- hold the latest completed output frame or a small bounded set
+- let backend code consume output without owning GL
+- report underrun/stale-frame reuse observations
+
+This may be a Phase 4 late step or a Phase 7 playout-policy step. Phase 4 should at least avoid designing the render thread in a way that blocks it.
+
+## Threading Contract
+
+Phase 4 should make thread ownership visible in APIs.
+
+Candidate naming:
+
+- `RenderEngine::StartRenderThread(...)`
+- `RenderEngine::StopRenderThread()`
+- `RenderEngine::EnqueueInputFrame(...)`
+- `RenderEngine::RequestOutputFrame(...)`
+- `RenderEngine::RequestPreviewPresent(...)`
+- `RenderEngine::RequestScreenshot(...)`
+
+Render-thread-only methods should be private or clearly named:
+
+- `RenderEngine::UploadInputFrameOnRenderThread(...)`
+- `RenderEngine::RenderOutputFrameOnRenderThread(...)`
+- `RenderEngine::CaptureOutputFrameOnRenderThread(...)`
+
+The current `TryUploadInputFrame`, `RenderOutputFrame`, `TryPresentPreview`, and `CaptureOutputFrameRgbaTopDown` methods can remain as compatibility shims during migration, but their implementations should move toward enqueue-and-wait or enqueue-and-return behavior instead of binding GL directly from the caller's thread.
+
+## Frame Production Shape
+
+A target render-thread frame should look like:
+
+1. wake for input, output demand, preview demand, shader build, reset, screenshot, or stop
+2. drain bounded render commands
+3. coalesce to the latest input frame and latest control/live state
+4. build `RenderFrameInput`
+5. prepare `RenderFrameState`
+6. upload accepted input frame
+7. render layer stack
+8. pack output if needed
+9. stage readback or output buffer
+10. publish preview/screenshot/output completion as needed
+11. record timing and queue metrics
+
+The exact cadence can remain demand-driven initially. The architectural win is that the demand wakes the render thread rather than borrowing GL from the caller.
+
+## Migration Plan
+
+### Step 1. Name Render-Thread-Only Methods
+
+Split existing direct GL methods into public request methods and private render-thread methods without changing behavior much.
+
+Initial target:
+
+- keep current synchronous behavior where callers need a result
+- move GL bodies into clearly render-thread-owned helpers
+- make future queue migration mechanical
+
+### Step 2. Add Render Command Queue
+
+Introduce a small queue/mailbox for render commands.
+
+Start with low-risk commands:
+
+- preview present request
+- screenshot request
+- render-local reset requests
+
+Then move input upload and output render requests once the queue and wakeup behavior are proven.
+
+### Step 3. Start A Dedicated Render Thread
+
+Create the render thread and make it own context binding.
+
+Transitional behavior may still allow synchronous request/response for output frames. The important change is that the caller waits for render-thread completion rather than taking the GL context itself.
+
+### Step 4. Move Input Upload To The Render Thread
+
+Change `OpenGLVideoIOBridge::UploadInputFrame(...)` so it enqueues or replaces the latest input frame.
+
+Policy targets:
+
+- bounded memory
+- latest-frame wins under load
+- input upload skip count is observable
+- input callback never waits for GL
+
+### Step 5. Move Output Rendering To The Render Thread
+
+Change `OpenGLVideoIOBridge::RenderScheduledFrame(...)` so it requests render-thread output production or consumes a completed render-thread output.
+
+Transitional option:
+
+- synchronous request/response through the render thread
+
+Better follow-up:
+
+- render ahead into a bounded output queue and let backend callbacks consume ready frames
+
+### Step 6. Decouple Preview And Screenshot Requests
+
+Preview should become best-effort:
+
+- request preview presentation from the render thread
+- skip when render is busy or output deadline pressure is high
+- record preview skips
+
+Screenshot should become:
+
+- queued render-thread capture request
+- async disk write remains outside render thread
+
+### Step 7. Remove Shared GL Lock From Normal Paths
+
+Once all GL entrypoints are render-thread-owned:
+
+- remove normal dependence on `pMutex` for render correctness
+- keep assertions or diagnostics that detect wrong-thread GL calls
+- leave only lifecycle synchronization where needed
+
+## Testing Strategy
+
+Phase 4 tests should avoid hardware where possible.
+
+Recommended tests:
+
+- render command queue preserves FIFO for non-coalesced commands
+- latest-input mailbox drops older frames under load
+- stop command wakes and drains the render thread
+- screenshot request receives one completion or failure
+- output render request reports timeout/failure if render thread is stopped
+- render reset commands coalesce where expected
+- wrong-thread render-only methods are not publicly reachable
+
+Existing useful homes:
+
+- `RuntimeEventTypeTests` for new render/backend observations
+- `RuntimeSubsystemTests` for pure request/coalescing helpers
+- a new `RenderThreadTests` target for queue/mailbox/lifecycle helpers that do not require GL
+
+Manual verification will still be needed for:
+
+- real DeckLink input/output
+- preview interaction
+- screenshot capture
+- shader reload while rendering
+
+## Telemetry Added During Phase 4
+
+Phase 4 should add minimal metrics while moving ownership:
+
+- render command queue depth
+- input frames accepted, replaced, and dropped
+- render-thread wake reason counts
+- render-thread frame duration
+- output request latency
+- preview request skipped count
+- screenshot request success/failure count
+- wrong-thread GL call diagnostics if practical
+
+Full operational reporting remains Phase 8, but these metrics make the threading migration debuggable.
+
+## Risks
+
+### Deadlock Risk
+
+Synchronous request/response shims can deadlock if the caller is already on the render thread or holds a lock the render thread needs. Phase 4 should keep request waits narrow and add render-thread detection early.
+
+### Latency Risk
+
+Moving work through queues can hide latency. Queue depth and output request latency should be measured from the first migration step.
+
+### Lifetime Risk
+
+Moving context ownership changes startup and shutdown order. The render thread must stop before GL resources or window/context handles are destroyed.
+
+### Callback Pressure Risk
+
+If DeckLink callbacks wait too long for render-thread work, Phase 4 may improve GL ownership but still leave callback timing fragile. A synchronous bridge is acceptable as a transition, but the design should keep the path open for producer/consumer playout.
+
+### Preview Coupling Risk
+
+Preview can remain a hidden budget consumer if it stays in the output frame path. Phase 4 should keep preview explicitly best-effort, even if physical decoupling continues later.
+
+## Phase 4 Exit Criteria
+
+Phase 4 can be considered complete once the project can say:
+
+- [ ] one render thread owns the GL context during normal operation
+- [ ] input callbacks do not bind GL or wait on GL upload
+- [ ] output callbacks do not bind GL directly
+- [ ] preview and screenshot requests enter render through explicit render-thread requests
+- [ ] `RenderFrameInput` / `RenderFrameState` remain the frame-state contract
+- [ ] normal frame production no longer depends on a shared GL `CRITICAL_SECTION`
+- [ ] render-thread queue/mailbox behavior has non-GL tests
+- [ ] shutdown order is explicit and tested or manually verified
+
+## Open Questions
+
+- Should the first output migration be synchronous request/response, or should Phase 4 go directly to a small ready-frame queue?
+- Should the render thread own `RuntimeServiceLiveBridge` calls, or should frame state be prepared just before enqueue?
+- How much input frame memory should be copied at enqueue time versus referenced from backend-owned buffers?
+- Should preview present on the render thread, or should render publish a preview image/texture to a separate presenter?
+- What timeout should output callbacks use if the render thread cannot produce a frame in time?
+- Should wrong-thread GL access be enforced with assertions, telemetry, or both?
+
+## Short Version
+
+Phase 4 should make GL ownership boring and deterministic.
+
+One render thread owns the context. Other threads submit work or consume results. Input upload, frame rendering, readback, preview, and screenshot capture all move behind render-thread entrypoints. The first implementation can be transitional and partly synchronous, but after Phase 4 the app should no longer rely on callback and UI paths borrowing the GL context under one shared lock.