docs update
This commit is contained in:
373
docs/PHASE_4_RENDER_THREAD_OWNERSHIP_DESIGN.md
Normal file
373
docs/PHASE_4_RENDER_THREAD_OWNERSHIP_DESIGN.md
Normal file
@@ -0,0 +1,373 @@
|
||||
# Phase 4 Design: Render Thread Ownership
|
||||
|
||||
This document expands Phase 4 of [ARCHITECTURE_RESILIENCE_REVIEW.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/ARCHITECTURE_RESILIENCE_REVIEW.md) into a concrete design target.
|
||||
|
||||
Phase 1 named the subsystems. Phase 2 added the typed event substrate. Phase 3 made render-facing live state explicit through `RuntimeLiveState`, `RenderStateComposer`, `RenderFrameInput`, `RenderFrameState`, `RenderFrameStateResolver`, and `RuntimeServiceLiveBridge`. Phase 4 can now focus on the core timing-risk boundary: making one render thread the only owner of OpenGL work.
|
||||
|
||||
## Status
|
||||
|
||||
- Phase 4 design package: proposed.
|
||||
- Phase 4 implementation: not started.
|
||||
- Current alignment: the repo has a named frame-state contract and cleaner render-state preparation, but GL work is still entered through multiple paths protected by one shared `CRITICAL_SECTION`.
|
||||
|
||||
Current GL ownership footholds:
|
||||
|
||||
- `RenderEngine` owns GL resources and the current context-binding helpers.
|
||||
- `RenderFrameInput` / `RenderFrameState` provide the frame-state contract that a render thread can consume.
|
||||
- `RenderFrameStateResolver` prepares the render-facing layer state before drawing.
|
||||
- `OpenGLVideoIOBridge` still calls `RenderEngine::TryUploadInputFrame(...)` from the input path and `RenderEngine::RenderOutputFrame(...)` from the output path.
|
||||
- `OpenGLComposite::paintGL(...)`, screenshot capture, input upload, and output rendering still reach GL through `RenderEngine` methods that bind the shared context under `pMutex`.
|
||||
|
||||
## Why Phase 4 Exists
|
||||
|
||||
The resilience review identifies shared GL ownership as the main remaining timing and failure-isolation risk. Today the shared context lock protects correctness, but it does not isolate timing:
|
||||
|
||||
- input callbacks can attempt texture upload
|
||||
- output callbacks can trigger frame rendering and readback
|
||||
- preview paint can enter the same GL context
|
||||
- screenshot capture can enter the same GL context
|
||||
- the DeckLink completion path is still too close to render work
|
||||
|
||||
That means brief input, preview, readback, or callback stalls can still collide on the most timing-sensitive path.
|
||||
|
||||
Phase 4 should turn GL from a shared resource guarded by a lock into a resource owned by one thread with explicit queues and handoff points.
|
||||
|
||||
## Goals
|
||||
|
||||
Phase 4 should establish:
|
||||
|
||||
- one render thread as the sole long-lived owner of the GL context
|
||||
- non-render threads enqueue work instead of binding the GL context
|
||||
- input upload requests are accepted and executed by the render thread
|
||||
- output frame rendering is requested or scheduled through render-owned work
|
||||
- preview and screenshot requests become render-thread commands or consumers
|
||||
- `RenderFrameInput` / `RenderFrameState` become the stable data contract for frame production
|
||||
- GL context entrypoints are reduced to render-thread-only code paths
|
||||
- tests for queue semantics, request coalescing, and lifecycle behavior without requiring DeckLink hardware
|
||||
|
||||
## Non-Goals
|
||||
|
||||
Phase 4 should not require:
|
||||
|
||||
- the final producer/consumer playout queue for DeckLink
|
||||
- the final DeckLink lifecycle state machine
|
||||
- replacing the async readback policy
|
||||
- implementing background persistence
|
||||
- completing Phase 5's deeper live-state layering
|
||||
- replacing every UI or backend API at once
|
||||
|
||||
Those are later phases or follow-on work. Phase 4 is about making GL ownership deterministic first.
|
||||
|
||||
## Current GL Entry Points
|
||||
|
||||
The current code paths that matter most are:
|
||||
|
||||
| Entry point | Current behavior | Phase 4 direction |
|
||||
| --- | --- | --- |
|
||||
| `RenderEngine::TryUploadInputFrame(...)` | attempts to take the GL lock, binds the context, uploads input texture | enqueue latest input frame; render thread uploads |
|
||||
| `RenderEngine::RenderOutputFrame(...)` | takes the GL lock, binds the context, renders, packs/readbacks output | render thread executes output frame production |
|
||||
| `RenderEngine::TryPresentPreview(...)` | attempts to take the GL lock and presents preview | render thread or preview presenter consumes latest completed frame |
|
||||
| `RenderEngine::CaptureOutputFrameRgbaTopDown(...)` | takes the GL lock and reads output pixels | screenshot request becomes render-thread command |
|
||||
| `OpenGLVideoIOBridge::UploadInputFrame(...)` | calls render upload directly | push input frame into render queue/mailbox |
|
||||
| `OpenGLVideoIOBridge::RenderScheduledFrame(...)` | calls render output directly from backend path | request/consume render-produced output without callback-owned GL |
|
||||
|
||||
## Target Ownership Model
|
||||
|
||||
### Render Thread
|
||||
|
||||
The render thread should own:
|
||||
|
||||
- `wglMakeCurrent(...)` for the rendering context
|
||||
- all GL resource creation/destruction
|
||||
- input texture upload
|
||||
- pass execution
|
||||
- output pack conversion
|
||||
- async readback buffers and fences
|
||||
- preview presentation or preview frame publication
|
||||
- screenshot readback
|
||||
- temporal history and feedback resources
|
||||
|
||||
### Other Threads
|
||||
|
||||
Other threads may:
|
||||
|
||||
- enqueue input frames or replace the latest input frame
|
||||
- publish control/runtime/backend events
|
||||
- request shader build application
|
||||
- request render-local resets
|
||||
- request screenshots
|
||||
- consume ready output frames or receive completion notifications
|
||||
|
||||
Other threads should not:
|
||||
|
||||
- call GL directly
|
||||
- bind or unbind the render context
|
||||
- wait on GL fences directly
|
||||
- mutate render-local resource state
|
||||
|
||||
## Proposed Collaborators
|
||||
|
||||
### `RenderThread`
|
||||
|
||||
Owns the OS thread, wakeup primitive, lifecycle, and render-loop execution.
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- start and stop the render thread
|
||||
- bind the GL context for the thread lifetime or render-loop lifetime
|
||||
- drain render commands
|
||||
- execute frame production work
|
||||
- publish lifecycle and failure observations
|
||||
|
||||
Non-responsibilities:
|
||||
|
||||
- runtime mutation policy
|
||||
- DeckLink scheduling policy
|
||||
- durable persistence
|
||||
|
||||
### `RenderCommandQueue`
|
||||
|
||||
Small bounded queue or command mailbox for render-thread work.
|
||||
|
||||
Possible commands:
|
||||
|
||||
- `UploadInputFrame`
|
||||
- `RenderOutputFrame`
|
||||
- `PrepareFrameState`
|
||||
- `ApplyShaderBuild`
|
||||
- `ResetTemporalHistory`
|
||||
- `ResetShaderFeedback`
|
||||
- `PresentPreview`
|
||||
- `CaptureScreenshot`
|
||||
- `Stop`
|
||||
|
||||
High-rate commands should be coalesced where appropriate. Input frames should likely be latest-value rather than unbounded FIFO.
|
||||
|
||||
### `RenderFrameCoordinator`
|
||||
|
||||
Optional helper that combines Phase 3's frame contract with render-thread execution.
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- build or receive `RenderFrameInput`
|
||||
- call `RuntimeServiceLiveBridge` and `RenderFrameStateResolver`
|
||||
- hand `RenderFrameState` to `RenderEngine`
|
||||
|
||||
This can begin as a thin helper. The important part is that it keeps frame-state preparation explicit when `renderEffect()` stops being called directly from the callback path.
|
||||
|
||||
### `RenderOutputMailbox`
|
||||
|
||||
Optional transitional bridge for output frames.
|
||||
|
||||
Responsibilities:
|
||||
|
||||
- hold the latest completed output frame or a small bounded set
|
||||
- let backend code consume output without owning GL
|
||||
- report underrun/stale-frame reuse observations
|
||||
|
||||
This may be a Phase 4 late step or a Phase 7 playout-policy step. Phase 4 should at least avoid designing the render thread in a way that blocks it.
|
||||
|
||||
## Threading Contract
|
||||
|
||||
Phase 4 should make thread ownership visible in APIs.
|
||||
|
||||
Candidate naming:
|
||||
|
||||
- `RenderEngine::StartRenderThread(...)`
|
||||
- `RenderEngine::StopRenderThread()`
|
||||
- `RenderEngine::EnqueueInputFrame(...)`
|
||||
- `RenderEngine::RequestOutputFrame(...)`
|
||||
- `RenderEngine::RequestPreviewPresent(...)`
|
||||
- `RenderEngine::RequestScreenshot(...)`
|
||||
|
||||
Render-thread-only methods should be private or clearly named:
|
||||
|
||||
- `RenderEngine::UploadInputFrameOnRenderThread(...)`
|
||||
- `RenderEngine::RenderOutputFrameOnRenderThread(...)`
|
||||
- `RenderEngine::CaptureOutputFrameOnRenderThread(...)`
|
||||
|
||||
The current `TryUploadInputFrame`, `RenderOutputFrame`, `TryPresentPreview`, and `CaptureOutputFrameRgbaTopDown` methods can remain as compatibility shims during migration, but their implementations should move toward enqueue-and-wait or enqueue-and-return behavior instead of binding GL directly from the caller's thread.
|
||||
|
||||
## Frame Production Shape
|
||||
|
||||
A target render-thread frame should look like:
|
||||
|
||||
1. wake for input, output demand, preview demand, shader build, reset, screenshot, or stop
|
||||
2. drain bounded render commands
|
||||
3. coalesce to the latest input frame and latest control/live state
|
||||
4. build `RenderFrameInput`
|
||||
5. prepare `RenderFrameState`
|
||||
6. upload accepted input frame
|
||||
7. render layer stack
|
||||
8. pack output if needed
|
||||
9. stage readback or output buffer
|
||||
10. publish preview/screenshot/output completion as needed
|
||||
11. record timing and queue metrics
|
||||
|
||||
The exact cadence can remain demand-driven initially. The architectural win is that the demand wakes the render thread rather than borrowing GL from the caller.
|
||||
|
||||
## Migration Plan
|
||||
|
||||
### Step 1. Name Render-Thread-Only Methods
|
||||
|
||||
Split existing direct GL methods into public request methods and private render-thread methods without changing behavior much.
|
||||
|
||||
Initial target:
|
||||
|
||||
- keep current synchronous behavior where callers need a result
|
||||
- move GL bodies into clearly render-thread-owned helpers
|
||||
- make future queue migration mechanical
|
||||
|
||||
### Step 2. Add Render Command Queue
|
||||
|
||||
Introduce a small queue/mailbox for render commands.
|
||||
|
||||
Start with low-risk commands:
|
||||
|
||||
- preview present request
|
||||
- screenshot request
|
||||
- render-local reset requests
|
||||
|
||||
Then move input upload and output render requests once the queue and wakeup behavior are proven.
|
||||
|
||||
### Step 3. Start A Dedicated Render Thread
|
||||
|
||||
Create the render thread and make it own context binding.
|
||||
|
||||
Transitional behavior may still allow synchronous request/response for output frames. The important change is that the caller waits for render-thread completion rather than taking the GL context itself.
|
||||
|
||||
### Step 4. Move Input Upload To The Render Thread
|
||||
|
||||
Change `OpenGLVideoIOBridge::UploadInputFrame(...)` so it enqueues or replaces the latest input frame.
|
||||
|
||||
Policy targets:
|
||||
|
||||
- bounded memory
|
||||
- latest-frame wins under load
|
||||
- input upload skip count is observable
|
||||
- input callback never waits for GL
|
||||
|
||||
### Step 5. Move Output Rendering To The Render Thread
|
||||
|
||||
Change `OpenGLVideoIOBridge::RenderScheduledFrame(...)` so it requests render-thread output production or consumes a completed render-thread output.
|
||||
|
||||
Transitional option:
|
||||
|
||||
- synchronous request/response through the render thread
|
||||
|
||||
Better follow-up:
|
||||
|
||||
- render ahead into a bounded output queue and let backend callbacks consume ready frames
|
||||
|
||||
### Step 6. Decouple Preview And Screenshot Requests
|
||||
|
||||
Preview should become best-effort:
|
||||
|
||||
- request preview presentation from the render thread
|
||||
- skip when render is busy or output deadline pressure is high
|
||||
- record preview skips
|
||||
|
||||
Screenshot should become:
|
||||
|
||||
- queued render-thread capture request
|
||||
- async disk write remains outside render thread
|
||||
|
||||
### Step 7. Remove Shared GL Lock From Normal Paths
|
||||
|
||||
Once all GL entrypoints are render-thread-owned:
|
||||
|
||||
- remove normal dependence on `pMutex` for render correctness
|
||||
- keep assertions or diagnostics that detect wrong-thread GL calls
|
||||
- leave only lifecycle synchronization where needed
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
Phase 4 tests should avoid hardware where possible.
|
||||
|
||||
Recommended tests:
|
||||
|
||||
- render command queue preserves FIFO for non-coalesced commands
|
||||
- latest-input mailbox drops older frames under load
|
||||
- stop command wakes and drains the render thread
|
||||
- screenshot request receives one completion or failure
|
||||
- output render request reports timeout/failure if render thread is stopped
|
||||
- render reset commands coalesce where expected
|
||||
- wrong-thread render-only methods are not publicly reachable
|
||||
|
||||
Existing useful homes:
|
||||
|
||||
- `RuntimeEventTypeTests` for new render/backend observations
|
||||
- `RuntimeSubsystemTests` for pure request/coalescing helpers
|
||||
- a new `RenderThreadTests` target for queue/mailbox/lifecycle helpers that do not require GL
|
||||
|
||||
Manual verification will still be needed for:
|
||||
|
||||
- real DeckLink input/output
|
||||
- preview interaction
|
||||
- screenshot capture
|
||||
- shader reload while rendering
|
||||
|
||||
## Telemetry Added During Phase 4
|
||||
|
||||
Phase 4 should add minimal metrics while moving ownership:
|
||||
|
||||
- render command queue depth
|
||||
- input frames accepted, replaced, and dropped
|
||||
- render-thread wake reason counts
|
||||
- render-thread frame duration
|
||||
- output request latency
|
||||
- preview request skipped count
|
||||
- screenshot request success/failure count
|
||||
- wrong-thread GL call diagnostics if practical
|
||||
|
||||
Full operational reporting remains Phase 8, but these metrics make the threading migration debuggable.
|
||||
|
||||
## Risks
|
||||
|
||||
### Deadlock Risk
|
||||
|
||||
Synchronous request/response shims can deadlock if the caller is already on the render thread or holds a lock the render thread needs. Phase 4 should keep request waits narrow and add render-thread detection early.
|
||||
|
||||
### Latency Risk
|
||||
|
||||
Moving work through queues can hide latency. Queue depth and output request latency should be measured from the first migration step.
|
||||
|
||||
### Lifetime Risk
|
||||
|
||||
Moving context ownership changes startup and shutdown order. The render thread must stop before GL resources or window/context handles are destroyed.
|
||||
|
||||
### Callback Pressure Risk
|
||||
|
||||
If DeckLink callbacks wait too long for render-thread work, Phase 4 may improve GL ownership but still leave callback timing fragile. A synchronous bridge is acceptable as a transition, but the design should keep the path open for producer/consumer playout.
|
||||
|
||||
### Preview Coupling Risk
|
||||
|
||||
Preview can remain a hidden budget consumer if it stays in the output frame path. Phase 4 should keep preview explicitly best-effort, even if physical decoupling continues later.
|
||||
|
||||
## Phase 4 Exit Criteria
|
||||
|
||||
Phase 4 can be considered complete once the project can say:
|
||||
|
||||
- [ ] one render thread owns the GL context during normal operation
|
||||
- [ ] input callbacks do not bind GL or wait on GL upload
|
||||
- [ ] output callbacks do not bind GL directly
|
||||
- [ ] preview and screenshot requests enter render through explicit render-thread requests
|
||||
- [ ] `RenderFrameInput` / `RenderFrameState` remain the frame-state contract
|
||||
- [ ] normal frame production no longer depends on a shared GL `CRITICAL_SECTION`
|
||||
- [ ] render-thread queue/mailbox behavior has non-GL tests
|
||||
- [ ] shutdown order is explicit and tested or manually verified
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Should the first output migration be synchronous request/response, or should Phase 4 go directly to a small ready-frame queue?
|
||||
- Should the render thread own `RuntimeServiceLiveBridge` calls, or should frame state be prepared just before enqueue?
|
||||
- How much input frame memory should be copied at enqueue time versus referenced from backend-owned buffers?
|
||||
- Should preview present on the render thread, or should render publish a preview image/texture to a separate presenter?
|
||||
- What timeout should output callbacks use if the render thread cannot produce a frame in time?
|
||||
- Should wrong-thread GL access be enforced with assertions, telemetry, or both?
|
||||
|
||||
## Short Version
|
||||
|
||||
Phase 4 should make GL ownership boring and deterministic.
|
||||
|
||||
One render thread owns the context. Other threads submit work or consume results. Input upload, frame rendering, readback, preview, and screenshot capture all move behind render-thread entrypoints. The first implementation can be transitional and partly synchronous, but after Phase 4 the app should no longer rely on callback and UI paths borrowing the GL context under one shared lock.
|
||||
Reference in New Issue
Block a user