Doc cleanup
Some checks failed
CI / React UI Build (push) Successful in 11s
CI / Windows Release Package (push) Has been cancelled
CI / Native Windows Build And Tests (push) Has been cancelled

This commit is contained in:
Aiden
2026-05-30 21:35:07 +10:00
parent 56883439a6
commit fce3c3c5ae
17 changed files with 37 additions and 6289 deletions

View File

@@ -18,7 +18,7 @@ The app loads shader packages from `shaders/`, compiles Slang to GLSL at runtime
Native app internals are grouped by boundary:
- `app/`: startup/shutdown orchestration and runtime layer controller.
- `app/`: startup/shutdown orchestration, runtime-content controller boundary, config, preview, telemetry, and HTTP hookup.
- `control/`: HTTP/WebSocket server, command parsing, and runtime-state JSON presentation.
- `frames/`: system-memory frame exchange and input mailbox handoff.
- `render/`: render thread, readback, runtime render scene, and shared-context shader program preparation.
@@ -129,6 +129,8 @@ The control UI provides:
- A searchable shader library for adding layers.
- Compact parameter rows with inline descriptions and intended OSC route copy controls.
- Manual shader reload.
- Host config editing, save, restart request, and NDI input source discovery.
- Compact video I/O and render cadence status.
## Package
@@ -205,6 +207,7 @@ Current native test coverage includes:
"oscBindAddress": "127.0.0.1",
"oscPort": 9000,
"oscSmoothing": 0.18,
"runtimeShaderId": "",
"input": {
"backend": "decklink",
"device": "default",
@@ -218,12 +221,14 @@ Current native test coverage includes:
"frameRate": "59.94",
"pixelFormat": "auto",
"keying": {
"external": true,
"alphaRequired": true
"external": false,
"alphaRequired": false
}
},
"autoReload": true,
"maxTemporalHistoryFrames": 12
"maxTemporalHistoryFrames": 12,
"previewEnabled": false,
"previewFps": 59.94
}
```
@@ -245,7 +250,7 @@ http://127.0.0.1:<serverPort>
The current layer stack is autosaved to `runtime/runtime_state.json` whenever durable UI/API layer changes are accepted: add/remove, shader assignment, bypass state, ordering, parameter updates, parameter reset, and reload compatibility refreshes. Saves are debounced and written on a background worker, with a final flush during shutdown.
On startup, the host tries to reload `runtime/runtime_state.json` before compiling the stack. Valid saved layers are rebuilt in saved order, with shader id, bypass state, and parameter values restored. Missing shader packages are skipped, invalid saved parameter values fall back to shader defaults, and if the file is missing or unusable the app falls back to the configured default shader.
On startup, the host tries to reload `runtime/runtime_state.json` before compiling the stack. Valid saved layers are rebuilt in saved order, with shader id, bypass state, and parameter values restored. Missing shader packages are skipped, invalid saved parameter values fall back to shader defaults, and if the file is missing or unusable the app falls back to the optional configured startup shader. The checked-in config leaves `runtimeShaderId` empty, so a fresh host uses the simple fallback renderer until layers are added or a saved stack exists.
Manual stack preset and screenshot routes are still present in the UI/OpenAPI surface, but they are not implemented by the current native command path yet. `runtime_state.json` is the supported latest-working-state mechanism for now.
@@ -278,7 +283,7 @@ Each parameter row still exposes the intended OSC route in the UI. The native ho
The control UI currently still shows preset and screenshot controls from the intended route surface. Those endpoints return an unimplemented action result in the native host until their backend paths are wired.
The planned screenshot output directory is:
The reserved screenshot output directory is:
```text
runtime/screenshots/
@@ -318,8 +323,8 @@ Runtime-generated files are intentionally ignored:
- `runtime/shader_cache/active_shader.raw.frag`
- `runtime/shader_cache/active_shader.frag`
- `runtime/runtime_state.json` autosaved latest stack and parameter state.
- `runtime/stack_presets/*.json` planned manual preset output; preset routes are not implemented in the native host yet.
- `runtime/screenshots/*.png` planned screenshot output; screenshot capture is not implemented in the native host yet.
- `runtime/stack_presets/*.json` reserved manual preset output; preset routes are not implemented in the native host yet.
- `runtime/screenshots/*.png` reserved screenshot output; screenshot capture is not implemented in the native host yet.
Only `runtime/templates/` and `runtime/README.md` are tracked.
@@ -366,7 +371,6 @@ If these variables are not set, CMake first looks under the private `video-io-3r
- Anotate included shaders
- allow 3 vector exposed controls
- add nearest sampling to the extra shader pass
- add spout input/output (https://github.com/leadedge/Spout2)
- Add Aja input and output (Assuming i can get a hold of an aja card)
- Add bluefish input and output (Assuming again card acess)
- Endpoint to show OSC paths seperatly instead of a part of the control UI

View File

@@ -1,725 +0,0 @@
# Architecture Resilience Review
This note summarizes the main architectural improvements that would make the app more resilient during live use, especially around timing isolation, failure isolation, and recoverability.
Phase checklist:
- [x] Define subsystem boundaries and target architecture
- [x] Introduce an internal event model
- [x] Split `RuntimeHost`
- [x] Finish live-state and service-facing coordination
- [x] Make the render thread the sole GL owner
- [x] Refactor live state layering into an explicit composition model
- [x] Move persistence onto a background snapshot writer
- [x] Make DeckLink/backend lifecycle explicit with a state machine
- [ ] Make playout timing proactive and deadline-aware
- [ ] Add structured health, telemetry, and operational reporting
Checklist note:
- The checked Phase 1 item means the subsystem vocabulary, dependency direction, state categories, design package, and runtime implementation foothold are in place.
- The checked Phase 2 item means the internal event model substrate is complete enough for later phases: the typed event vocabulary, app-owned dispatcher, coalesced event pump, reload bridge events, production bridges, and pure event tests are in place. Remaining items in [PHASE_2_INTERNAL_EVENT_MODEL_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_2_INTERNAL_EVENT_MODEL_DESIGN.md) are narrow follow-ups, mainly completion/failure observations and later replacement of the runtime-store poll fallback with real file-watch events.
- The checked Phase 3 item means the render-facing state path now has named live-state, composition, frame-state, resolver, and service-bridge boundaries. `OpenGLComposite::renderEffect()` is reduced to runtime work, frame input construction, and frame rendering.
- The checked Phase 4 item means normal runtime GL work is now owned by a dedicated `RenderEngine` render thread. Input upload, output render, preview, screenshot capture, render-local resets, and shader application enter through render-thread queue/request paths instead of caller-thread context borrowing. The remaining output timing risk is callback-coupled synchronous output production, which is intentionally tracked for the later DeckLink/backend lifecycle and playout-queue work.
- The checked Phase 5 item means persisted, committed/session, transient automation, and render-local state are explicitly named. `CommittedLiveState` physically owns current session layer state, `RuntimeLiveState` owns transient OSC overlays, `RenderStateComposer` consumes a layered input contract, and reset/reload/preset overlay invalidation is centralized and covered by non-GL tests.
- It does not mean the whole app is fully extracted. Backend lifecycle/playout queue policy and richer telemetry continue through later phases.
## Timing Review
The recent OSC work removed several control-path stalls, but the app still has a few deeper timing characteristics that matter for live resilience:
- output playout is still effectively render-on-demand from the DeckLink completion callback
- output buffering and preroll are now larger, but the buffering model is still static and only loosely related to actual render cost
- GPU readback is partly asynchronous, but the fallback path still returns to synchronous readback on any miss
- preview presentation is best-effort and render-thread queued, but still shares the same render-thread budget as playout
- background service timing is partially event-driven; runtime-store scanning still uses a bounded compatibility poll fallback
Those points are important because they affect not just average performance, but how the app behaves under brief spikes, device jitter, or load bursts.
## Key Findings
### 1. The original runtime host carried too many responsibilities
The original `RuntimeHost` acted as:
- config store
- persistent state store
- live parameter/state authority
- shader package registry owner
- status/telemetry sink
- control mutation entrypoint
That makes it a single contention and failure domain. It is also why OSC and render timing issues repeatedly surfaced around shared state access.
Relevant code:
- `RuntimeHost.h`
Recommended direction:
- split persisted config/state from live render-facing state
- separate status/telemetry updates from control mutation paths
- make render consume snapshots rather than sharing a large mutable authority object
### 2. OpenGL ownership has moved to the render thread
Phase 4 removed normal runtime dependence on the old shared GL `CRITICAL_SECTION`. `RenderEngine` now owns a dedicated render thread and binds the GL context there for normal input upload, output rendering, preview presentation, screenshot capture, shader application, and render-local reset work.
Relevant code:
- [RenderEngine.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/RenderEngine.cpp:36)
- [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:11)
- [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLComposite.cpp:168)
This removes cross-thread GL context borrowing as the central correctness model. The remaining timing risk is that output frame production is still synchronous from the DeckLink completion path, so a render/readback spike can still reduce playout headroom.
Recommended direction:
- keep the render thread as the sole GL owner
- replace synchronous output request/response with a bounded producer/consumer playout queue
- keep preview and screenshot subordinate to output deadline pressure
### 3. Control flow is spread across polling and shared-memory patterns
`RuntimeServices` currently mixes:
- file polling
- deferred OSC commit handling
- control service orchestration
OSC ingest, overlay application, and host sync are distributed across several components.
Relevant code:
- [RuntimeServices.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.h:26)
- [RuntimeServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.cpp:178)
Recommended direction:
- introduce a small internal event pipeline or message bus
- use typed events for OSC, reloads, persistence requests, and status changes
- make timing ownership explicit per subsystem
Example event types:
- `OscParameterTargeted`
- `RenderOverlaySettled`
- `PersistStateRequested`
- `ShaderReloadRequested`
- `DeckLinkStatusChanged`
### 4. Error handling is still heavily UI-coupled
Failures are often surfaced via `MessageBoxA`, while background services mainly log with `OutputDebugStringA`.
Relevant code:
- [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLComposite.cpp:314)
- [DeckLinkSession.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:478)
- [RuntimeServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.cpp:205)
This is not ideal for a live system where modal dialogs and silent debug logging are both poor operational behavior.
Recommended direction:
- introduce structured in-app error reporting
- define severity levels and counters
- prefer degraded runtime states over modal failure handling where possible
- add a rolling log file for operational troubleshooting
### 5. Live OSC overlay and persisted state now have an explicit layering model
Phase 5 formalized the previous hand-managed reconciliation between:
- base persisted state owned by `RuntimeStore` serialization/preset IO
- committed session state owned by `CommittedLiveState`
- transient OSC overlay state owned by `RuntimeLiveState`
- render-local temporal, feedback, preview, screenshot, and playout state owned by `RenderEngine`
Relevant code:
- [CommittedLiveState.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/runtime/live/CommittedLiveState.h:1)
- [RuntimeLiveState.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/runtime/live/RuntimeLiveState.h:1)
- [RenderStateComposer.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/runtime/live/RenderStateComposer.h:1)
- [RuntimeStateLayerModel.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/runtime/live/RuntimeStateLayerModel.h:1)
Current direction:
- render resolves values with a named composition rule:
- `final = base + committed + transient`
- settled OSC commits are session-only by default and do not request persistence unless policy explicitly opts in
- reset, reload, preset load, and shader compatibility changes prune or clear transient overlays at the live-state boundary
- render-local temporal and feedback resources remain outside the parameter layering model
### 6. DeckLink lifecycle could be modeled more explicitly
`DeckLinkSession` has a number of imperative calls, but startup, preroll, running, degraded, and stopped are not represented as an explicit state machine.
Relevant code:
- [DeckLinkSession.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.h:17)
Recommended direction:
- introduce explicit session states
- define allowed transitions
- centralize recovery behavior
- make shutdown ordering and degraded-mode behavior more predictable
Timing-specific additions:
- separate "device callback received" from "render the next output frame" so output cadence is not driven directly by the completion callback thread
- make playout headroom configurable and adaptive instead of using a fixed compile-time preroll count
- track an explicit backend health state such as `running-steady`, `catching-up`, `late`, and `dropping`
Relevant timing code:
- [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:86)
- [DeckLinkSession.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:420)
- [DeckLinkSession.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:487)
- [VideoPlayoutScheduler.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/VideoPlayoutScheduler.cpp:26)
Why this matters:
- the output completion path currently requests a scheduled render through `OpenGLVideoIOBridge::RenderScheduledFrame()`, which asks the render thread to render/read back synchronously and then schedules the next frame in one callback-driven flow.
- `VideoPlayoutScheduler::AccountForCompletionResult()` currently reacts to both late and dropped frames by blindly advancing the schedule index by `2`, which is simple but not especially robust.
- `kPrerollFrameCount` is now `12`, but `DeckLinkSession::ConfigureOutput()` still creates a fixed pool of `10` mutable output frames. That mismatch suggests the buffering model is not being sized from one coherent source of truth.
Recommended direction:
- move playout to a producer/consumer model where a render worker fills output buffers ahead of the DeckLink callback
- define buffer-pool sizing from one policy object, for example: preroll depth, minimum spare buffers, and allowed catch-up depth
- replace fixed "skip two frames" recovery with measured lag accounting based on actual scheduled-versus-completed position
- expose playout latency as a runtime setting or policy, rather than burying it in a constant
### 6a. The current playout timing model is still callback-coupled
The app now has more headroom, but the next output frame is still produced directly in the scheduled-frame completion callback path.
Relevant code:
- [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:86)
- [DeckLinkFrameTransfer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkFrameTransfer.cpp:53)
That means the completion callback is currently responsible for:
- frame pacing accounting
- acquiring the next output buffer
- requesting render-thread output production
- waiting for render/readback completion
- performing output readback
- scheduling the next frame
This works when the app is comfortably within budget, but it makes deadline misses much harder to absorb gracefully.
Recommended direction:
- make the DeckLink callback a lightweight notifier
- have a dedicated playout worker or render worker keep an ahead-of-time queue of ready output frames
- treat callback time as control-plane time, not render time
### 6b. A producer/consumer playout model would be a better long-term fit
The stronger architecture for this app is:
- a render scheduler or dedicated render thread runs at the configured video cadence
- rendering produces completed output frames ahead of need
- those frames are placed into a bounded queue or ring buffer
- the DeckLink side consumes already-prepared frames when callbacks indicate they are needed
That is a better fit than callback-driven rendering because it separates:
- render timing
- GL ownership
- output-device timing
- latency policy
In that model:
- render is the producer
- DeckLink is the timing consumer
- the queue between them becomes the main place to manage latency versus resilience
Why this is preferable:
- brief callback jitter is less likely to become a visible dropped frame
- render spikes can be absorbed by queue headroom instead of immediately missing output deadlines
- latency becomes an explicit policy choice rather than an incidental side effect of callback timing
- queue depth, underruns, stale-frame reuse, and catch-up behavior become measurable and tunable
Recommended direction:
- move toward a bounded producer/consumer playout queue
- make queue depth and target headroom runtime policy, not compile-time constants
- define explicit underrun behavior, for example:
- reuse newest completed frame
- reuse last scheduled frame
- output black or degraded frame
- keep DeckLink callbacks limited to dequeue/schedule/accounting work wherever possible
### 7. Persistence should be more asynchronous and debounced
Status: addressed by Phase 6.
Relevant current `RenderCadenceCompositor` code:
- `src/app/RuntimeLayerControllerControls.cpp`
- `src/runtime/state/RuntimeStatePersistence.cpp`
- `src/runtime/state/RuntimeStatePersistence.h`
Runtime-state persistence now flows from accepted durable layer-stack mutations into a debounced background writer. The layer controller owns the current snapshot source, while `RuntimeStatePersistenceWriter` owns serialization, temp-file replacement, coalescing, result reporting, and shutdown flushing.
The remaining architecture concern is broader persistence policy, not direct mutation-path disk writes:
- whether preset saves should stay synchronous
- whether runtime config writes should share the persistence writer
- whether failed writes should retry automatically or wait for the next request
This improves both resilience and timing safety.
### 8. Telemetry is useful, but still too coarse
The app already records render timing and playout pacing, which is a good foundation.
Relevant code:
- [OpenGLRenderPipeline.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:24)
- [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:24)
Recommended direction:
Add lightweight tracing for:
- input callback latency
- input upload skip count
- render-thread request latency
- render queue depth
- render time
- pass build/compile latency
- readback time
- output scheduling lag
- output queue depth
- preroll depth versus spare-buffer depth
- preview present cost and skipped-preview count
- control queue depth
- runtime state lock contention
That would make future tuning and failure diagnosis much easier.
Timing-specific observations from the current code:
- render time is captured as one total number in [OpenGLRenderPipeline.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:24), but not split into draw, pack, readback wait, readback copy, or preview present
- frame pacing stats are recorded in [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:17), but there is no explicit visibility into how much queued playout headroom remains
- input uploads are intentionally skipped when the GL bridge is busy in [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:60), but the app does not currently surface how often that is happening
### 8a. Preview and playout are still too close together
The desktop preview is rate-limited, but still presented from inside the render pipeline path.
Relevant code:
- [OpenGLRenderPipeline.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:54)
- [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLComposite.cpp:235)
This means preview presentation can still consume time on the same path that is trying to meet output deadlines.
Recommended direction:
- treat preview as best-effort and entirely subordinate to playout
- move preview present to a separate presentation schedule fed from the latest completed render
- record preview skips and preview present cost independently from playout timing
### 8b. Readback is improved, but still not fully deadline-safe
The async readback path is a good step, but the miss path still falls back to synchronous `glReadPixels()` and then flushes the async pipeline.
Relevant code:
- [OpenGLRenderPipeline.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:150)
- [OpenGLRenderPipeline.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:228)
That means a single late GPU fence can push the app back onto the most timing-sensitive path exactly when it is already under pressure.
Recommended direction:
- increase readback instrumentation before changing policy again
- consider deeper readback buffering or a true stale-frame reuse policy instead of immediate synchronous fallback
- separate "freshest possible frame" policy from "never miss output deadline" policy and make that tradeoff explicit
### 8c. Background control and file-watch timing are partially event-driven
`ControlServices::PollLoop()` now uses a condition-variable wakeup for queued OSC commit work and a fallback timer for compatibility polling. That removes the old fixed `25 x Sleep(10)` cadence as the default OSC commit timing model, but file-watch/runtime-store refresh work still relies on a compatibility poll path.
Relevant code:
- [ControlServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/ControlServices.cpp:217)
That is acceptable as transitional non-critical background work. The Phase 2 bridge now publishes typed reload/file-change events when changes are detected; a later file-watch implementation can replace scanning as the source.
Recommended direction:
- replace runtime-store scanning with true file-watch events when practical
- isolate truly background work from latency-sensitive control reconciliation
- add separate metrics for queue age, not just queue depth
## Phased Roadmap
This roadmap is ordered by architectural dependency rather than by “quick wins.” The goal is to move the app toward clearer ownership boundaries and safer live behavior without doing later work on top of foundations that are likely to change again.
### Phase 1. Define subsystem boundaries and target architecture
Before changing major internals, formalize the target responsibilities for each major part of the app.
Status:
- Design deliverable: complete.
- Runtime implementation foothold: complete.
- Target boundary extraction: not complete across the whole app; remaining work is tracked by later phases, especially the event model, render ownership, live-state layering, backend lifecycle, telemetry, and persistence work.
Target split:
- `RuntimeStore`
- persisted config
- persisted layer stack
- preset persistence
- `RuntimeCoordinator`
- mutation validation and classification
- committed-live versus transient policy
- snapshot and persistence requests
- `RuntimeSnapshotProvider`
- render-facing immutable or near-immutable snapshots
- parameter values prepared for the render path
- `ControlServices`
- OSC ingress
- web control ingress
- reload/file-watch requests
- commit/persist requests
- `RenderEngine`
- sole owner of live GL rendering
- sole consumer of render snapshots plus transient overlays
- `VideoBackend`
- DeckLink input/output lifecycle
- pacing and scheduling
- `HealthTelemetry`
- logging
- counters
- timing traces
- degraded-state reporting
Why this phase comes first:
- it prevents later refactors from reintroducing responsibility overlap
- it gives names to the seams the later phases will build around
- it reduces the risk of replacing one monolith with several poorly-defined ones
Suggested deliverables:
- a short architecture diagram
- a responsibility table for each subsystem
- a list of allowed dependencies between subsystems
- a dedicated Phase 1 design note:
- [PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md)
- a subsystem design bundle index:
- [docs/subsystems/README.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/subsystems/README.md)
Current implementation note:
The repo now has concrete runtime classes, folders, read models, and subsystem tests for the Phase 1 names. These classes are the runtime foothold for later phases; app-wide extraction still continues around eventing, render ownership, backend lifecycle, persistence, and telemetry.
### Phase 2. Introduce an internal event model
Once subsystem boundaries are defined, introduce a typed event pipeline between them. This should happen before large state splits so the app has a stable coordination model.
Dedicated design note:
- [PHASE_2_INTERNAL_EVENT_MODEL_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_2_INTERNAL_EVENT_MODEL_DESIGN.md)
Example event families:
- control events
- `OscParameterTargeted`
- `UiParameterCommitted`
- `TriggerFired`
- runtime events
- `ShaderReloadRequested`
- `PackagesRescanned`
- `PersistStateRequested`
- render events
- `OverlayApplied`
- `OverlaySettled`
- `SnapshotPublished`
- backend events
- `InputSignalChanged`
- `OutputLateFrameDetected`
- `OutputDroppedFrameDetected`
- health events
- `SubsystemWarningRaised`
- `SubsystemRecovered`
Why this phase comes second:
- it provides a migration path away from direct cross-calls
- it makes ownership explicit before data structures are split apart
- it lets you move one subsystem at a time without losing coordination
Suggested outcome:
- the app stops relying on “shared object plus mutex plus polling” as the default coordination pattern
### Phase 3. Finish live-state and service-facing coordination
After the event model exists, finish separating live committed state and service-facing coordination from the runtime facades.
Dedicated design note:
- [PHASE_3_LIVE_STATE_SERVICE_COORDINATION_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_3_LIVE_STATE_SERVICE_COORDINATION_DESIGN.md)
Recommended split:
- `RuntimeStore`
- owns config and saved layer data
- handles serialization/deserialization
- does not sit on the live render path
- `RuntimeCoordinator`
- resolves control actions
- validates mutations
- publishes new snapshots
- bridges events between services and render
- `RuntimeSnapshotProvider`
- publishes immutable render snapshots
- avoids large shared mutable structures on the render path
Why this phase comes before render-thread isolation:
- render isolation is easier when the render thread consumes clean snapshots instead of a large mutable host object
- otherwise the GL refactor still drags along too much shared state complexity
Primary design rule:
- render should read snapshots
- persistence should write stored state
- services should request mutations through the coordinator
### Phase 4. Make the render thread the sole GL owner
With state and coordination cleaner, move to a dedicated render-thread model.
Dedicated design note:
- [PHASE_4_RENDER_THREAD_OWNERSHIP_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_4_RENDER_THREAD_OWNERSHIP_DESIGN.md)
Status:
- complete for GL ownership
- remaining playout-headroom work is tracked under Phase 7/backend lifecycle
Target behavior:
- one thread owns the GL context
- input callbacks never perform GL work directly
- output callbacks never perform GL work directly
- preview presentation, texture upload, render passes, readback, and output pack work are all issued by the render thread
Other threads should only:
- enqueue new video frames
- enqueue control updates
- enqueue backend events
- consume produced output buffers
Why this phase comes here:
- it is much safer once state access and control coordination are no longer centered on one shared runtime object
- it avoids coupling the render-thread refactor to storage and service refactors at the same time
Expected benefits:
- less cross-thread GL contention
- easier timing reasoning
- much lower risk of callback-driven stalls
- a clearer foundation for future GPU pipeline work
### Phase 5. Refactor live state layering into an explicit composition model
Once rendering and snapshots are isolated, formalize how final parameter values are derived.
Dedicated design note:
- [PHASE_5_LIVE_STATE_LAYERING_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_5_LIVE_STATE_LAYERING_DESIGN.md)
Status:
- complete for the current architecture
- `RuntimeStateLayerModel` names the state categories
- `CommittedLiveState` physically owns committed/session layer state
- `RenderStateComposer` consumes `LayeredRenderStateInput`
- `RuntimeLiveState` owns transient overlay smoothing, generation, commit settlement, and compatibility pruning
- settled OSC commits update session state without requesting persistence by default
Render should derive final values from a clear composition rule such as:
- `final = base + committed + transient`
Why this phase follows render isolation:
- once render owns snapshot consumption, it becomes much easier to cleanly evaluate layered state without touching persistence or control services
- it turns the current OSC overlay behavior into a first-class model instead of an implementation detail
Expected benefits:
- fewer one-off sync rules
- clearer behavior for OSC, UI changes, and automation
- easier future expansion to presets, cues, or timed transitions
### Phase 6. Move persistence onto a background snapshot writer
Status: complete. Runtime-state persistence is now a background concern rather than a synchronous side effect of mutations.
Dedicated design note:
- [PHASE_6_BACKGROUND_PERSISTENCE_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_6_BACKGROUND_PERSISTENCE_DESIGN.md)
Implemented behavior:
- mutations update authoritative in-memory stored state
- persistence requests are queued
- disk writes are debounced and coalesced
- writes use temp-file replacement where practical
- shutdown flush behavior is explicit and tested
Why this phase comes after state splitting:
- otherwise persistence logic will need to be rewritten twice
- it should operate on the new `RuntimeStore` model, not on a mixed-responsibility runtime object
Expected benefits:
- less timing interference
- better corruption resistance
- cleaner restart/recovery semantics
### Phase 7. Make DeckLink/backend lifecycle explicit with a state machine
Once the render and state layers are cleaner, refactor the video backend into an explicit lifecycle model.
Dedicated design note:
- [PHASE_7_BACKEND_LIFECYCLE_PLAYOUT_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_7_BACKEND_LIFECYCLE_PLAYOUT_DESIGN.md)
Suggested states:
- uninitialized
- devices-discovered
- configured
- prerolling
- running
- degraded
- stopping
- stopped
- failed
Why this phase belongs here:
- the backend should integrate with the new event model
- degraded/recovery behavior will be easier once rendering and state coordination are already more deterministic
Expected benefits:
- safer startup/shutdown ordering
- clearer recovery behavior
- easier handling of missing input, dropped frames, or reconfiguration
- a clearer place to own playout headroom policy, output queue sizing, and late-frame recovery behavior
### Phase 7.5. Make playout timing proactive and deadline-aware
Phase 7 made backend lifecycle, ready-frame queueing, measured recovery, and backend playout health visible. The remaining timing-specific work is to make output production proactive instead of demand-filled by completion pressure.
Dedicated design note:
- [PHASE_7_5_PROACTIVE_PLAYOUT_TIMING_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_7_5_PROACTIVE_PLAYOUT_TIMING_DESIGN.md)
Expected benefits:
- output frames are produced ahead based on queue pressure or cadence
- DeckLink completion handling normally consumes already-ready frames
- preview and synchronous readback fallback become explicitly subordinate to playout deadlines
- queue depth, readback misses, preview skips, and render timing explain why headroom drains
### Phase 8. Add structured health, telemetry, and operational reporting
This phase should happen after the main ownership changes so the telemetry can reflect the final architecture instead of a transient one.
Dedicated design note:
- [PHASE_8_HEALTH_TELEMETRY_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_8_HEALTH_TELEMETRY_DESIGN.md)
Recommended coverage:
- render queue depth
- input callback latency
- input upload skip count
- output scheduling lag
- output queue depth and spare-buffer depth
- readback timing
- readback fence wait timing
- synchronous readback fallback count
- preview present timing and skipped-preview count
- snapshot publish frequency
- persistence queue depth
- event queue depth
- backend state transitions
- warning/error counters per subsystem
Also replace modal-only error handling with:
- structured in-app health state
- severity-based logging
- rolling log files
- operator-visible degraded-state messages
Why this phase comes last:
- it should instrument the architecture you intend to keep
- otherwise instrumentation work gets invalidated by the refactor
## Recommended Execution Order
If this is approached as a serious architecture program rather than opportunistic cleanup, the recommended order is:
1. Define subsystem boundaries and target architecture.
2. Introduce the internal event model.
3. Finish runtime live-state/service coordination.
4. Make the render thread the sole GL owner.
5. Formalize live state layering and composition.
6. Move persistence to a background snapshot writer.
7. Refactor DeckLink/backend lifecycle into an explicit state machine.
8. Make playout timing proactive and deadline-aware.
9. Add structured telemetry, health reporting, and operational diagnostics.
## Why This Order Makes Sense
This order tries to avoid doing foundational work twice.
- The event model comes before major subsystem extraction so coordination patterns stabilize early.
- runtime state ownership is split before render isolation so the render thread does not inherit a monolithic state model.
- Live state layering is formalized only after render ownership is clearer.
- Persistence moved after the state model split so it could target the durable snapshot model rather than an older mixed-responsibility runtime object.
- Telemetry is intentionally late so it instruments the architecture that survives the refactor.
## Short Version
The app is in a much better place than it was before the OSC timing work. The shared-GL ownership risk has now been addressed by Phase 4; the main remaining live-resilience risk is output playout headroom because DeckLink callbacks still synchronously request render-thread output production. The most sensible path forward is:
1. define boundaries
2. establish an event model
3. split state ownership
4. isolate rendering
5. formalize layered live state
6. complete background persistence
7. explicit backend lifecycle
8. proactive playout timing
9. health and telemetry
That sequence gives each later phase a cleaner foundation than the current app has today.

View File

@@ -11,8 +11,8 @@ The app is a native C++ OpenGL compositor with:
- optional DeckLink input
- optional DeckLink scheduled output
- a render-thread-owned OpenGL context
- runtime Slang shader packages from `shaders/`
- a configurable active layer stack
- a pluggable app-side runtime-content controller
- the default runtime Slang shader package stack from `shaders/`
- a local HTTP/WebSocket control server
- optional Win32/GDI preview from system-memory output frames
- background runtime-state persistence
@@ -20,7 +20,7 @@ The app is a native C++ OpenGL compositor with:
Primary source areas:
- `src/app`: startup/shutdown orchestration, config loading, runtime layer controller
- `src/app`: startup/shutdown orchestration, config loading, video backend factory, runtime-content controller boundary, and the default shader runtime-content controller
- `src/render`: cadence clock, input texture upload, render-content boundary, readback, and runtime GL support
- `src/render/thread`: render thread lifecycle, cadence loop, metrics, and runtime shader commit mailbox
- `src/render/runtime`: render-thread-owned runtime shader scene, renderer, text texture upload cache, and shared-context shader prepare worker
@@ -44,18 +44,20 @@ Primary source areas:
Startup broadly proceeds as:
1. Load `config/runtime-host.json` through `AppConfigProvider`, then apply CLI overrides.
2. Load the supported shader catalog from the configured `shaderLibrary`.
3. Start the runtime-state persistence writer.
4. Try to restore `runtime/runtime_state.json`.
5. If restore fails or no usable state exists, create one layer from the configured default shader.
6. Start the render thread.
7. Queue background Slang builds for every pending active layer.
8. Build a small completed-frame reserve.
9. Start optional preview, optional video output, telemetry, and HTTP control.
2. Initialize the active `IRuntimeContentController`. The checked-in app uses `ShaderRuntimeContentController`, which loads the supported shader catalog, starts runtime-state persistence, and tries to restore `runtime/runtime_state.json`.
3. If restore fails or no usable state exists, the shader controller falls back to the optional configured startup shader. The checked-in config leaves `runtimeShaderId` empty, so a fresh host keeps the simple fallback renderer.
4. Start the render thread.
5. Start the active runtime-content controller. The shader controller queues background Slang builds for every pending active layer.
6. Build a small completed-frame reserve.
7. Start optional preview, optional video output, telemetry, and HTTP control.
The runtime-state restore is intentionally app/control side work. The render thread does not read JSON, inspect the shader library, or decide what to compile.
## Runtime Layer State
## Runtime Content And Shader Layer State
`RenderCadenceApp` owns startup, video output, preview, telemetry, OSC status, and HTTP server lifetime. It does not own the Slang shader stack directly. Runtime content plugs in through `IRuntimeContentController`.
The checked-in implementation is `ShaderRuntimeContentController`. It wraps `RuntimeLayerController`, exposes shader catalog/layer JSON for `/api/state`, handles shader layer POST commands, and publishes render-ready shader layer snapshots to the render thread.
`RuntimeLayerController` owns the app-side layer model and coordinates:
@@ -78,7 +80,7 @@ The runtime-state restore is intentionally app/control side work. The render thr
- current parameter values
- render-ready artifacts
The current durable runtime state is stored in `runtime/runtime_state.json`. It contains the active stack order, shader ids, bypass flags, and parameter values. On startup, valid saved layers are restored in order. Missing shader packages are skipped, invalid saved parameter values fall back to manifest defaults, and a missing or unusable file falls back to the configured default shader.
The current durable runtime state is stored in `runtime/runtime_state.json`. It contains the active stack order, shader ids, bypass flags, and parameter values. On startup, valid saved layers are restored in order. Missing shader packages are skipped, invalid saved parameter values fall back to manifest defaults, and a missing or unusable file falls back to the optional configured startup shader.
Manual stack preset routes are present in the UI/OpenAPI surface but are not implemented in the current native command path yet. `runtime_state.json` is the supported latest-working-state mechanism.
@@ -132,7 +134,7 @@ The render path consumes published render-layer snapshots. It does not:
- handle HTTP or OSC
- call DeckLink discovery/setup APIs
When a runtime shader build completes, the app publishes a render-layer artifact. The render thread forwards pending layer snapshots to the active render-content adapter. The default `RuntimeShaderRenderContent` owns the runtime scene, diffs the snapshot, and queues changed pass programs to the shared-context prepare worker. The render thread swaps in an already-prepared render plan at a frame boundary through that adapter.
When a runtime shader build completes, the default shader runtime-content controller publishes a render-layer artifact. The render thread forwards pending layer snapshots to the active render-content adapter. The default `RuntimeShaderRenderContent` owns the runtime scene, diffs the snapshot, and queues changed pass programs to the shared-context prepare worker. The render thread swaps in an already-prepared render plan at a frame boundary through that adapter.
## Video And Preview
@@ -156,7 +158,7 @@ The HTTP server runs on its own thread. `HttpControlServer` owns socket lifetime
- OpenAPI/Swagger docs
- `GET /api/state`
- `/ws` state updates
- layer mutation POST routes
- layer mutation POST routes, dispatched to the active runtime-content controller
- `/api/reload`
Known but not implemented in the current native command path:

View File

@@ -1,414 +0,0 @@
# DeckLink / OpenGL Lessons Learned
This document summarizes the practical lessons from the Phase 3-7.7 refactor work, especially the DeckLink playout timing experiments.
It is intentionally broader than the phase design docs. The goal is to preserve what we now know about the system so future architecture choices start from evidence instead of rediscovering the same constraints.
## High-Level Lesson
The application is not just a renderer with a video output attached.
It is a real-time playout system with several independent clocks:
- the selected output cadence, for example 59.94 fps
- the GPU render/readback timeline
- the DeckLink scheduled playback clock
- the Windows thread scheduler
- the input capture callback cadence
- the preview/window message loop
- the runtime/control update cadence
Stable playback depends on assigning one owner to each timing domain and keeping those domains loosely coupled.
## What Worked
### Named State Contracts Helped
`RenderFrameInput` and `RenderFrameState` made the render path easier to reason about.
Before that, frame rendering depended on scattered choices about snapshots, cache state, layer state, input source state, and runtime service state. Naming the frame contract made it possible to move logic out of `RenderEngine` and toward explicit frame construction.
Lesson:
- keep frame inputs explicit
- keep render-frame state immutable for the duration of a frame
- avoid making the renderer ask global systems which state it should use mid-frame
### Render-Thread Ownership Helped
Moving GL work behind a render-thread boundary reduced wrong-thread GL access risk and made ownership clearer.
The current render thread is still shared by output render, input upload, preview, screenshot, resize, and reset work, so it is not yet a pure output cadence thread. But the ownership direction is right.
Lesson:
- GL context ownership should be explicit
- public methods should enqueue or request work
- render-thread methods should own GL bodies
- synchronous calls should be reserved for places that genuinely need a result
### Background Persistence Was Worth It
Moving persistence away from hot render/control paths reduced incidental latency risk and made state writes easier to reason about.
Lesson:
- runtime/control persistence should not sit on output render timing
- shutdown flushing is fine, steady-state blocking is not
### Lifecycle State Was Worth It
The backend lifecycle model gave us better failure and shutdown vocabulary.
This became important once startup stopped being a single `Start()` call and became:
- prepare output schedule
- start render cadence
- warm up real frames
- start input streams
- start scheduled playback
Lesson:
- playout startup needs phases
- degradation should be explicit
- shutdown order should be deliberate and testable
## What Did Not Work
### Completion-Driven Rendering Was Too Fragile
Rendering on or near DeckLink completion can average the target frame rate, but it leaves no headroom.
When the callback asks for a frame just-in-time, any small delay in render, readback, scheduling, or Windows wake timing becomes visible as a buffer dip or stutter.
Lesson:
- DeckLink completion should release scheduled resources and wake scheduling
- it should not render
- it should not decide visual fallback policy in steady state
### Black Fallback Hid The Real Timing Problem
Scheduling black on app-ready underrun made the pipeline appear to keep moving while producing visible black flicker.
It also made diagnosis harder because DeckLink could have scheduled frames while the app visibly failed.
Lesson:
- black is a startup/error/degraded-state policy, not normal steady-state recovery
- steady-state underruns should be measured as timing failures
### Synthetic Schedule Lead Was Misleading
The synthetic scheduled/completed index could report a large buffer while DeckLink still showed low actual device buffer depth.
Real DeckLink `GetBufferedVideoFrameCount()` telemetry was necessary to separate:
- app-owned scheduled slots
- synthetic schedule lead
- actual hardware/device buffer depth
Lesson:
- measure actual device buffer depth
- keep synthetic counters only as diagnostics
- do not infer device health from internal stream indexes alone
### Schedule Cursor Recovery Must Be Conservative
The DeckLink schedule cursor should normally advance as a continuous stream timeline. Continuously realigning the next scheduled stream time to the sampled playback cursor can create its own timing fault: output may look like low FPS even when render and scheduling counters average 59.94/60 fps.
What worked better:
- use the exact DeckLink frame duration for the render cadence
- keep healthy scheduling on a continuous stream cursor
- measure schedule lead from DeckLink playback time versus the next schedule time
- realign only after real pressure, such as a late/drop report or dangerously low measured lead
- re-arm proactive realignment only after lead has recovered
Lesson:
- schedule recovery is an output-edge safety valve, not a per-frame timing policy
- if recovery increments continuously, the recovery path has become the problem
- include schedule lead and realignment count in telemetry/logs so drift is visible before guessing
### More Buffer Is Not Automatically Smoother
Increasing DeckLink scheduled frames sometimes made the reported device buffer look healthier while visible motion still stuttered.
The problem was not only "how many frames are scheduled"; it was also whether the scheduled frames represented a stable render cadence.
Lesson:
- buffer depth absorbs jitter, but it cannot fix bad cadence ownership
- a full buffer of poorly timed or repeated frames can still look wrong
### Speed-Up Catch-Up Was The Wrong Instinct
Letting the producer sprint to refill the buffer created new timing artifacts.
The render side should behave like a stable game/render loop: render at the selected cadence, record lateness, and only skip ticks when render/GPU work itself overruns.
Lesson:
- the render thread should not render faster because DeckLink is empty
- buffer drain is a failure signal, not a sprint signal
- warmup should fill buffers before playback starts
## GPU Readback Lessons
### The Original Readback Path Was The Major Collapse
Early Phase 7.5 telemetry showed `glReadPixels(..., nullptr)` into the PBO costing roughly 8-14 ms on representative samples. That was enough to collapse ready depth and cause long freezes.
Direct synchronous readback was worse on the sampled machine.
Cached-output mode, while visually invalid for live output, immediately recovered timing. That proved ongoing GPU-to-CPU transfer was the major cost in that version of the path.
Lesson:
- isolate readback cost from render cost
- use intentionally invalid cached-output experiments when diagnosing throughput
- do not assume async PBO is actually cheap on every format/driver path
### BGRA8 Packing Changed The Problem
Changing the output path so readback matched the DeckLink BGRA8 format made `asyncQueueReadPixelsMs` drop dramatically in sampled runs.
Long pauses disappeared and the remaining issue became short stutters/cadence gaps.
Lesson:
- output/readback format matters
- avoid format conversions on the readback path when possible
- BGRA8 is a good current format target for experiments
- v210/YUV packing can be deferred until cadence is stable
### DeckLink SDK Fast Transfer Was Not Available On The Test GPU
The SDK OpenGL fast-transfer path depends on hardware/extension support that was not present on the RTX 4060 Ti test machine:
- NVIDIA DVP path was gated around Quadro-style support
- `GL_AMD_pinned_memory` was not exposed
Lesson:
- SDK fast-transfer samples are useful references but not a universal fix
- unsupported fast-transfer code should not be central to the architecture
- the default path must work with ordinary consumer GPUs
## DeckLink Lessons
### DeckLink Wants Scheduled System-Memory Frames
Using `CreateVideoFrameWithBuffer()` lets DeckLink schedule frames backed by our system-memory slots.
That is the right ownership model for this app:
- render/readback writes into a slot
- DeckLink schedules a frame that references that slot
- the slot is protected until DeckLink completion
Lesson:
- system-memory slots are the contract between render and playout
- scheduled slots must not be recycled early
- completed-but-unscheduled slots should form a bounded FIFO reserve for playout
### Startup Needs Real Preroll
Starting scheduled playback before real rendered frames exist creates avoidable startup fragility.
The better startup shape is:
- prepare the DeckLink schedule
- start render cadence
- render warmup frames at normal cadence
- schedule those frames as preroll
- start DeckLink scheduled playback
Lesson:
- do not use black preroll as the normal startup path
- do not render faster during warmup
- if warmup cannot fill in a bounded time, fail/degrade visibly
## Buffering Lessons
### There Are Two Different Buffers
The app has at least two important frame stores:
- system-memory completed FIFO reserve frames
- DeckLink scheduled/device buffer
They have different ownership rules.
Completed-but-unscheduled frames should be a bounded FIFO reserve for playout. If that reserve overflows, dropping the oldest completed frame is an app-side reserve policy and should be counted separately from DeckLink dropped frames.
Scheduled frames are not disposable because DeckLink may still read them.
Lesson:
- completed frames waiting for playout are a bounded FIFO reserve
- scheduled frames are owned by DeckLink until completion
- keep metrics for both
### Consume-Before-Render Is The Wrong Model For Completed Frames
If the render cadence waits for completed frames to be consumed, DeckLink timing can indirectly slow the renderer.
That couples the clocks again.
Lesson:
- render cadence should keep rendering at selected cadence
- render acquire should not evict completed frames that are waiting for playout
- if the completed reserve overflows, drop/count the oldest unscheduled completed frame
- only scheduled/in-flight saturation should prevent rendering to a safe slot
## Render Thread Lessons
### The Current Render Thread Is Still Shared
The GL render thread currently handles:
- output rendering
- input upload
- preview present
- screenshot capture
- render reset commands
- shader/resource operations
Output render can therefore be delayed by queued or inline work.
Lesson:
- "one GL thread" is not the same as "one output cadence thread"
- output render should become the highest-priority GL operation
- non-output GL work needs budgets, coalescing, or deferral
### Input Upload Is A Suspect Timing Coupling
Output render currently processes input upload work immediately before rendering the output frame.
That keeps input fresh but can steal time from the exact frame we are trying to render on cadence.
Lesson:
- measure input upload count and time immediately before output render
- test policies such as `one_before_output` or `skip_before_output`
- prefer latest-input semantics over draining every pending upload
### CPU Input Conversion Can Be Worse Than Input Copy
When DeckLink input only exposed UYVY8 on the test machine, an initial CPU UYVY-to-BGRA conversion in the input callback measured around a full-frame budget on sampled runs and reduced input cadence dramatically.
Moving the input edge to raw UYVY8 capture changed the ownership:
- DeckLink callback copies raw supported input bytes into `InputFrameMailbox`
- the mailbox keeps latest-frame semantics and uses a contiguous copy when row strides match
- the render thread uploads/decodes UYVY8 into the shader-visible `gVideoInput` texture
- runtime shaders continue to see decoded input, not packed capture bytes
Lesson:
- keep input callbacks as capture/copy edges
- keep GL decode/upload in the render-owned path
- measure input copy, upload, and decode separately
- do not hide expensive format conversion inside the DeckLink callback
### Preview And Screenshot Must Stay Secondary
Preview is useful, but DeckLink output is the real-time path.
Screenshot and preview share GL resources and can block or queue work on the same render thread.
Lesson:
- preview should be skipped when output is under pressure
- screenshot capture should be treated as disruptive unless proven otherwise
- forced preview/screenshot should be visible in telemetry
## Telemetry Lessons
The useful telemetry has been the telemetry that separates domains:
- output render queue wait
- render/draw time
- readback queue time
- readback fence/map/copy time
- app ready/completed queue depth
- system-memory free/rendering/completed/scheduled counts
- actual DeckLink buffered-frame count
- DeckLink schedule-call time/failures
- late/drop completion counts
Lesson:
- averages are not enough
- timing spikes matter more than steady low values
- count ownership states, not just queue depth
- keep experiment logs short and evidence-based
## Current Architectural Direction
The current direction is still sound:
```text
Render cadence loop
renders at selected output cadence
writes completed system-memory frames into a bounded FIFO reserve
never sprints to refill DeckLink
Frame store
owns free / rendering / completed / scheduled slots
recycles unscheduled completed frames when needed
protects scheduled frames until completion
DeckLink playout scheduler
consumes completed frames
tops up actual device buffer
never renders
Completion callback
releases scheduled slots
records completion result
wakes scheduler
```
## Rewrite Lesson
A full restart is not obviously the right next move.
The current repo now contains:
- working runtime/control architecture
- useful phase docs
- non-GL tests around key state machines
- real telemetry
- a clearer understanding of DeckLink and OpenGL timing
The better next step is likely a contained "V2 spine" inside the current app:
- harden the render cadence loop
- harden the frame store
- separate DeckLink scheduling
- demote preview/screenshot/input upload below output cadence
- delete old compatibility branches as they become unnecessary
A full rewrite becomes attractive only if the current GL ownership model cannot be made deterministic without excessive surgery, or if the project switches rendering API.
## Practical Rules Going Forward
- One timing authority per domain.
- Render cadence is time-driven, not completion-driven.
- DeckLink scheduling is device-buffer-driven, not render-driven.
- Completion callbacks release and report; they do not render.
- System-memory completed frames are a bounded FIFO reserve.
- Scheduled frames are protected until DeckLink completion.
- Startup uses real rendered warmup/preroll.
- Black fallback is degraded/error behavior, not steady-state behavior.
- Output render has priority over preview, screenshot, and bulk input upload.
- Measure before adding recovery branches.

View File

@@ -1,6 +1,6 @@
# Forking The Render Cadence Base
This note captures the fork-readiness review for using this repository as a base where the render cadence and video I/O work are kept, but the GPU content being rendered is replaced in a separate repository.
This note captures the current fork-readiness state for using this repository as a base where the render cadence and video I/O work are kept, but the GPU content being rendered is replaced in a separate repository.
## Verdict
@@ -76,7 +76,6 @@ Before cutting a long-lived fork, fix or decide these items:
- Make `config/runtime-host.json` portable. Current checked-in defaults include a local NDI source name and DeckLink output.
- Decide whether the fork keeps the Slang shader package contract. If not, replace `ShaderRuntimeContentController`, retire or clearly isolate `shaders/SHADER_CONTRACT.md`, shader package UI, and shader manifest tests.
- For a D3D engine fork, decide whether to bridge D3D output into the existing GL/readback path temporarily or replace `RenderThread`/readback with a D3D-native publisher that still writes completed `SystemFrameExchange` frames.
- Mark older docs that reference `apps/LoopThroughWithOpenGLCompositing` as historical, or update them to point at the current `src/` implementation.
- Keep `runtime/` generated output ignored, and keep only `runtime/templates/` plus `runtime/README.md` tracked.
- Keep the private SDK bundle as a submodule only if the new repo is intended for the same org/build environment. External forks should use ignored `3rdParty/` or explicit CMake SDK paths.

View File

@@ -1,582 +0,0 @@
# New Render Cadence App Plan
Status: historical implementation plan. `apps/RenderCadenceCompositor` now exists; use [apps/RenderCadenceCompositor/README.md](../apps/RenderCadenceCompositor/README.md) and [Render Cadence Golden Rules](RENDER_CADENCE_GOLDEN_RULES.md) as the current implementation contract.
This plan describes a new application folder that rebuilds the output path from the proven `DeckLinkRenderCadenceProbe` architecture, but as a maintainable app foundation rather than a monolithic probe file.
The first goal is not to port the current compositor feature set. The first goal is to reproduce the probe's smooth 59.94/60 fps DeckLink output with clean module boundaries, tests where possible, and a structure that can later accept the shader/runtime/control systems without compromising timing.
## Working Name
Suggested folder:
```text
apps/RenderCadenceCompositor
```
Suggested executable:
```text
RenderCadenceCompositor
```
The existing app remains intact:
```text
apps/LoopThroughWithOpenGLCompositing
```
The probe remains the control sample:
```text
apps/DeckLinkRenderCadenceProbe
```
## Design Principle
The app is built around one spine:
```text
Render cadence thread
-> owns GL context
-> renders at selected frame cadence
-> performs async BGRA8 readback
-> publishes completed system-memory frames
System frame exchange
-> owns Free / Rendering / Completed / Scheduled slots
-> bounded FIFO reserve for completed unscheduled frames
-> protects scheduled frames until DeckLink completion
DeckLink output thread
-> consumes completed frames
-> schedules to target buffer depth
-> releases scheduled frames on completion
-> never renders
```
Everything else must fit around that spine.
## Non-Negotiable Rules
- The render thread owns its GL context from initialization to shutdown.
- The render thread is driven by selected render cadence, not DeckLink demand.
- DeckLink scheduling never calls render code.
- Completion callbacks never render.
- No synchronous render request exists in the output path.
- Preview, screenshot, input upload, shader rebuild, and runtime control cannot run ahead of a due output frame.
- Completed unscheduled frames are a bounded FIFO reserve; overflow drops are counted separately from DeckLink drops.
- Scheduled frames are protected until DeckLink completion.
- Startup warms up real rendered frames before scheduled playback starts.
## Borrow From The Probe
Keep these behaviors from `DeckLinkRenderCadenceProbe`:
- hidden OpenGL context owned by the render thread
- simple render loop with `nextRenderTime`
- BGRA8 render target
- PBO ring readback
- non-blocking fence polling with zero timeout
- system-memory slots with `Free`, `Rendering`, `Completed`, `Scheduled`
- preserve completed frames waiting for playout; drop/count the oldest completed frame only if the bounded reserve overflows
- DeckLink playout thread only schedules completed frames
- warmup completed frames before `StartScheduledPlayback()`
- one-line-per-second timing telemetry
## Do Not Borrow Directly
The probe is deliberately compact. Do not carry over these probe limitations into the new app:
- one huge `.cpp` file
- hard-coded output mode as permanent behavior
- render pattern, frame store, PBO logic, DeckLink playout, COM setup, and telemetry mixed together
- no reusable interfaces
- no unit-testable non-GL core
## Proposed Folder Structure
```text
apps/RenderCadenceCompositor/
README.md
RenderCadenceCompositor.cpp
app/
RenderCadenceApp.cpp
RenderCadenceApp.h
AppConfig.cpp
AppConfig.h
AppConfigProvider.cpp
AppConfigProvider.h
control/
HttpControlServer.cpp
HttpControlServer.h
RuntimeStateJson.h
platform/
ComInit.cpp
ComInit.h
HiddenGlWindow.cpp
HiddenGlWindow.h
Win32Console.cpp
Win32Console.h
render/
thread/
RenderThread.cpp
RenderThread.h
RenderCadenceClock.cpp
RenderCadenceClock.h
SimpleMotionRenderer.cpp
SimpleMotionRenderer.h
readback/
Bgra8ReadbackPipeline.cpp
Bgra8ReadbackPipeline.h
PboReadbackRing.cpp
PboReadbackRing.h
frames/
SystemFrameExchange.cpp
SystemFrameExchange.h
SystemFrameTypes.h
video/
DeckLinkOutput.cpp
DeckLinkOutput.h
DeckLinkOutputThread.cpp
DeckLinkOutputThread.h
telemetry/
CadenceTelemetry.cpp
CadenceTelemetry.h
CadenceTelemetryJson.h
TelemetryHealthMonitor.h
logging/
Logger.cpp
Logger.h
json/
JsonWriter.cpp
JsonWriter.h
```
The new app can reuse selected existing source files from the current app at first:
- `videoio/decklink/DeckLinkSession.*`
- `videoio/decklink/DeckLinkDisplayMode.*`
- `videoio/decklink/DeckLinkVideoIOFormat.*`
- `videoio/decklink/DeckLinkFrameTransfer.*`
- `videoio/VideoIOFormat.*`
- `videoio/VideoIOTypes.h`
- `videoio/VideoPlayoutScheduler.*`
- `gl/renderer/GLExtensions.*`
Longer term, shared code should move into common libraries, but the first version can link these files directly to avoid a big build-system refactor.
## Module Responsibilities
### `RenderCadenceApp`
Owns top-level startup/shutdown sequencing.
Responsibilities:
- initialize COM
- discover/select DeckLink output
- create frame exchange
- start render thread
- wait for completed-frame warmup
- start DeckLink output thread
- wait for scheduled buffer warmup
- start DeckLink scheduled playback
- start telemetry printer
- stop in reverse order
It should not contain OpenGL drawing code, frame slot policy, or DeckLink scheduling loops.
### `AppConfig`
Owns runtime settings for the initial app.
Initial settings:
- output mode preference
- output width/height validation
- frame buffer capacity
- PBO depth
- warmup completed-frame count
- target DeckLink scheduled depth
- telemetry interval
Initial values should match the successful probe:
```text
systemFrameSlots = 12
pboDepth = 6
warmupFrames = 4
targetDeckLinkBufferedFrames = 4
pixelFormat = BGRA8
```
### `HiddenGlWindow`
Owns hidden Win32 window, device context, and OpenGL context creation.
Responsibilities:
- create hidden window with `CS_OWNDC`
- choose/set pixel format
- create `HGLRC`
- expose `MakeCurrent()` and `ClearCurrent()`
- destroy context/window safely
Only `RenderThread` should call `MakeCurrent()` after startup.
### `RenderThread`
Owns the render loop and GL context for its full lifetime.
Responsibilities:
- create/bind hidden GL context
- resolve GL extensions
- initialize renderer/readback pipeline
- run cadence loop
- render one frame when due
- queue PBO readback
- consume completed PBOs into `SystemFrameExchange`
- record telemetry
- destroy GL resources on the render thread
It must not:
- wait for DeckLink
- schedule DeckLink frames
- block on a system frame slot if only completed unscheduled frames can be dropped
- accept arbitrary GL tasks ahead of output frames
### `RenderCadenceClock`
Small, testable cadence helper.
Responsibilities:
- track target frame duration
- return whether a render is due
- compute sleep duration
- detect overrun/skipped ticks
- never speed up to fill buffers
This should be unit tested without GL.
### `SimpleMotionRenderer`
First renderer only.
Responsibilities:
- render obvious smooth motion and color changes
- produce BGRA8-compatible framebuffer content
- make dropped/repeated frames visually obvious
This intentionally avoids shader-package/runtime complexity.
### `Bgra8ReadbackPipeline`
Owns output framebuffer and BGRA8 readback orchestration.
Responsibilities:
- configure render target dimensions
- render into an RGBA8/BGRA-compatible texture
- coordinate `PboReadbackRing`
- publish completed frames into `SystemFrameExchange`
### `PboReadbackRing`
Owns PBO/fence state.
Responsibilities:
- queue readback into the next free PBO slot
- poll completed fences with zero timeout
- map/copy completed PBOs into provided system-memory slots
- count PBO misses
- clean up fences/PBOs on render thread
This is GL-backed, but the state model should be small and easy to reason about.
### `SystemFrameExchange`
The central handoff between render and video.
Responsibilities:
- own system-memory frame buffers
- track slot states: `Free`, `Rendering`, `Completed`, `Scheduled`
- provide `AcquireForRender()`
- provide `PublishCompleted()`
- provide `ConsumeCompletedForSchedule()`
- provide `ReleaseScheduledByBytes()`
- drop oldest completed unscheduled frame when render needs a slot
- expose metrics
This should be unit tested heavily.
### `DeckLinkOutput`
Thin wrapper around `DeckLinkSession` for output-only use.
Responsibilities:
- discover/select output mode
- configure output callback
- prepare output schedule
- schedule app-owned system-memory frames
- start scheduled playback
- stop/release resources
- expose actual DeckLink buffered count
No input support in the first version.
### `DeckLinkOutputThread`
Owns playout scheduling loop.
Responsibilities:
- keep scheduled depth near target
- consume completed frames from `SystemFrameExchange`
- schedule them through `DeckLinkOutput`
- release frame if scheduling fails
- sleep briefly when scheduled buffer is full or no completed frame exists
It must not render.
### `CadenceTelemetry`
Owns counters, not policy.
Initial counters:
- rendered frames
- completed readback frames
- scheduled frames
- completion count
- completed-frame drops
- acquire misses
- schedule underruns
- PBO queue misses
- DeckLink late count
- DeckLink dropped count
- free/rendering/completed/scheduled slot counts
- actual DeckLink buffered frames
### `TelemetryHealthMonitor`
Samples cadence telemetry once per interval and logs only health events.
Normal telemetry is available through the HTTP state endpoint. The console should not receive a healthy once-per-second cadence line.
Health events:
- warning when DeckLink late/dropped-frame counters increase
- warning when schedule failures increase
- error when app/DeckLink output buffering is starved
## Startup Sequence
Target first-version startup:
```text
main
-> load AppConfig through AppConfigProvider
-> initialize COM
-> create SystemFrameExchange
-> start RenderThread
-> wait for completed frame warmup
-> optionally discover/select/configure DeckLink output
-> if DeckLink is available:
-> start DeckLinkOutputThread
-> wait for scheduled depth warmup
-> DeckLinkOutput start scheduled playback
-> if DeckLink is unavailable:
-> continue without video output
-> start TelemetryHealthMonitor
-> start HttpControlServer
-> wait for Enter
```
Shutdown:
```text
stop HttpControlServer
stop TelemetryHealthMonitor
stop DeckLinkOutputThread
DeckLinkOutput stop playback
stop RenderThread
DeckLinkOutput release resources
release COM
```
## First Milestone: Modular Probe Equivalent
This is the only goal for the initial implementation.
Feature set:
- console app
- output-only DeckLink
- no input
- hidden GL context
- simple motion renderer
- BGRA8 only
- PBO async readback
- bounded FIFO system-memory frame exchange
- warmup before playback
- one-line telemetry
Acceptance:
- visible DeckLink output is smooth
- `renderFps` near selected cadence
- `scheduleFps` near selected cadence
- scheduled count/decklink buffered count stable around 4
- no continuous late/drop count
- no continuous PBO misses
- behavior matches or exceeds `DeckLinkRenderCadenceProbe`
## Second Milestone: Testable Core
Before porting compositor features, add tests for non-GL/non-DeckLink pieces.
Test targets:
- `SystemFrameExchangeTests`
- `RenderCadenceClockTests`
- `CadenceTelemetryTests`
Important cases:
- slot lifecycle transitions
- scheduled slots are protected
- completed unscheduled frames can be dropped
- stale handles/generations are rejected
- cadence does not speed up to refill buffers
- cadence records overrun/skipped ticks
## Third Milestone: Replace Simple Renderer With Render Interface
Add an interface around frame rendering:
```text
IRenderScene
-> InitializeGl()
-> RenderFrame(frameIndex, time)
-> ShutdownGl()
```
The first implementation remains `SimpleMotionRenderer`.
This creates the insertion point for shader-package rendering later without changing timing/scheduling.
## Fourth Milestone: Begin Porting Current App Features
Port only after the modular probe equivalent is stable.
Suggested order:
1. shader package compile/load
2. render pass/layer stack drawing
3. runtime snapshot input to renderer
4. live state overlays
5. control services
6. persistence/runtime store
7. preview from system-memory frames
8. screenshot from system-memory frames
9. input capture via CPU latest-frame mailbox
Each port must preserve the rule that the render thread cadence is primary.
## What Not To Port Early
Do not port these until the output spine is proven:
- DeckLink input
- preview GL presentation
- screenshot GL readback
- HTTP/OSC control services
- shader hot reload
- persistence
- runtime state JSON/open API
- complex telemetry/event dispatch
These are useful, but they are exactly the kinds of features that can accidentally reintroduce timing coupling.
## Build Plan
Initial CMake can follow the probe pattern:
```cmake
set(RENDER_CADENCE_APP_DIR "${CMAKE_CURRENT_SOURCE_DIR}/apps/RenderCadenceCompositor")
add_executable(RenderCadenceCompositor
# selected shared DeckLink/video/gl support files
# new modular app files
)
```
Later, shared source should be split into libraries:
```text
video_shader_decklink
video_shader_videoio
video_shader_gl_support
render_cadence_core
```
Avoid doing that library split before the first modular app works.
## VS Code Launch
Add a separate launch profile:
```text
Debug RenderCadenceCompositor
```
Run it as a console app so telemetry remains visible.
## Documentation
Add:
```text
apps/RenderCadenceCompositor/README.md
```
The README should record:
- intended architecture
- build/run instructions
- expected telemetry
- test result notes
- differences from the old app
- differences from the probe
## Success Criteria Before Porting More Features
Do not start feature porting until the new app can run with:
- stable smooth DeckLink output
- stable target scheduled depth
- stable actual DeckLink buffered count
- no regular visible freezes
- no steady PBO misses
- no steadily increasing late/dropped completions
- focus/minimize changes do not affect output cadence
- clean shutdown without hangs
This gives us a clean foundation. Once this is true, every feature added later has to prove it does not damage the spine.

View File

@@ -4,10 +4,12 @@ These are the non-negotiable rules for the new render-cadence architecture.
They exist because the old app drifted into a place where DeckLink timing, render work, shader build work, state coordination, readback, and recovery behavior all influenced each other. The new app should stay boring, explicit, and easy to reason about.
## 1. The Render Thread Owns Its GL Context
## 1. The Render Thread Owns Its Render Context
Only the render thread may bind and use its primary OpenGL context.
For a fork that replaces the renderer with D3D or another graphics API, read this rule as the same ownership contract: the cadence/render thread owns the primary render device/context path, and non-render services exchange prepared data with it rather than calling rendering APIs themselves.
Allowed on the render thread:
- GL resource creation and destruction for resources it owns

View File

@@ -1,448 +0,0 @@
# Render Thread Ownership Plan
This plan describes how to make the main compositor behave like the successful `DeckLinkRenderCadenceProbe`: one render cadence owner, one GL context owner, no unrelated work able to interrupt output frame production.
The goal is not just "all GL calls happen on one thread". The current app mostly does that during runtime already. The real goal is:
- the output render thread owns its GL context for its whole lifetime
- output cadence is driven by the render thread, not by DeckLink completion timing
- non-output GL work cannot sit ahead of output frames
- callers cannot block the render thread while waiting for synchronous answers
- DeckLink scheduling consumes completed system-memory frames and never causes rendering
## Current Risk Points
The current main app still has several ways to interrupt output cadence.
### Shared GL Executor
`RenderEngine` owns the GL context during runtime, but it acts as a general task executor.
The same queue/path can run:
- output frame render
- input upload
- preview present
- screenshot capture
- render resets
- shader/program commits
- resource resize
- state clearing
That means output frames are not guaranteed to be the next GL work item at the selected frame time.
### Synchronous Output Render Request
`VideoBackend` drives output production from its output producer thread, then calls:
```text
VideoBackend
-> OpenGLVideoIOBridge::RenderScheduledFrame
-> RenderEngine::RequestOutputFrame
-> TryInvokeOnRenderThread
```
That makes output production a request/response interaction. The producer waits for the render thread, and the render thread is still shared with other work.
### Input Upload Shares Output Context
DeckLink input capture currently flows into:
```text
VideoBackend::HandleInputFrame
-> OpenGLVideoIOBridge::UploadInputFrame
-> RenderEngine::QueueInputFrame
-> render thread upload
```
Even with coalescing, input upload can consume render-thread time and GPU bandwidth directly before output rendering.
### Preview And Screenshot Share Output Context
Preview and screenshot are lower-priority features, but today they still execute on the render thread.
Preview is best-effort at the caller side, but once queued it can still occupy the same context. Screenshot capture can be more expensive because it performs readback and CPU-side image preparation.
### Startup Context Ownership Is Transitional
The Win32 startup path creates and binds the GL context before `RenderEngine::StartRenderThread()`.
That is acceptable as a transitional state, but the final model should make context ownership explicit:
- bootstrap thread creates the window/context
- bootstrap thread releases it
- render thread binds it
- only render thread initializes GL resources
- only render thread destroys GL resources
### Render Callback Re-enters App State
`OpenGLRenderPipeline::RenderFrame()` calls a callback into `OpenGLComposite::renderEffect()`.
That callback builds `RenderFrameInput`, resolves frame state, drains runtime live state, and then calls back into `RenderEngine` to draw the prepared frame.
This works, but it means the output render path still reaches up into app/runtime code at frame time.
## Target Runtime Shape
The main app should match this ownership model:
```text
runtime/control threads
-> publish snapshots, live overlays, reset requests, shader-build results
-> never call GL
render cadence thread
-> sole owner of output GL context
-> wakes at selected render cadence
-> samples latest render input/state
-> renders one frame
-> queues async readback/copies completed readback into system-memory slot
-> publishes completed frame to bounded FIFO output reserve
video output thread
-> consumes completed system-memory frames
-> schedules DeckLink frames to target buffer depth
-> processes completion results
-> never calls GL
optional input upload path
-> writes latest input frame into CPU-side latest-frame buffer
-> render thread imports/uploads at a controlled point in its frame
preview/screenshot path
-> consumes already-rendered output/system-memory frame when possible
-> never interrupts output render cadence
```
## Non-Negotiable Rules
- The render thread never waits for DeckLink.
- DeckLink callbacks never render.
- Runtime/control threads never directly execute GL.
- Preview and screenshot never execute ahead of output frames.
- Input upload is never a separate urgent GL task ahead of output render.
- Shader/resource commits are applied only at a frame boundary.
- Telemetry on the hot path must be lock-light or try-lock only.
- The render thread cadence does not speed up to refill buffers.
- If output work overruns, the render thread records the overrun and resumes the selected cadence policy.
## Implementation Plan
### 1. Add Thread/Context Ownership Guards
Add explicit render-thread ownership checks around all GL entry points.
Deliverables:
- `RenderEngine` exposes `IsOnRenderThread()` for assertions/tests.
- GL-facing classes get debug-only owner checks where practical.
- wrong-thread GL access becomes a counted telemetry warning, not just `OutputDebugStringA`.
- tests cover that public request methods do not execute GL directly.
Acceptance:
- every `RenderEngine` public method is classified as either request-only, lifecycle-only, or render-thread-only.
- render-thread-only methods are private or guarded.
- no normal runtime caller can accidentally invoke GL work inline.
### 2. Move GL Initialization Fully Onto The Render Thread
Start the render thread before compiling shaders and initializing GL resources.
Current startup does:
```text
InitOpenGLState()
-> CompileDecodeShader
-> CompileOutputPackShader
-> InitializeResources
-> CompileLayerPrograms
StartRenderThread()
```
Move toward:
```text
create context on Win32 thread
release context on Win32 thread
StartRenderThread()
render thread binds context
render thread initializes extensions, shaders, resources
```
Deliverables:
- a single `RenderEngine::StartAndInitialize(RenderInitializationConfig)` path.
- GL extension resolution happens on the render thread.
- shader/resource initialization is a render-thread startup phase.
- `RenderEngine` destructor only destroys resources on the render thread.
Acceptance:
- after `StartRenderThread()`, no non-render thread binds or uses the app GL context.
- shutdown order is deterministic: stop video output, stop render cadence, destroy GL resources, release context.
### 3. Replace Synchronous Output Render Requests With Render-Owned Cadence
Move output cadence out of `VideoBackend` and into the render system.
Current:
```text
VideoBackend output producer
-> cadence tick
-> acquire output slot
-> synchronous render-thread request
```
Target:
```text
RenderEngine output cadence loop
-> cadence tick
-> acquire/free output slot through a non-blocking frame-sink interface
-> render frame
-> publish completed frame
```
Deliverables:
- introduce `RenderedFrameSink` or similar interface owned by video output.
- render thread pulls/claims a free system-memory slot without waiting.
- if no free slot exists, render thread drops/recycles the oldest unscheduled completed frame or records backpressure without blocking.
- remove `RenderEngine::RequestOutputFrame()` from the steady-state output path.
Acceptance:
- output rendering continues even if DeckLink completion is delayed.
- no `std::future` wait exists in the output cadence path.
- `VideoBackend` no longer owns the producer render loop; it owns scheduling/completion only.
### 4. Make The Render Thread A Frame Loop, Not A Task Queue
Keep a command mailbox, but process it only at safe frame-boundary points.
Frame loop:
```text
while running:
wait until next render timestamp
apply bounded frame-boundary commands
sample latest frame input/state
upload latest input frame if enabled and budget allows
render output frame
queue/consume readback
publish completed frame
record timings
```
Command classes:
- frame-boundary commands: reset temporal history, reset shader feedback, commit prepared shader programs
- background/low-priority commands: preview, screenshot, diagnostic readback
- non-GL commands: state publication, telemetry, persistence
Deliverables:
- replace FIFO render task queue with a priority/mailbox model.
- output cadence is the loop's main clock.
- commands have budget classes and max work per frame.
- long commands are deferred rather than blocking the current output tick.
Acceptance:
- preview/screenshot cannot run immediately before a due output frame.
- reset/shader work is applied between frames and measured.
- output render starts within a small jitter window when the GPU is not overrun.
### 5. Move Input Capture To A CPU Latest-Frame Buffer
Input capture should not enqueue independent GL upload tasks.
Target:
```text
DeckLink input callback
-> copy/coalesce latest CPU input frame
-> return quickly
render thread frame boundary
-> if input version changed, upload latest frame
-> render using last successfully uploaded input texture
```
Deliverables:
- introduce `InputFrameMailbox` with latest-frame semantics.
- remove `RenderEngine::QueueInputFrame()` from the callback path.
- render thread owns the upload moment.
- if upload would exceed budget, render thread can reuse the previous input texture and record an input-upload skip.
Acceptance:
- input capture enabled does not create arbitrary render-thread tasks.
- output cadence remains stable when input frames arrive.
- telemetry separates input-frame arrival, upload count, upload skips, and upload cost.
### 6. Move Preview To A Consumer Path
Preview should consume the latest completed output image instead of asking the output GL context to present.
Options:
- CPU preview from latest system-memory output frame.
- a separate preview GL context fed asynchronously from completed frames.
- a low-priority render-thread blit only when output has measurable slack.
Recommended first step:
- use latest system-memory BGRA8 output for the window preview.
Deliverables:
- preview reads from latest completed/scheduled output frame copy.
- `TryPresentPreview()` no longer queues GL work on the output render thread.
- preview FPS throttling remains caller-side.
Acceptance:
- forcing preview cannot delay output rendering.
- minimizing/focusing the window does not affect output cadence.
### 7. Move Screenshot To Completed Frame Capture
Screenshot should capture from the latest completed output frame unless an explicit "exact render capture" mode is requested.
Deliverables:
- screenshot request reads the latest system-memory output frame.
- PNG write remains async.
- optional diagnostic exact-GL screenshot is disabled during live output or explicitly marked disruptive.
Acceptance:
- screenshot request does not call `glReadPixels` on the output render context during steady-state playout.
### 8. Make Shader Commits Frame-Boundary Work
Prepared shader builds are CPU/background work; GL program commit is still GL work.
Deliverables:
- shader build queue produces `PreparedShaderBuild`.
- render thread sees latest pending prepared build at a frame boundary.
- commit is applied only between frames.
- expensive commits can temporarily enter a measured "render reconfigure" state.
Acceptance:
- shader commits do not interleave midway through output render.
- output timing telemetry records commit duration separately from normal render duration.
### 9. Split Output Scheduling From Rendering Completely
`VideoBackend` should become a playout/scheduling owner, not a render producer.
Target:
```text
RenderEngine
-> produces completed frames at render cadence
VideoBackend
-> schedules completed frames up to target DeckLink depth
-> processes completions
-> releases scheduled slots
```
Deliverables:
- `VideoBackend` owns `SystemOutputFramePool`, or a new `SystemFrameExchange` owns it between render/video.
- render thread publishes completed frames into the exchange.
- video output thread schedules from the exchange.
- no render calls exist in completion handling or scheduling paths.
Acceptance:
- DeckLink buffer depth changes cannot directly cause render-thread wakeups except through non-blocking availability signals.
- render cadence can be tested without DeckLink by using a fake frame sink.
- video scheduling can be tested without GL by using synthetic frames.
### 10. Preserve The Probe As The Reference Contract
The `DeckLinkRenderCadenceProbe` is now the control sample.
Deliverables:
- document which main-app components correspond to the probe components.
- add a small regression checklist:
- render FPS near target
- schedule FPS near target
- DeckLink buffered frames stable
- no late/drop frames
- no PBO misses or readback stalls
- focus/minimize does not change output cadence
Acceptance:
- after each migration step, compare the main app telemetry against the probe's known-good behavior.
## Suggested Order Of Work
1. Add ownership guards and classify render methods.
2. Move GL initialization/destruction fully onto the render thread.
3. Introduce a render-owned cadence loop behind a feature flag.
4. Add a frame-sink/exchange interface between render and video.
5. Move output production from `VideoBackend` to the render cadence loop.
6. Convert input upload to latest-frame mailbox semantics.
7. Move preview to completed-frame consumption.
8. Move screenshot to completed-frame capture.
9. Convert shader commits/resets to frame-boundary mailbox commands.
10. Remove old synchronous output render request path.
## Feature Flags During Migration
Use flags only to keep testing safe, not as long-term compatibility layers.
Suggested flags:
```text
VST_RENDER_CADENCE_OWNER=render_thread
VST_DISABLE_INPUT_CAPTURE=1
VST_PREVIEW_SOURCE=system_frame
VST_SCREENSHOT_SOURCE=system_frame
```
Remove each flag once the new behavior is proven and becomes the only supported path.
## Telemetry Needed
Add or preserve counters for:
- render tick jitter
- render tick overrun
- output render duration
- GL command mailbox depth by class
- frame-boundary command duration
- input upload duration and skips
- readback queue/consume duration
- completed system-memory frame depth
- scheduled DeckLink frame depth
- DeckLink actual buffered frames
- preview frames consumed
- screenshot requests served from system memory
The key metric is whether output render starts on time. Buffer depth alone is not enough; a full buffer can still contain stale or repeated frames.
## Completion Definition
This work is complete when:
- the output render thread owns the app GL context from initialization through shutdown
- output rendering is driven by the render thread's selected frame cadence
- no non-output task can run ahead of a due output frame
- `VideoBackend` never asks the render thread to render synchronously
- DeckLink scheduling consumes already completed system-memory frames
- input upload, preview, screenshot, shader commits, and resets are all frame-boundary, mailbox, or consumer-side operations
- main-app telemetry approaches the cadence probe behavior under the same output mode

View File

@@ -1,589 +0,0 @@
# ControlServices Subsystem Design
This document expands the `ControlServices` subsystem described in [PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md). It defines the target role of `ControlServices` as the ingress boundary for non-render control sources and the normalization layer that turns external input into typed internal actions.
The intent here is to make `ControlServices` explicit enough that later phases can extract it from the current `RuntimeServices` / `ControlServer` / `OscServer` mix without inventing new boundaries ad hoc.
## Purpose
`ControlServices` is the subsystem that accepts external control traffic and turns it into safe, typed, low-cost input for the rest of the app.
In the target architecture, `ControlServices` should:
- own ingress for OSC, HTTP/REST-style control routes, WebSocket session management, and file-watch/reload signals
- normalize transport-specific payloads into typed internal actions or events
- apply ingress-local buffering, coalescing, deduplication, and rate limiting where useful
- expose service timing and health observations to `HealthTelemetry`
- forward normalized actions into `RuntimeCoordinator`
It should not:
- decide persistence policy
- mutate persisted state directly
- build render snapshots
- own render-local overlay state
- own device timing or playout policy
This subsystem is intentionally narrow in authority and broad in transport coverage.
## Why This Subsystem Exists
Today the app already has a recognizable control-services slice, but it is spread across several classes:
- `RuntimeServices` hosts control server startup, OSC queues, deferred OSC commits, and file-watch polling:
- [RuntimeServices.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.h:26)
- [RuntimeServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.cpp:24)
- `ControlServer` owns HTTP, WebSocket upgrade, static asset serving, and direct callback-based route dispatch:
- [ControlServer.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/ControlServer.h:15)
- [ControlServer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/ControlServer.cpp:88)
- `OscServer` owns UDP socket receive, OSC decode, and parameter callback dispatch:
- [OscServer.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/OscServer.h:11)
- [OscServer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/OscServer.cpp:58)
The current shape works, but it mixes:
- transport handling
- action normalization
- direct callback dispatch
- coarse background polling
- transient queue ownership
- UI broadcast behavior
- partial runtime mutation coordination
That overlap is exactly what Phase 1 is trying to remove.
## Design Goals
`ControlServices` should optimize for:
- low-latency ingress without forcing immediate whole-app work
- clear transport boundaries
- deterministic normalization of external input
- isolation of service-specific timing concerns
- easy replacement of polling flows with typed events
- no direct knowledge of render-local implementation details
- safe behavior under bursty traffic such as high-rate OSC
## Subsystem Responsibilities
`ControlServices` owns the following concerns.
### 1. Transport Ingress
It accepts input from external control-facing sources such as:
- OSC/UDP parameter control
- HTTP API requests from the native control UI or external clients
- WebSocket connection lifecycle for state consumers
- file-watch triggers and manual reload requests
- future automation ingress such as MIDI, serial, or remote control bridges
The key rule is that transport-specific details stop here.
### 2. Action Normalization
Every ingress path should be converted into a typed internal action or event before it touches runtime policy.
Examples:
- OSC `/layer/param` traffic becomes `AutomationTargetReceived`
- `POST /api/layers/add` becomes `LayerAddRequested`
- `POST /api/reload` becomes `ShaderReloadRequested`
- file-watch changes become `RegistryChangedDetected` or `ReloadRequested`
The rest of the app should not need to know whether an action came from UDP, HTTP, the embedded UI, or a background watcher.
### 3. Ingress-Local Buffering and Coalescing
`ControlServices` may maintain short-lived queues or coalesced maps when that is the correct place to absorb bursty input.
Examples:
- latest-value coalescing per OSC route
- pending reload edge detection
- bounded outbound state-broadcast requests
- short-lived delivery queues for already-classified follow-up work, as long as commit and persistence policy still belong to `RuntimeCoordinator`
This state is ingress-local and must not become a substitute for committed runtime state.
### 4. WebSocket Session Management
The subsystem owns connection lifecycle for clients that observe runtime state, but it does not own the authoritative runtime model.
It is responsible for:
- accepting WebSocket upgrades
- tracking connected clients
- forwarding serialized state snapshots or health payloads produced elsewhere
- applying broadcast throttling or collapse policies when necessary
It is not responsible for deciding what the authoritative state is.
### 5. File-Watch and Reload Ingress
The subsystem should own the detection side of registry/file changes and reload requests.
It may:
- observe filesystem changes
- debounce bursts of related file events
- translate those changes into typed reload actions
It should not directly trigger render rebuilds or mutate shader/package state itself.
### 6. Service Health and Timing Reporting
`ControlServices` should emit operational signals into `HealthTelemetry`, including:
- OSC packet rate
- OSC decode failures
- queue depth / coalesced route count
- dropped or collapsed ingress events
- HTTP error counts
- WebSocket connection count
- reload request frequency
- file-watch failures
- service-thread startup/shutdown errors
## Explicit Non-Responsibilities
The following must stay outside `ControlServices` in the target design.
### Persistence Decisions
The subsystem may report that an input requested a state change, but it should not decide whether that change is persisted.
That belongs to `RuntimeCoordinator`, with `RuntimeStore` and the later persistence writer carrying out durable writes when policy requests them.
### Render Snapshot Publication
`ControlServices` must not publish render-facing snapshots or poke render-local structures directly.
### Render-Local Overlay Ownership
Live OSC automation overlays belong to the live-state/render preparation boundary (`RuntimeLiveState` today). Temporal state, shader feedback, output staging, and other render-only transient state belong to `RenderEngine`.
`ControlServices` may ingest and coalesce automation targets, but it should not own how those targets are composed, committed, persisted, or applied inside the render domain.
### Hardware Timing or Playout Recovery
Device scheduling, queue headroom, and callback recovery belong to `VideoBackend`, not the control ingress path.
## Ingress Boundary Model
The clean boundary for `ControlServices` is:
- external transport in
- typed action/event out
That implies three layers inside the subsystem.
### Transport Adapters
These are protocol-facing components.
Examples:
- `OscIngress`
- `HttpControlIngress`
- `WebSocketSessionHost`
- `FileWatchIngress`
Responsibilities:
- socket/file watcher lifecycle
- protocol decoding
- request framing
- transport-level validation
- low-level authentication or origin checks later if added
### Normalization Layer
This layer translates decoded transport input into typed actions.
Responsibilities:
- route parsing
- payload type normalization
- parameter name/key resolution where that is purely syntactic
- conversion from transport-specific errors into typed ingress errors
This layer should not perform deep runtime mutation policy.
### Service Coordination Shell
This shell owns:
- startup/shutdown ordering for ingress services
- shared ingress-local queues
- service-thread lifecycle
- handing normalized actions to `RuntimeCoordinator`
- handing outbound snapshot payloads to WebSocket clients
This shell is the spiritual successor to the hosting part of current `RuntimeServices`, but with a much narrower responsibility set.
## Service Timing Concerns
`ControlServices` is the correct place to isolate transport-level timing concerns that should not leak into whole-app state policy.
### OSC Timing
Current behavior already points in the right direction:
- OSC receive is on its own thread in `OscServer`
- latest values are coalesced by route in `RuntimeServices`
- updates are applied once per render tick rather than per packet
Relevant code:
- [OscServer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/OscServer.cpp:95)
- [RuntimeServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.cpp:65)
- [RuntimeServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.cpp:82)
Target rule:
- network receive and decode stay inside `ControlServices`
- coalescing policy stays inside `ControlServices`
- classification of the resulting action belongs to `RuntimeCoordinator`
- render-local application belongs to `RenderEngine`
This keeps high-rate ingress cheap without giving the service layer authority over render behavior or committed-state policy.
### HTTP / UI Timing
HTTP control requests are operator-facing and usually low-rate, but the UI can still generate bursts through slider drags or repeated parameter edits.
`ControlServices` should:
- normalize each request into a typed action
- allow collapse/throttle policies for purely observational outbound state pushes
- avoid synchronous full-state serialization on every ingress event where possible
It should not decide whether a request results in immediate, deferred, transient, or persisted mutation. That is a coordinator concern.
### WebSocket Broadcast Timing
Outbound state streaming is control-plane behavior, not core runtime ownership.
Current code already distinguishes immediate and requested broadcasts:
- [ControlServer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/ControlServer.cpp:163)
- [ControlServer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/ControlServer.cpp:170)
Target rule:
- `ControlServices` may own broadcast scheduling and collapse policy
- the source state payload should come from snapshot/telemetry producers, not from service-owned mutable state
### File-Watch Timing
Current file-watch and deferred OSC commit work run on a coarse poll loop:
- [RuntimeServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.cpp:194)
This is one of the cleaner migration opportunities in the whole app.
Target rule:
- file-watch detection belongs in `ControlServices`
- coarse polling should eventually be replaced with either event-driven watching or a narrower, typed background loop
- detected changes should be debounced and surfaced as typed reload-related actions
### Service Backpressure
`ControlServices` needs explicit backpressure rules for high-rate sources.
Recommended policies:
- coalesce latest-value automation by route
- bound per-service queues
- count and report dropped/coalesced events
- prefer collapsing observation work before collapsing operator mutations
- never let service queues become hidden durable state
## Interfaces
These are suggested target-facing interfaces, not final class signatures.
### Subsystem Shell
Possible top-level responsibilities:
- `Start(...)`
- `Stop()`
- `PublishStateSnapshot(...)`
- `PublishHealthSnapshot(...)`
- `DrainNormalizedActions(...)`
The shell should feel like a host for ingress adapters plus a normalization/buffering boundary.
### OSC Ingress
Possible responsibilities:
- `StartOscIngress(...)`
- `StopOscIngress()`
- `ConfigureOscBinding(...)`
- `EnqueueDecodedOscMessage(...)`
- `DrainCoalescedAutomationTargets(...)`
### HTTP / Web Control Ingress
Possible responsibilities:
- `StartHttpIngress(...)`
- `StopHttpIngress()`
- `HandleHttpRequest(...)`
- `HandleWebSocketUpgrade(...)`
- `QueueStateBroadcastRequest()`
### File-Watch Ingress
Possible responsibilities:
- `StartFileWatchIngress(...)`
- `StopFileWatchIngress()`
- `PollOrConsumeFileEvents(...)`
- `DrainReloadSignals(...)`
### Normalized Action Types
These should likely become shared event/action definitions in Phase 2, but `ControlServices` should be designed around them now.
Examples:
- `LayerAddRequested`
- `LayerRemovedRequested`
- `LayerReorderedRequested`
- `LayerBypassSetRequested`
- `LayerShaderSetRequested`
- `ParameterSetRequested`
- `LayerResetRequested`
- `StackPresetSaveRequested`
- `StackPresetLoadRequested`
- `ShaderReloadRequested`
- `ScreenshotRequested`
- `AutomationTargetReceived`
- `RegistryChangeDetected`
## Data Ownership Inside The Subsystem
`ControlServices` is allowed to own ingress-local ephemeral state.
Examples:
- connected WebSocket client list
- pending broadcast flag
- coalesced OSC route map
- outstanding decoded-but-undrained action queue
- file-watch debounce state
- transport error counters before publication to telemetry
It should not own:
- authoritative layer stack state
- committed parameter values
- render snapshots
- playout queue state
- shader feedback or render overlays
The rule is simple:
- if the state exists only to absorb or forward external input, it can live here
- if the state defines how the app should behave over time, it belongs elsewhere
## Outbound Boundaries
`ControlServices` talks outward in only a few approved directions.
### To `RuntimeCoordinator`
Primary outbound path.
It sends:
- normalized mutation requests
- automation targets
- reload requests
- stack preset requests
It does not send:
- transport-specific objects such as raw sockets or OSC packet structures
- render-facing state objects
### To `HealthTelemetry`
Observation-only relationship.
It sends:
- counters
- warnings
- timing samples
- service health transitions
It should not use `HealthTelemetry` as a hidden control path.
### From Snapshot / Telemetry Producers To Web Clients
`ControlServices` may deliver serialized outbound payloads to WebSocket clients, but the authoritative payload contents should be produced by the owning subsystems.
That means a later design may look like:
- `RuntimeSnapshotProvider` provides render-facing snapshot payloads or a runtime-state projection derived from those published snapshots
- `RuntimeCoordinator` or a later runtime-read-model helper provides control-plane runtime summaries when the UI needs more than raw render state
- `HealthTelemetry` provides health payloads
- `ControlServices` delivers them to connected observers
## Current Code Mapping
This section maps the current implementation onto the target subsystem.
### Current `RuntimeServices`
Should split into:
- `ControlServices` shell
- temporary compatibility adapter into `RuntimeCoordinator`
- removal of any direct runtime-state mutation responsibilities over time
Likely keep under `ControlServices`:
- service startup/shutdown
- OSC update coalescing
- Web control hosting shell
- file-watch ingress hosting
Should move out later:
- legacy direct runtime polling dependency
- deferred OSC commit behavior that has since moved behind coordinator-facing outcomes
- any remaining direct state-broadcast decisions tied to runtime internals
### Current `ControlServer`
Should become primarily:
- HTTP ingress adapter
- WebSocket session host
- static asset/doc host if that remains embedded
The callback table in current code:
- [ControlServer.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/ControlServer.h:18)
is a useful migration aid, but long-term it should evolve from callback-per-action toward typed action emission.
### Current `OscServer`
Should remain transport-focused.
Its clean long-term responsibilities are:
- UDP socket lifecycle
- OSC frame decode
- syntactic route extraction
- emitting decoded automation payloads into the `ControlServices` shell
It should not own any runtime state semantics beyond ingress decoding.
## Migration Plan
The safest migration is incremental.
### Step 1. Name The Boundary Explicitly
Create and use the `ControlServices` name in docs and future interfaces before moving all logic.
This document is part of that step.
### Step 2. Convert Callback Thinking Into Action Thinking
Without changing all runtime code at once, introduce typed action/event shapes for the major ingress paths.
The goal is for transports to emit actions, even if temporary adapters still call into existing code.
### Step 3. Extract Service Hosting From `OpenGLComposite`
`OpenGLComposite` currently owns `RuntimeServices` startup and consumption:
- [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLComposite.cpp:312)
- [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLComposite.cpp:723)
That should move toward a composition root or subsystem host arrangement where render is no longer the owner of control ingress.
### Step 4. Remove Direct Runtime Mutation Dependency
Previous polling and deferred OSC commit work directly against runtime storage:
- [RuntimeServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.cpp:194)
That has been routed through coordinator-facing actions; later phases should replace the remaining polling shape with event-driven flows.
### Step 5. Split Out Observation Delivery
WebSocket outbound delivery can stay in `ControlServices`, but serialization ownership should move toward the owning subsystems so the service layer stops assembling authoritative state itself.
## Risks
### Risk 1. Recreating `RuntimeHost` Coupling Under A New Name
If `ControlServices` is allowed to keep direct knowledge of runtime mutation internals, it will become a renamed version of the same coupling problem.
Mitigation:
- keep the boundary strict
- route mutations through coordinator interfaces
- treat any direct runtime mutation calls as migration-only compatibility
### Risk 2. Service Queues Becoming Hidden State Authority
Latest-value OSC maps and reload debounce flags are appropriate here. Full committed runtime state is not.
Mitigation:
- define ingress-local versus authoritative state explicitly
- bound queues
- publish queue metrics into telemetry
### Risk 3. WebSocket Broadcast Path Reintroducing Heavy Synchronous Work
If `ControlServices` becomes the place where whole runtime state is rebuilt or serialized on every input, it will recreate timing stalls.
Mitigation:
- broadcast snapshots produced elsewhere
- collapse redundant outbound requests
- track serialization/broadcast timing in telemetry
### Risk 4. Polling Surviving Too Long As Architecture
Some polling may remain during migration, but it should not become the permanent contract.
Mitigation:
- isolate polling behind ingress interfaces
- make replacement with event-driven flows a planned Phase 2/3 outcome
## Open Questions
- Should the embedded static UI/docs hosting stay inside `ControlServices`, or move to a thinner app-shell concern while control APIs remain in `ControlServices`?
- Should outbound state for WebSocket clients be one combined payload or separate runtime and health channels?
- How much route/key resolution should happen in `ControlServices` versus `RuntimeCoordinator`?
- Should any deferred automation-settle delivery remain in `ControlServices`, or should all commit/settle policy move entirely into coordinator/render ownership once the live-state model is formalized?
- When file watching is modernized, should reload classification live entirely in `ControlServices`, or should it emit a lower-level `FilesChanged` event and let `RuntimeCoordinator` decide reload semantics?
- Will future non-OSC automation sources reuse the same `AutomationTargetReceived` path, or need source-specific typed actions for policy reasons?
## Short Version
`ControlServices` should become the app's clean ingress boundary:
- transport handling stays here
- input normalization stays here
- ingress-local buffering stays here
- mutation policy does not
- authoritative runtime state does not
- render-local transient state does not
If later phases keep that line sharp, the app gains a control layer that is fast, testable, and timing-aware without becoming another shared state authority.

View File

@@ -1,647 +0,0 @@
# HealthTelemetry Subsystem Design
This document expands the `HealthTelemetry` subsystem introduced in [PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md).
`HealthTelemetry` is the subsystem that owns operational visibility for the app. Its purpose is to gather health state, warnings, counters, logs, and timing observations from the other subsystems and publish them in a structured way without becoming a second control plane.
Before the Phase 1 runtime split, those responsibilities were fragmented across `RuntimeHost` status setters, ad hoc `OutputDebugStringA` calls, callback-local warnings, and UI-facing runtime-state payloads. The result was that the app could often detect problems, but did not yet have one clear place that answered:
- what is healthy right now
- what is degraded right now
- what has recently gone wrong
- which subsystem is under pressure
- how timing behavior is trending over time
`HealthTelemetry` is the target boundary that should answer those questions.
## Why This Subsystem Exists
The codebase already contains meaningful health and timing signals, but some are still spread through unrelated ownership domains:
- previous `RuntimeHost` status fields stored signal and timing status:
- `RuntimeHost.h`
- `RuntimeHost.cpp`
- `RuntimeHost.cpp`
- `RuntimeHost.cpp`
- render and bridge code historically reported timing by writing back into `RuntimeHost`:
- [OpenGLRenderPipeline.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:50)
- [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:49)
- backend warning paths still log directly:
- [DeckLinkFrameTransfer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkFrameTransfer.cpp:84)
- [DeckLinkSession.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:305)
- control ingress failures still log directly:
- [OscServer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/OscServer.cpp:142)
- [RuntimeServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.cpp:100)
This creates several recurring problems:
- health information shares storage and lock scope with runtime state
- warnings are not consistently classified by subsystem or severity
- timing data is hard to compare across render, control, and backend paths
- UI connection state and operational state are too closely coupled
- logging is mostly text-first instead of structured-first
- recovery behavior is hard to audit because the app does not retain a coherent health snapshot
`HealthTelemetry` exists so timing and health concerns have one subsystem whose only job is observation and reporting, instead of drifting back into runtime storage, callback-local logging, or UI payload assembly.
## Design Goals
`HealthTelemetry` should optimize for:
- one authoritative home for operational visibility
- structured health state per subsystem
- timing and counter recording that does not require a UI to be connected
- low-friction reporting from render, backend, coordinator, and services
- explicit degraded-mode reporting instead of only raw text logs
- support for live operator summaries and deeper engineering diagnostics
- minimal risk of telemetry writes becoming a render or callback bottleneck
## Responsibilities
`HealthTelemetry` owns structured operational visibility.
Primary responsibilities:
- accept timing samples from major subsystems
- accept counter deltas and point-in-time gauges
- accept warning, error, and degraded-state transitions
- collect subsystem-scoped health state
- collect operator-visible summary state
- collect structured log entries
- build stable health snapshots for UI, diagnostics, and later persistence/export if desired
- retain recent history needed for short-term troubleshooting
- classify observations by subsystem, severity, and category
Secondary responsibilities that still fit here:
- smoothing or rolling-window summaries for timing metrics
- mapping raw subsystem observations into operator-facing health summaries
- deduplicating repeated warnings
- tracking warning open/clear lifecycles
- providing bounded in-memory history for recent logs and warning transitions
## Explicit Non-Responsibilities
`HealthTelemetry` should not become a behavior owner.
It does not own:
- layer stack truth
- persistence policy
- render scheduling
- DeckLink scheduling
- OSC buffering or routing
- reload coordination
- shader compilation
- recovery actions themselves
It also should not decide:
- whether render should skip a frame
- whether VideoBackend should increase queue depth
- whether RuntimeCoordinator should reject a mutation
- whether ControlServices should drop or coalesce ingress traffic
Those decisions belong to the subsystem being observed. `HealthTelemetry` may describe that a subsystem is degraded, but it must not quietly become the mechanism that tells the app how to react.
## Ownership Boundaries
`HealthTelemetry` owns the following state categories.
### Structured Log State
Examples:
- subsystem name
- severity
- category
- timestamp
- message
- optional structured fields such as layer id, preset name, queue depth, or shader id
This replaces the idea that `OutputDebugStringA` text is itself the main diagnostic product.
### Warning And Error State
Examples:
- active warning set
- warning occurrence counts
- first-seen and last-seen timestamps
- clear timestamps
- subsystem-scoped degraded flags
This is the durable in-memory operational state that should answer "what is currently wrong?" even if no UI was connected when the warning was raised.
### Timing State
Examples:
- render duration
- frame budget
- playout completion interval
- smoothed completion interval
- queue depth
- input upload skip count
- async readback fallback count
- control ingress lag or queue depth
- snapshot publication cost
This state should be organized as time-series-like rolling telemetry, not as a grab bag of unrelated `double` fields mixed into the runtime store.
### Health Snapshot State
Examples:
- current subsystem health summaries
- current operator-facing overall health summary
- most recent warning list
- recent counters and timing summaries
- "degraded but still running" status
This is the material that `ControlServices` or a diagnostics endpoint may later publish.
## State Model
The subsystem should model health and telemetry in a way that supports both machine-friendly and operator-friendly views.
Suggested conceptual model:
- `TelemetryLogEntry`
- `TelemetryWarningRecord`
- `TelemetryCounterState`
- `TelemetryGaugeState`
- `TelemetryTimingSeries`
- `SubsystemHealthState`
- `HealthSnapshot`
Important distinction:
- raw observations are append/update operations
- health snapshots are derived read models
That distinction matters because the system should be able to retain richer recent telemetry internally than what is necessarily sent to the UI on every refresh.
## Subsystem Health Domains
`HealthTelemetry` should track health by subsystem rather than as one flat status blob.
At minimum, Phase 1 should assume domains for:
- `RuntimeStore`
- `RuntimeCoordinator`
- `RuntimeSnapshotProvider`
- `ControlServices`
- `RenderEngine`
- `VideoBackend`
Optional cross-cutting domain:
- `ApplicationShell`
Each domain should be able to express states such as:
- `Healthy`
- `Warning`
- `Degraded`
- `Error`
- `Unavailable`
The exact enum can change, but the design should preserve the idea that each subsystem reports into its own health lane first, and only then is an overall status derived.
## Logging Boundaries
Logging belongs here, but logging should be structured-first.
Expected inputs:
- subsystem-scoped debug information
- warning and error messages
- recovery events
- notable state transitions
- significant operator actions that matter for diagnostics
Expected design rules:
- textual messages are still useful, but they should be wrapped in a structured log entry
- repeated transient failures should be rate-limited or deduplicated at the telemetry layer where possible
- log storage should be bounded in memory
- UI publication should read from health/log snapshots, not scrape stdout/debug output
Examples of current direct log paths that should eventually move behind `HealthTelemetry`:
- OSC decode/dispatch failures
- screenshot write failures
- DeckLink fallback warnings
- late/dropped frame warnings
## Metrics And Timing Boundaries
Timing and metrics should also move here, but their ownership line matters.
`HealthTelemetry` should own:
- metric collection interfaces
- rolling summaries
- recent history buffers
- warning thresholds if the app later chooses to define them declaratively
- operator-facing derived summaries
The producing subsystem should still own:
- the meaning of the measurement
- when it is sampled
- whether it triggers local mitigation
Examples:
- `RenderEngine` owns when render duration is sampled
- `VideoBackend` owns when queue depth or playout lateness is sampled
- `ControlServices` owns when ingress backlog is sampled
- `RuntimeSnapshotProvider` owns when snapshot publish/build timing is sampled
`HealthTelemetry` should not invent those timings by inference. It records them when producers report them.
## Proposed Interfaces
These are target-shape interfaces, not final signatures.
### Write/Record Interface
Core write-side operations could look like:
```cpp
enum class TelemetrySeverity;
enum class TelemetrySubsystem;
struct TelemetryLogEntry;
struct TelemetryWarning;
struct TelemetryTimingSample;
struct TelemetryCounterDelta;
struct TelemetryGaugeUpdate;
class IHealthTelemetry
{
public:
virtual void AppendLogEntry(const TelemetryLogEntry& entry) = 0;
virtual void RaiseWarning(const TelemetryWarning& warning) = 0;
virtual void ClearWarning(std::string_view warningKey) = 0;
virtual void RecordTimingSample(const TelemetryTimingSample& sample) = 0;
virtual void RecordCounterDelta(const TelemetryCounterDelta& delta) = 0;
virtual void RecordGauge(const TelemetryGaugeUpdate& gauge) = 0;
virtual void ReportSubsystemState(TelemetrySubsystem subsystem,
SubsystemHealthState state) = 0;
};
```
The key is that every subsystem should be able to publish observations without also needing to know how UI payloads, rolling summaries, or log retention are implemented.
### Read Interface
Expected read-side operations:
- `BuildHealthSnapshot()`
- `GetSubsystemHealth(...)`
- `GetRecentLogs(...)`
- `GetActiveWarnings()`
- `GetRecentTimingSummary(...)`
Design notes:
- the read interface should return stable snapshots or read models
- UI/websocket publication should consume those snapshots through `ControlServices`
- read-side access should not require direct knowledge of internal ring buffers or lock layout
## Producer Expectations By Subsystem
The parent Phase 1 design already allows multiple subsystems to publish into telemetry. This section makes that concrete.
### From `RuntimeCoordinator`
Expected observations:
- mutation rejected
- reload requested
- preset apply failed
- transient state cleared due to compatibility rules
- policy-driven degraded notices such as repeated invalid external control input
### From `RuntimeSnapshotProvider`
Expected observations:
- snapshot publication duration
- snapshot build failure
- snapshot version churn metrics
- repeated publish retries or stale-snapshot conditions
### From `ControlServices`
Expected observations:
- OSC decode failures
- websocket broadcast failures
- REST/control transport errors
- ingress queue depth
- coalescing/drop counts
- file-watch reload request activity
### From `RenderEngine`
Expected observations:
- frame render duration
- upload duration
- readback duration
- fallback to synchronous readback
- preview present timing
- render-local state resets caused by reload or incompatibility
### From `VideoBackend`
Expected observations:
- current playout queue depth
- system-memory playout frame counts by state: free, ready, and scheduled
- system-memory playout underrun, repeat, and drop counters
- system-memory frame age at schedule and completion time
- input signal state
- late frames
- dropped frames
- backend mode changes
- fallback from 10-bit to 8-bit input
- output-only black-frame mode
## Current Code Mapping
The current codebase already contains several telemetry responsibilities that should migrate here.
### Previous `RuntimeHost` Status Setters
These were the clearest initial migration candidates:
- `SetSignalStatus(...)`
- `TrySetSignalStatus(...)`
- `SetPerformanceStats(...)`
- `TrySetPerformanceStats(...)`
- `SetFramePacingStats(...)`
- `TrySetFramePacingStats(...)`
See:
- `RuntimeHost.h`
- `RuntimeHost.cpp`
- `RuntimeHost.cpp`
- `RuntimeHost.cpp`
In the target architecture, this kind of state should not sit on the same object that owns persistent layer truth.
### Render Timing Production
Current render timing is produced in:
- [OpenGLRenderPipeline.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:50)
That timing sample should conceptually become:
- `RenderEngine -> HealthTelemetry::RecordTimingSample(...)`
not the old pattern:
- `RenderEngine -> RuntimeHost::TrySetPerformanceStats(...)`
### Playout And Signal Status Production
Current signal and frame pacing updates are produced in:
- [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:49)
- [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:61)
These should eventually become structured `VideoBackend` observations instead of bridge-to-host status writes.
### Direct Warning And Log Paths
Current examples:
- late/dropped frame warnings:
- [DeckLinkFrameTransfer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkFrameTransfer.cpp:84)
- backend fallback warnings:
- [DeckLinkSession.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:305)
- [DeckLinkSession.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:320)
- OSC errors:
- [OscServer.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/OscServer.cpp:142)
- [RuntimeServices.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.cpp:100)
All of these are clear migration candidates for `AppendLogEntry(...)`, `RaiseWarning(...)`, or counter/timing updates.
## Health Snapshot Contract
`HealthTelemetry` should expose one coherent health snapshot that other publication layers can consume.
That snapshot should be able to answer, at minimum:
- what the overall app health is
- whether input signal is present
- whether playout is healthy, degraded, or underrunning
- whether render timing is within budget
- what active warnings exist
- what recent notable events occurred
- what the current subsystem-specific states are
The important boundary is:
- `HealthTelemetry` builds the health snapshot
- `ControlServices` may publish it
- UI consumes it
That avoids rebuilding health summaries ad hoc in UI-facing runtime state serializers.
## Concurrency Expectations
This subsystem will likely receive updates from multiple threads:
- control ingress threads
- render thread
- backend callback threads
- coordinator/service threads
So the design should assume:
- low-contention write paths
- bounded memory
- no long-held global mutex that callbacks and render both depend on
Phase 1 does not require lock-free implementation, but it does require the architecture to avoid recreating the old problem where health writes share the same lock as durable state and render-facing concerns.
Practical expectations:
- per-domain aggregation or lightweight internal locking is acceptable
- read snapshots should be cheap and stable
- callback paths should record telemetry cheaply and return
## Migration Plan From Current Code
The safest migration path is to peel telemetry responsibilities away from the existing classes incrementally.
### Step 1: Introduce The `HealthTelemetry` Interface
Create a small interface and health model types first.
Initial responsibilities:
- append structured logs
- record timing samples
- record counter deltas
- raise and clear warnings
- build a read-only health snapshot
The first implementation can still be backed by simple in-memory structures.
### Step 2: Keep New Observations Off Runtime Storage
Route new health-style work into `HealthTelemetry` instead of adding more status fields to runtime storage.
This prevents the old status surface from growing during migration.
### Step 3: Replace Legacy Status Setters With Telemetry Producers
Refactor:
- render timing writes
- signal status writes
- playout pacing writes
so they publish structured observations instead of mutating store-adjacent fields.
### Step 4: Replace Direct `OutputDebugStringA` Warning Paths
Wrap common warning/error cases in telemetry producers.
This includes:
- OSC decode/dispatch failures
- DeckLink late/dropped frame notifications
- backend fallback notices
- screenshot write failures
Direct debug output can remain as a sink of telemetry if desired, but not as the primary source of truth.
### Step 5: Publish Health Snapshot Through UI/Diagnostics Paths
Once the snapshot format exists, let `ControlServices` publish health summaries and recent warnings explicitly rather than depending on the runtime-state payload alone.
## Risks
### 1. Telemetry becomes a hidden behavior controller
If warning states start being used as the indirect way subsystems tell each other what to do, the subsystem boundary will fail.
Guardrail:
- telemetry observes and reports
- it does not coordinate or command
### 2. Logging stays string-only
If the subsystem only centralizes text logging without structure, later diagnostics will still be difficult.
Guardrail:
- severity, subsystem, category, and optional fields should be first-class
### 3. Timing writes become too expensive
If every sample requires heavy locking or snapshot rebuilds, render and callback timing could regress.
Guardrail:
- cheap recording path
- derived summaries built separately from hot-path writes
### 4. Health snapshot duplicates runtime truth
If health snapshots start storing copies of durable runtime state, the subsystem boundary will blur again.
Guardrail:
- health snapshots summarize operational state
- they do not become a second runtime store
### 5. Warning severity semantics drift by subsystem
If each subsystem invents its own meaning for warning/degraded/error, operator visibility becomes noisy and inconsistent.
Guardrail:
- define shared severity and health-state vocabulary early
## Open Questions
### 1. Should debug-output sinks remain enabled by default?
Current recommendation:
- yes, as a sink fed by structured telemetry entries, not as the source of truth
### 2. How much timing history should be retained in memory?
Current recommendation:
- enough for short-term live troubleshooting and UI summaries
- not an unbounded time-series archive
### 3. Should operator-facing health and engineering diagnostics use the same snapshot?
Current recommendation:
- share one core telemetry model
- allow separate derived views for concise operator summaries versus deeper engineering detail
### 4. Where should threshold policy live if the app later formalizes warnings like "render over budget"?
Current recommendation:
- telemetry may evaluate declared thresholds
- subsystem owners still own mitigation behavior
### 5. Should input signal presence remain part of runtime state or move fully into telemetry?
Current recommendation:
- treat it as operational health state under `VideoBackend` reporting into telemetry
- avoid keeping it as a core durable runtime-store concern
## Success Criteria For This Subsystem
`HealthTelemetry` can be considered well-defined once the codebase can say, without ambiguity:
- all major subsystems have one place to publish timing, warnings, and counters
- health and timing state no longer share ownership with durable runtime state
- the UI can consume a stable health snapshot without scraping unrelated runtime fields
- direct debug-string warning paths are being retired or wrapped behind structured telemetry
- degraded-but-running conditions are visible as first-class state
## Short Version
`HealthTelemetry` is the subsystem that should answer:
- what is healthy right now
- what is degraded right now
- what recent warnings and errors occurred
- how render, control, and playout timing are behaving
It should:
- collect structured logs
- collect warnings and counters
- collect timing samples and gauges
- build stable health snapshots for publication
It should not:
- own core runtime truth
- decide app behavior
- coordinate recovery actions
- become a replacement for the render or backend policy layers
If this boundary holds, later phases can keep moving toward a much more diagnosable live system without putting timing and warning state back into runtime storage.

View File

@@ -1,44 +0,0 @@
# Subsystem Notes Index
The current, phase-free architecture summary is:
- [Current System Architecture](../CURRENT_SYSTEM_ARCHITECTURE.md)
Start there when you want to understand how the application works now.
This directory contains deeper notes for individual subsystem boundaries. These notes were originally written during the phased architecture work, so some files may still mention migration steps or target-state language. Treat them as companion notes, not as the source of truth when they disagree with the current architecture summary.
## Recommended Reading Order
1. [Current System Architecture](../CURRENT_SYSTEM_ARCHITECTURE.md)
2. [RuntimeStore](RuntimeStore.md)
3. [RuntimeCoordinator](RuntimeCoordinator.md)
4. [RuntimeSnapshotProvider](RuntimeSnapshotProvider.md)
5. [ControlServices](ControlServices.md)
6. [RenderEngine](RenderEngine.md)
7. [VideoBackend](VideoBackend.md)
8. [HealthTelemetry](HealthTelemetry.md)
That order follows the current ownership story:
- durable state first
- mutation and publication next
- control ingress after that
- render ownership and video timing next
- operational visibility last
## Subsystem Notes
- [RuntimeStore](RuntimeStore.md): durable runtime-state facade over layer-stack, config, package-catalog, presentation, and persistence boundaries.
- [RuntimeCoordinator](RuntimeCoordinator.md): mutation validation, state classification, reset/reload policy, and publication/persistence requests.
- [RuntimeSnapshotProvider](RuntimeSnapshotProvider.md): render-facing snapshot publication boundary backed by explicit render snapshot building/versioning.
- [ControlServices](ControlServices.md): OSC, HTTP/WebSocket, and file-watch ingress plus normalization and service-local buffering.
- [RenderEngine](RenderEngine.md): GL ownership boundary, render-local transient state, preview, and playout-ready frame production.
- [VideoBackend](VideoBackend.md): device lifecycle, input/output pacing, buffer policy, and producer/consumer playout behavior.
- [HealthTelemetry](HealthTelemetry.md): logs, warnings, counters, timing traces, and subsystem health snapshots.
## Historical Documents
The `docs/PHASE_*` files and experiment logs record how the architecture evolved. They are useful when you need rationale, investigation history, or rejected paths, but they are no longer arranged as the main feature split for the app.
For current implementation work, use [Current System Architecture](../CURRENT_SYSTEM_ARCHITECTURE.md) as the entry point and only dip into the phase documents when you need context for why a subsystem ended up this way.

View File

@@ -1,486 +0,0 @@
# RenderEngine Subsystem Design
This document expands the `RenderEngine` portion of [PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md). It defines the target ownership, boundaries, and migration shape for the rendering subsystem so later phases can move GL work out of today's mixed orchestration paths without inventing new boundaries on the fly.
The intent here is not to force a one-step rewrite. It is to make the target render boundary explicit enough that later work on events, live-state layering, sole-owner GL threading, and backend decoupling all land in the same place.
## Purpose
`RenderEngine` is the live frame-production subsystem.
It owns:
- GL context ownership in the target architecture
- render loop cadence and render task execution
- shader program and render-pass execution once build outputs are available
- capture texture upload scheduling once frames are accepted for render
- temporal history resources
- shader feedback resources
- render-local transient overlays
- preview-ready frame production
- playout-ready frame production
- render-local reset and rebuild behavior
It does not own:
- persisted runtime state
- high-level mutation policy
- OSC/UI ingress
- device discovery or callback policy
- playout queue policy
- operator-visible health policy beyond publishing observations
In the Phase 1 terminology, `RenderEngine` consumes snapshots plus render-local transient state and produces completed visual frames plus timing signals.
## Why This Subsystem Needs A Sharp Boundary
The current rendering path is split across several classes:
- [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLComposite.cpp:86) constructs the renderer, render pipeline, shader programs, runtime services, and video bridge in one owner.
- [OpenGLRenderPipeline.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:31) performs pass execution, pack/readback, preview paint, and performance stat publication.
- [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:58) accepts capture frames and still performs render work from the playout completion callback path.
- `RenderFrameStateResolver` and `RenderStateComposer` now keep frame-state selection and live value composition outside GL drawing, while `RenderEngine` still owns the current GL resource and draw path.
That split is workable today, but it creates architectural pressure:
- render and playout timing are still callback-coupled.
- preview and playout are produced in the same immediate path.
- render-local transient state now has clearer Phase 3/5 boundaries, but output production is still synchronously requested by the backend completion path.
- it is difficult to test render behavior separately from app bootstrap and hardware integration.
`RenderEngine` exists to absorb that responsibility into one subsystem with one direction of ownership. Phase 4 has completed the GL ownership part of this target: normal runtime GL work now enters through the `RenderEngine` render thread.
## Responsibilities
### 1. Sole GL Ownership
In the target design, `RenderEngine` should be the only subsystem that performs long-lived GL work.
That includes:
- context binding and release policy
- framebuffer and texture lifetime
- shader program binding and draw execution
- upload/readback buffer lifetime
- preview blit or present paths
- render-local resource reset on rebuild or video-format changes
This is the most important boundary. Other subsystems may request work or provide data, but they should not directly perform GL commands.
### 2. Snapshot Consumption
`RenderEngine` should consume immutable or near-immutable render snapshots from `RuntimeSnapshotProvider`.
It is responsible for:
- detecting snapshot version changes
- rebuilding or re-binding render-local resources when the snapshot changes
- resolving render-pass execution from snapshot contents
- separating structural snapshot changes from transient overlay changes
It should not inspect mutable runtime store objects directly.
### 3. Frame Production
`RenderEngine` should produce completed frames for two consumers:
- preview presentation
- `VideoBackend` playout consumption
Those outputs may share most of their render work, but they are not equal-priority outputs. The subsystem rule from Phase 1 should be preserved:
- playout is the primary timing-sensitive output
- preview is subordinate and best-effort
### 4. Render-Local Transient State
`RenderEngine` owns transient visual state that affects output but is not persisted truth.
Examples:
- temporal history textures
- feedback ping-pong buffers
- render-local OSC/live overlay state
- queued input frames accepted for upload
- cached readback frames
- preview-only presentation state
- in-flight rebuild generations
This state should remain render-local even when it influences visible output.
Phase 5's `RuntimeStateLayerModel` explicitly keeps temporal history, feedback state, accepted input frames, staged output frames, preview staging, and screenshot/readback staging in the render-local category. These are deliberately outside the persisted/committed/transient-automation parameter composition rule.
`RuntimeLiveState` now owns transient automation invalidation for render-facing compatibility. It can clear overlays for a target layer/control key and prunes overlays that no longer resolve to the current layer and parameter definitions before applying them to a frame. This keeps shader reload, preset load, and layer removal behavior local to the live-state/composition boundary instead of scattering it through GL drawing code.
Render snapshots now flow through a named `CommittedLiveStateReadModel`, so render-facing committed state is distinct from durable storage and physically owned by `CommittedLiveState`.
### 5. Shader Build Application
Compilation itself may eventually move into a separate build service, but once shader build outputs exist, `RenderEngine` owns:
- program creation/link usage
- pass graph application
- sampler/texture binding layout application
- resource reallocation required by shader shape changes
- safe invalidation of old render-local feedback/history resources
### 6. Render Timing Publication
`RenderEngine` should publish observations to `HealthTelemetry` such as:
- frame render duration
- upload duration
- pass execution duration
- pack/readback duration
- preview present timing
- rebuild stalls
- dropped/skipped input uploads
- output frame production latency
It should publish them, not own the health policy built from them.
## Non-Responsibilities
The target boundary should remain explicit about what does not belong here.
`RenderEngine` should not:
- decide whether a parameter mutation is persisted
- normalize OSC/UI actions
- choose device modes
- own DeckLink callback behavior
- own playout headroom policy
- perform stack preset serialization
- broadcast UI state
- treat telemetry as a control plane
Those rules matter because the current codebase often solves timing issues by letting the render path reach sideways into nearby systems.
## GL Ownership Model
## Current Rule
One subsystem owns GL. `RenderEngine` now starts a dedicated render thread, binds the existing GL context on that thread for normal runtime work, and routes input upload, output render, preview presentation, screenshot capture, shader application, and render-local reset work through render-thread requests.
The render thread should:
- create or adopt the GL context
- execute all frame production work
- perform accepted texture uploads
- execute all pass graphs
- manage async readback and output packing
- manage feedback/history resets and reallocations
Other threads should interact with the subsystem through queues, snapshots, and completion signals, not by borrowing the GL context.
## Remaining Timing State
GL ownership is no longer shared across callback-driven and UI entrypoints:
- input upload is requested through [OpenGLVideoIOBridge::UploadInputFrame()](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:11)
- playout-triggered render is requested through [OpenGLVideoIOBridge::RenderScheduledFrame()](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:18)
- render-pass execution occurs in [OpenGLRenderPipeline::RenderFrame()](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:31)
- preview and screenshot paths enter `RenderEngine` queue/request methods
The remaining timing issue is not shared GL ownership; it is the transitional synchronous output request/response path. The DeckLink completion callback still waits while the render thread produces an output frame, fills the DeckLink buffer, and then schedules the next frame.
## Migration Direction
The next target path should be:
1. input callback enqueues frame payloads or references
2. render thread accepts the latest usable input frame
3. render thread performs uploads on its own cadence
4. render thread produces completed output frames ahead of backend demand
5. backend callbacks only dequeue and schedule pre-rendered frames
Phase 4 completed the part that removes callback-thread GL ownership. Phase 7 should complete the producer/consumer playout part.
## Render Loop Boundaries
`RenderEngine` should own a render loop with explicit phases. A good target shape is:
1. drain render-side commands and accepted service events
2. swap to the latest published snapshot if needed
3. apply render-local transient overlays
4. accept or coalesce latest input frame for upload
5. perform required uploads
6. execute pass graph
7. update temporal and feedback resources
8. pack and stage output frame(s)
9. publish preview-ready image if due
10. publish playout-ready frame(s) to `VideoBackend`
11. emit timing and health samples
The important property is that preview, playout preparation, feedback maintenance, and upload execution all happen under one render-owned cadence rather than as ad hoc side effects of unrelated callbacks.
## Snapshot And Overlay Interaction
`RenderEngine` should treat snapshots and overlays as different layers of state.
### Snapshot Inputs
Snapshots should provide:
- layer stack structure
- shader/package selections
- validated committed parameter values
- pass graph definitions
- resource requirements derived from runtime state
### Render-Local Overlay Inputs
Overlays should provide:
- active automation targets
- smoothed transient parameter overrides
- temporary visual state that should not persist back into the store
- queued reset/rebuild invalidations for render-local resources
### Resolution Rule
The render-side resolution order should be:
1. snapshot committed state forms the baseline
2. render-local transient overlays are applied on top
3. feedback/history resources influence shading as render-local inputs
4. completed frame is produced without mutating the underlying snapshot
This is especially important after the OSC work already moved toward render-local overlays. Phase 1 should keep that direction: render consumes committed truth plus transient live overlays, but render does not become the owner of persisted truth.
## Preview And Playout Relationship
Preview should be a subordinate consumer of render results, not a peer that can disturb playout timing.
### Target Rule
- playout deadlines come first
- preview is best-effort
- preview cadence may be reduced independently
- preview failure must not stall output frame production
### Current State
Today preview still hangs off the render pipeline path through `mPaint()` in [OpenGLRenderPipeline::RenderFrame()](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLRenderPipeline.cpp:54). That keeps preview close enough to the playout path that it is still part of the same timing surface.
### Target Shape
`RenderEngine` should internally distinguish:
- playout-ready frame production
- preview presentation or preview-copy publication
Possible later implementations:
- playout frame and preview frame share one composite render, but preview present is decoupled and rate-limited
- render publishes a preview texture handle or CPU-side preview image to a preview presenter
- preview updates are skipped under load without affecting playout queue fill
The exact implementation can change later, but the subsystem contract should already assume preview is subordinate.
## Interaction With `RuntimeSnapshotProvider`
`RenderEngine` should depend on `RuntimeSnapshotProvider`, not on `RuntimeStore`.
Expected interactions:
- query latest snapshot version
- consume latest stable snapshot
- detect structural versus parameter-only changes
- request no mutation back into the snapshot provider during render
Expected non-interactions:
- no direct persistence reads/writes
- no raw store mutation
- no direct service ingress handling
This is one of the main Phase 1 guardrails, because the current code often achieves convenience by letting render reach back into runtime-owned mutable objects.
## Interaction With `VideoBackend`
The target dependency direction stays:
- `VideoBackend -> RenderEngine`
That means:
- backend requests or consumes ready frames
- backend reports output timing/completion events
- render does not own output device policy
`RenderEngine` should expose frame-production and queue-facing interfaces, while `VideoBackend` owns:
- device callback handling
- output scheduling policy
- buffer pool policy
- backend state transitions
In later phases, this should evolve toward a producer/consumer queue where:
- render produces completed frames ahead of demand
- backend consumes already-produced frames
- callbacks drive dequeue/schedule/accounting only
## Current Code Mapping
The following current responsibilities should converge into `RenderEngine`.
### From `OpenGLComposite`
- render-local overlay management
- render-facing rebuild application
- screenshot-related render execution hooks
- render bootstrap ownership currently mixed with app bootstrap
### From `OpenGLRenderPipeline`
- frame render orchestration
- output pack conversion
- async readback state
- output frame caching
- preview-ready signal publication
### From `OpenGLVideoIOBridge`
- GL texture upload execution should move under render ownership
- playout callback render work should move out of the callback path
### Remains Outside `RenderEngine`
- device callback registration
- playout scheduling policy
- signal/device status lifecycle
- runtime mutation policy
## Suggested Internal Components
This document does not require final class names, but `RenderEngine` will likely be easier to evolve if it is not one monolithic replacement for `OpenGLComposite`.
Reasonable internal pieces could include:
- `RenderLoopController`
- `RenderSnapshotConsumer`
- `RenderOverlayState`
- `RenderInputQueue`
- `RenderPassExecutor`
- `RenderHistoryManager`
- `RenderOutputStager`
- `PreviewPresenter`
Those are internal implementation helpers. They should not become new cross-cutting subsystem boundaries by themselves.
## Public Interface Shape
Aligned with the Phase 1 design, `RenderEngine` should eventually expose operations in this family:
- `StartRenderLoop(...)`
- `StopRenderLoop()`
- `ConsumeSnapshot(...)`
- `EnqueueInputFrame(...)`
- `ApplyOverlayUpdate(...)`
- `RequestRenderLocalReset(...)`
- `HandleRebuildOutputs(...)`
- `TryProduceOutputFrame(...)`
- `GetPreviewFrame(...)`
- `ReportRenderState()`
Interface goals:
- calls are explicit about whether they mutate render-local state or request frame production
- no caller needs direct GL access
- preview and playout are exposed as outputs, not as reasons for callers to enter the render path
## Migration Plan From Current Code
### Step 1. Name The Boundary
Treat `OpenGLRenderPipeline` plus the render portions of `OpenGLComposite` and `OpenGLVideoIOBridge` as conceptually belonging to `RenderEngine`, even before physical extraction is complete.
### Step 2. Stop New Render Work From Escaping
As new features are added, keep:
- feedback buffers
- temporal history
- render-local overlays
- preview state
inside render-owned code paths instead of putting them back into runtime storage or service layers.
### Step 3. Isolate Snapshot Consumption
Introduce snapshot-facing APIs so render no longer depends on broad runtime-state access for frame production.
Current status: Phase 3 introduced `RenderFrameInput`, `RenderFrameState`, and `RenderFrameStateResolver`, so frame-state selection is named and no longer lives inside GL drawing. Phase 4 built on that contract and moved normal runtime GL ownership onto the render thread.
### Step 4. Move Uploads Onto Render Ownership
Input callbacks should enqueue or hand off frame data; render executes the upload.
### Step 5. Break Callback-Driven Rendering
Move from "render in playout completion callback" to "render ahead and let backend consume ready frames."
### Step 6. Decouple Preview Cadence
Make preview a best-effort presentation path with its own skip/rate-limit policy.
### Step 7. Narrow `OpenGLComposite`
After the above, `OpenGLComposite` should collapse toward a composition root and legacy adapter rather than remaining the owner of render behavior.
## Risks
### Latency Risk
Moving to queue-based frame production can accidentally increase latency if headroom is allowed to grow without policy. `RenderEngine` should therefore expose queue-friendly production, but `VideoBackend` must still own explicit latency/headroom policy.
### Resource Churn Risk
Snapshot changes, shader rebuilds, and video-format changes can cause expensive reallocation of:
- feedback surfaces
- history buffers
- output pack resources
- readback buffers
The subsystem needs clear structural-change boundaries so parameter-only changes do not trigger broad resource churn.
### Preview Coupling Risk
If preview remains too close to the render/playout path, it can continue to steal budget from output production even after the rest of the subsystem is cleaned up.
### Readback Deadline Risk
The current async readback path still falls back to synchronous reads when the deadline is missed. That behavior may remain necessary, but `RenderEngine` should treat it as a degraded-path metric, not as an invisible normal case.
### Overlay Complexity Risk
Render-local overlays are powerful, but they can become a hidden second state model if not kept clearly subordinate to committed snapshot state.
## Open Questions
- Should preview become a separate presenter helper inside `RenderEngine`, or remain a subordinate callback/output sink?
- Where should screenshot capture live long-term: inside `RenderEngine`, or in a small render consumer layered on top of it?
- Should shader compilation outputs be delivered to render as whole-framegraph rebuild packages, or incrementally by layer/pass?
- How should input frame ownership work under load: newest-only, bounded queue, or policy selected by backend mode?
- Should render expose one playout-ready frame at a time, or a bounded ring the backend drains directly?
- What exact distinction should the snapshot provider publish between structural changes and parameter-only changes so render rebuilds stay cheap?
## Phase 1 Exit Criteria For `RenderEngine`
For Phase 1, this subsystem design is sufficiently defined once the project agrees that:
- render is the sole long-term owner of GL work
- render consumes snapshots, not mutable runtime store objects
- preview is subordinate to playout
- feedback/history/overlays are render-local transient state
- backend callbacks should converge toward dequeue/schedule behavior rather than direct rendering
- current render responsibilities in `OpenGLComposite`, `OpenGLRenderPipeline`, and `OpenGLVideoIOBridge` are expected to migrate under this subsystem
## Short Version
`RenderEngine` should become the subsystem that owns live GPU execution and nothing else.
It consumes committed snapshots plus render-local overlays, owns the full GL lifecycle, produces preview and playout-ready frames, and publishes timing observations. It should not own persistence, control ingress, or hardware scheduling policy. If later phases hold to that line, timing work and render-state work can get cleaner without reintroducing the same cross-thread coupling in a different form.

View File

@@ -1,564 +0,0 @@
# RuntimeCoordinator Design Note
This document defines the target design for the `RuntimeCoordinator` subsystem introduced in [PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md).
`RuntimeCoordinator` is the mutation and policy layer for the app. Its job is to accept already-normalized actions from ingress systems, decide whether those actions are valid, classify how they should affect durable and live state, and trigger downstream publication or persistence work without taking ownership of rendering, device callbacks, or disk serialization details.
## Why This Subsystem Exists
Before the Phase 1 runtime split, the app's mutation path was split across several places:
- `RuntimeHost` performed validation, mutation, persistence, render-state invalidation, and some status updates:
- `RuntimeHost.h`
- `RuntimeHost.cpp`
- `OpenGLComposite` currently acts like an orchestration shell and a mutation coordinator at the same time:
- [OpenGLCompositeRuntimeControls.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLCompositeRuntimeControls.cpp:1)
- `RuntimeServices` still owns some deferred control flow around OSC commit and polling:
- [RuntimeServices.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.h:46)
That overlap makes several kinds of regressions more likely:
- persistence policy leaks into control handlers
- render invalidation rules are spread across UI and non-UI paths
- transient automation behavior is hard to reason about
- reload behavior is partly a render concern and partly a runtime concern
- future event-model work has no single policy owner to target
`RuntimeCoordinator` exists to centralize those decisions without becoming a new monolith.
## Core Responsibilities
`RuntimeCoordinator` should own the following responsibilities.
### 1. Mutation intake after normalization
`RuntimeCoordinator` accepts typed, already-parsed actions from `ControlServices` or composition-root adapters. Examples:
- add/remove/move layer
- change shader on a layer
- change a parameter value
- reset a layer
- save or load a stack preset
- request a shader/package reload
- apply a transient automation target
- commit or clear transient overlay state
The coordinator should not parse JSON, decode OSC payloads, or inspect HTTP payload syntax. That belongs to ingress systems.
### 2. Validation and policy decisions
The coordinator validates whether a requested mutation is allowed and decides how it should behave.
Examples:
- whether a layer id exists
- whether a shader id is valid
- whether a parameter exists on the targeted shader
- whether a value is within the definition's allowed range or enum set
- whether a trigger should update committed state, transient state, or both
- whether a structural change should preserve compatible transient state such as feedback buffers
This is the policy surface that used to be spread between `RuntimeHost` methods such as:
- `AddLayer(...)`
- `SetLayerShader(...)`
- `UpdateLayerParameter(...)`
- `UpdateLayerParameterByControlKey(...)`
- `ApplyOscTargetByControlKey(...)`
- `ResetLayerParameters(...)`
See `RuntimeHost.h`.
### 3. State classification
The coordinator decides which state category a mutation affects:
- persisted state
- committed live state
- transient live overlay state
- health/timing state only
The design rule is that classification belongs here, not in the ingress layer and not in render code.
Phase 5 codifies the shared vocabulary for this classification in `RuntimeStateLayerModel`. Current committed session parameter values and layer bypass state are committed-live/session state owned by `CommittedLiveState`; runtime compile/reload flags are coordination state rather than durable store truth.
### 4. Snapshot publication requests
When a mutation changes render-facing state, the coordinator asks `RuntimeSnapshotProvider` to publish a new snapshot or mark one dirty for publication.
The coordinator does not build render snapshots itself.
### 5. Persistence requests
When a mutation changes durable state, the coordinator asks `RuntimeStore` to record the new authoritative state and, when applicable, request persistence through the store's write path.
The coordinator does not serialize files directly.
### 6. Cross-subsystem consistency policy
The coordinator is where "what else must happen if this changes?" lives.
Examples:
- a layer add/remove/move may require:
- store mutation
- snapshot republish
- compatibility-preserving render-state reset policy
- optional UI-state notification via later event-model work
- a stack preset load may require:
- replacement of committed layer stack state
- invalidation of transient overlay state that no longer maps cleanly
- snapshot republish
- deferred persistence request
- an automation target may require:
- transient overlay update only
- no persistence write
- optional later commit into committed live state if policy says so
## Explicit Non-Responsibilities
`RuntimeCoordinator` should explicitly not own the following.
### Not a persistence engine
It does not:
- read or write files
- decide file formats
- own preset storage layout
- perform debounced disk flushing logic
Those belong in `RuntimeStore` and later persistence helpers.
### Not a render engine
It does not:
- own GL objects
- perform shader compilation
- reset temporal history textures directly
- build render passes
- hold frame queues
It may request policy outcomes that cause render-local resets, but render performs the work.
### Not a hardware/backend owner
It does not:
- configure DeckLink
- react directly to device callbacks
- schedule playout
- own input signal callbacks
### Not an ingress transport layer
It does not:
- parse OSC wire messages
- host websockets
- own HTTP handlers
- own polling loops
### Not a health reporting sink
It can emit mutation outcomes and warnings to `HealthTelemetry`, but it should not own counters, logs, or dashboards.
## Mutation Policy
The coordinator should use a small number of policy classes of mutation behavior rather than ad hoc per-call decisions.
### Durable mutation
Updates authoritative state that should survive beyond the current session flow.
Examples:
- add/remove/move layer
- change selected shader on a layer
- update a parameter via UI or API
- load a stack preset
- reset a layer to defaults
Expected coordinator behavior:
1. validate the request
2. normalize the target and value if needed
3. update committed/durable state via `RuntimeStore`
4. request snapshot publication
5. request persistence according to policy
### Live committed mutation
Updates committed current-session state that should be treated as true until changed again, but may not need synchronous persistence.
Examples:
- a UI action that changes a parameter repeatedly while dragging
- a manual operator bypass toggle during live use
Expected coordinator behavior:
1. update committed live state
2. request snapshot publication
3. decide whether persistence should happen immediately, be debounced, or be deferred
### Transient overlay mutation
Affects output but should not masquerade as stored truth.
Examples:
- active OSC automation target
- short-lived trigger-driven visual automation state
Expected coordinator behavior:
1. validate the route and target parameter
2. classify the action as transient
3. update overlay state through the appropriate owner boundary
4. avoid persistence unless a separate commit policy is invoked
### Coordination-only mutation
A request that mainly exists to trigger a flow rather than edit value state.
Examples:
- request reload
- request publish-now
- request clear transient state on reset/rebuild
## Interaction With State Categories
This section restates the Phase 1 state model specifically from the coordinator's perspective.
### Persisted state
`RuntimeCoordinator` does not own persisted state, but it decides when persisted state should change.
Typical interaction:
- validate request
- call into `RuntimeStore`
- receive success/failure
- request persistence if policy says this mutation should be durable
### Committed live state
This is the coordinator's primary logical domain.
The coordinator is the policy owner of:
- current layer stack composition
- current selected shaders
- current bypass flags
- current operator-authored parameter values
`CommittedLiveState` is the physical owner for this current-session layer state. `RuntimeStore` persists or skips disk writes according to coordinator policy and remains the compatibility facade for existing mutation call shapes.
### Transient live overlay state
The coordinator defines the rules for transient state, but should not become the long-term storage owner for render-local transient data.
The expected split is:
- coordinator owns policy
- `ControlServices` may own short ingress-side queues and coalescing buffers
- `RenderEngine` owns render-local transient application state
- `VideoBackend` owns playout and device transient state
For OSC specifically, the coordinator should eventually decide:
- whether an automation change is transient-only
- whether it should later commit into committed live state
- what reset/reload actions invalidate it
Phase 5 sets the default settled OSC policy to session-only. `CommitOscParameterByControlKey(...)` updates committed session state through the store with persistence disabled, publishes ordinary mutation/state-change observations, and does not request a persistence write unless a future explicit policy opts into durable OSC commits.
The committed-live concept now has a physical owner, `CommittedLiveState`, plus a named read model, `CommittedLiveStateReadModel`. The coordinator remains the owner of whether a mutation should be durable or session-only, while `RuntimeStore` persists or skips disk writes according to that policy.
### Health and timing state
The coordinator may emit events like:
- mutation rejected
- reload requested
- preset load succeeded/failed
- transient state cleared because structure changed
But those are observations into `HealthTelemetry`, not coordinator-owned data.
## Proposed Interfaces
These are target-shape interfaces, not final signatures.
### Input-facing API
Core mutation entrypoints could look like:
```cpp
struct RuntimeMutationRequest;
struct RuntimeMutationResult;
struct ReloadRequest;
struct OverlayCommitRequest;
class RuntimeCoordinator
{
public:
RuntimeMutationResult ApplyMutation(const RuntimeMutationRequest& request);
RuntimeMutationResult ApplyAutomationTarget(const RuntimeMutationRequest& request);
RuntimeMutationResult ResetLayer(const std::string& layerId);
RuntimeMutationResult RequestReload(const ReloadRequest& request);
RuntimeMutationResult CommitOverlayState(const OverlayCommitRequest& request);
RuntimeMutationResult ClearTransientStateForScope(const RuntimeResetScope& scope);
};
```
The important point is not the exact names. It is that ingress systems send typed requests into one policy owner.
### Downstream collaborators
The coordinator likely needs collaborators conceptually equivalent to:
- `IRuntimeStore`
- `IRuntimeSnapshotProvider`
- `IHealthTelemetry`
- compatibility adapters only where older call shapes still need to be supported during migration
### Mutation result shape
A useful result structure should carry more than success/failure. It should support policy-driven downstream behavior without re-deriving the decision elsewhere.
Suggested fields:
- `accepted`
- `errorMessage`
- `stateChanged`
- `persistedStateChanged`
- `committedLiveStateChanged`
- `transientStateChanged`
- `snapshotPublicationRequired`
- `persistenceRequested`
- `renderResetScope`
- `telemetryNotes`
This prevents callers from guessing whether they need to reload, publish, or persist.
## Current Code Mapping
The current app does not have a separate coordinator class, but several existing code paths are clearly doing coordinator work.
### `OpenGLCompositeRuntimeControls.cpp`
Methods like:
- `AddLayer(...)`
- `RemoveLayer(...)`
- `MoveLayer(...)`
- `SetLayerBypass(...)`
- `SetLayerShader(...)`
- `UpdateLayerParameterJson(...)`
- `ResetLayerParameters(...)`
- `SaveStackPreset(...)`
- `LoadStackPreset(...)`
currently do this pattern:
1. call a host/store mutation directly
2. decide whether to call `ReloadShader(...)`
3. call `broadcastRuntimeState()`
See [OpenGLCompositeRuntimeControls.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLCompositeRuntimeControls.cpp:1).
That "call host, then decide reload/broadcast policy" logic is a direct candidate for migration into `RuntimeCoordinator`.
### Previous `RuntimeHost`
`RuntimeHost` previously combined:
- mutation validation
- state mutation
- value normalization
- persistence writes
- render-state dirty marking
Examples from the old `RuntimeHost.cpp`:
- `AddLayer(...)`
- `SetLayerShader(...)`
- `UpdateLayerParameter(...)`
- `UpdateLayerParameterByControlKey(...)`
- `ApplyOscTargetByControlKey(...)`
- `ResetLayerParameters(...)`
- `LoadStackPreset(...)`
The target design is not to move all implementation in one step. It is to peel policy and orchestration decisions away first.
### `RuntimeServices`
Current OSC-specific flow in `RuntimeServices` includes:
- queueing updates
- applying pending updates
- queueing commits
- consuming completed commits
- clearing OSC state
See [RuntimeServices.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/control/RuntimeServices.h:46).
The coordinator should eventually own the rules for when these updates are transient, when they commit, and what reset/reload does to them, while `ControlServices` keeps only the ingress mechanics.
## Recommended Internal Model
The coordinator should remain small enough to reason about. A good target is to split its internal logic into policy-focused helpers rather than letting one class become another `RuntimeHost`.
Possible internal helper concepts:
- `LayerMutationPolicy`
- `ParameterMutationPolicy`
- `PresetMutationPolicy`
- `ReloadPolicy`
- `OverlayPolicy`
That can still be presented as one subsystem to the rest of the app, while keeping the implementation testable.
## Snapshot Publication Contract
The coordinator should never force callers to know whether a snapshot must be rebuilt. That policy should be owned here.
Examples:
- parameter changes require snapshot publication
- layer reorder requires snapshot publication
- shader swap requires snapshot publication and render-local rebuild work
- stack preset load requires snapshot publication and likely broader transient-state invalidation
- pure health/status changes do not require snapshot publication
This contract matters because current call sites often use coarse actions like `ReloadShader()` after structural edits. The coordinator should return a more precise outcome than "reload or not."
## Reload and Reset Policy
Reload and reset behavior has been a recurring source of edge cases in the current app, especially with shader feedback, temporal history, and OSC overlay state.
The coordinator should define explicit reset scopes such as:
- parameter-values-only reset
- committed-live-state reset for a layer
- transient-overlay reset for a layer
- render-local-history reset for a layer
- whole-stack structural reset
- reload-induced compatibility reset
That allows later phases to stop encoding reset behavior implicitly in UI handlers or render rebuild code.
Phase 5 has made this more concrete for OSC overlays: coordinator results now carry a named transient OSC invalidation request, with layer-scoped invalidation used for layer removal and manual parameter reset. The render/live-state owner still decides compatibility details, but callers no longer infer transient reset behavior from a generic boolean.
## Migration Plan From Current Code
The coordinator should be introduced incrementally.
### Step 1. Define request and result types
Introduce typed mutation request/result objects without changing most internals yet.
### Step 2. Wrap direct runtime mutations behind coordinator entrypoints
The first implementation could still delegate heavily into existing runtime mutation paths, but the call sites should stop deciding policy on their own.
For example, instead of:
1. `OpenGLComposite::AddLayer()`
2. direct layer-add mutation
3. `ReloadShader(true)`
4. `broadcastRuntimeState()`
the flow becomes:
1. `OpenGLComposite` or `ControlServices` creates a typed request
2. `RuntimeCoordinator::ApplyMutation(...)`
3. coordinator returns a result describing snapshot, reset, and persistence needs
4. composition root dispatches those downstream effects
### Step 3. Move validation and classification out of direct mutation helpers
Once coordinator entrypoints are stable, pull up:
- mutation classification
- reset/reload policy
- transient-versus-durable decisions
while leaving raw store operations in place.
### Step 4. Split storage and snapshot collaborators
Only after the coordinator is clearly owning policy should storage and snapshot responsibilities be split into real target subsystems.
## Key Risks
### Risk 1. Coordinator becomes a new god object
If the coordinator starts owning persistence details, status counters, or render reset mechanics directly, it will just recreate the current problem under a new name.
Mitigation:
- keep collaborators explicit
- keep request/result types narrow
- avoid direct dependencies on render or backend internals
### Risk 2. Call sites bypass coordinator during migration
If new code bypasses `RuntimeCoordinator` for convenience, the architecture will fork into two policy systems.
Mitigation:
- treat the coordinator as the required entrypoint for new non-render mutations
- add compatibility adapters rather than parallel mutation paths
### Risk 3. Too much policy stays implicit in return conventions
If callers still infer policy from "which method was called," the coordinator will not actually clarify the system.
Mitigation:
- return explicit mutation outcomes
- define reset and publication scopes as named concepts
### Risk 4. Transient-state ownership remains fuzzy
OSC overlay behavior, feedback invalidation, and reload compatibility can easily blur subsystem boundaries again.
Mitigation:
- coordinator owns classification rules
- subsystem owners retain storage ownership
- reset scopes are explicit
## Open Questions
- Should preset load/save stay synchronous through early migration, or should the coordinator always treat them as policy requests whose persistence effects may complete later?
- Should reload requests be modeled as a dedicated mutation class distinct from ordinary control mutations from the start?
- How much normalization of parameter values should remain in store-side helpers versus moving into coordinator policy helpers?
- Should transient overlay commit policy be global, or parameter-definition-driven for specific shader controls?
- What is the minimal reset-scope vocabulary needed to avoid hard-coding reload behavior in `RenderEngine` later?
## Short Version
`RuntimeCoordinator` is where the app decides what a valid change means.
It should:
- accept typed mutations from ingress systems
- validate and classify them
- update durable and committed state through `RuntimeStore`
- request render-facing publication through `RuntimeSnapshotProvider`
- request persistence when policy requires it
- define reset, reload, and transient-overlay rules
It should not:
- parse transport payloads
- own GL work
- own device callbacks
- write files directly
- become a replacement monolith for every kind of state

View File

@@ -1,476 +0,0 @@
# RuntimeSnapshotProvider Subsystem Design
This document expands the `RuntimeSnapshotProvider` subsystem from [PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md) into a concrete subsystem design.
The goal of `RuntimeSnapshotProvider` is to separate render-facing state publication from both runtime mutation policy and durable storage. In the target architecture, render should consume published snapshots rather than reaching into `RuntimeStore` or lock-protected live objects directly.
## Purpose
`RuntimeSnapshotProvider` is the boundary between runtime-owned state and render-consumable state.
It exists to solve three problems that Phase 1 pulled apart:
- render state was built directly out of `RuntimeHost` under a shared mutex
- render read and refreshed partially mutable cached layer state in more than one place
- state publication, state versioning, and dynamic frame-field refresh need explicit ownership
Before the Phase 1 runtime split, the closest behavior lived in:
- `RuntimeHost::GetLayerRenderStates(...)`
- `RuntimeHost::TryGetLayerRenderStates(...)`
- `RuntimeHost::TryRefreshCachedLayerStates(...)`
- `RuntimeHost::RefreshDynamicRenderStateFields(...)`
- `RuntimeHost::BuildLayerRenderStatesLocked(...)`
- the render-side cache usage in [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLComposite.cpp:589)
`RuntimeSnapshotProvider` has absorbed that responsibility in a cleaner and more publish-oriented way.
## Responsibilities
`RuntimeSnapshotProvider` is responsible for:
- publishing stable, versioned snapshots that can be consumed without large shared mutable locks
- giving `RenderEngine` a cheap read path for the latest committed snapshot
- making snapshot invalidation and publication rules explicit
`RenderSnapshotBuilder` is responsible for:
- building render-facing snapshots from the committed-live read model and package/runtime metadata supplied by `RuntimeStore`
- separating structural snapshot changes from dynamic frame fields
- translating runtime layer state into render-ready layer descriptors
- attaching immutable or near-immutable shader/package-derived data needed by render
- maintaining render snapshot version counters and frame advancement
It is not responsible for:
- deciding whether a mutation is valid
- classifying a change as transient versus durable
- directly accepting OSC/UI/file-watch requests
- disk persistence
- GL resource allocation
- shader compilation execution
- render-local transient overlays such as live OSC overlay state, temporal history textures, or feedback textures
## Design Principles
### Render consumes published state, not store internals
The render side should never need to walk `RuntimeStore` structures directly or perform per-frame reconstruction under the store lock.
### Structural data and dynamic frame fields are different classes of data
The layer stack, shader ids, parameter definitions, texture assets, font assets, feedback declarations, and temporal requirements change relatively infrequently. Frame count, wall time, UTC time, and similar values change every frame.
`RuntimeSnapshotProvider` should publish structural snapshots and provide a separate mechanism for frame-local dynamic enrichment, rather than rebuilding everything for every frame.
### Snapshot reads should be cheap and explicit
The render side should be able to say:
- give me the latest published snapshot
- tell me whether the structural snapshot version changed
- apply dynamic frame fields for this frame
without having to infer cache validity from multiple host-owned counters and fallback lock behavior.
### Published shape should be stable
The shape of render-facing layer state should remain consistent across phases even if the underlying store or coordination model changes.
## Snapshot Inputs
`RenderSnapshotBuilder` should build from a read-oriented runtime view, not from direct mutation calls. `RuntimeSnapshotProvider` should consume the builder's output and own publication/cache behavior.
That view now includes:
- committed live layer state from `CommittedLiveStateReadModel`
- package and manifest metadata supplied through `RuntimeStore`
- durable runtime configuration needed to describe render-facing dimensions and defaults
The important Phase 1 rule is not "the provider always reads one specific object." It is:
- the builder consumes read-oriented committed runtime state
- the provider consumes builder-published render snapshot data
- the provider does not own mutation policy
- render consumes the provider's published output instead of reaching back into whichever runtime object currently stores the truth
## Snapshot Model
The subsystem should publish a render snapshot object rather than loose vectors and ad hoc version getters.
Suggested top-level shape:
```cpp
struct RuntimeRenderSnapshot
{
uint64_t snapshotVersion = 0;
uint64_t structureVersion = 0;
uint64_t parameterVersion = 0;
uint64_t packageVersion = 0;
uint64_t publicationSequence = 0;
unsigned inputWidth = 0;
unsigned inputHeight = 0;
unsigned outputWidth = 0;
unsigned outputHeight = 0;
std::vector<RuntimeRenderLayerSnapshot> layers;
};
```
Suggested per-layer shape:
```cpp
struct RuntimeRenderLayerSnapshot
{
std::string layerId;
std::string shaderId;
std::string shaderName;
double mixAmount = 1.0;
double bypass = 0.0;
std::vector<ShaderParameterDefinition> parameterDefinitions;
std::map<std::string, ShaderParameterValue> parameterValues;
std::vector<ShaderTextureAsset> textureAssets;
std::vector<ShaderFontAsset> fontAssets;
bool isTemporal = false;
TemporalHistorySource temporalHistorySource = TemporalHistorySource::None;
unsigned requestedTemporalHistoryLength = 0;
unsigned effectiveTemporalHistoryLength = 0;
FeedbackSettings feedback;
};
```
This is intentionally close to todays [RuntimeRenderState](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/shader/ShaderTypes.h:134), but split so dynamic fields are not embedded in the published structural snapshot.
Suggested per-frame dynamic supplement:
```cpp
struct RuntimeRenderFrameContext
{
double timeSeconds = 0.0;
double utcTimeSeconds = 0.0;
double utcOffsetSeconds = 0.0;
double startupRandom = 0.0;
double frameCount = 0.0;
};
```
`RenderEngine` can combine `RuntimeRenderSnapshot` and `RuntimeRenderFrameContext` into its final frame-local render input without forcing snapshot republish every frame.
## Publication Rules
The provider should publish a new structural snapshot when any render-relevant structural or committed-live field changes, including:
- layer add/remove/reorder
- shader id change on a layer
- layer bypass change
- parameter value change that is part of committed live state
- shader package metadata refresh that changes parameter definitions, assets, temporal declarations, or feedback declarations
- input or output dimensions that change render-facing layer interpretation
- stack preset load that changes any render-facing state
The provider should not publish a new structural snapshot just because:
- time advanced by one frame
- frame count increased
- preview cadence changed
- render-local transient overlay state changed
- temporal history or feedback textures changed
- device playout queue state changed
That distinction matters because the current model effectively mixes structural publication with frame-local refresh and lock-driven fallback logic.
## Versioning Model
The provider should own explicit version domains rather than exposing only host-wide counters.
Recommended version domains:
- `structureVersion`
- changes when the layer graph or shader/package-derived structure changes
- `parameterVersion`
- changes when committed parameter or bypass values change
- `packageVersion`
- changes when shader manifests or package-derived metadata relevant to render changes
- `snapshotVersion`
- a composed version for consumers that only need a single fast invalidation key
- `publicationSequence`
- monotonic sequence number for diagnostics and telemetry
Recommended rules:
- `snapshotVersion` changes whenever any render-visible aspect of the structural snapshot changes
- `structureVersion` should not change for pure parameter edits
- `parameterVersion` should not change for time-only updates
- dynamic frame context should not require any version change
This makes later cache policy much cleaner:
- shader rebuild decisions can key off structure/package changes
- parameter buffer refresh can key off parameter changes
- frame-local updates can ignore snapshot publication entirely
## Snapshot Read Rules
The target read contract for `RenderEngine` should be:
1. acquire the latest published snapshot atomically or under a very small provider-owned read lock
2. compare relevant versions with the render-side cached state
3. if unchanged, reuse render-local compiled/cached resources
4. if changed, rebuild only the portions implied by the changed version domains
5. attach the current `RuntimeRenderFrameContext` for the frame being rendered
Important rule:
- `RenderEngine` should never partially mutate the provider's published snapshot in place.
The old `TryRefreshCachedLayerStates(...)` host path is gone. The remaining dynamic refresh is explicit: `RuntimeSnapshotProvider::RefreshDynamicRenderStateFields(...)` updates frame-local fields on render-owned copies, while published snapshot structure and committed parameter data stay behind the provider boundary.
## Render-Facing Data Shape Rules
The published snapshot should contain exactly the data render needs to interpret a layer, but not render-local execution artifacts.
Include:
- layer identity
- shader identity and display name
- parameter definitions
- committed parameter values
- bypass and mix flags needed for layer evaluation
- texture and font asset declarations
- temporal settings
- feedback settings
- input/output dimensions when they affect shader configuration or resource interpretation
Do not include:
- GL object ids
- framebuffer handles
- compiled shader programs
- live texture bindings resolved to hardware units
- temporal history texture state
- feedback buffer contents
- queued OSC overlays
- queued input frames
- preview frame caches
- DeckLink buffer handles
This line is important because current `RuntimeRenderState` is close to render-ready data, but the subsystem contract should stop before actual device or GL execution artifacts.
## Proposed Public Interface
Suggested interface shape:
```cpp
class IRuntimeSnapshotProvider
{
public:
virtual ~IRuntimeSnapshotProvider() = default;
virtual RuntimeRenderSnapshot BuildSnapshot(
const RuntimeStoreView& storeView,
const SnapshotBuildOptions& options) const = 0;
virtual void PublishSnapshot(RuntimeRenderSnapshot snapshot) = 0;
virtual std::shared_ptr<const RuntimeRenderSnapshot> GetLatestSnapshot() const = 0;
virtual uint64_t GetSnapshotVersion() const = 0;
virtual RuntimeRenderFrameContext BuildFrameContext() const = 0;
};
```
Likely supporting methods:
- `BuildLayerSnapshot(...)`
- `BuildFrameContext(...)`
- `ComputeSnapshotVersion(...)`
- `DidStructureChange(...)`
- `DidParametersChange(...)`
- `PublishIfChanged(...)`
Notes:
- `GetLatestSnapshot()` should ideally return a shared immutable snapshot pointer or equivalent stable handle
- `BuildFrameContext()` may remain provider-owned or later move behind a clock/timing helper if that subsystem becomes more explicit
- publication should be initiated by `RuntimeCoordinator`, not by render
## Relationship to Other Subsystems
### `RuntimeStore`
`RenderSnapshotBuilder` depends on store-owned durable metadata and the committed-live read model exposed through store-facing read APIs. `RuntimeSnapshotProvider` depends on the builder rather than reaching into store internals directly.
Committed session layer state now lives in `CommittedLiveState`; `RuntimeStore` remains the facade that combines that read model with package metadata and persistence-owned data for snapshot publication.
Neither the builder nor provider should mutate the store directly.
### `RuntimeCoordinator`
`RuntimeCoordinator` decides when a mutation requires snapshot republish.
The provider should not reclassify policy. It should only:
- build
- compare
- publish
based on the change request it is asked to materialize.
### `RenderEngine`
`RenderEngine` is the main consumer.
It should:
- read the latest published snapshot
- treat that snapshot as immutable
- derive render-local artifacts from it
- keep frame-local overlays and history outside the provider
### `HealthTelemetry`
The provider should emit:
- snapshot publication counts
- snapshot build duration
- version bump reason categories
- publication suppression counts when no effective change occurred
- warning states if snapshot build repeatedly fails
This is especially important while migrating away from the current lock/fallback model.
## Current Code Mapping
The current runtime path is:
1. get latest published snapshot from provider
2. compare snapshot versions produced by `RenderSnapshotBuilder`
3. rebuild through `RenderSnapshotBuilder` only if needed
4. apply render-local overlay state
5. attach frame context
That replaced the old mixed lock/cache/fallback flow that lived around [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/composite/OpenGLComposite.cpp:589).
`RenderSnapshotBuilder` now owns:
- layer render-state construction
- render-facing translation of committed live state plus package metadata
- explicit version composition for render-visible state
- dynamic frame-field refresh for render-owned copies
`RuntimeSnapshotProvider` now owns:
- published snapshot cache ownership
- version matching for already-published snapshots
- publication events and snapshot publish observations
## Migration Plan
### Step 1: Introduce provider types without changing behavior
- define `RuntimeRenderSnapshot`, `RuntimeRenderLayerSnapshot`, and `RuntimeRenderFrameContext`
- initially implement provider methods as thin wrappers over existing behavior
- completed: replace the temporary `RuntimeHost` backing source with `RenderSnapshotBuilder`
### Step 2: Route render reads through the provider
- replace direct host/store layer-state reads with provider snapshot reads
- preserve current version behavior first, even if internally bridged to existing counters
### Step 3: Separate structural publication from frame context
- stop rebuilding structural layer state just to refresh time and frame values
- let render request frame context separately each frame
### Step 4: Remove mutable snapshot refresh paths
- completed: retire the old `TryRefreshCachedLayerStates(...)` host path
- publish new snapshots for committed parameter changes instead of mutating published snapshot structure in place
### Step 5: Move publication triggering fully behind `RuntimeCoordinator`
- no render-driven snapshot rebuilding
- coordinator requests publication after successful committed mutations and reloads
## Risks
### Risk: snapshot copies become expensive
Publishing whole snapshots on every parameter commit could be expensive if the layer stack grows.
Mitigation:
- use immutable shared snapshots with replace-on-publish semantics
- consider per-layer structural sharing later if real profiles justify it
- avoid republishing for frame-local time-only changes
### Risk: unclear boundary between committed state and transient overlay state
If overlays are accidentally folded into the published snapshot, the provider will recreate the coupling that the subsystem split is supposed to remove.
Mitigation:
- keep overlays render-local or coordinator-owned transient state
- document that snapshots represent committed render-facing truth, not in-flight automation state
### Risk: version domains are under-specified
If version rules are not crisp, render may still over-rebuild or miss needed updates.
Mitigation:
- make version bump reasons explicit
- log version-domain changes during migration
- add tests around parameter-only, structure-only, and package-only changes
### Risk: snapshot publication is treated as a background convenience rather than a core contract
If code keeps reaching around the provider into the store, the architecture will remain half-split.
Mitigation:
- treat provider publication as the only supported render-facing state publication path
- convert direct host/store render-state methods into adapters, then remove them
## Testing Strategy
The provider should be testable without GL or hardware.
Recommended tests:
- snapshot build from a sample layer stack
- parameter-only mutation increments `parameterVersion` but not `structureVersion`
- layer reorder increments `structureVersion`
- shader manifest change increments `packageVersion`
- frame context changes over time without forcing `snapshotVersion` changes
- repeated publish with no effective change suppresses unnecessary version bumps
- feedback and temporal declarations are preserved correctly in published layer snapshots
## Open Questions
- Should output dimensions live inside the top-level snapshot only, or also be copied into each layer snapshot for compatibility with current code paths?
- Should package-derived compile-ready pass source metadata eventually be published by this provider, or remain a separate build artifact pipeline?
- Is `BuildFrameContext()` part of the provider long-term, or should timing/clock publication become its own helper owned adjacent to `HealthTelemetry`?
- Do parameter-only changes always require full snapshot republish, or should later phases add more granular per-layer publication handles?
- Should the provider own input signal dimensions directly, or should those come from a backend-published runtime environment view supplied during build?
## Completion Criteria For This Subsystem
`RuntimeSnapshotProvider` can be considered architecturally in place once:
- render no longer reads `RuntimeStore` or legacy host render state directly
- render consumes published snapshot handles rather than rebuilding layer vectors from host state
- dynamic frame fields are supplied separately from structural snapshot publication
- snapshot version domains are explicit and observable
- transient overlays remain outside the published snapshot contract
## Short Version
`RuntimeSnapshotProvider` should become the single place that turns committed runtime state into render-consumable published snapshots.
Its contract is:
- build from store-owned state
- publish immutable or near-immutable render snapshots; the current implementation keeps the last matching versioned snapshot in `RuntimeSnapshotProvider`
- version them explicitly
- keep frame-local timing separate
- give render a cheap, lock-light read path
If that boundary is held, later phases can isolate render timing and decouple playout without inventing a second render-state authority.

View File

@@ -1,590 +0,0 @@
# RuntimeStore Subsystem Design
This document expands the `RuntimeStore` portion of [PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md) into a subsystem-specific design note.
The purpose of `RuntimeStore` is to give the Phase 1 target architecture one clear home for durable runtime data. Before the Phase 1 runtime split, that responsibility was spread through `RuntimeHost`, where persistence, mutation entrypoints, render-state building, shader metadata access, and status reporting all shared the same object and lock domain. `RuntimeStore` is the design boundary that separates "what the app knows and saves" from "how the app decides to mutate it" and "how rendering consumes it."
## Role In The Phase 1 Architecture
Within the Phase 1 subsystem model, `RuntimeStore` is the durable data authority.
It exists to answer questions like:
- what runtime configuration is currently loaded
- what the saved layer stack structure is
- what the saved parameter values are
- what stack presets exist and what they contain
- what package and manifest metadata is available for validation and snapshot building
It should not answer questions like:
- should this control mutation be allowed
- should this OSC value be treated as transient or persisted
- how should the render thread consume state
- when should output frames be scheduled
- what warnings should be shown to the operator
That policy belongs elsewhere:
- mutation policy: `RuntimeCoordinator`
- render-facing publication: `RuntimeSnapshotProvider`
- hardware timing: `VideoBackend`
- operational visibility: `HealthTelemetry`
## Design Goals
`RuntimeStore` should optimize for:
- explicit ownership of durable runtime data
- predictable disk-backed load and save behavior
- minimal knowledge of GL, callbacks, or live playout timing
- stable read models for validation and snapshot building
- a clean seam for introducing debounced or asynchronous persistence later
- testability without GPU or DeckLink dependencies
## Responsibilities
`RuntimeStore` owns persisted and operator-authored state.
Primary responsibilities:
- load runtime host configuration from disk
- load saved runtime state from disk
- save runtime state snapshots to disk
- own the stored layer stack model
- own persisted parameter values and bypass flags
- own stack preset serialization and deserialization
- own package/manifest metadata needed across renders and reloads
- expose query/read APIs over stored state
- expose write APIs for coordinator-approved durable mutations
- normalize or repair stored data at load boundaries when necessary
Secondary responsibilities that still fit here:
- path resolution for runtime state and preset files
- preset name normalization/file-stem safety
- compatibility handling for older saved-state schemas
- default seeding of initial persistent state when no saved runtime exists
## Non-Responsibilities
`RuntimeStore` must not become a general convenience layer again.
It does not own:
- render-thread timing
- GL objects or resource lifetime
- shader compilation orchestration
- render-local transient state such as temporal history, feedback buffers, preview caches, or playout queues
- OSC smoothing, coalescing, or overlay application
- websocket broadcast policy
- REST or OSC ingress handling
- device callbacks, queue-depth policy, or preroll policy
- app-wide health aggregation
It also should not directly decide:
- whether a mutation is valid in policy terms
- whether a change should persist immediately, eventually, or not at all
- when a new render snapshot should be published
- whether a reload should be treated as config-only, package-only, or render-affecting
Those are coordinator concerns, not store concerns.
## State Ownership
`RuntimeStore` should own the following state categories.
Phase 5 names this boundary in code through `RuntimeStateLayerModel`: persisted layer stack data, saved parameter values, and stack presets are classified as base persisted state. Operator/session values are owned by `CommittedLiveState`; their mutation policy is committed-live policy owned by the coordinator, not durable-store policy by default.
Phase 5 also adds `CommittedLiveState` as the physical owner of current session/operator layer state and `CommittedLiveStateReadModel` as the named read boundary for render snapshot publication. `RuntimeStore` still owns file IO, config, package metadata, preset persistence, and persistence requests, but it delegates current-session layer mutations to `CommittedLiveState`.
### Runtime Configuration
Examples:
- server/control ports
- OSC bind address
- OSC smoothing defaults
- runtime paths and directory configuration
- any host-side configuration loaded from `config/runtime-host.json`
This data is durable, file-backed, and not inherently render-local.
### Persistent Layer Stack State
Examples:
- ordered layer list
- stable layer ids
- selected shader id per layer
- bypass state
- persisted parameter values
This is the stored "official" layer model, not a render-thread working copy.
### Stack Presets
Examples:
- preset names
- serialized saved layer stacks under `runtime/stack_presets`
Preset files are durable artifacts and should remain in the store domain even if later phases add async writing.
### Shader/Package Metadata Needed As Durable Reference Data
Examples:
- discovered shader package manifests
- parameter definitions used for validation/default restoration
- manifest-level capability metadata such as temporal history and feedback declarations
- package ordering that should survive across reloads
Important distinction:
- manifest and package metadata belongs here
- render-ready compiled programs and GPU resources do not
### Load-Time Compatibility/Repair State
Examples:
- schema version adaptation
- default value filling for missing parameters
- removal or migration of layers that reference missing packages
- preset compatibility cleanup
This should be treated as store hygiene during ingest, not runtime mutation policy.
## Data Model Boundaries
`RuntimeStore` should present data in durable-model terms rather than live-render terms.
Core model groupings:
- `RuntimeConfigModel`
- `PersistentLayerStackModel`
- `LayerStoredState`
- `StoredParameterValue`
- `StackPresetModel`
- `ShaderPackageCatalog` or equivalent durable package registry view
The exact C++ types may differ from these names, but the boundary should hold:
- store models describe durable intent
- snapshot models describe render consumption
That means `RuntimeStore` should not expose render-optimized structures such as `RuntimeRenderState` directly as its primary interface.
## Interface Shape
The Phase 1 architecture doc already sketches the high-level interface. This section expands it.
### Load / Save Interface
Expected responsibilities:
- `LoadConfig()`
- `LoadPersistentState()`
- `BuildPersistentStateSnapshot(...)`
- `RequestPersistence(...)`
- `LoadStackPreset(...)`
- `SaveStackPreset(...)`
- `GetStackPresetNames()`
Design notes:
- `Load*` operations should parse and normalize external file content into durable in-memory models.
- `Save*` operations should serialize durable models without needing render or control subsystem context.
- debounce/background writing wraps these operations rather than redefining store ownership
### Read Interface
Expected responsibilities:
- `GetRuntimeConfig()`
- `GetStoredLayerStack()`
- `FindStoredLayer(...)`
- `GetShaderPackageCatalog()`
- `GetStackPresetNames()`
- `BuildPersistenceSnapshot()` or equivalent stable serialization input
Design notes:
- read APIs should support coordinator validation and snapshot building
- read APIs should avoid exposing raw mutable internals across subsystem boundaries
- stable read snapshots from the store are fine; render snapshots are still the snapshot provider's job
### Write Interface
Expected responsibilities:
- `SetStoredLayerStack(...)`
- `ReplaceStoredLayer(...)`
- `SetStoredParameterValue(...)`
- `SetStoredBypassState(...)`
- `SetStoredShaderSelection(...)`
- `ReplaceShaderPackageCatalog(...)`
Design notes:
- writes should assume the coordinator already decided the mutation is allowed
- store APIs may still enforce structural invariants and shape correctness
- writes should not contain ingress-specific policy like OSC smoothing or UI throttling
### Normalization / Validation-Support Interface
Expected responsibilities:
- `NormalizeLoadedState(...)`
- `EnsureStoredDefaults(...)`
- `MakeSafePresetFileStem(...)`
- package lookup helpers for parameter-definition queries
Design notes:
- lightweight structure and schema validation belongs here
- policy validation belongs in the coordinator
- render compatibility translation belongs in the snapshot provider
## Concurrency Expectations
`RuntimeStore` should be designed as a shared data authority, but not as the app's global lock for everything.
Phase 1 design expectations:
- coordinator-driven writes may still be synchronized internally
- read APIs should be safe for coordinator and snapshot-provider use
- render should not directly take a large mutable store lock in the target architecture
This implies:
- `RuntimeStore` may keep an internal mutex during migration
- that mutex should protect durable models only
- render-facing consumers should eventually read via `RuntimeSnapshotProvider`, not by reaching into the store
One of the main goals here is avoiding the old situation where runtime lock scope effectively mixed:
- persistent state
- status reporting
- render-state caches
- timing stats
- reload flags
`RuntimeStore` should sharply narrow that scope.
## Dependency Rules
Per the Phase 1 subsystem design, `RuntimeStore` should sit low in the dependency graph.
Allowed inbound dependencies:
- `RuntimeCoordinator -> RuntimeStore`
- `RenderSnapshotBuilder -> RuntimeStore`
- temporary migration shims from `ControlServices` only where explicitly tolerated
Allowed outbound dependencies:
- file/serialization helpers
- package manifest parsing helpers
- pure utility types
Not allowed:
- `RuntimeStore -> RenderEngine`
- `RuntimeStore -> VideoBackend`
- `RuntimeStore -> ControlServices`
- `RuntimeStore -> HealthTelemetry` for behavior control
The store may emit errors or return result objects, but it should not coordinate the rest of the system directly.
## Current Code Mapping
Before the Phase 1 runtime split, `RuntimeHost` contained many responsibilities that needed to move into `RuntimeStore` or adjacent runtime collaborators.
Previous code paths:
- config load:
- `RuntimeHost.cpp`
- persistent state load:
- `RuntimeHost.cpp`
- persistent state save:
- `RuntimeHost.cpp`
- preset save/load:
- `RuntimeHost.cpp`
- `RuntimeHost.cpp`
- state serialization helpers:
- `RuntimeHost.cpp`
- `RuntimeHost.cpp`
- `RuntimeHost.cpp`
- path and file helpers:
- `RuntimeHost.cpp`
- `RuntimeHost.cpp`
- `RuntimeHost.cpp`
Durable-state mutation entrypoints that previously lived on `RuntimeHost` but conceptually split between coordinator and store:
- layer stack edits:
- `AddLayer`
- `RemoveLayer`
- `MoveLayer`
- `MoveLayerToIndex`
- committed-state edits:
- `SetLayerBypass`
- `SetLayerShader`
- `UpdateLayerParameter`
- `ResetLayerParameters`
The target split should be:
- validation/policy/orchestration -> `RuntimeCoordinator`
- durable state write application -> `RuntimeStore`
Methods that were intentionally not moved into `RuntimeStore` because they belong under other runtime subsystems:
- render-state building and caching:
- `GetLayerRenderStates`
- `TryRefreshCachedLayerStates`
- `BuildLayerRenderStatesLocked`
- status/timing reporting:
- `SetSignalStatus`
- `SetPerformanceStats`
- `SetFramePacingStats`
- `AdvanceFrame`
- live reload flags/polling shell:
- `PollFileChanges`
- `ManualReloadRequested`
- `ClearReloadRequest`
Those belong under other target subsystems.
## Proposed Internal Subcomponents
`RuntimeStore` does not need to be one monolithic class forever. A practical internal shape would be:
- `RuntimeConfigStore`
- runtime host config load and resolved paths
The current codebase has completed this part of the split: `RuntimeConfigStore` owns config parsing, path resolution, configured ports/formats, runtime roots, and shader compiler paths, while `RuntimeStore` exposes compatibility-shaped delegates for existing callers.
- `CommittedLiveState`
- current committed/session layer stack and parameter values
- layer CRUD/reorder and shader selection for the running session
- committed-live read model for snapshot publication
- `LayerStackStore`
- backing layer stack mechanics used by committed-live state
- layer CRUD/reorder and shader selection helpers
- stack preset value serialization/load helpers
- `RuntimeStatePresenter` / `RuntimeStateJson`
- runtime-state JSON assembly
- layer-stack presentation serialization
- `RenderSnapshotBuilder`
- render-state assembly and parameter refresh
- dynamic frame-field refresh and render snapshot version counters
- `ShaderPackageCatalog`
- durable manifest/package metadata
- shader package scanning, status/order/lookup, and asset/source change comparison
- `PersistenceWriter` helper
- synchronous at first, async/debounced later
The current codebase has completed the committed-live split: `CommittedLiveState` owns current committed/session layer state using `LayerStackStore` backing mechanics. `RuntimeStore` keeps file IO, package metadata, persistence serialization, persistence requests, preset file access, and facade methods for existing callers.
The current codebase has completed the render snapshot split: `RenderSnapshotBuilder` owns render-state assembly, cached parameter refresh, dynamic frame-field refresh, and render snapshot versions. `RuntimeSnapshotProvider` depends on this builder rather than on `RuntimeStore` friendship.
The current codebase has also completed the presentation split: `RuntimeStatePresenter` owns top-level runtime-state JSON assembly, while `RuntimeStateJson` owns the layer-stack and parameter presentation shape used by runtime state clients.
The current codebase has also completed the package split: `ShaderPackageCatalog` owns package scanning and registry comparison, while `RuntimeStore` uses it to keep layer state valid and to build compatibility read models.
These can still be presented through one subsystem façade during migration.
## Persistence Model
The store should treat persistence as durable snapshot management, not incremental side-effect spraying.
Target behavior:
- in-memory durable models are updated first
- serialization snapshots are built from those models
- save requests persist a coherent snapshot
This matters because earlier code called persistent-state saves directly from mutation paths. Phase 6 removed that pressure point: accepted durable mutations now publish persistence requests, and `RuntimeStore::RequestPersistence(...)` builds a coherent snapshot for the background writer.
The Phase 1 design for `RuntimeStore` should therefore assume:
- store ownership of serialization remains
- persistence requests, not mutation methods, are the durable write boundary
Phase 6 added that background snapshot writer underneath this subsystem, while keeping the durable model here.
## Migration Plan From Current Code
The safest migration path is to extract responsibilities by interface, not by big-bang rename.
### Step 1: Introduce The `RuntimeStore` Name And Facade
Create a facade interface for the durable-data parts that used to live in `RuntimeHost`.
Initial likely contents:
- config load/save access
- persistent layer-stack get/set access
- preset load/save access
- package catalog read access
This stage is complete: `RuntimeStore` owns its durable/session backing fields directly rather than wrapping a `RuntimeHost` object.
### Step 2: Move Pure Persistence Helpers First
Low-risk extractions:
- path resolution helpers
- file read/write helpers
- preset enumeration and serialization helpers
- persistent-state serialization/deserialization helpers
These have relatively low coupling to GL and backend timing.
### Step 3: Split Durable Models From Render Cache/Status Fields
Move out or conceptually separate:
- `mPersistentState`
- runtime config fields
- preset roots and runtime roots
- package catalog/order metadata
From fields that should stay elsewhere:
- render-state dirty flags and caches
- status/timing counters
- reload flags
This is one of the most important separations in the whole program.
### Step 4: Route Durable Mutations Through Coordinator-Owned Policy
Once the coordinator exists, `RuntimeStore` write calls should become lower-level and less policy-rich.
Examples:
- `SetStoredParameterValue(...)` rather than `ApplyOscTargetByControlKey(...)`
- `ReplaceStoredLayerStack(...)` rather than `LoadStackPreset(...)` directly mutating every downstream concern
### Step 5: Keep Render Off The Store
As `RuntimeSnapshotProvider` arrives, render should stop reading store internals directly.
That is the moment where `RuntimeStore` becomes a proper durable authority instead of a shared mutable app center.
## Risks
### 1. Recreating `RuntimeHost` Under A New Name
The biggest risk is calling something `RuntimeStore` while leaving policy, status, and render-cache behavior attached.
Guardrail:
- only durable data and store hygiene belong here
### 2. Letting Validation Drift Into Persistence
Store-level shape validation is appropriate. High-level mutation policy is not.
Risk examples:
- store decides whether OSC should persist
- store decides whether a layer reorder should trigger snapshot publication
- store decides whether a reload is render-only or package-affecting
Those are coordinator decisions.
### 3. Overexposing Mutable Internals
If callers keep direct mutable access to the underlying vectors/maps, the subsystem boundary will exist only on paper.
Guardrail:
- prefer controlled write methods and stable read models
### 4. Coupling Package Metadata Too Tightly To Compile Outputs
Package manifest and parameter-definition metadata belongs here. Compiled program state does not.
Guardrail:
- keep compile products and GPU artifacts out of the store
### 5. Using The Store Lock As A Global Synchronization Shortcut
This would recreate timing and contention issues in a new form.
Guardrail:
- store locking protects durable models only
- render synchronization must happen through snapshots, not by sharing the store lock
## Open Questions
### 1. How Much Shader Package Data Should Live Here?
Clear yes:
- manifest metadata
- parameter definitions
- package discovery/order information
Still open:
- whether compile-ready transformed sources belong here or in a later build-focused subsystem
Current recommendation:
- keep only durable reference/package metadata here
### 2. Should Preset Application Be A Store Operation Or A Coordinator Operation?
The file load and preset parse clearly belong here.
The policy question of how a loaded preset affects live state, snapshot publication, overlays, and notifications belongs in the coordinator.
Current recommendation:
- `RuntimeStore` loads preset content
- `RuntimeCoordinator` decides how to apply it
### 3. How Early Should Async Persistence Land?
Phase 1 does not require it, but the store design should not block it.
Current recommendation:
- keep synchronous save semantics initially if needed
- shape the interfaces so a background writer can be introduced without changing subsystem ownership
## Success Criteria For This Subsystem
`RuntimeStore` can be considered well-defined once the codebase can say, without ambiguity:
- all durable runtime config and saved layer data has one authoritative home
- stack presets are owned by that same durable-data subsystem
- render does not depend on store internals directly
- timing/status/reporting state is no longer mixed into the same subsystem
- persistence ownership is clear even before async persistence is introduced
## Short Version
`RuntimeStore` is the subsystem that should answer:
- what durable runtime data exists
- what saved layer stack and parameters exist
- what presets and package metadata exist
- how that durable data is loaded and serialized
It should not answer:
- whether a mutation should happen
- how rendering should consume state
- how hardware pacing should work
- what health warnings should be emitted
If this boundary holds, later phases can continue without recreating the old coupling under a different class name.

View File

@@ -1,694 +0,0 @@
# VideoBackend Subsystem Design
This note defines the target design for the `VideoBackend` subsystem introduced in [PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md).
It focuses on input/output device lifecycle, pacing, buffering, and recovery policy for live video I/O. It does not redefine the whole app architecture. Its job is to make the backend boundary concrete enough that later phases can move current DeckLink and bridge code toward one clear ownership model.
## Purpose
`VideoBackend` is the hardware-facing timing subsystem.
It owns:
- video device discovery and capability inspection
- input and output device configuration
- input callback handling
- output callback handling
- buffer-pool ownership for device-facing frames
- playout headroom policy
- queueing and pacing policy between render and hardware
- input signal presence tracking
- backend lifecycle and degraded-state transitions
It does not own:
- GL contexts
- frame composition
- shader execution
- persistence
- control mutation policy
- render snapshot publication
The core rule is:
- `RenderEngine` produces frames
- `VideoBackend` moves those frames to and from hardware at the right cadence
## Why This Subsystem Exists
Today the boundary between render and hardware pacing is still too blurred.
The main current pressure points are:
- `OpenGLVideoIOBridge` still performs render-facing work inside the output completion callback:
- [OpenGLVideoIOBridge.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:83)
- `DeckLinkSession` owns device setup, mutable output frame pools, and schedule timing in one class:
- [DeckLinkSession.h](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.h:13)
- [DeckLinkSession.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:289)
- the output scheduler currently reacts to late and dropped frames with a fixed skip policy:
- [VideoPlayoutScheduler.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/VideoPlayoutScheduler.cpp:26)
- the current output frame pool and preroll depth are not sourced from one policy object:
- `DeckLinkSession::ConfigureOutput()` creates `10` mutable output frames
- `kPrerollFrameCount` is currently `12`
Those overlaps make latency, buffering, and recovery behavior harder to reason about.
## Subsystem Responsibilities
`VideoBackend` should own the following responsibilities explicitly.
### 1. Device Discovery and Capability Reporting
The subsystem should:
- discover available input and output devices
- choose the configured input/output pair
- inspect mode support and pixel-format support
- expose capability facts needed by higher layers
Examples:
- input present or absent
- output present or absent
- model name
- keyer support
- internal/external keying availability
- supported pixel formats for the configured mode
- input/output frame sizes
This work is currently mostly in:
- [DeckLinkSession.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:76)
### 2. Input Lifecycle and Input Callback Handling
The subsystem should:
- configure input mode and pixel format
- install and own the input callback delegate
- start and stop capture streams
- translate hardware input frames into backend-level input frame events
- track signal-present versus no-input-source conditions
It should not decide how uploaded textures are produced. That belongs to `RenderEngine`.
The backend may expose input frames as:
- borrowed CPU-accessible frame views
- backend-managed input frame objects
- typed input events containing signal state and frame payload metadata
This work is currently split across:
- [DeckLinkSession::ConfigureInput](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:221)
- [CaptureDelegate::VideoInputFrameArrived](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkFrameTransfer.cpp:33)
- [OpenGLVideoIOBridge::UploadInputFrame](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:11)
### 3. Output Lifecycle and Output Callback Handling
The subsystem should:
- configure output mode and pixel format
- own the output frame pool
- install and own the scheduled-frame completion callback
- start scheduled playback
- stop scheduled playback
- account for completion results such as completed, late, dropped, and flushed
It should not render the next frame in the callback path.
This work is currently split across:
- [DeckLinkSession::ConfigureOutput](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:273)
- [DeckLinkSession::Start](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkSession.cpp:358)
- [PlayoutDelegate::ScheduledFrameCompleted](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/videoio/decklink/DeckLinkFrameTransfer.cpp:79)
### 4. Pacing and Scheduling Policy
The subsystem should own:
- target frame duration and timescale
- schedule time generation
- preroll policy
- spare-buffer policy
- queue headroom policy
- late-frame and dropped-frame recovery policy
This is not just a utility detail. It is one of the main timing responsibilities of the subsystem.
The current `VideoPlayoutScheduler` is a useful seed, but it is too small and too implicit to represent the eventual backend policy by itself.
### 5. Device-Facing Buffer Pools
The subsystem should own all device-facing buffers that exist to satisfy the hardware API contract.
Examples:
- mutable output frames created through DeckLink
- any staging buffers required by a future non-DeckLink backend
- reusable CPU frame containers for hardware ingress/egress
The goal is to make buffer depth and lifetime explicit and measurable.
`RenderEngine` may own render surfaces and GPU readback resources. `VideoBackend` owns the buffers required to talk to the hardware or OS video I/O API.
### 6. Backend Health and Degraded State
The subsystem should publish operational state such as:
- running normally
- prerolling
- temporarily late
- dropping frames
- no input signal
- output stopped
- failed to configure
This state should be reported to `HealthTelemetry`, not hidden inside debug logs or modal dialog paths.
## Boundary With Other Subsystems
This subsystem must stay aligned with the Phase 1 dependency rules.
Allowed directions:
- `VideoBackend -> RenderEngine`
- `VideoBackend -> HealthTelemetry`
Not allowed in the target design:
- `VideoBackend -> RuntimeStore`
- `VideoBackend -> RuntimeCoordinator`
- `VideoBackend -> ControlServices`
The important operational boundary is:
- `VideoBackend` may request or consume rendered output frames
- it may not own frame composition policy
That means:
- no shader parameter validation here
- no persistence decisions here
- no direct mutation of runtime state here
## State Owned by VideoBackend
`VideoBackend` should own the following state categories.
### Device Configuration State
Examples:
- selected device handles
- configured input/output formats
- negotiated pixel formats
- keyer configuration
- output model name
- supported keying flags
### Session Lifecycle State
Examples:
- discovered
- configured
- prerolling
- running
- degraded
- stopping
- stopped
- failed
### Input Runtime State
Examples:
- signal present or missing
- last observed input format properties
- input frame counters
- input callback timestamps
- queued capture frames awaiting render ingestion
### Output Runtime State
Examples:
- output queue depth
- free system-memory playout frame count
- ready system-memory playout frame count
- scheduled system-memory playout frame count
- scheduled frame index
- completed frame index
- late frame count
- dropped frame count
- underrun/repeat/drop counters for system-memory playout policy
- frame age at schedule time and completion callback time
- spare buffer count
- current headroom target
### Backend-Owned Transient Buffers
Examples:
- output mutable frame pool
- playout ring buffer entries
- input frame handoff queue
- staging buffers if required by the device API
This is transient live state, not persisted state.
## Target Lifecycle Model
`VideoBackend` should eventually expose an explicit lifecycle state machine rather than relying on scattered imperative calls.
Suggested states:
1. `uninitialized`
2. `discovering`
3. `discovered`
4. `configuring`
5. `configured`
6. `prerolling`
7. `running`
8. `degraded`
9. `stopping`
10. `stopped`
11. `failed`
Suggested transition rules:
- `uninitialized -> discovering`
- `discovering -> discovered | failed`
- `discovered -> configuring | stopped`
- `configuring -> configured | failed`
- `configured -> prerolling | stopped`
- `prerolling -> running | failed | stopping`
- `running -> degraded | stopping | failed`
- `degraded -> running | stopping | failed`
- `stopping -> stopped`
Why this matters:
- startup failure reporting becomes more predictable
- backend recovery can become policy-driven
- telemetry can report backend state directly
- later backends do not need to mimic DeckLink's exact imperative shape
## Target Timing Model
The long-term timing design should be producer/consumer playout.
### Current Model
Today the callback path effectively does this:
1. DeckLink signals completion.
2. The callback path asks for a new output buffer.
3. The callback path requests render-thread output production.
4. The render thread renders the next frame.
5. The render thread reads it back into the output buffer.
6. The callback path schedules the next hardware frame.
That path is visible in:
- [OpenGLVideoIOBridge::RenderScheduledFrame](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:18)
This no longer borrows the GL context from the callback thread, but it still couples output timing directly to render-thread work.
### Target Model
The target model should be:
1. `RenderEngine` produces completed output frames at the configured cadence.
2. `RenderEngine` places them into a bounded queue owned or mediated by `VideoBackend`.
3. `VideoBackend` dequeues ready frames when the device needs them.
4. hardware callbacks only:
- record completion results
- release or recycle buffers
- dequeue and schedule the next ready frame
- raise underrun or degraded-state signals if needed
The timing rule becomes:
- render is the producer
- hardware output is the consumer
This gives the app a clear place to manage:
- target latency
- playout headroom
- stale-frame reuse
- underrun behavior
- spare buffer policy
## Input Buffering and Pacing
The input side needs a simpler but still explicit handoff model.
Recommended target behavior:
- hardware callbacks push input frames into a bounded ingress queue
- `RenderEngine` pulls the newest useful input frame when preparing a render
- if the ingress queue overflows, old frames are discarded according to policy
Recommended default policy for live playout:
- prefer recency over completeness
- drop stale capture frames instead of blocking render or output
The current latest-input mailbox behavior is directionally correct for live timing:
- [OpenGLVideoIOBridge::UploadInputFrame](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:11)
The next improvement is to make the backend-to-render handoff policy more explicit in telemetry and playout scheduling, rather than treating it as only a render command mailbox detail.
Suggested input metrics:
- input frames received
- no-signal transitions
- input queue depth
- dropped input frames
- oldest queued input age
## Output Buffering and Headroom Policy
Output buffering should be policy-driven from one source of truth.
The target design should define a playout buffering policy object with at least:
- target preroll depth
- minimum spare device buffers
- maximum queued rendered frames
- allowed catch-up depth
- underrun behavior
Example policy fields:
- `targetPrerollFrames`
- `minSpareOutputBuffers`
- `maxReadyFrames`
- `maxCatchUpFrames`
- `reuseLastFrameOnUnderrun`
- `allowAdaptiveHeadroom`
This replaces the current split between:
- fixed mutable frame pool size in `DeckLinkSession::ConfigureOutput()`
- fixed preroll count in `kPrerollFrameCount`
- fixed skip-ahead recovery in `VideoPlayoutScheduler`
## Underrun and Recovery Policy
The backend should define explicit behavior for when no fresh frame is ready at schedule time.
Candidate policies:
1. Reuse the last completed rendered frame.
2. Reuse the last scheduled output frame.
3. Schedule a known black or degraded frame.
4. Temporarily increase headroom if the system is repeatedly catching up.
Which one is correct may differ by operating mode, but the choice should be explicit rather than incidental.
Similarly, completion-result handling should become measured rather than fixed.
The current scheduler does this:
- late or dropped frame -> `mScheduledFrameIndex += 2`
That is a useful emergency simplification, but not a durable backend contract.
The target backend should instead track:
- scheduled frame index
- completed frame index
- backlog depth
- late streaks
- dropped streaks
- current operating headroom
Then recovery can use measured lag, not a hardcoded skip.
## Suggested Public Interface
This is not a final class API. It describes the shape the subsystem should move toward.
### Discovery and Configuration
- `DiscoverDevices(...)`
- `SelectFormats(...)`
- `ConfigureInput(...)`
- `ConfigureOutput(...)`
- `GetCapabilities()`
- `GetBackendState()`
### Lifecycle
- `StartCapture()`
- `StartPlayout()`
- `StopCapture()`
- `StopPlayout()`
- `Shutdown()`
### Input Handoff
- `PollInputFrame(...)` or `TryDequeueInputFrame(...)`
- `ReportInputSignalState(...)`
### Output Handoff
- `QueueRenderedFrame(...)`
- `TryDequeueReadyFrameForSchedule(...)`
- `RecycleCompletedFrame(...)`
### Timing and Recovery
- `SetPlayoutPolicy(...)`
- `AccountForCompletionResult(...)`
- `BuildBackendTimingSnapshot()`
### Health Reporting
- `BuildBackendHealthSnapshot()`
- `GetWarningState()`
## Suggested Internal Components
The subsystem will likely be easier to evolve if its responsibilities are split internally.
Possible internal structure:
### `VideoBackendSession`
Owns:
- high-level lifecycle state
- configuration
- input/output subcomponents
- policy objects
### `InputEndpoint`
Owns:
- input device callback registration
- input frame queue
- signal detection state
### `OutputEndpoint`
Owns:
- output device callback registration
- output device buffer pool
- schedule/dequeue logic
- preroll and output queue management
### `PlayoutPolicy`
Owns:
- preroll target
- spare buffer target
- underrun behavior
- catch-up and lateness rules
### `BackendTimingState`
Owns:
- frame counters
- queue depth snapshots
- late/dropped streaks
- observed intervals
These can remain implementation details in Phase 1, but the design should leave room for them.
## Mapping From Current Code
### Current `DeckLinkSession`
Should mostly migrate into:
- `VideoBackend`
- device discovery
- input configuration
- output configuration
- keyer capability handling
- output frame pool ownership
- lifecycle state handling
Candidates to stay backend-owned:
- `DiscoverDevicesAndModes(...)`
- `SelectPreferredFormats(...)`
- `ConfigureInput(...)`
- `ConfigureOutput(...)`
- `Start()`
- `Stop()`
- `HandleVideoInputFrame(...)`
- `HandlePlayoutFrameCompleted(...)`
### Current `VideoPlayoutScheduler`
Should likely become:
- a backend-owned policy helper or timing component under `VideoBackend`
It is still a backend concern, but it should be expanded beyond a single counter and fixed skip rule.
### Current `OpenGLVideoIOBridge`
Should split between:
- `RenderEngine`
- input texture upload scheduling
- render submission
- readback or output-frame production
- `VideoBackend`
- input ingress queue
- output callback and scheduling policy
- pacing stats
The most important migration is:
- remove render work from `PlayoutFrameCompleted()`
### Previous Runtime Status Updates
Frame pacing and signal status setters that were historically called from the bridge should route through:
- `VideoBackend -> HealthTelemetry`
rather than the old pattern:
- callback/bridge -> `RuntimeHost`
## Migration Plan
The migration should avoid a flag-day rewrite.
### Step 1. Name the backend boundary explicitly
Create a conceptual `VideoBackend` interface around the existing `VideoIODevice`/`DeckLinkSession` shape without moving all logic at once.
### Step 2. Pull timing policy into backend-owned objects
Move:
- completion accounting
- headroom configuration
- frame-pool sizing
- queue depth reporting
behind explicit backend policy types.
This can happen before changing the render thread model.
### Step 3. Separate callback work from render work
Change the output completion path so it stops rendering immediately in the callback chain.
Intermediate step:
- callback records completion and wakes a playout worker
Target step:
- callback only dequeues and schedules already-ready frames
### Step 4. Move input handoff to a bounded queue
Replace direct callback-to-GL upload behavior with:
- backend-owned input queue
- render-owned dequeue/upload policy
### Step 5. Introduce explicit backend lifecycle states
Start surfacing:
- configured
- prerolling
- running
- degraded
- failed
before changing all recovery behavior.
### Step 6. Route backend health to `HealthTelemetry`
Move debug-only warnings and ad hoc status strings toward structured counters and backend snapshots.
## Risks
### Latency Versus Stability Tradeoff
Increasing headroom reduces deadline misses but increases end-to-end latency. The backend must make that tradeoff explicit and configurable enough for live use.
### Hidden Coupling During Migration
The current bridge still mixes backend and render concerns. Partial extraction can accidentally preserve the old coupling under new names if the callback path is not cleaned up deliberately.
### Buffer Ownership Ambiguity
If device-facing buffers and render-facing buffers are not separated clearly, lifetime bugs and timing regressions will remain easy to reintroduce.
### Backend-Specific Assumptions
The first target is still DeckLink-centric. The interface should avoid baking in assumptions that would make alternate backends awkward later.
### Recovery Policy Complexity
A more explicit backend model will surface choices that are currently hidden:
- stale frame reuse
- black-frame fallback
- adaptive headroom
- catch-up rules
That is healthy, but it will require deliberate policy decisions.
## Open Questions
- Should `VideoBackend` own both input and output under one session object long-term, or should it expose distinct input and output endpoints under a shared shell?
- Should queue ownership sit fully inside `VideoBackend`, or should there be a narrow shared frame-exchange interface between `RenderEngine` and `VideoBackend`?
- What should the default underrun policy be for live playout: reuse last frame, reuse newest completed frame, or output black?
- Should adaptive headroom be automatic, operator-configurable, or both?
- At what point should preview timing be treated as a backend concern versus a render concern? The Phase 1 direction says preview is subordinate to render, not owned by the backend, but later timing work may still require explicit coordination.
- How much of the current `VideoIOState` belongs inside `VideoBackend` versus `HealthTelemetry` snapshots?
## Short Version
`VideoBackend` should become the subsystem that owns hardware timing, device lifecycle, buffer policy, and playout recovery.
It should not render frames.
The target direction is:
- `RenderEngine` produces frames ahead of need
- `VideoBackend` consumes and schedules them
- callbacks become lightweight control-plane events
- headroom, queue depth, and recovery become explicit backend policy
- hardware health is reported structurally instead of being inferred from scattered logs and bridge behavior

View File

@@ -103,7 +103,7 @@ Those features should be ported only after the cadence spine is stable.
## V1 Feature Parity Checklist
This tracks parity with `apps/LoopThroughWithOpenGLCompositing`.
This tracks current production-facing feature coverage for the `src/` app.
- [x] Stable DeckLink output cadence
- [x] BGRA8 and UYVY8 system-memory output paths
@@ -149,11 +149,11 @@ This tracks parity with `apps/LoopThroughWithOpenGLCompositing`.
- [x] Startup restore from latest runtime layer state
- [x] Debounced background autosave for durable layer-stack changes
- [ ] Manual stack preset save/load
- [ ] Persistent config writes
- [x] Persistent host config writes with restart request
- [ ] OSC ingress
- [x] Preview output from a non-consuming system-memory tap
- [ ] Screenshot capture
- [ ] External keying support
- [x] DeckLink external keying request through the Output alpha config/control path
- [ ] Full V1 health/runtime presentation model
## Build