diff --git a/docs/ARCHITECTURE_RESILIENCE_REVIEW.md b/docs/ARCHITECTURE_RESILIENCE_REVIEW.md index 6756299..8ccee8b 100644 --- a/docs/ARCHITECTURE_RESILIENCE_REVIEW.md +++ b/docs/ARCHITECTURE_RESILIENCE_REVIEW.md @@ -538,6 +538,10 @@ Expected benefits: Once rendering and snapshots are isolated, formalize how final parameter values are derived. +Dedicated design note: + +- [PHASE_5_LIVE_STATE_LAYERING_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_5_LIVE_STATE_LAYERING_DESIGN.md) + Recommended layers: - base persisted state diff --git a/docs/PHASE_3_LIVE_STATE_SERVICE_COORDINATION_DESIGN.md b/docs/PHASE_3_LIVE_STATE_SERVICE_COORDINATION_DESIGN.md index aeb6e97..1103b77 100644 --- a/docs/PHASE_3_LIVE_STATE_SERVICE_COORDINATION_DESIGN.md +++ b/docs/PHASE_3_LIVE_STATE_SERVICE_COORDINATION_DESIGN.md @@ -24,7 +24,7 @@ Current footholds: - `RuntimeServiceLiveBridge` translates service OSC queues into render live-state updates and queues settled overlay commit requests. - `RuntimeEventDispatcher` now routes accepted mutations, reloads, snapshots, shader build events, backend observations, and health observations. -The current architecture is much better than the original `RuntimeHost` shape. The remaining render risk is now mostly Phase 4 work: GL calls are still reached from callback and UI paths through a shared context lock rather than through one render-thread owner. +The current architecture is much better than the original `RuntimeHost` shape. Phase 4 has since moved normal runtime GL work onto the `RenderEngine` render thread, so the remaining render-facing risk is no longer shared context ownership; it is the later producer/consumer playout work needed to keep DeckLink callbacks from synchronously waiting on output production. ## Why Phase 3 Exists @@ -314,7 +314,7 @@ Before calling Phase 3 complete, update: - architecture review checklist - Phase 4 assumptions about render thread input state -Status: complete. The Phase 4 design note starts from the `RenderFrameInput` / `RenderFrameState` contract and the remaining shared-GL ownership paths. +Status: complete. The Phase 4 design note started from the `RenderFrameInput` / `RenderFrameState` contract and has now completed the shared-GL ownership migration. ## Testing Strategy diff --git a/docs/PHASE_5_LIVE_STATE_LAYERING_DESIGN.md b/docs/PHASE_5_LIVE_STATE_LAYERING_DESIGN.md new file mode 100644 index 0000000..3b3bfc1 --- /dev/null +++ b/docs/PHASE_5_LIVE_STATE_LAYERING_DESIGN.md @@ -0,0 +1,390 @@ +# Phase 5 Design: Live State Layering And Composition + +This document expands Phase 5 of [ARCHITECTURE_RESILIENCE_REVIEW.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/ARCHITECTURE_RESILIENCE_REVIEW.md) into a concrete design target. + +Phase 1 named the subsystems. Phase 2 added the typed event substrate. Phase 3 made render-facing live state explicit through `RuntimeLiveState`, `RenderStateComposer`, `RenderFrameInput`, `RenderFrameState`, `RenderFrameStateResolver`, and `RuntimeServiceLiveBridge`. Phase 4 made one render thread the owner of normal runtime GL work. Phase 5 should now make the live parameter model itself explicit: persisted truth, operator/session truth, and transient automation should be separate layers with one predictable composition rule. + +## Status + +- Phase 5 design package: proposed. +- Phase 5 implementation: not started. +- Current alignment: Phase 3 introduced the first pure composition boundary and transient OSC overlay owner, but committed runtime values are still physically stored through `RuntimeStore`/`LayerStackStore`, and transient OSC overlay state is applied through render-facing helpers rather than through a first-class layered state model. + +Current live-state footholds: + +- `RuntimeStore` owns persisted layer stack, parameter values, presets, config, and render snapshot read models. +- `RuntimeCoordinator` owns mutation validation, classification, accepted/rejected event publication, snapshot/reload follow-ups, and the policy switch between committed states and live snapshots. +- `RuntimeSnapshotProvider` publishes render-facing snapshots from committed runtime state. +- `RuntimeLiveState` owns transient OSC overlay bookkeeping, smoothing, generation tracking, and commit-settlement policy. +- `RenderStateComposer` combines base render states with live overlay state and returns final per-frame layer states plus settled commit requests. +- `RuntimeServiceLiveBridge` drains OSC ingress/completion queues and applies them to render live state during frame preparation. + +## Why Phase 5 Exists + +The resilience review identifies live OSC overlay and persisted state as separate concepts that still do not have a fully formal model. The app now has better boundaries, but several policies are still implicit: + +- whether a value is durable, committed for the current session, or transient automation +- whether an OSC value should merely influence the current frame or eventually commit +- what reload, preset load, layer removal, shader change, and reset should do to transient values +- which layer wins when UI/operator changes race with OSC automation +- which state changes should publish snapshots, request persistence, or only affect render frames + +Without a formal layering model, these rules can leak across `RuntimeStore`, `RuntimeCoordinator`, `RuntimeLiveState`, `RenderStateComposer`, and service bridges. Phase 5 should make those rules boring and testable. + +## Goals + +Phase 5 should establish: + +- explicit state layers for persisted, committed/session, and transient automation values +- one named composition contract for final render values +- clear ownership for layer-specific mutation policy +- explicit reset/reload/preset behavior for transient and committed state +- a clean path for OSC automation to remain high-rate without becoming durable state by accident +- tests for layer precedence, lifecycle, invalidation, and commit policy without GL or DeckLink +- documentation that distinguishes render-local temporal/feedback state from parameter/live-state overlays + +## Non-Goals + +Phase 5 should not require: + +- a background persistence writer implementation +- a DeckLink producer/consumer playout queue +- a full cue/timeline/preset performance system +- a new UI state-management framework +- replacing every synchronous coordinator API +- moving temporal history or shader feedback into the runtime state model + +Those are later phases or separate feature work. Phase 5 is about parameter and live-value layering. + +## Target State Model + +Phase 5 should formalize three layers: + +| Layer | Owner | Lifetime | Persistence | Render role | +| --- | --- | --- | --- | --- | +| Base persisted state | `RuntimeStore` / `LayerStackStore` | survives restart | written to disk | default layer stack, shader selections, saved parameter values | +| Committed live state | `RuntimeCoordinator` or a new live-session collaborator | current running session | may request persistence depending on mutation type | operator/UI/current truth until changed again | +| Transient automation overlay | `RuntimeLiveState` or a new automation overlay collaborator | high-rate/short-lived | not persisted directly | temporary OSC/automation target applied over committed truth | + +The target composition rule is: + +```text +final render state = base persisted state + committed live state + transient automation overlay +``` + +The actual implementation may continue using render snapshots as the base transport. The important part is that each layer has named ownership, documented lifetime, and tested precedence. + +## Current Composition Shape + +Today, final frame state is prepared through this path: + +1. `OpenGLComposite::renderEffect()` processes runtime work. +2. `OpenGLComposite` builds `RenderFrameInput`. +3. `RuntimeServiceLiveBridge` drains OSC updates and completed commits. +4. `RenderEngine` updates `RuntimeLiveState`. +5. `RenderFrameStateResolver` chooses committed states or live snapshot states. +6. `RenderStateComposer` applies transient overlay values. +7. `RenderEngine::RenderPreparedFrame(...)` consumes `RenderFrameState`. + +That is a good Phase 3/4 foundation. Phase 5 should make the hidden assumptions in steps 5 and 6 explicit enough that reset/reload/preset and future UI automation behavior are not scattered across those collaborators. + +## Proposed Collaborators + +### `RuntimeStateLayerModel` + +Optional pure model that names the layers and composition metadata. + +Responsibilities: + +- represent base, committed, and transient layer state inputs +- define precedence and invalidation categories +- expose a pure composition function or input object +- keep GL, services, persistence, and device callbacks out of the model + +Non-responsibilities: + +- disk IO +- OSC socket handling +- render-thread scheduling +- shader compilation + +This may be a small set of structs rather than a large class. The value is in naming the contract. + +### `CommittedLiveState` + +Optional runtime/session collaborator if committed session state needs to move out of `RuntimeStore`. + +Responsibilities: + +- hold operator/UI committed values that are true for the current session +- distinguish persistence-required commits from session-only commits +- expose a read model for snapshot publication +- provide reset/load behavior separate from durable storage + +Non-responsibilities: + +- transient OSC smoothing +- disk writes +- GL resources + +Phase 5 can defer this physical split if the policy is documented and covered by tests. The key is that committed-live state becomes a distinct concept even if it still lives inside existing storage temporarily. + +### `AutomationOverlayState` + +Possible evolution of `RuntimeLiveState`. + +Responsibilities: + +- hold transient automation values keyed by route/layer/parameter identity +- track generation, commit-in-flight, and completion +- apply smoothing and settle policy +- decide whether an overlay is render-only, commit-requesting, stale, or invalidated + +Non-responsibilities: + +- owning committed truth +- persistent state mutation +- snapshot publication + +This can start by renaming or narrowing current `RuntimeLiveState` responsibilities rather than replacing it outright. + +### `LayeredStateComposer` + +Possible evolution of `RenderStateComposer`. + +Responsibilities: + +- apply the target precedence rule +- produce final `RuntimeRenderState` values for a frame +- return commit requests or overlay observations when policy says a transient value settled +- keep value composition testable without OpenGL + +Non-responsibilities: + +- frame rendering +- service queue draining +- storage mutation + +## Layering Rules + +### Precedence + +Default precedence should be: + +1. base persisted/snapshot value +2. committed live/session value +3. transient automation overlay + +The topmost valid layer wins for discrete values. Numeric/vector values may be smoothed by overlay policy before they win. + +### Identity + +Layering should use stable render-facing identity: + +- layer id for persisted structural identity +- layer key/control key for OSC-facing identity +- parameter id for shader-defined identity +- parameter control key for external-control identity + +Phase 5 should document which identity is authoritative when layer id and control key disagree or when a shader changes. + +### Invalidations + +The following should have explicit behavior: + +- layer removed: clear committed and transient state for that layer +- layer shader changed: clear or remap parameter overlays according to compatible control keys +- preset loaded: replace base/committed state and clear incompatible transient overlays +- shader reload with same controls: preserve compatible transient overlays where safe +- manual reset parameters: clear committed overrides and transient overlays for that layer +- no input/source changes: should not affect parameter layers + +### Commit Policy + +Transient automation may: + +- remain render-only +- settle and request a committed mutation +- commit without persistence +- commit with persistence only when the control path explicitly requests it + +The policy should be explicit per ingress path or parameter category. Phase 5 does not need a full UI for it, but the default behavior should be documented and tested. + +## Event And Snapshot Contract + +Phase 5 should clarify which changes publish which effects: + +| Change | Snapshot publication | Persistence request | Render reset | Runtime event | +| --- | --- | --- | --- | --- | +| persisted layer stack mutation | yes | yes | maybe | accepted mutation + persistence requested | +| operator live parameter change | yes | maybe | no, unless structural | accepted mutation | +| transient OSC overlay update | no committed snapshot by default | no | no | optional overlay observation | +| overlay settled commit | yes if accepted | usually no for OSC | no | accepted mutation or overlay-settled observation | +| preset load | yes | maybe | temporal/feedback policy dependent | accepted mutation + reload/reset observations | +| shader change/reload | yes after build | maybe | temporal/feedback policy dependent | shader build/reload events | + +This table should evolve with implementation, but Phase 5 should prevent transient overlay updates from masquerading as durable committed state. + +## Migration Plan + +### Step 1. Inventory Current State Layers + +Document and/or encode where each current state category lives: + +- persisted layer stack and parameter values +- committed current-session parameter values +- runtime compile/reload flags +- transient OSC overlays +- render-local temporal history and feedback state + +Initial target: + +- identify which fields are durable, committed-live, transient automation, render-local, or health/config +- update subsystem docs where the current ownership is misleading +- add small tests for classification if a pure helper exists + +### Step 2. Name The Layered Composition Input + +Introduce a named composition input model around the current `RenderStateCompositionInput`. + +Initial target: + +- make base/committed/transient inputs visible in type names or field names +- keep `RenderStateComposer` behavior unchanged at first +- add tests that assert precedence with no GL + +Possible outcomes: + +- evolve `RenderStateCompositionInput` +- add a new `LayeredRenderStateInput` +- add a thin adapter that feeds existing `RenderStateComposer` + +### Step 3. Make Reset And Reload Policy Explicit + +Move reset/reload transient-state decisions into one policy point. + +Initial target: + +- layer removal clears matching transient overlays +- shader change clears incompatible overlays +- preset load clears incompatible overlays +- shader reload can preserve compatible overlays when requested +- temporal/feedback resets stay render-local and separate from parameter overlays + +This is where Phase 5 should prevent "clear everything" and "preserve everything" from being scattered through unrelated code. + +### Step 4. Clarify OSC Commit Semantics + +Make the transient-to-committed path explicit. + +Initial target: + +- document and test whether settled OSC commits persist +- ensure stale generation completions are ignored +- ensure one settled route does not clear unrelated overlay state +- publish or preserve useful events for accepted overlay commits + +Current Phase 3 behavior is a good base; Phase 5 should make the policy easier to reason about from the code. + +### Step 5. Separate Committed-Live Concept From Durable Storage + +Decide whether to physically split committed-live state now or introduce a read/model boundary first. + +Conservative option: + +- leave storage physically in `RuntimeStore` +- add a named committed-live read model +- keep persistence decisions in `RuntimeCoordinator` + +Stronger option: + +- introduce `CommittedLiveState` +- make `RuntimeSnapshotProvider` consume committed live state through a read model +- leave durable writes in `RuntimeStore` + +Phase 5 does not need a flag-day split. It needs the concept to stop being implicit. + +### Step 6. Update Docs And Exit Criteria + +Before calling Phase 5 complete, update: + +- architecture review checklist +- `RuntimeCoordinator`, `RuntimeStore`, `RuntimeSnapshotProvider`, `RenderEngine`, and `ControlServices` subsystem docs +- Phase 6 assumptions about persistence inputs +- Phase 7 assumptions about what render/backend state is not part of live parameter layering + +## Testing Strategy + +Phase 5 tests should avoid GL, DeckLink, sockets, and filesystem writes where possible. + +Recommended tests: + +- base value is used when no committed or transient value exists +- committed value overrides base value +- transient overlay overrides committed value +- numeric smoothing applies only to transient overlay values +- trigger/bool/discrete overlay behavior is explicit +- layer removal clears matching transient state +- shader change preserves only compatible overlays if policy allows +- preset load clears or replaces committed/transient state according to policy +- settled OSC overlay creates the expected commit request +- settled OSC commit does not request persistence unless policy says so +- stale commit completion does not clear a newer overlay +- render-local temporal/feedback resets do not mutate parameter layers + +Existing useful homes: + +- `RuntimeLiveStateTests` for overlay generation, smoothing, settle, and invalidation behavior +- `RuntimeSubsystemTests` for coordinator mutation, persistence request, and reset/reload policy +- `RuntimeEventTypeTests` for any new observations or accepted mutation events +- a possible new `RuntimeStateLayeringTests` target if the composition model gets a pure helper + +## Risks + +### Over-Abstraction Risk + +It would be easy to introduce too many state containers. Phase 5 should add names where they clarify behavior, not create an elaborate framework. + +### Persistence Confusion Risk + +Committed live state and persisted state are related but not identical. If Phase 5 blurs them, Phase 6's background persistence writer will inherit ambiguous inputs. + +### Automation Surprise Risk + +OSC automation can be high-rate and transient, but users may expect settled values to become "real." The commit policy needs to be explicit enough that UI, OSC, presets, and reloads behave predictably. + +### Identity/Compatibility Risk + +Shader changes and preset loads can invalidate layer/parameter identities. Phase 5 should prefer conservative clearing over accidental application of an old automation value to the wrong control. + +### Render Coupling Risk + +Render-local resources such as temporal history, feedback buffers, readback caches, and playout queues are not parameter layers. Keeping them out of this model avoids turning Phase 5 into a render-resource refactor. + +## Phase 5 Exit Criteria + +Phase 5 can be considered complete once the project can say: + +- [ ] persisted, committed-live, and transient automation layers are named in code or clear read models +- [ ] final render-value precedence is explicit and covered by tests +- [ ] `RenderStateComposer` or its replacement consumes a layered input contract +- [ ] reset/reload/preset behavior for transient overlays is centralized or clearly delegated +- [ ] OSC overlay settle/commit behavior is explicit, including persistence policy +- [ ] `RuntimeStore` remains durable-state focused and does not absorb transient automation policy +- [ ] render-local temporal/feedback state remains separate from live parameter layering +- [ ] subsystem docs and the architecture review reflect the final ownership model + +## Open Questions + +- Should committed live state remain physically in `RuntimeStore` for now, or move to a `CommittedLiveState` collaborator? +- Should transient OSC overlay updates become app-level typed events, or stay source-local through `RuntimeServiceLiveBridge`? +- Should overlay commit persistence be global, ingress-specific, or parameter-definition-driven? +- What compatibility rule should apply when shader reload preserves a control key but changes parameter shape? +- Should preset load clear all transient automation, or only automation that no longer maps to the loaded stack? +- Should UI slider drags use the committed-live layer directly, or a short-lived transient layer that commits on release? + +## Short Version + +Phase 5 should make live values boring and explicit. + +Persisted state is durable truth. Committed live state is current-session/operator truth. Transient automation is high-rate overlay truth. Render consumes the composed result, and each layer has clear ownership, lifetime, persistence behavior, and reset/reload rules. diff --git a/docs/subsystems/VideoBackend.md b/docs/subsystems/VideoBackend.md index dfec491..b8ead25 100644 --- a/docs/subsystems/VideoBackend.md +++ b/docs/subsystems/VideoBackend.md @@ -300,16 +300,16 @@ Today the callback path effectively does this: 1. DeckLink signals completion. 2. The callback path asks for a new output buffer. -3. The callback path enters the shared GL section. -4. The callback path renders the next frame. -5. The callback path reads it back. +3. The callback path requests render-thread output production. +4. The render thread renders the next frame. +5. The render thread reads it back into the output buffer. 6. The callback path schedules the next hardware frame. That path is visible in: - [OpenGLVideoIOBridge::RenderScheduledFrame](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:18) -This couples output timing directly to render work. +This no longer borrows the GL context from the callback thread, but it still couples output timing directly to render-thread work. ### Target Model @@ -352,11 +352,11 @@ Recommended default policy for live playout: - prefer recency over completeness - drop stale capture frames instead of blocking render or output -The current "skip upload if the GL bridge is busy" behavior is directionally correct for live timing: +The current latest-input mailbox behavior is directionally correct for live timing: - [OpenGLVideoIOBridge::UploadInputFrame](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/pipeline/OpenGLVideoIOBridge.cpp:11) -But in the target architecture that decision should move out of GL lock acquisition and into an explicit backend-to-render handoff queue policy. +The next improvement is to make the backend-to-render handoff policy more explicit in telemetry and playout scheduling, rather than treating it as only a render command mailbox detail. Suggested input metrics: