Files
video-shader-toys/docs/subsystems/RuntimeCoordinator.md
Aiden 77590f4a62
Some checks failed
CI / React UI Build (push) Successful in 11s
CI / Windows Release Package (push) Has been cancelled
CI / Native Windows Build And Tests (push) Has been cancelled
Phase 5 step 1
2026-05-11 18:53:59 +10:00

19 KiB

RuntimeCoordinator Design Note

This document defines the target design for the RuntimeCoordinator subsystem introduced in PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md.

RuntimeCoordinator is the mutation and policy layer for the app. Its job is to accept already-normalized actions from ingress systems, decide whether those actions are valid, classify how they should affect durable and live state, and trigger downstream publication or persistence work without taking ownership of rendering, device callbacks, or disk serialization details.

Why This Subsystem Exists

Before the Phase 1 runtime split, the app's mutation path was split across several places:

  • RuntimeHost performed validation, mutation, persistence, render-state invalidation, and some status updates:
    • RuntimeHost.h
    • RuntimeHost.cpp
  • OpenGLComposite currently acts like an orchestration shell and a mutation coordinator at the same time:
  • RuntimeServices still owns some deferred control flow around OSC commit and polling:

That overlap makes several kinds of regressions more likely:

  • persistence policy leaks into control handlers
  • render invalidation rules are spread across UI and non-UI paths
  • transient automation behavior is hard to reason about
  • reload behavior is partly a render concern and partly a runtime concern
  • future event-model work has no single policy owner to target

RuntimeCoordinator exists to centralize those decisions without becoming a new monolith.

Core Responsibilities

RuntimeCoordinator should own the following responsibilities.

1. Mutation intake after normalization

RuntimeCoordinator accepts typed, already-parsed actions from ControlServices or composition-root adapters. Examples:

  • add/remove/move layer
  • change shader on a layer
  • change a parameter value
  • reset a layer
  • save or load a stack preset
  • request a shader/package reload
  • apply a transient automation target
  • commit or clear transient overlay state

The coordinator should not parse JSON, decode OSC payloads, or inspect HTTP payload syntax. That belongs to ingress systems.

2. Validation and policy decisions

The coordinator validates whether a requested mutation is allowed and decides how it should behave.

Examples:

  • whether a layer id exists
  • whether a shader id is valid
  • whether a parameter exists on the targeted shader
  • whether a value is within the definition's allowed range or enum set
  • whether a trigger should update committed state, transient state, or both
  • whether a structural change should preserve compatible transient state such as feedback buffers

This is the policy surface that used to be spread between RuntimeHost methods such as:

  • AddLayer(...)
  • SetLayerShader(...)
  • UpdateLayerParameter(...)
  • UpdateLayerParameterByControlKey(...)
  • ApplyOscTargetByControlKey(...)
  • ResetLayerParameters(...)

See RuntimeHost.h.

3. State classification

The coordinator decides which state category a mutation affects:

  • persisted state
  • committed live state
  • transient live overlay state
  • health/timing state only

The design rule is that classification belongs here, not in the ingress layer and not in render code.

Phase 5 has started codifying the shared vocabulary for this classification in RuntimeStateLayerModel. The current model records committed session parameter values, layer bypass state, and runtime compile/reload flags as committed-live/session coordination state, even though some of those values are still physically backed by RuntimeStore during migration.

4. Snapshot publication requests

When a mutation changes render-facing state, the coordinator asks RuntimeSnapshotProvider to publish a new snapshot or mark one dirty for publication.

The coordinator does not build render snapshots itself.

5. Persistence requests

When a mutation changes durable state, the coordinator asks RuntimeStore to record the new authoritative state and, when applicable, request persistence through the store's write path.

The coordinator does not serialize files directly.

6. Cross-subsystem consistency policy

The coordinator is where "what else must happen if this changes?" lives.

Examples:

  • a layer add/remove/move may require:
    • store mutation
    • snapshot republish
    • compatibility-preserving render-state reset policy
    • optional UI-state notification via later event-model work
  • a stack preset load may require:
    • replacement of committed layer stack state
    • invalidation of transient overlay state that no longer maps cleanly
    • snapshot republish
    • deferred persistence request
  • an automation target may require:
    • transient overlay update only
    • no persistence write
    • optional later commit into committed live state if policy says so

Explicit Non-Responsibilities

RuntimeCoordinator should explicitly not own the following.

Not a persistence engine

It does not:

  • read or write files
  • decide file formats
  • own preset storage layout
  • perform debounced disk flushing logic

Those belong in RuntimeStore and later persistence helpers.

Not a render engine

It does not:

  • own GL objects
  • perform shader compilation
  • reset temporal history textures directly
  • build render passes
  • hold frame queues

It may request policy outcomes that cause render-local resets, but render performs the work.

Not a hardware/backend owner

It does not:

  • configure DeckLink
  • react directly to device callbacks
  • schedule playout
  • own input signal callbacks

Not an ingress transport layer

It does not:

  • parse OSC wire messages
  • host websockets
  • own HTTP handlers
  • own polling loops

Not a health reporting sink

It can emit mutation outcomes and warnings to HealthTelemetry, but it should not own counters, logs, or dashboards.

Mutation Policy

The coordinator should use a small number of policy classes of mutation behavior rather than ad hoc per-call decisions.

Durable mutation

Updates authoritative state that should survive beyond the current session flow.

Examples:

  • add/remove/move layer
  • change selected shader on a layer
  • update a parameter via UI or API
  • load a stack preset
  • reset a layer to defaults

Expected coordinator behavior:

  1. validate the request
  2. normalize the target and value if needed
  3. update committed/durable state via RuntimeStore
  4. request snapshot publication
  5. request persistence according to policy

Live committed mutation

Updates committed current-session state that should be treated as true until changed again, but may not need synchronous persistence.

Examples:

  • a UI action that changes a parameter repeatedly while dragging
  • a manual operator bypass toggle during live use

Expected coordinator behavior:

  1. update committed live state
  2. request snapshot publication
  3. decide whether persistence should happen immediately, be debounced, or be deferred

Transient overlay mutation

Affects output but should not masquerade as stored truth.

Examples:

  • active OSC automation target
  • short-lived trigger-driven visual automation state

Expected coordinator behavior:

  1. validate the route and target parameter
  2. classify the action as transient
  3. update overlay state through the appropriate owner boundary
  4. avoid persistence unless a separate commit policy is invoked

Coordination-only mutation

A request that mainly exists to trigger a flow rather than edit value state.

Examples:

  • request reload
  • request publish-now
  • request clear transient state on reset/rebuild

Interaction With State Categories

This section restates the Phase 1 state model specifically from the coordinator's perspective.

Persisted state

RuntimeCoordinator does not own persisted state, but it decides when persisted state should change.

Typical interaction:

  • validate request
  • call into RuntimeStore
  • receive success/failure
  • request persistence if policy says this mutation should be durable

Committed live state

This is the coordinator's primary logical domain.

Even while committed live state is physically stored inside RuntimeStore, the coordinator should be considered the policy owner of:

  • current layer stack composition
  • current selected shaders
  • current bypass flags
  • current operator-authored parameter values

Transient live overlay state

The coordinator defines the rules for transient state, but should not become the long-term storage owner for render-local transient data.

The expected split is:

  • coordinator owns policy
  • ControlServices may own short ingress-side queues and coalescing buffers
  • RenderEngine owns render-local transient application state
  • VideoBackend owns playout and device transient state

For OSC specifically, the coordinator should eventually decide:

  • whether an automation change is transient-only
  • whether it should later commit into committed live state
  • what reset/reload actions invalidate it

Health and timing state

The coordinator may emit events like:

  • mutation rejected
  • reload requested
  • preset load succeeded/failed
  • transient state cleared because structure changed

But those are observations into HealthTelemetry, not coordinator-owned data.

Proposed Interfaces

These are target-shape interfaces, not final signatures.

Input-facing API

Core mutation entrypoints could look like:

struct RuntimeMutationRequest;
struct RuntimeMutationResult;
struct ReloadRequest;
struct OverlayCommitRequest;

class RuntimeCoordinator
{
public:
    RuntimeMutationResult ApplyMutation(const RuntimeMutationRequest& request);
    RuntimeMutationResult ApplyAutomationTarget(const RuntimeMutationRequest& request);
    RuntimeMutationResult ResetLayer(const std::string& layerId);
    RuntimeMutationResult RequestReload(const ReloadRequest& request);
    RuntimeMutationResult CommitOverlayState(const OverlayCommitRequest& request);
    RuntimeMutationResult ClearTransientStateForScope(const RuntimeResetScope& scope);
};

The important point is not the exact names. It is that ingress systems send typed requests into one policy owner.

Downstream collaborators

The coordinator likely needs collaborators conceptually equivalent to:

  • IRuntimeStore
  • IRuntimeSnapshotProvider
  • IHealthTelemetry
  • compatibility adapters only where older call shapes still need to be supported during migration

Mutation result shape

A useful result structure should carry more than success/failure. It should support policy-driven downstream behavior without re-deriving the decision elsewhere.

Suggested fields:

  • accepted
  • errorMessage
  • stateChanged
  • persistedStateChanged
  • committedLiveStateChanged
  • transientStateChanged
  • snapshotPublicationRequired
  • persistenceRequested
  • renderResetScope
  • telemetryNotes

This prevents callers from guessing whether they need to reload, publish, or persist.

Current Code Mapping

The current app does not have a separate coordinator class, but several existing code paths are clearly doing coordinator work.

OpenGLCompositeRuntimeControls.cpp

Methods like:

  • AddLayer(...)
  • RemoveLayer(...)
  • MoveLayer(...)
  • SetLayerBypass(...)
  • SetLayerShader(...)
  • UpdateLayerParameterJson(...)
  • ResetLayerParameters(...)
  • SaveStackPreset(...)
  • LoadStackPreset(...)

currently do this pattern:

  1. call a host/store mutation directly
  2. decide whether to call ReloadShader(...)
  3. call broadcastRuntimeState()

See OpenGLCompositeRuntimeControls.cpp.

That "call host, then decide reload/broadcast policy" logic is a direct candidate for migration into RuntimeCoordinator.

Previous RuntimeHost

RuntimeHost previously combined:

  • mutation validation
  • state mutation
  • value normalization
  • persistence writes
  • render-state dirty marking

Examples from the old RuntimeHost.cpp:

  • AddLayer(...)
  • SetLayerShader(...)
  • UpdateLayerParameter(...)
  • UpdateLayerParameterByControlKey(...)
  • ApplyOscTargetByControlKey(...)
  • ResetLayerParameters(...)
  • LoadStackPreset(...)

The target design is not to move all implementation in one step. It is to peel policy and orchestration decisions away first.

RuntimeServices

Current OSC-specific flow in RuntimeServices includes:

  • queueing updates
  • applying pending updates
  • queueing commits
  • consuming completed commits
  • clearing OSC state

See RuntimeServices.h.

The coordinator should eventually own the rules for when these updates are transient, when they commit, and what reset/reload does to them, while ControlServices keeps only the ingress mechanics.

The coordinator should remain small enough to reason about. A good target is to split its internal logic into policy-focused helpers rather than letting one class become another RuntimeHost.

Possible internal helper concepts:

  • LayerMutationPolicy
  • ParameterMutationPolicy
  • PresetMutationPolicy
  • ReloadPolicy
  • OverlayPolicy

That can still be presented as one subsystem to the rest of the app, while keeping the implementation testable.

Snapshot Publication Contract

The coordinator should never force callers to know whether a snapshot must be rebuilt. That policy should be owned here.

Examples:

  • parameter changes require snapshot publication
  • layer reorder requires snapshot publication
  • shader swap requires snapshot publication and render-local rebuild work
  • stack preset load requires snapshot publication and likely broader transient-state invalidation
  • pure health/status changes do not require snapshot publication

This contract matters because current call sites often use coarse actions like ReloadShader() after structural edits. The coordinator should return a more precise outcome than "reload or not."

Reload and Reset Policy

Reload and reset behavior has been a recurring source of edge cases in the current app, especially with shader feedback, temporal history, and OSC overlay state.

The coordinator should define explicit reset scopes such as:

  • parameter-values-only reset
  • committed-live-state reset for a layer
  • transient-overlay reset for a layer
  • render-local-history reset for a layer
  • whole-stack structural reset
  • reload-induced compatibility reset

That allows later phases to stop encoding reset behavior implicitly in UI handlers or render rebuild code.

Migration Plan From Current Code

The coordinator should be introduced incrementally.

Step 1. Define request and result types

Introduce typed mutation request/result objects without changing most internals yet.

Step 2. Wrap direct runtime mutations behind coordinator entrypoints

The first implementation could still delegate heavily into existing runtime mutation paths, but the call sites should stop deciding policy on their own.

For example, instead of:

  1. OpenGLComposite::AddLayer()
  2. direct layer-add mutation
  3. ReloadShader(true)
  4. broadcastRuntimeState()

the flow becomes:

  1. OpenGLComposite or ControlServices creates a typed request
  2. RuntimeCoordinator::ApplyMutation(...)
  3. coordinator returns a result describing snapshot, reset, and persistence needs
  4. composition root dispatches those downstream effects

Step 3. Move validation and classification out of direct mutation helpers

Once coordinator entrypoints are stable, pull up:

  • mutation classification
  • reset/reload policy
  • transient-versus-durable decisions

while leaving raw store operations in place.

Step 4. Split storage and snapshot collaborators

Only after the coordinator is clearly owning policy should storage and snapshot responsibilities be split into real target subsystems.

Key Risks

Risk 1. Coordinator becomes a new god object

If the coordinator starts owning persistence details, status counters, or render reset mechanics directly, it will just recreate the current problem under a new name.

Mitigation:

  • keep collaborators explicit
  • keep request/result types narrow
  • avoid direct dependencies on render or backend internals

Risk 2. Call sites bypass coordinator during migration

If new code bypasses RuntimeCoordinator for convenience, the architecture will fork into two policy systems.

Mitigation:

  • treat the coordinator as the required entrypoint for new non-render mutations
  • add compatibility adapters rather than parallel mutation paths

Risk 3. Too much policy stays implicit in return conventions

If callers still infer policy from "which method was called," the coordinator will not actually clarify the system.

Mitigation:

  • return explicit mutation outcomes
  • define reset and publication scopes as named concepts

Risk 4. Transient-state ownership remains fuzzy

OSC overlay behavior, feedback invalidation, and reload compatibility can easily blur subsystem boundaries again.

Mitigation:

  • coordinator owns classification rules
  • subsystem owners retain storage ownership
  • reset scopes are explicit

Open Questions

  • Should committed live state remain physically stored in RuntimeStore, or should the coordinator gain a live-session companion object before Phase 3?
  • Should preset load/save stay synchronous through early migration, or should the coordinator always treat them as policy requests whose persistence effects may complete later?
  • Should reload requests be modeled as a dedicated mutation class distinct from ordinary control mutations from the start?
  • How much normalization of parameter values should remain in store-side helpers versus moving into coordinator policy helpers?
  • Should transient overlay commit policy be global, or parameter-definition-driven for specific shader controls?
  • What is the minimal reset-scope vocabulary needed to avoid hard-coding reload behavior in RenderEngine later?

Short Version

RuntimeCoordinator is where the app decides what a valid change means.

It should:

  • accept typed mutations from ingress systems
  • validate and classify them
  • update durable and committed state through RuntimeStore
  • request render-facing publication through RuntimeSnapshotProvider
  • request persistence when policy requires it
  • define reset, reload, and transient-overlay rules

It should not:

  • parse transport payloads
  • own GL work
  • own device callbacks
  • write files directly
  • become a replacement monolith for every kind of state