18 KiB
RuntimeCoordinator Design Note
This document defines the target design for the RuntimeCoordinator subsystem introduced in PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md.
RuntimeCoordinator is the mutation and policy layer for the app. Its job is to accept already-normalized actions from ingress systems, decide whether those actions are valid, classify how they should affect durable and live state, and trigger downstream publication or persistence work without taking ownership of rendering, device callbacks, or disk serialization details.
Why This Subsystem Exists
Before the Phase 1 runtime split, the app's mutation path was split across several places:
RuntimeHostperformed validation, mutation, persistence, render-state invalidation, and some status updates:RuntimeHost.hRuntimeHost.cpp
OpenGLCompositecurrently acts like an orchestration shell and a mutation coordinator at the same time:RuntimeServicesstill owns some deferred control flow around OSC commit and polling:
That overlap makes several kinds of regressions more likely:
- persistence policy leaks into control handlers
- render invalidation rules are spread across UI and non-UI paths
- transient automation behavior is hard to reason about
- reload behavior is partly a render concern and partly a runtime concern
- future event-model work has no single policy owner to target
RuntimeCoordinator exists to centralize those decisions without becoming a new monolith.
Core Responsibilities
RuntimeCoordinator should own the following responsibilities.
1. Mutation intake after normalization
RuntimeCoordinator accepts typed, already-parsed actions from ControlServices or composition-root adapters. Examples:
- add/remove/move layer
- change shader on a layer
- change a parameter value
- reset a layer
- save or load a stack preset
- request a shader/package reload
- apply a transient automation target
- commit or clear transient overlay state
The coordinator should not parse JSON, decode OSC payloads, or inspect HTTP payload syntax. That belongs to ingress systems.
2. Validation and policy decisions
The coordinator validates whether a requested mutation is allowed and decides how it should behave.
Examples:
- whether a layer id exists
- whether a shader id is valid
- whether a parameter exists on the targeted shader
- whether a value is within the definition's allowed range or enum set
- whether a trigger should update committed state, transient state, or both
- whether a structural change should preserve compatible transient state such as feedback buffers
This is the policy surface that used to be spread between RuntimeHost methods such as:
AddLayer(...)SetLayerShader(...)UpdateLayerParameter(...)UpdateLayerParameterByControlKey(...)ApplyOscTargetByControlKey(...)ResetLayerParameters(...)
See RuntimeHost.h.
3. State classification
The coordinator decides which state category a mutation affects:
- persisted state
- committed live state
- transient live overlay state
- health/timing state only
The design rule is that classification belongs here, not in the ingress layer and not in render code.
4. Snapshot publication requests
When a mutation changes render-facing state, the coordinator asks RuntimeSnapshotProvider to publish a new snapshot or mark one dirty for publication.
The coordinator does not build render snapshots itself.
5. Persistence requests
When a mutation changes durable state, the coordinator asks RuntimeStore to record the new authoritative state and, when applicable, request persistence through the store's write path.
The coordinator does not serialize files directly.
6. Cross-subsystem consistency policy
The coordinator is where "what else must happen if this changes?" lives.
Examples:
- a layer add/remove/move may require:
- store mutation
- snapshot republish
- compatibility-preserving render-state reset policy
- optional UI-state notification via later event-model work
- a stack preset load may require:
- replacement of committed layer stack state
- invalidation of transient overlay state that no longer maps cleanly
- snapshot republish
- deferred persistence request
- an automation target may require:
- transient overlay update only
- no persistence write
- optional later commit into committed live state if policy says so
Explicit Non-Responsibilities
RuntimeCoordinator should explicitly not own the following.
Not a persistence engine
It does not:
- read or write files
- decide file formats
- own preset storage layout
- perform debounced disk flushing logic
Those belong in RuntimeStore and later persistence helpers.
Not a render engine
It does not:
- own GL objects
- perform shader compilation
- reset temporal history textures directly
- build render passes
- hold frame queues
It may request policy outcomes that cause render-local resets, but render performs the work.
Not a hardware/backend owner
It does not:
- configure DeckLink
- react directly to device callbacks
- schedule playout
- own input signal callbacks
Not an ingress transport layer
It does not:
- parse OSC wire messages
- host websockets
- own HTTP handlers
- own polling loops
Not a health reporting sink
It can emit mutation outcomes and warnings to HealthTelemetry, but it should not own counters, logs, or dashboards.
Mutation Policy
The coordinator should use a small number of policy classes of mutation behavior rather than ad hoc per-call decisions.
Durable mutation
Updates authoritative state that should survive beyond the current session flow.
Examples:
- add/remove/move layer
- change selected shader on a layer
- update a parameter via UI or API
- load a stack preset
- reset a layer to defaults
Expected coordinator behavior:
- validate the request
- normalize the target and value if needed
- update committed/durable state via
RuntimeStore - request snapshot publication
- request persistence according to policy
Live committed mutation
Updates committed current-session state that should be treated as true until changed again, but may not need synchronous persistence.
Examples:
- a UI action that changes a parameter repeatedly while dragging
- a manual operator bypass toggle during live use
Expected coordinator behavior:
- update committed live state
- request snapshot publication
- decide whether persistence should happen immediately, be debounced, or be deferred
Transient overlay mutation
Affects output but should not masquerade as stored truth.
Examples:
- active OSC automation target
- short-lived trigger-driven visual automation state
Expected coordinator behavior:
- validate the route and target parameter
- classify the action as transient
- update overlay state through the appropriate owner boundary
- avoid persistence unless a separate commit policy is invoked
Coordination-only mutation
A request that mainly exists to trigger a flow rather than edit value state.
Examples:
- request reload
- request publish-now
- request clear transient state on reset/rebuild
Interaction With State Categories
This section restates the Phase 1 state model specifically from the coordinator's perspective.
Persisted state
RuntimeCoordinator does not own persisted state, but it decides when persisted state should change.
Typical interaction:
- validate request
- call into
RuntimeStore - receive success/failure
- request persistence if policy says this mutation should be durable
Committed live state
This is the coordinator's primary logical domain.
Even while committed live state is physically stored inside RuntimeStore, the coordinator should be considered the policy owner of:
- current layer stack composition
- current selected shaders
- current bypass flags
- current operator-authored parameter values
Transient live overlay state
The coordinator defines the rules for transient state, but should not become the long-term storage owner for render-local transient data.
The expected split is:
- coordinator owns policy
ControlServicesmay own short ingress-side queues and coalescing buffersRenderEngineowns render-local transient application stateVideoBackendowns playout and device transient state
For OSC specifically, the coordinator should eventually decide:
- whether an automation change is transient-only
- whether it should later commit into committed live state
- what reset/reload actions invalidate it
Health and timing state
The coordinator may emit events like:
- mutation rejected
- reload requested
- preset load succeeded/failed
- transient state cleared because structure changed
But those are observations into HealthTelemetry, not coordinator-owned data.
Proposed Interfaces
These are target-shape interfaces, not final signatures.
Input-facing API
Core mutation entrypoints could look like:
struct RuntimeMutationRequest;
struct RuntimeMutationResult;
struct ReloadRequest;
struct OverlayCommitRequest;
class RuntimeCoordinator
{
public:
RuntimeMutationResult ApplyMutation(const RuntimeMutationRequest& request);
RuntimeMutationResult ApplyAutomationTarget(const RuntimeMutationRequest& request);
RuntimeMutationResult ResetLayer(const std::string& layerId);
RuntimeMutationResult RequestReload(const ReloadRequest& request);
RuntimeMutationResult CommitOverlayState(const OverlayCommitRequest& request);
RuntimeMutationResult ClearTransientStateForScope(const RuntimeResetScope& scope);
};
The important point is not the exact names. It is that ingress systems send typed requests into one policy owner.
Downstream collaborators
The coordinator likely needs collaborators conceptually equivalent to:
IRuntimeStoreIRuntimeSnapshotProviderIHealthTelemetry- compatibility adapters only where older call shapes still need to be supported during migration
Mutation result shape
A useful result structure should carry more than success/failure. It should support policy-driven downstream behavior without re-deriving the decision elsewhere.
Suggested fields:
acceptederrorMessagestateChangedpersistedStateChangedcommittedLiveStateChangedtransientStateChangedsnapshotPublicationRequiredpersistenceRequestedrenderResetScopetelemetryNotes
This prevents callers from guessing whether they need to reload, publish, or persist.
Current Code Mapping
The current app does not have a separate coordinator class, but several existing code paths are clearly doing coordinator work.
OpenGLCompositeRuntimeControls.cpp
Methods like:
AddLayer(...)RemoveLayer(...)MoveLayer(...)SetLayerBypass(...)SetLayerShader(...)UpdateLayerParameterJson(...)ResetLayerParameters(...)SaveStackPreset(...)LoadStackPreset(...)
currently do this pattern:
- call a host/store mutation directly
- decide whether to call
ReloadShader(...) - call
broadcastRuntimeState()
See OpenGLCompositeRuntimeControls.cpp.
That "call host, then decide reload/broadcast policy" logic is a direct candidate for migration into RuntimeCoordinator.
Previous RuntimeHost
RuntimeHost previously combined:
- mutation validation
- state mutation
- value normalization
- persistence writes
- render-state dirty marking
Examples from the old RuntimeHost.cpp:
AddLayer(...)SetLayerShader(...)UpdateLayerParameter(...)UpdateLayerParameterByControlKey(...)ApplyOscTargetByControlKey(...)ResetLayerParameters(...)LoadStackPreset(...)
The target design is not to move all implementation in one step. It is to peel policy and orchestration decisions away first.
RuntimeServices
Current OSC-specific flow in RuntimeServices includes:
- queueing updates
- applying pending updates
- queueing commits
- consuming completed commits
- clearing OSC state
See RuntimeServices.h.
The coordinator should eventually own the rules for when these updates are transient, when they commit, and what reset/reload does to them, while ControlServices keeps only the ingress mechanics.
Recommended Internal Model
The coordinator should remain small enough to reason about. A good target is to split its internal logic into policy-focused helpers rather than letting one class become another RuntimeHost.
Possible internal helper concepts:
LayerMutationPolicyParameterMutationPolicyPresetMutationPolicyReloadPolicyOverlayPolicy
That can still be presented as one subsystem to the rest of the app, while keeping the implementation testable.
Snapshot Publication Contract
The coordinator should never force callers to know whether a snapshot must be rebuilt. That policy should be owned here.
Examples:
- parameter changes require snapshot publication
- layer reorder requires snapshot publication
- shader swap requires snapshot publication and render-local rebuild work
- stack preset load requires snapshot publication and likely broader transient-state invalidation
- pure health/status changes do not require snapshot publication
This contract matters because current call sites often use coarse actions like ReloadShader() after structural edits. The coordinator should return a more precise outcome than "reload or not."
Reload and Reset Policy
Reload and reset behavior has been a recurring source of edge cases in the current app, especially with shader feedback, temporal history, and OSC overlay state.
The coordinator should define explicit reset scopes such as:
- parameter-values-only reset
- committed-live-state reset for a layer
- transient-overlay reset for a layer
- render-local-history reset for a layer
- whole-stack structural reset
- reload-induced compatibility reset
That allows later phases to stop encoding reset behavior implicitly in UI handlers or render rebuild code.
Migration Plan From Current Code
The coordinator should be introduced incrementally.
Step 1. Define request and result types
Introduce typed mutation request/result objects without changing most internals yet.
Step 2. Wrap direct runtime mutations behind coordinator entrypoints
The first implementation could still delegate heavily into existing runtime mutation paths, but the call sites should stop deciding policy on their own.
For example, instead of:
OpenGLComposite::AddLayer()- direct layer-add mutation
ReloadShader(true)broadcastRuntimeState()
the flow becomes:
OpenGLCompositeorControlServicescreates a typed requestRuntimeCoordinator::ApplyMutation(...)- coordinator returns a result describing snapshot, reset, and persistence needs
- composition root dispatches those downstream effects
Step 3. Move validation and classification out of direct mutation helpers
Once coordinator entrypoints are stable, pull up:
- mutation classification
- reset/reload policy
- transient-versus-durable decisions
while leaving raw store operations in place.
Step 4. Split storage and snapshot collaborators
Only after the coordinator is clearly owning policy should storage and snapshot responsibilities be split into real target subsystems.
Key Risks
Risk 1. Coordinator becomes a new god object
If the coordinator starts owning persistence details, status counters, or render reset mechanics directly, it will just recreate the current problem under a new name.
Mitigation:
- keep collaborators explicit
- keep request/result types narrow
- avoid direct dependencies on render or backend internals
Risk 2. Call sites bypass coordinator during migration
If new code bypasses RuntimeCoordinator for convenience, the architecture will fork into two policy systems.
Mitigation:
- treat the coordinator as the required entrypoint for new non-render mutations
- add compatibility adapters rather than parallel mutation paths
Risk 3. Too much policy stays implicit in return conventions
If callers still infer policy from "which method was called," the coordinator will not actually clarify the system.
Mitigation:
- return explicit mutation outcomes
- define reset and publication scopes as named concepts
Risk 4. Transient-state ownership remains fuzzy
OSC overlay behavior, feedback invalidation, and reload compatibility can easily blur subsystem boundaries again.
Mitigation:
- coordinator owns classification rules
- subsystem owners retain storage ownership
- reset scopes are explicit
Open Questions
- Should committed live state remain physically stored in
RuntimeStore, or should the coordinator gain a live-session companion object before Phase 3? - Should preset load/save stay synchronous through early migration, or should the coordinator always treat them as policy requests whose persistence effects may complete later?
- Should reload requests be modeled as a dedicated mutation class distinct from ordinary control mutations from the start?
- How much normalization of parameter values should remain in store-side helpers versus moving into coordinator policy helpers?
- Should transient overlay commit policy be global, or parameter-definition-driven for specific shader controls?
- What is the minimal reset-scope vocabulary needed to avoid hard-coding reload behavior in
RenderEnginelater?
Short Version
RuntimeCoordinator is where the app decides what a valid change means.
It should:
- accept typed mutations from ingress systems
- validate and classify them
- update durable and committed state through
RuntimeStore - request render-facing publication through
RuntimeSnapshotProvider - request persistence when policy requires it
- define reset, reload, and transient-overlay rules
It should not:
- parse transport payloads
- own GL work
- own device callbacks
- write files directly
- become a replacement monolith for every kind of state