aiden/video-shader-toys

Fork 0

Files

Aiden 618831d578

CI / React UI Build (push) Successful in 11s

Details

CI / Native Windows Build And Tests (push) Successful in 2m26s

Details

CI / Windows Release Package (push) Successful in 2m32s

Details

Phase 1 goals

2026-05-10 22:36:34 +10:00

20 KiB

Raw Blame History

Phase 1 Design: Subsystem Boundaries and Target Architecture

This document expands Phase 1 of ARCHITECTURE_RESILIENCE_REVIEW.md into a concrete target design. Its purpose is to define the long-term subsystem split before later phases introduce a full event model, split RuntimeHost, and move rendering onto a sole-owner render thread.

The main goal of Phase 1 is not to immediately rewrite the app. It is to establish clear ownership boundaries so later refactors all move toward the same architecture instead of solving local problems in conflicting ways.

Why Phase 1 Exists

Today the app works, but too many responsibilities still converge in a few places:

RuntimeHost owns persistence, live layer state, shader package access, status reporting, and mutation entrypoints.
OpenGLComposite coordinates runtime setup, render state retrieval, shader rebuild handling, transient OSC overlay behavior, and video backend integration.
DeckLink callback-driven playout still reaches directly into render-facing work.
Background services rely on polling and shared mutable state more than explicit subsystem contracts.

Those are exactly the kinds of overlaps that make timing issues, state regressions, and recovery edge cases harder to solve cleanly.

Phase 1 creates a map for where each responsibility should eventually live.

Design Goals

The target architecture should optimize for:

live timing isolation
explicit state ownership
predictable recovery behavior
clear boundaries between persistent state and transient live state
easier testing of non-GL and non-hardware logic
fewer cross-thread shared mutable objects
a playout model that can evolve toward producer/consumer scheduling

Non-Goals

Phase 1 does not itself require:

replacing every direct call with events immediately
moving all rendering to a new thread yet
redesigning the shader contract again
changing DeckLink behavior in place
removing all existing classes before replacements exist

This phase is the target design and the dependency rules. Later phases perform the actual extraction.

Current Pressure Points

The following current code paths are the strongest evidence for the split proposed here:

RuntimeHost is both store and live authority:
- RuntimeHost.h
- RuntimeHost.cpp
OpenGLComposite is both app orchestrator and render/runtime coordinator:
- OpenGLComposite.cpp
- OpenGLComposite.cpp
RuntimeServices mixes service orchestration with polling and deferred state work:
- RuntimeServices.h
- RuntimeServices.cpp
Playout is still callback-coupled to render-facing work:
- OpenGLVideoIOBridge.cpp

Target Subsystems

The long-term architecture should converge on six primary subsystems:

RuntimeStore
RuntimeCoordinator
RuntimeSnapshotProvider
ControlServices
RenderEngine
VideoBackend
HealthTelemetry

The split below is intentionally sharper than the current code. The point is to make ownership obvious.

Subsystem Responsibilities

`RuntimeStore`

RuntimeStore owns persisted and operator-authored state.

It is the source of truth for:

runtime config loaded from disk
persisted layer stack structure
persisted parameter values
stack preset serialization/deserialization
shader/package metadata that must survive across renders

It should not be responsible for:

render-thread timing
GL resource lifetime
live transient overlays
hardware callback coordination
UI/websocket broadcasting policy

Design rules:

disk I/O belongs here or in its dedicated writer helper
values here are authoritative for saved state
writes may be debounced later, but the data model itself belongs here

`RuntimeCoordinator`

RuntimeCoordinator is the mutation and policy layer.

It is responsible for:

receiving valid mutation requests from controls, services, or automation
validating requested changes against shader definitions and config rules
resolving how persisted state, committed live state, and transient overlays should interact
requesting snapshot publication when state changes affect render
requesting persistence when stored state changes

It should not be responsible for:

direct disk serialization details
direct GL work
hardware device lifecycle
polling loops

Design rules:

all non-render mutations should eventually flow through this layer
this layer decides whether a change is persisted, transient, or both
this layer owns state policy, not device policy

`RuntimeSnapshotProvider`

RuntimeSnapshotProvider publishes render-facing snapshots.

It is responsible for:

building immutable or near-immutable render snapshots
translating runtime state into render-ready structures
publishing versioned snapshots
serving the render side without large mutable shared locks

It should not be responsible for:

deciding whether a mutation is allowed
directly applying UI/OSC requests
persistence
shader compilation orchestration

Design rules:

render consumes snapshots, not live mutable store objects
snapshots should be cheap to read and explicit about version changes
dynamic frame-only values may still be attached later, but the snapshot shape should stay stable

`ControlServices`

ControlServices is the ingress boundary for non-render control sources.

It is responsible for:

OSC receive and route resolution
REST/websocket/control UI ingress
file-watch or reload request ingress
translating external inputs into typed internal actions/events
low-cost buffering/coalescing where appropriate

It should not be responsible for:

persistence decisions
render snapshot building
hardware playout policy
direct long-lived state ownership beyond ingress-specific queues

Design rules:

external inputs enter here and are normalized before they touch core state
service-specific timing concerns stay here unless they affect whole-app policy
no service should directly mutate render-facing state structures

`RenderEngine`

RenderEngine is the owner of live rendering behavior.

It is responsible for:

sole ownership of GL work in the target architecture
shader program lifecycle once compilation outputs are available
texture upload scheduling
render-pass execution
temporal history and shader feedback resources
transient render-only overlays
preview production as a subordinate output
output-frame production for the video backend

It should not be responsible for:

persistence
user-facing control normalization
hardware discovery/configuration
high-level runtime mutation policy

Design rules:

render consumes snapshots plus render-local transient state
render-local state is allowed if it stays render-local
preview must be treated as best-effort relative to playout

`VideoBackend`

VideoBackend owns input/output device lifecycle and playout policy.

It is responsible for:

input device configuration and callbacks
output device configuration and callbacks
frame scheduling policy
buffer-pool ownership
playout headroom policy
input signal status
backend state transitions and recovery logic

It should not be responsible for:

composing frames
owning GL contexts long-term
validating shader parameter changes
persistence

Design rules:

this subsystem is the consumer of rendered output frames, not the owner of frame composition policy
it should evolve toward producer/consumer playout rather than callback-driven rendering
backend state should be explicit and reportable

`HealthTelemetry`

HealthTelemetry owns structured operational visibility.

It is responsible for:

logging
warning/error counters
timing traces
subsystem health state
degraded-mode reporting
operator-visible health summaries

It should not be responsible for:

deciding core app behavior
owning render or backend state
persistence policy

Design rules:

all major subsystems publish health information here
health visibility should outlive UI connection state
modal dialogs should not be the main operational surface

Target Dependency Rules

The architecture should follow these rules as closely as possible.

Allowed dependency directions:

ControlServices -> RuntimeCoordinator
RuntimeCoordinator -> RuntimeStore
RuntimeCoordinator -> RuntimeSnapshotProvider
RuntimeCoordinator -> HealthTelemetry
RuntimeSnapshotProvider -> RuntimeStore
RenderEngine -> RuntimeSnapshotProvider
RenderEngine -> HealthTelemetry
VideoBackend -> RenderEngine
VideoBackend -> HealthTelemetry

Conditionally allowed during migration:

ControlServices -> HealthTelemetry
ControlServices -> RuntimeStore only through temporary compatibility shims

Not allowed in the target design:

RenderEngine -> RuntimeStore
RenderEngine -> ControlServices
VideoBackend -> RuntimeStore
ControlServices -> RenderEngine for direct mutation
RuntimeStore -> RenderEngine
HealthTelemetry -> any subsystem for control flow

The key principle is:

store owns durable data
coordinator owns mutation policy
snapshot provider owns render-facing state publication
render owns live GPU execution
backend owns device timing
telemetry observes all of them

State Ownership Model

The app has several different kinds of state, and Phase 1 should name them explicitly.

Persisted State

Owned by RuntimeStore.

Examples:

layer stack structure
selected shader ids
saved parameter values
runtime host config
stack presets

Committed Live State

Owned logically by RuntimeCoordinator, stored in the store or a live-state companion depending on future implementation.

Examples:

current operator-selected parameter values
current bypass state
current selected shader for each layer

This is state that should normally survive until explicitly changed and can be persisted if policy says so.

Transient Live Overlay State

Owned by the subsystem that consumes it, not by the persisted store.

Examples:

active OSC overlay targets while automation is flowing
shader feedback buffers
temporal history textures
queued input frames
in-flight preview state
playout queue state

This is where many current issues come from. The design rule is:

transient state may influence output
transient state should not masquerade as persisted truth

Health and Timing State

Owned by HealthTelemetry.

Examples:

frame pacing stats
render timing
late/dropped frame counters
queue depths
warning states

Target Runtime Flow

This section describes the intended long-term flow once later phases are in place.

Control Mutation Flow

OSC/UI/file-watch input enters ControlServices.
ControlServices normalizes it into an internal action or event.
RuntimeCoordinator validates and classifies the action.
If the action changes durable state, RuntimeStore is updated.
If the action changes render-facing state, RuntimeSnapshotProvider publishes a new snapshot.
If the action requires persistence, a persistence request is queued.
Health/timing observations are emitted separately.

Render Flow

RenderEngine consumes the latest published snapshot.
RenderEngine combines that snapshot with render-local transient state.
RenderEngine performs uploads, pass execution, feedback/history maintenance, and output production.
RenderEngine produces:
- preview-ready output
- video-backend-ready output frames
- render timing and warning signals

Video Output Flow

Target long-term flow:

RenderEngine produces completed output frames ahead of demand.
VideoBackend consumes those frames from a bounded queue or ring buffer.
Device callbacks only drive dequeue/schedule/accounting behavior.
HealthTelemetry records queue depth, lateness, underruns, and recovery events.

Reload / Shader Rebuild Flow

file-watch or manual reload enters through ControlServices
RuntimeCoordinator classifies the reload request
RuntimeStore and shader/package metadata are refreshed if needed
RuntimeSnapshotProvider republishes affected snapshot state
RenderEngine rebuilds render-local resources from the new snapshot/build outputs

The important boundary here is that reload is not "a render concern that also touches persistence." It is a coordinated runtime concern with a render-local execution phase.

Suggested Public Interfaces

These are not final class signatures, but they show the shape the architecture should move toward.

`RuntimeStore`

Core responsibilities:

LoadConfig()
LoadPersistentState()
SavePersistentStateSnapshot(...)
GetStoredLayerStack()
SetStoredLayerStack(...)
GetStackPresetNames()
SaveStackPreset(...)
LoadStackPreset(...)

`RuntimeCoordinator`

Core responsibilities:

ApplyControlMutation(...)
ApplyAutomationTarget(...)
ResetLayer(...)
RequestReload(...)
CommitOverlayState(...)
PublishSnapshotIfNeeded()
RequestPersistenceIfNeeded()

`RuntimeSnapshotProvider`

Core responsibilities:

BuildSnapshot(...)
GetLatestSnapshot()
GetSnapshotVersion()
PublishSnapshot(...)

`ControlServices`

Core responsibilities:

StartOscIngress(...)
StartWebControlIngress(...)
StartFileWatchIngress(...)
EnqueueControlAction(...)
DrainServiceEvents(...)

`RenderEngine`

Core responsibilities:

StartRenderLoop(...)
ConsumeSnapshot(...)
EnqueueInputFrame(...)
ProduceOutputFrame(...)
ResetRenderLocalState(...)
HandleRebuildOutputs(...)

`VideoBackend`

Core responsibilities:

ConfigureInput(...)
ConfigureOutput(...)
StartPlayout(...)
StopPlayout(...)
ConsumeRenderedFrame(...)
ReportBackendState(...)

`HealthTelemetry`

Core responsibilities:

RecordTimingSample(...)
RecordCounterDelta(...)
RaiseWarning(...)
ClearWarning(...)
AppendLogEntry(...)
BuildHealthSnapshot()

Mapping From Current Code to Target Subsystems

This is not a one-to-one rename plan. It is a responsibility migration map.

Current `RuntimeHost`

Should eventually split across:

RuntimeStore
RuntimeCoordinator
RuntimeSnapshotProvider
parts of HealthTelemetry

Likely examples:

config loading/saving -> RuntimeStore
layer stack mutation validation -> RuntimeCoordinator
render state building/versioning -> RuntimeSnapshotProvider
timing/status setters -> HealthTelemetry

Current `RuntimeServices`

Should eventually become mostly:

ControlServices
a small service-hosting shell

Likely examples:

OSC ingress/coalescing -> ControlServices
file-watch ingress -> ControlServices
deferred service coordination now done by polling -> split between ControlServices and event-driven coordinator calls

Current `OpenGLComposite`

Should eventually split across:

application bootstrap shell
RenderEngine
orchestration glue that wires subsystems together

Likely examples:

render-pass facing code -> RenderEngine
app/service/backend bootstrap -> composition root
runtime mutation API surface -> coordinator-facing adapter, not render owner

Current `OpenGLVideoIOBridge` and `DeckLinkSession`

Should eventually align more clearly under:

VideoBackend
RenderEngine

Likely examples:

device callback and scheduling policy -> VideoBackend
GL upload/readback/render work -> RenderEngine

Architectural Guardrails

As later phases begin, these rules should be treated as guardrails.

1. No new cross-cutting state should be added to `RuntimeHost`

If a new feature needs durable state, place it conceptually under RuntimeStore. If it needs render-local transient state, place it conceptually under RenderEngine. If it needs timing/status counters, place it conceptually under HealthTelemetry.

2. Render-local state should stay render-local

Do not push shader feedback, temporal history, preview caches, or playout queues back into the store just to make them easy to reach from other systems.

3. Device callbacks should not become a dumping ground for app work

Callback threads should converge toward signaling and queue management, not core rendering, persistence, or control mutation.

4. Persistence should not be used as a control synchronization mechanism

Saving state is not how subsystems discover changes. Published snapshots and explicit events should handle that.

5. Health reporting should observe, not coordinate

Telemetry systems may record warnings and degraded states, but they should not become the hidden control plane for the app.

Migration Strategy

Phase 1 is a design phase, but it should support incremental migration.

Suggested Deliverables for Completing Phase 1

Phase 1 can reasonably be considered complete once the project has:

this subsystem-boundary design document
agreed subsystem names and responsibilities
agreed allowed dependency directions
explicit state categories: persisted, committed live, transient overlay, health/timing
a current-to-target responsibility map for RuntimeHost, RuntimeServices, OpenGLComposite, and backend/render bridge code
a decision that later phases will build against this target rather than inventing new boundaries ad hoc

Open Questions For Later Phases

These do not block Phase 1, but they should remain visible.

Should shader package registry ownership live entirely in RuntimeStore, or should compile-ready derived registry data move into the snapshot provider?
Should committed live state be stored directly in RuntimeStore, or split into store plus live-session state owned by the coordinator?
How much of shader build orchestration belongs to RenderEngine versus a separate build service?
At what phase should preview become fully decoupled from playout cadence?
Should persistence become its own PersistenceWriter subsystem in Phase 6, or remain an implementation detail under RuntimeStore?

Short Version

Phase 1 should establish one simple rule for the rest of the refactor:

durable state lives in the store
mutation policy lives in the coordinator
render-facing state is published as snapshots
external control sources enter through services
GL work belongs to render
hardware pacing belongs to the backend
health visibility belongs to telemetry

If later phases keep to that rule, the architecture will become materially more resilient without needing another round of foundational boundary changes.

20 KiB Raw Blame History

Phase 1 Design: Subsystem Boundaries and Target Architecture

Why Phase 1 Exists

Design Goals

Non-Goals

Current Pressure Points

Target Subsystems

Subsystem Responsibilities

RuntimeStore

RuntimeCoordinator

RuntimeSnapshotProvider

ControlServices

RenderEngine

VideoBackend

HealthTelemetry

Target Dependency Rules

State Ownership Model

Persisted State

Committed Live State

Transient Live Overlay State

Health and Timing State

Target Runtime Flow

Control Mutation Flow

Render Flow

Video Output Flow

Reload / Shader Rebuild Flow

Suggested Public Interfaces

RuntimeStore

RuntimeCoordinator

RuntimeSnapshotProvider

ControlServices

RenderEngine

VideoBackend

HealthTelemetry

Mapping From Current Code to Target Subsystems

Current RuntimeHost

Current RuntimeServices

Current OpenGLComposite

Current OpenGLVideoIOBridge and DeckLinkSession

Architectural Guardrails

1. No new cross-cutting state should be added to RuntimeHost

2. Render-local state should stay render-local

3. Device callbacks should not become a dumping ground for app work

4. Persistence should not be used as a control synchronization mechanism

5. Health reporting should observe, not coordinate

Migration Strategy

Suggested Deliverables for Completing Phase 1

Open Questions For Later Phases

Short Version

20 KiB

Raw Blame History

`RuntimeStore`

`RuntimeCoordinator`

`RuntimeSnapshotProvider`

`ControlServices`

`RenderEngine`

`VideoBackend`

`HealthTelemetry`

`RuntimeStore`

`RuntimeCoordinator`

`RuntimeSnapshotProvider`

`ControlServices`

`RenderEngine`

`VideoBackend`

`HealthTelemetry`

Current `RuntimeHost`

Current `RuntimeServices`

Current `OpenGLComposite`

Current `OpenGLVideoIOBridge` and `DeckLinkSession`

1. No new cross-cutting state should be added to `RuntimeHost`