video-shader-toys/docs/subsystems/RuntimeSnapshotProvider.md

# RuntimeSnapshotProvider Subsystem Design

This document expands the `RuntimeSnapshotProvider` subsystem from [PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/docs/PHASE_1_SUBSYSTEM_BOUNDARIES_DESIGN.md) into a concrete subsystem design.

The goal of `RuntimeSnapshotProvider` is to separate render-facing state publication from both runtime mutation policy and durable storage. In the target architecture, render should consume published snapshots rather than reaching into `RuntimeStore` or lock-protected live objects directly.

## Purpose

`RuntimeSnapshotProvider` is the boundary between runtime-owned state and render-consumable state.

It exists to solve three current problems:

- render state is still built directly out of `RuntimeHost` under a shared mutex
- render reads and refreshes partially mutable cached layer state in more than one place
- state publication, state versioning, and dynamic frame-field refresh are not yet explicit subsystems

Today the closest current behavior lives in:

- [RuntimeHost::GetLayerRenderStates(...)](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/runtime/RuntimeHost.cpp:1535)
- [RuntimeHost::TryGetLayerRenderStates(...)](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/runtime/RuntimeHost.cpp:1543)
- [RuntimeHost::TryRefreshCachedLayerStates(...)](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/runtime/RuntimeHost.cpp:1554)
- [RuntimeHost::RefreshDynamicRenderStateFields(...)](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/runtime/RuntimeHost.cpp:1582)
- [RuntimeHost::BuildLayerRenderStatesLocked(...)](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/runtime/RuntimeHost.cpp:1598)
- the render-side cache usage in [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/OpenGLComposite.cpp:589)

`RuntimeSnapshotProvider` should absorb that responsibility, but in a cleaner and more publish-oriented way.

## Responsibilities

`RuntimeSnapshotProvider` is responsible for:

- building render-facing snapshots from durable store state plus whatever committed-live state view the Phase 3 split ultimately exposes
- publishing stable, versioned snapshots that can be consumed without large shared mutable locks
- separating structural snapshot changes from dynamic frame fields
- translating runtime layer state into render-ready layer descriptors
- attaching immutable or near-immutable shader/package-derived data needed by render
- giving `RenderEngine` a cheap read path for the latest committed snapshot
- making snapshot invalidation and publication rules explicit

It is not responsible for:

- deciding whether a mutation is valid
- classifying a change as transient versus durable
- directly accepting OSC/UI/file-watch requests
- disk persistence
- GL resource allocation
- shader compilation execution
- render-local transient overlays such as live OSC overlay state, temporal history textures, or feedback textures

## Design Principles

### Render consumes published state, not store internals

The render side should never need to walk `RuntimeStore` structures directly or perform per-frame reconstruction under the store lock.

### Structural data and dynamic frame fields are different classes of data

The layer stack, shader ids, parameter definitions, texture assets, font assets, feedback declarations, and temporal requirements change relatively infrequently. Frame count, wall time, UTC time, and similar values change every frame.

`RuntimeSnapshotProvider` should publish structural snapshots and provide a separate mechanism for frame-local dynamic enrichment, rather than rebuilding everything for every frame.

### Snapshot reads should be cheap and explicit

The render side should be able to say:

- give me the latest published snapshot
- tell me whether the structural snapshot version changed
- apply dynamic frame fields for this frame

without having to infer cache validity from multiple host-owned counters and fallback lock behavior.

### Published shape should be stable

The shape of render-facing layer state should remain consistent across phases even if the underlying store or coordination model changes.

## Snapshot Inputs

`RuntimeSnapshotProvider` should build from a read-oriented runtime view, not from direct mutation calls.

That view will likely include:

- durable configuration and layer-stack data from `RuntimeStore`
- committed live values from either:
  - `RuntimeStore`, while committed live state is still co-located there, or
  - a coordinator-owned live-state companion once Phase 3 finishes the split
- package and manifest metadata required to describe render-facing layer structure

The important Phase 1 rule is not "the provider always reads one specific object." It is:

- the provider consumes read-oriented committed runtime state
- the provider does not own mutation policy
- render consumes the provider's published output instead of reaching back into whichever runtime object currently stores the truth

## Snapshot Model

The subsystem should publish a render snapshot object rather than loose vectors and ad hoc version getters.

Suggested top-level shape:

```cpp
struct RuntimeRenderSnapshot
{
	uint64_t snapshotVersion = 0;
	uint64_t structureVersion = 0;
	uint64_t parameterVersion = 0;
	uint64_t packageVersion = 0;
	uint64_t publicationSequence = 0;
	unsigned inputWidth = 0;
	unsigned inputHeight = 0;
	unsigned outputWidth = 0;
	unsigned outputHeight = 0;
	std::vector<RuntimeRenderLayerSnapshot> layers;
};
```

Suggested per-layer shape:

```cpp
struct RuntimeRenderLayerSnapshot
{
	std::string layerId;
	std::string shaderId;
	std::string shaderName;
	double mixAmount = 1.0;
	double bypass = 0.0;
	std::vector<ShaderParameterDefinition> parameterDefinitions;
	std::map<std::string, ShaderParameterValue> parameterValues;
	std::vector<ShaderTextureAsset> textureAssets;
	std::vector<ShaderFontAsset> fontAssets;
	bool isTemporal = false;
	TemporalHistorySource temporalHistorySource = TemporalHistorySource::None;
	unsigned requestedTemporalHistoryLength = 0;
	unsigned effectiveTemporalHistoryLength = 0;
	FeedbackSettings feedback;
};
```

This is intentionally close to today’s [RuntimeRenderState](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/shader/ShaderTypes.h:134), but split so dynamic fields are not embedded in the published structural snapshot.

Suggested per-frame dynamic supplement:

```cpp
struct RuntimeRenderFrameContext
{
	double timeSeconds = 0.0;
	double utcTimeSeconds = 0.0;
	double utcOffsetSeconds = 0.0;
	double startupRandom = 0.0;
	double frameCount = 0.0;
};
```

`RenderEngine` can combine `RuntimeRenderSnapshot` and `RuntimeRenderFrameContext` into its final frame-local render input without forcing snapshot republish every frame.

## Publication Rules

The provider should publish a new structural snapshot when any render-relevant structural or committed-live field changes, including:

- layer add/remove/reorder
- shader id change on a layer
- layer bypass change
- parameter value change that is part of committed live state
- shader package metadata refresh that changes parameter definitions, assets, temporal declarations, or feedback declarations
- input or output dimensions that change render-facing layer interpretation
- stack preset load that changes any render-facing state

The provider should not publish a new structural snapshot just because:

- time advanced by one frame
- frame count increased
- preview cadence changed
- render-local transient overlay state changed
- temporal history or feedback textures changed
- device playout queue state changed

That distinction matters because the current model effectively mixes structural publication with frame-local refresh and lock-driven fallback logic.

## Versioning Model

The provider should own explicit version domains rather than exposing only host-wide counters.

Recommended version domains:

- `structureVersion`
  - changes when the layer graph or shader/package-derived structure changes
- `parameterVersion`
  - changes when committed parameter or bypass values change
- `packageVersion`
  - changes when shader manifests or package-derived metadata relevant to render changes
- `snapshotVersion`
  - a composed version for consumers that only need a single fast invalidation key
- `publicationSequence`
  - monotonic sequence number for diagnostics and telemetry

Recommended rules:

- `snapshotVersion` changes whenever any render-visible aspect of the structural snapshot changes
- `structureVersion` should not change for pure parameter edits
- `parameterVersion` should not change for time-only updates
- dynamic frame context should not require any version change

This makes later cache policy much cleaner:

- shader rebuild decisions can key off structure/package changes
- parameter buffer refresh can key off parameter changes
- frame-local updates can ignore snapshot publication entirely

## Snapshot Read Rules

The target read contract for `RenderEngine` should be:

1. acquire the latest published snapshot atomically or under a very small provider-owned read lock
2. compare relevant versions with the render-side cached state
3. if unchanged, reuse render-local compiled/cached resources
4. if changed, rebuild only the portions implied by the changed version domains
5. attach the current `RuntimeRenderFrameContext` for the frame being rendered

Important rule:

- `RenderEngine` should never partially mutate the provider’s published snapshot in place

That means today’s `TryRefreshCachedLayerStates(...)` behavior is a migration waypoint, not a target pattern. Once the provider exists, the render side should treat the snapshot as immutable input and keep any overlays or last-frame adjusted values inside `RenderEngine`.

## Render-Facing Data Shape Rules

The published snapshot should contain exactly the data render needs to interpret a layer, but not render-local execution artifacts.

Include:

- layer identity
- shader identity and display name
- parameter definitions
- committed parameter values
- bypass and mix flags needed for layer evaluation
- texture and font asset declarations
- temporal settings
- feedback settings
- input/output dimensions when they affect shader configuration or resource interpretation

Do not include:

- GL object ids
- framebuffer handles
- compiled shader programs
- live texture bindings resolved to hardware units
- temporal history texture state
- feedback buffer contents
- queued OSC overlays
- queued input frames
- preview frame caches
- DeckLink buffer handles

This line is important because current `RuntimeRenderState` is close to render-ready data, but the subsystem contract should stop before actual device or GL execution artifacts.

## Proposed Public Interface

Suggested interface shape:

```cpp
class IRuntimeSnapshotProvider
{
public:
	virtual ~IRuntimeSnapshotProvider() = default;

	virtual RuntimeRenderSnapshot BuildSnapshot(
		const RuntimeStoreView& storeView,
		const SnapshotBuildOptions& options) const = 0;

	virtual void PublishSnapshot(RuntimeRenderSnapshot snapshot) = 0;
	virtual std::shared_ptr<const RuntimeRenderSnapshot> GetLatestSnapshot() const = 0;
	virtual uint64_t GetSnapshotVersion() const = 0;
	virtual RuntimeRenderFrameContext BuildFrameContext() const = 0;
};
```

Likely supporting methods:

- `BuildLayerSnapshot(...)`
- `BuildFrameContext(...)`
- `ComputeSnapshotVersion(...)`
- `DidStructureChange(...)`
- `DidParametersChange(...)`
- `PublishIfChanged(...)`

Notes:

- `GetLatestSnapshot()` should ideally return a shared immutable snapshot pointer or equivalent stable handle
- `BuildFrameContext()` may remain provider-owned or later move behind a clock/timing helper if that subsystem becomes more explicit
- publication should be initiated by `RuntimeCoordinator`, not by render

## Relationship to Other Subsystems

### `RuntimeStore`

`RuntimeSnapshotProvider` depends on store-owned durable data and package metadata through a read-oriented interface or view.

If committed live state remains physically co-located with the store during early migration, the provider may read it through the same view. If committed live state moves behind a coordinator-owned live-session model later, the provider should consume that through a similarly read-oriented view.

It should not mutate the store directly.

### `RuntimeCoordinator`

`RuntimeCoordinator` decides when a mutation requires snapshot republish.

The provider should not reclassify policy. It should only:

- build
- compare
- publish

based on the change request it is asked to materialize.

### `RenderEngine`

`RenderEngine` is the main consumer.

It should:

- read the latest published snapshot
- treat that snapshot as immutable
- derive render-local artifacts from it
- keep frame-local overlays and history outside the provider

### `HealthTelemetry`

The provider should emit:

- snapshot publication counts
- snapshot build duration
- version bump reason categories
- publication suppression counts when no effective change occurred
- warning states if snapshot build repeatedly fails

This is especially important while migrating away from the current lock/fallback model.

## Current Code Mapping

The current code suggests the following migration map.

### Move into `RuntimeSnapshotProvider`

From `RuntimeHost`:

- layer render-state construction from `BuildLayerRenderStatesLocked(...)`
- render-facing translation of layer persistent state plus package metadata
- explicit version composition for render-visible state
- dynamic frame-context construction currently done in `RefreshDynamicRenderStateFields(...)`

### Stop exposing directly from the host/store boundary

Current methods that should become compatibility shims and later disappear:

- `GetLayerRenderStates(...)`
- `TryGetLayerRenderStates(...)`
- `TryRefreshCachedLayerStates(...)`
- `RefreshDynamicRenderStateFields(...)`

### Render-side compatibility during migration

The current `OpenGLComposite` cache path:

- reads versions from `RuntimeHost`
- conditionally calls `TryRefreshCachedLayerStates(...)`
- conditionally rebuilds full layer state
- then reapplies render-local OSC overlay state

During migration, that should become:

1. get latest published snapshot from provider
2. compare snapshot versions against render-local cache
3. rebuild only if needed
4. apply render-local overlay state
5. attach frame context

That is a much cleaner split than the current mixed lock/cache/fallback flow in [OpenGLComposite.cpp](/c:/Users/Aiden/Documents/GitHub/video-shader-toys/apps/LoopThroughWithOpenGLCompositing/gl/OpenGLComposite.cpp:589).

## Migration Plan

### Step 1: Introduce provider types without changing behavior

- define `RuntimeRenderSnapshot`, `RuntimeRenderLayerSnapshot`, and `RuntimeRenderFrameContext`
- implement provider methods as thin wrappers over current `RuntimeHost` logic
- keep `RuntimeHost` as the backing source temporarily

### Step 2: Route render reads through the provider

- replace direct `RuntimeHost` layer-state reads with provider snapshot reads
- preserve current version behavior first, even if internally bridged to existing counters

### Step 3: Separate structural publication from frame context

- stop rebuilding structural layer state just to refresh time and frame values
- let render request frame context separately each frame

### Step 4: Remove mutable snapshot refresh paths

- retire `TryRefreshCachedLayerStates(...)`
- publish new snapshots for committed parameter changes instead of mutating render-cached host-derived vectors in place

### Step 5: Move publication triggering fully behind `RuntimeCoordinator`

- no render-driven snapshot rebuilding
- coordinator requests publication after successful committed mutations and reloads

## Risks

### Risk: snapshot copies become expensive

Publishing whole snapshots on every parameter commit could be expensive if the layer stack grows.

Mitigation:

- use immutable shared snapshots with replace-on-publish semantics
- consider per-layer structural sharing later if real profiles justify it
- avoid republishing for frame-local time-only changes

### Risk: unclear boundary between committed state and transient overlay state

If overlays are accidentally folded into the published snapshot, the provider will recreate the coupling that the subsystem split is supposed to remove.

Mitigation:

- keep overlays render-local or coordinator-owned transient state
- document that snapshots represent committed render-facing truth, not in-flight automation state

### Risk: version domains are under-specified

If version rules are not crisp, render may still over-rebuild or miss needed updates.

Mitigation:

- make version bump reasons explicit
- log version-domain changes during migration
- add tests around parameter-only, structure-only, and package-only changes

### Risk: snapshot publication is treated as a background convenience rather than a core contract

If code keeps reaching around the provider into the store, the architecture will remain half-split.

Mitigation:

- treat provider publication as the only supported render-facing state publication path
- convert direct host/store render-state methods into adapters, then remove them

## Testing Strategy

The provider should be testable without GL or hardware.

Recommended tests:

- snapshot build from a sample layer stack
- parameter-only mutation increments `parameterVersion` but not `structureVersion`
- layer reorder increments `structureVersion`
- shader manifest change increments `packageVersion`
- frame context changes over time without forcing `snapshotVersion` changes
- repeated publish with no effective change suppresses unnecessary version bumps
- feedback and temporal declarations are preserved correctly in published layer snapshots

## Open Questions

- Should output dimensions live inside the top-level snapshot only, or also be copied into each layer snapshot for compatibility with current code paths?
- Should package-derived compile-ready pass source metadata eventually be published by this provider, or remain a separate build artifact pipeline?
- Is `BuildFrameContext()` part of the provider long-term, or should timing/clock publication become its own helper owned adjacent to `HealthTelemetry`?
- Do parameter-only changes always require full snapshot republish, or should later phases add more granular per-layer publication handles?
- Should the provider own input signal dimensions directly, or should those come from a backend-published runtime environment view supplied during build?

## Completion Criteria For This Subsystem

`RuntimeSnapshotProvider` can be considered architecturally in place once:

- render no longer reads `RuntimeStore` or `RuntimeHost` render state directly
- render consumes published snapshot handles rather than rebuilding layer vectors from host state
- dynamic frame fields are supplied separately from structural snapshot publication
- snapshot version domains are explicit and observable
- transient overlays remain outside the published snapshot contract

## Short Version

`RuntimeSnapshotProvider` should become the single place that turns committed runtime state into render-consumable published snapshots.

Its contract is:

- build from store-owned state
- publish immutable or near-immutable render snapshots
- version them explicitly
- keep frame-local timing separate
- give render a cheap, lock-light read path

If that boundary is held, later phases can split `RuntimeHost`, isolate render timing, and decouple playout without inventing a second render-state authority.