Doc cleanup
All checks were successful
CI / React UI Build (push) Successful in 10s
CI / Native Windows Build And Tests (push) Successful in 2m49s
CI / Windows Release Package (push) Successful in 3m8s

This commit is contained in:
Aiden
2026-05-12 01:37:20 +10:00
parent 709d3d3fa4
commit 2531d871e8
17 changed files with 594 additions and 5461 deletions

View File

@@ -0,0 +1,529 @@
# Current System Architecture
This document describes how the application currently works.
It replaces the phase-by-phase design trail as the best entry point for understanding the repo. The older phase documents remain useful history, but they mix implementation notes, experiments, and target designs. This document is organized by current runtime behavior and subsystem ownership instead.
## Application Shape
The app is a live OpenGL compositor with DeckLink input/output, runtime control services, persistent layer-stack state, live state overlays, health telemetry, and a small internal event model.
At runtime the major subsystems are:
- `OpenGLComposite`
- `RuntimeStore`
- `RuntimeCoordinator`
- `RuntimeSnapshotProvider`
- `RuntimeServices`
- `RuntimeUpdateController`
- `RenderEngine`
- `VideoBackend`
- `DeckLinkSession`
- `HealthTelemetry`
- `RuntimeEventDispatcher`
- `PersistenceWriter`
The key architectural rule is:
- runtime/control subsystems decide what state should exist
- render subsystems decide how to draw that state
- video subsystems decide how frames move to and from hardware
- telemetry observes behavior without becoming a control plane
## Process Startup
The Win32 app creates the window, chooses a pixel format, creates an OpenGL context, initializes COM, and constructs `OpenGLComposite`.
`OpenGLComposite` owns the high-level assembly of the runtime:
- runtime store
- runtime coordinator
- runtime services
- runtime update controller
- render engine
- video backend
Startup proceeds broadly as:
1. COM and OpenGL are initialized by the Win32 app.
2. `OpenGLComposite::InitDeckLink()` discovers/configures DeckLink and runtime state.
3. Runtime services are started.
4. Shader programs and GL resources are initialized.
5. The render thread is started.
6. The video backend starts output preroll and playback.
The normal VS Code debug launch currently sets:
```text
VST_DISABLE_INPUT_CAPTURE=1
```
That disables DeckLink input capture for output-timing isolation while keeping the output path active.
## Runtime State
### `RuntimeStore`
`RuntimeStore` owns durable runtime data and file-backed state.
It owns:
- runtime host configuration
- stored layer stack data
- persisted parameter values
- stack presets
- shader package catalog metadata
- runtime state presentation data
- persistence requests
It does not own render-thread resources, DeckLink timing, control ingress, or mutation policy.
### `CommittedLiveState`
`CommittedLiveState` owns current session/operator layer state that is live but not necessarily persisted as the durable base state.
It gives the renderer and snapshot builder a named read model for current committed layer state.
### `RuntimeCoordinator`
`RuntimeCoordinator` is the mutation policy boundary.
It validates and applies runtime mutations, classifies whether changes are persisted/committed/transient, emits persistence requests, and produces render reset/reload decisions.
It keeps mutation decisions out of:
- the render engine
- control services
- video backend
- telemetry
### `RuntimeSnapshotProvider`
`RuntimeSnapshotProvider` publishes render-facing snapshots.
It owns the currently published render snapshot and gives the render path a stable read boundary. Rendering does not read mutable store objects directly.
## Live State And Layering
The current render state is built from named layers of state:
- persisted layer/package/default state from the runtime store
- committed live/session state
- transient live overlays from OSC/control input
- render-local state owned by the renderer
`RuntimeStateLayerModel` names these categories. `RenderStateComposer` and `RuntimeLiveState` combine live values into render-facing state.
`RenderFrameInput` and `RenderFrameState` are the frame contract:
- `RenderFrameInput` describes what kind of frame is being built
- `RenderFrameState` describes the resolved state used to draw that frame
The renderer should not ask global state systems which snapshot or layer state to use midway through drawing.
## Control And Events
### `RuntimeServices`
`RuntimeServices` owns runtime-facing services such as OSC/control integration and service lifecycle.
It connects control ingress to the coordinator and live-state bridge.
### `ControlServices`
`ControlServices` handles OSC/control ingress, buffering, and polling/wake behavior.
It does not own runtime mutation policy. It normalizes ingress and asks the coordinator/runtime services to apply changes.
### `RuntimeEventDispatcher`
The app uses typed runtime events for internal coordination and observation.
Events are used for:
- runtime state broadcast requests
- shader build lifecycle
- backend state changes
- input/output frame observations
- timing samples
- health and queue observations
Events say what happened. Commands/request methods still exist where a caller needs an immediate success/failure answer.
## Persistence
Persistence is handled by `PersistenceWriter`.
Runtime mutations can enqueue persistence requests without blocking the render/output path. Shutdown performs a bounded persistence flush.
The store owns durable state; the writer owns background write execution.
## Render System
### `RenderEngine`
`RenderEngine` owns normal runtime OpenGL work.
It starts a dedicated render thread and binds the GL context on that thread. Runtime GL work enters through render-thread requests or render command queues.
The render thread handles:
- output frame rendering
- input frame upload
- preview present
- screenshot capture
- render-local resets
- shader/rebuild application
- temporal history and shader feedback resources
Startup initialization still happens before the render thread starts while the app explicitly owns the context. Normal runtime work is routed through `RenderEngine`.
### Current Render-Thread Limitation
The current render thread is a shared GL executor, not a pure output-only cadence thread.
This means output render can still be delayed by:
- input upload work
- preview present requests
- screenshot capture
- render reset commands
- shader/resource update work
- synchronous render-thread request queue wait
For output-timing diagnosis, input capture can be disabled with:
```text
VST_DISABLE_INPUT_CAPTURE=1
```
When enabled, the backend skips DeckLink input configuration/start and `HasInputSource()` reports false.
### `OpenGLRenderPipeline`
`OpenGLRenderPipeline` draws the frame and performs output packing/readback.
The current output path:
1. binds the composite framebuffer
2. calls the render effect callback
3. blits/composes into the output framebuffer
4. packs the output for the configured pixel format
5. flushes GL
6. reads output into the provided system-memory output frame
7. records render/readback timing
For BGRA8 output, the pipeline uses a BGRA-compatible pack framebuffer and async PBO readback by default.
## Video Backend
### `VideoBackend`
`VideoBackend` owns app-level video device lifecycle, output production, system-memory frame slots, and backend playout health.
It owns:
- backend lifecycle state
- output production worker
- output completion worker
- system-memory output frame pool
- ready/completed output queue
- render cadence controller
- playout policy
- output frame scheduling into `VideoIODevice`
- backend timing and queue telemetry
It does not own GL drawing. It asks `OpenGLVideoIOBridge` / `RenderEngine` to render into system-memory output frames.
### Lifecycle
The current backend lifecycle includes:
- discovery
- configuring
- configured
- prerolling
- running
- degraded
- stopping
- stopped
- failed
Startup now separates output schedule preparation from scheduled playback:
1. prepare the DeckLink output schedule
2. start output completion worker
3. start output producer worker
4. warm up rendered system-memory preroll frames
5. optionally start input streams
6. start DeckLink scheduled playback
### Output Production
The output producer is cadence-driven.
`RenderCadenceController` tracks the selected output frame duration and decides when the producer should render another frame.
The render producer attempts to render one output frame per selected output tick. It does not speed up just because DeckLink is empty.
If render/GPU work is late enough, the cadence controller can skip late ticks according to policy.
### System-Memory Frame Pool
`SystemOutputFramePool` owns reusable system-memory output slots.
Slots have four states:
- `Free`
- `Rendering`
- `Completed`
- `Scheduled`
Completed-but-unscheduled frames are treated as a latest-N cache. If render cadence needs space and old completed frames have not been scheduled, the oldest unscheduled completed frame can be recycled.
Scheduled frames are protected until DeckLink reports completion.
### Output Queue
`RenderOutputQueue` holds completed unscheduled output frames waiting to be scheduled.
It is bounded and latest-N:
- pushing beyond capacity releases/drops the oldest ready frame
- `DropOldestFrame()` is used when the frame pool needs to recycle old completed work
### Scheduling
`VideoBackend::ScheduleReadyOutputFramesToTarget()` schedules completed system-memory frames up to the configured preroll/scheduled target.
DeckLink scheduling is capped by the current app-owned scheduled count. Real DeckLink buffered-frame telemetry is also recorded.
### Completion Handling
DeckLink completion callbacks do not render.
The callback path reports completion into `VideoBackend`, which processes completions on a backend worker. Completion processing:
- releases the system-memory slot by buffer pointer
- records pacing
- accounts for late/drop/flushed/completed result
- records telemetry
- wakes the output producer
## DeckLink Integration
### `DeckLinkSession`
`DeckLinkSession` is the DeckLink implementation of `VideoIODevice`.
It owns:
- DeckLink discovery
- input/output mode selection
- DeckLink input/output interfaces
- keyer configuration
- capture and playout delegates
- schedule-time generation through `VideoPlayoutScheduler`
- DeckLink frame scheduling
- actual buffered-frame telemetry
For output, system-memory frames are scheduled through DeckLink `CreateVideoFrameWithBuffer()`.
When a system-memory frame is scheduled, `DeckLinkSession` records a map from the DeckLink frame object back to the app-owned system-memory buffer pointer. On completion, the buffer pointer is returned so `VideoBackend` can release the matching slot.
### Actual DeckLink Buffer Telemetry
`DeckLinkSession` calls `GetBufferedVideoFrameCount()` after schedule/completion where available.
Telemetry separates:
- actual DeckLink buffered frames
- app-owned scheduled system-memory slots
- synthetic schedule/completion counters
- late/drop/flushed completion results
## Output Timing Experiments And Current Finding
The repo includes `DeckLinkRenderCadenceProbe`, a small standalone test app under:
```text
apps/DeckLinkRenderCadenceProbe
```
The probe does not use the main runtime, shader system, preview path, input upload path, or shared render engine. It uses:
- one OpenGL render thread with its own hidden GL context
- simple BGRA8 motion rendering
- async PBO readback
- latest-N system-memory frame slots
- a playout thread that feeds DeckLink
- real rendered warmup before scheduled playback
The first hardware result was smooth at roughly 59.94/60 fps with:
- `renderFps` near 59.9
- `scheduleFps` near 59.9
- DeckLink actual buffered frames stable at 4
- no late frames
- no dropped frames
- no PBO misses
- no completed-frame drops
That proves the clean architecture can work on the test machine. Remaining main-app timing issues are therefore likely integration/ownership issues in the main app rather than a fundamental DeckLink/OpenGL/BGRA8 limitation.
The highest-value current suspects are:
- input upload sharing the output render thread
- shared render-thread task queue contention
- preview/screenshot work
- runtime/render-state work on the output path
## Health Telemetry
`HealthTelemetry` owns app-visible health and timing observations.
It records:
- signal/input status
- performance/render timing
- event queue timing
- backend lifecycle/playout state
- output render queue wait
- output render/readback timing
- system-memory frame counts
- actual DeckLink buffer depth
- late/drop/flushed/completed frame counters
- schedule-call timing/failure counts
Several hot-path telemetry calls use try-lock variants so observation does not become a major timing dependency.
Runtime state presentation exposes telemetry through the runtime JSON/open API surface.
## Preview And Screenshot
Preview is best-effort.
`OpenGLComposite::paintGL()` skips preview when the backend reports output pressure. Preview presentation is requested through the render thread.
Screenshot capture is also a render-thread request. It reads pixels from the output framebuffer and writes PNG asynchronously after capture.
Both preview and screenshot share GL execution with output render, so they are secondary to output timing.
## Output Readback Modes
The output readback path supports environment-selected modes:
```text
VST_OUTPUT_READBACK_MODE=async_pbo
VST_OUTPUT_READBACK_MODE=sync
VST_OUTPUT_READBACK_MODE=cached_only
```
Default behavior is `async_pbo`.
Experiment findings:
- direct synchronous readback was slower on the sampled machine
- cached-only recovered timing but is visually invalid for live motion
- BGRA8 pack framebuffer plus async PBO removed the earlier large readback stall
## Current Debug/Experiment Launches
VS Code launch configurations include:
- `Debug LoopThroughWithOpenGLCompositing`
- `Debug LoopThroughWithOpenGLCompositing - sync readback experiment`
- `Debug LoopThroughWithOpenGLCompositing - cached output experiment`
- `Debug DeckLinkRenderCadenceProbe`
The default main-app debug launch currently disables input capture with `VST_DISABLE_INPUT_CAPTURE=1` so output timing can be tested without input upload interference.
## Current Ownership Summary
| Area | Current Owner |
| --- | --- |
| Durable runtime config/state | `RuntimeStore` |
| Current committed live layer state | `CommittedLiveState` |
| Mutation validation/policy | `RuntimeCoordinator` |
| Render snapshot publication | `RuntimeSnapshotProvider` |
| OSC/control ingress | `RuntimeServices` / `ControlServices` |
| Internal event dispatch | `RuntimeEventDispatcher` |
| Background persistence writes | `PersistenceWriter` |
| GL context and normal GL work | `RenderEngine` render thread |
| Render-pass execution and output readback | `OpenGLRenderPipeline` |
| Device lifecycle and output production | `VideoBackend` |
| DeckLink API integration | `DeckLinkSession` |
| Operational health/timing | `HealthTelemetry` |
## Current Runtime Flow Summary
### Control Mutation
```text
OSC/API/control input
-> RuntimeServices / ControlServices
-> RuntimeCoordinator
-> RuntimeStore / CommittedLiveState / RuntimeLiveState
-> RuntimeSnapshotProvider publication or live overlay update
-> RuntimeEventDispatcher observations
```
### Output Render
```text
VideoBackend output producer
-> RenderCadenceController tick
-> SystemOutputFramePool acquire rendering slot
-> OpenGLVideoIOBridge::RenderScheduledFrame
-> RenderEngine::RequestOutputFrame
-> render thread
-> OpenGLRenderPipeline::RenderFrame
-> system-memory output slot
-> RenderOutputQueue completed frame
```
### DeckLink Playout
```text
RenderOutputQueue completed frame
-> VideoBackend schedules to target
-> DeckLinkSession::ScheduleOutputFrame
-> CreateVideoFrameWithBuffer
-> ScheduleVideoFrame
-> DeckLink playback
-> completion callback
-> VideoBackend completion worker
-> release scheduled system-memory slot
```
### Input Capture
When input capture is enabled:
```text
DeckLink input callback
-> VideoBackend::HandleInputFrame
-> OpenGLVideoIOBridge::UploadInputFrame
-> RenderEngine::QueueInputFrame
-> render thread upload
```
When `VST_DISABLE_INPUT_CAPTURE=1`, this flow is skipped.
## Known Current Constraints
- The main app render thread still handles multiple kinds of GL work.
- Output render still uses a synchronous request/response call into the render thread.
- Input upload can contend with output render when input capture is enabled.
- Preview and screenshot share the render thread.
- Phase/experiment documents still exist as historical notes, but this document is the current architecture summary.
## Practical Rules
- Keep one owner for each kind of state.
- Keep GL work on the render thread.
- Keep DeckLink completion callbacks passive.
- Treat completed unscheduled output frames as latest-N cache entries.
- Protect scheduled output frames until DeckLink completion.
- Keep output timing more important than preview/screenshot.
- Measure timing by domain instead of adding fallback branches blindly.