415 lines
20 KiB
Markdown
415 lines
20 KiB
Markdown
# RenderCadenceCompositor
|
|
|
|
This app is the modular version of the working DeckLink render-cadence probe.
|
|
|
|
Its job is to prove the production-facing foundation before the current compositor's shader/runtime/control features are ported over.
|
|
|
|
Before adding features here, read the guardrails in [Render Cadence Golden Rules](../../docs/RENDER_CADENCE_GOLDEN_RULES.md).
|
|
|
|
## Architecture
|
|
|
|
```text
|
|
RenderThread
|
|
owns a hidden OpenGL context
|
|
polls latest input frames without waiting
|
|
uploads input frames into a render-owned GL texture
|
|
renders simple BGRA8 motion at selected cadence
|
|
queues async PBO readback
|
|
publishes completed frames into SystemFrameExchange
|
|
|
|
InputFrameMailbox
|
|
owns latest disposable CPU input slots
|
|
drops older unsampled input frames when newer frames arrive
|
|
protects the one frame currently being uploaded by render
|
|
uses a single contiguous copy when capture row stride matches mailbox row stride
|
|
|
|
SystemFrameExchange
|
|
owns Free / Rendering / Completed / Scheduled slots
|
|
drops old completed unscheduled frames when render needs space
|
|
protects scheduled frames until DeckLink completion
|
|
|
|
DeckLinkOutputThread
|
|
consumes completed system-memory frames
|
|
schedules them into DeckLink up to target depth
|
|
never renders
|
|
```
|
|
|
|
Startup warms up real rendered frames before DeckLink scheduled playback starts. When DeckLink input is available, startup also waits briefly for two ready input frames before the render thread starts so the first render ticks are deliberate rather than lucky.
|
|
|
|
## Current Scope
|
|
|
|
Included now:
|
|
|
|
- output-only DeckLink
|
|
- optional DeckLink input edge with BGRA8 capture or raw UYVY8 capture decoded on the render thread
|
|
- non-blocking startup when DeckLink output is unavailable
|
|
- hidden render-thread-owned OpenGL context
|
|
- simple smooth-motion renderer
|
|
- BGRA8-only output
|
|
- non-blocking latest-frame input mailbox
|
|
- fast contiguous mailbox copy path for matching input row strides
|
|
- bounded two-frame input warmup before render cadence starts
|
|
- render-thread-owned input texture upload
|
|
- async PBO readback
|
|
- latest-N system-memory frame exchange
|
|
- rendered-frame warmup
|
|
- background Slang compile of `shaders/happy-accident`
|
|
- app-owned display/render layer model for shader build readiness
|
|
- app-owned submission of a completed shader artifact
|
|
- render-thread-owned runtime render scene for ready shader layers
|
|
- shared-context GL prepare worker for runtime shader program compile/link
|
|
- render-thread-only GL program swap once a prepared program is ready
|
|
- manifest-driven stateless single-pass shader packages
|
|
- manifest-driven stateless named-pass shader packages
|
|
- atomic render-plan swap after every pass program is prepared
|
|
- HTTP shader list populated from supported stateless full-frame shader packages
|
|
- default float, vec2, color, boolean, enum, and trigger parameters
|
|
- small JSON writer for future HTTP/WebSocket payloads
|
|
- JSON serialization for cadence telemetry snapshots
|
|
- background logging with `log`, `warning`, and `error` levels
|
|
- local HTTP control server matching the OpenAPI route surface
|
|
- HTTP layer controls for add, remove, reorder, bypass, shader change, parameter update, and parameter reset
|
|
- trigger parameters as latest-pulse controls with shader-visible count/time
|
|
- startup config provider for `config/runtime-host.json`
|
|
- quiet telemetry health monitor
|
|
- non-GL frame-exchange tests
|
|
- non-GL input-mailbox tests
|
|
|
|
Intentionally not included yet:
|
|
|
|
- additional input format conversion/scaling
|
|
- temporal/history/feedback shader storage
|
|
- texture/LUT asset upload
|
|
- text-parameter rasterization
|
|
- runtime state
|
|
- OSC control
|
|
- persistent control/state writes
|
|
- trigger event history for stacked repeated pulses
|
|
- preview
|
|
- screenshots
|
|
- persistence
|
|
|
|
Those features should be ported only after the cadence spine is stable.
|
|
|
|
## V1 Feature Parity Checklist
|
|
|
|
This tracks parity with `apps/LoopThroughWithOpenGLCompositing`.
|
|
|
|
- [x] Stable DeckLink output cadence
|
|
- [x] BGRA8 system-memory output path
|
|
- [x] Render thread owns its primary GL context
|
|
- [x] Output startup warmup before scheduled playback
|
|
- [x] Non-blocking startup when DeckLink output is unavailable
|
|
- [x] Runtime shader package discovery
|
|
- [x] Background Slang shader compile
|
|
- [x] Shared-context GL shader/program preparation
|
|
- [x] Render-thread program swap at a frame boundary
|
|
- [x] Stateless single-pass shader rendering
|
|
- [x] Stateless named-pass shader rendering
|
|
- [x] Atomic multipass render-plan commit
|
|
- [x] Shader add/remove control path
|
|
- [x] Previous-layer texture handoff for stacked shaders
|
|
- [x] Supported shader list in HTTP/UI state
|
|
- [x] Local HTTP server
|
|
- [x] WebSocket state updates for the UI
|
|
- [x] OpenAPI document serving
|
|
- [x] Static control UI serving
|
|
- [x] Startup config loading from `config/runtime-host.json`
|
|
- [x] Cadence telemetry JSON
|
|
- [x] Health logging for schedule/drop/starvation events
|
|
- [x] Runtime parameter updates from HTTP controls
|
|
- [x] Layer reorder/bypass/set-shader/update-parameter/reset-parameter HTTP controls
|
|
- [x] Trigger parameter pulse count/time for latest trigger events
|
|
- [x] Optional DeckLink input capture
|
|
- [x] UYVY8 input capture with render-thread GPU decode to shader input texture
|
|
- [x] Latest-frame CPU input mailbox
|
|
- [x] Fast contiguous input mailbox copy when source/destination stride matches
|
|
- [x] Bounded two-frame input warmup before render cadence starts
|
|
- [x] Render-owned input texture upload
|
|
- [x] Runtime shaders receive input through `gVideoInput`
|
|
- [x] Live DeckLink input bound to `gVideoInput`
|
|
- [ ] Input format conversion/scaling
|
|
- [ ] Temporal history buffers
|
|
- [ ] Feedback buffers
|
|
- [ ] Texture asset loading and upload
|
|
- [ ] LUT asset loading and upload
|
|
- [ ] Text parameter rasterization
|
|
- [ ] Trigger history/event buffers for overlapping repeated trigger effects
|
|
- [ ] Full runtime state store/read model
|
|
- [ ] Persistent layer stack/config writes
|
|
- [ ] OSC ingress
|
|
- [ ] Preview output
|
|
- [ ] Screenshot capture
|
|
- [ ] External keying support
|
|
- [ ] Full V1 health/runtime presentation model
|
|
|
|
## Build
|
|
|
|
```powershell
|
|
cmake --build --preset build-debug --target RenderCadenceCompositor -- /m:1
|
|
```
|
|
|
|
The executable is:
|
|
|
|
```text
|
|
build\vs2022-x64-debug\Debug\RenderCadenceCompositor.exe
|
|
```
|
|
|
|
## Run
|
|
|
|
Run from VS Code with:
|
|
|
|
```text
|
|
Debug RenderCadenceCompositor
|
|
```
|
|
|
|
Or from a terminal:
|
|
|
|
```powershell
|
|
build\vs2022-x64-debug\Debug\RenderCadenceCompositor.exe
|
|
```
|
|
|
|
Press Enter to stop.
|
|
|
|
To test a different compatible shader package:
|
|
|
|
```powershell
|
|
build\vs2022-x64-debug\Debug\RenderCadenceCompositor.exe --shader solid-color
|
|
```
|
|
|
|
Use `--no-shader` to keep the simple motion fallback only.
|
|
|
|
## Startup Config
|
|
|
|
On startup the app loads `config/runtime-host.json` through `AppConfigProvider`, then applies explicit CLI overrides.
|
|
|
|
Currently consumed fields:
|
|
|
|
- `serverPort`
|
|
- `shaderLibrary`
|
|
- `oscBindAddress`
|
|
- `oscPort`
|
|
- `oscSmoothing`
|
|
- `inputVideoFormat`
|
|
- `inputFrameRate`
|
|
- `outputVideoFormat`
|
|
- `outputFrameRate`
|
|
- `autoReload`
|
|
- `maxTemporalHistoryFrames`
|
|
- `previewFps`
|
|
- `enableExternalKeying`
|
|
|
|
The loaded config is treated as a read-only startup snapshot. Subsystems that need config should receive this snapshot or a narrowed config struct from app orchestration; they should not reload files independently.
|
|
|
|
Supported CLI overrides:
|
|
|
|
- `--shader <shader-id>`
|
|
- `--no-shader`
|
|
- `--port <port>`
|
|
|
|
## Expected Telemetry
|
|
|
|
Startup, shutdown, shader-build, and render-thread event messages are written through the app logger. Telemetry is intentionally separate and remains a compact once-per-second cadence line.
|
|
|
|
The logger writes to the console, `OutputDebugStringA`, and `logs/render-cadence-compositor.log` by default. Render-thread log calls use the non-blocking path so diagnostics do not become cadence blockers.
|
|
|
|
## HTTP Control Server
|
|
|
|
The app starts a local HTTP control server on `http://127.0.0.1:8080` by default, searching nearby ports if that one is busy.
|
|
|
|
Current endpoints:
|
|
|
|
- `GET /` and UI asset paths: serve the bundled control UI from `ui/dist`
|
|
- `GET /api/state`: returns OpenAPI-shaped display data with cadence telemetry, supported shaders, output status, and a read-only current runtime layer
|
|
- `GET /ws`: upgrades to a WebSocket and streams state snapshots when they change
|
|
- `GET /docs/openapi.yaml` and `GET /openapi.yaml`: serves the OpenAPI document
|
|
- `GET /docs`: serves Swagger UI
|
|
- `POST /api/layers/add`, `/remove`, `/reorder`, `/set-bypass`, `/set-shader`, `/update-parameter`, and `/reset-parameters` use the shared runtime control-command path
|
|
- other OpenAPI POST routes are present but return `{ "ok": false, "error": "Endpoint is not implemented in RenderCadenceCompositor yet." }`
|
|
|
|
The HTTP server runs on its own thread. It serves static UI/docs files, samples/copies telemetry through callbacks, and translates POST bodies into runtime control commands. Command execution is app-owned, so future OSC ingress can create the same commands without depending on HTTP route code. Control commands may update the display layer model, start background shader builds, or publish an already-built render-layer snapshot, but they do not call render work or DeckLink scheduling directly.
|
|
|
|
## Optional DeckLink Output
|
|
|
|
DeckLink output is an optional edge service in this app.
|
|
|
|
Startup order is:
|
|
|
|
1. start render thread
|
|
2. warm up rendered system-memory frames
|
|
3. try to attach DeckLink output
|
|
4. start telemetry and HTTP either way
|
|
|
|
If DeckLink discovery or output setup fails, the app logs a warning and continues running without starting the output scheduler or scheduled playback. This keeps render cadence, runtime shader testing, HTTP state, and logging available on machines without DeckLink hardware or drivers.
|
|
|
|
`/api/state` reports the output status in `videoIO.statusMessage`.
|
|
|
|
## Optional DeckLink Input
|
|
|
|
DeckLink input is an optional edge service in this app.
|
|
|
|
Startup order is:
|
|
|
|
1. create `InputFrameMailbox`
|
|
2. try to attach DeckLink input for the configured input mode
|
|
3. prefer BGRA8 capture, otherwise accept raw UYVY8 capture and configure the mailbox for UYVY8 bytes
|
|
4. start `DeckLinkInputThread`
|
|
5. wait briefly for two ready input warmup frames before starting render cadence
|
|
6. leave input absent if discovery, setup, format support, or stream startup fails
|
|
|
|
`DeckLinkInput` and `DeckLinkInputThread` are deliberately narrow. They capture BGRA8 frames directly or raw UYVY8 frames into `InputFrameMailbox`; they do not call GL, render, preview, screenshot, shader, or output scheduling code. UYVY8-to-RGBA decode happens later inside the render-thread-owned input texture upload path, so the DeckLink callback stays a capture/copy edge only. The mailbox uses one contiguous copy when the capture row stride matches the configured mailbox row stride, and falls back to row-by-row copy only for padded or mismatched frames. Unsupported input modes or formats outside BGRA8/UYVY8 are reported explicitly and treated as an unavailable edge rather than silently converted.
|
|
|
|
Input warmup is startup-only and bounded. It may delay render-thread startup for a short window, but it does not add waits to the steady-state render cadence loop.
|
|
|
|
The app samples telemetry once per second.
|
|
|
|
Normal cadence samples are available through `GET /api/state` and are not printed to the console. The telemetry monitor only logs health events:
|
|
|
|
- warning when DeckLink late/dropped-frame counters increase
|
|
- warning when schedule failures increase
|
|
- error when the app/DeckLink output buffer is starved
|
|
|
|
Input telemetry:
|
|
|
|
- `inputFramesReceived`: frames accepted into `InputFrameMailbox`
|
|
- `inputFramesDropped`: ready input frames dropped or missed because the mailbox was full
|
|
- `inputLatestAgeMs`: age of the newest submitted input frame
|
|
- `inputUploadMs`: render-thread GL upload/decode submission time for the latest uploaded input frame
|
|
- `inputFormatSupported`: whether the latest frame reaching the render upload path was BGRA8 or UYVY8 compatible
|
|
- `inputSignalPresent`: whether any input frame has reached the mailbox
|
|
- `inputCaptureFps`: DeckLink input callback capture rate
|
|
- `inputConvertMs`: input-edge CPU conversion time; expected to remain `0` for BGRA8 and raw UYVY8 capture because UYVY8 decode is render-thread GPU work
|
|
- `inputSubmitMs`: time spent copying/submitting the latest captured input frame to `InputFrameMailbox`
|
|
- `inputCaptureFormat`: selected DeckLink input format (`BGRA8`, `UYVY8`, or `none`)
|
|
- `inputNoSignalFrames`: DeckLink callbacks reporting no input source
|
|
- `inputUnsupportedFrames`: input frames rejected before mailbox submission
|
|
- `inputSubmitMisses`: input frames that could not be submitted to the mailbox
|
|
|
|
Runtime shaders continue rendering when input is missing. If no mailbox frame has been uploaded yet, shader samplers use the runtime fallback source texture; once DeckLink input is flowing, shaders such as CRT and trigger-ripple sample the real/latest input through `gVideoInput`.
|
|
|
|
Healthy first-run signs:
|
|
|
|
- visible DeckLink output is smooth
|
|
- `renderFps` is close to the selected cadence
|
|
- `scheduleFps` is close to the selected cadence after warmup
|
|
- `scheduled` stays near 4
|
|
- `decklinkBuffered` stays near 4 when available
|
|
- `late` and `dropped` do not increase continuously
|
|
- `scheduleFailures` does not increase
|
|
- `shaderCommitted` becomes `1` after the background Happy Accident compile completes
|
|
- `shaderFailures` remains `0`
|
|
|
|
`completedPollMisses` means the DeckLink scheduling thread woke up before a completed frame was available. It is not a DeckLink playout underrun by itself. Treat it as healthy polling noise when `scheduled`, `decklinkBuffered`, `late`, `dropped`, and `scheduleFailures` remain stable.
|
|
|
|
## Runtime Slang Shader Test
|
|
|
|
On startup the app begins compiling the selected shader package on a background thread owned by the app orchestration layer. The default is `shaders/happy-accident`.
|
|
|
|
The render thread keeps drawing the simple motion renderer while Slang compiles. It does not choose packages, launch Slang, or track build lifecycle. Once a completed shader artifact is published, the render-thread-owned runtime scene queues changed layers to a shared-context GL prepare worker. That worker compiles/links runtime shader programs off the cadence thread. The render thread only swaps in an already-prepared GL program at a frame boundary. If either the Slang build or GL preparation fails, the app keeps rendering the current renderer or simple motion fallback.
|
|
|
|
Current runtime shader support is deliberately limited to stateless full-frame packages:
|
|
|
|
- one or more named passes
|
|
- one sampled source input per pass
|
|
- named intermediate outputs routed by the pass manifest
|
|
- final visible output must be named `layerOutput`
|
|
- no temporal history
|
|
- no feedback storage
|
|
- no texture/LUT assets yet
|
|
- no text parameters yet
|
|
- manifest defaults initialize parameters
|
|
- HTTP controls can update runtime parameter values without rebuilding GL programs when the shader program is unchanged
|
|
- trigger parameters are treated as latest-pulse controls: each press increments the trigger count and records the current runtime time
|
|
- repeated trigger history is not stored yet, so effects such as `trigger-ripple` restart from the latest trigger rather than accumulating overlapping ripples
|
|
- the first layer receives a small fallback source texture until DeckLink input is available
|
|
- the first layer receives the latest DeckLink input texture through both `gVideoInput` and `gLayerInput` when input frames are available
|
|
- stacked layers receive the original input through `gVideoInput` and the previous ready layer output through `gLayerInput`
|
|
|
|
Shader source semantics:
|
|
|
|
- `gVideoInput` means the latest decoded shader-visible video input for every layer.
|
|
- `gLayerInput` means the previous layer output.
|
|
- the first layer may receive `gLayerInput = gVideoInput`.
|
|
- later layers receive `gVideoInput = original input` and `gLayerInput = previous layer`.
|
|
- named intermediate pass inputs inside a multipass layer are still routed through the selected pass-source slot; layer stacking should use `gLayerInput`.
|
|
|
|
The `/api/state` shader list uses the same support rules as runtime shader compilation and reports only packages this app can run today. Unsupported manifest feature sets such as temporal, feedback, texture-backed, font-backed, or text-parameter shaders are hidden from the control UI for now.
|
|
|
|
Runtime shaders are exposed through `RuntimeLayerModel` as display layers with manifest parameter definitions, current parameter values, build status, and render-ready artifacts. POST controls mutate this app-owned model and may start background shader builds when the selected shader changes.
|
|
|
|
When a layer becomes render-ready, the app publishes the ready render-layer snapshot to the render thread. The render thread owns the GL-side `RuntimeRenderScene`, diffs that snapshot at a frame boundary, queues new or changed pass programs to the shared-context prepare worker, swaps in a full prepared render plan only after every pass is ready, removes obsolete GL programs, and renders ready layers in order. Stacked stateless full-frame shaders render through internal ping-pong targets so each layer can sample the previous layer through `gLayerInput`; multipass shaders route named intermediate outputs through their manifest-declared pass inputs, and the final ready layer renders to the output target.
|
|
|
|
Successful handoff signs:
|
|
|
|
- telemetry shows `shaderCommitted=1`
|
|
- output changes from the simple motion pattern to the Happy Accident shader
|
|
- render/schedule cadence remains near 60 fps during and after the handoff
|
|
- DeckLink buffer remains stable
|
|
|
|
## Baseline Result
|
|
|
|
Date: 2026-05-12
|
|
|
|
User-visible result:
|
|
|
|
- output was smooth
|
|
- DeckLink held a 4-frame buffer
|
|
|
|
Representative telemetry:
|
|
|
|
```text
|
|
renderFps=59.9 scheduleFps=59.9 free=8 completed=0 scheduled=4 completedPollMisses=30 scheduleFailures=0 completions=720 late=0 dropped=0 decklinkBuffered=4 scheduleCallMs=1.2
|
|
renderFps=59.8 scheduleFps=59.8 free=7 completed=1 scheduled=4 completedPollMisses=36 scheduleFailures=0 completions=1080 late=0 dropped=0 decklinkBuffered=4 scheduleCallMs=4.7
|
|
renderFps=59.9 scheduleFps=59.9 free=7 completed=1 scheduled=4 completedPollMisses=86 scheduleFailures=0 completions=1381 late=0 dropped=0 decklinkBuffered=4 scheduleCallMs=2.1
|
|
```
|
|
|
|
Read:
|
|
|
|
- render cadence and DeckLink schedule cadence both held roughly 60 fps
|
|
- app scheduled depth stayed at 4
|
|
- actual DeckLink buffered depth stayed at 4
|
|
- no late frames, dropped frames, or schedule failures were observed
|
|
- completed poll misses were benign because playout remained fully fed
|
|
|
|
## Tests
|
|
|
|
```powershell
|
|
cmake --build --preset build-debug --target RenderCadenceCompositorFrameExchangeTests -- /m:1
|
|
ctest --test-dir build\vs2022-x64-debug -C Debug -R RenderCadenceCompositorFrameExchangeTests --output-on-failure
|
|
```
|
|
|
|
## Relationship To The Probe
|
|
|
|
`apps/DeckLinkRenderCadenceProbe` proved the timing model in one compact file.
|
|
|
|
This app keeps the same core behavior but splits it into modules that can grow:
|
|
|
|
- `frames/`: system-memory handoff
|
|
- `platform/`: COM/Win32/hidden GL context support
|
|
- `render/`: cadence thread, clock, and simple renderer
|
|
- `frames/InputFrameMailbox`: non-blocking latest-frame CPU input handoff with contiguous-copy fast path for matching row strides
|
|
- `render/InputFrameTexture`: render-thread-owned upload of the latest CPU input frame into GL, including raw UYVY8 decode into the shader-visible input texture
|
|
- `render/readback/`: PBO-backed BGRA8 readback and completed-frame publication
|
|
- `render/runtime/RuntimeRenderScene`: render-thread-owned GL scene for ready runtime shader layers
|
|
- `render/runtime/RuntimeShaderPrepareWorker`: shared-context runtime shader program compile/link worker
|
|
- `runtime/`: app-owned shader layer readiness model, runtime Slang build bridge, and completed artifact handoff
|
|
- `control/`: control action results and runtime-state JSON presentation
|
|
- `control/http/`: local HTTP API, static UI serving, OpenAPI serving, and WebSocket updates
|
|
- `json/`: compact JSON serialization helpers
|
|
- `video/`: DeckLink output wrapper and scheduling thread
|
|
- `telemetry/`: cadence telemetry
|
|
- `telemetry/TelemetryHealthMonitor`: quiet health event logging from telemetry samples
|
|
- `app/`: startup/shutdown orchestration
|
|
- `app/AppConfigProvider`: startup config loading and CLI overrides
|
|
|
|
## Next Porting Steps
|
|
|
|
Only after this app matches the probe's smooth output:
|
|
|
|
1. replace `SimpleMotionRenderer` with a render-scene interface
|
|
2. port shader package rendering
|
|
3. port runtime snapshots/live state
|
|
4. add control services
|
|
5. add preview/screenshot from system-memory frames
|
|
6. add scaling and additional input format support after the BGRA8/raw-UYVY8 input edge is stable
|