Files
video-shader-toys/docs/NEW_RENDER_CADENCE_APP_PLAN.md
Aiden bc690e2a87
All checks were successful
CI / React UI Build (push) Successful in 11s
CI / Native Windows Build And Tests (push) Successful in 2m55s
CI / Windows Release Package (push) Successful in 3m14s
Clean up pass
2026-05-12 13:14:52 +10:00

14 KiB

New Render Cadence App Plan

This plan describes a new application folder that rebuilds the output path from the proven DeckLinkRenderCadenceProbe architecture, but as a maintainable app foundation rather than a monolithic probe file.

The first goal is not to port the current compositor feature set. The first goal is to reproduce the probe's smooth 59.94/60 fps DeckLink output with clean module boundaries, tests where possible, and a structure that can later accept the shader/runtime/control systems without compromising timing.

Working Name

Suggested folder:

apps/RenderCadenceCompositor

Suggested executable:

RenderCadenceCompositor

The existing app remains intact:

apps/LoopThroughWithOpenGLCompositing

The probe remains the control sample:

apps/DeckLinkRenderCadenceProbe

Design Principle

The app is built around one spine:

Render cadence thread
  -> owns GL context
  -> renders at selected frame cadence
  -> performs async BGRA8 readback
  -> publishes completed system-memory frames

System frame exchange
  -> owns Free / Rendering / Completed / Scheduled slots
  -> latest-N semantics for completed unscheduled frames
  -> protects scheduled frames until DeckLink completion

DeckLink output thread
  -> consumes completed frames
  -> schedules to target buffer depth
  -> releases scheduled frames on completion
  -> never renders

Everything else must fit around that spine.

Non-Negotiable Rules

  • The render thread owns its GL context from initialization to shutdown.
  • The render thread is driven by selected render cadence, not DeckLink demand.
  • DeckLink scheduling never calls render code.
  • Completion callbacks never render.
  • No synchronous render request exists in the output path.
  • Preview, screenshot, input upload, shader rebuild, and runtime control cannot run ahead of a due output frame.
  • Completed unscheduled frames are latest-N and disposable.
  • Scheduled frames are protected until DeckLink completion.
  • Startup warms up real rendered frames before scheduled playback starts.

Borrow From The Probe

Keep these behaviors from DeckLinkRenderCadenceProbe:

  • hidden OpenGL context owned by the render thread
  • simple render loop with nextRenderTime
  • BGRA8 render target
  • PBO ring readback
  • non-blocking fence polling with zero timeout
  • system-memory slots with Free, Rendering, Completed, Scheduled
  • drop oldest completed unscheduled frame if render needs space
  • DeckLink playout thread only schedules completed frames
  • warmup completed frames before StartScheduledPlayback()
  • one-line-per-second timing telemetry

Do Not Borrow Directly

The probe is deliberately compact. Do not carry over these probe limitations into the new app:

  • one huge .cpp file
  • hard-coded output mode as permanent behavior
  • render pattern, frame store, PBO logic, DeckLink playout, COM setup, and telemetry mixed together
  • no reusable interfaces
  • no unit-testable non-GL core

Proposed Folder Structure

apps/RenderCadenceCompositor/
  README.md
  RenderCadenceCompositor.cpp

  app/
    RenderCadenceApp.cpp
    RenderCadenceApp.h
    AppConfig.cpp
    AppConfig.h
    AppConfigProvider.cpp
    AppConfigProvider.h

  control/
    HttpControlServer.cpp
    HttpControlServer.h
    RuntimeStateJson.h

  platform/
    ComInit.cpp
    ComInit.h
    HiddenGlWindow.cpp
    HiddenGlWindow.h
    Win32Console.cpp
    Win32Console.h

  render/
    RenderThread.cpp
    RenderThread.h
    RenderCadenceClock.cpp
    RenderCadenceClock.h
    SimpleMotionRenderer.cpp
    SimpleMotionRenderer.h
    Bgra8ReadbackPipeline.cpp
    Bgra8ReadbackPipeline.h
    PboReadbackRing.cpp
    PboReadbackRing.h

  frames/
    SystemFrameExchange.cpp
    SystemFrameExchange.h
    SystemFrameTypes.h

  video/
    DeckLinkOutput.cpp
    DeckLinkOutput.h
    DeckLinkOutputThread.cpp
    DeckLinkOutputThread.h

  telemetry/
    CadenceTelemetry.cpp
    CadenceTelemetry.h
    CadenceTelemetryJson.h
    TelemetryHealthMonitor.h

  logging/
    Logger.cpp
    Logger.h

  json/
    JsonWriter.cpp
    JsonWriter.h

The new app can reuse selected existing source files from the current app at first:

  • videoio/decklink/DeckLinkSession.*
  • videoio/decklink/DeckLinkDisplayMode.*
  • videoio/decklink/DeckLinkVideoIOFormat.*
  • videoio/decklink/DeckLinkFrameTransfer.*
  • videoio/VideoIOFormat.*
  • videoio/VideoIOTypes.h
  • videoio/VideoPlayoutScheduler.*
  • gl/renderer/GLExtensions.*

Longer term, shared code should move into common libraries, but the first version can link these files directly to avoid a big build-system refactor.

Module Responsibilities

RenderCadenceApp

Owns top-level startup/shutdown sequencing.

Responsibilities:

  • initialize COM
  • discover/select DeckLink output
  • create frame exchange
  • start render thread
  • wait for completed-frame warmup
  • start DeckLink output thread
  • wait for scheduled buffer warmup
  • start DeckLink scheduled playback
  • start telemetry printer
  • stop in reverse order

It should not contain OpenGL drawing code, frame slot policy, or DeckLink scheduling loops.

AppConfig

Owns runtime settings for the initial app.

Initial settings:

  • output mode preference
  • output width/height validation
  • frame buffer capacity
  • PBO depth
  • warmup completed-frame count
  • target DeckLink scheduled depth
  • telemetry interval

Initial values should match the successful probe:

systemFrameSlots = 12
pboDepth = 6
warmupFrames = 4
targetDeckLinkBufferedFrames = 4
pixelFormat = BGRA8

HiddenGlWindow

Owns hidden Win32 window, device context, and OpenGL context creation.

Responsibilities:

  • create hidden window with CS_OWNDC
  • choose/set pixel format
  • create HGLRC
  • expose MakeCurrent() and ClearCurrent()
  • destroy context/window safely

Only RenderThread should call MakeCurrent() after startup.

RenderThread

Owns the render loop and GL context for its full lifetime.

Responsibilities:

  • create/bind hidden GL context
  • resolve GL extensions
  • initialize renderer/readback pipeline
  • run cadence loop
  • render one frame when due
  • queue PBO readback
  • consume completed PBOs into SystemFrameExchange
  • record telemetry
  • destroy GL resources on the render thread

It must not:

  • wait for DeckLink
  • schedule DeckLink frames
  • block on a system frame slot if only completed unscheduled frames can be dropped
  • accept arbitrary GL tasks ahead of output frames

RenderCadenceClock

Small, testable cadence helper.

Responsibilities:

  • track target frame duration
  • return whether a render is due
  • compute sleep duration
  • detect overrun/skipped ticks
  • never speed up to fill buffers

This should be unit tested without GL.

SimpleMotionRenderer

First renderer only.

Responsibilities:

  • render obvious smooth motion and color changes
  • produce BGRA8-compatible framebuffer content
  • make dropped/repeated frames visually obvious

This intentionally avoids shader-package/runtime complexity.

Bgra8ReadbackPipeline

Owns output framebuffer and BGRA8 readback orchestration.

Responsibilities:

  • configure render target dimensions
  • render into an RGBA8/BGRA-compatible texture
  • coordinate PboReadbackRing
  • publish completed frames into SystemFrameExchange

PboReadbackRing

Owns PBO/fence state.

Responsibilities:

  • queue readback into the next free PBO slot
  • poll completed fences with zero timeout
  • map/copy completed PBOs into provided system-memory slots
  • count PBO misses
  • clean up fences/PBOs on render thread

This is GL-backed, but the state model should be small and easy to reason about.

SystemFrameExchange

The central handoff between render and video.

Responsibilities:

  • own system-memory frame buffers
  • track slot states: Free, Rendering, Completed, Scheduled
  • provide AcquireForRender()
  • provide PublishCompleted()
  • provide ConsumeCompletedForSchedule()
  • provide ReleaseScheduledByBytes()
  • drop oldest completed unscheduled frame when render needs a slot
  • expose metrics

This should be unit tested heavily.

DeckLinkOutput

Thin wrapper around DeckLinkSession for output-only use.

Responsibilities:

  • discover/select output mode
  • configure output callback
  • prepare output schedule
  • schedule app-owned system-memory frames
  • start scheduled playback
  • stop/release resources
  • expose actual DeckLink buffered count

No input support in the first version.

DeckLinkOutputThread

Owns playout scheduling loop.

Responsibilities:

  • keep scheduled depth near target
  • consume completed frames from SystemFrameExchange
  • schedule them through DeckLinkOutput
  • release frame if scheduling fails
  • sleep briefly when scheduled buffer is full or no completed frame exists

It must not render.

CadenceTelemetry

Owns counters, not policy.

Initial counters:

  • rendered frames
  • completed readback frames
  • scheduled frames
  • completion count
  • completed-frame drops
  • acquire misses
  • schedule underruns
  • PBO queue misses
  • DeckLink late count
  • DeckLink dropped count
  • free/rendering/completed/scheduled slot counts
  • actual DeckLink buffered frames

TelemetryHealthMonitor

Samples cadence telemetry once per interval and logs only health events.

Normal telemetry is available through the HTTP state endpoint. The console should not receive a healthy once-per-second cadence line.

Health events:

  • warning when DeckLink late/dropped-frame counters increase
  • warning when schedule failures increase
  • error when app/DeckLink output buffering is starved

Startup Sequence

Target first-version startup:

main
  -> load AppConfig through AppConfigProvider
  -> initialize COM
  -> create SystemFrameExchange
  -> start RenderThread
  -> wait for completed frame warmup
  -> optionally discover/select/configure DeckLink output
  -> if DeckLink is available:
       -> start DeckLinkOutputThread
       -> wait for scheduled depth warmup
       -> DeckLinkOutput start scheduled playback
  -> if DeckLink is unavailable:
       -> continue without video output
  -> start TelemetryHealthMonitor
  -> start HttpControlServer
  -> wait for Enter

Shutdown:

stop HttpControlServer
stop TelemetryHealthMonitor
stop DeckLinkOutputThread
DeckLinkOutput stop playback
stop RenderThread
DeckLinkOutput release resources
release COM

First Milestone: Modular Probe Equivalent

This is the only goal for the initial implementation.

Feature set:

  • console app
  • output-only DeckLink
  • no input
  • hidden GL context
  • simple motion renderer
  • BGRA8 only
  • PBO async readback
  • latest-N system-memory frame exchange
  • warmup before playback
  • one-line telemetry

Acceptance:

  • visible DeckLink output is smooth
  • renderFps near selected cadence
  • scheduleFps near selected cadence
  • scheduled count/decklink buffered count stable around 4
  • no continuous late/drop count
  • no continuous PBO misses
  • behavior matches or exceeds DeckLinkRenderCadenceProbe

Second Milestone: Testable Core

Before porting compositor features, add tests for non-GL/non-DeckLink pieces.

Test targets:

  • SystemFrameExchangeTests
  • RenderCadenceClockTests
  • CadenceTelemetryTests

Important cases:

  • slot lifecycle transitions
  • scheduled slots are protected
  • completed unscheduled frames can be dropped
  • stale handles/generations are rejected
  • cadence does not speed up to refill buffers
  • cadence records overrun/skipped ticks

Third Milestone: Replace Simple Renderer With Render Interface

Add an interface around frame rendering:

IRenderScene
  -> InitializeGl()
  -> RenderFrame(frameIndex, time)
  -> ShutdownGl()

The first implementation remains SimpleMotionRenderer.

This creates the insertion point for shader-package rendering later without changing timing/scheduling.

Fourth Milestone: Begin Porting Current App Features

Port only after the modular probe equivalent is stable.

Suggested order:

  1. shader package compile/load
  2. render pass/layer stack drawing
  3. runtime snapshot input to renderer
  4. live state overlays
  5. control services
  6. persistence/runtime store
  7. preview from system-memory frames
  8. screenshot from system-memory frames
  9. input capture via CPU latest-frame mailbox

Each port must preserve the rule that the render thread cadence is primary.

What Not To Port Early

Do not port these until the output spine is proven:

  • DeckLink input
  • preview GL presentation
  • screenshot GL readback
  • HTTP/OSC control services
  • shader hot reload
  • persistence
  • runtime state JSON/open API
  • complex telemetry/event dispatch

These are useful, but they are exactly the kinds of features that can accidentally reintroduce timing coupling.

Build Plan

Initial CMake can follow the probe pattern:

set(RENDER_CADENCE_APP_DIR "${CMAKE_CURRENT_SOURCE_DIR}/apps/RenderCadenceCompositor")

add_executable(RenderCadenceCompositor
  # selected shared DeckLink/video/gl support files
  # new modular app files
)

Later, shared source should be split into libraries:

video_shader_decklink
video_shader_videoio
video_shader_gl_support
render_cadence_core

Avoid doing that library split before the first modular app works.

VS Code Launch

Add a separate launch profile:

Debug RenderCadenceCompositor

Run it as a console app so telemetry remains visible.

Documentation

Add:

apps/RenderCadenceCompositor/README.md

The README should record:

  • intended architecture
  • build/run instructions
  • expected telemetry
  • test result notes
  • differences from the old app
  • differences from the probe

Success Criteria Before Porting More Features

Do not start feature porting until the new app can run with:

  • stable smooth DeckLink output
  • stable target scheduled depth
  • stable actual DeckLink buffered count
  • no regular visible freezes
  • no steady PBO misses
  • no steadily increasing late/dropped completions
  • focus/minimize changes do not affect output cadence
  • clean shutdown without hangs

This gives us a clean foundation. Once this is true, every feature added later has to prove it does not damage the spine.