14 KiB
New Render Cadence App Plan
Status: historical implementation plan. apps/RenderCadenceCompositor now exists; use apps/RenderCadenceCompositor/README.md and Render Cadence Golden Rules as the current implementation contract.
This plan describes a new application folder that rebuilds the output path from the proven DeckLinkRenderCadenceProbe architecture, but as a maintainable app foundation rather than a monolithic probe file.
The first goal is not to port the current compositor feature set. The first goal is to reproduce the probe's smooth 59.94/60 fps DeckLink output with clean module boundaries, tests where possible, and a structure that can later accept the shader/runtime/control systems without compromising timing.
Working Name
Suggested folder:
apps/RenderCadenceCompositor
Suggested executable:
RenderCadenceCompositor
The existing app remains intact:
apps/LoopThroughWithOpenGLCompositing
The probe remains the control sample:
apps/DeckLinkRenderCadenceProbe
Design Principle
The app is built around one spine:
Render cadence thread
-> owns GL context
-> renders at selected frame cadence
-> performs async BGRA8 readback
-> publishes completed system-memory frames
System frame exchange
-> owns Free / Rendering / Completed / Scheduled slots
-> bounded FIFO reserve for completed unscheduled frames
-> protects scheduled frames until DeckLink completion
DeckLink output thread
-> consumes completed frames
-> schedules to target buffer depth
-> releases scheduled frames on completion
-> never renders
Everything else must fit around that spine.
Non-Negotiable Rules
- The render thread owns its GL context from initialization to shutdown.
- The render thread is driven by selected render cadence, not DeckLink demand.
- DeckLink scheduling never calls render code.
- Completion callbacks never render.
- No synchronous render request exists in the output path.
- Preview, screenshot, input upload, shader rebuild, and runtime control cannot run ahead of a due output frame.
- Completed unscheduled frames are a bounded FIFO reserve; overflow drops are counted separately from DeckLink drops.
- Scheduled frames are protected until DeckLink completion.
- Startup warms up real rendered frames before scheduled playback starts.
Borrow From The Probe
Keep these behaviors from DeckLinkRenderCadenceProbe:
- hidden OpenGL context owned by the render thread
- simple render loop with
nextRenderTime - BGRA8 render target
- PBO ring readback
- non-blocking fence polling with zero timeout
- system-memory slots with
Free,Rendering,Completed,Scheduled - preserve completed frames waiting for playout; drop/count the oldest completed frame only if the bounded reserve overflows
- DeckLink playout thread only schedules completed frames
- warmup completed frames before
StartScheduledPlayback() - one-line-per-second timing telemetry
Do Not Borrow Directly
The probe is deliberately compact. Do not carry over these probe limitations into the new app:
- one huge
.cppfile - hard-coded output mode as permanent behavior
- render pattern, frame store, PBO logic, DeckLink playout, COM setup, and telemetry mixed together
- no reusable interfaces
- no unit-testable non-GL core
Proposed Folder Structure
apps/RenderCadenceCompositor/
README.md
RenderCadenceCompositor.cpp
app/
RenderCadenceApp.cpp
RenderCadenceApp.h
AppConfig.cpp
AppConfig.h
AppConfigProvider.cpp
AppConfigProvider.h
control/
HttpControlServer.cpp
HttpControlServer.h
RuntimeStateJson.h
platform/
ComInit.cpp
ComInit.h
HiddenGlWindow.cpp
HiddenGlWindow.h
Win32Console.cpp
Win32Console.h
render/
RenderThread.cpp
RenderThread.h
RenderCadenceClock.cpp
RenderCadenceClock.h
SimpleMotionRenderer.cpp
SimpleMotionRenderer.h
Bgra8ReadbackPipeline.cpp
Bgra8ReadbackPipeline.h
PboReadbackRing.cpp
PboReadbackRing.h
frames/
SystemFrameExchange.cpp
SystemFrameExchange.h
SystemFrameTypes.h
video/
DeckLinkOutput.cpp
DeckLinkOutput.h
DeckLinkOutputThread.cpp
DeckLinkOutputThread.h
telemetry/
CadenceTelemetry.cpp
CadenceTelemetry.h
CadenceTelemetryJson.h
TelemetryHealthMonitor.h
logging/
Logger.cpp
Logger.h
json/
JsonWriter.cpp
JsonWriter.h
The new app can reuse selected existing source files from the current app at first:
videoio/decklink/DeckLinkSession.*videoio/decklink/DeckLinkDisplayMode.*videoio/decklink/DeckLinkVideoIOFormat.*videoio/decklink/DeckLinkFrameTransfer.*videoio/VideoIOFormat.*videoio/VideoIOTypes.hvideoio/VideoPlayoutScheduler.*gl/renderer/GLExtensions.*
Longer term, shared code should move into common libraries, but the first version can link these files directly to avoid a big build-system refactor.
Module Responsibilities
RenderCadenceApp
Owns top-level startup/shutdown sequencing.
Responsibilities:
- initialize COM
- discover/select DeckLink output
- create frame exchange
- start render thread
- wait for completed-frame warmup
- start DeckLink output thread
- wait for scheduled buffer warmup
- start DeckLink scheduled playback
- start telemetry printer
- stop in reverse order
It should not contain OpenGL drawing code, frame slot policy, or DeckLink scheduling loops.
AppConfig
Owns runtime settings for the initial app.
Initial settings:
- output mode preference
- output width/height validation
- frame buffer capacity
- PBO depth
- warmup completed-frame count
- target DeckLink scheduled depth
- telemetry interval
Initial values should match the successful probe:
systemFrameSlots = 12
pboDepth = 6
warmupFrames = 4
targetDeckLinkBufferedFrames = 4
pixelFormat = BGRA8
HiddenGlWindow
Owns hidden Win32 window, device context, and OpenGL context creation.
Responsibilities:
- create hidden window with
CS_OWNDC - choose/set pixel format
- create
HGLRC - expose
MakeCurrent()andClearCurrent() - destroy context/window safely
Only RenderThread should call MakeCurrent() after startup.
RenderThread
Owns the render loop and GL context for its full lifetime.
Responsibilities:
- create/bind hidden GL context
- resolve GL extensions
- initialize renderer/readback pipeline
- run cadence loop
- render one frame when due
- queue PBO readback
- consume completed PBOs into
SystemFrameExchange - record telemetry
- destroy GL resources on the render thread
It must not:
- wait for DeckLink
- schedule DeckLink frames
- block on a system frame slot if only completed unscheduled frames can be dropped
- accept arbitrary GL tasks ahead of output frames
RenderCadenceClock
Small, testable cadence helper.
Responsibilities:
- track target frame duration
- return whether a render is due
- compute sleep duration
- detect overrun/skipped ticks
- never speed up to fill buffers
This should be unit tested without GL.
SimpleMotionRenderer
First renderer only.
Responsibilities:
- render obvious smooth motion and color changes
- produce BGRA8-compatible framebuffer content
- make dropped/repeated frames visually obvious
This intentionally avoids shader-package/runtime complexity.
Bgra8ReadbackPipeline
Owns output framebuffer and BGRA8 readback orchestration.
Responsibilities:
- configure render target dimensions
- render into an RGBA8/BGRA-compatible texture
- coordinate
PboReadbackRing - publish completed frames into
SystemFrameExchange
PboReadbackRing
Owns PBO/fence state.
Responsibilities:
- queue readback into the next free PBO slot
- poll completed fences with zero timeout
- map/copy completed PBOs into provided system-memory slots
- count PBO misses
- clean up fences/PBOs on render thread
This is GL-backed, but the state model should be small and easy to reason about.
SystemFrameExchange
The central handoff between render and video.
Responsibilities:
- own system-memory frame buffers
- track slot states:
Free,Rendering,Completed,Scheduled - provide
AcquireForRender() - provide
PublishCompleted() - provide
ConsumeCompletedForSchedule() - provide
ReleaseScheduledByBytes() - drop oldest completed unscheduled frame when render needs a slot
- expose metrics
This should be unit tested heavily.
DeckLinkOutput
Thin wrapper around DeckLinkSession for output-only use.
Responsibilities:
- discover/select output mode
- configure output callback
- prepare output schedule
- schedule app-owned system-memory frames
- start scheduled playback
- stop/release resources
- expose actual DeckLink buffered count
No input support in the first version.
DeckLinkOutputThread
Owns playout scheduling loop.
Responsibilities:
- keep scheduled depth near target
- consume completed frames from
SystemFrameExchange - schedule them through
DeckLinkOutput - release frame if scheduling fails
- sleep briefly when scheduled buffer is full or no completed frame exists
It must not render.
CadenceTelemetry
Owns counters, not policy.
Initial counters:
- rendered frames
- completed readback frames
- scheduled frames
- completion count
- completed-frame drops
- acquire misses
- schedule underruns
- PBO queue misses
- DeckLink late count
- DeckLink dropped count
- free/rendering/completed/scheduled slot counts
- actual DeckLink buffered frames
TelemetryHealthMonitor
Samples cadence telemetry once per interval and logs only health events.
Normal telemetry is available through the HTTP state endpoint. The console should not receive a healthy once-per-second cadence line.
Health events:
- warning when DeckLink late/dropped-frame counters increase
- warning when schedule failures increase
- error when app/DeckLink output buffering is starved
Startup Sequence
Target first-version startup:
main
-> load AppConfig through AppConfigProvider
-> initialize COM
-> create SystemFrameExchange
-> start RenderThread
-> wait for completed frame warmup
-> optionally discover/select/configure DeckLink output
-> if DeckLink is available:
-> start DeckLinkOutputThread
-> wait for scheduled depth warmup
-> DeckLinkOutput start scheduled playback
-> if DeckLink is unavailable:
-> continue without video output
-> start TelemetryHealthMonitor
-> start HttpControlServer
-> wait for Enter
Shutdown:
stop HttpControlServer
stop TelemetryHealthMonitor
stop DeckLinkOutputThread
DeckLinkOutput stop playback
stop RenderThread
DeckLinkOutput release resources
release COM
First Milestone: Modular Probe Equivalent
This is the only goal for the initial implementation.
Feature set:
- console app
- output-only DeckLink
- no input
- hidden GL context
- simple motion renderer
- BGRA8 only
- PBO async readback
- bounded FIFO system-memory frame exchange
- warmup before playback
- one-line telemetry
Acceptance:
- visible DeckLink output is smooth
renderFpsnear selected cadencescheduleFpsnear selected cadence- scheduled count/decklink buffered count stable around 4
- no continuous late/drop count
- no continuous PBO misses
- behavior matches or exceeds
DeckLinkRenderCadenceProbe
Second Milestone: Testable Core
Before porting compositor features, add tests for non-GL/non-DeckLink pieces.
Test targets:
SystemFrameExchangeTestsRenderCadenceClockTestsCadenceTelemetryTests
Important cases:
- slot lifecycle transitions
- scheduled slots are protected
- completed unscheduled frames can be dropped
- stale handles/generations are rejected
- cadence does not speed up to refill buffers
- cadence records overrun/skipped ticks
Third Milestone: Replace Simple Renderer With Render Interface
Add an interface around frame rendering:
IRenderScene
-> InitializeGl()
-> RenderFrame(frameIndex, time)
-> ShutdownGl()
The first implementation remains SimpleMotionRenderer.
This creates the insertion point for shader-package rendering later without changing timing/scheduling.
Fourth Milestone: Begin Porting Current App Features
Port only after the modular probe equivalent is stable.
Suggested order:
- shader package compile/load
- render pass/layer stack drawing
- runtime snapshot input to renderer
- live state overlays
- control services
- persistence/runtime store
- preview from system-memory frames
- screenshot from system-memory frames
- input capture via CPU latest-frame mailbox
Each port must preserve the rule that the render thread cadence is primary.
What Not To Port Early
Do not port these until the output spine is proven:
- DeckLink input
- preview GL presentation
- screenshot GL readback
- HTTP/OSC control services
- shader hot reload
- persistence
- runtime state JSON/open API
- complex telemetry/event dispatch
These are useful, but they are exactly the kinds of features that can accidentally reintroduce timing coupling.
Build Plan
Initial CMake can follow the probe pattern:
set(RENDER_CADENCE_APP_DIR "${CMAKE_CURRENT_SOURCE_DIR}/apps/RenderCadenceCompositor")
add_executable(RenderCadenceCompositor
# selected shared DeckLink/video/gl support files
# new modular app files
)
Later, shared source should be split into libraries:
video_shader_decklink
video_shader_videoio
video_shader_gl_support
render_cadence_core
Avoid doing that library split before the first modular app works.
VS Code Launch
Add a separate launch profile:
Debug RenderCadenceCompositor
Run it as a console app so telemetry remains visible.
Documentation
Add:
apps/RenderCadenceCompositor/README.md
The README should record:
- intended architecture
- build/run instructions
- expected telemetry
- test result notes
- differences from the old app
- differences from the probe
Success Criteria Before Porting More Features
Do not start feature porting until the new app can run with:
- stable smooth DeckLink output
- stable target scheduled depth
- stable actual DeckLink buffered count
- no regular visible freezes
- no steady PBO misses
- no steadily increasing late/dropped completions
- focus/minimize changes do not affect output cadence
- clean shutdown without hangs
This gives us a clean foundation. Once this is true, every feature added later has to prove it does not damage the spine.