Performance chasing
All checks were successful
CI / React UI Build (push) Successful in 10s
CI / Native Windows Build And Tests (push) Successful in 2m51s
CI / Windows Release Package (push) Successful in 2m55s

This commit is contained in:
Aiden
2026-05-11 23:10:45 +10:00
parent c5cead6003
commit a434a88108
18 changed files with 1115 additions and 82 deletions

View File

@@ -7,7 +7,7 @@ Phase 7 made backend lifecycle, playout policy, ready-frame queueing, late/drop
## Status
- Phase 7.5 design package: proposed.
- Phase 7.5 implementation: Step 2 in progress.
- Phase 7.5 implementation: Step 5 in progress.
- Current alignment: Phase 7 is complete. `RenderOutputQueue`, `VideoPlayoutPolicy`, `VideoPlayoutScheduler`, `VideoBackendLifecycle`, and backend playout telemetry exist. The backend worker fills the ready queue on completion demand, but render production is not yet proactively driven by queue pressure or video cadence.
Current footholds:
@@ -19,6 +19,9 @@ Current footholds:
- `HealthTelemetry::BackendPlayoutSnapshot` exposes queue depth, underruns, late/drop streaks, and recovery decisions.
- Step 1 adds baseline timing fields for ready-queue min/max/zero-depth samples and output render duration.
- Step 2 adds a pure `OutputProductionController` for queue-pressure production decisions.
- Step 3 adds a proactive output producer worker that keeps `RenderOutputQueue` warm after playback starts.
- Step 4 skips non-forced preview presentation while output ready-queue depth is below target.
- Step 5 makes async readback misses prefer cached output over synchronous readback after bootstrap.
## Timing Review Findings
@@ -199,15 +202,23 @@ Move from demand-filled output production to queue-pressure production.
Initial target:
- producer wakes when queue depth is below target
- producer requests render-thread output production until target depth is reached
- producer stops when backend stops or render thread shuts down
- completion worker mostly schedules from already-ready frames
- [x] producer wakes when queue depth is below target
- [x] producer requests render-thread output production until target depth is reached
- [x] producer stops when backend stops or render thread shuts down
- [x] completion worker mostly schedules from already-ready frames
Exit criteria:
- normal playback does not depend on completion processing to fill the queue from empty
- callback/completion pressure and render production pressure are separate
- [x] normal playback does not depend on completion processing to fill the queue from empty
- [x] callback/completion pressure and render production pressure are separate
Implementation notes:
- `VideoBackend` starts the completion worker before device start, then starts the output producer only after DeckLink start succeeds. This avoids fighting DeckLink preroll for the same output frame pool.
- `OutputProducerWorkerMain()` periodically wakes and uses `OutputProductionController` to decide whether to produce, wait, or throttle.
- Completion handling records pacing/recovery, updates producer pressure, schedules a ready frame, and wakes the producer to refill headroom.
- Completion handling keeps a one-frame synchronous fallback when the ready queue is unexpectedly empty, then falls back to black underrun behavior if that also fails.
- Producer shutdown is explicit and joined before video output teardown.
### Step 4. Prioritize Playout Over Preview
@@ -215,15 +226,21 @@ Make preview explicitly subordinate to output playout deadlines.
Initial target:
- skip or delay preview when ready queue depth is below target
- [x] skip or delay preview when ready queue depth is below target
- count skipped previews
- record preview present cost separately from output render cost
Exit criteria:
- preview cannot drain output headroom invisibly
- [x] preview cannot drain output headroom invisibly
- runtime telemetry shows preview skips and preview present cost
Implementation notes:
- `OpenGLComposite::paintGL(false)` now skips preview presentation when `VideoBackend` reports that the ready queue is below the target depth.
- Forced preview paints are still allowed so resize/manual paint behavior remains intact.
- Preview skip counters and present-cost telemetry remain follow-up work for this step.
### Step 5. Make Readback Miss Policy Deadline-Aware
Avoid turning a late async readback fence into synchronous deadline pressure by default.
@@ -232,13 +249,20 @@ Initial target:
- count async readback misses
- count synchronous fallback uses
- allow policy to prefer stale/black output over synchronous fallback when queue pressure is high
- keep current fallback available while behavior is measured
- [x] allow policy to prefer stale/black output over synchronous fallback when queue pressure is high
- [x] keep current fallback available while behavior is measured
Exit criteria:
- readback fallback is an explicit policy decision
- late GPU fences do not automatically block the most timing-sensitive path
- [x] readback fallback is an explicit policy decision
- [x] late GPU fences do not automatically block the most timing-sensitive path
Implementation notes:
- `OpenGLRenderPipeline::ReadOutputFrame()` now uses synchronous readback only to bootstrap the first cached output frame.
- After cached output exists, an async readback miss copies the cached output frame into the DeckLink output frame instead of blocking on synchronous `glReadPixels`.
- Async readback queueing now skips when the next PBO slot is still in flight rather than deleting an in-flight fence and overwriting it.
- Miss/fallback counters remain follow-up telemetry work for this step.
### Step 6. Tune Headroom Policy