Input GPU decoding
All checks were successful
CI / React UI Build (push) Successful in 11s
CI / Native Windows Build And Tests (push) Successful in 3m4s
CI / Windows Release Package (push) Has been skipped

This commit is contained in:
Aiden
2026-05-12 20:26:03 +10:00
parent ce28904891
commit fd4b70ec9c
6 changed files with 278 additions and 66 deletions

View File

@@ -40,7 +40,7 @@ Startup warms up real rendered frames before DeckLink scheduled playback starts.
Included now: Included now:
- output-only DeckLink - output-only DeckLink
- optional DeckLink input edge with BGRA8 capture or UYVY8-to-BGRA8 CPU conversion - optional DeckLink input edge with BGRA8 capture or raw UYVY8 capture decoded on the render thread
- non-blocking startup when DeckLink output is unavailable - non-blocking startup when DeckLink output is unavailable
- hidden render-thread-owned OpenGL context - hidden render-thread-owned OpenGL context
- simple smooth-motion renderer - simple smooth-motion renderer
@@ -74,7 +74,7 @@ Included now:
Intentionally not included yet: Intentionally not included yet:
- input format conversion - additional input format conversion/scaling
- temporal/history/feedback shader storage - temporal/history/feedback shader storage
- texture/LUT asset upload - texture/LUT asset upload
- text-parameter rasterization - text-parameter rasterization
@@ -118,7 +118,7 @@ This tracks parity with `apps/LoopThroughWithOpenGLCompositing`.
- [x] Layer reorder/bypass/set-shader/update-parameter/reset-parameter HTTP controls - [x] Layer reorder/bypass/set-shader/update-parameter/reset-parameter HTTP controls
- [x] Trigger parameter pulse count/time for latest trigger events - [x] Trigger parameter pulse count/time for latest trigger events
- [x] Optional DeckLink input capture - [x] Optional DeckLink input capture
- [x] UYVY8-to-BGRA8 input conversion - [x] UYVY8 input capture with render-thread GPU decode to shader input texture
- [x] Latest-frame CPU input mailbox - [x] Latest-frame CPU input mailbox
- [x] Render-owned input texture upload - [x] Render-owned input texture upload
- [x] Runtime shaders receive input through `gVideoInput` - [x] Runtime shaders receive input through `gVideoInput`
@@ -247,11 +247,11 @@ Startup order is:
1. create `InputFrameMailbox` 1. create `InputFrameMailbox`
2. try to attach DeckLink input for the configured input mode 2. try to attach DeckLink input for the configured input mode
3. prefer BGRA8 capture, otherwise accept UYVY8 capture and convert to BGRA8 before the mailbox 3. prefer BGRA8 capture, otherwise accept raw UYVY8 capture and configure the mailbox for UYVY8 bytes
4. start `DeckLinkInputThread` 4. start `DeckLinkInputThread`
5. leave input absent if discovery, setup, format support, or stream startup fails 5. leave input absent if discovery, setup, format support, or stream startup fails
`DeckLinkInput` and `DeckLinkInputThread` are deliberately narrow. They capture BGRA8 frames directly or convert UYVY8 frames to BGRA8 before submitting to `InputFrameMailbox`; they do not call GL, render, preview, screenshot, shader, or output scheduling code. Unsupported input modes or formats outside BGRA8/UYVY8 are reported explicitly and treated as an unavailable edge rather than silently converted. `DeckLinkInput` and `DeckLinkInputThread` are deliberately narrow. They capture BGRA8 frames directly or raw UYVY8 frames into `InputFrameMailbox`; they do not call GL, render, preview, screenshot, shader, or output scheduling code. UYVY8-to-RGBA decode happens later inside the render-thread-owned input texture upload path, so the DeckLink callback stays a capture/copy edge only. Unsupported input modes or formats outside BGRA8/UYVY8 are reported explicitly and treated as an unavailable edge rather than silently converted.
The app samples telemetry once per second. The app samples telemetry once per second.
@@ -266,11 +266,11 @@ Input telemetry:
- `inputFramesReceived`: frames accepted into `InputFrameMailbox` - `inputFramesReceived`: frames accepted into `InputFrameMailbox`
- `inputFramesDropped`: ready input frames dropped or missed because the mailbox was full - `inputFramesDropped`: ready input frames dropped or missed because the mailbox was full
- `inputLatestAgeMs`: age of the newest submitted input frame - `inputLatestAgeMs`: age of the newest submitted input frame
- `inputUploadMs`: render-thread GL upload time for the latest uploaded input frame - `inputUploadMs`: render-thread GL upload/decode submission time for the latest uploaded input frame
- `inputFormatSupported`: whether the latest frame reaching the render upload path was BGRA8-compatible - `inputFormatSupported`: whether the latest frame reaching the render upload path was BGRA8 or UYVY8 compatible
- `inputSignalPresent`: whether any input frame has reached the mailbox - `inputSignalPresent`: whether any input frame has reached the mailbox
- `inputCaptureFps`: DeckLink input callback capture rate - `inputCaptureFps`: DeckLink input callback capture rate
- `inputConvertMs`: input-edge UYVY8-to-BGRA8 conversion time for the latest converted frame - `inputConvertMs`: input-edge CPU conversion time; expected to remain `0` for BGRA8 and raw UYVY8 capture because UYVY8 decode is render-thread GPU work
- `inputSubmitMs`: time spent submitting the latest captured/converted input frame to `InputFrameMailbox` - `inputSubmitMs`: time spent submitting the latest captured/converted input frame to `InputFrameMailbox`
- `inputCaptureFormat`: selected DeckLink input format (`BGRA8`, `UYVY8`, or `none`) - `inputCaptureFormat`: selected DeckLink input format (`BGRA8`, `UYVY8`, or `none`)
- `inputNoSignalFrames`: DeckLink callbacks reporting no input source - `inputNoSignalFrames`: DeckLink callbacks reporting no input source
@@ -380,7 +380,7 @@ This app keeps the same core behavior but splits it into modules that can grow:
- `platform/`: COM/Win32/hidden GL context support - `platform/`: COM/Win32/hidden GL context support
- `render/`: cadence thread, clock, and simple renderer - `render/`: cadence thread, clock, and simple renderer
- `frames/InputFrameMailbox`: non-blocking latest-frame CPU input handoff - `frames/InputFrameMailbox`: non-blocking latest-frame CPU input handoff
- `render/InputFrameTexture`: render-thread-owned upload of the latest CPU input frame into GL - `render/InputFrameTexture`: render-thread-owned upload of the latest CPU input frame into GL, including raw UYVY8 decode into the shader-visible input texture
- `render/readback/`: PBO-backed BGRA8 readback and completed-frame publication - `render/readback/`: PBO-backed BGRA8 readback and completed-frame publication
- `render/runtime/RuntimeRenderScene`: render-thread-owned GL scene for ready runtime shader layers - `render/runtime/RuntimeRenderScene`: render-thread-owned GL scene for ready runtime shader layers
- `render/runtime/RuntimeShaderPrepareWorker`: shared-context runtime shader program compile/link worker - `render/runtime/RuntimeShaderPrepareWorker`: shared-context runtime shader program compile/link worker
@@ -403,4 +403,4 @@ Only after this app matches the probe's smooth output:
3. port runtime snapshots/live state 3. port runtime snapshots/live state
4. add control services 4. add control services
5. add preview/screenshot from system-memory frames 5. add preview/screenshot from system-memory frames
6. add scaling and additional input format support after the BGRA8/UYVY8 input edge is stable 6. add scaling and additional input format support after the BGRA8/raw-UYVY8 input edge is stable

View File

@@ -112,8 +112,14 @@ int main(int argc, char** argv)
RenderCadenceCompositor::DeckLinkInputConfig deckLinkInputConfig; RenderCadenceCompositor::DeckLinkInputConfig deckLinkInputConfig;
deckLinkInputConfig.videoFormat = inputVideoMode; deckLinkInputConfig.videoFormat = inputVideoMode;
std::string deckLinkInputError; std::string deckLinkInputError;
if (deckLinkInput.Initialize(deckLinkInputConfig, deckLinkInputError) && if (deckLinkInput.Initialize(deckLinkInputConfig, deckLinkInputError))
deckLinkInputThread.Start(deckLinkInputError)) {
inputMailboxConfig.pixelFormat = deckLinkInput.CapturePixelFormat();
inputMailboxConfig.rowBytes = VideoIORowBytes(inputMailboxConfig.pixelFormat, inputMailboxConfig.width);
inputMailbox.Configure(inputMailboxConfig);
}
if (deckLinkInput.IsInitialized() && deckLinkInputThread.Start(deckLinkInputError))
{ {
deckLinkInputStarted = true; deckLinkInputStarted = true;
RenderCadenceCompositor::Log("app", "DeckLink input edge started for " + inputVideoMode.displayName + "."); RenderCadenceCompositor::Log("app", "DeckLink input edge started for " + inputVideoMode.displayName + ".");

View File

@@ -2,6 +2,64 @@
#include <chrono> #include <chrono>
#ifndef GL_FRAMEBUFFER_BINDING
#define GL_FRAMEBUFFER_BINDING 0x8CA6
#endif
namespace
{
constexpr GLuint kUyvyTextureUnit = 0;
const char* kDecodeVertexShader = R"GLSL(
#version 430 core
out vec2 vTexCoord;
void main()
{
vec2 positions[3] = vec2[3](
vec2(-1.0, -1.0),
vec2( 3.0, -1.0),
vec2(-1.0, 3.0));
vec2 texCoords[3] = vec2[3](
vec2(0.0, 0.0),
vec2(2.0, 0.0),
vec2(0.0, 2.0));
gl_Position = vec4(positions[gl_VertexID], 0.0, 1.0);
vTexCoord = texCoords[gl_VertexID];
}
)GLSL";
const char* kUyvyDecodeFragmentShader = R"GLSL(
#version 430 core
layout(binding = 0) uniform sampler2D uPackedUyvy;
uniform vec2 uDecodedSize;
in vec2 vTexCoord;
out vec4 fragColor;
vec4 rec709YCbCr2rgba(float yByte, float cbByte, float crByte)
{
float y = (yByte - 16.0) / 219.0;
float cb = (cbByte - 16.0) / 224.0 - 0.5;
float cr = (crByte - 16.0) / 224.0 - 0.5;
return vec4(
y + 1.5748 * cr,
y - 0.1873 * cb - 0.4681 * cr,
y + 1.8556 * cb,
1.0);
}
void main()
{
ivec2 decodedSize = ivec2(uDecodedSize);
ivec2 outputCoord = ivec2(clamp(gl_FragCoord.xy, vec2(0.0), vec2(decodedSize - ivec2(1))));
int sourceY = decodedSize.y - 1 - outputCoord.y;
ivec2 packedCoord = ivec2(clamp(outputCoord.x / 2, 0, max(decodedSize.x / 2 - 1, 0)), sourceY);
vec4 macroPixel = texelFetch(uPackedUyvy, packedCoord, 0) * 255.0;
float ySample = (outputCoord.x & 1) != 0 ? macroPixel.a : macroPixel.g;
fragColor = clamp(rec709YCbCr2rgba(ySample, macroPixel.r, macroPixel.b), vec4(0.0), vec4(1.0));
}
)GLSL";
}
InputFrameTexture::~InputFrameTexture() InputFrameTexture::~InputFrameTexture()
{ {
ShutdownGl(); ShutdownGl();
@@ -29,9 +87,19 @@ GLuint InputFrameTexture::PollAndUpload(InputFrameMailbox* mailbox)
mLastUploadMilliseconds = std::chrono::duration_cast<std::chrono::duration<double, std::milli>>(uploadEnd - uploadStart).count(); mLastUploadMilliseconds = std::chrono::duration_cast<std::chrono::duration<double, std::milli>>(uploadEnd - uploadStart).count();
++mUploadedFrames; ++mUploadedFrames;
} }
else if (frame.bytes != nullptr && frame.pixelFormat == VideoIOPixelFormat::Uyvy8 && EnsureTexture(frame) && EnsureRawUyvyTexture(frame) && EnsureDecodeProgram())
{
mLastFrameFormatSupported = true;
const auto uploadStart = std::chrono::steady_clock::now();
UploadUyvy8Frame(frame);
DecodeUyvy8Frame(frame);
const auto uploadEnd = std::chrono::steady_clock::now();
mLastUploadMilliseconds = std::chrono::duration_cast<std::chrono::duration<double, std::milli>>(uploadEnd - uploadStart).count();
++mUploadedFrames;
}
else else
{ {
mLastFrameFormatSupported = frame.pixelFormat == VideoIOPixelFormat::Bgra8; mLastFrameFormatSupported = frame.pixelFormat == VideoIOPixelFormat::Bgra8 || frame.pixelFormat == VideoIOPixelFormat::Uyvy8;
mLastUploadMilliseconds = 0.0; mLastUploadMilliseconds = 0.0;
} }
@@ -43,9 +111,15 @@ void InputFrameTexture::ShutdownGl()
{ {
if (mTexture != 0) if (mTexture != 0)
glDeleteTextures(1, &mTexture); glDeleteTextures(1, &mTexture);
if (mRawTexture != 0)
glDeleteTextures(1, &mRawTexture);
mTexture = 0; mTexture = 0;
mRawTexture = 0;
mWidth = 0; mWidth = 0;
mHeight = 0; mHeight = 0;
mRawWidth = 0;
mRawHeight = 0;
DestroyDecodeResources();
} }
bool InputFrameTexture::EnsureTexture(const InputFrame& frame) bool InputFrameTexture::EnsureTexture(const InputFrame& frame)
@@ -80,6 +154,41 @@ bool InputFrameTexture::EnsureTexture(const InputFrame& frame)
return mTexture != 0; return mTexture != 0;
} }
bool InputFrameTexture::EnsureRawUyvyTexture(const InputFrame& frame)
{
if (frame.width == 0 || frame.height == 0)
return false;
const unsigned rawWidth = (frame.width + 1u) / 2u;
if (mRawTexture != 0 && mRawWidth == rawWidth && mRawHeight == frame.height)
return true;
if (mRawTexture != 0)
glDeleteTextures(1, &mRawTexture);
mRawTexture = 0;
glGenTextures(1, &mRawTexture);
glBindTexture(GL_TEXTURE_2D, mRawTexture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexImage2D(
GL_TEXTURE_2D,
0,
GL_RGBA8,
static_cast<GLsizei>(rawWidth),
static_cast<GLsizei>(frame.height),
0,
GL_RGBA,
GL_UNSIGNED_BYTE,
nullptr);
glBindTexture(GL_TEXTURE_2D, 0);
mRawWidth = rawWidth;
mRawHeight = frame.height;
return mRawTexture != 0;
}
void InputFrameTexture::UploadBgra8FrameFlippedVertically(const InputFrame& frame) void InputFrameTexture::UploadBgra8FrameFlippedVertically(const InputFrame& frame)
{ {
glBindTexture(GL_TEXTURE_2D, mTexture); glBindTexture(GL_TEXTURE_2D, mTexture);
@@ -106,3 +215,127 @@ void InputFrameTexture::UploadBgra8FrameFlippedVertically(const InputFrame& fram
glPixelStorei(GL_UNPACK_ROW_LENGTH, 0); glPixelStorei(GL_UNPACK_ROW_LENGTH, 0);
glBindTexture(GL_TEXTURE_2D, 0); glBindTexture(GL_TEXTURE_2D, 0);
} }
void InputFrameTexture::UploadUyvy8Frame(const InputFrame& frame)
{
glBindTexture(GL_TEXTURE_2D, mRawTexture);
glPixelStorei(GL_UNPACK_ALIGNMENT, 4);
glPixelStorei(GL_UNPACK_ROW_LENGTH, frame.rowBytes > 0 ? static_cast<GLint>(frame.rowBytes / 4) : 0);
glTexSubImage2D(
GL_TEXTURE_2D,
0,
0,
0,
static_cast<GLsizei>((frame.width + 1u) / 2u),
static_cast<GLsizei>(frame.height),
GL_RGBA,
GL_UNSIGNED_BYTE,
frame.bytes);
glPixelStorei(GL_UNPACK_ROW_LENGTH, 0);
glBindTexture(GL_TEXTURE_2D, 0);
}
void InputFrameTexture::DecodeUyvy8Frame(const InputFrame& frame)
{
GLint previousFramebuffer = 0;
glGetIntegerv(GL_FRAMEBUFFER_BINDING, &previousFramebuffer);
if (mDecodeFramebuffer == 0)
glGenFramebuffers(1, &mDecodeFramebuffer);
glBindFramebuffer(GL_FRAMEBUFFER, mDecodeFramebuffer);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, mTexture, 0);
if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE)
{
glBindFramebuffer(GL_FRAMEBUFFER, static_cast<GLuint>(previousFramebuffer));
return;
}
glViewport(0, 0, static_cast<GLsizei>(frame.width), static_cast<GLsizei>(frame.height));
glDisable(GL_SCISSOR_TEST);
glDisable(GL_DEPTH_TEST);
glDisable(GL_BLEND);
glUseProgram(mDecodeProgram);
const GLint decodedSizeLocation = glGetUniformLocation(mDecodeProgram, "uDecodedSize");
if (decodedSizeLocation >= 0)
glUniform2f(decodedSizeLocation, static_cast<GLfloat>(frame.width), static_cast<GLfloat>(frame.height));
glActiveTexture(GL_TEXTURE0 + kUyvyTextureUnit);
glBindTexture(GL_TEXTURE_2D, mRawTexture);
glBindVertexArray(mDecodeVertexArray);
glDrawArrays(GL_TRIANGLES, 0, 3);
glBindVertexArray(0);
glBindTexture(GL_TEXTURE_2D, 0);
glActiveTexture(GL_TEXTURE0);
glUseProgram(0);
glBindFramebuffer(GL_FRAMEBUFFER, static_cast<GLuint>(previousFramebuffer));
}
bool InputFrameTexture::EnsureDecodeProgram()
{
if (mDecodeProgram != 0)
return true;
if (!CompileShader(GL_VERTEX_SHADER, kDecodeVertexShader, mDecodeVertexShader))
return false;
if (!CompileShader(GL_FRAGMENT_SHADER, kUyvyDecodeFragmentShader, mDecodeFragmentShader))
return false;
if (!LinkProgram(mDecodeVertexShader, mDecodeFragmentShader, mDecodeProgram))
return false;
glUseProgram(mDecodeProgram);
const GLint samplerLocation = glGetUniformLocation(mDecodeProgram, "uPackedUyvy");
if (samplerLocation >= 0)
glUniform1i(samplerLocation, static_cast<GLint>(kUyvyTextureUnit));
glUseProgram(0);
if (mDecodeVertexArray == 0)
glGenVertexArrays(1, &mDecodeVertexArray);
return mDecodeProgram != 0 && mDecodeVertexArray != 0;
}
void InputFrameTexture::DestroyDecodeResources()
{
if (mDecodeFramebuffer != 0)
glDeleteFramebuffers(1, &mDecodeFramebuffer);
if (mDecodeVertexArray != 0)
glDeleteVertexArrays(1, &mDecodeVertexArray);
if (mDecodeProgram != 0)
glDeleteProgram(mDecodeProgram);
if (mDecodeVertexShader != 0)
glDeleteShader(mDecodeVertexShader);
if (mDecodeFragmentShader != 0)
glDeleteShader(mDecodeFragmentShader);
mDecodeFramebuffer = 0;
mDecodeVertexArray = 0;
mDecodeProgram = 0;
mDecodeVertexShader = 0;
mDecodeFragmentShader = 0;
}
bool InputFrameTexture::CompileShader(GLenum shaderType, const char* source, GLuint& shader)
{
shader = glCreateShader(shaderType);
glShaderSource(shader, 1, &source, nullptr);
glCompileShader(shader);
GLint compileResult = GL_FALSE;
glGetShaderiv(shader, GL_COMPILE_STATUS, &compileResult);
if (compileResult != GL_FALSE)
return true;
glDeleteShader(shader);
shader = 0;
return false;
}
bool InputFrameTexture::LinkProgram(GLuint vertexShader, GLuint fragmentShader, GLuint& program)
{
program = glCreateProgram();
glAttachShader(program, vertexShader);
glAttachShader(program, fragmentShader);
glLinkProgram(program);
GLint linkResult = GL_FALSE;
glGetProgramiv(program, GL_LINK_STATUS, &linkResult);
if (linkResult != GL_FALSE)
return true;
glDeleteProgram(program);
program = 0;
return false;
}

View File

@@ -23,11 +23,26 @@ public:
private: private:
bool EnsureTexture(const InputFrame& frame); bool EnsureTexture(const InputFrame& frame);
bool EnsureRawUyvyTexture(const InputFrame& frame);
bool EnsureDecodeProgram();
void UploadBgra8FrameFlippedVertically(const InputFrame& frame); void UploadBgra8FrameFlippedVertically(const InputFrame& frame);
void UploadUyvy8Frame(const InputFrame& frame);
void DecodeUyvy8Frame(const InputFrame& frame);
void DestroyDecodeResources();
static bool CompileShader(GLenum shaderType, const char* source, GLuint& shader);
static bool LinkProgram(GLuint vertexShader, GLuint fragmentShader, GLuint& program);
GLuint mTexture = 0; GLuint mTexture = 0;
GLuint mRawTexture = 0;
GLuint mDecodeFramebuffer = 0;
GLuint mDecodeVertexArray = 0;
GLuint mDecodeProgram = 0;
GLuint mDecodeVertexShader = 0;
GLuint mDecodeFragmentShader = 0;
unsigned mWidth = 0; unsigned mWidth = 0;
unsigned mHeight = 0; unsigned mHeight = 0;
unsigned mRawWidth = 0;
unsigned mRawHeight = 0;
uint64_t mUploadedFrames = 0; uint64_t mUploadedFrames = 0;
uint64_t mUploadMisses = 0; uint64_t mUploadMisses = 0;
double mLastUploadMilliseconds = 0.0; double mLastUploadMilliseconds = 0.0;

View File

@@ -3,7 +3,6 @@
#include "DeckLinkVideoIOFormat.h" #include "DeckLinkVideoIOFormat.h"
#include "../logging/Logger.h" #include "../logging/Logger.h"
#include <algorithm>
#include <chrono> #include <chrono>
#include <new> #include <new>
@@ -24,28 +23,6 @@ bool FindInputDisplayMode(IDeckLinkInput* input, BMDDisplayMode targetMode, IDec
return FindDeckLinkDisplayMode(iterator, targetMode, foundMode); return FindDeckLinkDisplayMode(iterator, targetMode, foundMode);
} }
unsigned char ClampToByte(double value)
{
if (value <= 0.0)
return 0;
if (value >= 255.0)
return 255;
return static_cast<unsigned char>(value + 0.5);
}
void StoreRec709UyvyAsBgra(unsigned char yByte, unsigned char uByte, unsigned char vByte, unsigned char* destination)
{
const double y = (static_cast<double>(yByte) - 16.0) / 219.0;
const double cb = (static_cast<double>(uByte) - 16.0) / 224.0 - 0.5;
const double cr = (static_cast<double>(vByte) - 16.0) / 224.0 - 0.5;
const double red = y + 1.5748 * cr;
const double green = y - 0.1873 * cb - 0.4681 * cr;
const double blue = y + 1.8556 * cb;
destination[0] = ClampToByte(blue * 255.0);
destination[1] = ClampToByte(green * 255.0);
destination[2] = ClampToByte(red * 255.0);
destination[3] = 255;
}
} }
DeckLinkInputCallback::DeckLinkInputCallback(DeckLinkInput& owner) : DeckLinkInputCallback::DeckLinkInputCallback(DeckLinkInput& owner) :
@@ -122,7 +99,7 @@ bool DeckLinkInput::Initialize(const DeckLinkInputConfig& config, std::string& e
} }
Log( Log(
"decklink-input", "decklink-input",
std::string("DeckLink input enabled in ") + (mCapturePixelFormat == bmdFormat8BitBGRA ? "BGRA8" : "UYVY8-to-BGRA8 conversion") + " mode."); std::string("DeckLink input enabled in ") + (mCapturePixelFormat == bmdFormat8BitBGRA ? "BGRA8" : "UYVY8 raw capture") + " mode.");
mCallback.Attach(new (std::nothrow) DeckLinkInputCallback(*this)); mCallback.Attach(new (std::nothrow) DeckLinkInputCallback(*this));
if (mCallback == nullptr) if (mCallback == nullptr)
@@ -197,6 +174,11 @@ DeckLinkInputMetrics DeckLinkInput::Metrics() const
return metrics; return metrics;
} }
VideoIOPixelFormat DeckLinkInput::CapturePixelFormat() const
{
return mCapturePixelFormat == bmdFormat8BitYUV ? VideoIOPixelFormat::Uyvy8 : VideoIOPixelFormat::Bgra8;
}
void DeckLinkInput::HandleFrameArrived(IDeckLinkVideoInputFrame* inputFrame) void DeckLinkInput::HandleFrameArrived(IDeckLinkVideoInputFrame* inputFrame)
{ {
if (inputFrame == nullptr) if (inputFrame == nullptr)
@@ -263,7 +245,7 @@ void DeckLinkInput::HandleFrameArrived(IDeckLinkVideoInputFrame* inputFrame)
{ {
Log( Log(
"decklink-input", "decklink-input",
std::string("First DeckLink ") + (mCapturePixelFormat == bmdFormat8BitBGRA ? "BGRA8" : "UYVY8 converted BGRA8") + " input frame submitted to InputFrameMailbox."); std::string("First DeckLink ") + (mCapturePixelFormat == bmdFormat8BitBGRA ? "BGRA8" : "UYVY8 raw") + " input frame submitted to InputFrameMailbox.");
} }
inputFrameBuffer->EndAccess(bmdBufferAccessRead); inputFrameBuffer->EndAccess(bmdBufferAccessRead);
@@ -305,7 +287,7 @@ bool DeckLinkInput::DiscoverInput(const DeckLinkInputConfig& config, std::string
{ {
mInput = candidateInput; mInput = candidateInput;
mCapturePixelFormat = bmdFormat8BitYUV; mCapturePixelFormat = bmdFormat8BitYUV;
Log("decklink-input", "DeckLink input device selected for UYVY8 capture with CPU BGRA8 conversion."); Log("decklink-input", "DeckLink input device selected for UYVY8 raw capture with render-thread GPU decode.");
return true; return true;
} }
} }
@@ -360,34 +342,10 @@ bool DeckLinkInput::SubmitUyvy8Frame(IDeckLinkVideoInputFrame* inputFrame, const
if (width == 0 || height == 0 || sourceRowBytes < static_cast<long>(width * 2u)) if (width == 0 || height == 0 || sourceRowBytes < static_cast<long>(width * 2u))
return false; return false;
const unsigned destinationRowBytes = VideoIORowBytes(VideoIOPixelFormat::Bgra8, width); mConvertMilliseconds.store(0.0, std::memory_order_relaxed);
const auto convertStart = std::chrono::steady_clock::now();
mConversionBuffer.resize(static_cast<std::size_t>(destinationRowBytes) * static_cast<std::size_t>(height));
const unsigned char* sourceBytes = static_cast<const unsigned char*>(bytes);
for (unsigned y = 0; y < height; ++y)
{
const unsigned char* sourceRow = sourceBytes + static_cast<std::size_t>(y) * static_cast<std::size_t>(sourceRowBytes);
unsigned char* destinationRow = mConversionBuffer.data() + static_cast<std::size_t>(y) * static_cast<std::size_t>(destinationRowBytes);
for (unsigned x = 0; x < width; x += 2)
{
const unsigned pairOffset = x * 2u;
const unsigned char u = sourceRow[pairOffset + 0];
const unsigned char y0 = sourceRow[pairOffset + 1];
const unsigned char v = sourceRow[pairOffset + 2];
const unsigned char y1 = sourceRow[pairOffset + 3];
StoreRec709UyvyAsBgra(y0, u, v, destinationRow + static_cast<std::size_t>(x) * 4u);
if (x + 1u < width)
StoreRec709UyvyAsBgra(y1, u, v, destinationRow + static_cast<std::size_t>(x + 1u) * 4u);
}
}
const auto convertEnd = std::chrono::steady_clock::now();
mConvertMilliseconds.store(
std::chrono::duration_cast<std::chrono::duration<double, std::milli>>(convertEnd - convertStart).count(),
std::memory_order_relaxed);
const uint64_t frameIndex = mCapturedFrames.load(std::memory_order_relaxed); const uint64_t frameIndex = mCapturedFrames.load(std::memory_order_relaxed);
const auto submitStart = std::chrono::steady_clock::now(); const auto submitStart = std::chrono::steady_clock::now();
const bool submitted = mMailbox.SubmitFrame(mConversionBuffer.data(), destinationRowBytes, frameIndex); const bool submitted = mMailbox.SubmitFrame(bytes, static_cast<unsigned>(sourceRowBytes), frameIndex);
const auto submitEnd = std::chrono::steady_clock::now(); const auto submitEnd = std::chrono::steady_clock::now();
mSubmitMilliseconds.store( mSubmitMilliseconds.store(
std::chrono::duration_cast<std::chrono::duration<double, std::milli>>(submitEnd - submitStart).count(), std::chrono::duration_cast<std::chrono::duration<double, std::milli>>(submitEnd - submitStart).count(),

View File

@@ -63,6 +63,7 @@ public:
bool IsInitialized() const { return mInput != nullptr; } bool IsInitialized() const { return mInput != nullptr; }
bool IsRunning() const { return mRunning.load(std::memory_order_acquire); } bool IsRunning() const { return mRunning.load(std::memory_order_acquire); }
VideoIOPixelFormat CapturePixelFormat() const;
DeckLinkInputMetrics Metrics() const; DeckLinkInputMetrics Metrics() const;
void HandleFrameArrived(IDeckLinkVideoInputFrame* inputFrame); void HandleFrameArrived(IDeckLinkVideoInputFrame* inputFrame);
@@ -80,7 +81,6 @@ private:
BMDPixelFormat mCapturePixelFormat = bmdFormat8BitBGRA; BMDPixelFormat mCapturePixelFormat = bmdFormat8BitBGRA;
CComPtr<IDeckLinkInput> mInput; CComPtr<IDeckLinkInput> mInput;
CComPtr<DeckLinkInputCallback> mCallback; CComPtr<DeckLinkInputCallback> mCallback;
std::vector<unsigned char> mConversionBuffer;
std::atomic<bool> mRunning{ false }; std::atomic<bool> mRunning{ false };
std::atomic<uint64_t> mCapturedFrames{ 0 }; std::atomic<uint64_t> mCapturedFrames{ 0 };
std::atomic<uint64_t> mNoInputSourceFrames{ 0 }; std::atomic<uint64_t> mNoInputSourceFrames{ 0 };