Skip to content

fix: resolve camera preview and export pipeline issues#1679

Merged
richiemcilroy merged 3 commits intoCapSoftware:mainfrom
hidumou:fix/camera-and-export-pipeline
Apr 1, 2026
Merged

fix: resolve camera preview and export pipeline issues#1679
richiemcilroy merged 3 commits intoCapSoftware:mainfrom
hidumou:fix/camera-and-export-pipeline

Conversation

@hidumou
Copy link
Copy Markdown
Contributor

@hidumou hidumou commented Mar 25, 2026

Summary

This PR fixes three issues in the desktop recording/export pipeline:

  1. Camera preview did not re-open when the same camera was already active.
  2. Windows camera exports could show snow/noisy artifacts because camera buffers were read with the wrong stride/layout assumptions.
  3. 4K 60fps export could fail with Failed to export recording: Encode: Invalid argument because encoded packets could reach the MP4 muxer with invalid timestamp ordering (pts < dts).

Changes

  • Re-open the camera preview window when the selected camera is already active but the preview window is closed.
  • Respect camera buffer stride/layout on Windows during FFmpeg conversion and texture upload paths.
  • Tighten the libx264 high-throughput export configuration used for 4K60 software fallback by disabling B-frames.
  • Add a packet timestamp safety fix before MP4 muxing so packets are not written with pts < dts.

Root Cause

For the 4K60 export failure, logs showed non-monotonic timestamp ordering during software H.264 export. The encoder/muxer path could produce packets where pts became smaller than dts, which causes FFmpeg/MP4 muxing to fail with Invalid argument.

For the Windows camera artifact issue, camera frames were previously treated as tightly packed buffers, but real device buffers can contain row padding/stride. That caused corrupted image interpretation in export/output paths.

Validation

  • cargo fmt --all
  • cargo check -p cap-enc-ffmpeg -p cap-export
  • cargo check --manifest-path apps/desktop/src-tauri/Cargo.toml

Additionally, the failing local 4K60 export project was re-run and progressed well past the previous early failure range without immediately reproducing Encode: Invalid argument.

Greptile Summary

This PR fixes three distinct bugs in the recording/export pipeline: the camera preview not re-opening when the same camera device is re-selected, Windows camera export artifacts caused by ignoring device buffer row-padding (stride), and 4K60 export failures due to pts < dts timestamp ordering violations.

Key changes:

  • lib.rs: Adds CapWindowId::Camera.get(&app_handle).is_none() to the early-return branch so the preview window is re-shown only when it's actually closed; also extracts skip_camera_window.unwrap_or(false) to a single binding to eliminate repeated unwraps.
  • base.rs: Introduces a secondary guard that clamps pts to dts when the encoder emits a packet with pts < dts independently of any DTS ordering violation — a valid scenario at stream initialisation or during B-frame drain.
  • h264.rs: Sets bf=0 and b-adapt=0 in the libx264 HighThroughput software path, removing B-frames as the primary cause of pts < dts during 4K60 export.
  • win.rs: Introduces infer_camera_buffer_layout, which derives primary_stride and secondary_stride from the actual buffer length, and threads both values through copy_plane, flip_camera_buffer_into, camera_frame_to_ffmpeg, and upload_mf_buffer_to_texture. All format branches (NV12, NV21, YUYV422, UYVY422, YUV420P, YV12, P010, RGB/BGR variants) now correctly skip padding bytes on both the FFmpeg conversion and D3D11 texture upload paths.

Confidence Score: 5/5

Safe to merge — all changes are targeted bug fixes with correct implementations and no regressions identified.

All three bug fixes are logically sound: the camera window guard is correct, the pts<dts clamp is a standard muxer safety net, and the stride-aware buffer handling is comprehensively applied across every format branch. The two edge-case concerns raised in prior review threads are pre-existing limitations with clear error logging, not new regressions. No P0 or P1 issues remain.

crates/recording/src/output_pipeline/win.rs — largest and most complex change; worth a final read-through of the NV21 UV swap loop and the YUV420P V-plane offset arithmetic.

Important Files Changed

Filename Overview
apps/desktop/src-tauri/src/lib.rs Camera preview fix: adds CapWindowId::Camera.get().is_none() guard so the window is only re-opened when it's actually closed; also extracts skip_camera_window.unwrap_or(false) to a local to avoid repeated unwrap calls.
crates/enc-ffmpeg/src/base.rs Adds a secondary pts<dts safety clamp after the existing DTS-monotonicity block; correctly handles the distinct case where the encoder emits pts<dts without a DTS ordering violation.
crates/enc-ffmpeg/src/video/h264.rs Disables B-frames (bf=0, b-adapt=0) in the libx264 HighThroughput software fallback path, eliminating the primary source of pts<dts timestamp ordering violations during 4K60 export.
crates/recording/src/output_pipeline/win.rs Large refactor introducing infer_camera_buffer_layout and compact_primary_stride to correctly handle device-supplied row padding; all format branches now use the inferred stride. Edge cases (MJPEG hard-fail in texture path, asymmetric UV stride) already noted in prior review threads.

Reviews (2): Last reviewed commit: "fmt" | Re-trigger Greptile

Comment on lines +1850 to +1867
let layout = infer_camera_buffer_layout(
frame.pixel_format,
frame.width as usize,
frame.height as usize,
original_data.len(),
)
.ok_or_else(|| {
windows::core::Error::new(
windows::core::HRESULT(-1),
format!(
"Failed to infer camera buffer layout for {:?} {}x{} ({} bytes)",
frame.pixel_format,
frame.width,
frame.height,
original_data.len()
),
)
})?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 MJPEG cameras will now hard-fail in upload_mf_buffer_to_texture

infer_camera_buffer_layout always returns None for PixelFormat::MJPEG and PixelFormat::H264, so this function will now return an error immediately for any camera that streams in MJPEG format. Before this PR, the function continued (incorrectly, producing garbage), but now it fails outright.

camera_frame_to_ffmpeg correctly short-circuits MJPEG at the top of the function before reaching the layout inference. upload_mf_buffer_to_texture has no equivalent guard.

If any MJPEG cameras are used for the live preview path (which goes through texture upload), the preview will break silently. Consider adding a similar early-return guard:

// MJPEG/H264 frames cannot be uploaded as raw pixel textures
if matches!(
    frame.pixel_format,
    cap_camera_windows::PixelFormat::MJPEG | cap_camera_windows::PixelFormat::H264
) {
    return Err(windows::core::Error::new(
        windows::core::HRESULT(-1),
        "MJPEG/H264 cannot be uploaded as a raw texture".to_string(),
    ));
}

If MJPEG was already broken in this path before (the old bytes_per_pixel = _ => 2 was clearly wrong for compressed data), this is acceptable fail-fast behavior — but it should be documented explicitly.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/recording/src/output_pipeline/win.rs
Line: 1850-1867

Comment:
**MJPEG cameras will now hard-fail in `upload_mf_buffer_to_texture`**

`infer_camera_buffer_layout` always returns `None` for `PixelFormat::MJPEG` and `PixelFormat::H264`, so this function will now return an error immediately for any camera that streams in MJPEG format. Before this PR, the function continued (incorrectly, producing garbage), but now it fails outright.

`camera_frame_to_ffmpeg` correctly short-circuits MJPEG at the top of the function before reaching the layout inference. `upload_mf_buffer_to_texture` has no equivalent guard.

If any MJPEG cameras are used for the live preview path (which goes through texture upload), the preview will break silently. Consider adding a similar early-return guard:

```rust
// MJPEG/H264 frames cannot be uploaded as raw pixel textures
if matches!(
    frame.pixel_format,
    cap_camera_windows::PixelFormat::MJPEG | cap_camera_windows::PixelFormat::H264
) {
    return Err(windows::core::Error::new(
        windows::core::HRESULT(-1),
        "MJPEG/H264 cannot be uploaded as a raw texture".to_string(),
    ));
}
```

If MJPEG was already broken in this path before (the old `bytes_per_pixel = _ => 2` was clearly wrong for compressed data), this is acceptable fail-fast behavior — but it should be documented explicitly.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1149 to +1186
fn infer_camera_buffer_layout(
pixel_format: cap_camera_windows::PixelFormat,
width: usize,
height: usize,
data_len: usize,
) -> Option<CameraBufferLayout> {
use cap_camera_windows::PixelFormat;

if width == 0 || height == 0 || data_len == 0 {
return None;
}

let total_rows = match pixel_format {
PixelFormat::NV12
| PixelFormat::NV21
| PixelFormat::YUV420P
| PixelFormat::YV12
| PixelFormat::P010 => height + (height / 2),
PixelFormat::YUYV422
| PixelFormat::UYVY422
| PixelFormat::GRAY16
| PixelFormat::RGB565
| PixelFormat::ARGB
| PixelFormat::RGB32
| PixelFormat::RGB24
| PixelFormat::BGR24
| PixelFormat::GRAY8 => height,
PixelFormat::MJPEG | PixelFormat::H264 => return None,
};

if total_rows == 0 || data_len % total_rows != 0 {
return None;
}

let primary_stride = data_len / total_rows;

match pixel_format {
PixelFormat::NV12 | PixelFormat::NV21 => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Stride inference may fail for devices with asymmetric UV plane alignment

infer_camera_buffer_layout derives primary_stride from data_len / total_rows, which assumes both the luma and chroma planes in planar formats (YUV420P, YV12) use strides consistent with a single stride value. Specifically, for YUV420P it assumes secondary_stride = primary_stride / 2.

This assumption holds for the vast majority of Windows camera drivers (MF aligns all planes to the same width multiple), but a device that provides a Y plane with stride 1936 and U/V planes with stride 1024 (a different alignment unit) would fail the data_len % total_rows != 0 check and cause the frame to be dropped with an error.

The resulting error is logged clearly, so failed frames are visible. This is a reasonable trade-off given that such cameras are rare, but it's worth noting as a known limitation of the stride inference approach.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/recording/src/output_pipeline/win.rs
Line: 1149-1186

Comment:
**Stride inference may fail for devices with asymmetric UV plane alignment**

`infer_camera_buffer_layout` derives `primary_stride` from `data_len / total_rows`, which assumes both the luma and chroma planes in planar formats (YUV420P, YV12) use strides consistent with a single stride value. Specifically, for YUV420P it assumes `secondary_stride = primary_stride / 2`.

This assumption holds for the vast majority of Windows camera drivers (MF aligns all planes to the same width multiple), but a device that provides a Y plane with stride 1936 and U/V planes with stride 1024 (a different alignment unit) would fail the `data_len % total_rows != 0` check and cause the frame to be dropped with an error.

The resulting error is logged clearly, so failed frames are visible. This is a reasonable trade-off given that such cameras are rare, but it's worth noting as a known limitation of the stride inference approach.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +88 to +92
if let (Some(pts), Some(dts)) = (self.packet.pts(), self.packet.dts())
&& pts < dts
{
self.packet.set_pts(Some(dts));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Second pts < dts guard is correct but worth a comment

The existing DTS monotonicity block (lines 76–86) already adjusts PTS when DTS is bumped. This new block catches the distinct case where the encoder emits a packet with pts < dts without any DTS ordering violation (i.e., the first block didn't fire). This is a real scenario with libx264 at certain configurations.

A short comment would help future readers understand why two separate guards exist:

// Secondary guard: encoder may emit pts < dts even when DTS is
// already monotonic (e.g. during stream init or B-frame drain).
if let (Some(pts), Some(dts)) = (self.packet.pts(), self.packet.dts())
    && pts < dts
{
    self.packet.set_pts(Some(dts));
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/enc-ffmpeg/src/base.rs
Line: 88-92

Comment:
**Second `pts < dts` guard is correct but worth a comment**

The existing DTS monotonicity block (lines 76–86) already adjusts PTS when DTS is bumped. This new block catches the distinct case where the encoder emits a packet with `pts < dts` without any DTS ordering violation (i.e., the first block didn't fire). This is a real scenario with libx264 at certain configurations.

A short comment would help future readers understand why two separate guards exist:

```rust
// Secondary guard: encoder may emit pts < dts even when DTS is
// already monotonic (e.g. during stream init or B-frame drain).
if let (Some(pts), Some(dts)) = (self.packet.pts(), self.packet.dts())
    && pts < dts
{
    self.packet.set_pts(Some(dts));
}
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@richiemcilroy
Copy link
Copy Markdown
Member

Hey @hidumou, was this an issue you have personally experienced?

@richiemcilroy
Copy link
Copy Markdown
Member

@greptileai please re-review the PR

@richiemcilroy richiemcilroy merged commit 188dd93 into CapSoftware:main Apr 1, 2026
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants