Conversation
…ject Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
…kwards-compatible caption migration Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
…pipelines Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
…ildup Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
…line Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
…ll drag/resize/split support Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
…board settings store Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
… generation Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
|
Cursor Agent can help with this pull request. Just |
| if is_special && !show_special_keys && !is_modifier { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Right now Backspace gets skipped when show_special_keys is false (it hits the continue before the backspace handling), so the displayed text won’t reflect deletions unless special keys are enabled.
| if is_special && !show_special_keys && !is_modifier { | |
| continue; | |
| } | |
| if is_special && !show_special_keys && !is_modifier && event.key != "Backspace" { | |
| continue; | |
| } |
| all_events.presses.sort_by(|a, b| { | ||
| a.time_ms | ||
| .partial_cmp(&b.time_ms) | ||
| .unwrap_or(std::cmp::Ordering::Equal) | ||
| }); |
There was a problem hiding this comment.
partial_cmp + unwrap_or(Equal) can hide NaNs and makes ordering less explicit. Since this is f64, total_cmp is a nice drop-in here.
| all_events.presses.sort_by(|a, b| { | |
| a.time_ms | |
| .partial_cmp(&b.time_ms) | |
| .unwrap_or(std::cmp::Ordering::Equal) | |
| }); | |
| all_events | |
| .presses | |
| .sort_by(|a, b| a.time_ms.total_cmp(&b.time_ms)); |
| fn flush_keyboard_data(output_path: &Path, presses: &[KeyPressEvent]) { | ||
| let events = KeyboardEvents { | ||
| presses: presses.to_vec(), | ||
| }; | ||
| if let Ok(json) = serde_json::to_string_pretty(&events) | ||
| && let Err(e) = std::fs::write(output_path, json) | ||
| { | ||
| tracing::error!( | ||
| "Failed to write keyboard data to {}: {}", | ||
| output_path.display(), | ||
| e | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
This currently drops JSON serialization errors silently, and to_string_pretty allocates a full string every flush. Writing directly to a file handles both and is a bit cheaper.
| fn flush_keyboard_data(output_path: &Path, presses: &[KeyPressEvent]) { | |
| let events = KeyboardEvents { | |
| presses: presses.to_vec(), | |
| }; | |
| if let Ok(json) = serde_json::to_string_pretty(&events) | |
| && let Err(e) = std::fs::write(output_path, json) | |
| { | |
| tracing::error!( | |
| "Failed to write keyboard data to {}: {}", | |
| output_path.display(), | |
| e | |
| ); | |
| } | |
| } | |
| fn flush_keyboard_data(output_path: &Path, presses: &[KeyPressEvent]) { | |
| let events = KeyboardEvents { | |
| presses: presses.to_vec(), | |
| }; | |
| let file = match std::fs::File::create(output_path) { | |
| Ok(file) => file, | |
| Err(e) => { | |
| tracing::error!( | |
| "Failed to open keyboard data file {}: {}", | |
| output_path.display(), | |
| e | |
| ); | |
| return; | |
| } | |
| }; | |
| let mut writer = std::io::BufWriter::new(file); | |
| if let Err(e) = serde_json::to_writer(&mut writer, &events) { | |
| tracing::error!( | |
| "Failed to write keyboard data to {}: {}", | |
| output_path.display(), | |
| e | |
| ); | |
| } | |
| } |
| let grouped = cap_project::group_key_events( | ||
| &all_events, | ||
| grouping_threshold_ms, | ||
| linger_duration_ms, | ||
| show_modifiers, | ||
| show_special_keys, | ||
| ); |
There was a problem hiding this comment.
Minor robustness: if these come from the UI as floats, clamping to non-negative avoids end < start segments when values go negative.
| let grouped = cap_project::group_key_events( | |
| &all_events, | |
| grouping_threshold_ms, | |
| linger_duration_ms, | |
| show_modifiers, | |
| show_special_keys, | |
| ); | |
| let grouping_threshold_ms = grouping_threshold_ms.max(0.0); | |
| let linger_duration_ms = linger_duration_ms.max(0.0); | |
| let grouped = cap_project::group_key_events( | |
| &all_events, | |
| grouping_threshold_ms, | |
| linger_duration_ms, | |
| show_modifiers, | |
| show_special_keys, | |
| ); |
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
|
@greptileai please re-review the PR |
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
|
@greptileai please re-review the pr |
| let prefix = modifier_prefix(&modifiers); | ||
| if !prefix.is_empty() { | ||
| current_group_start = Some(event.time_ms); | ||
| current_display = prefix; | ||
| current_keys.push(KeyPressDisplay { | ||
| key: event.key.clone(), |
There was a problem hiding this comment.
Empty segments emitted when Backspace clears group
When a user types a character, presses Backspace to delete it (making current_display empty), then waits past grouping_threshold_ms before typing again, the should_start_new_group && let Some(start) = current_group_start branch fires and pushes a segment with an empty display_text. This produces invisible/blank keyboard overlay segments on the timeline.
The fix is to skip the push when current_display is empty, mirroring the guard already present on the final segment push at the bottom of the function:
| let prefix = modifier_prefix(&modifiers); | |
| if !prefix.is_empty() { | |
| current_group_start = Some(event.time_ms); | |
| current_display = prefix; | |
| current_keys.push(KeyPressDisplay { | |
| key: event.key.clone(), | |
| if should_start_new_group && let Some(start) = current_group_start { | |
| if !current_display.is_empty() { | |
| segment_counter += 1; | |
| segments.push(KeyboardTrackSegment { | |
| id: format!("kb-{segment_counter}"), | |
| start: start / 1000.0, | |
| end: (last_key_time + linger_duration_ms) / 1000.0, | |
| display_text: current_display.clone(), | |
| keys: current_keys.clone(), | |
| fade_duration_override: None, | |
| position_override: None, | |
| color_override: None, | |
| background_color_override: None, | |
| font_size_override: None, | |
| }); | |
| } | |
| current_display.clear(); | |
| current_keys.clear(); | |
| current_group_start = None; | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/project/src/keyboard.rs
Line: 245-250
Comment:
**Empty segments emitted when Backspace clears group**
When a user types a character, presses Backspace to delete it (making `current_display` empty), then waits past `grouping_threshold_ms` before typing again, the `should_start_new_group && let Some(start) = current_group_start` branch fires and pushes a segment with an empty `display_text`. This produces invisible/blank keyboard overlay segments on the timeline.
The fix is to skip the push when `current_display` is empty, mirroring the guard already present on the final segment push at the bottom of the function:
```suggestion
if should_start_new_group && let Some(start) = current_group_start {
if !current_display.is_empty() {
segment_counter += 1;
segments.push(KeyboardTrackSegment {
id: format!("kb-{segment_counter}"),
start: start / 1000.0,
end: (last_key_time + linger_duration_ms) / 1000.0,
display_text: current_display.clone(),
keys: current_keys.clone(),
fade_duration_override: None,
position_override: None,
color_override: None,
background_color_override: None,
font_size_override: None,
});
}
current_display.clear();
current_keys.clear();
current_group_start = None;
}
```
How can I resolve this? If you propose a fix, please make it concise.| const duration = segment.end - segment.start; | ||
| const remaining = duration - time; | ||
| if (time < 0.3 || remaining < 0.3) return; | ||
|
|
||
| segments.splice(index + 1, 0, { | ||
| ...segment, | ||
| start: segment.start + time, | ||
| end: segment.end, | ||
| }); | ||
| segments[index].end = segment.start + time; | ||
| }), | ||
| ); | ||
| }, | ||
| deleteKeyboardSegments: (segmentIndices: number[]) => { | ||
| batch(() => { | ||
| setProject( | ||
| "timeline", |
There was a problem hiding this comment.
Duplicate segment
id after split
splitKeyboardSegment spreads ...segment into the new right-hand segment, which copies the original's id. After the split both halves share the same id string. The same issue exists in splitCaptionSegment (~line 460). Any code that uses id as a unique key (e.g., the renderer looking up per-segment overrides, undo/redo reconciliation) will be unable to distinguish the two halves.
Generate a fresh ID for the new segment on each split, e.g.:
segments.splice(index + 1, 0, {
...segment,
id: `kb-split-${Date.now()}-${Math.random().toString(36).slice(2)}`,
start: segment.start + time,
end: segment.end,
});Apply the same fix in splitCaptionSegment.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src/routes/editor/context.ts
Line: 430-446
Comment:
**Duplicate segment `id` after split**
`splitKeyboardSegment` spreads `...segment` into the new right-hand segment, which copies the original's `id`. After the split both halves share the same `id` string. The same issue exists in `splitCaptionSegment` (~line 460). Any code that uses `id` as a unique key (e.g., the renderer looking up per-segment overrides, undo/redo reconciliation) will be unable to distinguish the two halves.
Generate a fresh ID for the new segment on each split, e.g.:
```typescript
segments.splice(index + 1, 0, {
...segment,
id: `kb-split-${Date.now()}-${Math.random().toString(36).slice(2)}`,
start: segment.start + time,
end: segment.end,
});
```
Apply the same fix in `splitCaptionSegment`.
How can I resolve this? If you propose a fix, please make it concise.| } | ||
| }); | ||
|
|
||
| const generateCaptionsFromTrack = async () => { | ||
| if (!editorInstance) return; | ||
|
|
||
| setEditorState("timeline", "tracks", "caption", true); |
There was a problem hiding this comment.
Caption track stays enabled on failed or empty generation
setEditorState("timeline", "tracks", "caption", true) runs unconditionally before the async call. When transcription throws or returns zero segments the caption track is left visible (showing the empty-state UI) with no automatic rollback. The user must manually toggle it off in the track header.
Consider only enabling the track on success, or resetting it on failure:
| } | |
| }); | |
| const generateCaptionsFromTrack = async () => { | |
| if (!editorInstance) return; | |
| setEditorState("timeline", "tracks", "caption", true); | |
| const generateCaptionsFromTrack = async () => { | |
| if (!editorInstance) return; | |
| setEditorState("captions", "isGenerating", true); | |
| try { | |
| const result = await transcribeEditorCaptions(editorInstance.path); | |
| if (result.segments.length < 1) { | |
| toast.error( | |
| "No captions were generated. The audio might be too quiet or unclear.", | |
| ); | |
| return; | |
| } |
Then add setEditorState("timeline", "tracks", "caption", true) only inside the successful branch, after the produce call.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src/routes/editor/Timeline/index.tsx
Line: 559-565
Comment:
**Caption track stays enabled on failed or empty generation**
`setEditorState("timeline", "tracks", "caption", true)` runs unconditionally before the async call. When transcription throws or returns zero segments the caption track is left visible (showing the empty-state UI) with no automatic rollback. The user must manually toggle it off in the track header.
Consider only enabling the track on success, or resetting it on failure:
```suggestion
const generateCaptionsFromTrack = async () => {
if (!editorInstance) return;
setEditorState("captions", "isGenerating", true);
try {
const result = await transcribeEditorCaptions(editorInstance.path);
if (result.segments.length < 1) {
toast.error(
"No captions were generated. The audio might be too quiet or unclear.",
);
return;
}
```
Then add `setEditorState("timeline", "tracks", "caption", true)` only inside the successful branch, after the `produce` call.
How can I resolve this? If you propose a fix, please make it concise.Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
|
@greptileai please re-review the pr |
|
|
||
| segments.splice(index + 1, 0, { | ||
| ...segment, | ||
| id: `kb-split-${Date.now()}-${Math.random().toString(36).slice(2)}`, |
There was a problem hiding this comment.
Since this is running in the desktop app, would you be up for using crypto.randomUUID() (already used elsewhere in apps/desktop) instead of Date.now()/Math.random() for split IDs?
| id: `kb-split-${Date.now()}-${Math.random().toString(36).slice(2)}`, | |
| id: `kb-split-${crypto.randomUUID()}`, |
|
|
||
| segments.splice(index + 1, 0, { | ||
| ...segment, | ||
| id: `cap-split-${Date.now()}-${Math.random().toString(36).slice(2)}`, |
There was a problem hiding this comment.
Same thought here for consistency + collision resistance.
| id: `cap-split-${Date.now()}-${Math.random().toString(36).slice(2)}`, | |
| id: `cap-split-${crypto.randomUUID()}`, |
Adds keyboard press recording and display, and migrates captions and keyboard presses to interactive timeline tracks for improved editing and visualization.
Greptile Summary
This PR adds keyboard press recording and display as a new feature, and migrates both captions and keyboard events into interactive timeline tracks — a significant UX improvement over the previous static sidebar-only controls.
Key changes:
crates/project/src/keyboard.rs: Binary (magic-prefixed) + JSON-fallback serialization forKeyboardEvents, andgroup_key_eventswhich converts raw key-press events into timeline segments with modifier-prefix support, Backspace handling, and command-shortcut grouping. Includes a solid unit test suite.crates/rendering/src/layers/keyboard.rs: GPU keyboard overlay renderer usingglyphonfor text and a dedicated WGPU pipeline for rounded backgrounds; handles fade, bounce animation, and caption-collision avoidance.CaptionsTrack/KeyboardTracktimeline components: Interactive drag-resize / drag-move / split / multi-select track rows, wired into the mainTimelinecomponent.KeyboardTabsidebar: Full font, color, position, animation, and behavior settings for keyboard overlays; includes a "Generate Keyboard Segments" button.general_settings.rs: Addscapture_keyboard_events(default true) andtranscription_hintsfields. The default hints include placeholder example values ("My Brand Name","mywebsite.com") that will be passed to Whisper for every new user.KeyboardTab.tsx:generateSegments()silently does nothing when the backend returns 0 segments, giving no user feedback — unlike the caption generation path which emits a descriptive toast.context.ts:timelineBoundsis destructured but unused in bothCaptionsTrackandKeyboardTrack.Confidence Score: 4/5
keyboardSegmentsin fallback initializer, caption track enabled on failure) are all addressed. Two new issues remain: (1)generateSegmentsinKeyboardTab.tsxgives no user feedback on an empty result — surprising since the caption path does — and (2) the defaulttranscription_hintscontains placeholder strings that will bias Whisper for every new user. Neither is a crash-level bug, but (2) in particular is a data-quality issue that will silently affect all new installations.apps/desktop/src/routes/editor/KeyboardTab.tsx(silent failure on empty generation) andapps/desktop/src-tauri/src/general_settings.rs(placeholder transcription hints as defaults).Important Files Changed
group_key_eventsthat groups raw key presses into timeline segments. Thorough unit tests included; known Space-key display issue (key stored as" "rather than"Space") is already tracked in prior review threads.generateCaptionsFromTrackasync function with proper toast feedback.captionSegmentsandkeyboardSegmentsare now included in all timeline initializations.splitKeyboardSegment,deleteKeyboardSegments,splitCaptionSegment,deleteCaptionSegmentsproject actions; initializes keyboard segments on first load via an effect; migrates the obsolete"above-captions"position string to"bottom-center". IDs for split segments are correctly unique.timelineBoundsdestructured but unused.timelineBoundsis destructured but unused (same issue as CaptionsTrack).generateSegments()silently does nothing when 0 segments are returned, giving no user feedback — unlike the caption generation path which emits a toast.capture_keyboard_events(default true) andtranscription_hintsfields. Default hints include placeholder values"My Brand Name"and"mywebsite.com"that will be fed to Whisper for every new user, potentially degrading transcription accuracy.CaptionTrackSegment,KeyboardSettings,KeyboardDatastructs andcaption_segments/keyboard_segmentsfields toTimelineConfiguration. All new fields have#[serde(default)]for backward compatibility.keyboardpath toMultipleSegment,keyboard_events()helper with binary/JSON fallback, and a migration that populatestimeline.caption_segmentsfromcaptions.segmentsfor existing projects without timeline captions.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD REC[Recording Session] -->|keyboard capture ON| KB_FILE[keyboard.bin per segment] KB_FILE --> META[MultipleSegment.keyboard_events] META -->|first editor load| AUTO_INIT{keyboardSegments empty?} AUTO_INIT -->|yes| GEN_CMD[generate_keyboard_segments Tauri command] AUTO_INIT -->|no| SKIP[skip init] GEN_CMD --> GROUP[group_key_events\nRust: threshold, linger, modifiers, special] GROUP --> KB_SEGMENTS[timeline.keyboardSegments] KB_SEGMENTS --> KB_TRACK[KeyboardTrack\ndrag / split / select] KB_TRACK -->|edit| KB_SEGMENTS KB_SEGMENTS --> RENDER[keyboard.rs GPU layer\nfade + bounce + position] CAPTIONS_DATA[captions.segments] -->|migration in meta.rs| CAP_SEGMENTS[timeline.captionSegments] CAP_SEGMENTS --> CAP_TRACK[CaptionsTrack\ndrag / split / select] CAP_TRACK -->|generate btn| TRANSCRIBE[transcribeEditorCaptions\nWhisper] TRANSCRIBE --> CAP_SEGMENTS RENDER --> OUTPUT[Exported video] CAP_SEGMENTS --> OUTPUTPrompt To Fix All With AI
Reviews (4): Last reviewed commit: "fix(editor): defer caption track enable ..." | Re-trigger Greptile