Waveform Editor

The Waveform tab inside the Subtitle Editor is a full editing surface, not just a viewer. It offers a large active-cue bar above the wave, text labels inside every region, visible keyframe and quality markers, and toolbar sliders for amplitude and playback rate.

The actual subtitle file (SRT or ASS) is rewritten only when you press Save in the modal — every drag and shortcut just stages an in-memory edit, so you can experiment freely.

What you see

┌─────────────────────────────────────────────────────────────┐
│ Cue 23 / 471   00:01:23.400 → 00:01:25.800   (2.400 s)      │
│ Sie haben mich nicht ernst genommen.                         │   ← Active cue bar (A)
│ Aber das war ein Fehler.                                     │
├─────────────────────────────────────────────────────────────┤
│ 0   5   10   15   20   25 …                                  │   ← Sticky timeline (G)
│   ████░ ░░░  ████   ░░  █████   ░░░░  ████                   │   ← Waveform with cue regions
│   "Hallo"   "Du da?"   "Stop!"   "Ja, …"   "Hört mich an."  │   ← Region labels (B)
│   ╿       ╿       ╿       ╿       ╿       ╿                  │   ← Keyframe markers (C, opt-in)
│ ▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▒▒ ▂▂▂▂▂▂▂  ▓▓ ▂▂▂▂▂▂▂▂▂▂▂                  │   ← Gap/overlap markers (F)
└─────────────────────────────────────────────────────────────┘

Opening It

Open any cached subtitle from Library → Movie/Episode detail.
Click Edit on a sidecar pill — the Subtitle Editor opens.
Switch to the Waveform tab. The audio track is extracted on demand (cached per audio stream), so the first open per video may take a few seconds.

If the source video has multiple audio tracks (Japanese / English / commentary), a track picker shows up next to the toolbar — switching re-extracts and renders the chosen track.

Editing a Cue

Click a cue in the Cues tab to select it. The selected region highlights on the waveform.
Drag the region edges to retime start/end. The drag-end commit snaps to the closest anchor (see below) and enforces a minimum cue duration of 80 ms.
Drag the region body to shift both edges by the same delta — useful for nudging a whole line without changing its length.
Use the L/R click map for fast set:
- Left click anywhere on the waveform → sets the selected cue’s start to that time (snapped).
- Right click → sets the end.

The header pill on the cue turns “edited” once you’ve staged a change; Save writes it back to disk.

Snap Targets

Drag-end and click commits use a three-pool snap algorithm: the closest anchor inside its tolerance wins; ties favour keyframes, then scenes, then neighbouring cues.

Pool	Default tolerance	Source
Keyframes	150 ms	`ffprobe -skip_frame nokey` against the video
Scene changes	200 ms	PySceneDetect (lazy-imported; install `scenedetect` to enable)
Neighbouring cue boundaries	80 ms	The currently parsed subtitle file

Scene markers paint as thin amber lines on the waveform. The “N scene cuts” status pill confirms how many were detected; if the host doesn’t have scenedetect installed, the pool stays empty and the algorithm falls back to keyframes + neighbours.

Keyboard Shortcuts

The editor follows Aegisub’s Medusa-style layout. Press ? at any time to open a shortcut overlay listing every action.

Group	Keys	Action
Playback	`Space`	Play / pause
Timing	`S` / `D`	Set start / end at playhead
Timing	`F` / `G`	Split at cursor / merge with next cue
Navigation	`←` / `→`	Seek 100 ms back / forward
Navigation	`Shift+←` / `Shift+→`	Seek 1 s back / forward
Navigation	`↑` / `↓`	Select previous / next cue
Zoom	`+` / `-`	Zoom in / out (multiplicative)
Help	`?`	Open this overlay

Shortcuts are intentionally suspended while the help modal is open so the close key never collides with another action.

Control	Purpose
Play/Pause	Toggles playback. Same as `Space`.
Edit / Locked	Drag-edit on/off. Disabled in read-only mode.
Shortcuts	Opens the shortcut overlay.
Spectrogram	Layers a spectrogram view (BSD-3-Clause WaveSurfer plugin). Persisted in `localStorage`.
Scrub	Plays a tiny audio window (~30 Hz throttled) while you drag, so you can hear what you’re snapping to.
Keyframes (v0.84+)	Toggles thin teal vertical lines at every video keyframe so you can see the snap targets you’re aligning to. Off by default — keyframes are dense (~one per 0.5 s) and clutter the wave. Persisted.
Zoom slider (horizontal)	1–50 px/sec. Linear slider; the `+`/`-` keys do exponential steps.
Amplitude zoom slider (vertical) (v0.84+)	1×–5× scaling of the wave’s `barHeight`. Crucial for editing quiet dialogue — without it whispers look flat. Persisted.
Playback rate slider (v0.84+)	0.5×–2.0× with `preservesPitch=true`, so slowing dialogue down doesn’t make it sound chipmunk-ish. Aegisub-style audition workflow. Persisted.
Auto-center	Keeps the selected cue in view as you switch cues.
Audio track	Hidden when the video has only one audio stream.

Quality Markers (Gap / Overlap)

Two coloured bars along the bottom 6 px of the wave flag adjacent-cue defects:

Red bar — Two cues overlap (next.start < prev.end). The bar spans the overlap interval; hover for Overlap: cues N ↔ N+1. Most players render overlapped cues unpredictably; the editor should fix these before save.
Amber bar — Tight gap (default < 80 ms). Spans the gap; hover for Tight gap (<80 ms): cues N → N+1. Whether you fix or keep depends on the source — fast dialogue often demands tight chaining.

Both layers are always on — they’re the kind of quality signal you want to see permanently. To suppress them in a per-component config, set showGapOverlapMarkers={false} on WaveformEditor (no UI toggle shipped because the bars are intentionally cheap).

ASS Karaoke Display

When the active cue is an ASS karaoke line (\k<cs>, \K<cs>, \kf<cs>, \ko<cs> overrides), the editor paints a thin purple tick per syllable on top of the waveform. This is display-only. Sublarr deliberately does not retime karaoke — that workflow stays in Aegisub.

Limits and Notes

Read-only fall-back: if your install doesn’t grant write access to the subtitle file, the Waveform tab still loads but the Edit/Locked toggle stays locked.
Scene detection is best-effort: PySceneDetect is lazy-imported. If it’s missing you’ll just see no scene markers — no warning, no failure. To enable, install the optional dep on the host or rebuild the Docker image with pip install scenedetect.
Per-track caching: Audio extraction caches per (video_path, mtime, track_index). Switching tracks repeatedly during a session is cheap.

Waveform Editor

Waveform Editor

What you see

Opening It

Editing a Cue

Snap Targets

Keyboard Shortcuts

Toolbar

Quality Markers (Gap / Overlap)

ASS Karaoke Display

Limits and Notes

See Also