Dubtitle Detection
Anime releases routinely bundle several English subtitle tracks in one container:
- Fansub / source subs — translate the original Japanese dialogue.
- Dubtitle (CC / SDH) — transcribe the spoken English dub.
- Signs & songs / forced — on-screen text and lyrics only.
They are usually all tagged eng with no standard flag to tell them apart. So if you watch the English dub, your player often shows the fansub track — which is a translation of the Japanese and doesn’t match what’s actually being said. The line you hear and the line you read drift apart.
Dubtitle Detection finds the track that matches the dub, specifically.
How it works
Section titled “How it works”Detection runs in two tiers, cheapest first. Most files are resolved by Tier 1 alone, with no audio processing.
Tier 1 — heuristics (no audio)
Section titled “Tier 1 — heuristics (no audio)”Every English track is classified from metadata and cue statistics:
| Signal | What it tells us |
|---|---|
| Disposition / title | forced flag or Signs/Forced/Songs in the title → not a dialogue track. |
| SDH / CC markers | SDH/CC/HI in the title or the hearing-impaired flag → transcribes the audio, i.e. a strong dubtitle signal. |
| Cue density | Signs tracks are sparse (a handful of cues per minute); dialogue tracks are dense. |
| Characters per second | Dubtitles tend to run slightly “hotter” than a literal Japanese translation. |
| Overlapping cues | Fansubs overlap dialogue with signs more often than a clean dub caption. |
Tier 1 resolves the unambiguous cases for free:
- One full-text English track → that’s the dubtitle, done.
- A lone SDH/CC track among otherwise sparse/signs tracks → that’s the dubtitle, done.
Tier 2 — audio match (only when ambiguous)
Section titled “Tier 2 — audio match (only when ambiguous)”When two or more full-text English tracks remain (e.g. a fansub and a dubtitle, both untitled), Sublarr listens to the dub to break the tie:
- Samples the English dub audio at three points of the file (~12 % / 50 % / 85 %, avoiding the opening and ending themes).
- Transcribes those windows with Whisper.
- Scores each subtitle’s wording in those windows against the transcript.
A dubtitle matches the dub transcript strongly (typically 0.7–0.9); a fansub doesn’t (0.2–0.4). The highest match above the threshold is flagged as the dubtitle. The separation is wide and stable even with imperfect transcription, because scoring ignores word order and punctuation and focuses on whether the spoken words are present.
Using it
Section titled “Using it”- Open a series in the Library and expand an episode to its track panel.
- When a file has more than one subtitle track, a Detect dubtitle button appears.
- Click it. Sublarr classifies every English track and, if needed, samples the dub audio.
The results are shown inline:
- The suggested dubtitle gets a green Dubtitle badge (with its audio match score, when Tier 2 ran).
- Every other English track shows its Tier-1 label — Likely CC, Candidate, Signs/Forced or Sparse — and its score.
- A summary line explains how the decision was reached.
Nothing is applied automatically. Once you’ve confirmed which track is the dubtitle you can act on it with the existing track tools — extract it as a sidecar, or strip the unwanted tracks via foreign-track removal.
Settings
Section titled “Settings”Dubtitle Detection is off by default. On-demand detection from the track panel always works regardless of the toggle; the setting governs any future automatic use.
| Setting | Default | Values | Effect |
|---|---|---|---|
| Dubtitle detection | off | toggle | Master switch for automatic dubtitle handling. Manual detection in the track panel is always available. |
| Minimum match score | 0.55 | 0.0–1.0 | Tier-2 acceptance threshold. A track must score at least this against the dub audio to be flagged. Lower it if real dubtitles are being missed; raise it to be stricter. |
| Minimum margin | 0.15 | 0.0–1.0 | Unattended only. The winner must beat the runner-up track by at least this much, so two near-tied tracks are never auto-flagged on a coin-flip. The on-demand button still surfaces the best guess. |
| Auto cue floor | 70 | integer | Unattended only. A track needs at least this many cues to be considered during the scheduled sweep — stricter than the on-demand path so a sparse track can’t fluke a match. |
Automatic detection
Section titled “Automatic detection”When Dubtitle detection is on, a nightly scheduled sweep (Settings →
System → Scheduler → dubtitle_scan) classifies the dubtitle on library files
that have multiple embedded English tracks and records the result — flag
only, no files are changed. The track panel then shows the dubtitle badge on
open without you clicking Detect. The sweep is bounded per run, skips files
it has already classified (re-running only when the file changes), and applies
the stricter minimum margin and auto cue floor guardrails above. With the
toggle off, the job does nothing.
Fetch, then verify
Section titled “Fetch, then verify”The same audio-matching powers a verification step for downloaded subtitles. Because external providers don’t reliably tag dubtitles — they usually label them just “English” or “SDH” — fetching by tag alone is a guess. Instead, a fetched English subtitle can be scored against the dub audio and rejected if it doesn’t match, so what lands on disk is verified to fit the dub rather than trusted blindly.
Limitations
Section titled “Limitations”- Scores aren’t perfect — always confirm before stripping tracks. A sentence the fansub and dub happen to share can nudge a single window; the three-window average smooths this out but isn’t infallible.
- Tier 2 requires an English dub audio track. A subs-only release with two fansub-style English tracks can’t be told apart by audio — only by the Tier-1 heuristics.
- Image-based subtitle tracks (PGS, VobSub) can’t be text-matched and are skipped.
- If no track clears the threshold, Sublarr reports “likely no dubtitle present” and points you at a full Whisper transcription of the dub as the fallback.
See also
Section titled “See also”- Stream Management — extract embedded tracks and strip the ones you don’t want.
- Transcription — configure the Whisper backend that powers Tier 2.
- Language Profiles — which languages Sublarr keeps and targets.