Skip to content

Performance Tuning

The defaults are tuned for libraries up to a few thousand items running on modest hardware. When you have a 10k+ library, slow storage, or aggressive translation goals, this page is the order in which to turn the knobs.

Before tuning anything, run Settings → Diagnostics → Run full self-test. The metrics panel at the bottom shows where the actual bottleneck is — guessing wastes time.

SymptomLikely bottleneck
Library scan slowStorage read I/O or metadata API rate-limit.
Wanted scan slowProvider rate-limits or per-provider concurrency too low.
Translation slowLLM backend throughput; not Sublarr.
UI sluggishDatabase query times — see Database section below.
High memoryRedis cache + translation memory buffers.

For large libraries (10k+ items), the default 4 metadata workers under Settings → Automation → Search & Scan are conservative. Tune:

SettingDefaultTune toWhen
Min metadata workers48–16Fast network + capable host.
Yield (ms)05–50Slow storage (NFS, SMB) — IO saturation symptom.
Min file size (MB)100200+Lots of trailers / clips polluting scan results.

Provider rate limits, not Sublarr, are usually the limit. Set under Settings → Automation:

SettingDefaultTune toWhen
Max concurrent provider searches48–16When you have many providers + capable network.
Per-provider concurrency21Provider rejects burst — symptom: 429s in Logs.
Provider delay (ms)0200–1000Provider only — same symptom as above.
Scan interval (h)62–4 (smaller libraries) / 12–24 (huge libraries)Balance freshness vs provider load.

Throughput is mostly your LLM backend’s choice, not Sublarr’s. The relevant Sublarr knobs under Settings → Translation:

SettingDefaultTune toWhen
Concurrent translations24–8Fast backend (DeepL, Gemini Flash, local Ollama on GPU).
Batch size (cues per request)1525–40 (capable models) / 8–10 (smaller models)Bigger batch = fewer round trips, but bigger context.
Request timeout (s)90180–300Slow backend (CPU Ollama, large model).
Backoff base (s)52Backend has fast recovery.

For Ollama specifically:

HardwareRecommended batch sizeConcurrent translations
CPU only81
Low-end GPU (8 GB)101–2
Mid-range GPU (16 GB)15–202–3
High-end GPU (24 GB+)25–404–6

For libraries above 5k items, switch from SQLite to PostgreSQL — see PostgreSQL Setup. The migration takes minutes; you don’t have to plan downtime around it.

When already on PostgreSQL, check connection-pool sizing under Settings → System → Database — see Postgres setup guide for what each field does.

Redis is optional but has two roles when configured (SUBLARR_REDIS_URL):

RoleEffect when enabled
Provider cacheFaster cache lookups across container restarts. Sublarr’s in-memory cache works without Redis but resets on restart.
Job queue backendDurable across restarts; jobs that were in-flight before a crash resume cleanly.
ToggleWhere
Redis cache enabledSettings → System
Redis queue enabledSettings → System

If you’ve configured SUBLARR_REDIS_URL but disabled both toggles, Redis is unused. Enable the toggles to actually benefit.

SymptomFix
Trash filling fastLower Settings → Subtitles → Stream Management → MKV / Subtitle backup retention.
Cache too bigSettings → Providers → Cache TTL — lower or click Clear all caches.
Audio cache too bigSettings → System → Sync Engines → Cache size limit (GB).
Translation memory too bigSettings → Translation → Cost & Memory → Memory tab → Reset.

Sync engines are CPU-heavy and sometimes the slowest single operation. Tune Settings → System → Sync Engines:

SettingDefaultTune toWhen
ffsubsync max offset (s)6030Confident your subs are at most 30 s drifted. Faster.
alass split penalty715+Don’t want resync mid-file. Faster.
Sanity threshold (s)2.55.0Heavily-drifted source material — reject fewer engine outputs.
SymptomFix
Provider calls slow + LAN looks fineCheck Settings → Diagnostics → External — internet round-trip latency.
Provider calls intermittentConfigure outbound HTTP proxy via standard HTTP_PROXY/HTTPS_PROXY env vars.
TLS handshake slowIncrease Settings → Providers → Search timeout (s) so handshakes don’t time out before the request runs.

Sublarr’s resident memory grows with library size + translation memory + Whisper model. Rough budget:

Library sizeWhisper mediumWhisper large-v3Translation memory (1k entries)
5k items+600 MB+2.4 GB+30 MB
20k itemssamesame+120 MB
100k itemssamesame+600 MB

Plus base Python + Flask process: ~300 MB.

If you’re running constrained:

  1. Use Whisper base instead of medium (-400 MB, modest accuracy hit).
  2. Disable translation if you don’t use it (reclaims TM + Whisper memory if the Whisper backend is unused).
  3. Disable Redis cache mode (small saving; mostly relevant on tiny VMs).

The Diagnostics page is the source of truth — its metrics panel shows where time and bytes actually go. Tune in response to data, not in response to feeling. If you can’t identify the bottleneck, open a GitHub issue with a Diagnostics export and the maintainers will help.