Performance Tuning

The defaults are tuned for libraries up to a few thousand items running on modest hardware. When you have a 10k+ library, slow storage, or aggressive translation goals, this page is the order in which to turn the knobs.

Diagnostic first

Before tuning anything, run Settings → Diagnostics → Run full self-test. The metrics panel at the bottom shows where the actual bottleneck is — guessing wastes time.

Symptom	Likely bottleneck
Library scan slow	Storage read I/O or metadata API rate-limit.
Wanted scan slow	Provider rate-limits or per-provider concurrency too low.
Translation slow	LLM backend throughput; not Sublarr.
UI sluggish	Database query times — see Database section below.
High memory	Redis cache + translation memory buffers.

Library scan

For large libraries (10k+ items), the default 4 metadata workers under Settings → Automation → Search & Scan are conservative. Tune:

Setting	Default	Tune to	When
Min metadata workers	`4`	`8–16`	Fast network + capable host.
Yield (ms)	`0`	`5–50`	Slow storage (NFS, SMB) — IO saturation symptom.
Min file size (MB)	`100`	`200+`	Lots of trailers / clips polluting scan results.

Wanted scanner

Provider rate limits, not Sublarr, are usually the limit. Set under Settings → Automation:

Setting	Default	Tune to	When
Max concurrent provider searches	`4`	`8–16`	When you have many providers + capable network.
Per-provider concurrency	`2`	`1`	Provider rejects burst — symptom: 429s in Logs.
Provider delay (ms)	`0`	`200–1000`	Provider only — same symptom as above.
Scan interval (h)	`6`	`2–4` (smaller libraries) / `12–24` (huge libraries)	Balance freshness vs provider load.

Translation throughput

Throughput is mostly your LLM backend’s choice, not Sublarr’s. The relevant Sublarr knobs under Settings → Translation:

Setting	Default	Tune to	When
Concurrent translations	`2`	`4–8`	Fast backend (DeepL, Gemini Flash, local Ollama on GPU).
Batch size (cues per request)	`15`	`25–40` (capable models) / `8–10` (smaller models)	Bigger batch = fewer round trips, but bigger context.
Request timeout (s)	`90`	`180–300`	Slow backend (CPU Ollama, large model).
Backoff base (s)	`5`	`2`	Backend has fast recovery.

For Ollama specifically:

Hardware	Recommended batch size	Concurrent translations
CPU only	`8`	`1`
Low-end GPU (8 GB)	`10`	`1–2`
Mid-range GPU (16 GB)	`15–20`	`2–3`
High-end GPU (24 GB+)	`25–40`	`4–6`

Database

For libraries above 5k items, switch from SQLite to PostgreSQL — see PostgreSQL Setup. The migration takes minutes; you don’t have to plan downtime around it.

When already on PostgreSQL, check connection-pool sizing under Settings → System → Database — see Postgres setup guide for what each field does.

Redis (optional)

Redis is optional but has two roles when configured (SUBLARR_REDIS_URL):

Role	Effect when enabled
Provider cache	Faster cache lookups across container restarts. Sublarr’s in-memory cache works without Redis but resets on restart.
Job queue backend	Durable across restarts; jobs that were in-flight before a crash resume cleanly.

Toggle	Where
Redis cache enabled	Settings → System
Redis queue enabled	Settings → System

If you’ve configured SUBLARR_REDIS_URL but disabled both toggles, Redis is unused. Enable the toggles to actually benefit.

Disk

Symptom	Fix
Trash filling fast	Lower Settings → Subtitles → Stream Management → MKV / Subtitle backup retention.
Cache too big	Settings → Providers → Cache TTL — lower or click Clear all caches.
Audio cache too big	Settings → System → Sync Engines → Cache size limit (GB).
Translation memory too big	Settings → Translation → Cost & Memory → Memory tab → Reset.

Sync engine timing

Sync engines are CPU-heavy and sometimes the slowest single operation. Tune Settings → System → Sync Engines:

Setting	Default	Tune to	When
ffsubsync max offset (s)	`60`	`30`	Confident your subs are at most 30 s drifted. Faster.
alass split penalty	`7`	`15+`	Don’t want resync mid-file. Faster.
Sanity threshold (s)	`2.5`	`5.0`	Heavily-drifted source material — reject fewer engine outputs.

Network

Symptom	Fix
Provider calls slow + LAN looks fine	Check Settings → Diagnostics → External — internet round-trip latency.
Provider calls intermittent	Configure outbound HTTP proxy via standard `HTTP_PROXY`/`HTTPS_PROXY` env vars.
TLS handshake slow	Increase Settings → Providers → Search timeout (s) so handshakes don’t time out before the request runs.

Memory pressure

Sublarr’s resident memory grows with library size + translation memory + Whisper model. Rough budget:

Library size	Whisper `medium`	Whisper `large-v3`	Translation memory (1k entries)
5k items	+600 MB	+2.4 GB	+30 MB
20k items	same	same	+120 MB
100k items	same	same	+600 MB

Plus base Python + Flask process: ~300 MB.

If you’re running constrained:

Use Whisper base instead of medium (-400 MB, modest accuracy hit).
Disable translation if you don’t use it (reclaims TM + Whisper memory if the Whisper backend is unused).
Disable Redis cache mode (small saving; mostly relevant on tiny VMs).

When in doubt

The Diagnostics page is the source of truth — its metrics panel shows where time and bytes actually go. Tune in response to data, not in response to feeling. If you can’t identify the bottleneck, open a GitHub issue with a Diagnostics export and the maintainers will help.