Threshold philosophy

How to think about the cascade of thresholds in pipeline.yaml.

The cascade order

For any given track, the validator walks a cascade to determine the threshold to use:

Per-pattern override (longest substring match wins)
Per-sound-class override
Per-category override
Validator default

null at any level disables the check entirely for that scope.

When to use which

Per-pattern (most specific)

Use when you’ve identified a single track or small group of tracks that has a legitimate quirk:

z_thresh_per_pattern:
  cat-purr: null         # cat purr-roll IS the content
  ambient-pad: 6.0       # this pad has loud sub
  hum: null              # all hum tracks have low-end variance

Per-sound-class

Use when an entire category of sound (regardless of catalog category) needs different tuning:

z_thresh_per_sound_class:
  synth-pure: null       # CLAP doesn't understand pure tones
  animal-specific: 5.0   # animals do unpredictable things

Per-category

Use when an entire catalog section (sleep, frequencies, etc.) needs adjustment:

z_thresh_per_category:
  nature: null           # nature beds have natural transients
  noises: 5.0            # noise has high variance by nature
  affirmations: null     # TTS doesn't match CLAP descriptions

Default

Validator-wide. Should be the strictest sensible value — looser overrides are explicit.

Common patterns

”Disable yamnet for affirmations”

score_threshold_per_category:
  affirmations: 0.95   # effectively disables (no real-world class hits 0.95)
allowed_classes:
  affirmations: [Speech, Narration, Female speech]

Better: keep score_threshold tight + add Speech to allowlist.

”Loosen spectral-anomaly for nature beds"

z_thresh_per_category:
  nature: 6.0   # nature beds have legitimate bird chirps + thumps

"Disable intent-drift for synth-pure”

z_thresh_per_sound_class:
  synth-pure: null

Anti-patterns

Loosening the default to silence one false positive. Use per-pattern.
Disabling a whole category for one track. Use per-pattern.
Setting different defaults for the same axis on different validators. Use one source of truth — typically the LLM judge — and have spectral / yamnet / CLAP catch the easier cases.

Verifying a threshold change

After editing pipeline.yaml:

Use the /validators/<name> page’s “Run on demand” against a track that previously failed.
Verify the verdict changes as expected.
Re-run against a track you know is good (regression check).
Commit.

The PATCH endpoint writes atomically (tmp + rename), so a crash mid-write doesn’t leave a half-baked YAML.

Sign in