---
name: detect-anomalies
description: Find and resolve label disagreements across multiple ML model filters in .data files using `skraak calls detect-anomalies`
---
# Detect and Resolve Anomalies
Compare corresponding segments across ≥2 ML model filters. Flag where models disagree on species+calltype or certainty. Resolve by visual inspection or by nominating one model as authoritative.
## Step 1: Run detect-anomalies
Get from user: **base folder**, **model filters** (min 2), optional **species scope**.
```bash
./skraak calls detect-anomalies \
--folder <path> \
--model opensoundscape-kiwi-1.0 \
--model opensoundscape-kiwi-1.2 \
--model opensoundscape-kiwi-1.5 \
[--species Kiwi] \
> /tmp/anomalies.json
```
Stderr reports: files examined, files with all models, anomaly counts by type.
Stdout: full JSON with every anomaly, file path, segment times per model, and labels.
Parse anomaly count and show a summary to the user before proceeding.
## Step 2: Parse and present anomalies
Extract the anomaly list:
```bash
jq -r '.anomalies[] | "\(.file) — \(.type) — \(.segments | map("\(.model)=\(.species)/\(.calltype)(\(.start)-\(.end))") | join(", "))"' \
/tmp/anomalies.json | head -40
```
Group by type (label_mismatch vs certainty_mismatch). Present a numbered list to the user like:
```
1. D09/…/20251108_234500.WAV.data — label_mismatch
1.0=Kiwi/Male(702.5-735) 1.2=Kiwi/Male(702.5-730) 1.5=Kiwi/Duet(712.5-732.5)
```
## Step 3: Choose resolution strategy
Ask the user which approach:
**A) Nominate an authoritative model** (e.g. "trust 1.2 — I reviewed it manually")
→ Propagate that model's labels to the others. Proceed to Step 5.
**B) Visual inspection**
→ Generate clips and examine spectrograms. Proceed to Step 4.
## Step 4: Visual inspection (optional)
Generate clips for anomalous segments only. Use the widest time range across models for each anomaly:
```bash
./skraak calls clip --folder <base_folder> --prefix <prefix> \
--output /tmp/anomaly_clips/ \
--filter <any_model> --species <species> \
--size 448 --color \
[--night --lat <float> --lng <float>]
```
Read each PNG. For each anomaly determine the correct label. Produce a verdict table:
```
# | File | Time | Models say | Verdict
1 | D09/20251108_234500 | 702-735 | 1.0=Male, 1.2=Male, 1.5=Duet | Male
2 | D09/20251126_230000 | 885-895 | 1.0=Male, 1.2=Male, 1.5=Noise | Noise
```
Skip segments that are too faint or short to classify confidently.
**Present the table and ask user to approve before making any changes.**
## Step 5: Build correction list
For each anomaly where corrections are needed, identify which model(s) differ from the verdict and list the changes:
```
# | File | Model | Segment | From | To
1 | D09/20251108_234500 | 1.5 | 712-733 | Duet | Male
2 | D09/20251126_230000 | 1.5 | 880-895 | Noise | Male
```
**Segment times**: use `floor(start)-ceil(end)` of the *model being changed* (not the anchor model).
**Present the correction list to the user and confirm before executing.**
## Step 6: Execute corrections
```bash
./skraak calls modify \
--file <full_path_to_data_file> \
--filter <model_being_corrected> \
--segment <floor_start>-<ceil_end> \
--species <Species+Calltype> # e.g. Kiwi+Male, Kiwi+Duet, "Don't Know"
--certainty 90 \
--reviewer <reviewer_name>
```
Parallelize across different files; serialize writes within the same file.
## Step 7: Verify
Re-run detect-anomalies on the same folder. Anomaly count should drop to zero (or near zero if some were skipped).
```bash
./skraak calls detect-anomalies \
--folder <path> \
--model opensoundscape-kiwi-1.0 \
--model opensoundscape-kiwi-1.2 \
--model opensoundscape-kiwi-1.5 \
2>&1 | head -3
```
Report final counts to user.
## Rules
- **Never modify** the authoritative model's labels — only correct the others to match.
- **Lonely segments** (a model has no overlapping segment) are silently skipped by the tool — not anomalies.
- **Certainty mismatches**: if labels agree but certainty differs, use the authoritative model's certainty.
- **Don't Know / Noise**: use `--species "Don't Know"` or `--species Noise` with no calltype.
- **Segment matching**: the tool uses `floor(start)` and `ceil(end)` — always pass integer seconds to `calls modify`.
- **Reviewer**: the model or human name making the correction (e.g. `Claude`, `haiku-4.5`).
## Many-folder workflow
```bash
FOLDERS=(
"/media/david/Pomona-4/Pomona/D09/2026-04-06"
"/media/david/Pomona-4/Pomona/F10/2026-04-06"
)
for F in "${FOLDERS[@]}"; do
tag=$(echo "$F" | awk -F/ '{print $(NF-2)"_"$(NF-1)}')
./skraak calls detect-anomalies \
--folder "$F" \
--model opensoundscape-kiwi-1.0 \
--model opensoundscape-kiwi-1.2 \
--model opensoundscape-kiwi-1.5 \
> "/tmp/anomalies_${tag}.json"
done
# Tally
jq -s 'map({total:.anomalies_total,lm:.label_mismatches,cm:.certainty_mismatches}) |
reduce .[] as $x ({t:0,lm:0,cm:0};
{t:(.t+$x.total),lm:(.lm+$x.label_mismatches),cm:(.cm+$x.certainty_mismatches)})' \
/tmp/anomalies_*.json
```