---
name: data-mapping
description: Build a mapping.json file that translates species and calltype names from .data files to DB labels, using interactive prompts
---
# Data File Species/Calltype Mapping
Build a `mapping.json` that translates `.data` file species and calltype names to database `species.label` and `call_type.label` values. Map-only: never create new species or calltypes in the DB.
## When to Use
When the user needs to map species/calltypes from `.data` files before importing selections. Typically run before `skraak import selections`.
## Workflow
### Step 1: Summarise .data files
```bash
./skraak calls summarise --folder <folder> --brief
```
Parse the JSON output. Extract unique species and per-species calltypes from the `filters` object. Each filter has:
- `species`: map of species name to count
- `calltypes`: map of species name to (calltype name to count)
Collect all unique species across all filters, and for each species collect all unique calltypes.
### Step 2: Query DB for available species and calltypes
```bash
./skraak sql --db ./db/skraak.duckdb "SELECT id, label FROM species WHERE active = true"
./skraak sql --db ./db/skraak.duckdb "SELECT ct.label, s.label as species FROM call_type ct JOIN species s ON ct.species_id = s.id WHERE ct.active = true ORDER BY s.label, ct.label"
```
Parse the JSON results. Build a lookup: DB species labels, and per-species DB calltype labels.
### Step 3: Interactive mapping
For each unique species found in `.data` files:
1. Show the user the .data species name and its count
2. Use `AskUserQuestion` with DB species labels as options (pick the most likely matches, up to 4 options + "Skip - no match")
3. If the user picks a DB species and that species has calltypes in the .data files:
- For each .data calltype, use `AskUserQuestion` with that DB species' calltypes as options (up to 4 + "Keep as-is")
- "Keep as-is" means the .data calltype name equals the DB calltype name (no mapping needed, omit from calltypes map)
### Step 4: Write mapping.json
Save to `<folder>/mapping_YYYY-MM-DD.json` (using the current date, e.g. `mapping_2026-03-14.json`):
```json
{
"Don't Know": {
"species": "Don't Know"
},
"GSK": {
"species": "Haast Tokoeka",
"calltypes": {
"Male": "Male - Solo",
"Female": "Female - Solo"
}
},
"Morepork": {
"species": "Morepork"
}
}
```
Structure rules:
- Top-level keys = .data file species names
- `species` value = DB species label
- `calltypes` map = only present if calltypes exist AND at least one needs remapping
- Each calltype entry: .data calltype name -> DB calltype label
- Omit calltypes that map to themselves (user chose "Keep as-is")
- If a species maps to itself with no calltype remapping, still include it with just `{"species": "SpeciesName"}`
- If user chose "Skip", warn and omit that species entirely
### Step 5: Display summary
Print the final `mapping.json` content to the user as formatted JSON.
## Error Handling
- If a .data species has no reasonable DB match, warn and skip (don't write to mapping)
- If a DB species has no calltypes defined but .data files have calltypes for it, warn the user (the calltypes will be ignored on import)
- If `calls summarise` finds no .data files, abort with clear message
- Never create new species or calltypes in the DB
## Example Session
```
Found 4 species in .data files: GSK (342), Don't Know (28), Morepork (15), Weka (3)
Mapping species: "GSK" (342 segments)
> Which DB species? [Haast Tokoeka | Great Spotted Kiwi | Stewart Island Tokoeka | Other | Skip]
User picks: Haast Tokoeka
GSK has calltypes: Male (200), Female (120), Duet (22)
> Map calltype "Male"? [Male - Solo | Male - Duet | Keep as-is | Other]
User picks: Male - Solo
...
Mapping species: "Don't Know" (28 segments)
> Which DB species? [Don't Know | Skip]
User picks: Don't Know
Writing mapping.json...
```