changelog

quietlight
May 13, 2026, 4:40 AM
ADBDI3ZCIIYXNTLXSJO2CTKUZ5JZFSGTPPJUT3AG52ZKMP4HEYCAC

Dependencies

Change contents

  • edit in me.txt at line 776
    [3.330][3.330:40094](),[3.40094][3.15250:24529]()
    Good progress — the leaf files (sql, time, metadata, isnight, prepend, xxhash, export, replay, create,
    update, cluster, dataset, location, pattern) are all os.Exit-free in their Run* functions. The remaining 103
    are in:
    - calls.go (internal functions like runCallsFromPreds, runCallsShowImages etc.)
    - calls_*.go (subcommand implementations)
    - import.go (internal runImport* functions)
    - common.go (the requireFlags function)
    This is a good stopping point for Stream 3c. The key architectural change is done: main.go has a single exit
    point, and the core CRUD + utility commands return errors. The calls_* and import internal functions can be
    converted in a follow-up.
    Skraak Codebase Review: Recommendations
    Based on thorough exploration with gosymdb, gocyclo, and direct file reading, here are my findings organized
    by theme.
    ────────────────────────────────────────────────────────────────────────────────
    1. Global Mutable State: dbPath Package Variable
    File: tools/sql.go:16 — var dbPath string
    Impact: Every DB-touching tool function implicitly reads this package-level variable. The cmd layer must call
    tools.SetDBPath() before invoking any tool function — a hidden temporal coupling. 25 references across 11
    files. This is the single biggest architectural smell.
    Recommendation: Make dbPath an explicit parameter on every tool function that needs it. This makes data flow
    visible, eliminates the SetDBPath function entirely, and makes the tools package side-effect-free and
    testable in parallel without mutexes. The Input structs already exist for every tool — add DBPath string to
    them (or a dedicated DBConfig struct embedded in each input).
    ────────────────────────────────────────────────────────────────────────────────
    2. Duplicated Validation/Verification Functions
    There are 8 local verify/validate functions in tools/ that duplicate or partially overlap with
    db/validation.go:
    ┌─────────────────────────────────────────────────┬────────────────────────────────────┬────────────────────┐
    │ Local (tools/) │ db/ equivalent │ Used by │
    ├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
    │ verifyDatasetActive(*sql.DB) │ db.DatasetExistsAndActive(Querier) │ 1 caller │
    │ │ │ (dataset.go) │
    ├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
    │ verifyDatasetExistsAndActive(Querier) │ wraps db.DatasetExistsAndActive │ location.go │
    ├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
    │ verifyLocationExistsAndActive(interface{QueryRo │ db.LocationBelongsToDataset(Querie │ location.go │
    │ w...}) │ r) │ │
    ├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
    │ verifyPatternExistsAndActive(*sql.DB) │ none in db/ │ pattern.go │
    ├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
    │ verifyPatternExists(*LoggedTx) │ none │ cluster.go │
    ├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
    │ validateClusterActive(*sql.DB) │ none │ cluster.go │
    ├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
    │ validateClusterCyclicPattern(*sql.DB) │ none │ cluster.go │
    ├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
    │ validateCyclicPattern(*sql.DB) │ none │ cluster.go │
    └─────────────────────────────────────────────────┴────────────────────────────────────┴────────────────────┘
    Problems:
    - Inconsistent parameter types: some take *sql.DB, some take db.Querier, some take inline
    interface{QueryRow...} — this is confusing
    - verifyDatasetActive in dataset.go duplicates db.DatasetExistsAndActive but with a different signature and
    error message
    - verifyLocationExistsAndActive defines an anonymous interface instead of using db.Querier
    Recommendation: Consolidate all entity-exists-and-active checks into db/validation.go using the db.Querier
    interface consistently. Delete the local duplicates in tools/. The existing db.Querier interface is good —
    extend it and use it everywhere.
    ────────────────────────────────────────────────────────────────────────────────
    3. High Cyclomatic Complexity (40+ Functions ≥ 11)
    Worst offenders (complexity 14):
    - RunCalls, ExportDataset, CallsPropagateFolder, bulkReadCSV, loadSpeciesCalltypeIDs, normalizeFlat,
    Model.View, Model.handleSwitchKey
    Average per-package: tools and cmd both have many functions in the 11–14 range. The lint_test threshold is
    14, so these are just barely passing.
    Recommendation:
    - Extract flag-parsing helpers in cmd/ — nearly every Run* function follows the same pattern: fs :=
    flag.NewFlagSet(...), parse, validate required flags, call tools.SetDBPath, call tool function, printJSON.
    The "validate required flags" block alone is duplicated ~20 times.
    - Extract early-return guard clauses in the tools functions — many if err != nil { return ... } chains can be
    collapsed.
    - Break up the TUI's View() (complexity 14, 700+ lines) into sub-render methods (which is already partially
    done, but the switch dispatch is the bottleneck).
    ────────────────────────────────────────────────────────────────────────────────
    4. tools/ Package is a God Package (265 non-test functions, 28 source files)
    556 symbols — 52% of the entire codebase's symbols live in tools/. It mixes:
    - DB CRUD operations (cluster, dataset, location, pattern, import, export)
    - File-processing pipelines (calls_from_preds, calls_classify, calls_clip, calls_clip_labels)
    - AviaNZ format I/O (data file reading/writing)
    - Audio processing (spectrogram generation, playback)
    - Validation logic
    - SQL execution
    Recommendation: Split tools/ into 3–4 focused sub-packages:
    - tools/calls — all calls_* files (from_preds, from_birda, from_raven, classify, clip, clip_labels, modify,
    propagate, push_certainty, detect_anomalies, show_images, summarise)
    - tools/crud — cluster, dataset, location, pattern create/update
    - tools/import — import_file, import_files, import_segments, import_unstructured, bulk_file_import
    - tools/sql — sql, export (or merge into db/)
    This would also reduce merge conflicts, as these are largely independent domains.
    ────────────────────────────────────────────────────────────────────────────────
    5. Repetitive cmd/ Boilerplate
    Every Run* function in cmd/ repeats this pattern:
    1. Create flag.NewFlagSet
    2. Define flags (always --db)
    3. Parse args
    4. missing := []string{} + check required flags
    5. tools.SetDBPath(*dbPath)
    6. defer initEventLog(*dbPath)()
    7. Build tools.XxxInput{...}
    8. Call tools.Xxx(input)
    9. printJSON(output) or error to stderr + os.Exit(1)
    Recommendation: Create a cmd-internal helper:
    ```go
    type CommandConfig struct {
    DBPath string
    EventLog bool
    }
    func parseCommand(args []string, requiredFlags ...string) (CommandConfig, *flag.FlagSet) { ... }
    ```
    This would eliminate ~60% of the boilerplate in cmd/ files.
    ────────────────────────────────────────────────────────────────────────────────
    6. MarshalJSON Boilerplate in db/types.go
    Four types (Dataset, Location, Cluster, CyclicRecordingPattern) each have a nearly identical MarshalJSON
    method that just reformats time.Time fields as RFC3339 strings. The pattern is: create anonymous struct
    identical to the receiver but with string instead of time.Time, copy all fields, convert timestamps, marshal.
    Recommendation: Use a helper or custom type:
    - Option A: Define type JSONTime time.Time with a MarshalJSON that outputs RFC3339, and use it in the struct
    tags.
    - Option B: Use json.Marshal with a single helper that reflects over the struct and converts time.Time
    fields.
    - This would remove ~80 lines of boilerplate.
    ────────────────────────────────────────────────────────────────────────────────
    7. AviaNZ Types Defined in calls_from_preds.go but Used Across Files
    AviaNZMeta, AviaNZLabel, AviaNZSegment are defined in calls_from_preds.go but used by calls_from_birda.go,
    calls_from_raven.go, and the shared writeDotDataFileSafe function. They're AviaNZ format concerns, not
    prediction-specific.
    Recommendation: Move these types to a dedicated file (e.g., tools/avianz_types.go or utils/data_file.go since
    the rest of the AviaNZ parsing already lives there). This makes the dependency graph honest and reduces the
    cognitive load of calls_from_preds.go (733 lines, one of the largest files).
    ────────────────────────────────────────────────────────────────────────────────
    8. Mixed DB Access Patterns in tools/
    The tools layer uses 4 different DB access patterns, sometimes within the same file:
    ┌───────────────────────────────────────┬───────────────────────────────────────────┬───────────────────────┐
    │ Pattern │ Example │ Used in │
    ├───────────────────────────────────────┼───────────────────────────────────────────┼───────────────────────┤
    │ db.WithWriteTx(ctx, dbPath, toolName, │ cluster.go, dataset.go, location.go │ CRUD tools │
    │ fn) │ │ │
    ├───────────────────────────────────────┼───────────────────────────────────────────┼───────────────────────┤
    │ db.OpenReadOnlyDB(dbPath) + manual │ bulk_file_import.go, export.go, │ Read-heavy tools │
    │ db.Close() │ import_files.go │ │
    ├───────────────────────────────────────┼───────────────────────────────────────────┼───────────────────────┤
    │ db.OpenWriteableDB(dbPath) + manual │ bulk_file_import.go, import_file.go, │ Write-heavy without │
    │ │ import_segments.go │ tx logging │
    ├───────────────────────────────────────┼───────────────────────────────────────────┼───────────────────────┤
    │ db.WithReadDB(dbPath, fn) │ import_files.go, import_unstructured.go │ Simple read queries │
    └───────────────────────────────────────┴───────────────────────────────────────────┴───────────────────────┘
    Some tools open a writeable DB but don't use WithWriteTx, meaning their mutations aren't logged for replay.
    import_segments.go and import_file.go are notable — they open *sql.DB directly and do inserts without the
    transaction logger.
    Recommendation: Standardize on db.WithWriteTx for all mutations (it's already there for exactly this purpose)
    and db.WithReadDB for reads. The manual Open*DB + defer Close() pattern should be the exception, not the
    norm. This also fixes the audit-gap where some writes aren't replayable.
    ────────────────────────────────────────────────────────────────────────────────
    9. tui/ Directly Imports tools for ClassifyState
    The TUI package imports tools specifically for ClassifyState, ClassifyConfig, KeyBinding, BindingResult, and
    FormatLabels. This creates a dependency from the presentation layer to the business-logic layer. The TUI
    calls tools.FormatLabels() and directly manipulates tools.ClassifyState.
    Recommendation: If tools/ is split per recommendation #4, the classify-related types should move to a
    tools/calls sub-package, which the TUI can import. Alternatively, extract the types (ClassifyConfig,
    KeyBinding, BindingResult) into utils/ since they're pure data with no DB dependency — ClassifyState is the
    only one that has real logic (and it should stay in tools/calls).
    ────────────────────────────────────────────────────────────────────────────────
    10. Error Handling: Inconsistent os.Exit in cmd/
    There are 174 os.Exit calls across cmd/ files, with no structured error types. Every function exits directly
    on any error. This makes testing cmd/ functions impossible and prevents composable error handling.
    Recommendation: Return errors from Run* functions instead of calling os.Exit. Have a single top-level
    dispatcher in main.go that handles os.Exit(1) based on the returned error. This would make the cmd/ layer
    testable and allow future programmatic use of the CLI.
    ────────────────────────────────────────────────────────────────────────────────
    Summary: Priority-Ordered Changes
    ┌───────────┬────────────────────────────────────────────────┬─────────────────────────────────────┬────────┐
    │ Priority │ Change │ Impact │ Effort │
    ├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
    │ 🔴 High │ Eliminate var dbPath — make it explicit │ Testability, parallel-safety, │ Medium │
    │ │ parameter │ clarity │ │
    ├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
    │ 🔴 High │ Consolidate duplicated verify/validate into │ Consistency, DRY │ Low │
    │ │ db/validation.go │ │ │
    ├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
    │ 🟡 Medium │ Split tools/ into sub-packages │ Maintainability, merge conflicts, │ High │
    │ │ │ cognitive load │ │
    ├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
    │ 🟡 Medium │ Standardize DB access patterns (use │ Auditability, consistency │ Medium │
    │ │ WithWriteTx/WithReadDB) │ │ │
    ├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
    │ 🟡 Medium │ Reduce cmd/ boilerplate with helper functions │ DRY, readability │ Low │
    ├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
    │ 🟢 Low │ Deduplicate MarshalJSON in db/types.go │ DRY, ~80 lines removed │ Low │
    ├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
    │ 🟢 Low │ Move AviaNZ types out of calls_from_preds.go │ Dependency clarity │ Low │
    ├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
    │ 🟢 Low │ Return errors from cmd/ instead of os.Exit │ Testability │ Medium │
    ├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
    │ 🟢 Low │ Reduce cyclomatic complexity in worst │ Maintainability │ Medium │
    │ │ functions │ │ │
    └───────────┴────────────────────────────────────────────────┴─────────────────────────────────────┴────────┘
    #: 3
    Item: Reduce cyclomatic complexity
    Status: ⚠️ Partial
    Notes: Still ~15 functions at 13–14 (e.g. ExportDataset, CallsPropagateFolder, bulkReadCSV,
    loadSpeciesCalltypeIDs, RunCalls, handleSwitchKey). Same offenders as before — not yet broken up.
    #: 8
    Item: Standardize DB access patterns
    Status: ⚠️ Partial
    Notes: WithWriteTx / WithReadDB are used in CRUD tools, but tools/import/*.go (bulk_file_import, import_file,
    import_files, import_segments) still call db.OpenWriteableDB directly without transaction logging. The
    audit-gap recommendation #8 flagged is still present.
    Evaluation: Recommendations Implementation Status
    ### 1. Global Mutable State: var dbPath — ✅ RESOLVED
    Was: var dbPath package-level variable in tools/sql.go with SetDBPath() function, 25 references across 11
    files.
    Now: SetDBPath and var dbPath are completely eliminated. Every Input struct now has an explicit DBPath string
    field (e.g., ClusterInput.DBPath, DatasetInput.DBPath, ExecuteSQLInput.DBPath). A small resolveDBPath()
    helper delegates to db.ResolveDBPath(). The cmd layer passes *dbPath into each input struct directly. This is
    exactly what was recommended — data flow is now visible, no temporal coupling, parallel-safe.
    ### 2. Duplicated Validation/Verification Functions — ✅ MOSTLY RESOLVED
    Was: 8 local verify/validate functions in tools/ with inconsistent parameter types.
    Now: db/validation.go has been significantly expanded with consistent db.Querier-based functions:
    - DatasetExistsAndActive, LocationBelongsToDataset, LocationExistsAndActive, ClusterExistsAndActive,
    ClusterBelongsToLocation, PatternExistsAndActive, ValidateLocationBelongsToDataset, plus type-specific
    validators.
    The old local duplicates in dataset.go, location.go, pattern.go are gone — those files now call db.*
    directly.
    Remaining issue: tools/cluster.go still has two local wrappers:
    - validateClusterActive(database *sql.DB, ...) — just delegates to db.ClusterExistsAndActive
    - validateClusterCyclicPattern(database *sql.DB, ...) — adds nil/empty check before calling
    db.PatternExistsAndActive
    These are thin wrappers (3–10 lines each) with a bit of business logic in the cyclic pattern one. Not a major
    smell, but the first one is a pure pass-through that could be inlined.
    ### 3. High Cyclomatic Complexity — 🟡 PARTIALLY IMPROVED
    Was: Worst offenders at complexity 14, many functions in the 11–14 range. Lint threshold at 14.
    Now: No functions exceed 14 (the lint threshold), but 6 functions still sit at complexity 14 (the max):
    - RunCalls (cmd), ExportDataset (tools), CallsPropagateFolder (calls), loadSpeciesCalltypeIDs (import),
    bulkReadCSV (import), Model.handleSwitchKey (tui)
    Plus ~28 functions in the 11–13 range. The TUI's View() was previously at complexity 14 and 700+ lines — now
    the TUI is 874 lines but View no longer appears in the top offenders (it was likely broken into sub-methods
    like handleSwitchKey, handleClipKey, etc.). So some progress on the TUI front.
    Verdict: The ceiling hasn't changed (still 14), but there's been some redistribution. The lint threshold is
    acting as a ceiling rather than a target — functions cluster just below/at it.
    ### 4. tools/ God Package — ✅ MOSTLY RESOLVED
    Was: 265 non-test functions, 28 source files, 556 symbols — 52% of codebase.
    Now: tools/ has been split into three packages:
    ┌───────────────┬───────┬───────────┬───────┐
    │ Package │ Files │ Functions │ Lines │
    ├───────────────┼───────┼───────────┼───────┤
    │ tools/ │ 12 │ 82 │ 2,746 │
    ├───────────────┼───────┼───────────┼───────┤
    │ tools/calls/ │ 28 │ 271 │ 8,242 │
    ├───────────────┼───────┼───────────┼───────┤
    │ tools/import/ │ 6 │ 37 │ 2,181 │
    └───────────────┴───────┴───────────┴───────┘
    This matches the recommended split almost exactly (calls → tools/calls/, import → tools/import/). The
    remaining tools/ package is the CRUD core (cluster, dataset, location, pattern, sql, export, prepend, time).
    The tools/sql sub-package recommendation wasn't adopted — export and sql remain in tools/, which is
    reasonable given their smaller size.
    ### 5. Repetitive cmd/ Boilerplate — 🟡 PARTIALLY IMPROVED
    Was: Every Run* function repeats 9 steps: flag.NewFlagSet, define flags, parse, validate required, SetDBPath,
    initEventLog, build input, call tool, printJSON.
    Now:
    - SetDBPath step is eliminated (DBPath goes into input structs) ✅
    - cmd/common.go was created with checkFlags() and checkNonZeroFlags() helpers that return errors instead of
    calling os.Exit ✅
    - Most Run* functions now return err instead of calling os.Exit directly ✅
    - main.go has a clean dispatch map with a single os.Exit(1) on error ✅
    Still present: 29 flag.NewFlagSet calls across 24 cmd files. Each function still creates its own FlagSet,
    defines flags, and calls checkFlags. The "parseCommand" helper from the recommendation was not implemented.
    The boilerplate is reduced (no SetDBPath, error returns instead of exit), but the flag-parsing pattern is
    still repeated.
    ### 6. MarshalJSON Boilerplate in db/types.go — 🟡 PARTIALLY IMPROVED
    Was: 4 nearly identical MarshalJSON methods, ~80 lines of boilerplate.
    Now: A JSONTime type and jt() helper were introduced — this is exactly "Option A" from the recommendation.
    However, the 4 MarshalJSON methods still exist. Each one creates an anonymous struct with JSONTime fields and
    manually copies all fields. The JSONTime type eliminates the explicit time.Time → string conversion, but the
    struct duplication remains. There are still 11 MarshalJSON references in db/types.go (4 methods + helper + jt
    references).
    Lines saved: Maybe ~20–30 lines reduced from the original ~80, but the fundamental boilerplate of duplicating
    the struct definition persists.
    ### 7. AviaNZ Types — ✅ RESOLVED
    Was: AviaNZMeta, AviaNZLabel, AviaNZSegment defined in calls_from_preds.go but used across files.
    Now: Moved to tools/calls/avianz_types.go — a dedicated file in the calls sub-package. Clean separation,
    honest dependency graph. The calls_from_preds.go file is down to 716 lines (from 733).
    ### 8. Mixed DB Access Patterns — 🟡 PARTIALLY IMPROVED
    Was: 4 different DB access patterns across tools/.
    Now: The CRUD operations in tools/ consistently use db.WithWriteTx for writes and db.WithReadDB for reads ✅.
    tools/import/import_unstructured.go also uses WithWriteTx ✅.
    Still an issue: 4 files in tools/import/ still use OpenWriteableDB + manual Close() without transaction
    logging:
    - import_file.go, import_files.go, import_segments.go, bulk_file_import.go
    These writes bypass the audit/replay logger. The original concern about the "audit gap" remains for imports.
    ### 9. TUI Importing tools for ClassifyState — 🟡 PARTIALLY IMPROVED
    Was: TUI imported tools directly for ClassifyState, ClassifyConfig, KeyBinding, BindingResult, FormatLabels.
    Now: TUI imports tools/calls instead of tools. Since the calls sub-package was created (rec #4), this is the
    "alternative" recommendation — the classify-related types live in tools/calls/ and TUI imports that. However,
    the deeper coupling remains: TUI directly manipulates calls.ClassifyState, calls calls.FormatLabels(), and
    calls.BindingResult. The types weren't extracted to utils/.
    Verdict: The dependency is now on a more focused sub-package rather than the god package, which is an
    improvement. But the TUI↔business-logic coupling remains.
    ### 10. Error Handling: os.Exit in cmd/ — ✅ MOSTLY RESOLVED
    Was: 174 os.Exit calls across cmd/ files.
    Now: Only 14 os.Exit calls remain in cmd/ (down from 174!). Of those:
    - 5 are in cmd/common.go's mustValue/mustIntValue (unrecoverable CLI parse errors)
    - 9 are in calls_classify.go, calls_modify.go, calls_push_certainty.go (TUI-related exit paths)
    Most Run* functions now return error. main.go has a clean dispatch with if err := handler(...); err != nil →
    single os.Exit(1). The ErrHelpRequested sentinel enables clean help exits.
    Remaining: 9 os.Exit calls in the classify/modify/push-certainty commands — these are the TUI-adjacent
    commands where the pattern is slightly different (TUI launch, signal handling). Still improvable but
    dramatically better.
    ────────────────────────────────────────────────────────────────────────────────
    Summary Table
    ┌────┬─────────────────────────┬─────────────┬──────────────────────────────────────────────────────────────┐
    │ # │ Recommendation │ Status │ Notes │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 1 │ Eliminate var dbPath │ ✅ Done │ DBPath in every Input struct, SetDBPath gone │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 2 │ Consolidate │ ✅ Mostly │ db/validation.go expanded; 2 thin wrappers remain in │
    │ │ verify/validate │ done │ cluster.go │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 3 │ Reduce cyclomatic │ 🟡 Marginal │ No functions >14, but 6 at exactly 14; still 28+ in 11–13 │
    │ │ complexity │ │ range │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 4 │ Split tools/ god │ ✅ Done │ Split into tools/, tools/calls/, tools/import/ │
    │ │ package │ │ │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 5 │ Reduce cmd/ boilerplate │ 🟡 Partial │ SetDBPath gone, checkFlags helper added, errors returned; │
    │ │ │ │ but 29 FlagSets still repeated │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 6 │ Deduplicate MarshalJSON │ 🟡 Partial │ JSONTime helper added, but 4 MarshalJSON methods still │
    │ │ │ │ duplicate structs │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 7 │ Move AviaNZ types │ ✅ Done │ Moved to tools/calls/avianz_types.go │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 8 │ Standardize DB access │ 🟡 Partial │ CRUD uses WithWriteTx consistently; 4 import files still use │
    │ │ │ │ OpenWriteableDB │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 9 │ TUI→tools coupling │ 🟡 Partial │ Now imports tools/calls (more focused), but still deeply │
    │ │ │ │ coupled to ClassifyState │
    ├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
    │ 10 │ Eliminate os.Exit in │ ✅ Mostly │ Down from 174→14; most functions return errors; main.go │
    │ │ cmd/ │ done │ dispatches │
    └────┴─────────────────────────┴─────────────┴──────────────────────────────────────────────────────────────┘
    Overall: 3 fully resolved, 4 mostly/partially done with remaining work, 3 with marginal improvement. The
    biggest wins were eliminating the global dbPath (rec #1), splitting the god package (rec #4), and the os.Exit
    cleanup (rec #10). The remaining work is mostly medium-priority: the import audit gap (rec #8), the cmd/
    FlagSet boilerplate (rec #5), and the MarshalJSON struct duplication (rec #6).
    Test Coverage Improvement Plan
    Current State
    ┌──────────────┬──────────┬──────────────────────────────────┐
    │ Package │ Coverage │ Statements at 0% │
    ├──────────────┼──────────┼──────────────────────────────────┤
    │ utils │ 71.6% │ 106 │
    ├──────────────┼──────────┼──────────────────────────────────┤
    │ tools │ 55.4% │ 181 (mostly calls/clip + export) │
    ├──────────────┼──────────┼──────────────────────────────────┤
    │ tools/calls │ 58.4% │ 110 │
    ├──────────────┼──────────┼──────────────────────────────────┤
    │ tools/import │ 3.7% │ 35 │
    ├──────────────┼──────────┼──────────────────────────────────┤
    │ db │ 51.6% │ 46 (mostly validation + types) │
    ├──────────────┼──────────┼──────────────────────────────────┤
    │ cmd │ 0.9% │ ~all │
    ├──────────────┼──────────┼──────────────────────────────────┤
    │ tui │ 0.0% │ all │
    ├──────────────┼──────────┼──────────────────────────────────┤
    │ Total │ 37.9% │ │
    └──────────────┴──────────┴──────────────────────────────────┘
    E2E shell scripts (2,200+ lines) cover integration paths already, so we shouldn't duplicate that effort in
    unit tests.
    Guiding Principles
    1. Test at the lowest feasible layer — push logic down so it's testable without DB/IO
    2. Don't test what E2E already covers — cmd/ and DB-integration paths are well-exercised by shell scripts
    3. Pure functions and validation logic first — highest ROI, easiest to maintain
    4. Mock via existing interfaces — db.Querier and utils.DB already exist
    5. Skip UI/IO-heavy code — tui, audio_player, calls_show_images
    ────────────────────────────────────────────────────────────────────────────────
    Priority 1: High-ROI, Zero-Dependency Tests (pure logic)
    These are the easiest wins — pure functions with no DB or filesystem dependencies.
    ### 1a. db/types.go — MarshalJSON tests (0% → ~100%)
    All 5 MarshalJSON methods on Dataset, Location, Cluster, CyclicRecordingPattern, and JSONTime/jt are pure
    serialization. Create structs with known values, marshal, assert JSON output. ~30 lines of test per type.
    ### 1b. utils/location.go — ParseLocation (0% → ~100%)
    Already a simple pure function. Test: valid 2-part, valid 3-part, too few parts, too many, non-numeric
    lat/lng, whitespace trimming.
    ### 1c. utils/placeholders.go — Placeholders (0% → ~100%)
    One function, already depends on nothing. Test n=0, n=1, n=3.
    ### 1d. utils/validation.go — ValidateOptionalShortID, ValidateOptionalStringLength (0% → ~100%)
    Two uncovered pure functions. Test nil pointer, empty string, valid value, invalid value.
    ### 1e. db/utils.go — Placeholders (0% → ~100%)
    Same as utils/placeholders — trivial test.
    ### 1f. db/resolve.go — ResolveDBPath (0% → ~100%)
    Small function, test path resolution logic.
    ### 1g. tools/calls/isnight.go — String() and sunTimeUTC() (0% → ~100%)
    String() is pure formatting. sunTimeUTC() is pure conditional. Easy table-driven tests.
    ### 1h. tools/calls/calls_summarise.go — updateStatsFromLabels (0%)
    Already nearby functions are tested. This one just delegates, so a quick test confirms wiring.
    Expected impact: +~8-10% total coverage for minimal effort (~200 lines of test code total).
    ────────────────────────────────────────────────────────────────────────────────
    Priority 2: Extract-and-Test (moderate refactoring, high value)
    These require extracting pure logic from DB-coupled functions, which improves both testability AND the
    architecture.
    ### 2a. db/validation.go — Test with db.Querier mock (0% → ~80%)
    This is the biggest architectural win. All 11 validation functions already use the db.Querier interface
    (not *sql.DB directly). We can:
    1. Create a mockQuerier struct implementing db.Querier using an in-memory DuckDB
    2. Or simpler: create a test helper that sets up an in-memory DuckDB with the full schema + test data, then
    run validation functions against it
    The pattern already exists in invariants_test.go (setupInvariantsTestDB). We can reuse that helper. Test
    each validation function with:
    - Valid ID → success
    - Nonexistent ID → appropriate error
    - Inactive ID → appropriate error
    - Mismatched hierarchy → appropriate error
    This is the single most impactful change — validation logic is business-critical and currently untested
    except via E2E.
    ### 2b. utils/mapping.go — collectUnmappedCalltypes, collectMappedLabels, validateMappedSpecies,
    validateMappedCalltypes (0%)
    These functions need utils.DB (which is the same interface as db.Querier + Query). The same mock approach
    works. Since these are the DB-validation side of the mapping system and the pure mapping functions are
    already well-tested, this closes the gap.
    ### 2c. tools/export.go — Extract orderByFKDependency and manifest logic
    ExportDataset has cyclomatic complexity 14. The table manifest (datasetTables) and ordering logic are pure
    data. Extract and test:
    - Table manifest completeness (no missing tables)
    - FK ordering correctness (can reuse TestGetFKOrder pattern)
    - checkOutputFile logic (pure path validation)
    ### 2d. tools/calls/calls_clip.go — Extract filterSegments and checkDayNightFilter
    Both are pure logic currently at 0%. filterSegments applies species/certainty/calltype filters to segments.
    checkDayNightFilter checks astronomical filters. Neither needs the DB.
    ────────────────────────────────────────────────────────────────────────────────
    Priority 3: Structural Improvements for Testability
    These are refactoring changes that don't add tests directly but make future testing easier.
    ### 3a. Add db.Querier usage consistently in tools/import/
    The import package has 3.7% coverage because everything takes *sql.DB. The validation functions in
    import_segments.go (validateSegmentHierarchy, validateFiltersExist, loadSpeciesCalltypeIDs) could accept
    db.Querier instead, making them testable with the same mock pattern as 2a.
    This aligns with the existing db.Querier interface pattern already in db/validation.go.
    ### 3b. Extract validation from tools/calls/calls_classify.go
    This 704-line file has many 0% methods: filterByTimeOfDay, NextSegment, PrevSegment, FormatLabels, Save,
    etc. The navigation and filtering logic is pure and could be extracted into testable helpers. However, this
    is lower priority since it's interactive/TUI-adjacent.
    ### 3c. Shared test helper package
    setupInvariantsTestDB in db/invariants_test.go is a pattern that should be available across packages.
    Extract to a shared test helper (e.g., db/testdb.go with build tag, or a testutil internal package). This
    would also benefit the validation tests in 2a.
    ────────────────────────────────────────────────────────────────────────────────
    Priority 4: Explicitly Skip / Document Why Not Tested
    ### 4a. tui/classify.go — Don't test
    Interactive TUI. 0% coverage is expected and acceptable. The underlying tools/calls functions it calls are
    tested.
    ### 4b. utils/audio_player.go — Don't test
    OS-level audio playback. Side-effect only.
  • edit in me.txt at line 777
    [3.24530][3.24530:24585]()
    ### 4c. cmd/*.go — Don't unit test (covered by E2E)
  • replacement in me.txt at line 778
    [3.24586][3.24586:29978](),[3.29978][2.17761:18787]()
    CLI dispatch is thin glue. All logic is in tools/. The existing common_test.go tests the few extractable
    helpers (checkFlags, checkNonZeroFlags). The rest is flag parsing and dispatch — well-covered by shell
    script tests.
    ### 4d. main.go — Don't test
    Entry point, no logic to test.
    ────────────────────────────────────────────────────────────────────────────────
    Summary: Expected Coverage Impact
    ┌─────────────────────────────┬────────┬────────────────────┬─────────────────────────────────────────────┐
    │ Change │ Effort │ Coverage Δ │ Maintainability Value │
    ├─────────────────────────────┼────────┼────────────────────┼─────────────────────────────────────────────┤
    │ 1a-1h: Pure function tests │ Low │ +8-10% │ Catches regressions in core logic │
    ├─────────────────────────────┼────────┼────────────────────┼─────────────────────────────────────────────┤
    │ 2a: db/validation with │ Medium │ +5-7% │ Highest architectural value — validates │
    │ Querier │ │ │ business rules │
    ├─────────────────────────────┼────────┼────────────────────┼─────────────────────────────────────────────┤
    │ 2b: mapping DB validation │ Medium │ +2-3% │ Completes mapping test coverage │
    ├─────────────────────────────┼────────┼────────────────────┼─────────────────────────────────────────────┤
    │ 2c: export │ Medium │ +3-4% │ Reduces ExportDataset complexity │
    │ manifest/ordering │ │ │ │
    ├─────────────────────────────┼────────┼────────────────────┼─────────────────────────────────────────────┤
    │ 2d: clip filter extraction │ Low │ +1-2% │ Pure logic testing │
    ├─────────────────────────────┼────────┼────────────────────┼─────────────────────────────────────────────┤
    │ 3a: import Querier refactor │ Medium │ (enables future │ Makes import testable │
    │ │ │ tests) │ │
    ├─────────────────────────────┼────────┼────────────────────┼─────────────────────────────────────────────┤
    │ 3b-c: shared helpers │ Low │ (enables future │ Reduces test boilerplate │
    │ │ │ tests) │ │
    └─────────────────────────────┴────────┴────────────────────┴─────────────────────────────────────────────┘
    Realistic target: 37.9% → ~55-60% with Priority 1+2, while keeping tests maintainable.
    Recommended Execution Order
    1. 1a → 1b → 1c → 1d → 1e → 1f → 1g → 1h (quick wins, build momentum)
    2. 3c (shared test DB helper — enables 2a)
    3. 2a (db/validation — biggest impact)
    4. 2b (mapping DB validation)
    5. 2d (clip filter extraction)
    6. 2c (export extraction)
    7. 3a (import Querier refactor — enables future coverage)
    Shall I start with Priority 1?
    write a PLAN.md, then begin with 1. we will take stock after that.
    Go line count:
    18737 total
    Test line count:
    11982 total
    go test -cover ./...
    ok skraak 2.548s coverage: 0.0% of statements
    ok skraak/cmd 0.005s coverage: 0.9% of statements
    ok skraak/db 1.269s coverage: 76.4% of statements
    ok skraak/tools 1.663s coverage: 58.4% of statements
    ok skraak/tools/calls 0.172s coverage: 60.7% of statements
    ok skraak/tools/import 0.007s coverage: 3.7% of statements
    skraak/tui coverage: 0.0% of statements
    ok skraak/utils 0.280s coverage: 78.8% of statements
    goreportcard-cli
    2026/05/13 15:49:22 disabling ineffassign on large repo...
    Grade .......... A+ 100.0%
    Files ................ 147
    Issues ................. 0
    go_vet .............. 100%
    gofmt ............... 100%
    ineffassign ......... 100%
    gocyclo ............. 100%
    license ............. 100%
    misspell ............ 100%
    Go line count:
    18864 total
    Test line count:
    13489 total
    [3.24586]
    os.Exit
  • edit in CHANGELOG.md at line 4
    [7.1198010]
    [8.39656]
    ## [2026-05-13] Event logging audit: import_files + bulk_file_import bypass LoggedTx
    Audit of all write paths revealed that `tools/import/import_files.go` and `tools/import/bulk_file_import.go` (file import phase) pass `tx.UnderlyingTx()` to `utils.ImportCluster`, bypassing the `LoggedTx` wrapper. The LoggedTx records 0 queries, so `Commit()` writes no event to `.events.jsonl`. Additionally, `utils.EnsureClusterPath` does an UPDATE on `*sql.DB` outside any transaction. Fix plan documented in `PLAN.md`.