└────────┴────────────┴────────────────────────────┴──────────────┘
Good progress — the leaf files (sql, time, metadata, isnight, prepend, xxhash, export, replay, create,
update, cluster, dataset, location, pattern) are all os.Exit-free in their Run* functions. The remaining 103
are in:
- calls.go (internal functions like runCallsFromPreds, runCallsShowImages etc.)
- calls_*.go (subcommand implementations)
- import.go (internal runImport* functions)
- common.go (the requireFlags function)
This is a good stopping point for Stream 3c. The key architectural change is done: main.go has a single exit
point, and the core CRUD + utility commands return errors. The calls_* and import internal functions can be
converted in a follow-up.
Skraak Codebase Review: Recommendations
Based on thorough exploration with gosymdb, gocyclo, and direct file reading, here are my findings organized
by theme.
────────────────────────────────────────────────────────────────────────────────
1. Global Mutable State: dbPath Package Variable
File: tools/sql.go:16 — var dbPath string
Impact: Every DB-touching tool function implicitly reads this package-level variable. The cmd layer must call
tools.SetDBPath() before invoking any tool function — a hidden temporal coupling. 25 references across 11
files. This is the single biggest architectural smell.
Recommendation: Make dbPath an explicit parameter on every tool function that needs it. This makes data flow
visible, eliminates the SetDBPath function entirely, and makes the tools package side-effect-free and
testable in parallel without mutexes. The Input structs already exist for every tool — add DBPath string to
them (or a dedicated DBConfig struct embedded in each input).
────────────────────────────────────────────────────────────────────────────────
2. Duplicated Validation/Verification Functions
There are 8 local verify/validate functions in tools/ that duplicate or partially overlap with
db/validation.go:
┌─────────────────────────────────────────────────┬────────────────────────────────────┬────────────────────┐
│ Local (tools/) │ db/ equivalent │ Used by │
├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
│ verifyDatasetActive(*sql.DB) │ db.DatasetExistsAndActive(Querier) │ 1 caller │
│ │ │ (dataset.go) │
├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
│ verifyDatasetExistsAndActive(Querier) │ wraps db.DatasetExistsAndActive │ location.go │
├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
│ verifyLocationExistsAndActive(interface{QueryRo │ db.LocationBelongsToDataset(Querie │ location.go │
│ w...}) │ r) │ │
├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
│ verifyPatternExistsAndActive(*sql.DB) │ none in db/ │ pattern.go │
├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
│ verifyPatternExists(*LoggedTx) │ none │ cluster.go │
├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
│ validateClusterActive(*sql.DB) │ none │ cluster.go │
├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
│ validateClusterCyclicPattern(*sql.DB) │ none │ cluster.go │
├─────────────────────────────────────────────────┼────────────────────────────────────┼────────────────────┤
│ validateCyclicPattern(*sql.DB) │ none │ cluster.go │
└─────────────────────────────────────────────────┴────────────────────────────────────┴────────────────────┘
Problems:
- Inconsistent parameter types: some take *sql.DB, some take db.Querier, some take inline
interface{QueryRow...} — this is confusing
- verifyDatasetActive in dataset.go duplicates db.DatasetExistsAndActive but with a different signature and
error message
- verifyLocationExistsAndActive defines an anonymous interface instead of using db.Querier
Recommendation: Consolidate all entity-exists-and-active checks into db/validation.go using the db.Querier
interface consistently. Delete the local duplicates in tools/. The existing db.Querier interface is good —
extend it and use it everywhere.
────────────────────────────────────────────────────────────────────────────────
3. High Cyclomatic Complexity (40+ Functions ≥ 11)
Worst offenders (complexity 14):
- RunCalls, ExportDataset, CallsPropagateFolder, bulkReadCSV, loadSpeciesCalltypeIDs, normalizeFlat,
Model.View, Model.handleSwitchKey
Average per-package: tools and cmd both have many functions in the 11–14 range. The lint_test threshold is
14, so these are just barely passing.
Recommendation:
- Extract flag-parsing helpers in cmd/ — nearly every Run* function follows the same pattern: fs :=
flag.NewFlagSet(...), parse, validate required flags, call tools.SetDBPath, call tool function, printJSON.
The "validate required flags" block alone is duplicated ~20 times.
- Extract early-return guard clauses in the tools functions — many if err != nil { return ... } chains can be
collapsed.
- Break up the TUI's View() (complexity 14, 700+ lines) into sub-render methods (which is already partially
done, but the switch dispatch is the bottleneck).
────────────────────────────────────────────────────────────────────────────────
4. tools/ Package is a God Package (265 non-test functions, 28 source files)
556 symbols — 52% of the entire codebase's symbols live in tools/. It mixes:
- DB CRUD operations (cluster, dataset, location, pattern, import, export)
- File-processing pipelines (calls_from_preds, calls_classify, calls_clip, calls_clip_labels)
- AviaNZ format I/O (data file reading/writing)
- Audio processing (spectrogram generation, playback)
- Validation logic
- SQL execution
Recommendation: Split tools/ into 3–4 focused sub-packages:
- tools/calls — all calls_* files (from_preds, from_birda, from_raven, classify, clip, clip_labels, modify,
propagate, push_certainty, detect_anomalies, show_images, summarise)
- tools/crud — cluster, dataset, location, pattern create/update
- tools/import — import_file, import_files, import_segments, import_unstructured, bulk_file_import
- tools/sql — sql, export (or merge into db/)
This would also reduce merge conflicts, as these are largely independent domains.
────────────────────────────────────────────────────────────────────────────────
5. Repetitive cmd/ Boilerplate
Every Run* function in cmd/ repeats this pattern:
1. Create flag.NewFlagSet
2. Define flags (always --db)
3. Parse args
4. missing := []string{} + check required flags
5. tools.SetDBPath(*dbPath)
6. defer initEventLog(*dbPath)()
7. Build tools.XxxInput{...}
8. Call tools.Xxx(input)
9. printJSON(output) or error to stderr + os.Exit(1)
Recommendation: Create a cmd-internal helper:
```go
type CommandConfig struct {
DBPath string
EventLog bool
}
func parseCommand(args []string, requiredFlags ...string) (CommandConfig, *flag.FlagSet) { ... }
```
This would eliminate ~60% of the boilerplate in cmd/ files.
────────────────────────────────────────────────────────────────────────────────
6. MarshalJSON Boilerplate in db/types.go
Four types (Dataset, Location, Cluster, CyclicRecordingPattern) each have a nearly identical MarshalJSON
method that just reformats time.Time fields as RFC3339 strings. The pattern is: create anonymous struct
identical to the receiver but with string instead of time.Time, copy all fields, convert timestamps, marshal.
Recommendation: Use a helper or custom type:
- Option A: Define type JSONTime time.Time with a MarshalJSON that outputs RFC3339, and use it in the struct
tags.
- Option B: Use json.Marshal with a single helper that reflects over the struct and converts time.Time
fields.
- This would remove ~80 lines of boilerplate.
────────────────────────────────────────────────────────────────────────────────
7. AviaNZ Types Defined in calls_from_preds.go but Used Across Files
AviaNZMeta, AviaNZLabel, AviaNZSegment are defined in calls_from_preds.go but used by calls_from_birda.go,
calls_from_raven.go, and the shared writeDotDataFileSafe function. They're AviaNZ format concerns, not
prediction-specific.
Recommendation: Move these types to a dedicated file (e.g., tools/avianz_types.go or utils/data_file.go since
the rest of the AviaNZ parsing already lives there). This makes the dependency graph honest and reduces the
cognitive load of calls_from_preds.go (733 lines, one of the largest files).
────────────────────────────────────────────────────────────────────────────────
8. Mixed DB Access Patterns in tools/
The tools layer uses 4 different DB access patterns, sometimes within the same file:
┌───────────────────────────────────────┬───────────────────────────────────────────┬───────────────────────┐
│ Pattern │ Example │ Used in │
├───────────────────────────────────────┼───────────────────────────────────────────┼───────────────────────┤
│ db.WithWriteTx(ctx, dbPath, toolName, │ cluster.go, dataset.go, location.go │ CRUD tools │
│ fn) │ │ │
├───────────────────────────────────────┼───────────────────────────────────────────┼───────────────────────┤
│ db.OpenReadOnlyDB(dbPath) + manual │ bulk_file_import.go, export.go, │ Read-heavy tools │
│ db.Close() │ import_files.go │ │
├───────────────────────────────────────┼───────────────────────────────────────────┼───────────────────────┤
│ db.OpenWriteableDB(dbPath) + manual │ bulk_file_import.go, import_file.go, │ Write-heavy without │
│ │ import_segments.go │ tx logging │
├───────────────────────────────────────┼───────────────────────────────────────────┼───────────────────────┤
│ db.WithReadDB(dbPath, fn) │ import_files.go, import_unstructured.go │ Simple read queries │
└───────────────────────────────────────┴───────────────────────────────────────────┴───────────────────────┘
Some tools open a writeable DB but don't use WithWriteTx, meaning their mutations aren't logged for replay.
import_segments.go and import_file.go are notable — they open *sql.DB directly and do inserts without the
transaction logger.
Recommendation: Standardize on db.WithWriteTx for all mutations (it's already there for exactly this purpose)
and db.WithReadDB for reads. The manual Open*DB + defer Close() pattern should be the exception, not the
norm. This also fixes the audit-gap where some writes aren't replayable.
────────────────────────────────────────────────────────────────────────────────
9. tui/ Directly Imports tools for ClassifyState
The TUI package imports tools specifically for ClassifyState, ClassifyConfig, KeyBinding, BindingResult, and
FormatLabels. This creates a dependency from the presentation layer to the business-logic layer. The TUI
calls tools.FormatLabels() and directly manipulates tools.ClassifyState.
Recommendation: If tools/ is split per recommendation #4, the classify-related types should move to a
tools/calls sub-package, which the TUI can import. Alternatively, extract the types (ClassifyConfig,
KeyBinding, BindingResult) into utils/ since they're pure data with no DB dependency — ClassifyState is the
only one that has real logic (and it should stay in tools/calls).
────────────────────────────────────────────────────────────────────────────────
10. Error Handling: Inconsistent os.Exit in cmd/
There are 174 os.Exit calls across cmd/ files, with no structured error types. Every function exits directly
on any error. This makes testing cmd/ functions impossible and prevents composable error handling.
Recommendation: Return errors from Run* functions instead of calling os.Exit. Have a single top-level
dispatcher in main.go that handles os.Exit(1) based on the returned error. This would make the cmd/ layer
testable and allow future programmatic use of the CLI.
────────────────────────────────────────────────────────────────────────────────
Summary: Priority-Ordered Changes
┌───────────┬────────────────────────────────────────────────┬─────────────────────────────────────┬────────┐
│ Priority │ Change │ Impact │ Effort │
├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
│ 🔴 High │ Eliminate var dbPath — make it explicit │ Testability, parallel-safety, │ Medium │
│ │ parameter │ clarity │ │
├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
│ 🔴 High │ Consolidate duplicated verify/validate into │ Consistency, DRY │ Low │
│ │ db/validation.go │ │ │
├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
│ 🟡 Medium │ Split tools/ into sub-packages │ Maintainability, merge conflicts, │ High │
│ │ │ cognitive load │ │
├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
│ 🟡 Medium │ Standardize DB access patterns (use │ Auditability, consistency │ Medium │
│ │ WithWriteTx/WithReadDB) │ │ │
├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
│ 🟡 Medium │ Reduce cmd/ boilerplate with helper functions │ DRY, readability │ Low │
├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
│ 🟢 Low │ Deduplicate MarshalJSON in db/types.go │ DRY, ~80 lines removed │ Low │
├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
│ 🟢 Low │ Move AviaNZ types out of calls_from_preds.go │ Dependency clarity │ Low │
├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
│ 🟢 Low │ Return errors from cmd/ instead of os.Exit │ Testability │ Medium │
├───────────┼────────────────────────────────────────────────┼─────────────────────────────────────┼────────┤
│ 🟢 Low │ Reduce cyclomatic complexity in worst │ Maintainability │ Medium │
│ │ functions │ │ │
└───────────┴────────────────────────────────────────────────┴─────────────────────────────────────┴────────┘
#: 3
Item: Reduce cyclomatic complexity
Status: ⚠️ Partial
Notes: Still ~15 functions at 13–14 (e.g. ExportDataset, CallsPropagateFolder, bulkReadCSV,
loadSpeciesCalltypeIDs, RunCalls, handleSwitchKey). Same offenders as before — not yet broken up.
#: 8
Item: Standardize DB access patterns
Status: ⚠️ Partial
Notes: WithWriteTx / WithReadDB are used in CRUD tools, but tools/import/*.go (bulk_file_import, import_file,
import_files, import_segments) still call db.OpenWriteableDB directly without transaction logging. The
audit-gap recommendation #8 flagged is still present.
Evaluation: Recommendations Implementation Status
### 1. Global Mutable State: var dbPath — ✅ RESOLVED
Was: var dbPath package-level variable in tools/sql.go with SetDBPath() function, 25 references across 11
files.
Now: SetDBPath and var dbPath are completely eliminated. Every Input struct now has an explicit DBPath string
field (e.g., ClusterInput.DBPath, DatasetInput.DBPath, ExecuteSQLInput.DBPath). A small resolveDBPath()
helper delegates to db.ResolveDBPath(). The cmd layer passes *dbPath into each input struct directly. This is
exactly what was recommended — data flow is now visible, no temporal coupling, parallel-safe.
### 2. Duplicated Validation/Verification Functions — ✅ MOSTLY RESOLVED
Was: 8 local verify/validate functions in tools/ with inconsistent parameter types.
Now: db/validation.go has been significantly expanded with consistent db.Querier-based functions:
- DatasetExistsAndActive, LocationBelongsToDataset, LocationExistsAndActive, ClusterExistsAndActive,
ClusterBelongsToLocation, PatternExistsAndActive, ValidateLocationBelongsToDataset, plus type-specific
validators.
The old local duplicates in dataset.go, location.go, pattern.go are gone — those files now call db.*
directly.
Remaining issue: tools/cluster.go still has two local wrappers:
- validateClusterActive(database *sql.DB, ...) — just delegates to db.ClusterExistsAndActive
- validateClusterCyclicPattern(database *sql.DB, ...) — adds nil/empty check before calling
db.PatternExistsAndActive
These are thin wrappers (3–10 lines each) with a bit of business logic in the cyclic pattern one. Not a major
smell, but the first one is a pure pass-through that could be inlined.
### 3. High Cyclomatic Complexity — 🟡 PARTIALLY IMPROVED
Was: Worst offenders at complexity 14, many functions in the 11–14 range. Lint threshold at 14.
Now: No functions exceed 14 (the lint threshold), but 6 functions still sit at complexity 14 (the max):
- RunCalls (cmd), ExportDataset (tools), CallsPropagateFolder (calls), loadSpeciesCalltypeIDs (import),
bulkReadCSV (import), Model.handleSwitchKey (tui)
Plus ~28 functions in the 11–13 range. The TUI's View() was previously at complexity 14 and 700+ lines — now
the TUI is 874 lines but View no longer appears in the top offenders (it was likely broken into sub-methods
like handleSwitchKey, handleClipKey, etc.). So some progress on the TUI front.
Verdict: The ceiling hasn't changed (still 14), but there's been some redistribution. The lint threshold is
acting as a ceiling rather than a target — functions cluster just below/at it.
### 4. tools/ God Package — ✅ MOSTLY RESOLVED
Was: 265 non-test functions, 28 source files, 556 symbols — 52% of codebase.
Now: tools/ has been split into three packages:
┌───────────────┬───────┬───────────┬───────┐
│ Package │ Files │ Functions │ Lines │
├───────────────┼───────┼───────────┼───────┤
│ tools/ │ 12 │ 82 │ 2,746 │
├───────────────┼───────┼───────────┼───────┤
│ tools/calls/ │ 28 │ 271 │ 8,242 │
├───────────────┼───────┼───────────┼───────┤
│ tools/import/ │ 6 │ 37 │ 2,181 │
└───────────────┴───────┴───────────┴───────┘
This matches the recommended split almost exactly (calls → tools/calls/, import → tools/import/). The
remaining tools/ package is the CRUD core (cluster, dataset, location, pattern, sql, export, prepend, time).
The tools/sql sub-package recommendation wasn't adopted — export and sql remain in tools/, which is
reasonable given their smaller size.
### 5. Repetitive cmd/ Boilerplate — 🟡 PARTIALLY IMPROVED
Was: Every Run* function repeats 9 steps: flag.NewFlagSet, define flags, parse, validate required, SetDBPath,
initEventLog, build input, call tool, printJSON.
Now:
- SetDBPath step is eliminated (DBPath goes into input structs) ✅
- cmd/common.go was created with checkFlags() and checkNonZeroFlags() helpers that return errors instead of
calling os.Exit ✅
- Most Run* functions now return err instead of calling os.Exit directly ✅
- main.go has a clean dispatch map with a single os.Exit(1) on error ✅
Still present: 29 flag.NewFlagSet calls across 24 cmd files. Each function still creates its own FlagSet,
defines flags, and calls checkFlags. The "parseCommand" helper from the recommendation was not implemented.
The boilerplate is reduced (no SetDBPath, error returns instead of exit), but the flag-parsing pattern is
still repeated.
### 6. MarshalJSON Boilerplate in db/types.go — 🟡 PARTIALLY IMPROVED
Was: 4 nearly identical MarshalJSON methods, ~80 lines of boilerplate.
Now: A JSONTime type and jt() helper were introduced — this is exactly "Option A" from the recommendation.
However, the 4 MarshalJSON methods still exist. Each one creates an anonymous struct with JSONTime fields and
manually copies all fields. The JSONTime type eliminates the explicit time.Time → string conversion, but the
struct duplication remains. There are still 11 MarshalJSON references in db/types.go (4 methods + helper + jt
references).
Lines saved: Maybe ~20–30 lines reduced from the original ~80, but the fundamental boilerplate of duplicating
the struct definition persists.
### 7. AviaNZ Types — ✅ RESOLVED
Was: AviaNZMeta, AviaNZLabel, AviaNZSegment defined in calls_from_preds.go but used across files.
Now: Moved to tools/calls/avianz_types.go — a dedicated file in the calls sub-package. Clean separation,
honest dependency graph. The calls_from_preds.go file is down to 716 lines (from 733).
### 8. Mixed DB Access Patterns — 🟡 PARTIALLY IMPROVED
Was: 4 different DB access patterns across tools/.
Now: The CRUD operations in tools/ consistently use db.WithWriteTx for writes and db.WithReadDB for reads ✅.
tools/import/import_unstructured.go also uses WithWriteTx ✅.
Still an issue: 4 files in tools/import/ still use OpenWriteableDB + manual Close() without transaction
logging:
- import_file.go, import_files.go, import_segments.go, bulk_file_import.go
These writes bypass the audit/replay logger. The original concern about the "audit gap" remains for imports.
### 9. TUI Importing tools for ClassifyState — 🟡 PARTIALLY IMPROVED
Was: TUI imported tools directly for ClassifyState, ClassifyConfig, KeyBinding, BindingResult, FormatLabels.
Now: TUI imports tools/calls instead of tools. Since the calls sub-package was created (rec #4), this is the
"alternative" recommendation — the classify-related types live in tools/calls/ and TUI imports that. However,
the deeper coupling remains: TUI directly manipulates calls.ClassifyState, calls calls.FormatLabels(), and
calls.BindingResult. The types weren't extracted to utils/.
Verdict: The dependency is now on a more focused sub-package rather than the god package, which is an
improvement. But the TUI↔business-logic coupling remains.
### 10. Error Handling: os.Exit in cmd/ — ✅ MOSTLY RESOLVED
Was: 174 os.Exit calls across cmd/ files.
Now: Only 14 os.Exit calls remain in cmd/ (down from 174!). Of those:
- 5 are in cmd/common.go's mustValue/mustIntValue (unrecoverable CLI parse errors)
- 9 are in calls_classify.go, calls_modify.go, calls_push_certainty.go (TUI-related exit paths)
Most Run* functions now return error. main.go has a clean dispatch with if err := handler(...); err != nil →
single os.Exit(1). The ErrHelpRequested sentinel enables clean help exits.
Remaining: 9 os.Exit calls in the classify/modify/push-certainty commands — these are the TUI-adjacent
commands where the pattern is slightly different (TUI launch, signal handling). Still improvable but
dramatically better.
────────────────────────────────────────────────────────────────────────────────
Summary Table
┌────┬─────────────────────────┬─────────────┬──────────────────────────────────────────────────────────────┐
│ # │ Recommendation │ Status │ Notes │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 1 │ Eliminate var dbPath │ ✅ Done │ DBPath in every Input struct, SetDBPath gone │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 2 │ Consolidate │ ✅ Mostly │ db/validation.go expanded; 2 thin wrappers remain in │
│ │ verify/validate │ done │ cluster.go │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 3 │ Reduce cyclomatic │ 🟡 Marginal │ No functions >14, but 6 at exactly 14; still 28+ in 11–13 │
│ │ complexity │ │ range │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 4 │ Split tools/ god │ ✅ Done │ Split into tools/, tools/calls/, tools/import/ │
│ │ package │ │ │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 5 │ Reduce cmd/ boilerplate │ 🟡 Partial │ SetDBPath gone, checkFlags helper added, errors returned; │
│ │ │ │ but 29 FlagSets still repeated │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 6 │ Deduplicate MarshalJSON │ 🟡 Partial │ JSONTime helper added, but 4 MarshalJSON methods still │
│ │ │ │ duplicate structs │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 7 │ Move AviaNZ types │ ✅ Done │ Moved to tools/calls/avianz_types.go │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 8 │ Standardize DB access │ 🟡 Partial │ CRUD uses WithWriteTx consistently; 4 import files still use │
│ │ │ │ OpenWriteableDB │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 9 │ TUI→tools coupling │ 🟡 Partial │ Now imports tools/calls (more focused), but still deeply │
│ │ │ │ coupled to ClassifyState │
├────┼─────────────────────────┼─────────────┼──────────────────────────────────────────────────────────────┤
│ 10 │ Eliminate os.Exit in │ ✅ Mostly │ Down from 174→14; most functions return errors; main.go │
│ │ cmd/ │ done │ dispatches │
└────┴─────────────────────────┴─────────────┴──────────────────────────────────────────────────────────────┘
Overall: 3 fully resolved, 4 mostly/partially done with remaining work, 3 with marginal improvement. The
biggest wins were eliminating the global dbPath (rec #1), splitting the god package (rec #4), and the os.Exit
cleanup (rec #10). The remaining work is mostly medium-priority: the import audit gap (rec #8), the cmd/
FlagSet boilerplate (rec #5), and the MarshalJSON struct duplication (rec #6).