package prompts
import (
"context"
"fmt"
"github.com/modelcontextprotocol/go-sdk/mcp"
)
func GetQueryDatasetsPrompt() *mcp.Prompt {
return &mcp.Prompt{
Name: "query_active_datasets",
Description: "Guide for using SQL to query and summarize datasets by type (organise/test/train)",
}
}
func QueryDatasetsPromptHandler(ctx context.Context, req *mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
return &mcp.GetPromptResult{
Messages: []*mcp.PromptMessage{
{
Role: "user",
Content: &mcp.TextContent{
Text: `# Query Active Datasets Workflow
This workflow helps you query and summarize datasets using SQL.
## Step 1: Query All Active Datasets
Use the execute_sql tool with this query:
` + "```sql" + `
SELECT id, name, type, active, created_at, last_modified, description
FROM dataset
WHERE active = true
ORDER BY type, name;
` + "```" + `
This returns all active datasets with their metadata.
## Step 2: Count Datasets by Type
To summarize dataset distribution by type:
` + "```sql" + `
SELECT type, COUNT(*) as count
FROM dataset
WHERE active = true
GROUP BY type
ORDER BY type;
` + "```" + `
## Step 3: Find Most Recently Modified
Get the 5 most recently modified datasets:
` + "```sql" + `
SELECT name, type, last_modified
FROM dataset
WHERE active = true
ORDER BY last_modified DESC
LIMIT 5;
` + "```" + `
## Example Analysis Output
After running these queries, present a summary like:
"""
Dataset Summary:
- Total: 8 active datasets
- Organise: 8 datasets
- Test: 0 datasets
- Train: 0 datasets
Most Recently Modified:
1. "Twenty Four Seven" (organise, 2024-06-05)
2. "Pomona - Kiwi" (organise, 2024-06-05)
3. "Pure Salt - Kiwi" (organise, 2024-05-15)
"""
## SQL Tips
- Use WHERE active = true to filter inactive datasets
- GROUP BY type to count by category
- ORDER BY to sort results
- LIMIT to restrict result count`,
},
},
},
}, nil
}
func GetExploreSchemaPrompt() *mcp.Prompt {
return &mcp.Prompt{
Name: "explore_database_schema",
Description: "Interactive guide for exploring the database schema using resources. Optional focus_area argument: overview, dataset, locations, files, labels, or taxonomy",
Arguments: []*mcp.PromptArgument{
{
Name: "focus_area",
Description: "Area to focus on: overview, dataset, locations, files, labels, or taxonomy",
Required: false,
},
},
}
}
func ExploreSchemaPromptHandler(ctx context.Context, req *mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
focusArea := "overview"
if req.Params.Arguments != nil {
if fa, ok := req.Params.Arguments["focus_area"]; ok && fa != "" {
focusArea = fa
}
}
var promptText string
switch focusArea {
case "overview":
promptText = `# Database Schema Overview
Explore the skraak database schema to understand its structure.
## Step 1: Read Full Schema
Use the schema://full resource to see the complete database structure:
- Resource URI: schema://full
- Returns: Complete SQL schema with all tables
## Step 2: Identify Major Components
The database has these main areas:
1. **Datasets & Organization**: dataset, location, cluster
2. **Audio Files**: file, file_dataset, moth_metadata
3. **Selections & Labels**: selection, label, label_subtype
4. **Taxonomy**: ebird_taxonomy, species, call_type
## Step 3: Explore Relationships
Look for:
- Foreign key relationships between tables
- Junction tables (many-to-many relationships)
- Enum types (dataset_type, gain_level)
## Next Steps
Re-run this prompt with a specific focus_area:
- "dataset" - Dataset and location structure
- "files" - Audio file organization
- "labels" - Labeling and taxonomy system
- "taxonomy" - eBird taxonomy integration`
case "dataset":
promptText = `# Dataset & Location Schema
Explore how datasets, locations, and clusters are organized.
## Step 1: Read Dataset Table
Use: schema://table/dataset
Key fields:
- id, name, description
- type: ENUM('organise', 'test', 'train')
- active: BOOLEAN
## Step 2: Read Location Table
Use: schema://table/location
Key fields:
- dataset_id: Links to dataset
- latitude, longitude: Geographic coordinates
- timezone_id: IANA timezone identifier
## Step 3: Read Cluster Table
Use: schema://table/cluster
Represents a collection of files (e.g., one SD card):
- location_id: Where recordings were made
- sample_rate: Audio sample rate
- cyclic_recording_pattern_id: Recording schedule
## Relationships
dataset (1) -> (many) location (1) -> (many) cluster (1) -> (many) file`
case "locations":
promptText = `# Location & Cluster Schema
Explore geographic and recording organization.
## Step 1: Location Details
Use: schema://table/location
- Geographic coordinates with validation checks
- Timezone support for accurate timestamps
- Links to parent dataset
## Step 2: Recording Patterns
Use: schema://table/cyclic_recording_pattern
Defines recording schedules:
- record_s: Recording duration in seconds
- sleep_s: Sleep duration between recordings
## Step 3: Cluster Organization
Use: schema://table/cluster
Groups files from one deployment:
- Linked to specific location
- Has recording pattern
- Consistent sample rate
## Use Case
Locations organize multiple recording deployments (clusters) at geographic coordinates.`
case "files":
promptText = `# Audio File Schema
Explore how audio files are stored and organized.
## Step 1: File Table Structure
Use: schema://table/file
Core fields:
- file_name, path: File identification
- xxh64_hash: Content hash for deduplication
- timestamp_local: Recording time (timezone-aware)
- duration, sample_rate: Audio properties
- maybe_solar_night, maybe_civil_night: Night detection
- moon_phase: Lunar phase (0.00-1.00)
## Step 2: File-to-Dataset Junction
Use: schema://table/file_dataset
Many-to-many relationship:
- Files can belong to multiple datasets
- Datasets can contain many files
## Step 3: AudioMoth Metadata
Use: schema://table/moth_metadata
Hardware-specific data:
- recorder_id: Device identifier
- gain: Recording gain level (ENUM)
- battery_v, temp_c: Environmental conditions
## Workflow
file (1) -> (1) location
file (many) <-> (many) dataset (via file_dataset junction)
file (1) -> (many) selection -> (many) label`
case "labels":
promptText = `# Selection & Label Schema
Explore how audio segments are labeled for species identification.
## Step 1: Selection Table
Use: schema://table/selection
Defines time/frequency regions:
- file_id: Source audio file
- start_time, end_time: Temporal bounds (seconds)
- freq_low, freq_high: Frequency bounds (Hz)
- dataset_id: Context for this selection
## Step 2: Label Table
Use: schema://table/label
Species identification:
- selection_id: The labeled region
- species_id: Identified species
- certainty: Confidence (0-100)
- filter_id: Optional processing filter
## Step 3: Label Subtype (Call Types)
Use: schema://table/label_subtype
Optional call classification:
- label_id: Parent label
- calltype_id: Type of call (from call_type table)
- certainty: Subtype confidence
## Workflow
file (1) -> (many) selection (1) -> (many) label (1) -> (0-many) label_subtype
Labels are specific to a dataset context (selection.dataset_id).`
case "taxonomy":
promptText = `# Taxonomy Schema
Explore eBird taxonomy integration and species management.
## Step 1: eBird Taxonomy Table
Use: schema://table/ebird_taxonomy
Immutable reference data:
- species_code: eBird identifier
- taxonomy_version: Year version
- primary_com_name, sci_name: Names
- bird_order, family: Classification
## Step 2: Species Table (Mutable)
Use: schema://table/species
User-managed species list:
- label: Display name
- ebird_code: Links to eBird taxonomy
- Can be customized per project
## Step 3: Call Type Table
Use: schema://table/call_type
Call classifications for each species:
- species_id: Parent species
- label: Call type (e.g., "male", "female", "duet")
## Step 4: Materialized View
Use: schema://table/ebird_taxonomy_v2024
Fast access to 2024 taxonomy:
- Pre-filtered for current version
- Full-text search enabled
- Used for species lookup
## Relationships
ebird_taxonomy (reference) <- species (mutable) (1) -> (many) call_type`
default:
return nil, fmt.Errorf("unknown focus_area: %s. Valid options: overview, dataset, locations, files, labels, taxonomy", focusArea)
}
return &mcp.GetPromptResult{
Messages: []*mcp.PromptMessage{
{
Role: "user",
Content: &mcp.TextContent{
Text: promptText,
},
},
},
}, nil
}
func GetExploreLocationHierarchyPrompt() *mcp.Prompt {
return &mcp.Prompt{
Name: "explore_location_hierarchy",
Description: "Guide for navigating the dataset→location→cluster→file data hierarchy using SQL JOINs. Optional starting point: dataset_id to focus on a specific dataset",
Arguments: []*mcp.PromptArgument{
{
Name: "dataset_id",
Description: "Optional dataset ID to focus the exploration (e.g., 'vgIr9JSH_lFj')",
Required: false,
},
},
}
}
func ExploreLocationHierarchyPromptHandler(ctx context.Context, req *mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
datasetID := ""
if req.Params.Arguments != nil {
if id, ok := req.Params.Arguments["dataset_id"]; ok && id != "" {
datasetID = id
}
}
var promptText string
if datasetID != "" {
promptText = fmt.Sprintf(`# Explore Location Hierarchy (Dataset: %s)
This workflow helps you explore the complete data hierarchy using SQL JOINs.
## Step 1: Get Dataset Information
` + "```sql" + `
SELECT id, name, type, active, created_at, last_modified
FROM dataset
WHERE id = '%s' AND active = true;
` + "```" + `
## Step 2: Get Locations in Dataset
` + "```sql" + `
SELECT id, name, latitude, longitude, timezone_id
FROM location
WHERE dataset_id = '%s' AND active = true
ORDER BY name;
` + "```" + `
Review the geographic distribution and timezone information.
## Step 3: Get Complete Hierarchy with Counts
Use a JOIN query to see the full hierarchy:
` + "```sql" + `
SELECT
d.name as dataset_name,
l.name as location_name,
l.latitude,
l.longitude,
COUNT(DISTINCT c.id) as cluster_count,
COUNT(f.id) as file_count
FROM dataset d
LEFT JOIN location l ON d.id = l.dataset_id
LEFT JOIN cluster c ON l.id = c.location_id
LEFT JOIN file f ON c.id = f.cluster_id
WHERE d.id = '%s' AND d.active = true
GROUP BY d.name, l.name, l.latitude, l.longitude
ORDER BY l.name;
` + "```" + `
This shows how many clusters and files exist at each location.
## Step 4: Examine Files in a Specific Cluster
First, find an interesting cluster from Step 3, then:
` + "```sql" + `
SELECT
file_name,
timestamp_local,
duration,
maybe_solar_night,
maybe_civil_night,
moon_phase
FROM file
WHERE cluster_id = ? AND active = true
ORDER BY timestamp_local
LIMIT 100;
` + "```" + `
Use parameterized query with the cluster_id you want to explore.
## Example Summary Output
"""
Dataset: [name] (type: organise)
├── Locations: 1 active location
│ └── "call site 1 1.2 test" at (-40.826344, 172.585079)
│ ├── Clusters: X recording deployments
│ │ └── Files: Y audio recordings
"""
## Data Hierarchy
dataset (1) → (many) locations → (many) clusters → (many) files
This structure allows:
- Multiple recording locations per dataset
- Multiple recording deployments (clusters) per location
- Multiple audio files per deployment`, datasetID, datasetID, datasetID, datasetID)
} else {
promptText = `# Explore Location Hierarchy
This workflow helps you explore the complete data hierarchy using SQL.
## Step 1: Start with Datasets
` + "```sql" + `
SELECT id, name, type, active
FROM dataset
WHERE active = true
ORDER BY type, name;
` + "```" + `
Pick a dataset_id for focused exploration, or re-run this prompt with the dataset_id parameter.
## Step 2: Understand the Hierarchy
The database organizes data in four levels:
1. **Dataset** - A project or collection (e.g., "Summer Survey 2024")
2. **Location** - Geographic recording site with GPS coordinates
3. **Cluster** - A recording deployment (e.g., one SD card's recordings)
4. **File** - Individual audio recording files
## Step 3: Query the Full Hierarchy with JOINs
Get an overview of all datasets with location/cluster/file counts:
` + "```sql" + `
SELECT
d.name as dataset,
d.type,
COUNT(DISTINCT l.id) as location_count,
COUNT(DISTINCT c.id) as cluster_count,
COUNT(f.id) as file_count
FROM dataset d
LEFT JOIN location l ON d.id = l.dataset_id
LEFT JOIN cluster c ON l.id = c.location_id
LEFT JOIN file f ON c.id = f.cluster_id
WHERE d.active = true
GROUP BY d.name, d.type
ORDER BY d.name;
` + "```" + `
## Step 4: Filter by Specific Dataset
To explore a specific dataset:
` + "```sql" + `
SELECT
l.name as location,
l.latitude,
l.longitude,
COUNT(DISTINCT c.id) as clusters,
COUNT(f.id) as files
FROM location l
LEFT JOIN cluster c ON l.id = c.location_id
LEFT JOIN file f ON c.id = f.cluster_id
WHERE l.dataset_id = ? AND l.active = true
GROUP BY l.name, l.latitude, l.longitude
ORDER BY l.name;
` + "```" + `
Use parameterized query: ` + "```json" + `{"parameters": ["your_dataset_id"]}` + "```" + `
## Use Case Examples
- **Count recordings per location**: Use GROUP BY with COUNT
- **Analyze temporal coverage**: Query file timestamps across clusters
- **Geographic analysis**: Select latitude/longitude with aggregates
- **Quality assessment**: Check sample rates and night detection flags
## SQL Tips
- Use LEFT JOIN to include locations even if they have no clusters
- Use COUNT(DISTINCT) to avoid double-counting
- Use GROUP BY to aggregate data at different levels
- Use parameterized queries (?) for safe filtering`
}
return &mcp.GetPromptResult{
Messages: []*mcp.PromptMessage{
{
Role: "user",
Content: &mcp.TextContent{
Text: promptText,
},
},
},
}, nil
}
func GetQueryLocationDataPrompt() *mcp.Prompt {
return &mcp.Prompt{
Name: "query_location_data",
Description: "Workflow for finding and analyzing recording locations using SQL queries",
}
}
func QueryLocationDataPromptHandler(ctx context.Context, req *mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
return &mcp.GetPromptResult{
Messages: []*mcp.PromptMessage{
{
Role: "user",
Content: &mcp.TextContent{
Text: `# Query Location Data Workflow
This workflow helps you analyze recording locations using SQL.
## Step 1: Get All Locations
` + "```sql" + `
SELECT id, name, latitude, longitude, timezone_id, dataset_id
FROM location
WHERE active = true
ORDER BY name;
` + "```" + `
This returns all active locations with coordinates and timezones.
## Step 2: Analyze Location Distribution by Dataset
Group locations by their parent dataset:
` + "```sql" + `
SELECT
d.name as dataset,
COUNT(l.id) as location_count,
AVG(l.latitude) as avg_latitude,
AVG(l.longitude) as avg_longitude
FROM dataset d
LEFT JOIN location l ON d.id = l.dataset_id
WHERE d.active = true
GROUP BY d.name
ORDER BY location_count DESC;
` + "```" + `
This shows which datasets have the most recording sites.
## Step 3: Find Locations Within Geographic Bounds
Filter by latitude/longitude ranges:
` + "```sql" + `
SELECT name, latitude, longitude, timezone_id
FROM location
WHERE active = true
AND latitude BETWEEN -42.0 AND -40.0
AND longitude BETWEEN 172.0 AND 174.0
ORDER BY latitude, longitude;
` + "```" + `
Adjust the BETWEEN ranges to match your area of interest.
## Step 4: Get Recording Counts by Location
Use JOINs to count clusters and files at each location:
` + "```sql" + `
SELECT
l.name as location,
l.latitude,
l.longitude,
d.name as dataset,
COUNT(DISTINCT c.id) as clusters,
COUNT(f.id) as total_files
FROM location l
LEFT JOIN dataset d ON l.dataset_id = d.id
LEFT JOIN cluster c ON l.id = c.location_id
LEFT JOIN file f ON c.id = f.cluster_id
WHERE l.active = true
GROUP BY l.name, l.latitude, l.longitude, d.name
ORDER BY total_files DESC
LIMIT 20;
` + "```" + `
This shows the 20 most productive recording locations.
## Step 5: Analyze Specific Location
Deep dive on a specific location using parameterized query:
` + "```sql" + `
SELECT
c.name as cluster,
c.sample_rate,
COUNT(f.id) as file_count,
MIN(f.timestamp_local) as first_recording,
MAX(f.timestamp_local) as last_recording,
SUM(f.duration) as total_duration_seconds
FROM cluster c
LEFT JOIN file f ON c.id = f.cluster_id
WHERE c.location_id = ? AND c.active = true
GROUP BY c.name, c.sample_rate
ORDER BY first_recording;
` + "```" + `
Use: ` + "```json" + `{"parameters": ["location_id_here"]}` + "```" + `
## Example Analysis Output
"""
Location Analysis:
Total Locations: 139 active sites
Distribution by Dataset:
- Pomona - Kiwi: 48 locations
- Twenty Four Seven: 35 locations
- Manu o Kahurangi - Kiwi: 23 locations
Top Recording Sites:
1. "Homer Point" - 5 clusters, 12,450 files
2. "Kahurangi Ridge" - 3 clusters, 8,230 files
3. "Cobb Valley" - 4 clusters, 7,890 files
"""
## SQL Tips
- Use LEFT JOIN to include locations even without recordings
- Use COUNT(DISTINCT) to avoid counting duplicates
- Use BETWEEN for geographic bounding boxes
- Use GROUP BY for aggregations at location level
- Use ORDER BY with LIMIT for top-N queries`,
},
},
},
}, nil
}
func GetAnalyzeClusterFilesPrompt() *mcp.Prompt {
return &mcp.Prompt{
Name: "analyze_cluster_files",
Description: "Guide for examining audio files within a recording cluster using SQL aggregates. Requires cluster_id parameter",
Arguments: []*mcp.PromptArgument{
{
Name: "cluster_id",
Description: "Cluster ID to analyze (e.g., 'oNI9jqszP4Bk')",
Required: true,
},
},
}
}
func AnalyzeClusterFilesPromptHandler(ctx context.Context, req *mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
clusterID := ""
if req.Params.Arguments != nil {
if id, ok := req.Params.Arguments["cluster_id"]; ok && id != "" {
clusterID = id
}
}
if clusterID == "" {
return nil, fmt.Errorf("cluster_id argument is required. Find cluster IDs using SQL: SELECT id, name FROM cluster WHERE active = true")
}
promptText := fmt.Sprintf(`# Analyze Cluster Files (Cluster: %s)
This workflow helps you examine audio files using SQL aggregates.
## Step 1: Get All Files in Cluster
` + "```sql" + `
SELECT
file_name,
timestamp_local,
duration,
sample_rate,
maybe_solar_night,
maybe_civil_night,
moon_phase,
xxh64_hash
FROM file
WHERE cluster_id = '%s' AND active = true
ORDER BY timestamp_local
LIMIT 100;
` + "```" + `
Review individual file details (first 100 files).
## Step 2: Get Summary Statistics
Use aggregate functions to summarize the cluster:
` + "```sql" + `
SELECT
COUNT(*) as total_files,
SUM(duration) as total_duration_seconds,
AVG(duration) as avg_duration,
MIN(timestamp_local) as first_recording,
MAX(timestamp_local) as last_recording,
SUM(CASE WHEN maybe_solar_night THEN 1 ELSE 0 END) as night_files,
SUM(CASE WHEN NOT maybe_solar_night THEN 1 ELSE 0 END) as day_files,
AVG(moon_phase) as avg_moon_phase,
COUNT(DISTINCT sample_rate) as unique_sample_rates
FROM file
WHERE cluster_id = '%s' AND active = true;
` + "```" + `
This provides an overview of recording coverage and characteristics.
## Step 3: Analyze Temporal Distribution
Group files by hour to see recording pattern:
` + "```sql" + `
SELECT
DATE_TRUNC('hour', timestamp_local) as recording_hour,
COUNT(*) as file_count,
SUM(duration) as total_seconds
FROM file
WHERE cluster_id = '%s' AND active = true
GROUP BY recording_hour
ORDER BY recording_hour
LIMIT 50;
` + "```" + `
This shows when recordings were made throughout the deployment.
## Step 4: Moon Phase Analysis
Analyze distribution across lunar cycle:
` + "```sql" + `
SELECT
ROUND(moon_phase, 1) as moon_phase_bin,
COUNT(*) as file_count,
AVG(moon_phase) as avg_phase
FROM file
WHERE cluster_id = '%s' AND active = true AND moon_phase IS NOT NULL
GROUP BY moon_phase_bin
ORDER BY moon_phase_bin;
` + "```" + `
Shows recording coverage across moon phases (0.0 = new moon, 1.0 = full moon).
## Step 5: Check for Data Quality Issues
Detect duplicates and gaps:
` + "```sql" + `
-- Find duplicate hashes (potential duplicate files)
SELECT xxh64_hash, COUNT(*) as count
FROM file
WHERE cluster_id = '%s' AND active = true
GROUP BY xxh64_hash
HAVING COUNT(*) > 1;
` + "```" + `
` + "```sql" + `
-- Check sample rate consistency
SELECT sample_rate, COUNT(*) as file_count
FROM file
WHERE cluster_id = '%s' AND active = true
GROUP BY sample_rate;
` + "```" + `
## Example Analysis Output
"""
Cluster Analysis: %s
Recording Period:
- Start: 2023-12-10 20:00:00
- End: 2023-12-11 10:00:00
- Duration: 14 hours
Files: 840 recordings
- Night: 650 files (77.4%%)
- Day: 190 files (22.6%%)
Audio Properties:
- Sample Rate: 250kHz (consistent)
- Avg Duration: 60s per file
- Total Audio: 14.0 hours
Moon Phase:
- Range: 0.92-0.95 (near full moon)
- Average: 0.93
Data Quality:
- Unique hashes: 840 (no duplicates)
- Sample rates: 1 (consistent)
"""
## SQL Tips for Analysis
- Use COUNT(*) to count files
- Use SUM(duration) for total recording time
- Use CASE WHEN for conditional counts (night vs day)
- Use DATE_TRUNC to group by time periods
- Use ROUND() to bin continuous values like moon_phase
- Use HAVING with GROUP BY to filter aggregated results
## Next Steps
- Cross-reference with selection/label data for species detections
- Compare temporal patterns across different clusters
- Use file_name and path to locate actual audio files`, clusterID, clusterID, clusterID, clusterID, clusterID, clusterID, clusterID, clusterID)
return &mcp.GetPromptResult{
Messages: []*mcp.PromptMessage{
{
Role: "user",
Content: &mcp.TextContent{
Text: promptText,
},
},
},
}, nil
}
func GetSystemStatusPrompt() *mcp.Prompt {
return &mcp.Prompt{
Name: "system_status_check",
Description: "Comprehensive workflow to verify MCP server health using tools, resources, and prompts",
}
}
func SystemStatusPromptHandler(ctx context.Context, req *mcp.GetPromptRequest) (*mcp.GetPromptResult, error) {
return &mcp.GetPromptResult{
Messages: []*mcp.PromptMessage{
{
Role: "user",
Content: &mcp.TextContent{
Text: `# System Status Check Workflow
This workflow verifies all MCP server primitives are functioning correctly.
## Step 1: Verify Tools
Test both available tools:
### 1a. Time Tool
- Call: get_current_time
- Expected: Current system time with timezone
- Validates: Tool execution, time handling
### 1b. Generic SQL Tool
Test with a simple query:
` + "```json" + `
{
"name": "execute_sql",
"arguments": {
"query": "SELECT COUNT(*) as dataset_count FROM dataset WHERE active = true"
}
}
` + "```" + `
Expected: Row count result
Validates: Database connectivity, SQL execution
### 1c. Parameterized Query Test
` + "```json" + `
{
"name": "execute_sql",
"arguments": {
"query": "SELECT id, name FROM dataset WHERE id = ? AND active = true",
"parameters": ["vgIr9JSH_lFj"]
}
}
` + "```" + `
Expected: Filtered dataset result
Validates: Parameterized query support
### 1d. JOIN Query Test
` + "```json" + `
{
"name": "execute_sql",
"arguments": {
"query": "SELECT d.name, COUNT(l.id) as locations FROM dataset d LEFT JOIN location l ON d.id = l.dataset_id WHERE d.active = true GROUP BY d.name",
"limit": 10
}
}
` + "```" + `
Expected: Aggregated results with JOINs
Validates: Complex SQL support
### 1e. Security Test (Should Fail)
` + "```json" + `
{
"name": "execute_sql",
"arguments": {
"query": "INSERT INTO dataset (id, name) VALUES ('test', 'test')"
}
}
` + "```" + `
Expected: Error about forbidden keywords
Validates: Security validation working
## Step 2: Verify Resources
Test schema resources:
### 2a. Full Schema Resource
- Read: schema://full
- Expected: Complete SQL schema (~348 lines)
- Validates: File I/O, resource serving
### 2b. Table Template Resource
- Read: schema://table/dataset
- Expected: Dataset table CREATE statement with indexes
- Validates: Template parsing, SQL extraction
### 2c. Additional Table Templates
Test a few more tables:
- Read: schema://table/location
- Read: schema://table/cluster
- Read: schema://table/file
- Expected: Individual table schemas
- Validates: Template system works for all tables
### 2d. Invalid Resource (Error Handling)
- Read: schema://table/invalid_table
- Expected: Error with list of valid table names
- Validates: Error handling, validation
## Step 3: Verify Prompts
Test all 6 prompt types:
### 3a. Dataset Query Prompt
- Get: query_active_datasets
- Expected: SQL-based workflow guide for dataset querying
- Validates: Basic prompt retrieval
### 3b. Schema Exploration Prompt
- Get: explore_database_schema (with focus_area: "overview")
- Expected: Schema exploration guide
- Validates: Parameterized prompts, context switching
### 3c. Location Hierarchy Prompt
- Get: explore_location_hierarchy
- Expected: SQL JOIN-based hierarchy navigation guide
- Validates: New SQL workflow prompts
### 3d. Location Data Prompt
- Get: query_location_data
- Expected: SQL location analysis workflow
- Validates: Location-focused SQL prompts
### 3e. Cluster Analysis Prompt
- Get: analyze_cluster_files (with cluster_id)
- Expected: SQL aggregate-based file analysis workflow
- Validates: Required parameter prompts
### 3f. System Status Prompt
- Get: system_status_check
- Expected: This current workflow
- Validates: Meta-prompt functionality
## Step 4: Query All Major Tables
Verify database access across all tables:
` + "```sql" + `
-- Datasets
SELECT COUNT(*) as count FROM dataset WHERE active = true;
-- Locations
SELECT COUNT(*) as count FROM location WHERE active = true;
-- Clusters
SELECT COUNT(*) as count FROM cluster WHERE active = true;
-- Files
SELECT COUNT(*) as count FROM file WHERE active = true;
` + "```" + `
All queries should return counts without errors.
## Step 5: Summary Report
Generate a comprehensive status report:
### Health Check Results
"""
✓ Tools: 2/2 operational
- get_current_time: OK
- execute_sql: OK
✓ Simple queries work
✓ Parameterized queries work
✓ JOIN queries work
✓ Aggregates work
✓ Security validation active
✓ Resources: 2 types operational
- schema://full: OK (348 lines)
- schema://table/{name}: OK (tested: dataset, location, cluster, file)
✓ Prompts: 6/6 operational
- query_active_datasets: OK (SQL-based)
- explore_database_schema: OK
- explore_location_hierarchy: OK (SQL JOIN-based)
- query_location_data: OK (SQL-based)
- analyze_cluster_files: OK (SQL aggregate-based)
- system_status_check: OK (current)
✓ Database: Read-only mode verified
- INSERT/UPDATE/DELETE blocked
- All tables accessible
System Status: HEALTHY
Architecture:
- Generic SQL tool (infinite flexibility)
- Schema resources (context for LLM)
- Workflow prompts (teach SQL patterns)
- Read-only database (security enforced)
"""
## Troubleshooting
If any check fails:
- **Tools**: Check database path and connectivity
- **Resources**: Verify schema.sql file exists and is readable
- **Prompts**: Check prompt handler registration in main.go
- **SQL errors**: Check query syntax and table names
## Next Steps
After verifying system health:
1. Use explore_location_hierarchy to understand data structure
2. Use query_location_data to analyze recording sites
3. Use analyze_cluster_files to examine specific recordings
4. Construct custom SQL queries for your analysis needs`,
},
},
},
}, nil
}