data-sql-optimization
npx skills add https://github.com/vasilyu1983/ai-agents-public --skill data-sql-optimization
Agent 安装分布
Skill 文档
SQL Optimization â Comprehensive Reference
This skill provides actionable checklists, patterns, and templates for transactional (OLTP) SQL optimization: measurement-first triage, EXPLAIN/plan interpretation, balanced indexing (avoiding over-indexing), performance monitoring, schema evolution, migrations, backup/recovery, high availability, and security.
Supported Platforms: PostgreSQL, MySQL, SQL Server, Oracle, SQLite
For OLAP/Analytics: See data-lake-platform (ClickHouse, DuckDB, Doris, StarRocks)
Quick Reference
| Task | Tool/Framework | Command | When to Use |
|---|---|---|---|
| Query Performance Analysis | EXPLAIN ANALYZE | EXPLAIN (ANALYZE, BUFFERS) SELECT ... (PG) / EXPLAIN ANALYZE SELECT ... (MySQL) |
Diagnose slow queries, identify missing indexes |
| Find Slow Queries | pg_stat_statements / slow query log | SELECT * FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10; |
Identify performance bottlenecks in production |
| Index Analysis | pg_stat_user_indexes / SHOW INDEX | SELECT * FROM pg_stat_user_indexes WHERE idx_scan = 0; |
Find unused indexes, validate index coverage |
| Schema Migration | Flyway / Liquibase | flyway migrate / liquibase update |
Version-controlled database changes |
| Backup & Recovery | pg_dump / mysqldump | pg_dump -Fc dbname > backup.dump |
Point-in-time recovery, disaster recovery |
| Replication Setup | Streaming / GTID | Configure postgresql.conf / my.cnf | High availability, read scaling |
| Safe Tuning Loop | Measure -> Explain -> Change -> Verify | Use tuning worksheet template | Reduce latency/cost without regressions |
Decision Tree: Choosing the Right Approach
Query performance issue?
ââ Identify slow queries first?
â ââ PostgreSQL -> pg_stat_statements (top queries by total_exec_time)
â ââ MySQL -> Performance Schema / slow query log
â
ââ Analyze execution plan?
â ââ PostgreSQL -> EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
â ââ MySQL -> EXPLAIN FORMAT=JSON or EXPLAIN ANALYZE
â ââ SQL Server -> SET STATISTICS IO ON; SET STATISTICS TIME ON;
â
ââ Need indexing strategy?
â ââ PostgreSQL -> B-tree (default), GIN (JSONB), GiST (spatial), partial indexes
â ââ MySQL -> BTREE (default), FULLTEXT (text search), SPATIAL
â ââ Check: Table >10k rows AND selectivity <10% AND 10x+ speedup verified
â
ââ Schema changes needed?
â ââ New database -> template-schema-design.md
â ââ Modify schema -> template-migration.md (Flyway/Liquibase)
â ââ Large tables (MySQL) -> gh-ost / pt-online-schema-change (avoid locks)
â
ââ High availability setup?
â ââ PostgreSQL -> Streaming replication (template-replication-ha.md)
â ââ MySQL -> GTID-based replication (template-replication-ha.md)
â
ââ Backup/disaster recovery?
â ââ template-backup-restore.md (pg_dump, mysqldump, PITR)
â
ââ Analytics on large datasets (OLAP)?
ââ See data-lake-platform (ClickHouse, DuckDB, Doris, StarRocks)
When to Use This Skill
Codex should invoke this skill when users ask for:
Query Optimization (Modern Approaches)
- SQL query performance review and tuning
- EXPLAIN/plan interpretation with optimization suggestions
- Index creation strategies with balanced approach (avoiding over-indexing)
- Troubleshooting slow queries using pg_stat_statements or Performance Schema
- Identifying and remediating SQL anti-patterns with operational fixes
- Query rewrite suggestions or migration from slow to fast patterns
- Statistics maintenance and auto-analyze configuration
Database Operations
- Schema design with normalization and performance trade-offs
- Database migrations with version control (Liquibase, Flyway)
- Backup and recovery strategies (point-in-time recovery, automated testing)
- High availability and replication setup (streaming, GTID-based)
- Database security auditing (access controls, encryption, SQL injection prevention)
- Lock analysis and deadlock troubleshooting
- Connection pooling (pgBouncer, Pgpool-II, ProxySQL)
Performance Tuning (Modern Standards)
- Memory configuration (work_mem, shared_buffers, effective_cache_size)
- Automated monitoring with pg_stat_statements and query pattern analysis
- Index health monitoring (unused index detection, index bloat analysis)
- Vacuum strategy and autovacuum tuning (PostgreSQL)
- InnoDB buffer pool optimization (MySQL)
- Partition pruning improvements (PostgreSQL 18+)
Resources (Best Practices Guides)
Find detailed operational patterns and quick references in:
- SQL Best Practices: references/sql-best-practices.md
- Query Tuning Patterns: references/query-tuning-patterns.md
- Indexing Strategies: references/index-patterns.md
- EXPLAIN/Analysis: references/explain-analysis.md
- SQL Anti-Patterns: references/sql-antipatterns.md
- External Sources: data/sources.json â vendor docs and reference links
- Operational Standards: references/operational-patterns.md â Deep operational checklists, database-specific guidance, and template selection trees
Each file includes:
- Copy-paste ready checklists (e.g., “query review”, “index design”, “explain review”)
- Anti-patterns with operational fixes and alternatives
- Query rewrite and indexing strategies with examples
- Troubleshooting guides (step-by-step)
Templates (Copy-Paste Ready)
Templates are organized by database technology for precision and clarity:
Cross-Platform Templates (All Databases)
- assets/cross-platform/template-query-tuning.md – Universal query optimization
- assets/cross-platform/template-explain-analysis.md – Execution plan analysis
- assets/cross-platform/template-performance-tuning-worksheet.md – NEW 4-step tuning workflow (Measure -> Explain -> Change -> Verify)
- assets/cross-platform/template-index.md – Index design patterns
- assets/cross-platform/template-slow-query.md – Slow query triage
- assets/cross-platform/template-schema-design.md – Schema modeling
- assets/cross-platform/template-migration.md – Database migrations
- assets/cross-platform/template-backup-restore.md – Backup/DR planning
- assets/cross-platform/template-security-audit.md – Security review
- assets/cross-platform/template-diagnostics.md – Performance diagnostics
- assets/cross-platform/template-lock-analysis.md – Lock troubleshooting
PostgreSQL Templates
- assets/postgres/template-pg-explain.md – PostgreSQL EXPLAIN analysis
- assets/postgres/template-pg-index.md – PostgreSQL indexing (B-tree, GIN, GiST)
- assets/postgres/template-replication-ha.md – Streaming replication & HA
MySQL Templates
- assets/mysql/template-mysql-explain.md – MySQL EXPLAIN analysis
- assets/mysql/template-mysql-index.md – MySQL/InnoDB indexing
- assets/mysql/template-replication-ha.md – MySQL replication & HA
Microsoft SQL Server Templates
- assets/mssql/template-mssql-explain.md – SQL Server EXPLAIN/SHOWPLAN analysis
- assets/mssql/template-mssql-index.md – SQL Server indexing and tuning
Oracle Templates
- assets/oracle/template-oracle-explain.md – Oracle EXPLAIN plan review and tuning
SQLite Templates
- assets/sqlite/template-sqlite-optimization.md – SQLite optimization and pragma guidance
Related Skills
Infrastructure & Operations:
- ../ops-devops-platform/SKILL.md â Infrastructure, backups, monitoring, and incident response
- ../qa-observability/SKILL.md â Performance monitoring, profiling, and metrics
- ../qa-debugging/SKILL.md â Production debugging patterns
Application Integration:
- ../software-backend/SKILL.md â API/database integration and application patterns
- ../software-architecture-design/SKILL.md â System design and data architecture
- ../dev-api-design/SKILL.md â REST API and database interaction patterns
Quality & Security:
- ../qa-resilience/SKILL.md â Resilience, circuit breakers, and failure handling
- ../software-security-appsec/SKILL.md â Database security, auth, SQL injection prevention
- ../qa-testing-strategy/SKILL.md â Database testing strategies
Data Engineering:
- ../ai-ml-data-science/SKILL.md â SQLMesh, dbt, data transformations
- ../ai-mlops/SKILL.md â Data pipelines, ETL, and warehouse loading (dlt)
- ../ai-ml-timeseries/SKILL.md â Time-series databases and forecasting
Navigation
Resources
- references/explain-analysis.md
- references/query-tuning-patterns.md
- references/operational-patterns.md
- references/sql-antipatterns.md
- references/index-patterns.md
- references/sql-best-practices.md
Templates
- assets/cross-platform/template-slow-query.md
- assets/cross-platform/template-backup-restore.md
- assets/cross-platform/template-schema-design.md
- assets/cross-platform/template-explain-analysis.md
- assets/cross-platform/template-performance-tuning-worksheet.md
- assets/cross-platform/template-security-audit.md
- assets/cross-platform/template-diagnostics.md
- assets/cross-platform/template-index.md
- assets/cross-platform/template-migration.md
- assets/cross-platform/template-lock-analysis.md
- assets/cross-platform/template-query-tuning.md
- assets/oracle/template-oracle-explain.md
- assets/sqlite/template-sqlite-optimization.md
- assets/postgres/template-pg-index.md
- assets/postgres/template-replication-ha.md
- assets/postgres/template-pg-explain.md
- assets/mysql/template-mysql-explain.md
- assets/mysql/template-mysql-index.md
- assets/mysql/template-replication-ha.md
- assets/mssql/template-mssql-index.md
- assets/mssql/template-mssql-explain.md
Data
- data/sources.json â Curated external references
Operational Deep Dives
See references/operational-patterns.md for:
- End-to-end optimization checklists and anti-pattern fixes
- Database-specific quick references (PostgreSQL, MySQL, SQL Server, Oracle, SQLite)
- Slow query troubleshooting workflow and reliability drills
- Template selection decision tree and platform migration notes
Do / Avoid
GOOD: Do
- Measure baseline before any optimization
- Change one variable at a time
- Verify results match after query changes
- Update statistics before concluding “needs index”
- Test with production-like data volumes
- Document all optimization decisions
- Include performance tests in CI/CD
BAD: Avoid
- Adding indexes without checking if they’ll be used
- Using SELECT * in production queries
- Optimizing for test data (use representative volumes)
- Ignoring write performance impact of indexes
- Skipping EXPLAIN analysis before changes
- Multiple simultaneous changes (can’t attribute improvement)
- N+1 query patterns in application code
Anti-Patterns Quick Reference
| Anti-Pattern | Problem | Fix |
|---|---|---|
| **SELECT *** | Reads unnecessary columns | Explicit column list |
| N+1 queries | Multiplied round trips | JOIN or batch fetch |
| Missing WHERE | Full table scan | Add predicates |
| Function on indexed column | Can’t use index | Move function to RHS |
| Implicit type conversion | Index bypass | Match types explicitly |
| LIKE ‘%prefix’ | Leading wildcard = scan | Full-text search |
| Unbounded result set | Memory explosion | Add LIMIT/pagination |
| OR conditions | Index may not be used | UNION or rewrite |
See references/sql-antipatterns.md for detailed fixes.
OLTP vs OLAP Decision Tree
Is your query for...?
ââ Point lookups (by ID/key)?
â ââ OLTP database (this skill)
â - Ensure proper indexes
â - Use connection pooling
â - Optimize for low latency
â
ââ Aggregations over recent data (dashboard)?
â ââ OLTP database (this skill)
â - Consider materialized views
â - Index common filter columns
â - Watch for lock contention
â
ââ Full table scans or historical analysis?
â ââ OLAP database (data-lake-platform)
â - ClickHouse, DuckDB, Doris
â - Columnar storage
â - Partitioning by date
â
ââ Mixed workload (both)?
ââ Separate OLTP and OLAP
- OLTP for transactions
- Replicate to OLAP for analytics
- Avoid running analytics on primary
Optional: AI/Automation
Note: AI tools assist but require human validation of correctness.
- EXPLAIN summarization â Identify bottlenecks from complex plans
- Query rewrite suggestions â Must verify result equivalence
- Index recommendations â Check selectivity and write impact first
Bounded Claims
- AI cannot determine correct query results
- Automated index suggestions may miss workload context
- Human review required for production changes
Analytical Databases (OLAP)
For OLAP databases and data lake infrastructure, see data-lake-platform:
- Query engines: ClickHouse, DuckDB, Apache Doris, StarRocks
- Table formats: Apache Iceberg, Delta Lake, Apache Hudi
- Transformation: SQLMesh, dbt (staging/marts layers)
- Ingestion: dlt, Airbyte (connectors)
- Streaming: Apache Kafka patterns
This skill focuses on transactional database optimization (PostgreSQL, MySQL, SQL Server, Oracle, SQLite). Use data-lake-platform for analytical workloads.
Related Skills
This skill focuses on query optimization within a single database. For related workflows:
SQL Transformation & Analytics Engineering: -> ai-ml-data-science skill
- SQLMesh templates for building staging/intermediate/marts layers
- Incremental models (FULL, INCREMENTAL_BY_TIME_RANGE, INCREMENTAL_BY_UNIQUE_KEY)
- DAG management and model dependencies
- Unit tests and audits for SQL transformations
Data Ingestion (Loading into Warehouses): -> ai-mlops skill
- dlt templates for extracting from REST APIs, databases
- Loading to Snowflake, BigQuery, Redshift, Postgres, DuckDB
- Incremental loading patterns (timestamp, ID-based, merge/upsert)
- Database replication (Postgres, MySQL, MongoDB -> warehouse)
Data Lake Infrastructure: -> data-lake-platform skill
- ClickHouse, DuckDB, Doris, StarRocks query engines
- Iceberg, Delta Lake, Hudi table formats
- Kafka streaming, Dagster/Airflow orchestration
Use Case Decision:
- Query is slow in production -> Use this skill (data-sql-optimization)
- Building feature pipelines in SQL -> Use ai-ml-data-science (SQLMesh)
- Loading data from APIs/DBs to warehouse -> Use ai-mlops (dlt)
- Analytics on large datasets (OLAP) -> Use data-lake-platform
External Resources
See data/sources.json for 62+ curated resources including:
Core Documentation:
- RDBMS Documentation: PostgreSQL, MySQL, SQL Server, Oracle, SQLite, DuckDB official docs
- Query Optimization: Use The Index, Luke, SQL Performance Explained, vendor optimization guides
- Schema Design: Database Refactoring (Fowler), normalization guides, data type selection
Modern Optimization (Current):
- PostgreSQL: official release notes and “current” docs for planner/optimizer changes
- MySQL: official reference manual sections for EXPLAIN, optimizer, and Performance Schema
- SQL Server / Oracle: official docs for execution plans, indexing, and concurrency controls
Operations & Infrastructure:
- HA & Replication: Streaming replication, GTID-based replication, failover automation
- Migrations: Liquibase, Flyway version control and deployment patterns
- Backup/Recovery: pgBackRest, Percona XtraBackup, point-in-time recovery
- Monitoring: pg_stat_statements, Performance Schema, EXPLAIN visualizers (Dalibo, depesz)
- Security: OWASP SQL Injection Prevention, Postgres hardening, encryption standards
- Analytical Databases: DuckDB extensions, Parquet specification, columnar storage patterns
Use references/operational-patterns.md and the templates directory for detailed workflows, migration notes, and ready-to-run commands.