postgres-optimization

📁 simhacker/moollm 📅 Jan 26, 2026
1
总安装量
1
周安装量
#43478
全站排名
安装命令
npx skills add https://github.com/simhacker/moollm --skill postgres-optimization

Agent 安装分布

mcpjam 1
claude-code 1
windsurf 1
zencoder 1
crush 1
cline 1

Skill 文档

🐘 PostgreSQL Optimization

“Beyond ‘just add an index’ — creative solutions for real performance problems.”

Unconventional optimization techniques for PostgreSQL that go beyond standard DBA playbooks.

Purpose

When conventional approaches fall short — query rewrites, adding indexes, VACUUM, ANALYZE — these techniques offer creative solutions:

  • Eliminate impossible query scans with constraint exclusion
  • Reduce index size with function-based indexes
  • Enforce uniqueness with hash indexes instead of B-Trees

When to Use

  • Ad-hoc query environments where users make mistakes
  • Large indexes approaching table size
  • Uniqueness constraints on large text values (URLs, documents)
  • Timestamp columns queried at coarser granularity

Technique 1: Constraint Exclusion

The Problem

Check constraints prevent invalid data, but PostgreSQL doesn’t use them to optimize queries by default.

CREATE TABLE users (
    id INT PRIMARY KEY,
    username TEXT NOT NULL,
    plan TEXT NOT NULL,
    CONSTRAINT plan_check CHECK (plan IN ('free', 'pro'))
);

An analyst writes:

SELECT * FROM users WHERE plan = 'Pro';  -- Note: capital P

Despite the check constraint making this condition impossible, PostgreSQL scans the entire table.

The Solution

SET constraint_exclusion TO 'on';

With constraint exclusion enabled:

EXPLAIN ANALYZE SELECT * FROM users WHERE plan = 'Pro';
Result  (cost=0.00..0.00 rows=0 width=0)
  One-Time Filter: false
Execution Time: 0.008 ms

PostgreSQL recognizes the condition contradicts the constraint and skips the scan entirely.

When to Enable

Environment Recommendation
OLTP production Leave as ‘partition’ (default)
BI / Data Warehouse Set to ‘on’
Ad-hoc query tools Set to ‘on’
Reporting databases Set to ‘on’

Tradeoffs

  • Benefit: Eliminates impossible query scans
  • Cost: Extra planning overhead evaluating constraints against conditions
  • Default: ‘partition’ — only used for partition pruning

Technique 2: Function-Based Indexes for Lower Cardinality

The Problem

You have a sales table with timestamps:

CREATE TABLE sale (
    id INT PRIMARY KEY,
    sold_at TIMESTAMPTZ NOT NULL,
    charged INT NOT NULL
);

Analysts query by day:

SELECT date_trunc('day', sold_at AT TIME ZONE 'UTC'), SUM(charged)
FROM sale
WHERE sold_at BETWEEN '2025-01-01 UTC' AND '2025-02-01 UTC'
GROUP BY 1;

You add a B-Tree index on sold_at — 214 MB for a 160 MB table. The index is almost half the table size!

The Solution

Index only what queries need:

CREATE INDEX sale_sold_at_date_ix 
ON sale((date_trunc('day', sold_at AT TIME ZONE 'UTC'))::date);
Index Size
sale_sold_at_ix (full timestamp) 214 MB
sale_sold_at_date_ix (date only) 66 MB

The function-based index is 3x smaller because:

  • Dates are 4 bytes vs 8 bytes for timestamptz
  • Fewer distinct values enable deduplication

The Discipline Problem

Function-based indexes require exact expression match:

-- Uses the index ✓
WHERE date_trunc('day', sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'

-- Does NOT use the index ✗
WHERE (sold_at AT TIME ZONE 'UTC')::date 
      BETWEEN '2025-01-01' AND '2025-01-31'

Solution: Virtual Generated Columns (PostgreSQL 18+)

ALTER TABLE sale ADD sold_at_date DATE
GENERATED ALWAYS AS (date_trunc('day', sold_at AT TIME ZONE 'UTC'));

Now queries use the virtual column:

SELECT sold_at_date, SUM(charged)
FROM sale
WHERE sold_at_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY 1;

Benefits:

  • Smaller index
  • Faster queries
  • No discipline required — column guarantees correct expression
  • No ambiguity about timezones

Limitation: PostgreSQL 18 doesn’t support indexes directly on virtual columns (yet).


Technique 3: Hash Index for Uniqueness

The Problem

You have a table with large URLs:

CREATE TABLE urls (
    id INT PRIMARY KEY,
    url TEXT NOT NULL,
    data JSON
);

You add a unique B-Tree index:

CREATE UNIQUE INDEX urls_url_unique_ix ON urls(url);
Size
Table: 160 MB
B-Tree index: 154 MB

The index is almost as large as the table because B-Tree stores actual values in leaf blocks.

The Solution

Use an exclusion constraint with a hash index:

ALTER TABLE urls 
ADD CONSTRAINT urls_url_unique_hash 
EXCLUDE USING HASH (url WITH =);
Index Size
B-Tree 154 MB
Hash 32 MB

The hash index is 5x smaller because it stores hash values, not the actual URLs.

Uniqueness Is Enforced

INSERT INTO urls (id, url) VALUES (1000002, 'https://example.com');
-- ERROR: conflicting key value violates exclusion constraint

Queries Still Fast

EXPLAIN ANALYZE SELECT * FROM urls WHERE url = 'https://example.com';
Index Scan using urls_url_unique_hash on urls
Execution Time: 0.022 ms  -- Faster than B-Tree's 0.046 ms!

Limitations

Feature B-Tree Unique Hash Exclusion
Foreign key reference ✓ ✗
ON CONFLICT (column) ✓ ✗
ON CONFLICT ON CONSTRAINT ✓ ✓ (DO NOTHING only)
ON CONFLICT DO UPDATE ✓ ✗
MERGE ✓ ✓

Workaround: Use MERGE

Instead of INSERT ... ON CONFLICT DO UPDATE:

MERGE INTO urls t
USING (VALUES (1000004, 'https://example.com')) AS s(id, url)
ON t.url = s.url
WHEN MATCHED THEN UPDATE SET id = s.id
WHEN NOT MATCHED THEN INSERT (id, url) VALUES (s.id, s.url);

Quick Reference

Diagnostic Queries

Check index sizes:

\di+ table_*

Compare index to table size:

SELECT 
    relname AS name,
    pg_size_pretty(pg_relation_size(oid)) AS size
FROM pg_class 
WHERE relname LIKE 'your_table%'
ORDER BY pg_relation_size(oid) DESC;

Check constraint_exclusion setting:

SHOW constraint_exclusion;

Decision Tree

Is the query scanning impossibly?
├── Yes → Enable constraint_exclusion
└── No
    ↓
Is index nearly as large as table?
├── Yes, timestamp column → Function-based index on date
├── Yes, large text column → Hash exclusion constraint
└── No → Standard B-Tree is fine

Commands

Command Action
ANALYZE [table] Analyze query performance
CHECK-CONSTRAINTS Evaluate constraint exclusion opportunity
LOWER-CARDINALITY Find function-based index opportunities
HASH-UNIQUE Evaluate hash index for large values
COMPARE-INDEXES Compare index sizes and performance

Integration

Direction Skill Relationship
← debugging Query debugging leads here
→ plan-then-execute Systematic optimization