postgres-semantic-search
28
总安装量
10
周安装量
#13252
全站排名
安装命令
npx skills add https://github.com/laguagu/claude-code-nextjs-skills --skill postgres-semantic-search
Agent 安装分布
claude-code
10
gemini-cli
5
opencode
5
antigravity
4
github-copilot
4
Skill 文档
PostgreSQL Semantic Search
Quick Start
1. Setup
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT NOT NULL,
embedding vector(1536) -- text-embedding-3-small
-- Or: embedding halfvec(3072) -- text-embedding-3-large (50% memory)
);
2. Basic Semantic Search
SELECT id, content, 1 - (embedding <=> query_vec) AS similarity
FROM documents
ORDER BY embedding <=> query_vec
LIMIT 10;
3. Add Index (> 10k documents)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
Docker Quick Start
# pgvector with PostgreSQL 17
docker run -d --name pgvector-db \
-e POSTGRES_PASSWORD=postgres \
-p 5432:5432 \
pgvector/pgvector:pg17
# Or PostgreSQL 18 (latest)
docker run -d --name pgvector-db \
-e POSTGRES_PASSWORD=postgres \
-p 5432:5432 \
pgvector/pgvector:pg18
# ParadeDB (includes pgvector + pg_search + BM25)
docker run -d --name paradedb \
-e POSTGRES_PASSWORD=postgres \
-p 5432:5432 \
paradedb/paradedb:latest
Connect: psql postgresql://postgres:postgres@localhost:5432/postgres
Cheat Sheet
Distance Operators
embedding <=> query -- Cosine distance (1 - similarity)
embedding <-> query -- L2/Euclidean distance
embedding <#> query -- Negative inner product
Common Queries
-- Top 10 similar (cosine)
SELECT * FROM docs ORDER BY embedding <=> $1 LIMIT 10;
-- With similarity score
SELECT *, 1 - (embedding <=> $1) AS similarity FROM docs ORDER BY 2 DESC LIMIT 10;
-- With threshold
SELECT * FROM docs WHERE embedding <=> $1 < 0.3 ORDER BY 1 LIMIT 10;
-- Preload index (run on startup)
SELECT 1 FROM docs ORDER BY embedding <=> $1 LIMIT 1;
Index Quick Reference
-- HNSW (recommended)
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);
-- With tuning
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 200);
-- Query-time recall
SET hnsw.ef_search = 100;
-- Iterative scan for filtered queries (pgvector 0.8+)
SET hnsw.iterative_scan = relaxed_order;
SET ivfflat.iterative_scan = on;
Decision Trees
Choose Search Method
Query type?
ââ Conceptual/meaning-based â Pure vector search
ââ Exact terms/names â Pure keyword search (FTS)
ââ Fuzzy/typo-tolerant â pg_trgm trigram similarity
ââ Autocomplete/prefix â pg_trgm + prefix index
ââ Substring (LIKE/ILIKE) â pg_trgm GIN index
ââ Mixed/unknown â Hybrid search
ââ Simple setup â FTS + RRF (no extra extensions)
ââ Better ranking â BM25 + RRF (pg_search extension)
ââ Full-featured â ParadeDB (Elasticsearch alternative)
Choose Index Type
Document count?
ââ < 10,000 â No index needed
ââ 10k - 1M â HNSW (best recall)
ââ > 1M â IVFFlat (less memory) or HNSW
Choose Vector Type
Embedding model?
ââ text-embedding-3-small (1536) â vector(1536)
ââ text-embedding-3-large (3072) â halfvec(3072) (50% memory savings)
ââ Other models â vector(dimensions)
Operators
| Operator | Distance | Use Case |
|---|---|---|
<=> |
Cosine | Text embeddings (default) |
<-> |
L2/Euclidean | Image embeddings |
<#> |
Inner product | Normalized vectors |
SQL Functions
Semantic Search
match_documents(query_vec, threshold, limit)– Basic searchmatch_documents_filtered(query_vec, metadata_filter, threshold, limit)– With JSONB filtermatch_chunks(query_vec, threshold, limit)– Search document chunks
Fuzzy Search (pg_trgm)
fuzzy_search_trigram(query_text, threshold, limit)– Trigram similarity searchautocomplete_search(prefix, limit)– Prefix + fuzzy autocompletehybrid_search_fuzzy_semantic(query_text, query_vec, limit, rrf_k)– Fuzzy + vector RRFweighted_fts_search(query_text, language, limit)– FTS with title/content weighting
Hybrid Search (FTS)
hybrid_search_fts(query_vec, query_text, limit, rrf_k, language)– FTS + RRFhybrid_search_weighted(query_vec, query_text, limit, sem_weight, kw_weight)– Linear combinationhybrid_search_fallback(query_vec, query_text, limit)– Graceful degradation
Hybrid Search (BM25)
hybrid_search_bm25(query_vec, query_text, limit, rrf_k)– BM25 + RRFhybrid_search_bm25_highlighted(...)– With snippet highlightinghybrid_search_chunks_bm25(...)– For RAG with chunks
Re-ranking (Optional)
Two-stage retrieval improves precision: fast recall â precise rerank.
When to Use
- Results need higher precision
- Using < 50 candidates after initial search
- Have budget for API calls (Cohere) or compute (local models)
Options
| Method | Latency | Quality | Cost |
|---|---|---|---|
| Cohere Rerank v4.0-fast | ~150ms | Excellent | $0.001/query |
| Cohere Rerank v4.0-pro | ~300ms | Best | $0.002/query |
| Zerank 2 | ~100ms | Best | API cost |
| Voyage Rerank 2.5 | ~100ms | Excellent | API cost |
| Cross-encoder (local) | ~500ms | Very Good | Compute |
TypeScript Example (Cohere)
import { CohereClient } from 'cohere-ai';
const cohere = new CohereClient({ token: process.env.COHERE_API_KEY });
async function rerankResults(query: string, documents: string[]) {
const response = await cohere.rerank({
model: 'rerank-v4.0-fast', // or 'rerank-v4.0-pro' for best quality
query,
documents,
topN: 10,
});
return response.results;
}
- reranking.md – Detailed guide
References
- fuzzy-search.md – pg_trgm, fuzzy matching, LIKE/ILIKE, autocomplete, advanced FTS
- paradedb.md – ParadeDB full-text search (Elasticsearch alternative)
- vector-types.md – vector vs halfvec, dimensions, storage
- indexing.md – HNSW, IVFFlat, GIN parameters
- hybrid-search.md – FTS, BM25, RRF algorithms
- performance.md – Cold-start, memory, HNSW vs IVFFlat
Scripts
- setup.sql – Extension and table setup
- semantic_search.sql – Semantic search functions
- hybrid_search_fts.sql – FTS hybrid functions
- hybrid_search_bm25.sql – BM25 hybrid functions
- fuzzy_search.sql – pg_trgm fuzzy search, autocomplete, weighted FTS
- indexes.sql – Index creation scripts
Common Patterns
TypeScript Integration (Supabase)
// Semantic search
const { data } = await supabase.rpc('match_documents', {
query_embedding: embedding,
match_threshold: 0.7,
match_count: 10
});
// Hybrid search
const { data } = await supabase.rpc('hybrid_search_fts', {
query_embedding: embedding,
query_text: userQuery,
match_count: 10,
rrf_k: 60,
fts_language: 'simple'
});
Drizzle ORM
import { sql } from 'drizzle-orm';
const results = await db.execute(sql`
SELECT * FROM match_documents(
${embedding}::vector(1536),
0.7,
10
)
`);
Troubleshooting
| Symptom | Cause | Solution |
|---|---|---|
| Index not used | < 10k rows or planner choice | Normal for small tables, check with EXPLAIN |
| Slow first query (30-60s) | HNSW cold-start | SELECT pg_prewarm('idx_name') or preload query |
| Poor recall | Low ef_search | SET hnsw.ef_search = 100 or higher |
| FTS returns nothing | Wrong language config | Use 'simple' for mixed/unknown languages |
| Memory error on index build | maintenance_work_mem too low | Increase to 2GB+ |
| Cosine similarity > 1 | Vectors not normalized | Normalize before insert or use L2 |
| Slow inserts | Index overhead | Batch inserts, consider IVFFlat |
| Fuzzy search slow | Missing trigram index | CREATE INDEX USING gin (col gin_trgm_ops) |
| ILIKE ‘%x%’ slow | No pg_trgm GIN index | Enable pg_trgm + create GIN trigram index |
% operator error |
pg_trgm not installed | CREATE EXTENSION IF NOT EXISTS pg_trgm |
Version Info (January 2026)
- PostgreSQL 18.1: Latest maintenance release with security fixes (Nov 2025)
- PostgreSQL 17.7: Stable LTS option
- pgvector 0.8.1: Iterative scans, PostgreSQL 18 support, halfvec up to 4000 dims
- pg_search 0.21.2: MVCC visibility, parallel aggregation, varchar[] indexing
- Cohere Rerank v4.0: 32K context, 100+ languages, self-learning (Dec 2025)
External Documentation
- pgvector GitHub – Official extension, latest features
- OpenAI Embeddings Guide – Embedding models and best practices
- Supabase Vector Guide – Supabase-specific integration
- ParadeDB pg_search – BM25 extension documentation
- PostgreSQL FTS – Built-in full-text search