grepai-storage-postgres
npx skills add https://github.com/yoanbernabeu/grepai-skills --skill grepai-storage-postgres
Agent 安装分布
Skill 文档
GrepAI Storage with PostgreSQL
This skill covers using PostgreSQL with the pgvector extension as the storage backend for GrepAI.
When to Use This Skill
- Team environments with shared index
- Large codebases (10K+ files)
- Need concurrent access
- Integration with existing PostgreSQL infrastructure
Prerequisites
- PostgreSQL 14+ with pgvector extension
- Database user with create table permissions
- Network access to PostgreSQL server
Advantages
| Benefit | Description |
|---|---|
| ð¥ Team sharing | Multiple users can access same index |
| ð Scalable | Handles large codebases |
| ð Concurrent | Multiple simultaneous searches |
| ð¾ Persistent | Data survives machine restarts |
| ð§ Familiar | Standard database tooling |
Setting Up PostgreSQL with pgvector
Option 1: Docker (Recommended for Development)
# Run PostgreSQL with pgvector
docker run -d \
--name grepai-postgres \
-e POSTGRES_USER=grepai \
-e POSTGRES_PASSWORD=grepai \
-e POSTGRES_DB=grepai \
-p 5432:5432 \
pgvector/pgvector:pg16
Option 2: Install on Existing PostgreSQL
# Install pgvector extension (Ubuntu/Debian)
sudo apt install postgresql-16-pgvector
# Or compile from source
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
Then enable the extension:
-- Connect to your database
CREATE EXTENSION IF NOT EXISTS vector;
Option 3: Managed Services
- Supabase: pgvector included by default
- Neon: pgvector available
- AWS RDS: Install pgvector extension
- Azure Database: pgvector available
Configuration
Basic Configuration
# .grepai/config.yaml
store:
backend: postgres
postgres:
dsn: postgres://user:password@localhost:5432/grepai
With Environment Variable
store:
backend: postgres
postgres:
dsn: ${DATABASE_URL}
Set the environment variable:
export DATABASE_URL="postgres://user:password@localhost:5432/grepai"
Full DSN Options
store:
backend: postgres
postgres:
dsn: postgres://user:password@host:5432/database?sslmode=require
DSN components:
user: Database usernamepassword: Database passwordhost: Server hostname or IP5432: Port (default: 5432)database: Database namesslmode: SSL mode (disable, require, verify-full)
SSL Modes
| Mode | Description | Use Case |
|---|---|---|
disable |
No SSL | Local development |
require |
SSL required | Production |
verify-full |
SSL + verify certificate | High security |
# Production with SSL
store:
backend: postgres
postgres:
dsn: postgres://user:pass@prod.db.com:5432/grepai?sslmode=require
Database Schema
GrepAI automatically creates these tables:
-- Vector embeddings table
CREATE TABLE IF NOT EXISTS embeddings (
id SERIAL PRIMARY KEY,
file_path TEXT NOT NULL,
chunk_index INTEGER NOT NULL,
content TEXT NOT NULL,
start_line INTEGER,
end_line INTEGER,
embedding vector(768), -- Dimension matches your model
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE(file_path, chunk_index)
);
-- Index for vector similarity search
CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops);
Verifying Setup
Check pgvector Extension
-- Connect to database
psql -U grepai -d grepai
-- Check extension is installed
SELECT * FROM pg_extension WHERE extname = 'vector';
-- Check GrepAI tables exist (after first grepai watch)
\dt
Test Connection from GrepAI
# Check status
grepai status
# Should show PostgreSQL backend info
Performance Tuning
PostgreSQL Configuration
For better vector search performance:
-- Increase work memory for vector operations
SET work_mem = '256MB';
-- Adjust for your hardware
SET effective_cache_size = '4GB';
SET shared_buffers = '1GB';
Index Tuning
For large indices, tune the IVFFlat index:
-- More lists = faster search, more memory
CREATE INDEX ON embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100); -- Adjust based on row count
Rule of thumb: lists = sqrt(rows)
Concurrent Access
PostgreSQL handles concurrent access automatically:
- Multiple
grepai searchcommands work simultaneously - One
grepai watchdaemon per codebase - Many users can share the same index
Team Setup
Shared Database
All team members point to the same database:
# Each developer's .grepai/config.yaml
store:
backend: postgres
postgres:
dsn: postgres://team:secret@shared-db.company.com:5432/grepai
Per-Project Databases
For isolated projects, use separate databases:
# Create databases
createdb -U postgres grepai_projecta
createdb -U postgres grepai_projectb
# Project A config
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai_projecta
Backup and Restore
Backup
pg_dump -U grepai -d grepai > grepai_backup.sql
Restore
psql -U grepai -d grepai < grepai_backup.sql
Migrating from GOB
- Set up PostgreSQL with pgvector
- Update configuration:
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai
- Delete old index:
rm .grepai/index.gob
- Re-index:
grepai watch
Common Issues
â Problem: FATAL: password authentication failed
â
Solution: Check DSN credentials and pg_hba.conf
â Problem: ERROR: extension "vector" is not available
â
Solution: Install pgvector:
sudo apt install postgresql-16-pgvector
# Then: CREATE EXTENSION vector;
â Problem: ERROR: type "vector" does not exist
â
Solution: Enable extension in the database:
CREATE EXTENSION IF NOT EXISTS vector;
â Problem: Connection refused â Solution:
- Check PostgreSQL is running
- Verify host and port
- Check firewall rules
â Problem: Slow searches â Solution:
- Add IVFFlat index
- Increase
work_mem - Vacuum and analyze tables
Best Practices
- Use environment variables: Don’t commit credentials
- Enable SSL: For remote databases
- Regular backups: pg_dump before major changes
- Monitor performance: Check query times
- Index maintenance: Regular VACUUM ANALYZE
Output Format
PostgreSQL storage status:
â
PostgreSQL Storage Configured
Backend: PostgreSQL + pgvector
Host: localhost:5432
Database: grepai
SSL: disabled
Contents:
- Files: 2,450
- Chunks: 12,340
- Vector dimension: 768
Performance:
- Connection: OK
- IVFFlat index: Yes
- Search latency: ~50ms