grepai-embeddings-lmstudio
npx skills add https://github.com/yoanbernabeu/grepai-skills --skill grepai-embeddings-lmstudio
Agent 安装分布
Skill 文档
GrepAI Embeddings with LM Studio
This skill covers using LM Studio as the embedding provider for GrepAI, offering a user-friendly GUI for managing local models.
When to Use This Skill
- Want local embeddings with a graphical interface
- Already using LM Studio for other AI tasks
- Prefer visual model management over CLI
- Need to easily switch between models
What is LM Studio?
LM Studio is a desktop application for running local LLMs with:
- ð¥ï¸ Graphical user interface
- ð¦ Easy model downloading
- ð OpenAI-compatible API
- ð 100% private, local processing
Prerequisites
- Download LM Studio from lmstudio.ai
- Install and launch the application
- Download an embedding model
Installation
Step 1: Download LM Studio
Visit lmstudio.ai and download for your platform:
- macOS (Intel or Apple Silicon)
- Windows
- Linux
Step 2: Launch and Download a Model
- Open LM Studio
- Go to the Search tab
- Search for an embedding model:
nomic-embed-text-v1.5bge-small-en-v1.5bge-large-en-v1.5
- Click Download
Step 3: Start the Local Server
- Go to the Local Server tab
- Select your embedding model
- Click Start Server
- Note the endpoint (default:
http://localhost:1234)
Configuration
Basic Configuration
# .grepai/config.yaml
embedder:
provider: lmstudio
model: nomic-embed-text-v1.5
endpoint: http://localhost:1234
With Custom Port
embedder:
provider: lmstudio
model: nomic-embed-text-v1.5
endpoint: http://localhost:8080
With Explicit Dimensions
embedder:
provider: lmstudio
model: nomic-embed-text-v1.5
endpoint: http://localhost:1234
dimensions: 768
Available Models
nomic-embed-text-v1.5 (Recommended)
| Property | Value |
|---|---|
| Dimensions | 768 |
| Size | ~260 MB |
| Quality | Excellent |
| Speed | Fast |
embedder:
provider: lmstudio
model: nomic-embed-text-v1.5
bge-small-en-v1.5
| Property | Value |
|---|---|
| Dimensions | 384 |
| Size | ~130 MB |
| Quality | Good |
| Speed | Very fast |
Best for: Smaller codebases, faster indexing.
embedder:
provider: lmstudio
model: bge-small-en-v1.5
dimensions: 384
bge-large-en-v1.5
| Property | Value |
|---|---|
| Dimensions | 1024 |
| Size | ~1.3 GB |
| Quality | Very high |
| Speed | Slower |
Best for: Maximum accuracy.
embedder:
provider: lmstudio
model: bge-large-en-v1.5
dimensions: 1024
Model Comparison
| Model | Dims | Size | Speed | Quality |
|---|---|---|---|---|
bge-small-en-v1.5 |
384 | 130MB | â¡â¡â¡ | âââ |
nomic-embed-text-v1.5 |
768 | 260MB | â¡â¡ | ââââ |
bge-large-en-v1.5 |
1024 | 1.3GB | â¡ | âââââ |
LM Studio Server Setup
Starting the Server
- Open LM Studio
- Navigate to Local Server tab (left sidebar)
- Select an embedding model from the dropdown
- Configure settings:
- Port:
1234(default) - Enable Embedding Endpoint
- Port:
- Click Start Server
Server Status
Look for the green indicator showing the server is running.
Verifying the Server
# Check server is responding
curl http://localhost:1234/v1/models
# Test embedding
curl http://localhost:1234/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "nomic-embed-text-v1.5",
"input": "function authenticate(user)"
}'
LM Studio Settings
Recommended Settings
In LM Studio’s Local Server tab:
| Setting | Recommended Value |
|---|---|
| Port | 1234 |
| Enable CORS | Yes |
| Context Length | Auto |
| GPU Layers | Max (for speed) |
GPU Acceleration
LM Studio automatically uses:
- macOS: Metal (Apple Silicon)
- Windows/Linux: CUDA (NVIDIA)
Adjust GPU layers in settings for memory/speed balance.
Running LM Studio Headless
For server environments, LM Studio supports CLI mode:
# Start server without GUI (check LM Studio docs for exact syntax)
lmstudio server start --model nomic-embed-text-v1.5 --port 1234
Common Issues
â Problem: Connection refused â Solution: Ensure LM Studio server is running:
- Open LM Studio
- Go to Local Server tab
- Click Start Server
â Problem: Model not found â Solution:
- Download the model in LM Studio’s Search tab
- Select it in the Local Server dropdown
â Problem: Slow embedding generation â Solutions:
- Enable GPU acceleration in LM Studio settings
- Use a smaller model (bge-small-en-v1.5)
- Close other GPU-intensive applications
â Problem: Port already in use â Solution: Change port in LM Studio settings:
embedder:
endpoint: http://localhost:8080 # Different port
â Problem: LM Studio closes and server stops â Solution: Keep LM Studio running in the background, or consider using Ollama which runs as a system service
LM Studio vs Ollama
| Feature | LM Studio | Ollama |
|---|---|---|
| GUI | â Yes | â CLI only |
| System service | â App must run | â Background service |
| Model management | â Visual | â CLI |
| Ease of use | âââââ | ââââ |
| Server reliability | âââ | âââââ |
Recommendation: Use LM Studio if you prefer a GUI, Ollama for always-on background service.
Migrating from LM Studio to Ollama
If you need a more reliable background service:
- Install Ollama:
brew install ollama
ollama serve &
ollama pull nomic-embed-text
- Update config:
embedder:
provider: ollama
model: nomic-embed-text
endpoint: http://localhost:11434
- Re-index:
rm .grepai/index.gob
grepai watch
Best Practices
- Keep LM Studio running: Server stops when app closes
- Use recommended model:
nomic-embed-text-v1.5for best balance - Enable GPU: Faster embeddings with hardware acceleration
- Check server before indexing: Ensure green status indicator
- Consider Ollama for production: More reliable as background service
Output Format
Successful LM Studio configuration:
â
LM Studio Embedding Provider Configured
Provider: LM Studio
Model: nomic-embed-text-v1.5
Endpoint: http://localhost:1234
Dimensions: 768 (auto-detected)
Status: Connected
Note: Keep LM Studio running for embeddings to work.