infra-manage-ssh-services
npx skills add https://github.com/dawiddutoit/custom-claude --skill infra-manage-ssh-services
Agent 安装分布
Skill 文档
Works with SSH commands, Docker remote management, and infrastructure health checks.
Infrastructure SSH Service Management
Quick Start
Discover available infrastructure:
# List all hosts and their status
ping -c 1 -W 1 infra.local && echo "â
infra.local (primary)" || echo "â infra.local"
ping -c 1 -W 1 192.168.68.135 && echo "â
deus (development)" || echo "â deus"
ping -c 1 -W 1 homeassistant.local && echo "â
homeassistant.local" || echo "â homeassistant.local"
Check primary infrastructure services:
# View all running Docker services on infra.local
ssh infra "docker ps --format 'table {{.Names}}\t{{.Status}}\t{{.Ports}}'"
# Quick MongoDB health check (MongoDB 4.4 uses 'mongo' not 'mongosh')
ssh infra "docker exec local-infra-mongodb-1 mongo off --quiet --eval 'db.runCommand({ping: 1})'" 2>/dev/null
Before using remote MongoDB (NomNom project):
# Verify MongoDB is accessible
nc -z infra.local 27017 && echo "â
MongoDB port open" || echo "â MongoDB unreachable"
Connection Reference
To connect to infra.local, you have three equivalent options:
# Option 1: Use the connect function (recommended)
connect infra
# Option 2: Use the SSH alias from ~/.ssh/config
ssh infra
# Option 3: Use the full hostname
ssh dawiddutoit@infra.local
All three commands do the same thing:
- Connect to
infra.local - Authenticate as user
dawiddutoit - Use SSH key
~/.ssh/id_ed25519
First-time setup (if SSH key not yet copied):
connect infra --setup
# This copies your SSH public key to infra.local for passwordless authentication
For other hosts:
connect deus # or: ssh deus # or: ssh dawiddutoit@192.168.68.135
connect ha # or: ssh ha # or: ssh root@homeassistant.local
connect motor # or: ssh motor # or: ssh dawiddutoit@pi4-motor.local
connect armitage # or: ssh unit@armitage.local
Running commands on infra.local (without interactive shell):
# Execute single command
ssh infra "docker ps"
# Execute multiple commands
ssh infra "cd ~/projects/local-infra && docker compose ps"
# Chain commands
ssh infra "docker ps -f name=mongodb && docker logs --tail 10 local-infra-mongodb-1"
Table of Contents
- When to Use This Skill
- What This Skill Does
- Instructions
- Supporting Files
- Common Workflows
- Expected Outcomes
- Integration Points
- Expected Benefits
- Requirements
- Red Flags to Avoid
When to Use This Skill
Explicit Triggers (User Requests)
- “Check infrastructure status”
- “Connect to infra/deus/ha”
- “View Docker services on infra”
- “Test MongoDB connectivity”
- “What services are running on infra.local?”
- “Troubleshoot remote MongoDB connection”
- “Check Langfuse status”
- “View OTLP collector logs”
Implicit Triggers (Contextual Needs)
- Before using remote MongoDB in NomNom project
- When remote service connection fails (MongoDB, Neo4j, Langfuse)
- Before starting development session that uses remote resources
- When planning to use OpenTelemetry/Langfuse observability
- When investigating service availability for integration work
Debugging/Troubleshooting Triggers
- Connection refused errors to infra.local services
- MongoDB ServerSelectionTimeoutError
- SSH authentication failures
- Docker container not responding
- Service appears running but not accessible
- Neo4j or Infinity in restart loop
What This Skill Does
This skill provides systematic workflows for:
- Service Discovery – Identify available hosts (5 total) and running services (16+ on infra.local)
- Connectivity Testing – Verify network reachability, port availability, SSH access
- Docker Management – View, restart, and monitor remote Docker containers
- Health Verification – Check service health status and logs
- Troubleshooting – Diagnose connection issues and service failures
- Infrastructure Integration – Ensure remote resources (MongoDB, Langfuse, OTLP) are ready for use
Instructions
3.1 Discovery Phase
Step 1: Identify Target Host
Use the connect function to determine which host you need:
# View available hosts
connect
# Output: Hosts: infra, armitage, deus, ha, motor
Infrastructure Inventory:
| Host | Connection | Status | Primary Services |
|---|---|---|---|
| infra.local | connect infra |
â Online | MongoDB, Langfuse, OTLP, Jaeger, Neo4j, Infinity, PostgreSQL, Redis, MinIO, Mosquitto, Caddy |
| deus | connect deus |
â Online | None detected (development machine) |
| homeassistant.local | connect ha |
â Online | Home Assistant (port 8123) |
| pi4-motor.local | connect motor |
â Offline | Motor control (Raspberry Pi 4) |
| armitage.local | connect armitage |
â Offline | Neo4j, Infinity Embeddings (WSL2 PC) |
Step 2: Test Host Reachability
# Quick network ping test
ping -c 1 -W 1 infra.local
# Test specific port availability
nc -z infra.local 27017 # MongoDB
nc -z infra.local 3000 # Langfuse
nc -z infra.local 4317 # OTLP Collector
nc -z infra.local 7687 # Neo4j (if not in restart loop)
Step 3: Discover Running Services
# View all Docker containers on infra.local
ssh infra "docker ps --format 'table {{.Names}}\t{{.Status}}\t{{.Ports}}'"
# Count running services
ssh infra "docker ps --format '{{.Names}}' | wc -l"
# Check specific service
ssh infra "docker ps -f name=mongodb"
3.2 Health Check Phase
Step 1: Verify SSH Connectivity
# Test basic SSH connection
ssh infra "echo 'Connection OK'"
# If SSH fails, check SSH agent
ssh-add -l
# Copy SSH key if needed (first-time setup)
connect infra --setup
Step 2: Check Service Health
# MongoDB health check
ssh infra "docker inspect --format='{{.State.Health.Status}}' local-infra-mongodb-1"
ssh infra "docker exec local-infra-mongodb-1 mongo off --quiet --eval 'db.runCommand({ping: 1})'"
# Langfuse health check (HTTP)
curl -s -o /dev/null -w "%{http_code}" http://infra.local:3000
# OTLP Collector health check
ssh infra "docker inspect --format='{{.State.Status}}' local-infra-otel-collector-1"
# View container logs for errors
ssh infra "docker logs --tail 50 local-infra-mongodb-1"
Step 3: Verify Application-Level Connectivity
For MongoDB (NomNom project):
# Test from application environment
cd ~/projects/play/nomnom
python -c "from motor.motor_asyncio import AsyncIOMotorClient; import asyncio; asyncio.run(AsyncIOMotorClient('mongodb://infra.local:27017').admin.command('ping'))" && echo "â
MongoDB reachable"
For Langfuse:
# Check web UI accessibility
curl -I http://infra.local:3000 | grep "HTTP"
3.3 Execution Phase
Service Management Commands:
# Restart single service
ssh infra "cd ~/projects/local-infra && docker compose restart mongodb"
# Restart all services
ssh infra "cd ~/projects/local-infra && docker compose restart"
# Stop service
ssh infra "cd ~/projects/local-infra && docker compose stop mongodb"
# Start service
ssh infra "cd ~/projects/local-infra && docker compose up -d mongodb"
# View Docker Compose configuration
ssh infra "cd ~/projects/local-infra && docker compose config"
Monitoring Commands:
# Follow logs in real-time
ssh infra "docker logs -f local-infra-mongodb-1"
# View last 100 lines
ssh infra "docker logs --tail 100 local-infra-langfuse-web-1"
# View logs for all services
ssh infra "cd ~/projects/local-infra && docker compose logs -f"
# Check resource usage
ssh infra "docker stats --no-stream"
File Synchronization:
# Push file to infra.local
syncpi push ~/path/to/file
# Pull file from infra.local
syncpi pull ~/path/to/file
# Sync zsh configuration
syncpi zsh push
syncpi zsh pull
Supporting Files
references/infrastructure_guide.md
Complete infrastructure documentation – Read this for:
- Detailed service inventory with ports and URLs
- Environment variable mappings
- Docker Compose management on infra.local
- Troubleshooting guides for specific services
- Security notes and credential locations
When to read: Before performing any infrastructure operations, when troubleshooting connection issues, or when needing detailed service information.
Location: /Users/dawiddutoit/.claude/artifacts/2026-01-01/infrastructure/SSH_INFRASTRUCTURE_GUIDE.md
scripts/health_check.sh
Quick health check script – Automated connectivity and service status checks.
Usage:
See references/detailed-workflows.md for:
- 7 comprehensive workflows (NomNom setup, connection debugging, service discovery, restart loop diagnosis, SSH setup, OTLP verification, file syncing)
- Expected outcomes (successful/failed health checks, restart loop diagnosis)
- Integration examples (NomNom, observability, Home Assistant, quality gates)
- Troubleshooting guide (connection refused, permission denied, restart loops, slow SSH)
- Advanced techniques (complex commands, real-time monitoring, batch health checks)
Environment variables:
MONGODB_URL=mongodb://infra.local:27017
MONGODB_DATABASE=off
With Observability Skills
Before using observability skills:
# Verify OTLP Collector is running
ssh infra "docker ps -f name=otel-collector -q" | grep -q . || echo "â ï¸ OTLP Collector offline"
# Then use skills:
# - observability-analyze-logs
# - observability-analyze-session-logs
With Home Assistant Skills
Before using HA skills:
# Verify Home Assistant is accessible
curl -s -H "Authorization: Bearer $HA_LONG_LIVED_TOKEN" http://192.168.68.123:8123/api/ | grep -q "message" && echo "â
HA API accessible"
# Then use skills:
# - ha-dashboard-create
# - ha-custom-cards
# - ha-mushroom-cards
With Quality Gates
Infrastructure verification as quality gate:
# Add to pre-start checks
if ! nc -z infra.local 27017; then
echo "â QUALITY GATE FAILED: MongoDB unreachable"
echo "Run: ssh infra 'cd ~/projects/local-infra && docker compose restart mongodb'"
exit 1
fi
Expected Benefits
| Metric | Before Skill | After Skill | Improvement |
|---|---|---|---|
| Discovery Time | 5-10 min (manual SSH, guessing) | 30 sec (automated checks) | 10-20x faster |
| Troubleshooting Time | 10-30 min (trial and error) | 2-5 min (systematic workflow) | 5-6x faster |
| Connection Failures | 30-40% (no verification) | <5% (proactive health checks) | 6-8x reduction |
| Service Availability Awareness | Unknown until failure | Real-time status | Proactive visibility |
| Documentation Access | Search files, guess locations | Single skill reference | Immediate context |
Success Metrics
- Discovery Success Rate – Can identify all online hosts and services in <30 seconds
- Health Check Coverage – Verify critical services (MongoDB, Langfuse, OTLP) before use
- Troubleshooting Efficiency – Resolve 80% of connection issues within 5 minutes
- Proactive Usage – Check infrastructure before remote operations (NomNom, observability)
- Zero Surprise Failures – No “connection refused” errors due to unchecked infrastructure
Requirements
Tools
- Bash (for SSH commands and connectivity tests)
- Read (for comprehensive infrastructure guide)
Environment
- SSH access to remote hosts (via
~/.ssh/config) - SSH keys configured (use
connect <host> --setupif needed) - Network connectivity to infra.local (primary), deus, homeassistant.local
connectfunction in~/.zshrc(lines 290-306)- Optional:
syncpifunction for file synchronization
Knowledge
- Basic SSH command syntax
- Understanding of Docker and Docker Compose
- Familiarity with port-based service discovery (nc, curl)
- Environment variables for service endpoints
Utility Scripts
scripts/health_check.sh
Purpose: Run comprehensive health checks across all infrastructure hosts
Usage:
# Check all hosts
bash /Users/dawiddutoit/.claude/skills/infra-manage-ssh-services/scripts/health_check.sh
# Check specific host
bash /Users/dawiddutoit/.claude/skills/infra-manage-ssh-services/scripts/health_check.sh infra
# Verbose output
bash /Users/dawiddutoit/.claude/skills/infra-manage-ssh-services/scripts/health_check.sh --verbose
Checks performed:
- Network reachability (ping)
- SSH connectivity
- Docker daemon status
- Container health for critical services
- Port availability for key services
- Service-specific health endpoints
Red Flags to Avoid
- Assuming local MongoDB – MongoDB runs on infra.local, NOT localhost
- Skipping connectivity checks – Always verify before using remote services
- Ignoring offline hosts – armitage.local and pi4-motor.local are offline (environment variables may point to them)
- Missing SSH key setup – Run
connect <host> --setupon first use - Not checking container health – Container “Up” â healthy (use
docker inspectfor health) - Hardcoding IPs – Use hostnames (infra.local, homeassistant.local) for mDNS resolution
- Ignoring restart loops – Neo4j and Infinity are restarting on infra.local (check logs)
- Skipping logs when debugging – Always view logs before restarting services
- Not testing ports – Use
nc -zto verify port availability before connection attempts - Missing Docker Compose context – Always
cd ~/projects/local-infrabefore Docker Compose commands
Notes
Key Infrastructure Facts:
- Primary Host: infra.local (16+ services, always online)
- MongoDB: 632K OpenFoodFacts products already imported
- Telemetry: All Claude Code sessions automatically send OTLP to infra.local:4317
- Offline Services: Neo4j and Infinity Embeddings in restart loop on infra.local
- Alternative Endpoints: armitage.local has Neo4j/Infinity but is currently offline
- Home Assistant: Separate host with 16 related skills in ~/.claude/skills/
Environment Variable Locations:
- SSH config:
~/.ssh/config - Secrets:
~/.zshrc(lines 366-540) - Project .env:
~/projects/play/nomnom/.env(MongoDB URL)
Related Documentation:
- Complete infrastructure guide:
/Users/dawiddutoit/.claude/artifacts/2026-01-01/infrastructure/SSH_INFRASTRUCTURE_GUIDE.md - NomNom MongoDB setup:
/Users/dawiddutoit/projects/play/nomnom/CLAUDE.md(lines 224-235) - Observability skills:
~/.claude/CLAUDE.md(search “observability-*”) - Home Assistant skills:
~/.claude/CLAUDE.md(search “ha-*”)