arcanea-performance-tuning

📁 frankxai/arcanea 📅 Jan 27, 2026
2
总安装量
2
周安装量
#63746
全站排名
安装命令
npx skills add https://github.com/frankxai/arcanea --skill arcanea-performance-tuning

Agent 安装分布

codex 2
kilo 2
mcpjam 1
zencoder 1
crush 1
cline 1

Skill 文档

The Performance Tuning Codex

“Premature optimization is the root of all evil. But mature optimization is the root of all delight.”


The Performance Philosophy

The Golden Rules

RULE 1: MEASURE FIRST
Don't guess where the bottleneck is.
Profile. Measure. Prove.

RULE 2: OPTIMIZE THE RIGHT THING
80% of time is spent in 20% of code.
Find that 20%.

RULE 3: SET TARGETS
"Faster" is not a goal.
"Under 200ms" is a goal.

RULE 4: REGRESSION PREVENTION
Performance is easy to lose.
Benchmark continuously.

The Optimization Hierarchy

╔═══════════════════════════════════════════════════════════════════╗
║                    OPTIMIZATION HIERARCHY                          ║
║              (Optimize in this order)                              ║
╠═══════════════════════════════════════════════════════════════════╣
║                                                                    ║
║   1. ALGORITHM         │ O(n²) → O(n log n) = massive wins       ║
║   2. DATA STRUCTURE    │ Right structure for access pattern       ║
║   3. I/O               │ Network, disk, database calls            ║
║   4. MEMORY            │ Allocation, garbage collection           ║
║   5. CPU               │ Hot loops, cache efficiency              ║
║                                                                    ║
║   (Don't optimize #5 if #1-4 are the problem)                     ║
║                                                                    ║
╚═══════════════════════════════════════════════════════════════════╝

Profiling

Types of Profiling

CPU PROFILING:
• What functions take the most time?
• Where are the hot paths?
• What's the call graph?

MEMORY PROFILING:
• Where is memory allocated?
• What's causing garbage collection?
• Are there memory leaks?

I/O PROFILING:
• What queries are slow?
• What network calls are made?
• What files are accessed?

TRACE PROFILING:
• What's the full request lifecycle?
• Where do requests spend time?
• What's the concurrency pattern?

The Profiling Process

┌─────────────────────────────────────────────────────────────────┐
│                    THE PROFILING CYCLE                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   1. ESTABLISH BASELINE                                          │
│      Measure current performance                                 │
│      Record metrics: latency, throughput, resource usage         │
│                                                                  │
│   2. SET TARGET                                                   │
│      Define acceptable performance                               │
│      "P95 latency < 200ms"                                       │
│                                                                  │
│   3. PROFILE                                                      │
│      Identify bottlenecks                                        │
│      Focus on top 3 issues                                       │
│                                                                  │
│   4. HYPOTHESIZE                                                  │
│      Why is this slow?                                           │
│      What would make it faster?                                  │
│                                                                  │
│   5. OPTIMIZE                                                     │
│      Make ONE change                                             │
│      Keep it isolated                                            │
│                                                                  │
│   6. MEASURE                                                      │
│      Did it help?                                                │
│      Did it hurt anything else?                                  │
│                                                                  │
│   7. REPEAT                                                       │
│      Until target reached                                        │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Common Performance Patterns

The N+1 Query Problem

BAD: N+1 queries
┌──────────────────────────────────────────────────────────────┐
│ // Get all users (1 query)                                   │
│ users = db.query("SELECT * FROM users")                      │
│                                                              │
│ // For each user, get their orders (N queries)               │
│ for user in users:                                           │
│     orders = db.query("SELECT * FROM orders WHERE user_id=?")│
└──────────────────────────────────────────────────────────────┘

GOOD: Eager loading
┌──────────────────────────────────────────────────────────────┐
│ // Single query with JOIN                                    │
│ SELECT users.*, orders.*                                     │
│ FROM users                                                   │
│ LEFT JOIN orders ON orders.user_id = users.id                │
│                                                              │
│ // Or batch loading                                          │
│ SELECT * FROM orders WHERE user_id IN (1, 2, 3, 4, 5)       │
└──────────────────────────────────────────────────────────────┘

Caching Strategies

╔═══════════════════════════════════════════════════════════════════╗
║                    CACHING STRATEGIES                              ║
╠═══════════════════════════════════════════════════════════════════╣
║                                                                    ║
║   CACHE-ASIDE (Lazy Loading)                                       ║
║   ┌─────────┐                                                      ║
║   │ Request │──┬──▶ Cache Hit ──▶ Return                          ║
║   └─────────┘  │                                                   ║
║                └──▶ Cache Miss ──▶ DB ──▶ Cache ──▶ Return        ║
║                                                                    ║
║   WRITE-THROUGH                                                    ║
║   ┌─────────┐                                                      ║
║   │  Write  │──▶ Cache ──▶ DB ──▶ Confirm                         ║
║   └─────────┘                                                      ║
║                                                                    ║
║   WRITE-BEHIND (Async)                                             ║
║   ┌─────────┐                                                      ║
║   │  Write  │──▶ Cache ──▶ Confirm                                ║
║   └─────────┘      │                                               ║
║                    └──▶ [Later] ──▶ DB                             ║
║                                                                    ║
╚═══════════════════════════════════════════════════════════════════╝

CACHE INVALIDATION:
• TTL (Time To Live) - Simple but may serve stale data
• Event-based - Invalidate on writes
• Tag-based - Group related items

Connection Pooling

WITHOUT POOLING:
┌──────────┐     ┌──────────┐
│ Request  │──▶──│ Connect  │──▶ 50-100ms overhead
└──────────┘     └──────────┘

WITH POOLING:
┌──────────┐     ┌──────────────┐     ┌──────────┐
│ Request  │──▶──│ Pool Manager │──▶──│ Reuse    │──▶ ~0ms
└──────────┘     └──────────────┘     └──────────┘

POOL CONFIGURATION:
• Min connections: Keep warm for base load
• Max connections: Limit to prevent exhaustion
• Idle timeout: Release unused connections
• Connection lifetime: Prevent stale connections

Lazy Loading

EAGER (Load everything):
┌────────────────────────────────────────────────────────┐
│ class User:                                            │
│     def __init__(self, id):                            │
│         self.profile = load_profile(id)   # Always     │
│         self.orders = load_orders(id)     # Always     │
│         self.preferences = load_prefs(id) # Always     │
└────────────────────────────────────────────────────────┘

LAZY (Load on demand):
┌────────────────────────────────────────────────────────┐
│ class User:                                            │
│     def __init__(self, id):                            │
│         self._id = id                                  │
│         self._orders = None                            │
│                                                        │
│     @property                                          │
│     def orders(self):                                  │
│         if self._orders is None:                       │
│             self._orders = load_orders(self._id)       │
│         return self._orders                            │
└────────────────────────────────────────────────────────┘

Database Optimization

Index Optimization

WHEN TO INDEX:
✓ Columns in WHERE clauses
✓ Columns in JOIN conditions
✓ Columns in ORDER BY
✓ Columns with high selectivity

WHEN NOT TO INDEX:
✗ Small tables (full scan is faster)
✗ Columns with low selectivity (gender, boolean)
✗ Tables with heavy writes (index maintenance cost)
✗ Columns rarely queried

COMPOSITE INDEX ORDER:
• Equality conditions first
• Range conditions last
• Most selective first

INDEX (status, created_at)  -- status = 'active' AND created_at > ?

Query Optimization

EXPLAIN ANALYZE:
Always explain before optimizing.

┌────────────────────────────────────────────────────────────────┐
│ EXPLAIN ANALYZE                                                 │
│ SELECT * FROM orders                                            │
│ WHERE user_id = 123 AND status = 'pending'                      │
│ ORDER BY created_at DESC                                        │
│ LIMIT 10;                                                       │
│                                                                 │
│ Look for:                                                       │
│ • Seq Scan (bad on large tables)                               │
│ • Index Scan (good)                                            │
│ • Sort (expensive if not indexed)                              │
│ • Rows vs estimated rows (accuracy of stats)                   │
└────────────────────────────────────────────────────────────────┘

COMMON FIXES:
• Add missing indexes
• Rewrite subqueries as JOINs
• Use LIMIT for pagination
• Avoid SELECT * in production
• Partition large tables

Frontend Performance

Critical Rendering Path

┌─────────┐     ┌─────────┐     ┌─────────┐     ┌─────────┐
│  HTML   │──▶──│  CSS    │──▶──│   JS    │──▶──│ Render  │
│  Parse  │     │  Parse  │     │ Execute │     │  Paint  │
└─────────┘     └─────────┘     └─────────┘     └─────────┘
     │               │               │
     ▼               ▼               ▼
    DOM            CSSOM          Execute
   Build           Build         & Modify

OPTIMIZATION:
1. Minimize critical resources
2. Minimize critical bytes
3. Minimize critical path length

Core Web Vitals

LCP (Largest Contentful Paint):
Target: < 2.5s
• Optimize images
• Preload critical resources
• Use CDN

FID (First Input Delay):
Target: < 100ms
• Break up long tasks
• Defer non-critical JS
• Use web workers

CLS (Cumulative Layout Shift):
Target: < 0.1
• Set image dimensions
• Reserve space for ads
• Avoid inserting content above fold

Bundle Optimization

CODE SPLITTING:
// Instead of one large bundle
import { everything } from 'huge-library';

// Load on demand
const HeavyComponent = lazy(() => import('./HeavyComponent'));

TREE SHAKING:
// Bad: imports everything
import _ from 'lodash';

// Good: imports only what's used
import { debounce } from 'lodash-es';

COMPRESSION:
• Gzip: 70-90% reduction
• Brotli: 15-20% better than Gzip
• Enable on server and CDN

Concurrency & Parallelism

Async Patterns

SEQUENTIAL (Slow):
┌────────────────────────────────────────────────────────────────┐
│ result1 = await fetchUser()      // 100ms                      │
│ result2 = await fetchOrders()    // 150ms                      │
│ result3 = await fetchProducts()  // 120ms                      │
│ // Total: 370ms                                                │
└────────────────────────────────────────────────────────────────┘

PARALLEL (Fast):
┌────────────────────────────────────────────────────────────────┐
│ [user, orders, products] = await Promise.all([                 │
│     fetchUser(),                                               │
│     fetchOrders(),                                             │
│     fetchProducts()                                            │
│ ])                                                             │
│ // Total: 150ms (slowest call)                                │
└────────────────────────────────────────────────────────────────┘

Rate Limiting & Backpressure

RATE LIMITING:
┌────────────────────────────────────────────────────────────────┐
│ Token Bucket Algorithm:                                         │
│                                                                 │
│ • Bucket has capacity (e.g., 100 tokens)                       │
│ • Tokens added at fixed rate (e.g., 10/second)                 │
│ • Each request consumes a token                                │
│ • No tokens = request rejected                                 │
└────────────────────────────────────────────────────────────────┘

BACKPRESSURE:
┌────────────────────────────────────────────────────────────────┐
│ When producer is faster than consumer:                          │
│                                                                 │
│ Options:                                                        │
│ • Drop: Discard excess (lossy)                                 │
│ • Buffer: Queue until processed (memory risk)                  │
│ • Sample: Process every Nth item                               │
│ • Slow down: Signal producer to wait                           │
└────────────────────────────────────────────────────────────────┘

Monitoring & Metrics

Key Metrics

THE FOUR GOLDEN SIGNALS:
┌─────────────────────────────────────────────────────────────┐
│ 1. LATENCY    │ Time to serve a request                    │
│ 2. TRAFFIC    │ Requests per second                        │
│ 3. ERRORS     │ Rate of failed requests                    │
│ 4. SATURATION │ How "full" the service is                  │
└─────────────────────────────────────────────────────────────┘

PERCENTILES:
• P50 (median): Typical experience
• P95: Most users' worst experience
• P99: Tail latency (important!)
• Max: Absolute worst case

Note: Average is misleading.
      A few slow requests hide in the average.

Benchmarking

MICRO-BENCHMARKS:
• Test specific functions
• Isolate from I/O
• Run many iterations
• Beware of JIT warmup

LOAD TESTING:
• Simulate realistic traffic
• Measure at various loads
• Find the breaking point
• Test failure scenarios

TOOLS:
• k6, Artillery, Locust (load testing)
• wrk, hey (HTTP benchmarking)
• hyperfine (CLI benchmarking)

Quick Reference

Performance Checklist

□ Profiled to find actual bottlenecks
□ Set measurable performance targets
□ Optimized hot paths first
□ Added appropriate caching
□ Minimized I/O operations
□ Used connection pooling
□ Indexed frequently queried columns
□ Implemented lazy loading where appropriate
□ Set up performance monitoring
□ Established performance regression tests

Common Performance Wins

| Problem              | Solution                    |
|----------------------|-----------------------------|
| N+1 queries          | Eager loading, batch        |
| Slow queries         | Add indexes, optimize SQL   |
| Large payloads       | Pagination, compression     |
| Repeated computation | Caching, memoization        |
| Synchronous waits    | Async, parallel execution   |
| Cold starts          | Warmup, connection pools    |
| Large bundles        | Code splitting, tree shake  |
| Slow images          | Lazy load, WebP, CDN        |

The Performance Mantras

"Measure first, optimize second"
"The fastest code is code that doesn't run"
"Cache invalidation is hard; TTL is your friend"
"Profile in production, not just development"
"Optimize for the common case"

“Performance is not about making things fast. It’s about removing what makes things slow.”