algorithm-designer

📁 data-wise/claude-plugins 📅 5 days ago
1
总安装量
1
周安装量
#42511
全站排名
安装命令
npx skills add https://github.com/data-wise/claude-plugins --skill algorithm-designer

Agent 安装分布

replit 1
trae 1
opencode 1
cursor 1
claude-code 1
antigravity 1

Skill 文档

Algorithm Designer

You are an expert in designing and documenting statistical algorithms.

Algorithm Documentation Standards

Required Components

  1. Purpose: What problem does this solve?
  2. Input/Output: Precise specifications
  3. Pseudocode: Language-agnostic description
  4. Complexity: Time and space analysis
  5. Convergence: Conditions and guarantees
  6. Implementation notes: Practical considerations

Input/Output Specification

Formal Specification Template

Every algorithm must have precise input/output documentation:

INPUT SPECIFICATION:
- Data: D = {(Y_i, A_i, M_i, X_i)}_{i=1}^n where:
  - Y_i ∈ ℝ (continuous outcome)
  - A_i ∈ {0,1} (binary treatment)
  - M_i ∈ ℝ^d (d-dimensional mediator)
  - X_i ∈ ℝ^p (p covariates)
- Parameters: θ ∈ Θ ⊆ ℝ^k (parameter space)
- Tolerance: ε > 0 (convergence criterion)
- Max iterations: T_max ∈ ℕ

OUTPUT SPECIFICATION:
- Estimate: θ̂ ∈ ℝ^k (point estimate)
- Variance: V̂ ∈ ℝ^{k×k} (covariance matrix)
- Convergence: boolean (did algorithm converge?)
- Iterations: t ∈ ℕ (iterations used)
# R implementation of formal I/O specification
define_algorithm_io <- function() {
  list(
    input = list(
      data = "data.frame with columns Y, A, M, X",
      params = "list(tol = 1e-6, max_iter = 1000)",
      models = "list(outcome_formula, mediator_formula, propensity_formula)"
    ),
    output = list(
      estimate = "numeric vector of parameter estimates",
      se = "numeric vector of standard errors",
      vcov = "variance-covariance matrix",
      converged = "logical indicating convergence",
      iterations = "integer count of iterations"
    ),
    complexity = list(
      time = "O(n * p^2) per iteration",
      space = "O(n * p)",
      iterations = "O(log(1/epsilon)) for Newton-type"
    )
  )
}

Convergence Criteria

Standard Convergence Conditions

Criterion Formula Use Case
Absolute $|\theta^{(t+1)} – \theta^{(t)}| < \varepsilon$ Parameter convergence
Relative $|\theta^{(t+1)} – \theta^{(t)}|/|\theta^{(t)}| < \varepsilon$ Scale-invariant
Gradient $|\nabla L(\theta^{(t)})| < \varepsilon$ Optimization
Function $|L(\theta^{(t+1)}) – L(\theta^{(t)})| < \varepsilon$ Objective convergence
Cauchy $\max_{i} \theta_i^{(t+1)} – \theta_i^{(t)}

Mathematical Formulation

Convergence tolerance: $\varepsilon = 10^{-6}$ (typical default)

Standard tolerances by application:

  • Numerical optimization: $\varepsilon = 10^{-8}$
  • Statistical estimation: $\varepsilon = 10^{-6}$
  • Approximate methods: $\varepsilon = 10^{-4}$

Complexity Formulas

Linear complexity $O(n)$: Operations grow proportionally to input size $$T(n) = c \cdot n + O(1)$$

Quadratic complexity $O(n^2)$: Nested iterations over input $$T(n) = c \cdot n^2 + O(n)$$

Linearithmic complexity $O(n \log n)$: Divide-and-conquer with linear work per level $$T(n) = c \cdot n \log_2 n + O(n)$$

Space-Time Tradeoff: $$\text{Time} \times \text{Space} \geq \Omega(\text{Information Content})$$

Convergence rate analysis:

  • Linear convergence: $|\theta^{(t)} – \theta^*| \leq C \cdot \rho^t$ where $0 < \rho < 1$
  • Quadratic convergence: $|\theta^{(t+1)} – \theta^| \leq C \cdot |\theta^{(t)} – \theta^|^2$
  • Superlinear: $\lim_{t \to \infty} \frac{|\theta^{(t+1)} – \theta^|}{|\theta^{(t)} – \theta^|} = 0$
# Comprehensive convergence checking
check_convergence <- function(theta_new, theta_old, gradient = NULL,
                              objective_new = NULL, objective_old = NULL,
                              tol = 1e-6, method = "relative") {
  switch(method,
    "absolute" = {
      # |θ^(t+1) - θ^t| < ε
      converged <- max(abs(theta_new - theta_old)) < tol
      criterion <- max(abs(theta_new - theta_old))
    },
    "relative" = {
      # |θ^(t+1) - θ^t| / |θ^t| < ε
      denom <- pmax(abs(theta_old), 1)  # Avoid division by zero
      converged <- max(abs(theta_new - theta_old) / denom) < tol
      criterion <- max(abs(theta_new - theta_old) / denom)
    },
    "gradient" = {
      # |∇L(θ)| < ε
      stopifnot(!is.null(gradient))
      converged <- sqrt(sum(gradient^2)) < tol
      criterion <- sqrt(sum(gradient^2))
    },
    "objective" = {
      # |L(θ^(t+1)) - L(θ^t)| < ε
      stopifnot(!is.null(objective_new), !is.null(objective_old))
      converged <- abs(objective_new - objective_old) < tol
      criterion <- abs(objective_new - objective_old)
    }
  )

  list(converged = converged, criterion = criterion, method = method)
}

# Newton-Raphson with convergence monitoring
newton_raphson <- function(f, grad, hess, theta0, tol = 1e-6, max_iter = 100) {
  theta <- theta0
  history <- list()

  for (t in 1:max_iter) {
    g <- grad(theta)
    H <- hess(theta)

    # Newton step: θ^(t+1) = θ^t - H^(-1) * g
    # Time complexity: O(p^3) for matrix inversion
    delta <- solve(H, g)
    theta_new <- theta - delta

    # Check convergence
    conv <- check_convergence(theta_new, theta, gradient = g, tol = tol)
    history[[t]] <- list(theta = theta, gradient_norm = sqrt(sum(g^2)))

    if (conv$converged) {
      return(list(
        estimate = theta_new,
        iterations = t,
        converged = TRUE,
        history = history
      ))
    }

    theta <- theta_new
  }

  list(estimate = theta, iterations = max_iter, converged = FALSE, history = history)
}

Complexity and Convergence Relationship

Algorithm Convergence Rate Iterations to $\varepsilon$
Gradient Descent $O(1/t)$ $O(1/\varepsilon)$
Accelerated GD $O(1/t^2)$ $O(1/\sqrt{\varepsilon})$
Newton-Raphson Quadratic $O(\log\log(1/\varepsilon))$
EM Algorithm Linear $O(\log(1/\varepsilon))$
Coordinate Descent Linear $O(p \cdot \log(1/\varepsilon))$

Pseudocode Conventions

Standard Format

ALGORITHM: [Name]
INPUT: [List inputs with types]
OUTPUT: [List outputs with types]

1. [Initialize]
2. [Main loop or procedure]
   2.1 [Sub-step]
   2.2 [Sub-step]
3. [Return]

Example: AIPW Estimator

ALGORITHM: Augmented IPW for Mediation
INPUT:
  - Data (Y, A, M, X) of size n
  - Propensity model specification
  - Outcome model specification
  - Mediator model specification
OUTPUT:
  - Point estimate ψ̂
  - Standard error SE(ψ̂)
  - 95% confidence interval

1. ESTIMATE NUISANCE FUNCTIONS
   1.1 Fit propensity score: π̂(x) = P̂(A=1|X=x)
   1.2 Fit mediator density: f̂(m|a,x)
   1.3 Fit outcome regression: μ̂(a,m,x) = Ê[Y|A=a,M=m,X=x]

2. COMPUTE PSEUDO-OUTCOMES
   For i = 1 to n:
     2.1 Compute IPW weight: w_i = A_i/π̂(X_i) + (1-A_i)/(1-π̂(X_i))
     2.2 Compute outcome prediction: μ̂_i = μ̂(A_i, M_i, X_i)
     2.3 Compute augmentation term
     2.4 φ_i = w_i(Y_i - μ̂_i) + [integration term]

3. ESTIMATE AND INFERENCE
   3.1 ψ̂ = n⁻¹ Σᵢ φ_i
   3.2 SE = √(n⁻¹ Σᵢ (φ_i - ψ̂)²)
   3.3 CI = [ψ̂ - 1.96·SE, ψ̂ + 1.96·SE]

4. RETURN (ψ̂, SE, CI)

Complexity Analysis

Big-O Notation Guide

Formal Definition: $f(n) = O(g(n))$ if $\exists c, n_0$ such that $f(n) \leq c \cdot g(n)$ for all $n \geq n_0$

Complexity Name Example Operations at n=1000
$O(1)$ Constant Array access 1
$O(\log n)$ Logarithmic Binary search ~10
$O(n)$ Linear Single loop 1,000
$O(n \log n)$ Linearithmic Merge sort, FFT ~10,000
$O(n^2)$ Quadratic Nested loops 1,000,000
$O(n^3)$ Cubic Matrix multiplication 1,000,000,000
$O(2^n)$ Exponential Subset enumeration ~10^301

Key Formulas

Master Theorem for recurrences $T(n) = aT(n/b) + f(n)$:

  • If $f(n) = O(n^{\log_b a – \epsilon})$ then $T(n) = \Theta(n^{\log_b a})$
  • If $f(n) = \Theta(n^{\log_b a})$ then $T(n) = \Theta(n^{\log_b a} \log n)$
  • If $f(n) = \Omega(n^{\log_b a + \epsilon})$ then $T(n) = \Theta(f(n))$

Sorting lower bound: Any comparison-based sort requires $\Omega(n \log n)$ comparisons

Matrix operations:

  • Naive multiplication: $O(n^3)$
  • Strassen: $O(n^{2.807})$
  • Matrix inversion: $O(n^3)$ (same as multiplication)
# Complexity analysis helper
analyze_complexity <- function(f, n_values = c(100, 500, 1000, 5000)) {
  times <- sapply(n_values, function(n) {
    system.time(f(n))[["elapsed"]]
  })

  # Fit log-log regression to estimate complexity
  fit <- lm(log(times) ~ log(n_values))
  estimated_power <- coef(fit)[2]

  list(
    times = data.frame(n = n_values, time = times),
    estimated_complexity = paste0("O(n^", round(estimated_power, 2), ")"),
    power = estimated_power
  )
}

Statistical Algorithm Complexities

Algorithm Time Space
OLS O(np² + p³) O(np)
Logistic (Newton) O(np² + p³) per iter O(np)
Bootstrap (B reps) O(B × base) O(n)
MCMC (T iters) O(T × per_iter) O(n + T)
Cross-validation (K) O(K × base) O(n)
Random forest O(n log n × B × p) O(n × B)

Template for Analysis

TIME COMPLEXITY:
- Initialization: O(...)
- Per iteration: O(...)
- Total (T iterations): O(...)
- Convergence typically in T = O(...) iterations

SPACE COMPLEXITY:
- Data storage: O(n × p)
- Working memory: O(...)
- Output: O(...)

Convergence Analysis

Types of Convergence

  1. Finite termination: Exact solution in finite steps
  2. Linear: $|x_{k+1} – x^| \leq c|x_k – x^|$, $c < 1$
  3. Superlinear: $|x_{k+1} – x^| / |x_k – x^| \to 0$
  4. Quadratic: $|x_{k+1} – x^| \leq c|x_k – x^|^2$

Convergence Documentation Template

CONVERGENCE:
- Type: [Linear/Superlinear/Quadratic]
- Rate: [Expression]
- Conditions: [What must hold]
- Stopping criterion: [When to stop]
- Typical iterations: [Order of magnitude]

Optimization Algorithms

Gradient-Based Methods

ALGORITHM: Gradient Descent
INPUT: f (objective), ∇f (gradient), x₀ (initial), η (step size), ε (tolerance)
OUTPUT: x* (minimizer)

1. k ← 0
2. WHILE ‖∇f(xₖ)‖ > ε:
   2.1 xₖ₊₁ ← xₖ - η∇f(xₖ)
   2.2 k ← k + 1
3. RETURN xₖ

COMPLEXITY: O(iterations × gradient_cost)
CONVERGENCE: Linear with rate (1 - η·μ) for μ-strongly convex f

Newton’s Method

ALGORITHM: Newton-Raphson
INPUT: f, ∇f, ∇²f, x₀, ε
OUTPUT: x*

1. k ← 0
2. WHILE ‖∇f(xₖ)‖ > ε:
   2.1 Solve ∇²f(xₖ)·d = -∇f(xₖ) for direction d
   2.2 xₖ₊₁ ← xₖ + d
   2.3 k ← k + 1
3. RETURN xₖ

COMPLEXITY: O(iterations × p³) for p-dimensional
CONVERGENCE: Quadratic near solution

EM Algorithm Template

ALGORITHM: Expectation-Maximization
INPUT: Data Y, model parameters θ₀, tolerance ε
OUTPUT: MLE θ̂

1. θ ← θ₀
2. REPEAT:
   2.1 E-STEP: Compute Q(θ'|θ) = E[log L(θ'|Y,Z) | Y, θ]
   2.2 M-STEP: θ_new ← argmax_θ' Q(θ'|θ)
   2.3 Δ ← |θ_new - θ|
   2.4 θ ← θ_new
3. UNTIL Δ < ε
4. RETURN θ

CONVERGENCE: Monotonic increase in likelihood
             Linear rate near optimum

Bootstrap Algorithms

Nonparametric Bootstrap

ALGORITHM: Nonparametric Bootstrap
INPUT: Data X of size n, statistic T, B (number of replicates)
OUTPUT: SE estimate, CI

1. FOR b = 1 to B:
   1.1 Draw X*_b by sampling n observations with replacement from X
   1.2 Compute T*_b = T(X*_b)
2. SE_boot ← SD({T*_1, ..., T*_B})
3. CI_percentile ← [quantile(T*, 0.025), quantile(T*, 0.975)]
4. RETURN (SE_boot, CI_percentile)

COMPLEXITY: O(B × cost(T))
NOTES: B ≥ 1000 for SE, B ≥ 10000 for percentile CI

Parametric Bootstrap

ALGORITHM: Parametric Bootstrap
INPUT: Data X, parametric model M, B replicates
OUTPUT: SE estimate

1. Fit θ̂ = MLE(X, M)
2. FOR b = 1 to B:
   2.1 Generate X*_b ~ M(θ̂)
   2.2 Compute θ̂*_b = MLE(X*_b, M)
3. SE_boot ← SD({θ̂*_1, ..., θ̂*_B})
4. RETURN SE_boot

Numerical Stability Notes

Common Issues

  1. Overflow/Underflow: Work on log scale
  2. Cancellation: Reformulate subtractions
  3. Ill-conditioning: Use regularization or pivoting
  4. Convergence: Add damping or line search

Stability Techniques

# Log-sum-exp trick
log_sum_exp <- function(x) {
  max_x <- max(x)
  max_x + log(sum(exp(x - max_x)))
}

# Numerically stable variance
stable_var <- function(x) {
  n <- length(x)
  m <- mean(x)
  sum((x - m)^2) / (n - 1)  # One-pass with correction
}

Implementation Checklist

Before Coding

  • Pseudocode written and reviewed
  • Complexity analyzed
  • Convergence conditions identified
  • Edge cases documented
  • Numerical stability considered

During Implementation

  • Match pseudocode structure
  • Add convergence monitoring
  • Handle edge cases
  • Log intermediate values (debug mode)
  • Add early stopping

After Implementation

  • Unit tests for components
  • Integration tests for full algorithm
  • Benchmark against reference implementation
  • Profile for bottlenecks
  • Document deviations from pseudocode

Key References

  • CLRS
  • Numerical Recipes