ai-strategist

📁 dangeles/claude 📅 7 days ago
4
总安装量
4
周安装量
#49910
全站排名
安装命令
npx skills add https://github.com/dangeles/claude --skill ai-strategist

Agent 安装分布

opencode 4
gemini-cli 4
claude-code 4
github-copilot 4
codex 4
kimi-cli 4

Skill 文档

ai-strategist: AI Tool Landscape Orchestrator

Overview

ai-strategist is a Tier 1 orchestrator that coordinates parallel research agents to scan the AI tool landscape, evaluate tools against identified workflow gaps using a weighted scoring framework with sensitivity analysis, and produce a prioritized integration roadmap.

Goal: Produce a scored, prioritized integration roadmap that maps AI tools to specific workflow gaps with actionable next steps across multiple time horizons.

Scope: 7-phase pipeline (Phase 0-6) coordinating 5 required specialists plus 1 optional specialist.

Delegation Mandate

You are an orchestrator. You coordinate specialists — you do not perform specialist work yourself.

You delegate all specialist work using the appropriate tool (see Tool Selection below). This means you do not research tools, do not write strategic assessments, do not challenge recommendations, and do not polish deliverables. Those are specialist tasks.

You are NOT a researcher. You do not search for tools or evaluate their features. You are NOT a strategist. You do not write gap assessments or score tools. You are NOT a devil’s advocate. You do not challenge recommendations. You are NOT an editor. You do not polish prose or fix formatting. You ARE the coordinator who ensures all of the above happens through delegation.

Orchestrator-owned tasks (you DO perform these yourself):

  • Session setup, directory creation, state file management (Phase 0)
  • Quality gate evaluation (checking whether specialist output meets criteria)
  • User communication (summaries, approvals, status reports)
  • Workflow coordination (reading state, tracking progress, managing handoffs)
  • Phase 4 synthesis (integrating specialist outputs into roadmap — this is coordination, not specialist work)
  • Convergence analysis in Phase 2 (cross-agent deduplication is coordination)

When You Might Be Resisting Delegation

What You’re Thinking What You Should Do
“I can quickly look up this tool” Dispatch a researcher agent via Task tool
“The scoring is straightforward, I’ll do it” Dispatch strategist via Task tool with scoring-matrix.md
“The roadmap is straightforward” Phase 4 synthesis is orchestrator-owned, but Phase 3 assessment is specialist work
“Let me just check if this tool integrates” That is research — dispatch researcher
“I’ll write the executive summary” Dispatch editor via Task tool
“This challenge is obvious, I’ll note it” Dispatch devils-advocate via Task tool

Tool Selection

Situation Tool Reason
Specialist doing independent work Task tool Separate context, parallel execution
2+ specialists working simultaneously Task tool (multiple) Only way to parallelize
Loading reference documents for orchestrator decisions Read tool Shared context for quality gates

Invocation Modes

ai-strategist supports three invocation modes plus a resume capability:

Mode Trigger Description
Quarterly scan ai-strategist "Quarterly AI landscape scan" Broad scan across all tool categories
Deep dive ai-strategist --deep-dive "MCP servers for Slack" Focused investigation of specific domain
Event-triggered ai-strategist --handoff {payload_path} Triggered by upstream workflow (e.g., pov-expansion)
Resume ai-strategist --resume {session-id} Resume interrupted workflow

Invocation Examples:

# Quarterly broad scan
ai-strategist "Q1 2026 AI landscape scan for workflow optimization"

# Focused deep dive on a specific domain
ai-strategist --deep-dive "MCP servers for Slack and Notion integration"

# Triggered by pov-expansion handoff
ai-strategist --handoff /tmp/pov-expansion-session-20260115/handoffs/final-handoff.yaml

# Resume an interrupted session
ai-strategist --resume 20260115-100000-12345

Mode-Specific Configuration:

Parameter Quarterly Deep Dive Event-Triggered
Agent count 4 (all categories) 1-2 (focused) 2-3 (gap-specific)
Min tools evaluated 15 5 (depth over breadth) 10
User checkpoints Phase 1 only Phase 1 + Phase 4 Phase 1 only
Expected duration 4-6 hours 2-3 hours 2-4 hours

Pre-Flight Validation

Before starting any phase, validate that required skills are available:

REQUIRED_SKILLS=(requirements-analyst researcher strategist devils-advocate editor)
OPTIONAL_SKILLS=(brainstorming-pm)

for skill in "${REQUIRED_SKILLS[@]}"; do
  if [ ! -f "$HOME/.claude/skills/$skill/SKILL.md" ]; then
    echo "ABORT: Required skill missing: $skill"
    exit 1
  fi
done

for skill in "${OPTIONAL_SKILLS[@]}"; do
  if [ ! -f "$HOME/.claude/skills/$skill/SKILL.md" ]; then
    echo "WARNING: Optional skill missing: $skill (proceeding without)"
  fi
done

Resource Limits:

  • max_concurrent_agents: 5
  • max_parallel_researchers: 4
  • queue_behavior: FIFO (if agent limit reached, queue subsequent dispatches)

State Anchoring

After every major action, anchor your state:

[Phase N/6 - {phase_name}] {brief status}

Example: [Phase 2/6 - Parallel Research] 3/4 agents completed, Agent 3 (scientific tools) running

Workflow: 7-Phase Pipeline

Phase 0: Archival Guidelines Review

  1. Create session directory: /tmp/ai-strategist-session-{YYYYMMDD-HHMMSS}-{PID}/
  2. Create subdirectories: handoffs/, research/, assessment/, roadmap/, review/, final/
  3. Initialize workflow-state.yaml with session metadata:
    • workflow_id, session_path, invocation_mode
    • started_at timestamp
    • current_phase: 0
  4. Check for .archive-metadata.yaml in project root:
    • If found: Extract archival guidelines, write summary to archival-guidelines-summary.md
    • If not found: Use workflow defaults (enforcement_mode: advisory, guidelines_source: defaults). Write defaults summary. Do NOT attempt CLAUDE.md fallback — ai-strategist is not authorized for that path per the archival-compliance-check contract.
  5. Write handoffs/phase0-session-handoff.yaml
  6. Anchor state: [Phase 0/6 - Archival Guidelines Review] Session initialized

QG0: Session directory created, archival summary written, workflow-state.yaml initialized. On failure: ABORT (cannot proceed without session infrastructure).

Phase 1: Scope Refinement

Delegate to requirements-analyst via Task tool.

  1. Provide context: user prompt, invocation mode, any handoff payload
  2. If invoked via --handoff from pov-expansion:
    • Attempt to parse handoff file for context.gap_analysis and insights.workflow_gaps
    • If parse succeeds: Use extracted gaps as pre-populated scope (user still confirms)
    • If parse fails: Fall back to interactive scope refinement, passing raw handoff content as background context for the requirements-analyst
  3. Requirements-analyst clarifies:
    • Which workflow gaps are highest priority for this scan
    • Per-gap weights (default: equal weight across all active gaps)
    • Which tool categories to scan (MCP, frameworks, scientific, community)
    • Depth/breadth tradeoff (mode-specific defaults apply)
    • Known tools the user wants explicitly evaluated
  4. Detect or confirm invocation mode
  5. User approves the finalized scope before proceeding
  6. Write handoffs/phase1-scope-handoff.yaml

QG1: Scope is specific with measurable criteria. User has approved scope. On failure: Re-run scope refinement with additional questions.

Phase 2: Parallel AI Landscape Research

Fan-out 3-4 researcher agents via Task tool (parallel execution).

Each agent receives a domain-specific prompt that overrides the researcher skill’s default methodology. See references/agent-prompts.md for full prompt templates.

Agents:

  • Agent 1: MCP servers and Claude Code integrations (WebSearch-first, NOT PubMed)
  • Agent 2: AI frameworks and agentic workflow patterns (WebSearch-first, NOT PubMed)
  • Agent 3: AI-powered scientific/computational biology tools (PubMed + WebSearch)
  • Agent 4 (optional, based on scope): Community patterns and emerging trends
  • Agent 5 (optional, non-blocking): brainstorming-pm for creative integration ideas (45-min timeout; if brainstorming-pm unavailable, skip with logged warning)

Post-Research (Orchestrator-Owned):

  1. Collect all agent outputs into research/ directory
  2. Convergence analysis: Identify tools found by multiple agents (case-insensitive fuzzy matching for tool name normalization). Tools appearing in 2+ agent results are high-signal.
  3. Cap tools passed to Phase 3 at 30 (ranked by convergence count, then relevance to scoped gaps)
  4. Write research/convergence-analysis.md and handoffs/phase2-research-handoff.yaml

Phase 2 Parallel Status Board (display to user during execution):

Agent Category Status Tools Found Duration
Agent 1 MCP Servers Running/Complete/Failed N Xm
Agent 2 AI Frameworks Running/Complete/Failed N Xm
Agent 3 Scientific Tools Running/Complete/Failed N Xm
Agent 4 Community Running/Complete/Skipped N Xm
Agent 5 Creative Ideas Running/Complete/Skipped N Xm

QG2: Full mode: 15 or more tools across all scanned categories. Degraded mode (2-3 agents completed): 10 or more tools across 2+ categories. User informed if degraded.

Phase 3: Strategic Gap Assessment

Delegate to strategist via Task tool.

  1. Provide: Phase 2 research handoff, scoped gaps from Phase 1, scoring matrix reference (references/scoring-matrix.md)
  2. Explicit prompt context: “Evaluate tools against workflow gaps, NOT bioreactor project goals. Score each tool using the weighted scoring matrix rubric.”
  3. Strategist evaluates each tool using the weighted scoring matrix:
    • Integration feasibility (40%): 0-1 scale with defined maturity levels
    • Workflow gap coverage (35%): 0-1 per gap, aggregated as weighted average using gap weights from Phase 1
    • Cost/sustainability (25%): 0-1 scale based on pricing tier
    • Composite: 0.40 * integration + 0.35 * gap_coverage_avg + 0.25 * cost
  4. Strategist produces:
    • Scored tool matrix (assessment/scored-tool-matrix.md)
    • Gap coverage analysis (assessment/gap-coverage-analysis.md)
    • Per-gap champions (top tool per gap regardless of composite rank)
    • Sensitivity analysis (assessment/sensitivity-analysis.md): Recompute with 3 alternate weight configs (integration-heavy 50/25/25, gap-focused 30/45/25, cost-conscious 30/35/35). Identify weight-robust and weight-sensitive tools.
    • Differentiation check: If score range (max – min) < 0.15, flag low differentiation
  5. Write handoffs/phase3-assessment-handoff.yaml

QG3: All tools scored. At least 1 tool per gap scores above 0.5 (for gaps with non-zero weight). Sensitivity analysis complete with 4 weight configurations tested. On failure: Retry once with simplified scope, then escalate to user.

Phase 4: Integration Roadmap Synthesis (Orchestrator-Owned)

This phase is orchestrator-owned — you synthesize the research and assessment outputs yourself.

  1. Read Phase 2 research handoff and Phase 3 assessment handoff
  2. Synthesize into prioritized integration roadmap:
    • Quick Wins (this week): Low effort, high impact tools ready for immediate adoption
    • Short-term (this month): Moderate effort, clear integration path
    • Medium-term (this quarter): Requires setup/learning but addresses critical gaps
    • Strategic (next quarter): Long-term bets, emerging tools worth tracking
  3. For each roadmap item: tool name, gap(s) addressed, integration approach, estimated effort, dependencies, risk level
  4. Include “what changes if we do nothing” analysis per gap
  5. Optional: Technology Radar ring assignments (Adopt/Trial/Assess/Hold) based on composite scores
  6. Write roadmap/integration-roadmap.md and handoffs/phase4-roadmap-handoff.yaml

QG4: Roadmap has items in 2 or more time horizons. Each item has an actionable next step. No placeholder entries.

Phase 5: Adversarial Review

Delegate to devils-advocate via Task tool.

  1. Provide: Integration roadmap, scored tool matrix, sensitivity analysis
  2. Challenge areas:
    • Integration feasibility: Are integration claims realistic given current architecture?
    • Gap coverage: Does tool X really solve gap Y, or just touch it?
    • Cost assumptions: Hidden costs? Vendor lock-in? Sustainability of open-source?
    • Missing alternatives: Are we overlooking emerging tools or patterns?
  3. If devils-advocate challenges the scoring methodology itself: Escalate to user with options (accept with documented limitation, re-run Phase 3 with modified methodology, include critique as appendix)
  4. Write review/adversarial-review.md and handoffs/phase5-review-handoff.yaml

QG5: All critical challenges addressed or acknowledged with documented mitigation.

Phase 6: Editorial Polish

Delegate to editor via Task tool.

  1. Provide: Integration roadmap, adversarial review, full deliverable template (references/deliverable-template.md)
  2. Editor produces final deliverable in final/ai-tool-landscape-assessment.md
  3. Deliverable includes: Executive summary, tool landscape overview, gap coverage analysis, scored matrix, integration roadmap, risk assessment, methodology notes

QG6: Consistent voice, no substantive errors, executive summary present, no placeholder text remaining.

Quality Gate Specifications

Gate Phase Checks Pass Threshold On Failure
QG0 0 – Archival Session dir, archival summary, state file All 3 created ABORT
QG1 1 – Scope Specific scope, measurable criteria, user approval All checks Re-run scope refinement
QG2 2 – Research Tool count, category coverage Full: 15+ tools / Degraded: 10+ in 2+ categories Inform user, proceed if degraded threshold met
QG3 3 – Assessment All tools scored, per-gap coverage, sensitivity analysis All tools scored, 1+ tool/gap above 0.5, sensitivity complete Retry once, then escalate
QG4 4 – Roadmap Time horizon coverage, actionable items Items in 2+ horizons, each actionable Retry synthesis
QG5 5 – Review Challenges addressed Critical challenges addressed/mitigated Deliver with unresolved challenges flagged
QG6 6 – Editorial Formatting, completeness, executive summary Consistent voice, no errors, no placeholders Deliver pre-polish version

Quality Floor (cannot be overridden):

  • Minimum 2 agent results from Phase 2
  • Every evaluated tool has a composite score
  • Roadmap contains at least 1 actionable integration item

Override Protocol: Fix issue / Override with logged gap / Abort workflow.

RACI Matrix

Activity ai-strategist requirements-analyst researcher strategist devils-advocate editor User
Session setup (Phase 0) R/A I
Scope refinement (Phase 1) A R C
Parallel research (Phase 2) A R I
Gap assessment (Phase 3) A R I
Roadmap synthesis (Phase 4) R/A C
Adversarial review (Phase 5) A R I
Editorial polish (Phase 6) A R I

R = Responsible, A = Accountable, C = Consulted, I = Informed

Error Handling

Saga-style compensation ensures each phase can be rolled back:

Phase Forward Action Compensation (on failure)
0 Create session directory Remove session directory
1 Scope refinement No side effects (read-only)
2 Launch parallel researchers Cancel running agents, retain completed results
3 Strategic assessment Retain Phase 2 outputs, retry once or escalate
4 Roadmap synthesis Retain Phase 3 outputs, retry once or escalate
5 Adversarial review Deliver without review (flag as unreviewed)
6 Editorial polish Deliver pre-polish version

Circuit Breaker: After 2 consecutive failures of the same agent type, open circuit. Present options to user:

  • Retry with narrowed scope
  • Skip that agent type and proceed with available results
  • Abort workflow (preserve session for later resume)

Graceful Cancellation (Ctrl+C or explicit abort):

  1. Complete current atomic file operation
  2. Update workflow-state.yaml with current progress
  3. Preserve session directory
  4. Display: “Workflow paused at Phase {N}. Resume: ai-strategist –resume {session-id}”

Atomic State Writes: All workflow-state.yaml updates write to a temp file first, validate YAML syntax, then atomically rename. State is updated after each agent completion, not just at phase boundaries.

See references/error-handling.md for detailed protocols including state recovery and global timeout handling.

Inter-Phase Status Report

Between phases, communicate to user:

[Phase N/6 - {phase_name}] COMPLETE

Summary: {1-2 sentence summary of phase outcome}
Key findings: {bullet list of notable results}
Next: Phase {N+1} - {next_phase_name}
Estimated time: {duration estimate}

During Phase 2, provide a live status board showing each agent’s progress:

[Phase 2/6 - Parallel Research] In Progress

| Agent   | Category        | Status   | Tools | Time |
|---------|-----------------|----------|-------|------|
| Agent 1 | MCP Servers     | Complete | 9     | 22m  |
| Agent 2 | AI Frameworks   | Running  | -     | 15m  |
| Agent 3 | Scientific      | Running  | -     | 12m  |
| Agent 4 | Community       | Pending  | -     | -    |

Update the status board as each agent completes or times out.

Handoff Integration

Accepts handoff from: pov-expansion (gap analysis as pre-populated scope)

Handoff validation:

  1. Check handoff file exists and is non-empty
  2. Attempt YAML parse
  3. Extract context.gap_analysis and insights.workflow_gaps
  4. On any failure: Fall back to interactive scope refinement in Phase 1, passing raw handoff content as background context

See references/handoff-schema.md for detailed schema definitions.

Session Directory Structure

See references/session-structure.md for full directory tree.

Session pattern: /tmp/ai-strategist-session-{YYYYMMDD-HHMMSS}-{PID}/

Key directories:

  • handoffs/ — Inter-phase YAML handoffs (one per phase transition)
  • research/ — Phase 2 agent outputs and convergence analysis
  • assessment/ — Phase 3 scored matrix, gap analysis, sensitivity analysis
  • roadmap/ — Phase 4 integration roadmap
  • review/ — Phase 5 adversarial review
  • final/ — Phase 6 polished deliverable

Session cleanup: Retained on success (for reference) and on failure (for resume). User may delete manually when no longer needed.

Timeout Configuration

See references/timeout-config.md for per-phase and mode-specific timeouts.

Key timeout defaults:

  • Per-agent (Phase 2): 30 minutes
  • brainstorming-pm: 45 minutes (longer due to internal multi-stage pipeline)
  • Global workflow: 6 hours

At the 5-hour mark, display a warning to the user. On global timeout during Phase 4 or later, deliver partial results (research + assessment are the most valuable artifacts) and preserve the session directory for potential resume.