codebase-context-extractor

📁 lofcz/llmtornado 📅 11 days ago
1
总安装量
1
周安装量
#41750
全站排名
安装命令
npx skills add https://github.com/lofcz/llmtornado --skill codebase-context-extractor

Agent 安装分布

opencode 1

Skill 文档

Codebase Context Extractor Skill

Overview

This skill provides a comprehensive context extraction system for large codebases. It intelligently analyzes code structure, dependencies, and relationships to extract relevant context for understanding, debugging, or modifying code.

Trigger Words

  • “extract context”
  • “codebase context”
  • “code context”
  • “analyze codebase”
  • “codebase analysis”
  • “code structure”
  • “dependency analysis”
  • “code relationships”
  • “understand codebase”
  • “map codebase”

When to Use This Skill

Use this skill when you need to:

  • Understand the structure and organization of a large codebase
  • Extract relevant context for a specific function, class, or module
  • Analyze dependencies and relationships between code components
  • Generate documentation or summaries of code sections
  • Prepare context for code modifications or debugging
  • Identify entry points and execution flows
  • Map out API surfaces and public interfaces
  • Understand data flow and state management

Instructions

When this skill is triggered, execute the context_extractor.py script with appropriate parameters.

Basic Usage

python /projects/workspace/codebase-context-extractor/context_extractor.py \
  --target-path <path_to_codebase> \
  --mode <extraction_mode> \
  --output <output_file>

Extraction Modes

  1. full – Complete codebase analysis with all components
  2. targeted – Focus on specific files, functions, or classes
  3. dependency – Map dependencies and imports
  4. flow – Trace execution flows and call chains
  5. api – Extract public interfaces and API surfaces
  6. data – Analyze data structures and models
  7. hierarchy – Show class hierarchies and inheritance
  8. summary – Generate high-level overview

Parameters

  • --target-path (required): Path to the codebase to analyze
  • --mode (required): Extraction mode (see above)
  • --output (optional): Output file path (default: stdout)
  • --focus (optional): Specific file, class, or function to focus on
  • --depth (optional): Maximum depth for traversal (default: unlimited)
  • --include-tests (optional): Include test files in analysis (default: false)
  • --language (optional): Programming language (auto-detected if not specified)
  • --format (optional): Output format (markdown, json, yaml, text) (default: markdown)
  • --exclude (optional): Patterns to exclude (comma-separated)

Examples

  1. Full codebase analysis:
python context_extractor.py --target-path ./my-project --mode full --output context.md
  1. Targeted analysis of a specific class:
python context_extractor.py --target-path ./my-project --mode targeted --focus "UserService" --output user_service_context.md
  1. Dependency mapping:
python context_extractor.py --target-path ./my-project --mode dependency --format json --output dependencies.json
  1. Execution flow analysis:
python context_extractor.py --target-path ./my-project --mode flow --focus "main" --depth 5

Output Structure

The extractor generates structured output including:

For Full/Targeted Mode

  • Project Overview: Language, structure, entry points
  • File Organization: Directory structure and file purposes
  • Key Components: Important classes, functions, modules
  • Dependencies: External and internal dependencies
  • Code Metrics: Lines of code, complexity estimates
  • Context Summary: High-level understanding

For Dependency Mode

  • Dependency Graph: Visual representation of dependencies
  • Import Analysis: All imports and their usage
  • Circular Dependencies: Detection and reporting
  • Unused Dependencies: Potential cleanup targets

For Flow Mode

  • Call Chains: Function call sequences
  • Entry Points: Main execution paths
  • Exit Points: Return and error handling
  • Branch Analysis: Conditional execution paths

For API Mode

  • Public Interfaces: Exported functions and classes
  • API Documentation: Signatures and docstrings
  • Usage Examples: How to use the API
  • Versioning Info: API version and compatibility

Advanced Features

Smart Context Window Management

The extractor automatically manages context size to fit within LLM token limits:

  • Prioritizes most relevant code sections
  • Provides summaries for less critical parts
  • Includes breadcrumb navigation for context

Multi-Language Support

Supports analysis of:

  • Python
  • JavaScript/TypeScript
  • Java
  • C#
  • Go
  • Rust
  • C/C++
  • Ruby
  • PHP
  • And more (extensible)

Intelligent Filtering

  • Excludes generated code, build artifacts, and vendor directories
  • Focuses on business logic and core functionality
  • Configurable exclusion patterns

Integration with Other Tools

The context extractor output can be used with:

  • Documentation generators
  • Code review tools
  • Refactoring assistants
  • Bug tracking systems
  • Development environments

Best Practices

  1. Start with Summary Mode: Get a high-level overview before diving deep
  2. Use Targeted Mode for Specific Tasks: Focus on relevant code sections
  3. Combine with Dependency Analysis: Understand impact of changes
  4. Leverage Flow Analysis for Debugging: Trace execution paths
  5. Regular Updates: Re-run analysis as codebase evolves

Notes

  • Large codebases may take time to analyze
  • Consider using depth limits for very large projects
  • JSON output is best for programmatic processing
  • Markdown output is best for human reading
  • The tool respects .gitignore patterns by default