convert-doc

📁 carlheath/ogmios 📅 Jan 24, 2026

总安装量

周安装量

#30811

全站排名

安装命令

npx skills add https://github.com/carlheath/ogmios --skill convert-doc

Agent 安装分布

claude-code 5

gemini-cli 5

antigravity 5

codex 4

opencode 4

Skill 文档

Smart Document Pipeline

Quick Reference

# Convert document (auto-caches, auto-summarizes if >100KB)
python ~/.claude/lib/document-converter.py "/path/to/file.pdf"

# Force regenerate
python ~/.claude/lib/document-converter.py "/path/to/file.pdf" --force

# List cached documents
python ~/.claude/lib/document-converter.py --list

# Cleanup old cache (>1 week)
python ~/.claude/lib/document-converter.py --cleanup

Supported Formats

Format	Extension	Tool	Notes
PDF	.pdf	PyMuPDF	Text extraction, page-by-page
Word	.docx, .doc	pandoc/python-docx	Full markdown
PowerPoint	.pptx, .ppt	python-pptx	Slide-by-slide with notes
Excel	.xlsx, .xls	openpyxl	Tables as markdown
RTF	.rtf	pandoc	Rich text

Output Structure

{
  "cache_path": "/path/to/cached/file.md",
  "summary_path": "/path/to/cached/file_summary.md",  // if >100KB
  "from_cache": false,
  "original_size": 26744198,
  "converted_size": 129844,
  "summary_size": 30638,
  "savings_percent": 99.5,
  "recommendation": "summary"  // "summary" or "full"
}

Auto-Summary

Documents >100KB automatically get a summary version:

Version	Purpose	Size Target
Full	Complete content	As converted
Summary	Quick overview	~30KB

The summary preserves:

All headers and structure
First portion of each section
Metadata and source reference

Automatic Integration

The smart-read-interceptor hook automatically triggers when you read:

PDF, Word, PowerPoint, Excel files
Any file >200KB

It will suggest:

Use summary – If summary exists (best for overview)
Use cache – If full cached version exists
Convert first – If no cache exists
Delegate – For very large files, use subagent

Subagent Delegation Pattern

For very large documents, delegate to isolated context:

Task(
  subagent_type="Explore",
  prompt="Read and summarize key points from: /path/to/large-file.pdf.
         Focus on: [specific topics]. Max 500 words summary."
)

This keeps the large content OUT of main context.

Cache Location

~/.claude/cache/documents/
âââ filename_hash.md           # Full converted version
âââ filename_hash_summary.md   # Summary (if >100KB)
âââ ...

Cache expires after 1 week. Run --cleanup to remove old files.

Real-World Results

Document	Original	Converted	Summary	Savings
Google AI Guide (PDF)	26.7 MB	127 KB	30 KB	99.9%
Debatt (Word)	206 KB	5.4 KB	–	97%
Ãvning (PowerPoint)	7.2 MB	3.1 KB	–	99.96%

Workflow Examples

Reading a PDF for research

1. User asks to analyze a PDF
2. Hook detects: "ð DOCUMENT FILE: .PDF"
3. Convert: python ~/.claude/lib/document-converter.py "file.pdf"
4. Read the summary for overview
5. Read specific sections from full version if needed

Processing multiple documents

1. Convert all documents first (batch):
   for f in *.pdf; do python ~/.claude/lib/document-converter.py "$f"; done

2. Read summaries in main context
3. Delegate deep analysis to subagents

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台