ebook-extractor
21
总安装量
12
周安装量
#17638
全站排名
安装命令
npx skills add https://github.com/ratacat/claude-skills --skill ebook-extractor
Agent 安装分布
claude-code
9
codex
7
gemini-cli
6
antigravity
6
opencode
6
trae
6
Skill 文档
Ebook Text Extractor
Overview
Extract plain text from EPUB, MOBI, and PDF files using Python scripts. No LLM calls – pure text extraction.
Supported Formats
| Format | Tool Used | Notes |
|---|---|---|
| EPUB | ebooklib + BeautifulSoup |
Direct parsing, preserves structure |
| MOBI | Calibre ebook-convert |
Converts to EPUB first, then extracts |
PyMuPDF (fitz) |
Fast, handles most PDFs well |
Usage
Unified extractor (auto-detects format):
python3 ~/.claude/skills/ebook-extractor/scripts/extract.py /path/to/book.epub
python3 ~/.claude/skills/ebook-extractor/scripts/extract.py /path/to/book.mobi
python3 ~/.claude/skills/ebook-extractor/scripts/extract.py /path/to/book.pdf
Output options:
# To stdout (default)
python3 scripts/extract.py book.epub
# To file
python3 scripts/extract.py book.epub -o output.txt
python3 scripts/extract.py book.epub > output.txt
Format-specific scripts:
python3 scripts/extract_epub.py book.epub
python3 scripts/extract_mobi.py book.mobi
python3 scripts/extract_pdf.py book.pdf
Setup
# One-command setup (installs all dependencies)
~/.claude/skills/ebook-extractor/setup.sh
# Or manually:
pip install -r ~/.claude/skills/ebook-extractor/requirements.txt
brew install calibre # macOS, for MOBI support
Script Location
~/.claude/skills/ebook-extractor/scripts/
Common Issues
| Problem | Solution |
|---|---|
| Missing package | Run setup.sh or pip install -r requirements.txt |
| MOBI fails | Ensure Calibre is installed: brew install calibre |
| PDF garbled | Some PDFs are image-based; OCR needed (not supported) |