pdf-skill
37
总安装量
37
周安装量
#5631
全站排名
安装命令
npx skills add https://github.com/404kidwiz/claude-supercode-skills --skill pdf-skill
Agent 安装分布
claude-code
25
opencode
25
codex
20
cursor
18
trae
15
Skill 文档
PDF Skill
Purpose
Provides expertise in programmatic PDF generation, parsing, and manipulation. Specializes in creating PDFs from scratch, extracting content, merging/splitting documents, and handling forms using PDFKit, PDF.js, Puppeteer, and similar tools.
When to Use
- Generating PDFs programmatically
- Extracting text or data from PDFs
- Merging or splitting PDF documents
- Filling PDF forms programmatically
- Converting HTML to PDF
- Adding watermarks or annotations
- Parsing PDF structure and metadata
- Building PDF report generators
Quick Start
Invoke this skill when:
- Generating PDFs from code or data
- Extracting content from PDF files
- Merging, splitting, or manipulating PDFs
- Filling or creating PDF forms
- Converting HTML/web pages to PDF
Do NOT invoke when:
- Word document creation â use
/docx-skill - Excel/spreadsheet work â use
/xlsx-skill - PowerPoint creation â use
/pptx-skill - General file operations â use Bash or file tools
Decision Framework
PDF Operation?
âââ Generate from scratch
â âââ Simple â PDFKit (Node) / ReportLab (Python)
â âââ Complex layouts â Puppeteer/Playwright + HTML
âââ Parse/Extract
â âââ Text extraction â pdf-parse / PyPDF2
â âââ Table extraction â Camelot / Tabula
âââ Manipulate
â âââ pdf-lib (merge, split, edit)
âââ Forms
âââ pdf-lib (fill) / PDFtk (advanced)
Core Workflows
1. PDF Generation with PDFKit
- Install PDFKit (
npm install pdfkit) - Create new PDDocument
- Add content (text, images, graphics)
- Style with fonts and colors
- Add pages as needed
- Pipe to file or response
2. HTML to PDF Conversion
- Set up Puppeteer/Playwright
- Navigate to HTML content or URL
- Configure page size and margins
- Set print options (headers, footers)
- Generate PDF buffer
- Save or stream result
3. PDF Parsing and Extraction
- Choose parser (pdf-parse, PyPDF2, pdfplumber)
- Load PDF file
- Extract text or structured data
- Handle multi-page documents
- Clean and normalize extracted text
- Output in desired format
Best Practices
- Use vector graphics over raster when possible
- Embed fonts for consistent rendering
- Test PDF output across different readers
- Handle large PDFs with streaming
- Use appropriate library for task complexity
- Consider accessibility (tagged PDFs)
Anti-Patterns
| Anti-Pattern | Problem | Correct Approach |
|---|---|---|
| Image-only PDFs | Not searchable/accessible | Use text with fonts |
| No font embedding | Rendering issues | Embed required fonts |
| Memory loading large PDFs | Crashes | Stream processing |
| Ignoring encryption | Security/access issues | Handle encrypted PDFs |
| Wrong tool for job | Over-engineering | Match tool to complexity |