pdf-skill
2
总安装量
2
周安装量
#71665
全站排名
安装命令
npx skills add https://github.com/neversight/skills_feed --skill pdf-skill
Agent 安装分布
amp
2
opencode
2
kimi-cli
2
codex
2
github-copilot
2
gemini-cli
2
Skill 文档
PDF Skill
Purpose
Provides expertise in programmatic PDF generation, parsing, and manipulation. Specializes in creating PDFs from scratch, extracting content, merging/splitting documents, and handling forms using PDFKit, PDF.js, Puppeteer, and similar tools.
When to Use
- Generating PDFs programmatically
- Extracting text or data from PDFs
- Merging or splitting PDF documents
- Filling PDF forms programmatically
- Converting HTML to PDF
- Adding watermarks or annotations
- Parsing PDF structure and metadata
- Building PDF report generators
Quick Start
Invoke this skill when:
- Generating PDFs from code or data
- Extracting content from PDF files
- Merging, splitting, or manipulating PDFs
- Filling or creating PDF forms
- Converting HTML/web pages to PDF
Do NOT invoke when:
- Word document creation â use
/docx-skill - Excel/spreadsheet work â use
/xlsx-skill - PowerPoint creation â use
/pptx-skill - General file operations â use Bash or file tools
Decision Framework
PDF Operation?
âââ Generate from scratch
â âââ Simple â PDFKit (Node) / ReportLab (Python)
â âââ Complex layouts â Puppeteer/Playwright + HTML
âââ Parse/Extract
â âââ Text extraction â pdf-parse / PyPDF2
â âââ Table extraction â Camelot / Tabula
âââ Manipulate
â âââ pdf-lib (merge, split, edit)
âââ Forms
âââ pdf-lib (fill) / PDFtk (advanced)
Core Workflows
1. PDF Generation with PDFKit
- Install PDFKit (
npm install pdfkit) - Create new PDDocument
- Add content (text, images, graphics)
- Style with fonts and colors
- Add pages as needed
- Pipe to file or response
2. HTML to PDF Conversion
- Set up Puppeteer/Playwright
- Navigate to HTML content or URL
- Configure page size and margins
- Set print options (headers, footers)
- Generate PDF buffer
- Save or stream result
3. PDF Parsing and Extraction
- Choose parser (pdf-parse, PyPDF2, pdfplumber)
- Load PDF file
- Extract text or structured data
- Handle multi-page documents
- Clean and normalize extracted text
- Output in desired format
Best Practices
- Use vector graphics over raster when possible
- Embed fonts for consistent rendering
- Test PDF output across different readers
- Handle large PDFs with streaming
- Use appropriate library for task complexity
- Consider accessibility (tagged PDFs)
Anti-Patterns
| Anti-Pattern | Problem | Correct Approach |
|---|---|---|
| Image-only PDFs | Not searchable/accessible | Use text with fonts |
| No font embedding | Rendering issues | Embed required fonts |
| Memory loading large PDFs | Crashes | Stream processing |
| Ignoring encryption | Security/access issues | Handle encrypted PDFs |
| Wrong tool for job | Over-engineering | Match tool to complexity |