pdf-reader
8
总安装量
3
周安装量
#34512
全站排名
安装命令
npx skills add https://github.com/childbamboo/claude-code-marketplace-sample --skill pdf-reader
Agent 安装分布
claude-code
3
cursor
3
opencode
2
openhands
1
zencoder
1
Skill 文档
PDF Reader
PDF ãã¡ã¤ã«ãããã¹ãæ½åºã㦠Markdown å½¢å¼ã«å¤æããã¹ãã«ã§ãã
ã¯ã¤ãã¯ã¹ã¿ã¼ã
åºæ¬çãªä½¿ãæ¹
# WSLç°å¢ã§Pythonã¹ã¯ãªãããå®è¡
wsl python3 scripts/read_pdf.py "/mnt/c/path/to/file.pdf"
Markdownå½¢å¼ã§ä¿å
- ã¹ã¯ãªããã§ããã¹ãæ½åº
- Write ãã¼ã«ã§ .md ãã¡ã¤ã«ã«ä¿å
åææ¡ä»¶
pdfplumber ããã±ã¼ã¸ãå¿ è¦ã§ãï¼
wsl pip3 install pdfplumber
使ç¨ä¾
ä¾1: PDF ãã¡ã¤ã«ãèªã¿è¾¼ãã§å 容ã表示
User: "C:\Users\keita\repos\guideline.pdf ãèªã¿è¾¼ãã§"
Assistant:
1. Windowsãã¹ã WSL ãã¹ã«å¤æ: /mnt/c/Users/keita/repos/guideline.pdf
2. wsl python3 scripts/read_pdf.py ãå®è¡
3. æ½åºãããããã¹ãã Markdown å½¢å¼ã§è¡¨ç¤º
ä¾2: PDF ã Markdown ã«å¤æãã¦ä¿å
User: "ã¬ã¤ãã©ã¤ã³.pdf ã Markdown ã«å¤æãã¦ä¿å"
Assistant:
1. scripts/read_pdf.py ã§ããã¹ãæ½åº
2. Markdownå½¢å¼ã§æ§é åï¼ãã¼ã¸ãã¨ã«è¦åºãããã¼ãã«ãå«ãï¼
3. Write ãã¼ã«ã§ ã¬ã¤ãã©ã¤ã³.md ã«ä¿å
4. ä¿åå®äºãå ±å
ã¯ã¼ã¯ããã¼
åä¸ãã¡ã¤ã«ã®èªã¿è¾¼ã¿
- ã¦ã¼ã¶ã¼ã PDF ãã¡ã¤ã«ãã¹ãæå®
- Windows ãã¹ã WSL ãã¹å½¢å¼ã«å¤æ (
C:\â/mnt/c/) wsl python3 scripts/read_pdf.pyãå®è¡- æ½åºãããããã¹ãã Markdown å½¢å¼ã§è¡¨ç¤ºã¾ãã¯ä¿å
è¤æ°ãã¡ã¤ã«ã®ä¸æ¬å¦ç
- Glob ã§ .pdf ãã¡ã¤ã«ãæ¤ç´¢
- åãã¡ã¤ã«ã«å¯¾ãã¦ã¹ã¯ãªãããå®è¡
- çµæãã¾ã¨ãã¦å ±å
åºåå½¢å¼
Markdown æ§é
# [PDFãã¡ã¤ã«å]
**Total Pages:** 10
---
## Page 1
[ãã¼ã¸1ã®ããã¹ãå
容]
### Tables
**Table 1:**
| å1 | å2 | å3 |
| --- | --- | --- |
| ãã¼ã¿1 | ãã¼ã¿2 | ãã¼ã¿3 |
---
## Page 2
[ãã¼ã¸2ã®ããã¹ãå
容]
---
ã¹ã¯ãªãã詳細
Python ã¹ã¯ãªãã㯠scripts/read_pdf.py ã«é
ç½®ããã¦ãã¾ãã
ä¸»ãªæ©è½:
- ãã¼ã¸ãã¨ã®ããã¹ãæ½åº
- ãã¼ãã«ã® Markdown å
- è¤æ°ãã¼ã¸ã®æ§é å
- ã¨ã©ã¼ãã³ããªã³ã°
ä½¿ãæ¹:
python scripts/read_pdf.py <file_path>
å¯¾å¿æ©è½
- â ããã¹ãæ½åºï¼å ¨ãã¼ã¸ï¼
- â ãã¼ãã«ã® Markdown å
- â ãã¼ã¸çªå·ã®ä¿æ
- â æ§é åãããåºå
- â ï¸ ç»åããã®ããã¹ãæ½åºï¼OCRæªå¯¾å¿ï¼
- â ï¸ è¤éãªã¬ã¤ã¢ã¦ãã¯ç°¡ç¥å
å¶éäºé
- ã¹ãã£ã³ããã PDFï¼ç»åã®ã¿ï¼ããã¯ããã¹ãæ½åºä¸å¯
- OCR æ©è½ã¯å«ã¾ãã¾ãã
- è¤éãªã¬ã¤ã¢ã¦ãã¯ç°¡ç¥åããã¾ã
- ãã©ã³ãæ å ±ãè²ãªã©ã®ã¹ã¿ã¤ã«ã¯å¤±ããã¾ã
- åãè¾¼ã¿ãªãã¸ã§ã¯ãã¯æ½åºããã¾ãã
ãã©ãã«ã·ã¥ã¼ãã£ã³ã°
pdfplumber ãã¤ã³ã¹ãã¼ã«ããã¦ããªã
wsl pip3 install pdfplumber
ããã¹ããæ½åºãããªã
- PDF ãã¹ãã£ã³ç»åã®å¯è½æ§ãããã¾ãï¼OCR ãå¿ è¦ï¼
- PDF ãæå·åããã¦ããå¯è½æ§ãããã¾ã
- ããã¹ãã¬ã¤ã¤ã¼ããªã PDF ããããã¾ãã
æååããã
# æ¥æ¬èªå¯¾å¿ã®ç¢ºèª
wsl locale
# UTF-8 ãå«ã¾ãã¦ãããã¨ã確èª
ã¡ã¢ãªä¸è¶³ã¨ã©ã¼
大ã㪠PDF ãã¡ã¤ã«ã®å ´åããã¼ã¸ãã¨ã«åå²ãã¦å¦çãããã¨ãæ¤è¨ãã¦ãã ããã
ãã¹å¤æ
Windows ãã¹ãã WSL ãã¹ã¸ã®å¤æï¼
C:\Users\...â/mnt/c/Users/...D:\Projects\...â/mnt/d/Projects/...- ããã¯ã¹ã©ãã·ã¥
\ãã¹ã©ãã·ã¥/ã«å¤æ
é¢é£ãã¼ã«
- PyPDF2: 軽éãªä»£æ¿ã©ã¤ãã©ãª
- pdfminer.six: ãã詳細ãªå¶å¾¡ãå¿ è¦ãªå ´å
- Camelot: ãã¼ãã«æ½åºç¹å
- OCRmyPDF: ã¹ãã£ã³ PDF ã« OCR ãé©ç¨
é«åº¦ãªä½¿ãæ¹
ç¹å®ã®ãã¼ã¸ã®ã¿æ½åº
ã¹ã¯ãªãããä¿®æ£ã㦠pdf.pages[0:5] ã®ããã«ã¹ã©ã¤ã¹ã使ç¨ã§ãã¾ãã
ãã¼ãã«ã®ã¿æ½åº
ã¹ã¯ãªããå
ã® extract_tables() é¨åã®ã¿ã使ç¨ãã¾ãã
OCR ãå¿ è¦ãªå ´å
pytesseract 㨠pdf2image ãçµã¿åããã¦ä½¿ç¨ãã¾ãï¼å¥ã¹ãã«ã¨ãã¦ä½ææ¨å¥¨ï¼ã
ãã¼ã¸ã§ã³å±¥æ´
- v1.0.0 (2026-01-06): åæãªãªã¼ã¹
- åºæ¬çãªããã¹ãæ½åºæ©è½
- ãã¼ãã« Markdown å対å¿
- WSLç°å¢ã§ã®åä½
- ãã¼ã¸ãã¨ã®æ§é å