doc2x-ocr-markdown

📁 jysd-ai/skills 📅 7 days ago

总安装量

周安装量

#50337

全站排名

安装命令

npx skills add https://github.com/jysd-ai/skills --skill doc2x-ocr-markdown

Agent 安装分布

windsurf 1

amp 1

opencode 1

kimi-cli 1

codex 1

Convert a single PDF or image into Markdown and extract image assets with one local script:

Require only one credential:

Set API key:

export DOC2X_APIKEY='sk-...'

Run PDF OCR to Markdown + images:

python scripts/doc2x_ocr.py pdf ./input.pdf --outdir ./output

Run image OCR to Markdown + images:

python scripts/doc2x_ocr.py image ./page.png --outdir ./output

Use the asynchronous Doc2X PDF flow:

Useful options:

Use synchronous image layout OCR:

For input <name>.pdf or <name>.png, script writes:

Script prints a JSON summary with:

Read these files when you need deeper context:

references/api-quick-reference.md for endpoint behavior and limits
references/implementation-notes.md for relation to the copied official doc2x.py

Handle parse_task_limit_exceeded or parse_concurrency_limit by reducing concurrent jobs and retrying later.
Split huge PDFs if parse timeout or page-limit errors occur.
Keep poll interval between 1 and 3 seconds for status APIs unless there is a strong reason to change.
Save outputs promptly because official docs state cloud parse results are temporary.