doc-to-markdown
1
总安装量
1
周安装量
#41183
全站排名
安装命令
npx skills add https://github.com/sheepmao/doc-to-markdown-skill --skill doc-to-markdown
Agent 安装分布
opencode
1
codex
1
claude-code
1
Skill 文档
Doc-to-Markdown (Word â Markdown)
Convert Microsoft Word .doc / .docx into:
- a clean Markdown file (
.md) - plus an optional images folder (
*_images/) with relative image links
This is designed to keep Markdown small (good for humans + LLMs) while preserving diagrams.
Quickstart (copy/paste)
# 1) Convert a single file (.docx or .doc)
python3 convert_word_to_markdown.py "path/to/document.docx"
# 2) Embedded mode (single self-contained .md, very large)
python3 convert_word_to_markdown.py --embedded "path/to/document.docx"
# 3) If anything fails, run a dependency check
python3 convert_word_to_markdown.py --check
Batch convert (current folder)
for f in *.doc *.docx; do
[ -e "$f" ] || continue
python3 convert_word_to_markdown.py "$f"
done
Outputs
Default (external images):
document.docx
document.md
document_images/
image1.png
image2.png
...
Embedded mode:
document.docx
document.md # contains base64 images
Requirements
- Recommended (most reliable): install
markitdowninto a local virtualenv in this repobash setup_venv.sh- (manual)
python3.11 -m venv .venv+.venv/bin/python -m pip install 'markitdown[all]'
- Alternative: install
markitdowngloballypython3 -m pip install 'markitdown[all]'(requires Python 3.10+ andmarkitdownonPATH)
- Fallback:
uv(providesuvx) so the scripts can runmarkitdownwithout pip installs- macOS:
brew install uv
- macOS:
- For
.doc(legacy) support: LibreOffice (brew install --cask libreoffice)
Environment Overrides (for reliability)
MARKITDOWN_UVX_PYTHON=3.11(default) â change the Python version used byuvxMARKITDOWN_UVX_OFFLINE=0â allowuvxto use network (default: offline)MARKITDOWN_CMD="... markitdown"â full command override (advanced)UV_CACHE_DIR=/tmp/uv-cacheâ use this ifuvxcanât write to its cache directory (default:./.uv-cache/)
Common Failure Modes
.docconversion fails:- LibreOffice GUI running â quit LibreOffice (or
killall soffice) and retry - If you see
Abort trap: 6/ exit 134 in a sandboxed tool runner â pre-convert.docto.docxoutside the sandbox, then convert the.docx
- LibreOffice GUI running â quit LibreOffice (or
- WMF/EMF diagrams donât display: in sandboxed environments the WMF/EMF â PNG step may be skipped; convert those images to PNG outside the sandbox if needed
markitdown not found: create./.venv/(recommended) or installmarkitdowngloballyFailed to initialize cache at ~/.cache/uv: setUV_CACHE_DIR=/tmp/uv-cacheand retry
Notes
convert_word_to_markdown.pyis the entrypoint (handles both.docand.docx).convert_with_images.pyis an internal helper and only supports.docx.