document-docx

📁 vasilyu1983/ai-agents-public 📅 Jan 23, 2026

总安装量

周安装量

#4500

全站排名

安装命令

npx skills add https://github.com/vasilyu1983/ai-agents-public --skill document-docx

Agent 安装分布

claude-code 38

opencode 28

cursor 24

antigravity 22

codex 18

Skill 文档

Document DOCX Skill – Quick Reference

This skill enables creation, editing, and analysis of .docx files for reports, contracts, proposals, documentation, and template-driven outputs.

Modern best practices (2026):

Prefer templates + styles over manual formatting.
Treat .docx as the editable source; treat PDF as a release artifact.
If distributing externally, include basic accessibility hygiene (headings, table headers, alt text).

Quick Reference

Task	Tool/Library	Language	When to Use
Create DOCX	python-docx	Python	Reports, contracts, proposals
Create DOCX	docx	Node.js	Server-side document generation
Convert to HTML	mammoth.js	Node.js	Web display, content extraction
Parse DOCX	python-docx	Python	Extract text, tables, metadata
Template fill	docxtpl	Python	Mail merge, template-based generation
Review workflow	Word compare, comments/highlights	Any	Human review without OOXML surgery
Tracked changes	OOXML inspection, docx4j/OpenXML SDK/Aspose	Any	True redlines or parsing tracked changes

Tool Selection

Prefer docxtpl when non-developers must edit layout/design in Word.
Prefer python-docx for structural edits (paragraphs/tables/headers/footers) when formatting complexity is moderate.
Prefer docx (Node.js) for server-side generation in TypeScript-heavy stacks.
Prefer mammoth for text-first extraction or DOCX-to-HTML (best effort; may drop some layout fidelity).

Known Limits (Plan Around These)

.doc (legacy) is not supported by these libraries; convert to .docx first (e.g., LibreOffice).
python-docx cannot reliably create true tracked changes; use Word compare or specialized OOXML tooling.
Tables of Contents and many fields are placeholders until opened/updated in Word.

Core Operations

Create Document (Python – python-docx)

from docx import Document
from docx.shared import Inches, Pt
from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = Document()

# Title
title = doc.add_heading('Document Title', 0)
title.alignment = WD_ALIGN_PARAGRAPH.CENTER

# Paragraph with formatting
para = doc.add_paragraph()
run = para.add_run('Bold and ')
run.bold = True
run = para.add_run('italic text.')
run.italic = True

# Table
table = doc.add_table(rows=3, cols=3)
table.style = 'Table Grid'
for i, row in enumerate(table.rows):
    for j, cell in enumerate(row.cells):
        cell.text = f'Row {i+1}, Col {j+1}'

# Image
doc.add_picture('image.png', width=Inches(4))

# Save
doc.save('output.docx')

Create Document (Node.js – docx)

import { Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell } from 'docx';
import * as fs from 'fs';

const doc = new Document({
  sections: [{
    properties: {},
    children: [
      new Paragraph({
        children: [
          new TextRun({ text: 'Bold text', bold: true }),
          new TextRun({ text: ' and normal text.' }),
        ],
      }),
      new Table({
        rows: [
          new TableRow({
            children: [
              new TableCell({ children: [new Paragraph('Cell 1')] }),
              new TableCell({ children: [new Paragraph('Cell 2')] }),
            ],
          }),
        ],
      }),
    ],
  }],
});

Packer.toBuffer(doc).then((buffer) => {
  fs.writeFileSync('output.docx', buffer);
});

Template-Based Generation (Python – docxtpl)

from docxtpl import DocxTemplate

doc = DocxTemplate('template.docx')
context = {
    'company_name': 'Acme Corp',
    'date': '2025-01-15',
    'items': [
        {'name': 'Widget A', 'price': 100},
        {'name': 'Widget B', 'price': 200},
    ]
}
doc.render(context)
doc.save('filled_template.docx')

Extract Content (Python – python-docx)

from docx import Document

doc = Document('input.docx')

# Extract all text
full_text = []
for para in doc.paragraphs:
    full_text.append(para.text)

# Extract tables
for table in doc.tables:
    for row in table.rows:
        row_data = [cell.text for cell in row.cells]
        print(row_data)

Styling Reference

Element	Python Method	Node.js Class
Heading 1	`add_heading(text, 1)`	`HeadingLevel.HEADING_1`
Bold	`run.bold = True`	`TextRun({ bold: true })`
Italic	`run.italic = True`	`TextRun({ italics: true })`
Font size	`run.font.size = Pt(12)`	`TextRun({ size: 24 })` (half-points)
Alignment	`WD_ALIGN_PARAGRAPH.CENTER`	`AlignmentType.CENTER`
Page break	`doc.add_page_break()`	`new PageBreak()`

Do / Avoid (Dec 2025)

Do

Use consistent heading levels and a table of contents for long docs.
Capture decisions and action items with owners and due dates.
Store docs in a versioned, searchable system.

Avoid

Manual formatting instead of styles (breaks consistency).
Docs with no owner or review cadence (stale quickly).
Copy/pasting without updating definitions and links.

Output Quality Checklist

Structure: consistent heading hierarchy, styles, and (when needed) an auto-generated table of contents.
Decisions: decisions/actions captured with owner + due date (not buried in prose).
Versioning: doc ID + version + change summary; review cadence defined.
Accessibility hygiene: headings/reading order are correct; table headers are marked; alt text for non-decorative images.
Reuse: use assets/doc-template-pack.md for decision logs and recurring doc types.

Optional: AI / Automation

Use only when explicitly requested and policy-compliant.

Summarize meeting notes into decisions/actions; humans verify accuracy.
Draft first-pass docs from outlines; do not invent facts or quotes.

Navigation

Resources

references/docx-patterns.md – Advanced formatting, styles, headers/footers
references/template-workflows.md – Mail merge, batch generation
references/tracked-changes.md – Tracked changes: what is feasible, and what is not
data/sources.json – Library documentation links

Scripts

scripts/docx_inspect_ooxml.py – Dependency-free OOXML inspection (including tracked changes signals)
scripts/docx_extract.py – Extract text/tables to JSON (requires python-docx)
scripts/docx_render_template.py – Render a docxtpl template (requires docxtpl)
scripts/docx_to_html.mjs – Convert .docx to HTML (requires mammoth)

Templates

assets/report-template.md – Standard report structure
assets/contract-template.md – Legal document structure
assets/doc-template-pack.md – Decision log, meeting notes, changelog templates

Related Skills

../document-pdf/SKILL.md – PDF generation and conversion
../docs-codebase/SKILL.md – Technical writing patterns

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台