document-docx

📁 vasilyu1983/ai-agents-public 📅 Jan 23, 2026
47
总安装量
47
周安装量
#4500
全站排名
安装命令
npx skills add https://github.com/vasilyu1983/ai-agents-public --skill document-docx

Agent 安装分布

claude-code 38
opencode 28
cursor 24
antigravity 22
codex 18

Skill 文档

Document DOCX Skill – Quick Reference

This skill enables creation, editing, and analysis of .docx files for reports, contracts, proposals, documentation, and template-driven outputs.

Modern best practices (2026):

  • Prefer templates + styles over manual formatting.
  • Treat .docx as the editable source; treat PDF as a release artifact.
  • If distributing externally, include basic accessibility hygiene (headings, table headers, alt text).

Quick Reference

Task Tool/Library Language When to Use
Create DOCX python-docx Python Reports, contracts, proposals
Create DOCX docx Node.js Server-side document generation
Convert to HTML mammoth.js Node.js Web display, content extraction
Parse DOCX python-docx Python Extract text, tables, metadata
Template fill docxtpl Python Mail merge, template-based generation
Review workflow Word compare, comments/highlights Any Human review without OOXML surgery
Tracked changes OOXML inspection, docx4j/OpenXML SDK/Aspose Any True redlines or parsing tracked changes

Tool Selection

  • Prefer docxtpl when non-developers must edit layout/design in Word.
  • Prefer python-docx for structural edits (paragraphs/tables/headers/footers) when formatting complexity is moderate.
  • Prefer docx (Node.js) for server-side generation in TypeScript-heavy stacks.
  • Prefer mammoth for text-first extraction or DOCX-to-HTML (best effort; may drop some layout fidelity).

Known Limits (Plan Around These)

  • .doc (legacy) is not supported by these libraries; convert to .docx first (e.g., LibreOffice).
  • python-docx cannot reliably create true tracked changes; use Word compare or specialized OOXML tooling.
  • Tables of Contents and many fields are placeholders until opened/updated in Word.

Core Operations

Create Document (Python – python-docx)

from docx import Document
from docx.shared import Inches, Pt
from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = Document()

# Title
title = doc.add_heading('Document Title', 0)
title.alignment = WD_ALIGN_PARAGRAPH.CENTER

# Paragraph with formatting
para = doc.add_paragraph()
run = para.add_run('Bold and ')
run.bold = True
run = para.add_run('italic text.')
run.italic = True

# Table
table = doc.add_table(rows=3, cols=3)
table.style = 'Table Grid'
for i, row in enumerate(table.rows):
    for j, cell in enumerate(row.cells):
        cell.text = f'Row {i+1}, Col {j+1}'

# Image
doc.add_picture('image.png', width=Inches(4))

# Save
doc.save('output.docx')

Create Document (Node.js – docx)

import { Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell } from 'docx';
import * as fs from 'fs';

const doc = new Document({
  sections: [{
    properties: {},
    children: [
      new Paragraph({
        children: [
          new TextRun({ text: 'Bold text', bold: true }),
          new TextRun({ text: ' and normal text.' }),
        ],
      }),
      new Table({
        rows: [
          new TableRow({
            children: [
              new TableCell({ children: [new Paragraph('Cell 1')] }),
              new TableCell({ children: [new Paragraph('Cell 2')] }),
            ],
          }),
        ],
      }),
    ],
  }],
});

Packer.toBuffer(doc).then((buffer) => {
  fs.writeFileSync('output.docx', buffer);
});

Template-Based Generation (Python – docxtpl)

from docxtpl import DocxTemplate

doc = DocxTemplate('template.docx')
context = {
    'company_name': 'Acme Corp',
    'date': '2025-01-15',
    'items': [
        {'name': 'Widget A', 'price': 100},
        {'name': 'Widget B', 'price': 200},
    ]
}
doc.render(context)
doc.save('filled_template.docx')

Extract Content (Python – python-docx)

from docx import Document

doc = Document('input.docx')

# Extract all text
full_text = []
for para in doc.paragraphs:
    full_text.append(para.text)

# Extract tables
for table in doc.tables:
    for row in table.rows:
        row_data = [cell.text for cell in row.cells]
        print(row_data)

Styling Reference

Element Python Method Node.js Class
Heading 1 add_heading(text, 1) HeadingLevel.HEADING_1
Bold run.bold = True TextRun({ bold: true })
Italic run.italic = True TextRun({ italics: true })
Font size run.font.size = Pt(12) TextRun({ size: 24 }) (half-points)
Alignment WD_ALIGN_PARAGRAPH.CENTER AlignmentType.CENTER
Page break doc.add_page_break() new PageBreak()

Do / Avoid (Dec 2025)

Do

  • Use consistent heading levels and a table of contents for long docs.
  • Capture decisions and action items with owners and due dates.
  • Store docs in a versioned, searchable system.

Avoid

  • Manual formatting instead of styles (breaks consistency).
  • Docs with no owner or review cadence (stale quickly).
  • Copy/pasting without updating definitions and links.

Output Quality Checklist

  • Structure: consistent heading hierarchy, styles, and (when needed) an auto-generated table of contents.
  • Decisions: decisions/actions captured with owner + due date (not buried in prose).
  • Versioning: doc ID + version + change summary; review cadence defined.
  • Accessibility hygiene: headings/reading order are correct; table headers are marked; alt text for non-decorative images.
  • Reuse: use assets/doc-template-pack.md for decision logs and recurring doc types.

Optional: AI / Automation

Use only when explicitly requested and policy-compliant.

  • Summarize meeting notes into decisions/actions; humans verify accuracy.
  • Draft first-pass docs from outlines; do not invent facts or quotes.

Navigation

Resources

Scripts

  • scripts/docx_inspect_ooxml.py – Dependency-free OOXML inspection (including tracked changes signals)
  • scripts/docx_extract.py – Extract text/tables to JSON (requires python-docx)
  • scripts/docx_render_template.py – Render a docxtpl template (requires docxtpl)
  • scripts/docx_to_html.mjs – Convert .docx to HTML (requires mammoth)

Templates

Related Skills