docx

📁 sherifeldeeb/agentskills 📅 3 days ago

总安装量

周安装量

#48076

全站排名

安装命令

npx skills add https://github.com/sherifeldeeb/agentskills --skill docx

Agent 安装分布

opencode 1

codex 1

claude-code 1

Skill 文档

DOCX Skill

Read, modify, and create Microsoft Word documents with support for converting markdown content into professionally formatted documents using templates.

Capabilities

Read Documents: Extract text, tables, and metadata from DOCX files
Create Documents: Generate new DOCX files from scratch or templates
Modify Documents: Edit existing documents, add content, modify styles
Markdown Conversion: Convert markdown files to DOCX using template styling
Template-Based Generation: Use DOCX templates to maintain consistent formatting

Quick Start

from docx import Document

# Read a document
doc = Document('report.docx')
for para in doc.paragraphs:
    print(para.text)

# Create a document
doc = Document()
doc.add_heading('Report Title', 0)
doc.add_paragraph('Content here.')
doc.save('output.docx')

Usage

Reading Documents

Extract content from existing DOCX files.

Input: Path to a DOCX file

Process:

Open document with python-docx
Iterate through paragraphs, tables, or other elements
Extract text, styles, or metadata

Example:

from docx import Document

doc = Document('input.docx')

# Extract all text
full_text = []
for para in doc.paragraphs:
    full_text.append(para.text)
text = '\n'.join(full_text)

# Extract tables
for table in doc.tables:
    for row in table.rows:
        row_data = [cell.text for cell in row.cells]
        print(row_data)

# Get document properties
props = doc.core_properties
print(f"Title: {props.title}")
print(f"Author: {props.author}")

Creating New Documents

Generate DOCX files from scratch.

Input: Content to include (text, tables, images)

Process:

Create new Document object
Add headings, paragraphs, tables
Apply styles as needed
Save to file

Example:

from docx import Document
from docx.shared import Inches, Pt
from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = Document()

# Add title
title = doc.add_heading('Security Assessment Report', 0)
title.alignment = WD_ALIGN_PARAGRAPH.CENTER

# Add metadata paragraph
doc.add_paragraph('Prepared by: Security Team')
doc.add_paragraph('Date: January 2024')

# Add section heading
doc.add_heading('Executive Summary', level=1)

# Add content paragraph
para = doc.add_paragraph()
para.add_run('This assessment identified ').bold = False
para.add_run('3 critical findings').bold = True
para.add_run(' requiring immediate attention.')

# Add a table
table = doc.add_table(rows=1, cols=3)
table.style = 'Table Grid'
header = table.rows[0].cells
header[0].text = 'Finding'
header[1].text = 'Severity'
header[2].text = 'Status'

# Add data rows
data = [
    ('SQL Injection', 'Critical', 'Open'),
    ('XSS Vulnerability', 'High', 'In Progress'),
]
for finding, severity, status in data:
    row = table.add_row().cells
    row[0].text = finding
    row[1].text = severity
    row[2].text = status

doc.save('report.docx')

Modifying Existing Documents

Edit and update existing DOCX files.

Input: Path to existing DOCX file

Process:

Open existing document
Locate content to modify
Make changes
Save (same or new file)

Example:

from docx import Document

doc = Document('template.docx')

# Replace placeholder text
for para in doc.paragraphs:
    if '{{CLIENT_NAME}}' in para.text:
        para.text = para.text.replace('{{CLIENT_NAME}}', 'Acme Corp')
    if '{{DATE}}' in para.text:
        para.text = para.text.replace('{{DATE}}', '2024-01-15')

# Add new section at end
doc.add_heading('Additional Findings', level=1)
doc.add_paragraph('New content added during modification.')

doc.save('modified_report.docx')

Markdown to DOCX Conversion

Convert markdown files to Word documents using template styling.

Input:

Markdown file path
DOCX template file path (optional)
Output file path

Process:

Parse markdown content
Load template document for styles (if provided)
Map markdown elements to Word styles
Generate formatted DOCX

Using the conversion script:

python scripts/md_to_docx.py report.md --template company_template.docx --output final_report.docx

Programmatic usage:

from scripts.md_to_docx import MarkdownToDocx

converter = MarkdownToDocx(template_path='template.docx')
converter.convert('input.md', 'output.docx')

Style Mapping:

Markdown	Word Style
`# Heading 1`	Heading 1
`## Heading 2`	Heading 2
`### Heading 3`	Heading 3
`bold`	Bold run
`italic`	Italic run
`code`	Code character style
`- item`	List Bullet
`1. item`	List Number
Code blocks	Code block style
`> quote`	Quote style
Tables	Table Grid

Template Requirements:

For best results, your template should define these styles:

Heading 1, Heading 2, Heading 3
Normal (body text)
List Bullet, List Number
Quote (for blockquotes)
Code (character style for inline code)

Working with Styles

Apply consistent formatting using document styles.

Example:

from docx import Document
from docx.shared import Pt, RGBColor
from docx.enum.style import WD_STYLE_TYPE

doc = Document()

# Create a custom style
styles = doc.styles
style = styles.add_style('Finding Critical', WD_STYLE_TYPE.PARAGRAPH)
style.font.bold = True
style.font.size = Pt(12)
style.font.color.rgb = RGBColor(255, 0, 0)

# Use the style
doc.add_paragraph('CRITICAL: SQL Injection Found', style='Finding Critical')

doc.save('styled_report.docx')

Adding Images

Insert images into documents.

Example:

from docx import Document
from docx.shared import Inches

doc = Document()
doc.add_heading('Network Diagram', level=1)
doc.add_picture('network_diagram.png', width=Inches(6))
doc.add_paragraph('Figure 1: Current network architecture')

doc.save('report_with_images.docx')

Configuration

Environment Variables

Variable	Description	Required	Default
`DOCX_TEMPLATE_DIR`	Default template directory	No	`./assets/templates`

Script Options

Option	Type	Description
`--template`	path	DOCX template for styling
`--output`	path	Output file path
`--toc`	flag	Generate table of contents
`--verbose`	flag	Enable verbose logging

Examples

Example 1: Security Report from Markdown

Scenario: Convert a penetration test report written in markdown to a professional Word document.

Input (report.md):

# Penetration Test Report

## Executive Summary

The assessment identified **3 critical** and **5 high** severity findings.

## Findings

### Finding 1: SQL Injection

**Severity**: Critical

The application is vulnerable to SQL injection in the login form.

| Parameter | Payload | Result |
|-----------|---------|--------|
| username | `' OR 1=1--` | Auth bypass |

### Remediation

1. Use parameterized queries
2. Implement input validation
3. Apply least privilege

Command:

python scripts/md_to_docx.py report.md --template corporate_template.docx --output pentest_report.docx

Output: Professional DOCX with corporate styling applied.

Example 2: Batch Document Generation

Scenario: Generate multiple reports from a template with different data.

from docx import Document
import json

# Load client data
with open('clients.json') as f:
    clients = json.load(f)

for client in clients:
    doc = Document('assessment_template.docx')

    # Replace placeholders
    for para in doc.paragraphs:
        for key, value in client.items():
            placeholder = f'{{{{{key}}}}}'
            if placeholder in para.text:
                para.text = para.text.replace(placeholder, str(value))

    doc.save(f"reports/{client['name']}_assessment.docx")

Limitations

Complex formatting: Some advanced Word features (SmartArt, embedded objects) may not be fully supported
Track changes: Limited support for revision tracking
Macros: VBA macros are not supported for security reasons
Large tables: Tables with merged cells may have rendering inconsistencies

Troubleshooting

Style Not Applied

Problem: Custom styles from template not appearing in output

Solution: Ensure the style exists in the template and use exact style name:

# Check available styles
for style in doc.styles:
    print(style.name)

Encoding Issues

Problem: Special characters not displaying correctly

Solution: Ensure UTF-8 encoding when reading source files:

with open('input.md', 'r', encoding='utf-8') as f:
    content = f.read()

Missing Images

Problem: Images not appearing in output

Solution: Use absolute paths or verify relative path from script location:

from pathlib import Path
image_path = Path(__file__).parent / 'assets' / 'logo.png'
doc.add_picture(str(image_path))

Related Skills

pdf: Convert DOCX to PDF or extract content from PDFs
xlsx: Work with Excel files for data that feeds into reports
pptx: Create presentations from document content

References

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台