srt enhancer

📁 dean9703111/ai-agent-skill-for-video-workflow 📅 Jan 1, 1970
1
总安装量
0
周安装量
#44239
全站排名
安装命令
npx skills add https://github.com/dean9703111/ai-agent-skill-for-video-workflow --skill SRT Enhancer

Skill 文档

SRT Enhancer

This skill provides an AI-driven workflow for enhancing SRT subtitle files by comparing them with an original reference document (origin.md). The enhancement process corrects typos, standardizes proper nouns, and adds proper spacing around English text and numbers while preserving the original timeline and structure.

Purpose

Enhance SRT subtitle files by:

  • Comparing subtitle content with reference markdown document (origin.md)
  • Correcting typos and transcription errors
  • Standardizing proper nouns and terminology
  • Adding half-width spaces before and after English words and numbers
  • Maintaining exact timestamps and SRT structure
  • Only including content that appears in the SRT (not adding new content from markdown)

When to Use This Skill

Use this skill when:

  • Refining auto-generated subtitles with a reference script
  • Correcting transcription errors in SRT files
  • Standardizing terminology across subtitle files
  • Improving subtitle formatting for Chinese content with English/numbers
  • Ensuring consistency between spoken content and written reference

Enhancement Principles

  1. Timeline Preservation: Never modify timestamps or subtitle numbering
  2. Content Fidelity: Only correct what exists in SRT; don’t add content from markdown
  3. Reference Comparison: Use origin.md as the source of truth for spelling and terminology
  4. Spacing Rules: Add half-width spaces ( ) around English words and numbers in Chinese text
  5. AI-Driven: Use semantic understanding to match content, not simple string matching

Core Workflow

1. Locate Reference Document

Find origin.md in the same directory as the input SRT file:

  • Check for origin.md in the SRT file’s directory
  • If not found, prompt user for the reference document location
  • Read and parse the markdown content

2. Parse SRT File

Load and parse the input SRT file:

  • Extract subtitle number
  • Extract timestamp (start → end)
  • Extract subtitle text
  • Preserve exact formatting and structure

3. Build Reference Knowledge Base

Analyze origin.md to extract:

  • Proper nouns and terminology
  • Correct spellings and phrasings
  • English words and technical terms
  • Numbers and their contexts
  • Common phrases and expressions

Reference references/enhancement-rules.md for detailed extraction strategies.

4. Match and Compare Content

For each subtitle segment:

  • Identify corresponding content in origin.md using semantic matching
  • Compare subtitle text with reference text
  • Identify discrepancies:
    • Typos and misspellings
    • Incorrect proper nouns
    • Missing spaces around English/numbers
    • Terminology inconsistencies

5. Apply Corrections

For each identified issue:

  • Typos: Replace with correct spelling from origin.md
  • Proper Nouns: Standardize to reference document version
  • English/Numbers: Add half-width space before and after
  • Terminology: Use consistent terms from reference

6. Validate Enhancements

Before finalizing each correction:

  • Ensure the content exists in the original SRT
  • Verify no new content is added from markdown
  • Confirm timestamps remain unchanged
  • Check subtitle numbering is preserved
  • Validate spacing follows rules (see Spacing Rules section)

7. Generate Output File

Save the enhanced SRT as enhanced.srt:

  • Maintain all original timestamps
  • Preserve subtitle numbering
  • Keep exact SRT formatting
  • Apply all validated corrections

Spacing Rules

Add Half-Width Spaces Around:

  1. English Words in Chinese Text:

    • Before: 這是一個example範例
    • After: 這是一個 example 範例
  2. Numbers in Chinese Text:

    • Before: 總共有3個步驟
    • After: 總共有 3 個步驟
  3. Mixed English and Numbers:

    • Before: 使用Python3.9版本
    • After: 使用 Python 3.9 版本

Do NOT Add Spaces:

  1. Within English phrases: machine learning (keep as is)
  2. Within numbers: 123,456 (keep as is)
  3. Between punctuation and text: Keep original punctuation spacing
  4. When space already exists: Don’t add duplicate spaces

Reference references/spacing-examples.md for comprehensive examples.

Implementation Guidelines

Semantic Matching Strategy

Use AI to match SRT content with origin.md:

  • Don’t rely on exact string matching
  • Understand context and meaning
  • Account for transcription variations
  • Match concepts, not just words
  • Handle paraphrasing and reordering

Correction Priority

Apply corrections in this order:

  1. Critical typos affecting meaning
  2. Proper nouns and terminology
  3. Spacing around English/numbers
  4. Minor spelling variations

Quality Checks

Before generating output:

  • Verify all timestamps are unchanged
  • Confirm subtitle count matches original
  • Validate no content added from markdown
  • Check spacing rules applied consistently
  • Ensure proper nouns are standardized
  • Validate output filename is enhanced.srt

Example Enhancement

origin.md:

今天要介紹 Python 3.9 的新功能。首先是 match case 語句,這是一個強大的模式匹配工具。

Input SRT:

1
00:00:00,000 --> 00:00:05,000
今天要介紹Python3.9的新功能

2
00:00:05,000 --> 00:00:10,000
首先是match case語句這是一個強大的模式匹配工俱

Output SRT (enhanced.srt):

1
00:00:00,000 --> 00:00:05,000
今天要介紹 Python 3.9 的新功能

2
00:00:05,000 --> 00:00:10,000
首先是 match case 語句,這是一個強大的模式匹配工具

Changes Made:

  • Added spaces around Python, 3.9, match case
  • Corrected 工俱 → 工具 (typo fix from reference)
  • Preserved all timestamps and numbering

Important Constraints

Must NOT:

  • Modify timestamps or subtitle numbering
  • Add content from origin.md that doesn’t exist in SRT
  • Use Python scripts or automation (AI analysis required)
  • Change the meaning or intent of subtitles
  • Remove existing content from SRT
  • Alter SRT structure or formatting

Must DO:

  • Use AI to semantically match content
  • Compare each subtitle with reference document
  • Apply spacing rules consistently
  • Correct typos based on reference
  • Standardize proper nouns and terminology
  • Output to enhanced.srt
  • Preserve exact timestamps and structure

Error Handling

Missing origin.md:

  • Check same directory as SRT file
  • Prompt user for location
  • Cannot proceed without reference document

No Matching Content:

  • If SRT content has no match in origin.md, leave unchanged
  • Don’t guess or invent corrections
  • Apply spacing rules even without reference match

Ambiguous Matches:

  • Use context to determine best match
  • Prefer conservative corrections
  • When uncertain, preserve original text

Additional Resources

Reference Files

  • references/enhancement-rules.md – Detailed rules for typo detection, proper noun extraction, and semantic matching strategies
  • references/spacing-examples.md – Comprehensive examples of spacing rules with edge cases

Workflow Summary

To enhance an SRT file with reference document:

  1. Locate origin.md in the same directory as the SRT file
  2. Read and parse both files
  3. Build knowledge base from origin.md (proper nouns, terminology, correct spellings)
  4. For each SRT subtitle segment:
    • Semantically match with reference content
    • Identify typos, incorrect proper nouns, missing spaces
    • Apply corrections while preserving timeline
  5. Validate all corrections maintain content fidelity
  6. Save output as enhanced.srt
  7. Verify timestamps unchanged and no content added

Focus on semantic understanding and conservative corrections. The goal is to refine existing subtitle content using the reference document as a guide, not to rewrite or add new content. Maintain the integrity of the original SRT structure while improving accuracy and formatting.