srt enhancer
npx skills add https://github.com/dean9703111/ai-agent-skill-for-video-workflow --skill SRT Enhancer
Skill 文档
SRT Enhancer
This skill provides an AI-driven workflow for enhancing SRT subtitle files by comparing them with an original reference document (origin.md). The enhancement process corrects typos, standardizes proper nouns, and adds proper spacing around English text and numbers while preserving the original timeline and structure.
Purpose
Enhance SRT subtitle files by:
- Comparing subtitle content with reference markdown document (
origin.md) - Correcting typos and transcription errors
- Standardizing proper nouns and terminology
- Adding half-width spaces before and after English words and numbers
- Maintaining exact timestamps and SRT structure
- Only including content that appears in the SRT (not adding new content from markdown)
When to Use This Skill
Use this skill when:
- Refining auto-generated subtitles with a reference script
- Correcting transcription errors in SRT files
- Standardizing terminology across subtitle files
- Improving subtitle formatting for Chinese content with English/numbers
- Ensuring consistency between spoken content and written reference
Enhancement Principles
- Timeline Preservation: Never modify timestamps or subtitle numbering
- Content Fidelity: Only correct what exists in SRT; don’t add content from markdown
- Reference Comparison: Use
origin.mdas the source of truth for spelling and terminology - Spacing Rules: Add half-width spaces (
) around English words and numbers in Chinese text - AI-Driven: Use semantic understanding to match content, not simple string matching
Core Workflow
1. Locate Reference Document
Find origin.md in the same directory as the input SRT file:
- Check for
origin.mdin the SRT file’s directory - If not found, prompt user for the reference document location
- Read and parse the markdown content
2. Parse SRT File
Load and parse the input SRT file:
- Extract subtitle number
- Extract timestamp (start â end)
- Extract subtitle text
- Preserve exact formatting and structure
3. Build Reference Knowledge Base
Analyze origin.md to extract:
- Proper nouns and terminology
- Correct spellings and phrasings
- English words and technical terms
- Numbers and their contexts
- Common phrases and expressions
Reference references/enhancement-rules.md for detailed extraction strategies.
4. Match and Compare Content
For each subtitle segment:
- Identify corresponding content in
origin.mdusing semantic matching - Compare subtitle text with reference text
- Identify discrepancies:
- Typos and misspellings
- Incorrect proper nouns
- Missing spaces around English/numbers
- Terminology inconsistencies
5. Apply Corrections
For each identified issue:
- Typos: Replace with correct spelling from
origin.md - Proper Nouns: Standardize to reference document version
- English/Numbers: Add half-width space before and after
- Terminology: Use consistent terms from reference
6. Validate Enhancements
Before finalizing each correction:
- Ensure the content exists in the original SRT
- Verify no new content is added from markdown
- Confirm timestamps remain unchanged
- Check subtitle numbering is preserved
- Validate spacing follows rules (see Spacing Rules section)
7. Generate Output File
Save the enhanced SRT as enhanced.srt:
- Maintain all original timestamps
- Preserve subtitle numbering
- Keep exact SRT formatting
- Apply all validated corrections
Spacing Rules
Add Half-Width Spaces Around:
-
English Words in Chinese Text:
- Before:
鿝ä¸åexampleç¯ä¾ - After:
鿝ä¸å example ç¯ä¾
- Before:
-
Numbers in Chinese Text:
- Before:
ç¸½å ±æ3åæ¥é© - After:
ç¸½å ±æ 3 忥é©
- Before:
-
Mixed English and Numbers:
- Before:
使ç¨Python3.9çæ¬ - After:
ä½¿ç¨ Python 3.9 çæ¬
- Before:
Do NOT Add Spaces:
- Within English phrases:
machine learning(keep as is) - Within numbers:
123,456(keep as is) - Between punctuation and text: Keep original punctuation spacing
- When space already exists: Don’t add duplicate spaces
Reference references/spacing-examples.md for comprehensive examples.
Implementation Guidelines
Semantic Matching Strategy
Use AI to match SRT content with origin.md:
- Don’t rely on exact string matching
- Understand context and meaning
- Account for transcription variations
- Match concepts, not just words
- Handle paraphrasing and reordering
Correction Priority
Apply corrections in this order:
- Critical typos affecting meaning
- Proper nouns and terminology
- Spacing around English/numbers
- Minor spelling variations
Quality Checks
Before generating output:
- Verify all timestamps are unchanged
- Confirm subtitle count matches original
- Validate no content added from markdown
- Check spacing rules applied consistently
- Ensure proper nouns are standardized
- Validate output filename is
enhanced.srt
Example Enhancement
origin.md:
ä»å¤©è¦ä»ç´¹ Python 3.9 çæ°åè½ãé¦å
æ¯ match case èªå¥ï¼éæ¯ä¸åå¼·å¤§çæ¨¡å¼å¹é
å·¥å
·ã
Input SRT:
1
00:00:00,000 --> 00:00:05,000
ä»å¤©è¦ä»ç´¹Python3.9çæ°åè½
2
00:00:05,000 --> 00:00:10,000
é¦å
æ¯match caseèªå¥éæ¯ä¸åå¼·å¤§çæ¨¡å¼å¹é
工俱
Output SRT (enhanced.srt):
1
00:00:00,000 --> 00:00:05,000
ä»å¤©è¦ä»ç´¹ Python 3.9 çæ°åè½
2
00:00:05,000 --> 00:00:10,000
é¦å
æ¯ match case èªå¥,鿝ä¸åå¼·å¤§çæ¨¡å¼å¹é
å·¥å
·
Changes Made:
- Added spaces around
Python,3.9,match case - Corrected
工俱âå·¥å ·(typo fix from reference) - Preserved all timestamps and numbering
Important Constraints
Must NOT:
- Modify timestamps or subtitle numbering
- Add content from
origin.mdthat doesn’t exist in SRT - Use Python scripts or automation (AI analysis required)
- Change the meaning or intent of subtitles
- Remove existing content from SRT
- Alter SRT structure or formatting
Must DO:
- Use AI to semantically match content
- Compare each subtitle with reference document
- Apply spacing rules consistently
- Correct typos based on reference
- Standardize proper nouns and terminology
- Output to
enhanced.srt - Preserve exact timestamps and structure
Error Handling
Missing origin.md:
- Check same directory as SRT file
- Prompt user for location
- Cannot proceed without reference document
No Matching Content:
- If SRT content has no match in
origin.md, leave unchanged - Don’t guess or invent corrections
- Apply spacing rules even without reference match
Ambiguous Matches:
- Use context to determine best match
- Prefer conservative corrections
- When uncertain, preserve original text
Additional Resources
Reference Files
references/enhancement-rules.md– Detailed rules for typo detection, proper noun extraction, and semantic matching strategiesreferences/spacing-examples.md– Comprehensive examples of spacing rules with edge cases
Workflow Summary
To enhance an SRT file with reference document:
- Locate
origin.mdin the same directory as the SRT file - Read and parse both files
- Build knowledge base from
origin.md(proper nouns, terminology, correct spellings) - For each SRT subtitle segment:
- Semantically match with reference content
- Identify typos, incorrect proper nouns, missing spaces
- Apply corrections while preserving timeline
- Validate all corrections maintain content fidelity
- Save output as
enhanced.srt - Verify timestamps unchanged and no content added
Focus on semantic understanding and conservative corrections. The goal is to refine existing subtitle content using the reference document as a guide, not to rewrite or add new content. Maintain the integrity of the original SRT structure while improving accuracy and formatting.