snap-context
npx skills add https://github.com/sohilpandya/skills --skill snap-context
Agent 安装分布
Skill 文档
Screenshot to Structured Markdown
When this skill is invoked, you MUST delegate image analysis to subagents using the Task tool. Do NOT read or process any images in the current context â this keeps image tokens out of the main conversation.
How to invoke
Single image
Use the Task tool with these parameters:
subagent_type:"general-purpose"model:"sonnet"description:"Extract screenshot to markdown"prompt: The full agent prompt below, withIMAGE_PATHreplaced by the actual file path
Multiple images
Spawn one Task per image in parallel (all in a single message with multiple tool calls). Each subagent processes one image independently. Replace IMAGE_PATH in each prompt with the corresponding file path.
Identifying images
Collect all image file paths from:
- Explicit paths in
$ARGUMENTS - Images attached/pasted in the user’s message (these appear as
[Image: source: /path/to/file])
If no images are found, ask the user for the image file path(s) before spawning any agents.
Agent prompt
Pass this exact prompt to each Task agent (replacing IMAGE_PATH with the real path for that image):
You are a screenshot-to-markdown converter. Use the Read tool to open the image at: IMAGE_PATH
Detect the structure type and output clean markdown. Follow these rules exactly.
Detection Priority
Pick the first type that clearly fits. If none fit, use Plain Text.
- Table â grid of aligned columns with repeated rows of data
- Form â labeled fields with values (Label: Value pairs), possibly grouped under section headings
- Card â 2â6 side-by-side content blocks arranged in columns
- Code â monospaced text with syntax patterns (braces, keywords, indentation)
- Dialog â a small, narrow overlay box with a title, message, and action buttons
- Hierarchy â indented/nested list structure (file trees, outlines, task lists)
- Plain Text â paragraphs of text that don’t match any above type
Context Detection
Before formatting the main content, check for:
-
Sidebar navigation: If a left-side nav panel is visible, extract it as:
Context: AppName > ActivePage Sidebar: item1, ActivePage, item3 Bold the active page. Place above main content with a blank line separator.
-
Modal/dialog overlay: If a modal on a dimmed background, focus only on the modal â ignore the background.
Formatting Rules
Table:
- Pipe-delimited markdown table with padded columns (minimum 3 chars)
- Use distinguishable header row (bold, ALL CAPS, or first row)
- Preserve context lines above and footer lines below
Form:
- Bullet list with bolded labels:
- **Label:** Value - Group under
## Section Headingwhen section headers are visible - Omit section headings if none exist
Card:
##for overall title,###for each card title- Subtitle = smaller text below card title
- Action buttons as
**[Label]**
Code:
- Fenced code block with language tag (swift, python, javascript, rust, go, java, bash, html, sql, typescript)
- Omit language tag only if truly unidentifiable
- Preserve indentation exactly
Dialog:
- Everything inside a blockquote
- Title as
> ## Title - Buttons as
> **[OK]** **[Cancel]** - Menus (vertical list, no title/buttons): same blockquote, each item on its own line
Hierarchy:
- 2-space indent per nesting level
- Preserve bullet types:
-unordered,1.numbered,- [x]/- [ ]checkboxes - Convert
â¢and*to-, convertâ/âto- [x], convertâ/âto- [ ]
Plain Text:
- Separate paragraphs with blank lines
- Preserve line breaks within paragraphs
Output Rules
- Output ONLY the formatted markdown â no explanations, no preamble, no commentary
- If sidebar context was detected, include it at the top
- Pick exactly one structure type for the main content
- Be precise â transcribe text exactly as shown, do not paraphrase or summarize
- For ambiguous cases (e.g. a form inside a dialog), prefer the outer container type
After the agent(s) return
Single image
Return the agent’s markdown output directly to the user. Do not add any wrapper text, explanation, or commentary â just the raw markdown result.
Multiple images
Return each agent’s markdown output separated by a horizontal rule (---) and prefixed with the filename in bold for clarity. Example:
screenshot-1.png
(markdown output)
screenshot-2.png
(markdown output)