agent-media
19
总安装量
11
周安装量
#18780
全站排名
安装命令
npx skills add https://github.com/agntswrm/agent-media --skill agent-media
Agent 安装分布
claude-code
10
opencode
9
gemini-cli
8
codex
8
windsurf
7
cursor
7
Skill 文档
Agent Media
Agent Media is an agent-first media toolkit that provides CLI-accessible commands for image, video, and audio processing. All commands produce deterministic, machine-readable JSON output.
Available Commands
Image Commands
agent-media image resize– Resize an imageagent-media image convert– Convert image formatagent-media image remove-background– Remove image backgroundagent-media image generate– Generate image from text
Audio Commands
agent-media audio extract– Extract audio from videoagent-media audio transcribe– Transcribe audio to text
Video Commands
agent-media video generate– Generate video from text or image
Output Format
All commands return JSON to stdout:
{
"ok": true,
"media_type": "image",
"action": "resize",
"provider": "local",
"output_path": "output_123.webp",
"mime": "image/webp",
"bytes": 12345
}
On error:
{
"ok": false,
"error": {
"code": "INVALID_INPUT",
"message": "input file not found"
}
}
Providers
- local – Default provider using Sharp (resize, convert) and Transformers.js (remove-background, transcribe)
- fal – fal.ai provider (generate, edit, remove-background, transcribe, video)
- replicate – Replicate API (generate, edit, remove-background, transcribe, video)
- runpod – Runpod API (generate, edit)
- ai-gateway – Vercel AI Gateway (generate, edit)
Provider Selection
- Explicit:
--provider <name> - Auto-detect from environment variables
- Fallback to local provider
Environment Variables
AGENT_MEDIA_DIR– Custom output directoryFAL_API_KEY– Enable fal providerREPLICATE_API_TOKEN– Enable replicate providerRUNPOD_API_KEY– Enable runpod providerAI_GATEWAY_API_KEY– Enable ai-gateway provider