videodb

📁 video-db/skills 📅 3 days ago
30
总安装量
30
周安装量
#12170
全站排名
安装命令
npx skills add https://github.com/video-db/skills --skill videodb

Agent 安装分布

opencode 30
gemini-cli 30
github-copilot 30
codex 30
amp 30
kimi-cli 30

Skill 文档

VideoDB Python Skill

A single API-first video stack for agents. Ingest anything, process server-side, and ship playable streams without FFmpeg glue.

Core capabilities

  • Ingest + Transcode — Accept any format, change codec/bitrate/FPS/resolution, output a playable stream (CDN + hosting)
  • Scene-level Search Engine — Build a searchable index scene by scene, find exact moments and auto-create clips, manage 1000s of hours of footage
  • Generate + Compose — Generate image/audio/video/text assets, overlay text/images/branding/motion captions, dub videos, translate captions
  • Real-time RTSP — Connect live streams, define events and alerts, ideal for security cams and monitoring
  • Desktop Perception — Capture screen/mic/system audio, stream desktop live, define alerts and triggers, store episodic memory and search sessions

Try it now

  • “Ingest this file and give me a playable web stream link”
  • “Generate subtitles, burn them in, and add light background music”
  • “Index this folder and find every scene with people”
  • “Connect this RTSP URL and alert when a person enters the zone”
  • “Start recording and give me an actionable summary when it ends”

Running Python code

scripts/videodb_env.py handles environment loading and SDK validation. Call load_vdb_env() before any VideoDB code — it checks the SDK is installed, loads VIDEO_DB_API_KEY from environment variable → ./.env → ~/.videodb/.env, and exits with clear error messages if anything is missing. All scripts in scripts/ already call this at the top.

Inline usage:

python -c "from scripts.videodb_env import load_vdb_env; load_vdb_env(); print('OK')"

Usage in a script file:

from scripts.videodb_env import load_vdb_env
load_vdb_env()

import videodb
conn = videodb.connect()

videodb.connect() reads VIDEO_DB_API_KEY from the environment automatically.

Do NOT write a script file when a short inline command works.

When writing inline Python (python -c "..."), always use properly formatted code — use semicolons to separate statements and keep it readable. For anything longer than ~3 statements, use a heredoc instead:

python << 'EOF'
from scripts.videodb_env import load_vdb_env
load_vdb_env()

import videodb
conn = videodb.connect()
coll = conn.get_collection()
print(f"Videos: {len(coll.get_videos())}")
EOF

Setup

When the user asks to “setup videodb” or similar:

1. Install SDK

pip install "videodb[capture]" python-dotenv

If videodb[capture] fails on Linux, install without the capture extra:

pip install videodb python-dotenv

2. Verify environment

Run load_vdb_env() to verify the SDK and API key:

python -c "from scripts.videodb_env import load_vdb_env; load_vdb_env(); print('OK')"

If it prints OK, setup is done.

If it fails with VIDEO_DB_API_KEY not found, ask the user to:

  1. Get a free API key at https://console.videodb.io (50 free uploads, no credit card)
  2. Set it up using either of these methods:
    • Set it as an environment variable: export VIDEO_DB_API_KEY=your-key
    • Or save it to ~/.videodb/.env as VIDEO_DB_API_KEY=your-key

Then re-run to confirm.

Do NOT read, write, or handle the API key yourself. Always let the user set it.

Quick Reference

Upload media

# URL
video = coll.upload(url="https://example.com/video.mp4")

# YouTube
video = coll.upload(url="https://www.youtube.com/watch?v=VIDEO_ID")

# Local file
video = coll.upload(file_path="/path/to/video.mp4")

Transcript + subtitle

# force=True skips the error if the video is already indexed
video.index_spoken_words(force=True)
text = video.get_transcript_text()
stream_url = video.add_subtitle()

Search inside videos

from videodb.exceptions import InvalidRequestError

video.index_spoken_words(force=True)

# search() raises InvalidRequestError when no results are found.
# Always wrap in try/except and treat "No results found" as empty.
try:
    results = video.search("product demo")
    shots = results.get_shots()
    stream_url = results.compile()
except InvalidRequestError as e:
    if "No results found" in str(e):
        shots = []
    else:
        raise

Scene search

import re
from videodb import SearchType, IndexType, SceneExtractionType
from videodb.exceptions import InvalidRequestError

# index_scenes() has no force parameter — it raises an error if a scene
# index already exists. Extract the existing index ID from the error.
try:
    scene_index_id = video.index_scenes(
        extraction_type=SceneExtractionType.shot_based,
        prompt="Describe the visual content in this scene.",
    )
except Exception as e:
    match = re.search(r"id\s+([a-f0-9]+)", str(e))
    if match:
        scene_index_id = match.group(1)
    else:
        raise

# Use score_threshold to filter low-relevance noise (recommended: 0.3+)
try:
    results = video.search(
        query="person writing on a whiteboard",
        search_type=SearchType.semantic,
        index_type=IndexType.scene,
        scene_index_id=scene_index_id,
        score_threshold=0.3,
    )
    shots = results.get_shots()
    stream_url = results.compile()
except InvalidRequestError as e:
    if "No results found" in str(e):
        shots = []
    else:
        raise

Timeline editing

Important: Always validate timestamps before building a timeline:

  • start must be >= 0 (negative values are silently accepted but produce broken output)
  • start must be < end
  • end must be <= video.length
from videodb.timeline import Timeline
from videodb.asset import VideoAsset, TextAsset, TextStyle

timeline = Timeline(conn)
timeline.add_inline(VideoAsset(asset_id=video.id, start=10, end=30))
timeline.add_overlay(0, TextAsset(text="The End", duration=3, style=TextStyle(fontsize=36)))
stream_url = timeline.generate_stream()

Transcode video (resolution / quality change)

from videodb import TranscodeMode, VideoConfig, AudioConfig

# Change resolution, quality, or aspect ratio server-side
job_id = conn.transcode(
    source="https://example.com/video.mp4",
    callback_url="https://example.com/webhook",
    mode=TranscodeMode.economy,
    video_config=VideoConfig(resolution=720, quality=23, aspect_ratio="16:9"),
    audio_config=AudioConfig(mute=False),
)

Reframe aspect ratio (for social platforms)

Warning: reframe() is a slow server-side operation. For long videos it can take several minutes and may time out. Best practices:

  • Always limit to a short segment using start/end when possible
  • For full-length videos, use callback_url for async processing
  • Trim the video on a Timeline first, then reframe the shorter result
from videodb import ReframeMode

# Always prefer reframing a short segment:
reframed = video.reframe(start=0, end=60, target="vertical", mode=ReframeMode.smart)

# Async reframe for full-length videos (returns None, result via webhook):
video.reframe(target="vertical", callback_url="https://example.com/webhook")

# Presets: "vertical" (9:16), "square" (1:1), "landscape" (16:9)
reframed = video.reframe(start=0, end=60, target="square")

# Custom dimensions
reframed = video.reframe(start=0, end=60, target={"width": 1280, "height": 720})

Generative media

image = coll.generate_image(
    prompt="a sunset over mountains",
    aspect_ratio="16:9",
)

Error handling

from videodb.exceptions import AuthenticationError, InvalidRequestError

try:
    conn = videodb.connect()
except AuthenticationError:
    print("Check your VIDEO_DB_API_KEY")

try:
    video = coll.upload(url="https://example.com/video.mp4")
except InvalidRequestError as e:
    print(f"Upload failed: {e}")

Common pitfalls

Scenario Error message Solution
Indexing an already-indexed video Spoken word index for video already exists Use video.index_spoken_words(force=True) to skip if already indexed
Scene index already exists Scene index with id XXXX already exists Extract the existing scene_index_id from the error with re.search(r"id\s+([a-f0-9]+)", str(e))
Search finds no matches InvalidRequestError: No results found Catch the exception and treat as empty results (shots = [])
Reframe times out Blocks indefinitely on long videos Use start/end to limit segment, or pass callback_url for async
Negative timestamps on Timeline Silently produces broken stream Always validate start >= 0 before creating VideoAsset
generate_video() / create_collection() fails Operation not allowed or maximum limit Plan-gated features — inform the user about plan limits

Additional docs

Reference documentation is in the reference/ directory adjacent to this SKILL.md file. Use the Glob tool to locate it if needed.

Screen Recording (Desktop Capture)

Use scripts/capture_bg.py for screen and audio recording with AI transcription and visual indexing.

Start Recording

Run in background mode:

python scripts/capture_bg.py start &

The recording will capture:

  • Screen video
  • Microphone audio
  • System audio
  • Real-time AI transcription
  • Visual scene descriptions

Check Status

cat /tmp/videodb_capture_state.json

Stop Recording

Create the stop file to signal recording to finish:

touch /tmp/videodb_capture_stop

Or run:

python scripts/capture_bg.py stop

Get Shareable Link

After stopping, the state file contains the video URL:

cat /tmp/videodb_capture_state.json

Returns JSON with player_url for sharing.

Utility scripts

Ready-to-run scripts are in the scripts/ directory adjacent to this SKILL.md file. Read and execute them directly instead of rewriting the logic.

Do not use ffmpeg, moviepy, or local encoding tools when VideoDB supports the operation. The following are all handled server-side by VideoDB — trimming, combining clips, overlaying audio or music, adding subtitles, text/image overlays, transcoding, resolution changes, aspect-ratio conversion, resizing for platform requirements, transcription, and media generation. Only fall back to local tools for operations listed under Limitations in reference/editor.md (transitions, speed changes, crop/zoom, colour grading, volume mixing).

When to use what

Problem VideoDB solution
Platform rejects video aspect ratio or resolution video.reframe() or conn.transcode() with VideoConfig
Need to resize video for Twitter/Instagram/TikTok video.reframe(target="vertical") or target="square"
Need to change resolution (e.g. 1080p → 720p) conn.transcode() with VideoConfig(resolution=720)
Need to overlay audio/music on video AudioAsset on a Timeline
Need to add subtitles video.add_subtitle() or CaptionAsset
Need to combine/trim clips VideoAsset on a Timeline
Need to generate voiceover, music, or SFX coll.generate_voice(), generate_music(), generate_sound_effect()