agent-browser

📁 itechmeat/llm-code 📅 Jan 28, 2026

总安装量

周安装量

#53648

全站排名

安装命令

npx skills add https://github.com/itechmeat/llm-code --skill agent-browser

Agent 安装分布

windsurf 1

opencode 1

cursor 1

codex 1

claude-code 1

Skill 文档

Agent Browser

Headless browser automation CLI for AI agents. Fast Rust CLI with Node.js fallback.

Works with: Claude Code, Cursor, GitHub Copilot, OpenAI Codex, Google Gemini, opencode.

Quick Navigation

Topic	Reference
Installation	installation.md
Commands	commands.md
Refs	refs.md
Advanced	advanced.md

When to Use

Automating browser tasks in AI agent workflows
Web scraping with AI-friendly output
Testing web applications with LLM agents
Managing multiple browser sessions with isolated auth

Core Concepts

Refs (Element References)

The snapshot command returns an accessibility tree where each element has a unique ref like @e1, @e2:

Deterministic – ref points to exact element from snapshot
Fast – no DOM re-query needed
AI-friendly – LLMs can reliably parse and use refs

Architecture

Client-daemon architecture:

Rust CLI – parses commands, communicates with daemon
Node.js Daemon – manages Playwright browser instance

Daemon starts automatically and persists between commands.

Quick Example

# Navigate and get snapshot
agent-browser open example.com
agent-browser snapshot                    # Get accessibility tree with refs
agent-browser click @e2                   # Click by ref from snapshot
agent-browser fill @e3 "test@example.com" # Fill input by ref
agent-browser get text @e1                # Get text by ref
agent-browser screenshot page.png         # Save screenshot
agent-browser close

AI Workflow Pattern

Optimal workflow for AI agents:

# 1. Navigate and get snapshot
agent-browser open example.com
agent-browser snapshot -i --json   # AI parses tree and refs

# 2. AI identifies target refs from snapshot

# 3. Execute actions using refs
agent-browser click @e2
agent-browser fill @e3 "input text"

# 4. Get new snapshot if page changed
agent-browser snapshot -i --json

Headed Mode (Debugging)

agent-browser open example.com --headed

JSON Output

Use --json for machine-readable output:

agent-browser snapshot --json
agent-browser get text @e1 --json
agent-browser is visible @e2 --json

Critical Prohibitions

Do not use CSS/XPath selectors when refs are available (use @e1, @e2, etc.)
Do not forget to close sessions when done
Do not assume element positions without taking a fresh snapshot
Do not use old refs after page navigation or content changes (re-snapshot)

Common Commands

# Navigation
agent-browser open <url>
agent-browser back / forward / reload
agent-browser close

# Interaction
agent-browser click <sel>
agent-browser fill <sel> <text>
agent-browser press <key>
agent-browser hover <sel>
agent-browser select <sel> <val>
agent-browser download <sel> <path>  # v0.7+

# Info
agent-browser get text <sel>
agent-browser get url
agent-browser get title
agent-browser is visible <sel>

# Snapshots & Screenshots
agent-browser snapshot -i --json
agent-browser screenshot [path]