agent-browser
2
总安装量
2
周安装量
#75660
全站排名
安装命令
npx skills add https://github.com/cjkihl/cjkihl --skill agent-browser
Agent 安装分布
openclaw
2
gemini-cli
2
claude-code
2
github-copilot
2
codex
2
kimi-cli
2
Skill 文档
Agent Browser Skill
Browser automation CLI for AI agents. Fast Rust CLI with Node.js fallback, optimized for AI agent workflows.
When to Use
Use agent-browser when you need to:
- Automate web interactions (clicking, filling forms, navigating)
- Test user flows or end-to-end scenarios
- Scrape or extract web page content
- Perform browser-based tasks programmatically
- Debug web applications by interacting with them
- Take screenshots or generate PDFs from web pages
Core Workflow
The optimal AI agent workflow:
- Navigate:
agent-browser open <url> - Snapshot:
agent-browser snapshot -i --json(get interactive elements with refs) - Interact: Use refs (
@e1,@e2) to click, fill, etc. - Re-snapshot: After page changes, get new snapshot
Key Commands
Navigation
agent-browser open <url>– Navigate to URL (aliases:goto,navigate)agent-browser back– Go backagent-browser forward– Go forwardagent-browser reload– Reload page
Snapshot (Critical for AI)
agent-browser snapshot– Full accessibility tree with refsagent-browser snapshot -i– Interactive elements only (buttons, inputs, links)agent-browser snapshot -c– Compact (remove empty structural elements)agent-browser snapshot -d 3– Limit depth to 3 levelsagent-browser snapshot --json– JSON output for agents
Interaction (Using Refs – Recommended)
agent-browser click @e2– Click element by refagent-browser fill @e3 "text"– Fill input by refagent-browser get text @e1– Get text by refagent-browser hover @e4– Hover element by ref
Interaction (Using Selectors)
agent-browser click "#submit"– Click by CSS selectoragent-browser fill "#email" "test@example.com"– Fill by selectoragent-browser find role button click --name "Submit"– Semantic locator
Form Actions
agent-browser type <sel> <text>– Type into elementagent-browser fill <sel> <text>– Clear and fillagent-browser select <sel> <val>– Select dropdown optionagent-browser check <sel>– Check checkboxagent-browser uncheck <sel>– Uncheck checkbox
Information Retrieval
agent-browser get text <sel>– Get text contentagent-browser get html <sel>– Get innerHTMLagent-browser get value <sel>– Get input valueagent-browser get title– Get page titleagent-browser get url– Get current URLagent-browser get count <sel>– Count matching elements
State Checks
agent-browser is visible <sel>– Check if visibleagent-browser is enabled <sel>– Check if enabledagent-browser is checked <sel>– Check if checked
Screenshots & PDFs
agent-browser screenshot [path]– Take screenshotagent-browser screenshot --full– Full page screenshotagent-browser pdf <path>– Save as PDF
Wait Operations
agent-browser wait <selector>– Wait for element to be visibleagent-browser wait 1000– Wait for time (milliseconds)agent-browser wait --text "Welcome"– Wait for text to appearagent-browser wait --url "**/dash"– Wait for URL pattern
Browser Settings
agent-browser set viewport <w> <h>– Set viewport sizeagent-browser set device <name>– Emulate device (“iPhone 14”)agent-browser set headers <json>– Set HTTP headersagent-browser set credentials <u> <p>– HTTP basic auth
Sessions (Isolated Instances)
agent-browser --session agent1 open site-a.com– Use isolated sessionagent-browser session list– List active sessionsagent-browser session– Show current session
Network Control
agent-browser network route <url>– Intercept requestsagent-browser network route <url> --abort– Block requestsagent-browser network route <url> --body <json>– Mock responseagent-browser network requests– View tracked requests
Debugging
agent-browser console– View console messagesagent-browser errors– View page errorsagent-browser highlight <sel>– Highlight elementagent-browser trace start [path]– Start recording trace
Why Use Refs?
Refs (@e1, @e2, etc.) are recommended for AI agents because:
- Deterministic: Ref points to exact element from snapshot
- Fast: No DOM re-query needed
- AI-friendly: Snapshot + ref workflow is optimal for LLMs
- Stable: Refs persist until page changes
Example Workflow
# 1. Navigate
agent-browser open https://example.com
# 2. Get snapshot with refs
agent-browser snapshot -i --json
# Output includes refs like @e1, @e2, @e3
# 3. Interact using refs
agent-browser click @e2
agent-browser fill @e3 "test@example.com"
agent-browser click @e1
# 4. Re-snapshot after page changes
agent-browser snapshot -i --json
Options
--json– JSON output (for agents)--session <name>– Use isolated session--headers <json>– Set HTTP headers--headed– Show browser window (not headless)--debug– Debug output--executable-path <path>– Custom browser executable
Installation
Already installed. If needed:
npm install -g agent-browser
agent-browser install # Download Chromium
Best Practices
- Always use snapshots with refs for element selection
- Use
-iflag to filter to interactive elements only - Use
--jsonflag for programmatic parsing - Re-snapshot after page changes to get fresh refs
- Use sessions for isolated browser instances
- Wait for elements before interacting when needed
References
- GitHub: https://github.com/vercel-labs/agent-browser
- Documentation: https://agent-browser.dev