gemini-research-browser-use

📁 grasseed/google-search-browser-use 📅 Jan 23, 2026

总安装量

周安装量

#9529

全站排名

安装命令

npx skills add https://github.com/grasseed/google-search-browser-use --skill gemini-research-browser-use

Agent 安装分布

antigravity 17

gemini-cli 16

codex 15

claude-code 14

cursor 13

Skill 文档

Gemini Research Browser Use

Overview

Perform research or queries using Google Gemini via Chrome DevTools Protocol (CDP). This method reuses the user’s existing Chrome login session to interact with the Gemini web interface (https://gemini.google.com/).

Prerequisites

Python + websockets Verify:

python3 --version
python3 -m pip show websockets

Install if missing:

python3 -m pip install websockets

Google Chrome Verify:

"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --version

CDP Port Availability Verify Chrome is listening (after launch in Step 2):
```
curl -s http://localhost:9222/json | python3 -m json.tool
```
Non-default user data directory (required by Chrome) Chrome CDP requires a non-default profile path. Use a cloned profile so you keep login state.
```
rm -rf /tmp/chrome-gemini-profile
rsync -a "$HOME/Library/Application Support/Google/Chrome/" /tmp/chrome-gemini-profile/
```

Method Comparison

Method	Pros	Cons	Recommended
Chrome Remote Debugging (CDP)	Uses existing login, full automation, reliable	Requires Chrome restart with debugging flag	â Yes
`browser-use --browser real`	Simple CLI	Opens new session without login	â No
`browser_subagent`	Visual feedback	Rate limited, may fail	â No

â Recommended Method: Chrome Remote Debugging (CDP)

This is the most reliable method that uses your system Chrome with existing Google login.

Prerequisites

Python 3 with websockets library
Google Chrome installed at /Applications/Google Chrome.app/
User logged into Google in Chrome

Step 1: Install websockets (if needed)

pip3 install websockets
# Or in virtual environment:
python3 -m venv .venv && ./.venv/bin/pip install websockets

Step 2: Launch Chrome with Remote Debugging (Non-default profile)

Important: Close any existing Chrome windows first, or use a different debugging port.

"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --remote-debugging-port=9222 \
  --user-data-dir="/tmp/chrome-gemini-profile" \
  "https://gemini.google.com/" &

Parameters explained:

--remote-debugging-port=9222: Enables CDP on port 9222
--user-data-dir: Points to your existing Chrome profile (with login session)
The URL opens Gemini directly

Step 3: Verify Connection (CDP)

curl -s http://localhost:9222/json | python3 -m json.tool

Look for the Gemini page entry:

{
  "title": "Google Gemini",
  "url": "https://gemini.google.com/app",
  "webSocketDebuggerUrl": "ws://localhost:9222/devtools/page/XXXXXXXX"
}

Note: If URL shows /app instead of just /, it means you’re logged in.

Step 4: Send Query to Gemini

Save this as gemini_query.py or run inline:

import asyncio
import websockets
import json
import subprocess
import sys

async def query_gemini(query_text, wait_seconds=30):
    # Get the Gemini page WebSocket URL
    result = subprocess.run(
        ["curl", "-s", "http://localhost:9222/json"],
        capture_output=True, text=True
    )
    pages = json.loads(result.stdout)
    
    # Find Gemini page
    gemini_page = None
    for page in pages:
        if page.get("type") == "page" and "gemini.google.com" in page.get("url", ""):
            gemini_page = page
            break
    
    if not gemini_page:
        print("Error: Gemini page not found. Make sure Chrome is open with Gemini.")
        return None
    
    ws_url = gemini_page["webSocketDebuggerUrl"]
    print(f"Connecting to: {ws_url}")
    
    async with websockets.connect(ws_url) as ws:
        # Step 1: Input the query
        input_js = f'''
        const editor = document.querySelector('div[contenteditable="true"]');
        if(editor) {{
            editor.focus();
            document.execCommand('insertText', false, `{query_text}`);
            editor.dispatchEvent(new Event('input', {{bubbles: true}}));
            'success';
        }} else {{
            'editor not found';
        }}
        '''
        
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": input_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"Input result: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # Step 2: Click send button
        await asyncio.sleep(1)
        click_js = '''
        const btn = document.querySelector('button[aria-label="å³éè¨æ¯"]');
        if(btn) { btn.click(); 'clicked'; } else { 'button not found'; }
        '''
        
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": click_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"Click result: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # Step 3: Wait for response
        print(f"Waiting {wait_seconds} seconds for Gemini to respond...")
        await asyncio.sleep(wait_seconds)
        
        # Step 4: Extract the response
        extract_js = '''
        const markdownEls = document.querySelectorAll('.markdown');
        if(markdownEls.length > 0) {
            markdownEls[markdownEls.length - 1].innerText;
        } else {
            'No response found';
        }
        '''
        
        await ws.send(json.dumps({
            "id": 3,
            "method": "Runtime.evaluate",
            "params": {"expression": extract_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        content = result.get('result', {}).get('result', {}).get('value', 'No content')
        
        return content

# Main execution
if __name__ == "__main__":
    query = sys.argv[1] if len(sys.argv) > 1 else "ç¯ä¾åé¡ï¼è«ç¨ç¹é«ä¸æåçä»éº¼æ¯åå¡éï¼"
    result = asyncio.run(query_gemini(query, wait_seconds=30))
    print("\n" + "="*50)
    print("GEMINI RESPONSE:")
    print("="*50)
    print(result)

Step 5: Run the Query

python3 gemini_query.py "ç¯ä¾åé¡ï¼ä½ çæ¥è©¢åé¡"

Or inline for simple queries:

python3 << 'EOF'
import asyncio
import websockets
import json

async def send_to_gemini():
    # Get WebSocket URL
    import subprocess
    result = subprocess.run(["curl", "-s", "http://localhost:9222/json"], capture_output=True, text=True)
    pages = json.loads(result.stdout)
    ws_url = next(p["webSocketDebuggerUrl"] for p in pages if "gemini.google.com" in p.get("url", ""))
    
    async with websockets.connect(ws_url) as ws:
        # Input query
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": '''
                const editor = document.querySelector('div[contenteditable="true"]');
                editor.focus();
                document.execCommand('insertText', false, 'ç¯ä¾åé¡ï¼è«åææ¯ç¹å¹£æªä¾çå¹æ ¼èµ°å¢');
                editor.dispatchEvent(new Event('input', {bubbles: true}));
            '''}
        }))
        await ws.recv()
        
        # Click send
        await asyncio.sleep(1)
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": '''document.querySelector('button[aria-label="å³éè¨æ¯"]').click()'''}
        }))
        await ws.recv()
        
        # Wait and extract
        await asyncio.sleep(30)
        await ws.send(json.dumps({
            "id": 3,
            "method": "Runtime.evaluate",
            "params": {"expression": '''
                document.querySelectorAll('.markdown')[document.querySelectorAll('.markdown').length - 1].innerText
            '''}
        }))
        response = await ws.recv()
        print(json.loads(response)['result']['result']['value'])

asyncio.run(send_to_gemini())
EOF

Alternative Method: browser-use CLI

This method is simpler but does not use your existing Chrome login. You’ll need to log in manually each time.

Prerequisites

# Create virtual environment
python3 -m venv .venv

# Install browser-use
./.venv/bin/pip install browser-use

Workflow

1) Open Gemini

./.venv/bin/browser-use --browser real open "https://gemini.google.com/"

2) Get Page State

./.venv/bin/browser-use --browser real state

Look for:

The input textbox: contenteditable=true role=textbox
The send button: aria-label=å³éè¨æ¯

3) Input Text via JavaScript eval

./.venv/bin/browser-use --browser real eval "const editor = document.querySelector('div[contenteditable=\"true\"]'); editor.focus(); document.execCommand('insertText', false, 'YOUR QUERY HERE'); editor.dispatchEvent(new Event('input', {bubbles: true}));"

4) Click Send Button

# Get current state to find button index
./.venv/bin/browser-use --browser real state

# Click the send button (replace INDEX with actual number)
./.venv/bin/browser-use --browser real click INDEX

5) Close Session

./.venv/bin/browser-use close

Troubleshooting

Chrome Remote Debugging Issues

Problem	Cause	Solution
`curl: (7) Failed to connect`	Chrome not running with debugging	Restart Chrome with `--remote-debugging-port=9222`
WebSocket connection refused	Page ID changed	Re-fetch `/json` to get new WebSocket URL
“editor not found”	Page not fully loaded	Wait a few seconds before running script
“button not found”	Send button not visible	Check if text was actually input first
Login page instead of app	Wrong user-data-dir path	Verify path: `"$HOME/Library/Application Support/Google/Chrome"`
`DevTools remote debugging requires a non-default data directory`	Chrome disallows default profile for CDP	Launch with a cloned profile: `/tmp/chrome-gemini-profile`
`curl` shows connection refused even though Chrome is running	CDP not listening due to profile path	Ensure `--user-data-dir` is not default and the port is free
`No Gemini page found via CDP`	Gemini not loaded or not logged in	Open `https://gemini.google.com/` in the launched Chrome and wait for `/app`

browser-use Issues

Problem	Cause	Solution
Not logged in	browser-use creates isolated session	Use Chrome Remote Debugging method instead
`Unknown key: "è«"` error	CLI doesn’t support Unicode	Use `eval` with JavaScript `execCommand`
Click doesn’t work	Element index changed	Re-run `state` before each click

Best Practices

Always use Chrome Remote Debugging for queries requiring authentication
Wait 30+ seconds for complex queries (Gemini’s “Deep Think” mode takes longer)
Check for .markdown elements to verify response is complete
Use inline Python for one-off queries; use the full script for automation
Close Chrome debugging session when done to avoid port conflicts
Keep profile cloned in /tmp/chrome-gemini-profile to avoid CDP blocking the default profile

Complete Example: Crypto Price Analysis

å®æ´å·¥ä½æµç¨

# Step 1: æºå Chrome è¨å®æªå¯æ¬ (é¿å CDP é è¨ç®ééå¶)
rm -rf /tmp/chrome-gemini-profile
rsync -a "$HOME/Library/Application Support/Google/Chrome/" /tmp/chrome-gemini-profile/

# Step 2: åå Chrome é ç«¯é¤é¯æ¨¡å¼
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --remote-debugging-port=9222 \
  --user-data-dir="/tmp/chrome-gemini-profile" \
  "https://gemini.google.com/" > /dev/null 2>&1 &

# Step 3: çå¾é é¢è¼å¥ä¸¦é©èé£æ¥
sleep 8
curl -s http://localhost:9222/json | python3 -c "import sys, json; pages = json.load(sys.stdin); gemini = [p for p in pages if p.get('type') == 'page' and 'gemini.google.com' in p.get('url', '')]; print(f\"æ¾å° Gemini é é¢: {gemini[0]['url'] if gemini else 'æªæ¾å°'}\")"

å°ä»¥ä¸å§å®¹å²åçº query_gemini.py:

import asyncio
import websockets
import json
import subprocess
import sys

async def query_gemini(query_text, wait_seconds=60):
    # Get the Gemini page WebSocket URL
    result = subprocess.run(
        ["curl", "-s", "http://localhost:9222/json"],
        capture_output=True, text=True
    )
    pages = json.loads(result.stdout)
    
    # Find Gemini page
    gemini_page = None
    for page in pages:
        if page.get("type") == "page" and "gemini.google.com" in page.get("url", ""):
            gemini_page = page
            break
    
    if not gemini_page:
        print("é¯èª¤:æ¾ä¸å° Gemini é é¢ãè«ç¢ºä¿ Chrome å·²éå Geminiã")
        return None
    
    ws_url = gemini_page["webSocketDebuggerUrl"]
    print(f"æ£å¨é£æ¥å°: {ws_url}")
    
    async with websockets.connect(ws_url) as ws:
        # Step 1: Input the query
        input_js = f'''
        const editor = document.querySelector('div[contenteditable="true"]');
        if(editor) {{
            editor.focus();
            document.execCommand('insertText', false, `{query_text}`);
            editor.dispatchEvent(new Event('input', {{bubbles: true}}));
            'success';
        }} else {{
            'editor not found';
        }}
        '''
        
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": input_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"è¼¸å¥çµæ: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # Step 2: Click send button
        await asyncio.sleep(1)
        click_js = '''
        const btn = document.querySelector('button[aria-label="å³éè¨æ¯"]');
        if(btn) { btn.click(); 'clicked'; } else { 'button not found'; }
        '''
        
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": click_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"é»æçµæ: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # Step 3: Wait for response
        print(f"çå¾ {wait_seconds} ç§è® Gemini åæ...")
        await asyncio.sleep(wait_seconds)
        
        # Step 4: Extract the response - try to get complete content
        extract_js = '''
        const markdownEls = document.querySelectorAll('.markdown');
        if(markdownEls.length > 0) {
            const lastMarkdown = markdownEls[markdownEls.length - 1];
            // Get all text content including nested elements
            lastMarkdown.innerText || lastMarkdown.textContent || 'Empty response';
        } else {
            'No response found';
        }
        '''
        
        await ws.send(json.dumps({
            "id": 3,
            "method": "Runtime.evaluate",
            "params": {"expression": extract_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        content = result.get('result', {}).get('result', {}).get('value', 'No content')
        
        return content

# Main execution
if __name__ == "__main__":
    query = """ç¯ä¾åé¡ï¼è«è©³ç´°åæ BTCãETH çå¹æ ¼é æ¸¬èµ°å¢ã
éåå«ç¸éå°æ¥ææ¨ï¼ä¸¦ç¨ç¹é«ä¸æåçã"""
    
    result = asyncio.run(query_gemini(query, wait_seconds=60))
    print("\n" + "="*50)
    print("GEMINI åæ:")
    print("="*50)
    print(result)

å·è¡æ¹å¼:

python3 query_gemini.py

æ¹æ³ 2: ç²åå·²åå¨çåæ (get_gemini_response.py)

å¦æ Gemini é é¢å·²ç¶æåæ,å¯ä»¥ä½¿ç¨æ¤è³æ¬ç´æ¥æå:

import asyncio
import websockets
import json
import subprocess

async def get_all_gemini_content():
    # Get the Gemini page WebSocket URL
    result = subprocess.run(
        ["curl", "-s", "http://localhost:9222/json"],
        capture_output=True, text=True
    )
    pages = json.loads(result.stdout)
    
    # Find Gemini page
    gemini_page = None
    for page in pages:
        if page.get("type") == "page" and "gemini.google.com" in page.get("url", ""):
            gemini_page = page
            break
    
    if not gemini_page:
        print("é¯èª¤:æ¾ä¸å° Gemini é é¢ã")
        return None
    
    ws_url = gemini_page["webSocketDebuggerUrl"]
    print(f"æ£å¨é£æ¥å°: {ws_url}\n")
    
    async with websockets.connect(ws_url) as ws:
        # Extract all markdown content from the page
        extract_js = '''
        (function() {
            const markdownEls = document.querySelectorAll('.markdown');
            console.log('Found markdown elements:', markdownEls.length);
            
            if(markdownEls.length === 0) {
                return 'No markdown elements found';
            }
            
            // Get the last two markdown elements (user query and AI response)
            const responses = [];
            const startIdx = Math.max(0, markdownEls.length - 2);
            
            for(let i = startIdx; i < markdownEls.length; i++) {
                const text = markdownEls[i].innerText || markdownEls[i].textContent || '';
                if(text.trim()) {
                    responses.push(`[åæ ${i+1}]:\\n${text}`);
                }
            }
            
            return responses.join('\\n\\n' + '='.repeat(80) + '\\n\\n');
        })()
        '''
        
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": extract_js, "returnByValue": True}
        }))
        response = await ws.recv()
        result = json.loads(response)
        content = result.get('result', {}).get('result', {}).get('value', 'No content')
        
        return content

# Main execution
if __name__ == "__main__":
    result = asyncio.run(get_all_gemini_content())
    print("="*80)
    print("GEMINI å°è©±å§å®¹:")
    print("="*80)
    print(result)

å·è¡æ¹å¼:

python3 get_gemini_response.py

å¯¦éä½¿ç¨ç¯ä¾

# å®æ´æµç¨
rm -rf /tmp/chrome-gemini-profile && \
rsync -a "$HOME/Library/Application Support/Google/Chrome/" /tmp/chrome-gemini-profile/ && \
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --remote-debugging-port=9222 \
  --user-data-dir="/tmp/chrome-gemini-profile" \
  "https://gemini.google.com/" > /dev/null 2>&1 &

# çå¾ä¸¦å·è¡æ¥è©¢
sleep 8 && python3 query_gemini.py

æ¸çè³æº

# 1. éé Chrome é¤é¯æè©±
pkill -9 "Google Chrome"

# 2. æ¸çè¨æè¨å®æª (å¯é¸,éæ¾ç£ç¢ç©ºé)
rm -rf /tmp/chrome-gemini-profile

# 3. æ¸çæ¸¬è©¦éç¨ä¸çæçè¨æè³æ¬åè¼¸åºæä»¶
rm -f query_gemini.py get_gemini_response.py get_all_gemini_content.py
rm -f gemini_response.txt gemini_full_response.txt

æä½³å¯¦è¸:

æ¯æ¬¡ä½¿ç¨å¾éé Chrome – é¿åä½ç¨ 9222 ç«¯å£
å®ææ¸çè¨æè¨å®æª – /tmp/chrome-gemini-profile å¯è½ä½ç¨æ¸ç¾ MB
ä½¿ç¨å®æ´è³æ¬ – å°ä¸è¿° query_gemini.py å²åçºå°æ¡æä»¶,èéæ¯æ¬¡éæ°å»ºç«

æ³¨æäºé

åææªæ·åé¡ – å¦æåæå¾é·,å¯è½éè¦å¤æ¬¡æåæä½¿ç¨ get_all_gemini_content.py æ¹æ³
ç»å¥çæ – ç¢ºä¿ Chrome è¨å®æªä¸å·²ç»å¥ Google å¸³è

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台

gemini-research-browser-use

Agent 安装分布

Skill 文档

Gemini Research Browser Use

Overview

Prerequisites

Method Comparison

â Recommended Method: Chrome Remote Debugging (CDP)

Prerequisites

Step 1: Install websockets (if needed)

Step 2: Launch Chrome with Remote Debugging (Non-default profile)

Step 3: Verify Connection (CDP)

Step 4: Send Query to Gemini

Step 5: Run the Query

Alternative Method: browser-use CLI

Prerequisites

Workflow

1) Open Gemini

2) Get Page State

3) Input Text via JavaScript eval

4) Click Send Button

5) Close Session

Troubleshooting

Chrome Remote Debugging Issues

browser-use Issues

Best Practices

Complete Example: Crypto Price Analysis

å®æ´å·¥ä½æµç¨

æ¹æ³ 1: å®æ´æ¥è©¢è ³æ¬ (query_gemini.py)

æ¹æ³ 2: ç²åå·²å­å¨çåæ (get_gemini_response.py)

å¯¦éä½¿ç¨ç¯ä¾

æ¸ çè³æº

æ³¨æäºé

â Recommended Method: Chrome Remote Debugging (CDP)

å®æ´å·¥ä½æµç¨

æ¹æ³ 1: å®æ´æ¥è©¢è³æ¬ (query_gemini.py)

æ¹æ³ 2: ç²åå·²åå¨çåæ (get_gemini_response.py)

å¯¦éä½¿ç¨ç¯ä¾

æ¸çè³æº

æ³¨æäºé