gemini-websearch

📁 d-oit/do-novelist-ai 📅 7 days ago
4
总安装量
4
周安装量
#53365
全站排名
安装命令
npx skills add https://github.com/d-oit/do-novelist-ai --skill gemini-websearch

Agent 安装分布

opencode 4
gemini-cli 4
github-copilot 4
codex 4
kimi-cli 4
amp 4

Skill 文档

Gemini Web Search

Advanced web search using Gemini CLI in headless mode with tool restriction. All searches use Gemini’s google_web_search tool.

Quick Start

Basic search:

python .claude/skills/gemini-websearch/scripts/search.py "search for React 19 features"

With validation:

python .claude/skills/gemini-websearch/scripts/search.py "search for TypeScript 5.4 new features" --validate

Batch mode:

python .claude/skills/gemini-websearch/scripts/search.py queries.txt --batch --output results/

View analytics:

python .claude/skills/gemini-websearch/scripts/search.py --show-analytics

Search Workflow

Copy this checklist for research tasks:

Research Progress:
- [ ] Step 1: Check cache (1-hour TTL)
- [ ] Step 2: Formulate focused search query (prefix with "search ")
- [ ] Step 3: Execute headless Gemini search
- [ ] Step 4: Parse JSON and extract response content
- [ ] Step 5: Validate quality and relevance
- [ ] Step 6: Review search success and content quality
- [ ] Step 7: Log analytics

Step 1: Check cache MD5-keyed cache with 1-hour TTL. Automatic cleanup on expiry. Tracks cache hit rates.

Step 2: Formulate query Keep queries specific and focused. Break complex questions into multiple targeted searches. Important: Always prefix queries with “search ” for best results (e.g., “search for the React 19 best practice”).

Step 3: Execute headless search

gemini -p "/tool:google_web_search query:\"your query\" raw:true" \
  --yolo --output-format json

The --yolo flag auto-approves tool usage. Returns structured JSON with comprehensive search results.

Step 4: Parse response Extract search results and metadata from JSON output. Note: Gemini CLI doesn’t provide citations/grounding metadata in JSON format.

Step 5: Validate results Score based on:

  • Content completeness and length
  • Search tool success verification
  • Response latency (bonus for faster searches)
  • Relevance to original query

False positive detection prevents low-quality results. Note: Gemini CLI doesn’t provide citations, so validation focuses on content quality metrics.

Step 6: Review sources Verify that the search tool was called successfully and assess content quality directly from the response.

Step 7: Log analytics Track cache hits, latency, quality scores, validation failures, and query patterns.

Advanced Usage

For detailed examples see examples.md

Batch searches with validation:

python .claude/skills/gemini-websearch/scripts/search.py queries.txt \
  --batch \
  --output results/ \
  --validate \
  --min-quality 0.7

Research with retry logic:

python .claude/skills/gemini-websearch/scripts/search.py "complex technical query" \
  --validate \
  --min-quality 0.7 \
  --min-relevance 0.6 \
  --retry-on-fail \
  --max-retries 2

Disable cache for fresh results:

python .claude/skills/gemini-websearch/scripts/search.py "breaking news topic" --no-cache

Clear cache:

python .claude/skills/gemini-websearch/scripts/search.py --clear-cache

Configuration

Required: Tool Restriction

Create/update .gemini/settings.json to restrict Gemini CLI to only google_web_search:

{
  "tools": {
    "exclude": [
      "file_read",
      "file_write",
      "file_search",
      "file_list",
      "web_fetch",
      "run_shell_command",
      "save_memory",
      "code_execution",
      "edit_file",
      "create_file",
      "delete_file",
      "list_directory",
      "move_file",
      "copy_file"
    ]
  }
}

This ensures Gemini ONLY uses the web search tool, not other capabilities.

Optional: Authentication

# Option 1: API key (optional)
export GEMINI_API_KEY="your-api-key"

# Option 2: gcloud authentication
gcloud auth login

# Option 3: Application Default Credentials
gcloud auth application-default login

Environment Variables

export SEARCH_CACHE_DIR=".cache/gemini-searches"
export SEARCH_CACHE_TTL="3600"  # 1 hour
export ANALYTICS_LOG="search_analytics.json"
export GEMINI_MODEL="gemini-2.5-flash"

Config File

Optional ~/.gemini-search/config.json:

{
  "model": "gemini-2.5-flash",
  "cache_enabled": true,
  "cache_ttl": 3600,
  "validation": {
    "enabled": true,
    "min_quality": 0.6,
    "min_citations": 0,
    "min_relevance": 0.5,
    "retry_on_fail": true,
    "max_retries": 2
  },
  "analytics": {
    "enabled": true,
    "log_file": "search_analytics.json",
    "track_cache_hits": true,
    "track_latency": true,
    "track_quality": true
  }
}

Command Reference

Single search:

python .claude/skills/gemini-websearch/scripts/search.py "query" [options]

Batch search:

python .claude/skills/gemini-websearch/scripts/search.py queries.txt --batch [options]

Show analytics:

python .claude/skills/gemini-websearch/scripts/search.py --show-analytics

Clear cache:

python .claude/skills/gemini-websearch/scripts/search.py --clear-cache

Options

  • --model MODEL – Gemini model (default: gemini-2.5-flash)
  • --no-cache – Disable caching
  • --validate – Enable result validation
  • --min-quality FLOAT – Minimum quality score (0-1, default: 0.6)
  • --min-citations INT – Minimum citation count (default: 0, not applicable for Gemini CLI)
  • --min-relevance FLOAT – Minimum relevance score (0-1, default: 0.5)
  • --retry-on-fail – Retry if validation fails
  • --max-retries INT – Maximum retry attempts (default: 2)
  • --output PATH – Output file (single) or directory (batch)
  • --batch – Batch mode: query arg is file path

Best Practices

  • Use headless mode for programmatic searches
  • Restrict tools in settings.json for security
  • Enable caching for repeated/related queries
  • Validate critical searches before using results
  • Monitor analytics to optimize query patterns
  • Use batch mode for multi-query research
  • Set quality thresholds based on use case
  • Always prefix queries with “search ” for optimal results
  • Assess content quality directly from response text and metadata