gemini-websearch
npx skills add https://github.com/d-oit/do-novelist-ai --skill gemini-websearch
Agent 安装分布
Skill 文档
Gemini Web Search
Advanced web search using Gemini CLI in headless mode with tool restriction. All searches use Gemini’s google_web_search tool.
Quick Start
Basic search:
python .claude/skills/gemini-websearch/scripts/search.py "search for React 19 features"
With validation:
python .claude/skills/gemini-websearch/scripts/search.py "search for TypeScript 5.4 new features" --validate
Batch mode:
python .claude/skills/gemini-websearch/scripts/search.py queries.txt --batch --output results/
View analytics:
python .claude/skills/gemini-websearch/scripts/search.py --show-analytics
Search Workflow
Copy this checklist for research tasks:
Research Progress:
- [ ] Step 1: Check cache (1-hour TTL)
- [ ] Step 2: Formulate focused search query (prefix with "search ")
- [ ] Step 3: Execute headless Gemini search
- [ ] Step 4: Parse JSON and extract response content
- [ ] Step 5: Validate quality and relevance
- [ ] Step 6: Review search success and content quality
- [ ] Step 7: Log analytics
Step 1: Check cache MD5-keyed cache with 1-hour TTL. Automatic cleanup on expiry. Tracks cache hit rates.
Step 2: Formulate query Keep queries specific and focused. Break complex questions into multiple targeted searches. Important: Always prefix queries with “search ” for best results (e.g., “search for the React 19 best practice”).
Step 3: Execute headless search
gemini -p "/tool:google_web_search query:\"your query\" raw:true" \
--yolo --output-format json
The --yolo flag auto-approves tool usage. Returns structured JSON with comprehensive search results.
Step 4: Parse response Extract search results and metadata from JSON output. Note: Gemini CLI doesn’t provide citations/grounding metadata in JSON format.
Step 5: Validate results Score based on:
- Content completeness and length
- Search tool success verification
- Response latency (bonus for faster searches)
- Relevance to original query
False positive detection prevents low-quality results. Note: Gemini CLI doesn’t provide citations, so validation focuses on content quality metrics.
Step 6: Review sources Verify that the search tool was called successfully and assess content quality directly from the response.
Step 7: Log analytics Track cache hits, latency, quality scores, validation failures, and query patterns.
Advanced Usage
For detailed examples see examples.md
Batch searches with validation:
python .claude/skills/gemini-websearch/scripts/search.py queries.txt \
--batch \
--output results/ \
--validate \
--min-quality 0.7
Research with retry logic:
python .claude/skills/gemini-websearch/scripts/search.py "complex technical query" \
--validate \
--min-quality 0.7 \
--min-relevance 0.6 \
--retry-on-fail \
--max-retries 2
Disable cache for fresh results:
python .claude/skills/gemini-websearch/scripts/search.py "breaking news topic" --no-cache
Clear cache:
python .claude/skills/gemini-websearch/scripts/search.py --clear-cache
Configuration
Required: Tool Restriction
Create/update .gemini/settings.json to restrict Gemini CLI to only google_web_search:
{
"tools": {
"exclude": [
"file_read",
"file_write",
"file_search",
"file_list",
"web_fetch",
"run_shell_command",
"save_memory",
"code_execution",
"edit_file",
"create_file",
"delete_file",
"list_directory",
"move_file",
"copy_file"
]
}
}
This ensures Gemini ONLY uses the web search tool, not other capabilities.
Optional: Authentication
# Option 1: API key (optional)
export GEMINI_API_KEY="your-api-key"
# Option 2: gcloud authentication
gcloud auth login
# Option 3: Application Default Credentials
gcloud auth application-default login
Environment Variables
export SEARCH_CACHE_DIR=".cache/gemini-searches"
export SEARCH_CACHE_TTL="3600" # 1 hour
export ANALYTICS_LOG="search_analytics.json"
export GEMINI_MODEL="gemini-2.5-flash"
Config File
Optional ~/.gemini-search/config.json:
{
"model": "gemini-2.5-flash",
"cache_enabled": true,
"cache_ttl": 3600,
"validation": {
"enabled": true,
"min_quality": 0.6,
"min_citations": 0,
"min_relevance": 0.5,
"retry_on_fail": true,
"max_retries": 2
},
"analytics": {
"enabled": true,
"log_file": "search_analytics.json",
"track_cache_hits": true,
"track_latency": true,
"track_quality": true
}
}
Command Reference
Single search:
python .claude/skills/gemini-websearch/scripts/search.py "query" [options]
Batch search:
python .claude/skills/gemini-websearch/scripts/search.py queries.txt --batch [options]
Show analytics:
python .claude/skills/gemini-websearch/scripts/search.py --show-analytics
Clear cache:
python .claude/skills/gemini-websearch/scripts/search.py --clear-cache
Options
--model MODEL– Gemini model (default: gemini-2.5-flash)--no-cache– Disable caching--validate– Enable result validation--min-quality FLOAT– Minimum quality score (0-1, default: 0.6)--min-citations INT– Minimum citation count (default: 0, not applicable for Gemini CLI)--min-relevance FLOAT– Minimum relevance score (0-1, default: 0.5)--retry-on-fail– Retry if validation fails--max-retries INT– Maximum retry attempts (default: 2)--output PATH– Output file (single) or directory (batch)--batch– Batch mode: query arg is file path
Best Practices
- Use headless mode for programmatic searches
- Restrict tools in settings.json for security
- Enable caching for repeated/related queries
- Validate critical searches before using results
- Monitor analytics to optimize query patterns
- Use batch mode for multi-query research
- Set quality thresholds based on use case
- Always prefix queries with “search ” for optimal results
- Assess content quality directly from response text and metadata