bigquery

📁 g1joshi/agent-skills 📅 3 days ago

总安装量

周安装量

#45013

全站排名

安装命令

npx skills add https://github.com/g1joshi/agent-skills --skill bigquery

Agent 安装分布

mcpjam 1

claude-code 1

replit 1

junie 1

zencoder 1

BigQuery is Google’s serverless, highly scalable, and cost-effective multi-cloud data warehouse. It processes terabytes in seconds.

Serverless Analytics: No infrastructure to manage. Just run SQL.
Real-time Analytics: High-speed streaming ingestion.
ML Integration: CREATE MODEL lets you train ML models using standard SQL (BigQuery ML).

-- Standard SQL
SELECT name, COUNT(*) as count
FROM `bigquery-public-data.usa_names.usa_1910_2013`
GROUP BY name
ORDER BY count DESC
LIMIT 10;

A “Slot” is a unit of computational capacity. BigQuery autoscales slots, or you can reserve them for flat-rate pricing.

Optimized for aggregation queries. Reading one column is much cheaper/faster than reading all columns (SELECT * is expensive).

Partitioning: Splits table by Date/Int (e.g., Daily partitions). Prunes data scanning massive cost savings.
Clustering: Sorts data within partitions for faster filtering.

Do:

Partition by Date: Almost mandatory for time-series logs.
Use BigQuery ML: Train models (Regression, K-Means) directly where data lives.
Estimate Cost: Dry Run your query to see how many bytes it will scan before running it.

Don’t:

Don’t run SELECT *: You pay per column read. Select only what you need.
Don’t treat it like an OLTP: Single row inserts are slow (unless using Streaming API). It is for bulk analytics.