bigquery-analytics

📁 trantuananh-17/product-reviews 📅 Jan 30, 2026

总安装量

周安装量

#58348

全站排名

安装命令

npx skills add https://github.com/trantuananh-17/product-reviews --skill bigquery-analytics

Agent 安装分布

claude-code 3

github-copilot 3

codex 3

cursor 3

opencode 2

gemini-cli 2

Skill 文档

BigQuery Best Practices

Table Design

Partitioning (REQUIRED for tables > 1GB)

CREATE TABLE `project.dataset.events` (
  event_id STRING,
  shop_id STRING,
  event_type STRING,
  created_at TIMESTAMP,
  data JSON
)
PARTITION BY DATE(created_at)
CLUSTER BY shop_id, event_type;

Data Size	Partition By
< 1GB	Not needed
1GB – 1TB	DATE/TIMESTAMP
> 1TB	DATE + consider sharding

Query Patterns

Always Use Partition Filter

-- â BAD: No partition filter (full scan)
SELECT * FROM events WHERE shop_id = 'shop_123';

-- â GOOD: Partition filter included
SELECT * FROM events
WHERE created_at >= '2024-01-01'
  AND created_at < '2024-02-01'
  AND shop_id = 'shop_123';

Select Only Needed Columns

-- â BAD: SELECT *
SELECT * FROM events;

-- â GOOD: Select specific columns
SELECT event_id, event_type, created_at FROM events;

Node.js Integration

Always Batch Inserts

// â GOOD: Single batch insert
await table.insert(batch.map(row => ({
  ...row,
  time: new Date()
})));

// â BAD: Insert one row at a time
for (const row of batch) {
  await table.insert([row]);
}

Scenario	Max Batch Size
Streaming inserts	500-1000 rows
High throughput	Up to 10,000 rows

Cost Control

// Dry run before expensive queries
const [job] = await bigquery.createQueryJob({
  query: sql,
  dryRun: true
});
const estimatedCost = (job.statistics.totalBytesProcessed / 1e12) * 5;

Checklist

â¡ Large tables (>1GB) have partitioning
â¡ Queries include partition column in WHERE
â¡ Tables clustered by frequently filtered columns
â¡ No SELECT * - select specific columns
â¡ Using parameterized queries
â¡ Batch inserts (not row-by-row)

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台