bigquery-analytics

📁 trantuananh-17/product-reviews 📅 Jan 30, 2026
3
总安装量
3
周安装量
#58348
全站排名
安装命令
npx skills add https://github.com/trantuananh-17/product-reviews --skill bigquery-analytics

Agent 安装分布

claude-code 3
github-copilot 3
codex 3
cursor 3
opencode 2
gemini-cli 2

Skill 文档

BigQuery Best Practices

Table Design

Partitioning (REQUIRED for tables > 1GB)

CREATE TABLE `project.dataset.events` (
  event_id STRING,
  shop_id STRING,
  event_type STRING,
  created_at TIMESTAMP,
  data JSON
)
PARTITION BY DATE(created_at)
CLUSTER BY shop_id, event_type;
Data Size Partition By
< 1GB Not needed
1GB – 1TB DATE/TIMESTAMP
> 1TB DATE + consider sharding

Query Patterns

Always Use Partition Filter

-- ❌ BAD: No partition filter (full scan)
SELECT * FROM events WHERE shop_id = 'shop_123';

-- ✅ GOOD: Partition filter included
SELECT * FROM events
WHERE created_at >= '2024-01-01'
  AND created_at < '2024-02-01'
  AND shop_id = 'shop_123';

Select Only Needed Columns

-- ❌ BAD: SELECT *
SELECT * FROM events;

-- ✅ GOOD: Select specific columns
SELECT event_id, event_type, created_at FROM events;

Node.js Integration

Always Batch Inserts

// ✅ GOOD: Single batch insert
await table.insert(batch.map(row => ({
  ...row,
  time: new Date()
})));

// ❌ BAD: Insert one row at a time
for (const row of batch) {
  await table.insert([row]);
}
Scenario Max Batch Size
Streaming inserts 500-1000 rows
High throughput Up to 10,000 rows

Cost Control

// Dry run before expensive queries
const [job] = await bigquery.createQueryJob({
  query: sql,
  dryRun: true
});
const estimatedCost = (job.statistics.totalBytesProcessed / 1e12) * 5;

Checklist

□ Large tables (>1GB) have partitioning
□ Queries include partition column in WHERE
□ Tables clustered by frequently filtered columns
□ No SELECT * - select specific columns
□ Using parameterized queries
□ Batch inserts (not row-by-row)