snowflake

📁 g1joshi/agent-skills 📅 3 days ago
1
总安装量
1
周安装量
#43747
全站排名
安装命令
npx skills add https://github.com/g1joshi/agent-skills --skill snowflake

Agent 安装分布

mcpjam 1
claude-code 1
replit 1
junie 1
zencoder 1

Skill 文档

Snowflake

Snowflake is a cloud-native data warehouse. It separates compute (“Virtual Warehouses”) from storage, allowing them to scale independently.

When to Use

  • Data Warehousing: Central repository for all business data.
  • ELT Workflows: Load raw data (JSON/CSV) then Transform it via SQL.
  • Data Sharing: Securely share live data tables with other companies/accounts without copying.

Quick Start (SQL)

-- Create warehouse (Compute)
CREATE WAREHOUSE my_wh WITH WAREHOUSE_SIZE = 'X-SMALL';

-- Query JSON directly (Variant type)
SELECT src:sales.order_id::integer
FROM raw_data;

Core Concepts

Virtual Warehouses

Compute clusters. You can have an XS warehouse for reporting and a 4XL warehouse for heavy ML training running simultaneously on the same data.

Zero-Copy Cloning

Clone a Multi-Terabyte database in seconds for testing. It points to the same underlying S3 objects until changed.

Snowpark

Allows writing code in Python/Java/Scala that executes inside Snowflake (for ML/Data Engineering).

Best Practices (2025)

Do:

  • Use auto-suspend: Shut down warehouses after X minutes of idleness to save money.
  • Use Variant Type: Load semi-structured data (JSON) as-is into VARIANT columns, then parse on read.
  • Use Clustering Keys: For very large tables (>1TB), manual clustering improves query skipping.

Don’t:

  • Don’t use INSERT INTO ... VALUES: For bulk loading, use COPY INTO from S3/Stage. It is much faster.

References