data-engineering
4
总安装量
2
周安装量
#53162
全站排名
安装命令
npx skills add https://github.com/pluginagentmarketplace/custom-plugin-typescript --skill data-engineering
Agent 安装分布
windsurf
2
opencode
2
codex
2
antigravity
2
gemini-cli
2
Skill 文档
Data Engineering & Analytics Skill
Quick Start – SQL Data Pipeline
-- Create staging table
CREATE TABLE staging_events AS
SELECT
event_id,
user_id,
event_type,
event_time,
properties
FROM raw_events
WHERE event_time >= CURRENT_DATE - INTERVAL '1 day'
AND event_type IN ('click', 'purchase', 'view');
-- Aggregate metrics
SELECT
DATE(event_time) as date,
user_id,
COUNT(*) as event_count,
COUNT(DISTINCT event_type) as unique_events
FROM staging_events
GROUP BY 1, 2
ORDER BY date DESC, event_count DESC;
Core Technologies
Data Processing
- Apache Spark
- Apache Flink
- Pandas / Polars
- dbt (data transformation)
Data Warehousing
- Snowflake
- BigQuery (GCP)
- Redshift (AWS)
- Azure Synapse
ETL/ELT Tools
- dbt
- Airflow
- Talend
- Informatica
Streaming
- Apache Kafka
- AWS Kinesis
- Apache Pulsar
ML & Analytics
- scikit-learn
- TensorFlow
- Tableau / Power BI
Best Practices
- Data Quality – Validation and testing
- Documentation – Clear metadata
- Performance – Query optimization
- Governance – Data security
- Monitoring – Pipeline alerts
- Scalability – Design for growth
- Version Control – Git for code and configs
- Testing – Data and pipeline testing