great-expectations
12
总安装量
2
周安装量
#26686
全站排名
安装命令
npx skills add https://github.com/majesticlabs-dev/majestic-marketplace --skill great-expectations
Agent 安装分布
opencode
2
claude-code
2
replit
2
openhands
1
zencoder
1
Skill 文档
Great Expectations
Audience: Data engineers building validated data pipelines.
Goal: Provide GX patterns for expectation-based validation and monitoring.
Scripts
Execute GX functions from scripts/expectations.py:
from scripts.expectations import (
get_pandas_context,
add_dataframe_asset,
create_basic_suite,
run_validation
)
Usage Examples
Quick Setup
from scripts.expectations import get_pandas_context, add_dataframe_asset
context, datasource = get_pandas_context("my_datasource")
batch_request = add_dataframe_asset(datasource, "users", df)
Create Expectation Suite
from scripts.expectations import create_basic_suite
columns_config = {
'user_id': {'not_null': True, 'unique': True, 'type': 'int'},
'age': {'min': 0, 'max': 150},
'status': {'values': ['active', 'inactive', 'pending']},
'email': {'regex': r'^[\w\.-]+@[\w\.-]+\.\w+$'}
}
suite = create_basic_suite(context, "user_suite", columns_config)
Run Validation
from scripts.expectations import run_validation
results = run_validation(
context,
checkpoint_name="user_checkpoint",
batch_request=batch_request,
suite_name="user_suite"
)
if results['success']:
print("All expectations passed!")
else:
for failure in results['failures']:
print(f"Failed: {failure['expectation']} on {failure['column']}")
Common Expectations Reference
| Category | Expectation | Description |
|---|---|---|
| Table | ExpectTableRowCountToBeBetween |
Row count range |
| Existence | ExpectColumnToExist |
Column must exist |
| Nulls | ExpectColumnValuesToNotBeNull |
No null values |
| Range | ExpectColumnValuesToBeBetween |
Value bounds |
| Set | ExpectColumnValuesToBeInSet |
Allowed values |
| Pattern | ExpectColumnValuesToMatchRegex |
Regex match |
| Unique | ExpectColumnValuesToBeUnique |
No duplicates |
Data Docs
# Build and open HTML reports
context.build_data_docs()
context.open_data_docs()
Directory Structure
great_expectations/
âââ great_expectations.yml # Config
âââ expectations/ # Expectation suites (JSON)
âââ checkpoints/ # Checkpoint definitions
âââ plugins/ # Custom expectations
âââ uncommitted/
âââ data_docs/ # Generated HTML docs
âââ validations/ # Validation results
When to Use Great Expectations
| Use Case | GX | Alternative |
|---|---|---|
| Pipeline monitoring | â | – |
| Data warehouse validation | â | – |
| Automated data docs | â | – |
| Simple DataFrame checks | – | Pandera |
| Record-level API validation | – | Pydantic |
Dependencies
great_expectations>=0.18
pandas