csv-data-visualizer
npx skills add https://github.com/ailabs-393/ai-labs-claude-skills --skill csv-data-visualizer
Agent 安装分布
Skill 文档
CSV Data Visualizer
Overview
This skill enables comprehensive data visualization and analysis for CSV files. It provides three main capabilities: (1) creating individual interactive visualizations using Plotly, (2) automatic data profiling with statistical summaries, and (3) generating multi-plot dashboards. The skill is optimized for exploratory data analysis, statistical reporting, and creating presentation-ready visualizations.
When to Use This Skill
Invoke this skill when users request:
- “Visualize this CSV data”
- “Create a histogram/scatter plot/box plot from this data”
- “Show me the distribution of [column]”
- “Generate a dashboard for this dataset”
- “Profile this CSV file” or “Analyze this data”
- “Create a correlation heatmap”
- “Show trends over time”
- “Compare [variable] across [categories]”
Core Capabilities
1. Individual Visualizations
Create specific chart types for detailed analysis using the visualize_csv.py script.
Available Chart Types:
Statistical Plots:
# Histogram - distribution of numeric data
python3 scripts/visualize_csv.py data.csv --histogram column_name --bins 30
# Box plot - show quartiles and outliers
python3 scripts/visualize_csv.py data.csv --boxplot column_name
# Box plot grouped by category
python3 scripts/visualize_csv.py data.csv --boxplot salary --group-by department
# Violin plot - distribution with probability density
python3 scripts/visualize_csv.py data.csv --violin column_name --group-by category
Relationship Analysis:
# Scatter plot with automatic trend line
python3 scripts/visualize_csv.py data.csv --scatter height weight
# Scatter plot with color and size encoding
python3 scripts/visualize_csv.py data.csv --scatter x y --color category --size value
# Correlation heatmap for all numeric columns
python3 scripts/visualize_csv.py data.csv --correlation
Time Series:
# Line chart for single variable
python3 scripts/visualize_csv.py data.csv --line date sales
# Multiple variables on same chart
python3 scripts/visualize_csv.py data.csv --line date "sales,revenue,profit"
Categorical Data:
# Bar chart (counts categories automatically)
python3 scripts/visualize_csv.py data.csv --bar category
# Pie chart for composition
python3 scripts/visualize_csv.py data.csv --pie region
Output Formats: Specify output file with desired format extension:
# Interactive HTML (default)
python3 scripts/visualize_csv.py data.csv --histogram age -o output.html
# Static image formats
python3 scripts/visualize_csv.py data.csv --scatter x y -o plot.png
python3 scripts/visualize_csv.py data.csv --correlation -o heatmap.pdf
python3 scripts/visualize_csv.py data.csv --bar category -o chart.svg
2. Automatic Data Profiling
Generate comprehensive data quality and statistical reports using the data_profile.py script.
Text Report (default):
python3 scripts/data_profile.py data.csv
HTML Report:
python3 scripts/data_profile.py data.csv -f html -o report.html
JSON Report:
python3 scripts/data_profile.py data.csv -f json -o profile.json
What the Profiler Provides:
- File information (size, dimensions)
- Dataset overview (shape, memory usage, duplicates)
- Column-by-column analysis (types, missing data, unique values)
- Missing data patterns and completeness
- Statistical summary for numeric columns (mean, std, quartiles, skewness, kurtosis)
- Categorical column analysis (frequency counts, most/least common values)
- Data quality checks (high missing data, duplicate rows, constant columns, high cardinality)
When to Use Profiling: Always recommend running data profiling BEFORE creating visualizations when:
- User is unfamiliar with the dataset
- Data quality is unknown
- Need to identify appropriate visualization types
- Exploring a new dataset for the first time
3. Multi-Plot Dashboards
Create comprehensive dashboards with multiple visualizations using the create_dashboard.py script.
Automatic Dashboard: Analyzes data types and automatically creates appropriate visualizations:
python3 scripts/create_dashboard.py data.csv
Custom output location:
python3 scripts/create_dashboard.py data.csv -o my_dashboard.html
Control number of plots:
python3 scripts/create_dashboard.py data.csv --max-plots 9
Custom Dashboard from Config: Create a JSON configuration file specifying exact plots:
python3 scripts/create_dashboard.py data.csv --config config.json
Dashboard Config Format:
{
"title": "Sales Analysis Dashboard",
"plots": [
{"type": "histogram", "column": "revenue"},
{"type": "box", "column": "revenue", "group_by": "region"},
{"type": "scatter", "column": "advertising", "group_by": "revenue"},
{"type": "bar", "column": "product_category"},
{"type": "correlation"}
]
}
Dashboard Plot Types:
histogram: Distribution of numeric columnbox: Box plot, optionally grouped by categoryscatter: Relationship between two numeric columnsbar: Count of categorical valuescorrelation: Heatmap of numeric correlations
Workflow Decision Tree
Use this decision tree to determine the appropriate approach:
User provides CSV file
â
ââ "Profile this data" / "Analyze this data" / Unfamiliar dataset
â ââ> Run data_profile.py first
â Then offer visualization options based on findings
â
ââ "Create dashboard" / "Overview of the data" / Multiple visualizations needed
â ââ User knows exact plots wanted
â â ââ> Create JSON config â run create_dashboard.py with config
â ââ User wants automatic dashboard
â ââ> Run create_dashboard.py (auto mode)
â
ââ Specific visualization requested ("histogram", "scatter plot", etc.)
ââ> Use visualize_csv.py with appropriate flag
Best Practices
Starting Analysis
- Always profile first for unfamiliar datasets:
python3 scripts/data_profile.py data.csv - Review the profiling output to understand:
- Column data types and ranges
- Missing data patterns
- Data quality issues
- Statistical distributions
Choosing Visualizations
Consult references/visualization_guide.md for detailed guidance. Quick reference:
- Distribution: Histogram, box plot, violin plot
- Relationship: Scatter plot, correlation heatmap
- Time series: Line chart
- Categories: Bar chart (preferred) or pie chart (use sparingly)
- Comparison: Box plot grouped by category
Creating Dashboards
- Automatic dashboard: Good for initial exploration
- Custom dashboard: Better for presentations or specific analysis goals
- Limit plots: Keep to 6-9 plots maximum for readability
- Logical grouping: Group related visualizations together
Output Considerations
- HTML: Best for interactive exploration (zoom, pan, hover tooltips)
- PNG/PDF: Best for reports and presentations
- SVG: Best for publications requiring vector graphics
Dependencies
The scripts require these Python packages:
pip install pandas plotly numpy
For static image export (PNG, PDF, SVG), also install:
pip install kaleido
Example Workflows
Exploratory Data Analysis
# 1. Profile the data
python3 scripts/data_profile.py sales_data.csv -f html -o profile.html
# 2. Create automatic dashboard
python3 scripts/create_dashboard.py sales_data.csv -o dashboard.html
# 3. Dive deeper with specific plots
python3 scripts/visualize_csv.py sales_data.csv --scatter price sales --color region
python3 scripts/visualize_csv.py sales_data.csv --boxplot revenue --group-by product
Report Generation
# Create specific visualizations for report
python3 scripts/visualize_csv.py data.csv --histogram age -o fig1_distribution.png
python3 scripts/visualize_csv.py data.csv --scatter income age -o fig2_correlation.png
python3 scripts/visualize_csv.py data.csv --bar category -o fig3_categories.png
# Generate data summary
python3 scripts/data_profile.py data.csv -f html -o data_summary.html
Interactive Dashboard
# Create custom dashboard for presentation
# 1. First, create config.json with desired plots
# 2. Generate dashboard
python3 scripts/create_dashboard.py data.csv --config config.json -o presentation_dashboard.html
Troubleshooting
“Column not found” errors:
- Run data profiling to see exact column names
- CSV columns are case-sensitive
- Check for leading/trailing spaces in column names
Empty or incorrect visualizations:
- Verify data types (numeric vs categorical)
- Check for missing data in plotted columns
- Ensure sufficient non-null values exist
Script execution errors:
- Verify dependencies are installed:
pip list | grep plotly - Check Python version: Python 3.6+ required
- For image export issues, install kaleido:
pip install kaleido
Resources
scripts/
visualize_csv.py: Main visualization script with all chart typesdata_profile.py: Automatic data profiling and quality analysiscreate_dashboard.py: Multi-plot dashboard generator
references/
visualization_guide.md: Comprehensive guide for choosing appropriate chart types, best practices, and common patterns