scientific-publication
npx skills add https://github.com/delphine-l/claude_global --skill scientific-publication
Agent 安装分布
Skill 文档
Scientific Publication Figure Refinement
Expert guidance for systematically improving scientific figures through iterative refinement based on user feedback and publication requirements.
When to Use This Skill
- Improving figures based on reviewer or collaborator feedback
- Optimizing figure clarity and readability
- Ensuring all figure elements fit within bounds
- Deciding between layout alternatives (horizontal vs vertical panels)
- Preparing figures for high-impact publications
Iterative Figure Refinement Workflow
Standard Refinement Sequence
When improving a publication figure, follow this systematic approach:
1. Identify the Core Issue
Examples:
- "Violin plots look distorted on log scale"
- "P-values are cut off at the top"
- "Too much visual clutter, hard to see the data"
- "Text overlaps with data points"
2. Fix the Visualization Type/Method
# Example: Replace inappropriate plot type
# Before: Violin plot on log scale (distorted)
ax.violinplot(data)
ax.set_yscale('log')
# After: Boxplot on log scale (accurate)
ax.boxplot(data)
ax.set_yscale('log')
3. Improve Visual Clarity Systematically adjust element sizes:
# Point sizes: Reduce for dense data
# Start: s=60 (exploratory)
# End: s=25 (publication)
ax.scatter(..., s=25, alpha=0.5)
# Line widths: Thinner reduces clutter
# Start: linewidth=2.5
# End: linewidth=1.5
ax.plot(..., linewidth=1.5)
# Text sizes: Prevent overlap
# Start: fontsize=10-12
# End: fontsize=8-9
ax.text(..., fontsize=8)
# Error bar caps: Keep readable
ax.errorbar(..., capsize=5)
4. Test Layout Alternatives
# Option A: Side-by-side panels
fig, axes = plt.subplots(1, 2, figsize=(16, 7))
# Pros: Direct left-right comparison
# Cons: Smaller individual panels
# Option B: Stacked vertically
fig, axes = plt.subplots(2, 1, figsize=(10, 14))
# Pros: Larger individual panels, easier to read details
# Cons: Harder to compare across panels
# Decision: Let user feedback guide choice
# Generate both, ask which is clearer
5. Optimize Element Positioning Ensure all annotations fit within plot bounds:
# Calculate safe positioning
y_max = max([d.max() for d in data_list])
y_min = min([d.min() for d in data_list])
# Position annotations WITHIN bounds
y_pos = y_max * 0.92 # 92%, not 105% (which goes outside)
# Set explicit limits with headroom
ax.set_ylim(y_min * 0.95 if y_min > 0 else y_min - 5,
y_max * 1.05)
Checklist for Publication Figures
Use this checklist before finalizing figures:
- Plot type appropriate for data distribution (no violin on log scale)
- All text readable at publication size (8-10 pt minimum)
- Statistical annotations visible and within plot bounds
- Legend clear and doesn’t obscure data
- Axis labels descriptive with units
- Color scheme colorblind-friendly
- Line weights balanced (not too thick or thin)
- Point sizes optimized (visible but not overlapping)
- DPI adequate for publication (300 minimum)
- Layout tested (try both horizontal and vertical if applicable)
- File format publication-ready (PNG, PDF, or SVG)
Common Refinement Patterns
Pattern 1: Decluttering Dense Plots
Problem: Too many visual elements competing for attention
Solution sequence:
- Reduce point size (60 â 25)
- Thin line widths (2.5 â 1.5)
- Increase transparency (alpha=0.8 â 0.5)
- Reduce font sizes (10 â 8)
- Remove grid or make it lighter (alpha=0.3)
Before/After test: Generate both versions, compare
Pattern 2: Fixing Overflow Issues
Problem: Annotations, legends, or labels cut off
Solutions:
# 1. Adjust annotation positions
y_pos = y_max * 0.92 # Within bounds
# 2. Use bbox_inches='tight' when saving
plt.savefig('figure.png', dpi=300, bbox_inches='tight')
# 3. Explicitly set limits
ax.set_ylim(min_val * 0.95, max_val * 1.05)
# 4. Move legend outside plot area
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
# 5. Reduce text size
ax.text(..., fontsize=8) # Down from 10
Pattern 3: Multi-Panel Layout Optimization
Try both orientations:
# Version 1: Horizontal (side-by-side)
fig, axes = plt.subplots(1, 2, figsize=(16, 7))
plt.savefig('fig_horizontal.png', dpi=300, bbox_inches='tight')
# Version 2: Vertical (stacked)
fig, axes = plt.subplots(2, 1, figsize=(10, 14))
plt.savefig('fig_vertical.png', dpi=300, bbox_inches='tight')
# Present both to user, ask which is clearer
Decision criteria:
- Horizontal: Better for direct comparison between panels
- Vertical: Better when each panel needs more space
- User context: Journal column width, presentation slides, etc.
Pattern 4: Iterative Statistical Annotation
Common issue: P-values positioned outside plot or overlapping with data
Solution:
# Calculate data range first
all_data = [data_dual, data_prialt] # All datasets in plot
y_max = max([d.max() for d in all_data if len(d) > 0])
# Position relative to actual data, not theoretical maximum
for i, (x_pos, comparison) in enumerate(comparisons):
stat, pval = stats.mannwhitneyu(...)
# Safe positioning
y_annotation = y_max * 0.92 # Below the top
# Format text
if pval < 0.001:
text = 'p < 0.001***'
elif pval < 0.01:
text = 'p < 0.01**'
elif pval < 0.05:
text = 'p < 0.05*'
else:
text = f'p = {pval:.3f} ns'
ax.text(x_pos, y_annotation, text, ha='center', fontsize=9)
# Set explicit limits to ensure annotations fit
ax.set_ylim(0, y_max * 1.05)
Refinement Workflow Example
Real case: VGP Figure 5 improvement sequence
-
Initial version: 4 categories, violin plots on log scale
- Issue: Violin distortion, too complex
-
V1 refinement: Remove violin plots, keep boxplots
- Better, but still issues
-
V2 refinement: Simplify to 3 categories
- Clearer interpretation
-
V3 refinement: Reduce point sizes (60â25), thin lines (2.5â1.5)
- Less clutter
-
V4 refinement: Test vertical vs horizontal layout
- Horizontal clearer for this case
-
V5 refinement: Fix p-value positioning (105%â92% of y_max)
- All elements now visible
-
Final: Smaller text in statistics box (10â8)
- Publication ready
Total iterations: 7 versions over refinement process Result: Clear, accurate, publication-quality figure
Best Practices
1. Version Your Refinements
Keep working versions during major changes:
scripts/
plot_figure.py # Original
plot_figure_v2.py # After major change (layout)
plot_figure_final.py # Publication version
2. Generate Alternatives in Parallel
When testing layout options:
# Save both versions
layouts = [
((1, 2), (16, 7), 'horizontal'),
((2, 1), (10, 14), 'vertical')
]
for (nrows, ncols), figsize, name in layouts:
fig, axes = plt.subplots(nrows, ncols, figsize=figsize)
# ... plot data ...
plt.savefig(f'figure_{name}.png', dpi=300, bbox_inches='tight')
3. Document Each Refinement
"""
Figure 5 - Terminal Telomere Presence
Version history:
- v1: Initial 4-category version with violin plots
- v2: Removed violin plots (distortion on log scale)
- v3: Simplified to 3 categories (terminal only)
- v4: Reduced point/line sizes for clarity
- v5: Fixed p-value positioning
- final: Publication ready
Changes from v4 â v5:
- P-value y-position: 1.05 * y_max â 0.92 * y_max
- Added explicit y-axis limits: (y_min*0.95, y_max*1.05)
- Ensures all annotations visible within plot bounds
"""
4. Get Feedback at Key Milestones
Don’t over-iterate without input:
- After fixing major issues (wrong plot type): Show user
- After layout changes (horizontal vs vertical): Show user
- After final polish: Show user
5. Maintain Consistency Across Figure Set
If refining one figure, check if same improvements apply to others:
# Applied violinâboxplot fix to Figures 2, 7, 10, 11
# Applied size reductions consistently across all figures
# Used same color scheme throughout
Publication Standards
DPI Requirements
- Screen/web: 150 DPI
- Print (standard): 300 DPI
- High-quality print: 600 DPI
File Formats
- Raster: PNG at 300 DPI (most journals accept)
- Vector: PDF or SVG (preferred for line plots, smaller file size, infinite zoom)
- Avoid: JPG (lossy compression, poor for scientific data)
Size Specifications
Check journal requirements:
- Single column: Usually 3.5 inches (89 mm) wide
- Double column: Usually 7 inches (178 mm) wide
- Height: Typically max 9-10 inches
Plan figsize accordingly:
# Single column figure
fig, ax = plt.subplots(figsize=(3.5, 4))
# Double column figure
fig, axes = plt.subplots(1, 2, figsize=(7, 3.5))
Color Accessibility Requirements
Many journals now require accessibility statements for figures, including:
- Confirmation that color schemes are colorblind-safe
- Use of validated palettes (Okabe-Ito, Paul Tol)
- Alternative distinguishing features (patterns, shapes, labels)
Nature journals specifically recommend:
- Okabe-Ito palette for categorical data
- Avoiding red-green combinations
- Testing figures with colorblindness simulators
In Methods section, document your color choices:
“All figures use the Okabe-Ito colorblind-safe palette (Okabe & Ito, 2008) to ensure accessibility for readers with color vision deficiencies.”
Reference: Okabe, M. and Ito, K. (2008) Color Universal Design (CUD): How to make figures and presentations that are friendly to colorblind people. https://jfly.uni-koeln.de/color/
Writing Integrated Results from Multi-Study Analyses
Challenge: When you have multiple parallel analyses (e.g., same metrics across 5 different populations/clades/conditions), how to present findings coherently without overwhelming readers.
Solution: Organize by pattern type first, then by study
Structure Pattern
1. Universal Patterns Section
- Present findings consistent across ALL studies first
- This establishes the “baseline truth” readers can rely on
- Use strong language: “consistently,” “across all,” “universal”
- Provide statistical evidence from multiple studies
2. Study-Specific Patterns Section
- Present deviations and unique findings by study
- Explicitly contrast with universal patterns
- Explain why this study differs (biological/technical context)
3. Cross-Study Comparisons Section
- Tables comparing effect sizes across studies
- Discussion of what drives variation
- Statistical power considerations
Example Structure (from clade-specific genome analysis):
## Universal Patterns Across All Vertebrates
### Gap Density: Architecture Dominates Curation
- Finding: [Universal pattern]
- Evidence: [Stats from all 5 clades]
- Interpretation: [Why this is universal]
### Telomere Detection: Technology-Limited
- [Similar structure]
## Clade-Specific Patterns
### Mammals: Dual Curation Provides Benefits
- Finding: [Unique to this clade]
- Contrast: [How this differs from universal]
- Interpretation: [Biological context]
### Birds: No N50 Benefit from Dual Curation
- [Unique pattern and explanation]
## Cross-Clade Comparisons
- [Table of effect sizes]
- [Discussion of variation]
Benefits of This Structure
- Readers get reliable findings first: Universal patterns are established before introducing complexity
- Reduces cognitive load: Don’t jump between studies repeatedly
- Highlights what’s generalizable: Universal section shows what works everywhere
- Explains variation: Study-specific section explains why some results differ
- Facilitates recommendations: Can give universal advice plus context-specific guidance
Writing Tips
For Universal Patterns:
- Lead with the finding, then provide evidence from multiple studies
- Use consistent statistical reporting across all supporting evidence
- Emphasize the consistency: “across all,” “in every,” “universal”
For Study-Specific Patterns:
- Explicitly state how this differs from universal patterns
- Provide biological/technical context for why this study is unique
- Don’t just report statistics – explain the mechanism
For Statistical Power:
- Be explicit about which studies have sufficient power
- Note limitations in smaller studies
- Don’t over-interpret null results from underpowered studies
Common Pitfalls to Avoid
â Don’t: Report each study sequentially (Study 1 all results, Study 2 all results…) â Do: Report by finding type (Finding A across all studies, Finding B across all studies…)
â Don’t: Hide that some patterns aren’t universal â Do: Explicitly highlight when a pattern is study-specific and explain why
â Don’t: Give equal weight to all findings â Do: Emphasize universal patterns; note study-specific as “interesting variations”
Application Beyond Clade Analysis
This pattern works for any multi-study synthesis:
- Clinical trials across different populations
- Experimental treatments across multiple cell lines
- Algorithm performance across different datasets
- Policy interventions across different regions
Key principle: Organize by what readers need to know (universal vs specific) rather than by how you conducted the studies (study-by-study).
Providing Practical Recommendations from Complex Trade-offs
Challenge: When different methods excel at different outcomes, how to give clear guidance?
Pattern: “Depends on priority” recommendations with decision tree
Structure:
### For [Population/Context]
**Recommended**: [Method A]
- [Metric 1]: [Performance with stats]
- [Metric 2]: [Performance with stats]
- Use when: [Priority/constraint]
**Alternative**: [Method B]
- [Metric 1]: [Performance with stats]
- [Metric 2]: [Performance with stats]
- Use when: [Different priority/constraint]
**Note**: [Important caveat or key difference from other contexts]
Example (from avian genome assemblies):
### For Avian Genomes
**Depends on priority**:
**For gap density minimization**: Phased assembly (dual or single curation)
- Dramatic 75-100Ã reduction in gaps vs Pri/alt
- Strong significance (p=1.87Ã10â»Â¹â°)
**For chromosome assignment**: Pri/alt + Single curation
- Best assignment (98.93% median)
- Significantly better than phased approaches (p<0.001)
**Note**: Dual curation does NOT improve scaffold N50 in birds (p=0.378),
unlike mammals. Initial assemblies are already near-optimal due to
favorable genome characteristics.
Benefits:
- Acknowledges trade-offs honestly
- Provides clear decision criteria
- Gives actionable guidance despite complexity
- Explains when different approaches are optimal
Summary
Systematic refinement workflow:
- Identify issue â 2. Fix visualization â 3. Improve clarity â 4. Test layouts â 5. Optimize positioning
Key principles:
- Iterate based on user feedback
- Test alternatives (show options)
- Document changes
- Apply lessons across figure set
- Meet publication standards
Common adjustments:
- Point sizes: 60 â 25
- Line widths: 2.5 â 1.5
- Font sizes: 10 â 8
- Annotation positions: 105% â 92% of max
- Always set explicit axis limits