scientific-publication

📁 delphine-l/claude_global 📅 Today

总安装量

周安装量

#74894

全站排名

安装命令

npx skills add https://github.com/delphine-l/claude_global --skill scientific-publication

Agent 安装分布

amp 1

cline 1

opencode 1

cursor 1

continue 1

kimi-cli 1

Skill 文档

Scientific Publication Figure Refinement

Expert guidance for systematically improving scientific figures through iterative refinement based on user feedback and publication requirements.

When to Use This Skill

Improving figures based on reviewer or collaborator feedback
Optimizing figure clarity and readability
Ensuring all figure elements fit within bounds
Deciding between layout alternatives (horizontal vs vertical panels)
Preparing figures for high-impact publications

Iterative Figure Refinement Workflow

Standard Refinement Sequence

When improving a publication figure, follow this systematic approach:

1. Identify the Core Issue

Examples:
- "Violin plots look distorted on log scale"
- "P-values are cut off at the top"
- "Too much visual clutter, hard to see the data"
- "Text overlaps with data points"

2. Fix the Visualization Type/Method

# Example: Replace inappropriate plot type
# Before: Violin plot on log scale (distorted)
ax.violinplot(data)
ax.set_yscale('log')

# After: Boxplot on log scale (accurate)
ax.boxplot(data)
ax.set_yscale('log')

3. Improve Visual Clarity Systematically adjust element sizes:

# Point sizes: Reduce for dense data
# Start: s=60 (exploratory)
# End: s=25 (publication)
ax.scatter(..., s=25, alpha=0.5)

# Line widths: Thinner reduces clutter
# Start: linewidth=2.5
# End: linewidth=1.5
ax.plot(..., linewidth=1.5)

# Text sizes: Prevent overlap
# Start: fontsize=10-12
# End: fontsize=8-9
ax.text(..., fontsize=8)

# Error bar caps: Keep readable
ax.errorbar(..., capsize=5)

4. Test Layout Alternatives

# Option A: Side-by-side panels
fig, axes = plt.subplots(1, 2, figsize=(16, 7))
# Pros: Direct left-right comparison
# Cons: Smaller individual panels

# Option B: Stacked vertically
fig, axes = plt.subplots(2, 1, figsize=(10, 14))
# Pros: Larger individual panels, easier to read details
# Cons: Harder to compare across panels

# Decision: Let user feedback guide choice
# Generate both, ask which is clearer

5. Optimize Element Positioning Ensure all annotations fit within plot bounds:

# Calculate safe positioning
y_max = max([d.max() for d in data_list])
y_min = min([d.min() for d in data_list])

# Position annotations WITHIN bounds
y_pos = y_max * 0.92  # 92%, not 105% (which goes outside)

# Set explicit limits with headroom
ax.set_ylim(y_min * 0.95 if y_min > 0 else y_min - 5,
            y_max * 1.05)

Checklist for Publication Figures

Use this checklist before finalizing figures:

Plot type appropriate for data distribution (no violin on log scale)
All text readable at publication size (8-10 pt minimum)
Statistical annotations visible and within plot bounds
Legend clear and doesn’t obscure data
Axis labels descriptive with units
Color scheme colorblind-friendly
Line weights balanced (not too thick or thin)
Point sizes optimized (visible but not overlapping)
DPI adequate for publication (300 minimum)
Layout tested (try both horizontal and vertical if applicable)
File format publication-ready (PNG, PDF, or SVG)

Common Refinement Patterns

Pattern 1: Decluttering Dense Plots

Problem: Too many visual elements competing for attention

Solution sequence:

Reduce point size (60 â 25)
Thin line widths (2.5 â 1.5)
Increase transparency (alpha=0.8 â 0.5)
Reduce font sizes (10 â 8)
Remove grid or make it lighter (alpha=0.3)

Before/After test: Generate both versions, compare

Pattern 2: Fixing Overflow Issues

Problem: Annotations, legends, or labels cut off

Solutions:

# 1. Adjust annotation positions
y_pos = y_max * 0.92  # Within bounds

# 2. Use bbox_inches='tight' when saving
plt.savefig('figure.png', dpi=300, bbox_inches='tight')

# 3. Explicitly set limits
ax.set_ylim(min_val * 0.95, max_val * 1.05)

# 4. Move legend outside plot area
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

# 5. Reduce text size
ax.text(..., fontsize=8)  # Down from 10

Pattern 3: Multi-Panel Layout Optimization

Try both orientations:

# Version 1: Horizontal (side-by-side)
fig, axes = plt.subplots(1, 2, figsize=(16, 7))
plt.savefig('fig_horizontal.png', dpi=300, bbox_inches='tight')

# Version 2: Vertical (stacked)
fig, axes = plt.subplots(2, 1, figsize=(10, 14))
plt.savefig('fig_vertical.png', dpi=300, bbox_inches='tight')

# Present both to user, ask which is clearer

Decision criteria:

Horizontal: Better for direct comparison between panels
Vertical: Better when each panel needs more space
User context: Journal column width, presentation slides, etc.

Pattern 4: Iterative Statistical Annotation

Common issue: P-values positioned outside plot or overlapping with data

Solution:

# Calculate data range first
all_data = [data_dual, data_prialt]  # All datasets in plot
y_max = max([d.max() for d in all_data if len(d) > 0])

# Position relative to actual data, not theoretical maximum
for i, (x_pos, comparison) in enumerate(comparisons):
    stat, pval = stats.mannwhitneyu(...)

    # Safe positioning
    y_annotation = y_max * 0.92  # Below the top

    # Format text
    if pval < 0.001:
        text = 'p < 0.001***'
    elif pval < 0.01:
        text = 'p < 0.01**'
    elif pval < 0.05:
        text = 'p < 0.05*'
    else:
        text = f'p = {pval:.3f} ns'

    ax.text(x_pos, y_annotation, text, ha='center', fontsize=9)

# Set explicit limits to ensure annotations fit
ax.set_ylim(0, y_max * 1.05)

Refinement Workflow Example

Real case: VGP Figure 5 improvement sequence

Initial version: 4 categories, violin plots on log scale
- Issue: Violin distortion, too complex
V1 refinement: Remove violin plots, keep boxplots
- Better, but still issues
V2 refinement: Simplify to 3 categories
- Clearer interpretation
V3 refinement: Reduce point sizes (60â25), thin lines (2.5â1.5)
- Less clutter
V4 refinement: Test vertical vs horizontal layout
- Horizontal clearer for this case
V5 refinement: Fix p-value positioning (105%â92% of y_max)
- All elements now visible
Final: Smaller text in statistics box (10â8)
- Publication ready

Total iterations: 7 versions over refinement process Result: Clear, accurate, publication-quality figure

Best Practices

1. Version Your Refinements

Keep working versions during major changes:

scripts/
  plot_figure.py          # Original
  plot_figure_v2.py       # After major change (layout)
  plot_figure_final.py    # Publication version

2. Generate Alternatives in Parallel

When testing layout options:

# Save both versions
layouts = [
    ((1, 2), (16, 7), 'horizontal'),
    ((2, 1), (10, 14), 'vertical')
]

for (nrows, ncols), figsize, name in layouts:
    fig, axes = plt.subplots(nrows, ncols, figsize=figsize)
    # ... plot data ...
    plt.savefig(f'figure_{name}.png', dpi=300, bbox_inches='tight')

3. Document Each Refinement

"""
Figure 5 - Terminal Telomere Presence

Version history:
- v1: Initial 4-category version with violin plots
- v2: Removed violin plots (distortion on log scale)
- v3: Simplified to 3 categories (terminal only)
- v4: Reduced point/line sizes for clarity
- v5: Fixed p-value positioning
- final: Publication ready

Changes from v4 â v5:
- P-value y-position: 1.05 * y_max â 0.92 * y_max
- Added explicit y-axis limits: (y_min*0.95, y_max*1.05)
- Ensures all annotations visible within plot bounds
"""

4. Get Feedback at Key Milestones

Don’t over-iterate without input:

After fixing major issues (wrong plot type): Show user
After layout changes (horizontal vs vertical): Show user
After final polish: Show user

5. Maintain Consistency Across Figure Set

If refining one figure, check if same improvements apply to others:

# Applied violinâboxplot fix to Figures 2, 7, 10, 11
# Applied size reductions consistently across all figures
# Used same color scheme throughout

Publication Standards

DPI Requirements

Screen/web: 150 DPI
Print (standard): 300 DPI
High-quality print: 600 DPI

File Formats

Raster: PNG at 300 DPI (most journals accept)
Vector: PDF or SVG (preferred for line plots, smaller file size, infinite zoom)
Avoid: JPG (lossy compression, poor for scientific data)

Size Specifications

Check journal requirements:

Single column: Usually 3.5 inches (89 mm) wide
Double column: Usually 7 inches (178 mm) wide
Height: Typically max 9-10 inches

Plan figsize accordingly:

# Single column figure
fig, ax = plt.subplots(figsize=(3.5, 4))

# Double column figure
fig, axes = plt.subplots(1, 2, figsize=(7, 3.5))

Color Accessibility Requirements

Many journals now require accessibility statements for figures, including:

Confirmation that color schemes are colorblind-safe
Use of validated palettes (Okabe-Ito, Paul Tol)
Alternative distinguishing features (patterns, shapes, labels)

Nature journals specifically recommend:

Okabe-Ito palette for categorical data
Avoiding red-green combinations
Testing figures with colorblindness simulators

In Methods section, document your color choices:

“All figures use the Okabe-Ito colorblind-safe palette (Okabe & Ito, 2008) to ensure accessibility for readers with color vision deficiencies.”

Reference: Okabe, M. and Ito, K. (2008) Color Universal Design (CUD): How to make figures and presentations that are friendly to colorblind people. https://jfly.uni-koeln.de/color/

Writing Integrated Results from Multi-Study Analyses

Challenge: When you have multiple parallel analyses (e.g., same metrics across 5 different populations/clades/conditions), how to present findings coherently without overwhelming readers.

Solution: Organize by pattern type first, then by study

Structure Pattern

1. Universal Patterns Section

Present findings consistent across ALL studies first
This establishes the “baseline truth” readers can rely on
Use strong language: “consistently,” “across all,” “universal”
Provide statistical evidence from multiple studies

2. Study-Specific Patterns Section

Present deviations and unique findings by study
Explicitly contrast with universal patterns
Explain why this study differs (biological/technical context)

3. Cross-Study Comparisons Section

Tables comparing effect sizes across studies
Discussion of what drives variation
Statistical power considerations

Example Structure (from clade-specific genome analysis):

## Universal Patterns Across All Vertebrates

### Gap Density: Architecture Dominates Curation
- Finding: [Universal pattern]
- Evidence: [Stats from all 5 clades]
- Interpretation: [Why this is universal]

### Telomere Detection: Technology-Limited
- [Similar structure]

## Clade-Specific Patterns

### Mammals: Dual Curation Provides Benefits
- Finding: [Unique to this clade]
- Contrast: [How this differs from universal]
- Interpretation: [Biological context]

### Birds: No N50 Benefit from Dual Curation
- [Unique pattern and explanation]

## Cross-Clade Comparisons
- [Table of effect sizes]
- [Discussion of variation]

Benefits of This Structure

Readers get reliable findings first: Universal patterns are established before introducing complexity
Reduces cognitive load: Don’t jump between studies repeatedly
Highlights what’s generalizable: Universal section shows what works everywhere
Explains variation: Study-specific section explains why some results differ
Facilitates recommendations: Can give universal advice plus context-specific guidance

Writing Tips

For Universal Patterns:

Lead with the finding, then provide evidence from multiple studies
Use consistent statistical reporting across all supporting evidence
Emphasize the consistency: “across all,” “in every,” “universal”

For Study-Specific Patterns:

Explicitly state how this differs from universal patterns
Provide biological/technical context for why this study is unique
Don’t just report statistics – explain the mechanism

For Statistical Power:

Be explicit about which studies have sufficient power
Note limitations in smaller studies
Don’t over-interpret null results from underpowered studies

Common Pitfalls to Avoid

â Don’t: Report each study sequentially (Study 1 all results, Study 2 all results…) â Do: Report by finding type (Finding A across all studies, Finding B across all studies…)

â Don’t: Hide that some patterns aren’t universal â Do: Explicitly highlight when a pattern is study-specific and explain why

â Don’t: Give equal weight to all findings â Do: Emphasize universal patterns; note study-specific as “interesting variations”

Application Beyond Clade Analysis

This pattern works for any multi-study synthesis:

Clinical trials across different populations
Experimental treatments across multiple cell lines
Algorithm performance across different datasets
Policy interventions across different regions

Key principle: Organize by what readers need to know (universal vs specific) rather than by how you conducted the studies (study-by-study).

Providing Practical Recommendations from Complex Trade-offs

Challenge: When different methods excel at different outcomes, how to give clear guidance?

Pattern: “Depends on priority” recommendations with decision tree

Structure:

### For [Population/Context]

**Recommended**: [Method A]
- [Metric 1]: [Performance with stats]
- [Metric 2]: [Performance with stats]
- Use when: [Priority/constraint]

**Alternative**: [Method B]
- [Metric 1]: [Performance with stats]
- [Metric 2]: [Performance with stats]
- Use when: [Different priority/constraint]

**Note**: [Important caveat or key difference from other contexts]

Example (from avian genome assemblies):

### For Avian Genomes
**Depends on priority**:

**For gap density minimization**: Phased assembly (dual or single curation)
- Dramatic 75-100Ã reduction in gaps vs Pri/alt
- Strong significance (p=1.87Ã10â»Â¹â°)

**For chromosome assignment**: Pri/alt + Single curation
- Best assignment (98.93% median)
- Significantly better than phased approaches (p<0.001)

**Note**: Dual curation does NOT improve scaffold N50 in birds (p=0.378),
unlike mammals. Initial assemblies are already near-optimal due to
favorable genome characteristics.

Benefits:

Acknowledges trade-offs honestly
Provides clear decision criteria
Gives actionable guidance despite complexity
Explains when different approaches are optimal

Summary

Systematic refinement workflow:

Identify issue â 2. Fix visualization â 3. Improve clarity â 4. Test layouts â 5. Optimize positioning

Key principles:

Iterate based on user feedback
Test alternatives (show options)
Document changes
Apply lessons across figure set
Meet publication standards

Common adjustments:

Point sizes: 60 â 25
Line widths: 2.5 â 1.5
Font sizes: 10 â 8
Annotation positions: 105% â 92% of max
Always set explicit axis limits

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台