folder-organization

📁 delphine-l/claude_global 📅 Jan 24, 2026

总安装量

周安装量

#8314

全站排名

安装命令

npx skills add https://github.com/delphine-l/claude_global --skill folder-organization

Agent 安装分布

claude-code 14

opencode 13

gemini-cli 11

codex 11

antigravity 9

cursor 9

Skill 文档

Folder Organization Best Practices

Expert guidance for organizing project directories, establishing file naming conventions, and maintaining clean, navigable project structures for research and development work.

When to Use This Skill

Setting up new projects
Reorganizing existing projects
Establishing team conventions
Creating reproducible research structures
Managing data-intensive projects

Core Principles

Predictability – Standard locations for common file types
Scalability – Structure grows gracefully with project
Discoverability – Easy for others (and future you) to navigate
Separation of Concerns – Code, data, documentation, outputs separated
Version Control Friendly – Large/generated files excluded appropriately

Standard Project Structure

Research/Analysis Projects

project-name/
âââ README.md                 # Project overview and getting started
âââ .gitignore               # Exclude data, outputs, env files
âââ environment.yml          # Conda environment (or requirements.txt)
âââ data/                    # Input data (often gitignored)
â   âââ raw/                # Original, immutable data
â   âââ processed/          # Cleaned, transformed data
â   âââ external/           # Third-party data
âââ notebooks/               # Jupyter notebooks for exploration
â   âââ 01-exploration.ipynb
â   âââ 02-analysis.ipynb
â   âââ figures/            # Notebook-generated figures
âââ src/                     # Source code (reusable modules)
â   âââ __init__.py
â   âââ data_processing.py
â   âââ analysis.py
â   âââ visualization.py
âââ scripts/                 # Standalone scripts and workflows
â   âââ download_data.sh
â   âââ run_pipeline.py
âââ tests/                   # Unit tests
â   âââ test_analysis.py
âââ docs/                    # Documentation
â   âââ methods.md
â   âââ references.md
âââ results/                 # Analysis outputs (gitignored)
â   âââ figures/
â   âââ tables/
â   âââ models/
âââ config/                  # Configuration files
    âââ analysis_config.yaml

Development Projects

project-name/
âââ README.md
âââ .gitignore
âââ setup.py                 # Package configuration
âââ requirements.txt         # or pyproject.toml
âââ src/
â   âââ package_name/
â       âââ __init__.py
â       âââ core.py
â       âââ utils.py
âââ tests/
â   âââ test_core.py
â   âââ test_utils.py
âââ docs/
â   âââ api.md
â   âââ usage.md
âââ examples/                # Example usage
â   âââ example_workflow.py
âââ .github/                 # CI/CD workflows
    âââ workflows/
        âââ tests.yml

Bioinformatics/Workflow Projects

project-name/
âââ README.md
âââ data/
â   âââ raw/                # Raw sequencing data
â   âââ reference/          # Reference genomes, annotations
â   âââ processed/          # Workflow outputs
âââ workflows/               # Galaxy .ga or Snakemake files
â   âââ preprocessing.ga
â   âââ assembly.ga
âââ config/
â   âââ workflow_params.yaml
â   âââ sample_sheet.tsv
âââ scripts/                # Helper scripts
â   âââ submit_workflow.py
â   âââ quality_check.py
âââ results/                # Final outputs
â   âââ figures/
â   âââ tables/
â   âââ reports/
âââ logs/                   # Workflow execution logs

File Naming Conventions

General Rules

Use lowercase with hyphens or underscores
- â data-analysis.py or data_analysis.py
- â DataAnalysis.py or data analysis.py
Be descriptive but concise
- â process-telomere-data.py
- â script.py or process_all_the_telomere_sequencing_data_from_experiments.py
Use consistent separators
- Choose either hyphens or underscores and stick with it
- Convention: hyphens for file names, underscores for Python modules
Include version/date for important outputs
- â report-2026-01-23.pdf or model-v2.pkl
- â report-final-final-v3.pdf

Numbered Sequences

For sequential files (notebooks, scripts), use zero-padded numbers:

notebooks/
âââ 01-data-exploration.ipynb
âââ 02-quality-control.ipynb
âââ 03-statistical-analysis.ipynb
âââ 04-visualization.ipynb

Data Files

Include metadata in filename when possible:

data/raw/
âââ sample-A_hifi_reads_2026-01-15.fastq.gz
âââ sample-B_hifi_reads_2026-01-15.fastq.gz
âââ reference_genome_v3.fasta

Directory Management Best Practices

What to Version Control

DO commit:

Source code
Documentation
Configuration files
Small test datasets (<1MB)
Requirements/environment files
README files

DON’T commit:

Large data files (use .gitignore)
Generated outputs
Environment directories (venv/, conda-env/)
Logs
Temporary files
API keys/secrets

.gitignore Template

# Python
__pycache__/
*.py[cod]
*$py.class
.venv/
venv/
*.egg-info/

# Jupyter
.ipynb_checkpoints/
*.ipynb_checkpoints

# Data
data/raw/
data/processed/
*.fastq.gz
*.bam
*.vcf.gz

# Outputs
results/
outputs/
*.png
*.pdf
*.html

# Logs
logs/
*.log

# Environment
.env
environment.local.yml

# OS
.DS_Store
Thumbs.db

Data Organization

Raw Data is Sacred

Never modify raw data – Always keep originals untouched
Store in data/raw/ and make it read-only if possible
Document data provenance (where it came from, when downloaded)

Processed Data Hierarchy

data/
âââ raw/                    # Original, immutable
âââ interim/                # Intermediate processing steps
âââ processed/              # Final, analysis-ready data
âââ external/               # Third-party data

Documentation Standards

README.md Essentials

Every project should have a README with:

# Project Name

Brief description

## Installation

How to set up the environment

## Usage

How to run the analysis/code

## Project Structure

Brief overview of directories

## Data

Where data lives and how to access it

## Results

Where to find outputs

Code Documentation

Docstrings for all functions/classes
Comments for complex logic
CHANGELOG.md for tracking changes
TODO.md for tracking work (gitignored or removed before merge)

Common Anti-Patterns to Avoid

â Flat structure with everything in root

project/
âââ script1.py
âââ script2.py
âââ data.csv
âââ output1.png
âââ output2.png
âââ final_really_final_v3.xlsx

â Ambiguous naming

notebooks/
âââ notebook1.ipynb
âââ test.ipynb
âââ analysis.ipynb
âââ analysis_new.ipynb

â Mixed concerns

project/
âââ src/
â   âââ analysis.py
â   âââ data.csv          # Data in source code directory
â   âââ figure1.png       # Output in source code directory

Cleanup and Maintenance

Regular Maintenance Tasks

Archive old branches – Delete merged feature branches
Clean temp files – Remove TODO.md, NOTES.md from completed work
Update documentation – Keep README current with changes
Review .gitignore – Ensure large files aren’t tracked
Organize notebooks – Rename/renumber as project evolves

End-of-Project Checklist

README complete and accurate
Code documented
Tests passing
Large files gitignored
Working files removed (TODO.md, scratch notebooks)
Final outputs in results/
Environment files current
License added (if applicable)

Integration with Other Skills

This skill works well with:

python-environment – Environment setup and management
claude-collaboration – Team workflow best practices
jupyter-notebook-analysis – Notebook organization standards

Templates and Tools

Quick Project Setup

# Create standard research project structure
mkdir -p data/{raw,processed,external} notebooks scripts src tests docs results config
touch README.md .gitignore environment.yml

Cookiecutter Templates

Consider using cookiecutter for standardized project templates:

cookiecutter-data-science – Data science projects
cookiecutter-research – Research projects
Custom team templates

References and Resources

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台