bio-read-alignment-star-alignment
3
总安装量
3
周安装量
#57212
全站排名
安装命令
npx skills add https://github.com/gptomics/bioskills --skill bio-read-alignment-star-alignment
Agent 安装分布
trae
2
windsurf
1
opencode
1
codex
1
claude-code
1
antigravity
1
Skill 文档
STAR RNA-seq Alignment
Generate Genome Index
# Basic index generation
STAR --runMode genomeGenerate \
--runThreadN 8 \
--genomeDir star_index/ \
--genomeFastaFiles reference.fa \
--sjdbGTFfile annotation.gtf \
--sjdbOverhang 100 # Read length - 1
Index with Specific Read Length
# For 150bp reads, use sjdbOverhang=149
STAR --runMode genomeGenerate \
--runThreadN 8 \
--genomeDir star_index_150/ \
--genomeFastaFiles reference.fa \
--sjdbGTFfile annotation.gtf \
--sjdbOverhang 149
Basic Alignment
# Paired-end alignment
STAR --runThreadN 8 \
--genomeDir star_index/ \
--readFilesIn reads_1.fq.gz reads_2.fq.gz \
--readFilesCommand zcat \
--outFileNamePrefix sample_ \
--outSAMtype BAM SortedByCoordinate
Single-End Alignment
STAR --runThreadN 8 \
--genomeDir star_index/ \
--readFilesIn reads.fq.gz \
--readFilesCommand zcat \
--outFileNamePrefix sample_ \
--outSAMtype BAM SortedByCoordinate
Two-Pass Mode
# Two-pass mode for better novel junction detection
STAR --runThreadN 8 \
--genomeDir star_index/ \
--readFilesIn r1.fq.gz r2.fq.gz \
--readFilesCommand zcat \
--outFileNamePrefix sample_ \
--outSAMtype BAM SortedByCoordinate \
--twopassMode Basic
Quantification Mode
# Output gene counts (like featureCounts)
STAR --runThreadN 8 \
--genomeDir star_index/ \
--readFilesIn r1.fq.gz r2.fq.gz \
--readFilesCommand zcat \
--outFileNamePrefix sample_ \
--outSAMtype BAM SortedByCoordinate \
--quantMode GeneCounts
Output: sample_ReadsPerGene.out.tab with columns:
- Gene ID
- Unstranded counts
- Forward strand counts
- Reverse strand counts
ENCODE Options
# ENCODE recommended settings
STAR --runThreadN 8 \
--genomeDir star_index/ \
--readFilesIn r1.fq.gz r2.fq.gz \
--readFilesCommand zcat \
--outFileNamePrefix sample_ \
--outSAMtype BAM SortedByCoordinate \
--outSAMunmapped Within \
--outSAMattributes NH HI AS NM MD \
--outFilterType BySJout \
--outFilterMultimapNmax 20 \
--outFilterMismatchNmax 999 \
--outFilterMismatchNoverReadLmax 0.04 \
--alignIntronMin 20 \
--alignIntronMax 1000000 \
--alignMatesGapMax 1000000 \
--alignSJoverhangMin 8 \
--alignSJDBoverhangMin 1
Fusion Detection
# For chimeric/fusion detection
STAR --runThreadN 8 \
--genomeDir star_index/ \
--readFilesIn r1.fq.gz r2.fq.gz \
--readFilesCommand zcat \
--outFileNamePrefix sample_ \
--outSAMtype BAM SortedByCoordinate \
--chimSegmentMin 12 \
--chimJunctionOverhangMin 8 \
--chimOutType Junctions WithinBAM SoftClip \
--chimMainSegmentMultNmax 1
Output Files
| File | Description |
|---|---|
| *Aligned.sortedByCoord.out.bam | Sorted BAM file |
| *Log.final.out | Alignment summary statistics |
| *Log.out | Detailed log |
| *SJ.out.tab | Splice junctions |
| *ReadsPerGene.out.tab | Gene counts (if –quantMode) |
| *Chimeric.out.junction | Fusion candidates (if chimeric) |
Memory Requirements
# Reduce memory for limited systems
STAR --genomeLoad NoSharedMemory \
--limitBAMsortRAM 10000000000 \ # 10GB for sorting
...
# For very large genomes, limit during index generation
STAR --runMode genomeGenerate \
--limitGenomeGenerateRAM 31000000000 \ # 31GB
...
Shared Memory Mode
# Load genome into shared memory (for multiple samples)
STAR --genomeLoad LoadAndExit --genomeDir star_index/
# Run alignments (faster startup)
STAR --genomeLoad LoadAndKeep --genomeDir star_index/ ...
# Remove from memory when done
STAR --genomeLoad Remove --genomeDir star_index/
Key Parameters
| Parameter | Default | Description |
|---|---|---|
| –runThreadN | 1 | Number of threads |
| –sjdbOverhang | 100 | Read length – 1 |
| –outFilterMultimapNmax | 10 | Max multi-mapping |
| –alignIntronMax | 0 | Max intron size |
| –outFilterMismatchNmax | 10 | Max mismatches |
| –outSAMtype | SAM | Output format |
| –quantMode | – | GeneCounts for counting |
| –twopassMode | None | Basic for two-pass |
Related Skills
- rna-quantification/featurecounts-counting – Alternative counting
- rna-quantification/alignment-free-quant – Salmon/kallisto alternative
- differential-expression/deseq2-basics – Downstream DE analysis
- read-qc/fastp-workflow – Preprocess reads