solublempnn
13
总安装量
12
周安装量
#25550
全站排名
安装命令
npx skills add https://github.com/adaptyvbio/protein-design-skills --skill solublempnn
Agent 安装分布
claude-code
11
codex
9
opencode
9
windsurf
6
antigravity
6
Skill 文档
SolubleMPNN Solubility-Optimized Design
Prerequisites
| Requirement | Minimum | Recommended |
|---|---|---|
| Python | 3.8+ | 3.10 |
| CUDA | 11.0+ | 11.7+ |
| GPU VRAM | 8GB | 16GB (T4) |
| RAM | 8GB | 16GB |
How to run
First time? See Installation Guide to set up Modal and biomodals.
Option 1: Modal (recommended)
SolubleMPNN uses the ProteinMPNN Modal wrapper with soluble model:
cd biomodals
modal run modal_proteinmpnn.py \
--pdb-path backbone.pdb \
--num-seq-per-target 16 \
--sampling-temp 0.1 \
--model-name v_48_020
GPU: T4 (16GB) | Timeout: 600s default
Option 2: Local installation
git clone https://github.com/dauparas/ProteinMPNN.git
cd ProteinMPNN
# Use soluble model weights
python protein_mpnn_run.py \
--pdb_path backbone.pdb \
--out_folder output/ \
--num_seq_per_target 16 \
--sampling_temp "0.1" \
--model_name "v_48_020" # Soluble model
Key parameters
| Parameter | Default | Range | Description |
|---|---|---|---|
--pdb_path |
required | path | Input structure |
--num_seq_per_target |
1 | 1-1000 | Sequences per structure |
--sampling_temp |
“0.1” | “0.0001-1.0” | Temperature (string!) |
--model_name |
v_48_020 | string | Soluble model variant |
Model Variants
| Model | Description | Use Case |
|---|---|---|
| v_48_002 | Standard | General design |
| v_48_020 | Soluble-trained | E. coli expression |
| v_48_030 | High solubility | Difficult targets |
Output format
output/
âââ seqs/backbone.fa
âââ backbone_pdb/backbone_0001.pdb
Sample output
Successful run
$ python protein_mpnn_run.py --pdb_path backbone.pdb --model_name v_48_020 --num_seq_per_target 8
Loading soluble model weights (v_48_020)...
Designing sequences for backbone.pdb
Generated 8 sequences in 2.1 seconds
output/seqs/backbone.fa:
>backbone_0001, score=1.31, global_score=1.24, seq_recovery=0.78
MKTAYIAKQRQISFVKSHFSRQLE...
>backbone_0002, score=1.28, global_score=1.21, seq_recovery=0.81
MKTAYIAKQRQISFVKSQFSRQLD...
What good output looks like:
- Score: 1.0-2.0 (lower = more confident)
- Reduced hydrophobic patches compared to standard MPNN
- Improved charge distribution
Decision tree
Should I use SolubleMPNN?
â
ââ What expression system?
â ââ E. coli â SolubleMPNN â
â ââ Mammalian â ProteinMPNN (PTMs matter more)
â ââ Yeast â Either
â
ââ History of expression problems?
â ââ Yes, aggregation â SolubleMPNN â
â ââ Yes, low yield â SolubleMPNN â
â ââ No â ProteinMPNN is fine
â
ââ What's in the binding site?
â ââ Small molecule / ligand â Use LigandMPNN
â ââ Nothing / protein only â SolubleMPNN â
â
ââ Need highest solubility?
ââ Yes â Use v_48_030 model
ââ Standard â Use v_48_020 model
Typical performance
| Campaign Size | Time (T4) | Cost (Modal) | Notes |
|---|---|---|---|
| 100 backbones à 8 seq | 15-20 min | ~$2 | Standard |
| 500 backbones à 8 seq | 1-1.5h | ~$8 | Large campaign |
Expected improvement: +15-30% solubility score vs standard ProteinMPNN.
Verify
grep -c "^>" output/seqs/*.fa # Should match backbone_count à num_seq_per_target
Troubleshooting
Still insoluble: Try v_48_030 (higher solubility bias) Low diversity: Increase temperature to 0.2 Poor folding: Use standard ProteinMPNN and optimize later
Error interpretation
| Error | Cause | Fix |
|---|---|---|
RuntimeError: CUDA out of memory |
Long protein or large batch | Reduce batch_size |
FileNotFoundError: v_48_020 |
Missing model weights | Download soluble weights |
Next: Structure prediction for validation â protein-qc for filtering.