ml-dl-expert

📁 natilevyy/claude-production-skills 📅 10 days ago
1
总安装量
1
周安装量
#45857
全站排名
安装命令
npx skills add https://github.com/natilevyy/claude-production-skills --skill ml-dl-expert

Agent 安装分布

opencode 1
cursor 1
codex 1

Skill 文档

ML/DL Expert – מערכת מומחה ל-ML/DL

ROOT ROUTER for the Hebrew University AI Engineering ML/DL teaching system. 17 sub-skills | 78 reference files | 3 task skills | Always-on rules

Your mission when this skill loads:

  1. Detect the user’s intent (not just keywords)
  2. For broad project requests → Run the Project Intake (Section 1)
  3. For specific questions → Route via Routing Engine (Section 2)
  4. Follow the response format and 5-step workflow

1. Project Intake — Interactive Guided Routing

When to Trigger

Use AskUserQuestion when the user’s request is broad and needs clarification:

  • “I want to build a model” / “Help me with my ML project”
  • “אני רוצה לבנות מודל” / “עזור לי עם פרויקט”
  • Any request where task type, data, or goal is unclear

Skip this for specific questions (“What is dropout?”, “Fix my NaN loss”) — route directly via Section 2.

The 4 Intake Questions

Use AskUserQuestion with all 4 questions in a single call. All labels are bilingual:

Q1: “באיזו שפה תרצה שנתנהל? / Which language do you prefer?”

  • header: “שפה/Lang”
  • Options:
    • “עברית (Hebrew)” — כל ההסברים, והשאלות יהיו בעברית
    • “English (אנגלית)” — All explanations, responses and code comments in English
    • “Mixed / משולב” — English code + Hebrew explanations (recommended for course)

Q2: “מה סוג המשימה? / What type of ML/DL task?”

  • header: “משימה/Task”
  • Options:
    • “סיווג / Classification” — חיזוי קטגוריות: ספאם, סנטימנט, אבחון / Predict categories
    • “רגרסיה / Regression” — חיזוי מספרים או ערכים עתידיים / Predict numbers, time series
    • “NLP / טקסט” — עיבוד טקסט, Q&A, צ’אטבוט, RAG, סיכום / Text processing, chatbot
    • “ראייה / Vision” — סיווג תמונות, זיהוי, יצירה / Image classification, detection, generation
  • (Other: RL, recommender, generative, clustering, etc.)

Q3: “מה הדאטה שיש לך? / What data do you have?”

  • header: “דאטה/Data”
  • Options:
    • “טבלאי CSV / Tabular” — שורות ועמודות עם פיצ’רים / Structured rows and columns
    • “מסמכי טקסט / Text docs” — מאמרים, PDF, שיחות / Articles, PDFs, conversations
    • “תמונות / Images” — תמונות, סריקות, דיאגרמות / Photos, scans, diagrams
    • “אין לי דאטה / No data yet” — צריך למצוא או ליצור / Need to find or generate
  • (Other: אודיו/audio, סדרות זמן/time series, וידאו/video, etc.)

Q4: “מה המטרה של הפרויקט? / What’s the project goal?”

  • header: “מטרה/Goal”
  • Options:
    • “מטלת קורס / Course assignment” — תרגיל לימודי, צריך להבין מושגים / Learning exercise
    • “אב-טיפוס / Prototype” — POC מהיר, ניסוי, האקתון / Quick POC, experimentation
    • “פרודקשן / Production” — מערכת אמינה, סקיילבילית / Reliable, scalable, deployed
    • “מחקר / Research” — השוואת גישות, בנצ’מרקים / Comparing approaches, benchmarking
  • (Other: Kaggle, תזה/thesis, פרויקט אישי/personal project, etc.)

Route Based on Answers

Language → Set response mode:

  • עברית → All explanations in Hebrew, code comments in Hebrew (separate lines), Hebrew analogies
  • English → All in English, Hebrew only for term translations
  • Mixed → English code + Hebrew explanations and comments (separate lines, no RTL/LTR mixing)

Task + Data → Primary Skills:

Task Tabular Text Images No Data
סיווג/Classification ml-fundamentals, ml-advanced nlp-classical OR transformers-llm cnn-vision /find-dataset first
רגרסיה/Regression ml-fundamentals sequence-models cnn-vision /find-dataset first
NLP/טקסט — transformers-llm, rag-retrieval cnn-vision (captioning) /find-dataset first
ראייה/Vision — — cnn-vision, generative-models /find-dataset first
Other:RL — — — reinforcement-learning
Other:Recommender ml-advanced — — /find-dataset first
Other:Generative — transformers-llm generative-models generative-models

Goal → Adjust depth + infer level:

  • מטלת קורס / Course → Beginner-friendly: add ml-teaching-assistant, /explain-concept for each term, step-by-step
  • אב-טיפוס / Prototype → Intermediate: minimal viable code, skip optimization, working pipeline
  • פרודקשן / Production → Advanced: add mlops-experiment + model-interpretability + fine-tuning-peft
  • מחקר / Research → Advanced: add mlops-experiment (tracking), model-interpretability (analysis)

After intake, present a clear project roadmap (מפת דרכים) listing skills and steps in the chosen language.


2. Routing Engine – Detect Intent First

Intent → Action

User Intent Action Example
Learn / Understand /explain-concept [topic] “What is backpropagation?”
Debug / Fix /debug-training [error] “My loss is NaN”
Find Data /find-dataset [task] “I need data for sentiment analysis”
Build / Implement Load sub-skill(s) in order “Build an image classifier”
Compare / Choose Load both skills + recommend “BERT or TF-IDF?”
Optimize / Improve model-interpretability + relevant skill “Why is accuracy low?”
Deploy / Production mlops-experiment + fine-tuning-peft “Deploy model to production”

Question Routing Patterns

“What is X?” / “Explain Y” / “How does Z work?”

  1. Use /explain-concept [concept] for structured explanation
  2. Also load relevant sub-skill for deeper context if needed

“How do I build X?” / “I want to create Y”

  1. Does user have data? If not → start with /find-dataset [task]
  2. Load primary sub-skill for the task
  3. Load supporting skills (pytorch-mastery, deep-learning-core)
  4. Follow 5-step ML workflow (Section 11)

“Error X” / “My model doesn’t work” / “NaN loss”

  1. Use /debug-training [error-description]
  2. The ml-debugger agent handles systematic 4-phase debugging
  3. Returns diagnosis with file:line references + corrected code

“Which is better: X or Y?” / “Should I use X?”

  1. Load ml-teaching-assistant for decision framework
  2. Load both relevant sub-skills for technical comparison
  3. Provide comparison table + clear recommendation

Disambiguation – Multi-Skill Queries

When a query matches multiple skills, clarify with 1-2 questions:

“I want to classify text” → Ask:

  • Data size? (<500 → nlp-classical TF-IDF, 500-5K → zero-shot, >5K → BERT)
  • Need interpretability? (Yes → nlp-classical, No → transformers-llm)

“My training is slow” → Check:

  • GPU issue? → pytorch-mastery (memory, DataLoader)
  • Wrong architecture? → deep-learning-core (simplify model)
  • Need profiling? → mlops-experiment (TensorBoard profiler)

“I want to work with images” → Ask:

  • Classification? → cnn-vision
  • Generation? → generative-models
  • Captioning? → cnn-vision (multimodal)

3. Task Skills – Quick Actions

/debug-training [error-description or file-path]

Invokes read-only ml-debugger agent with systematic 4-phase debugging. Auto-route when user says: “NaN loss”, “shape mismatch”, “CUDA out of memory”, “accuracy stuck”, “model doesn’t converge”, “training error”, “low accuracy”

/explain-concept [concept-name]

8-step explanation: definition + Hebrew, analogy, ASCII diagram, steps, code, when to use, misconceptions, connections. Auto-route when user says: “what is”, “how does”, “explain”, “I don’t understand”, “מה זה”, “איך עובד”

/find-dataset [task-description]

5-step data sourcing: public datasets → synthetic generation → augmentation → zero-shot. Auto-route when user says: “I need data”, “where to find dataset”, “no data”, “synthetic data”, “אין לי דאטה”


4. Sub-Skill Routing – By Use Case

User wants to… Primary Skill Also Load
Predict numeric values (prices, scores) ml-fundamentals ml-advanced (ensembles)
Classify categories (spam, churn) ml-fundamentals ml-advanced (XGBoost)
Segment customers, find anomalies ml-advanced ml-fundamentals (features)
Build recommendation engine ml-advanced pytorch-mastery, deep-learning-core
Classify text (small data <1K) nlp-classical ml-fundamentals
Classify text (large data >5K) transformers-llm fine-tuning-peft
Understand training fundamentals deep-learning-core pytorch-mastery
Write PyTorch training code pytorch-mastery deep-learning-core
Classify/detect in images cnn-vision pytorch-mastery
Forecast time series sequence-models ml-fundamentals
Use BERT / HuggingFace / LLMs transformers-llm fine-tuning-peft
Build RAG / Q&A system rag-retrieval data-pipeline, transformers-llm
Parse PDFs, call LLM APIs data-pipeline rag-retrieval
Fine-tune LLM with LoRA/QLoRA fine-tuning-peft transformers-llm, mlops-experiment
Track experiments, tune hyperparams mlops-experiment any modeling skill
Explain predictions, debug errors model-interpretability ml-fundamentals
Train RL agent reinforcement-learning pytorch-mastery
Generate images (GAN/VAE/Diffusion) generative-models cnn-vision, pytorch-mastery
Get concept explanation ml-teaching-assistant specific sub-skill
Unsure which skill applies ml-knowledge-index (has A-Z topic index)

5. Sub-Skill Directory (17 Skills)

Foundation

  • ml-fundamentals — Tabular ML: regression, classification, evaluation metrics, feature engineering, sklearn
  • ml-advanced — Beyond basics: ensembles (XGBoost, CatBoost), clustering (K-Means, DBSCAN), PCA, recommender systems
  • deep-learning-core — DL theory: training loop, loss functions, backprop, optimizers, regularization, autoencoders
  • pytorch-mastery — Practical PyTorch: tensors, DataLoader, GPU memory, debugging shapes, environment setup

NLP & Language

  • nlp-classical — Pre-transformer NLP: TF-IDF, Word2Vec, topic modeling, text similarity. Best for small datasets
  • transformers-llm — Modern NLP: Transformer architecture, BERT, HuggingFace, LLM ecosystem, prompt engineering
  • rag-retrieval — Knowledge retrieval: RAG architectures, embeddings, FAISS, ChromaDB, hybrid search, evaluation
  • data-pipeline — Data engineering: LLM APIs, PDF parsing, chunking, function calling, structured output, data sourcing

Vision & Sequences

  • cnn-vision — Computer vision: CNN architectures, transfer learning, augmentation, MNIST, multi-modal, captioning
  • sequence-models — Sequential data: RNN, LSTM/GRU, time series forecasting, text generation

Advanced Deep Learning

  • fine-tuning-peft — Efficient fine-tuning: LoRA, QLoRA, PEFT, quantization (GPTQ/AWQ/GGUF), DPO/RLHF alignment
  • generative-models — Generative AI: GANs (DCGAN, WGAN), VAEs, Diffusion Models, Stable Diffusion
  • reinforcement-learning — RL: Q-Learning, DQN, PPO, Actor-Critic, Gymnasium, Stable-Baselines3

Operations & Understanding

  • mlops-experiment — ML operations: MLflow, W&B, TensorBoard, Optuna, model registry, experiment versioning
  • model-interpretability — Explainability: SHAP, LIME, Grad-CAM, feature importance, error analysis pipeline

Meta Skills

  • ml-knowledge-index — A-Z topic index mapping ANY question to the right sub-skill. Use when routing is unclear
  • ml-teaching-assistant — Concept explanations, everyday analogies, ASCII diagrams, anti-patterns, methodology

6. Cross-Skill Workflows

“Build an image classifier”

1. /find-dataset "image classification [domain]"  → Get data
2. cnn-vision/SKILL.md                            → Architecture, augmentation
3. pytorch-mastery/SKILL.md                       → Training loop, DataLoader
4. deep-learning-core/SKILL.md                    → Loss, regularization
5. model-interpretability/SKILL.md                → Grad-CAM visualization

“Build a RAG system”

1. data-pipeline/SKILL.md                         → PDF parsing, chunking
2. rag-retrieval/SKILL.md                         → Vector store, embeddings, RAG architecture
3. transformers-llm/SKILL.md                      → LLM selection, prompt engineering

“Classify text”

Decision tree:
Data size?
├── <500 samples  → nlp-classical (TF-IDF + LogisticRegression)
├── 500-5K        → transformers-llm (zero-shot or few-shot)
└── >5K           → transformers-llm (fine-tuned BERT)

Interpretability required?
├── Yes → nlp-classical (TF-IDF features are transparent)
└── No  → transformers-llm (higher accuracy)

“Fine-tune an LLM”

1. /find-dataset "instruction tuning data"        → Get or create dataset
2. fine-tuning-peft/SKILL.md                      → LoRA/QLoRA, SFTTrainer
3. transformers-llm/SKILL.md                      → Tokenization, HuggingFace Trainer
4. mlops-experiment/SKILL.md                      → Track experiments

“Customer segmentation”

1. /find-dataset "customer data"                  → Get data
2. ml-fundamentals/SKILL.md                       → EDA, feature engineering
3. ml-advanced/SKILL.md                           → K-Means, DBSCAN, PCA
4. model-interpretability/SKILL.md                → Cluster analysis

“Build a recommender system”

1. ml-advanced/SKILL.md                           → Matrix Factorization, NeuMF
2. pytorch-mastery/SKILL.md                       → Training loop, embeddings
3. deep-learning-core/SKILL.md                    → Loss functions, embedding layers

“My model isn’t working”

1. /debug-training [error-description]            → Systematic 4-phase debugging
2. model-interpretability/SKILL.md                → Error analysis, SHAP
3. deep-learning-core/SKILL.md                    → Check loss, optimizer, architecture

“Generate images”

1. generative-models/SKILL.md                     → GAN/VAE/Diffusion selection
2. cnn-vision/SKILL.md                            → CNN layers, image processing
3. pytorch-mastery/SKILL.md                       → Training loop, GPU optimization

“Train an RL agent”

1. reinforcement-learning/SKILL.md                → Algorithm selection (DQN vs PPO)
2. pytorch-mastery/SKILL.md                       → Neural network for policy/value
3. mlops-experiment/SKILL.md                      → Track RL experiments

“Explain predictions / Debug errors”

1. model-interpretability/SKILL.md                → SHAP, LIME, Grad-CAM
2. ml-fundamentals/SKILL.md                       → Evaluation metrics, confusion matrix
3. ml-teaching-assistant/SKILL.md                 → Conceptual explanation

“Deploy model to production”

1. mlops-experiment/SKILL.md                      → Model registry, versioning
2. fine-tuning-peft/SKILL.md                      → Quantization for efficiency
3. data-pipeline/SKILL.md                         → API integration, structured output

7. Hebrew Keyword Routing — מפת ניתוב בעברית

Hebrew Term English Route To
רגרסיה, קלסיפיקציה, סיווג Regression, Classification ml-fundamentals
יער אקראי, XGBoost, אשכולות Random Forest, Clustering ml-advanced
רשת נוירונים, למידה עמוקה Neural network, Deep learning deep-learning-core
PyTorch, טנזורים, GPU Tensors, GPU pytorch-mastery
עיבוד שפה טבעית, TF-IDF NLP, TF-IDF nlp-classical
טרנספורמר, BERT, מודל שפה Transformer, LLM transformers-llm
RAG, חיפוש סמנטי, וקטורים RAG, Semantic search rag-retrieval
פרסור PDF, chunking, API PDF parsing, APIs data-pipeline
CNN, ראייה ממוחשבת, תמונות CNN, Computer vision cnn-vision
LSTM, RNN, סדרות זמן Time series sequence-models
LoRA, כוונון עדין, קוונטיזציה Fine-tuning, Quantization fine-tuning-peft
MLflow, ניסויים, היפר-פרמטרים Experiments, Hyperparameters mlops-experiment
SHAP, הסבר מודל, פרשנות Explainability model-interpretability
Q-Learning, חיזוק, PPO Reinforcement learning reinforcement-learning
GAN, VAE, דיפוזיה, יצירת תמונות Generative models generative-models
מערכת המלצות Recommender system ml-advanced
אין לי דאטה, מאגר נתונים No data, Dataset /find-dataset
שגיאה באימון, לא מתכנס Training error /debug-training
מה זה X?, איך עובד Y? What is X?, How does Y work? /explain-concept

8. Loading Depth Strategy

User asks question
        │
        ▼
Intent is task skill? (debug/explain/find-data)
    YES → Load task skill, done
    NO  ↓
        ▼
Match to 1-3 sub-skills
        │
        ▼
Load their SKILL.md files (Level 2)
        │
        ▼
Can answer from SKILL.md patterns?
    YES → Answer using patterns + code
    NO  ↓
        ▼
Load 1-2 specific reference files (Level 3)
        │
        ▼
Answer with synthesis from all loaded context

When to Load Reference Files

User needs… Load reference file for…
Full implementation walkthrough Detailed code patterns
Mathematical foundations Theory and derivations
Library API details Specific library guides
Advanced configuration Edge cases, tuning
Troubleshooting beyond SKILL.md Deep debugging patterns

Rule: Load SKILL.md first. Only go to reference files when SKILL.md patterns aren’t enough. Load 1-2 reference files max per response.


9. Response Format Guidelines

Every Response Should Include:

  1. Code First — Complete, runnable Python with imports and sample data
  2. Hebrew Comments — On separate lines (NOT mixed RTL/LTR on same line!)
  3. Explain Why — Why this approach? When would you choose differently?
  4. Anti-Pattern Warnings — Call out common mistakes for this topic
  5. Next Steps — What to explore next, related concepts

Code Quality Standards

# Hebrew comment explaining the concept
# אנחנו מפצלים את הדאטה לפני כל עיבוד - למנוע דליפת מידע

# Always include:
import statements          # All imports at top
sample_data = ...          # Realistic sample data
expected_output = "..."    # Show what the output looks like

Hebrew Integration Rules

  • Translate concept names to Hebrew on first mention
  • Hebrew code comments on SEPARATE lines (RTL/LTR conflict prevention)
  • Use Hebrew analogies when culturally relevant

Quality Checklist

[ ] Code is complete and runnable (not snippets)
[ ] All imports included
[ ] Common pitfalls mentioned for this topic
[ ] 5-step ML workflow followed (if applicable)
[ ] Hebrew translation for key concepts
[ ] Next steps / related topics mentioned

10. Custom Models vs LLMs — Decision Framework

Scenario Approach Route To
Tabular data (CSV, structured) Custom ML ml-fundamentals, ml-advanced
Time-series forecasting Custom DL sequence-models
Narrow classification (spam, churn) Custom ML/DL ml-fundamentals → transformers-llm
Recommender systems Custom DL ml-advanced (Matrix Factorization, NeuMF)
Image classification/detection Custom DL cnn-vision
Flexible NL understanding LLM transformers-llm (zero-shot)
Document Q&A / summarization LLM + RAG rag-retrieval + transformers-llm
Function calling / AI agents LLM data-pipeline
Cost/privacy sensitive Custom Any custom model skill
Rapid prototyping LLM transformers-llm, data-pipeline

Rule of thumb: Start with the simplest model that meets your needs.


11. 5-Step ML Workflow — ALWAYS FOLLOW

Step 1: UNDERSTAND  → What type of problem? What data? What constraints?
Step 2: EDA         → df.shape, df.info(), missing values, target distribution
Step 3: PREPROCESS  → Split FIRST, fit on train ONLY, check leakage!
Step 4: MODEL       → Start simple, then increase complexity
Step 5: EVALUATE    → Baseline comparison, cross-validation, shuffled test

Enforce this in every ML project response. Reference: .claude/rules/ml-best-practices.md

Critical Anti-Patterns

DO:

  • BCEWithLogitsLoss (NOT BCELoss)
  • model.eval() + torch.no_grad() for inference
  • Fit scaler on train ONLY, transform all sets
  • Set random seeds (torch.manual_seed, np.random.seed)
  • Check class balance before training

DON’T:

  • Skip EDA and jump to modeling
  • Fit scaler before split → DATA LEAKAGE!
  • Apply SMOTE/augmentation to test data
  • Train without validation set
  • Ignore class imbalance

12. Quick Help

Need Action
Concept explanation /explain-concept [concept]
Training debugging /debug-training [error]
Data for ML project /find-dataset [task]
Unsure which skill Load ml-knowledge-index/SKILL.md
Full system guide See ML_DL_SKILL_SYSTEM_GUIDE.md

13. GSD Workflow Integration

When this skill operates within a GSD orchestration workflow (gsd init/discuss/plan/execute/verify), it adapts its behavior to provide domain expertise at each stage.

Domain Context Manifest

GSD detects ML/DL domain from PROJECT.md tech stack using these keywords: PyTorch, TensorFlow, sklearn, scikit-learn, neural network, deep learning, CNN, BERT, GPT, RAG, embeddings, transformer, training loop, loss function, model training, computer vision, NLP, reinforcement learning, fine-tuning, LSTM, GAN, diffusion, HuggingFace, vector store, FAISS, ChromaDB

When detected → GSD loads references/DOMAIN-INTEGRATION.md for ML/DL domain profile.

Per-Phase Behavior

gsd discuss [N] — Domain Consultation:

  • Ask ML-specific clarification questions: task type, data type, evaluation strategy, deployment target, compute constraints
  • Warn about anti-patterns early: data leakage risks, wrong loss functions, missing baselines
  • Recommend which sub-skills apply to this phase
  • Save ML decisions to CONTEXT.md (model type, data strategy, evaluation plan, sub-skills to use)

gsd plan [N] — Task Planning Guidance:

  • Map ML 5-step workflow to GSD atomic tasks:
    • Task 1: Data — preprocessing, splitting, augmentation (reference ml-fundamentals, data-pipeline)
    • Task 2: Model — architecture, training loop, hyperparams (reference pytorch-mastery, deep-learning-core)
    • Task 3: Evaluate — metrics, interpretability, error analysis (reference model-interpretability)
  • Include specific sub-skill pattern references in each PLAN-X.md <action> field
  • Use <domain-skill> tag in XML to declare which sub-skill the executor should consult

gsd execute [N] — Context Per Task:

  • Each PLAN-X.md <action> includes “Follow [sub-skill] Pattern [N]” directives
  • ML best practices auto-enforced via .claude/rules/ml-best-practices.md on all .py files
  • Use /debug-training when training issues arise during execution
  • Use /explain-concept when concept clarification is needed

gsd verify [N] — ML Verification Checklist:

  • Data split before any preprocessing (no leakage)
  • Scaler/encoder fit on train set ONLY
  • Correct loss function for task type (BCEWithLogitsLoss, not BCELoss)
  • model.eval() + torch.no_grad() for inference
  • Random seeds set for reproducibility
  • No SMOTE/augmentation on test data
  • Metrics compared against baseline
  • Class imbalance addressed if present

Cross-Skill Workflows → GSD Phase Mapping

ML Project Type Phase 1 Phase 2 Phase 3
Image Classifier Data + augmentation (cnn-vision, ml-fundamentals) Model + training (pytorch-mastery, deep-learning-core) Evaluation + Grad-CAM (model-interpretability)
RAG System Data pipeline + chunking (data-pipeline) Vector store + retrieval (rag-retrieval) LLM integration + eval (transformers-llm)
Fine-tune LLM Data preparation (data-pipeline, transformers-llm) LoRA/QLoRA training (fine-tuning-peft) Evaluation + deployment (mlops-experiment)
Text Classifier Data + EDA (ml-fundamentals, nlp-classical) Model selection + training (transformers-llm) Evaluation + interpretability (model-interpretability)
Recommender Data + features (ml-fundamentals) Matrix Factorization / NeuMF (ml-advanced) Evaluation + A/B setup (mlops-experiment)
RL Agent Environment setup (reinforcement-learning) Algorithm + training (pytorch-mastery) Evaluation + logging (mlops-experiment)

Agent-Architect Integration

When building ML/DL agent systems through agent-architect within GSD:

  • Phase 2 (Tools): Suggest ML custom MCP tools — model inference, evaluation metrics, data validation
  • Phase 2 (Agents): Use ML domain prompts — “Senior ML Engineer”, “Data Quality Analyst”
  • Phase 3 (Orchestration): Define ML-specific workflows — data prep → train → evaluate → report
  • Phase 4 (Guardrails): ML-specific — input validation, model versioning, drift detection, output confidence thresholds