ml-dl-expert
npx skills add https://github.com/natilevyy/claude-production-skills --skill ml-dl-expert
Agent 安装分布
Skill 文档
ML/DL Expert – ×ער×ת ××××× ×-ML/DL
ROOT ROUTER for the Hebrew University AI Engineering ML/DL teaching system. 17 sub-skills | 78 reference files | 3 task skills | Always-on rules
Your mission when this skill loads:
- Detect the user’s intent (not just keywords)
- For broad project requests â Run the Project Intake (Section 1)
- For specific questions â Route via Routing Engine (Section 2)
- Follow the response format and 5-step workflow
1. Project Intake â Interactive Guided Routing
When to Trigger
Use AskUserQuestion when the user’s request is broad and needs clarification:
- “I want to build a model” / “Help me with my ML project”
- “×× × ×¨××¦× ××× ×ת ××××” / “×¢××ר ×× ×¢× ×¤×¨×××§×”
- Any request where task type, data, or goal is unclear
Skip this for specific questions (“What is dropout?”, “Fix my NaN loss”) â route directly via Section 2.
The 4 Intake Questions
Use AskUserQuestion with all 4 questions in a single call. All labels are bilingual:
Q1: “××××× ×©×¤× ×ª×¨×¦× ×©× ×ª× ××? / Which language do you prefer?”
- header: “שפ×/Lang”
- Options:
- “×¢×ר×ת (Hebrew)” â ×× ××ס×ר××, ××ש×××ת ×××× ××¢×ר×ת
- “English (×× ×××ת)” â All explanations, responses and code comments in English
- “Mixed / ×ש×××” â English code + Hebrew explanations (recommended for course)
Q2: “×× ×¡×× ××ש×××? / What type of ML/DL task?”
- header: “×ש×××/Task”
- Options:
- “ס×××× / Classification” â ××××× ×§×××ר××ת: ספ××, ×¡× ×××× ×, ××××× / Predict categories
- “ר×רס×× / Regression” â ××××× ×ספר×× ×× ×¢×¨××× ×¢×ª××××× / Predict numbers, time series
- “NLP / ×קסה â ×¢×××× ×קס×, Q&A, צ’×××××, RAG, ס×××× / Text processing, chatbot
- “ר×××× / Vision” â ס×××× ×ª××× ×ת, ×××××, ×צ××¨× / Image classification, detection, generation
- (Other: RL, recommender, generative, clustering, etc.)
Q3: “×× ××××× ×©×ש ××? / What data do you have?”
- header: “××××/Data”
- Options:
- “××××× CSV / Tabular” â ש×ר×ת ××¢××××ת ×¢× ×¤×צ’ר×× / Structured rows and columns
- “×ס××× ××§×¡× / Text docs” â ×××ר××, PDF, ש×××ת / Articles, PDFs, conversations
- “ת××× ×ת / Images” â ת××× ×ת, סר××§×ת, ××××ר××ת / Photos, scans, diagrams
- “××× ×× ×××× / No data yet” â צר×× ××צ×× ×× ××צ×ר / Need to find or generate
- (Other: ×××××/audio, ס×ר×ת ×××/time series, ×××××/video, etc.)
Q4: “×× ××××¨× ×©× ×פר×××§×? / What’s the project goal?”
- header: “××ר×/Goal”
- Options:
- “×××ת ×§×רס / Course assignment” â תר××× ××××××, צר×× ××××× ××ש××× / Learning exercise
- “××-××פ×ס / Prototype” â POC ×××ר, × ×ס××, ××קת×× / Quick POC, experimentation
- “פר×××§×©× / Production” â ×ער×ת ×××× ×, סק×××××××ת / Reliable, scalable, deployed
- “××קר / Research” â ×ש×××ת ××ש×ת, ×× ×¦’×רק×× / Comparing approaches, benchmarking
- (Other: Kaggle, ת××/thesis, פר×××§× ××ש×/personal project, etc.)
Route Based on Answers
Language â Set response mode:
- ×¢×ר×ת â All explanations in Hebrew, code comments in Hebrew (separate lines), Hebrew analogies
- English â All in English, Hebrew only for term translations
- Mixed â English code + Hebrew explanations and comments (separate lines, no RTL/LTR mixing)
Task + Data â Primary Skills:
| Task | Tabular | Text | Images | No Data |
|---|---|---|---|---|
| ס××××/Classification | ml-fundamentals, ml-advanced | nlp-classical OR transformers-llm | cnn-vision | /find-dataset first |
| ר×רס××/Regression | ml-fundamentals | sequence-models | cnn-vision | /find-dataset first |
| NLP/××§×¡× | â | transformers-llm, rag-retrieval | cnn-vision (captioning) | /find-dataset first |
| ר××××/Vision | â | â | cnn-vision, generative-models | /find-dataset first |
| Other:RL | â | â | â | reinforcement-learning |
| Other:Recommender | ml-advanced | â | â | /find-dataset first |
| Other:Generative | â | transformers-llm | generative-models | generative-models |
Goal â Adjust depth + infer level:
- ×××ת ×§×רס / Course â Beginner-friendly: add ml-teaching-assistant, /explain-concept for each term, step-by-step
- ××-××פ×ס / Prototype â Intermediate: minimal viable code, skip optimization, working pipeline
- פר×××§×©× / Production â Advanced: add mlops-experiment + model-interpretability + fine-tuning-peft
- ××קר / Research â Advanced: add mlops-experiment (tracking), model-interpretability (analysis)
After intake, present a clear project roadmap (×פת ×ר×××) listing skills and steps in the chosen language.
2. Routing Engine – Detect Intent First
Intent â Action
| User Intent | Action | Example |
|---|---|---|
| Learn / Understand | /explain-concept [topic] |
“What is backpropagation?” |
| Debug / Fix | /debug-training [error] |
“My loss is NaN” |
| Find Data | /find-dataset [task] |
“I need data for sentiment analysis” |
| Build / Implement | Load sub-skill(s) in order | “Build an image classifier” |
| Compare / Choose | Load both skills + recommend | “BERT or TF-IDF?” |
| Optimize / Improve | model-interpretability + relevant skill | “Why is accuracy low?” |
| Deploy / Production | mlops-experiment + fine-tuning-peft | “Deploy model to production” |
Question Routing Patterns
“What is X?” / “Explain Y” / “How does Z work?”
- Use
/explain-concept [concept]for structured explanation - Also load relevant sub-skill for deeper context if needed
“How do I build X?” / “I want to create Y”
- Does user have data? If not â start with
/find-dataset [task] - Load primary sub-skill for the task
- Load supporting skills (pytorch-mastery, deep-learning-core)
- Follow 5-step ML workflow (Section 11)
“Error X” / “My model doesn’t work” / “NaN loss”
- Use
/debug-training [error-description] - The ml-debugger agent handles systematic 4-phase debugging
- Returns diagnosis with file:line references + corrected code
“Which is better: X or Y?” / “Should I use X?”
- Load ml-teaching-assistant for decision framework
- Load both relevant sub-skills for technical comparison
- Provide comparison table + clear recommendation
Disambiguation – Multi-Skill Queries
When a query matches multiple skills, clarify with 1-2 questions:
“I want to classify text” â Ask:
- Data size? (<500 â nlp-classical TF-IDF, 500-5K â zero-shot, >5K â BERT)
- Need interpretability? (Yes â nlp-classical, No â transformers-llm)
“My training is slow” â Check:
- GPU issue? â pytorch-mastery (memory, DataLoader)
- Wrong architecture? â deep-learning-core (simplify model)
- Need profiling? â mlops-experiment (TensorBoard profiler)
“I want to work with images” â Ask:
- Classification? â cnn-vision
- Generation? â generative-models
- Captioning? â cnn-vision (multimodal)
3. Task Skills – Quick Actions
/debug-training [error-description or file-path]
Invokes read-only ml-debugger agent with systematic 4-phase debugging. Auto-route when user says: “NaN loss”, “shape mismatch”, “CUDA out of memory”, “accuracy stuck”, “model doesn’t converge”, “training error”, “low accuracy”
/explain-concept [concept-name]
8-step explanation: definition + Hebrew, analogy, ASCII diagram, steps, code, when to use, misconceptions, connections. Auto-route when user says: “what is”, “how does”, “explain”, “I don’t understand”, “×× ××”, “××× ×¢×××”
/find-dataset [task-description]
5-step data sourcing: public datasets â synthetic generation â augmentation â zero-shot. Auto-route when user says: “I need data”, “where to find dataset”, “no data”, “synthetic data”, “××× ×× ××××”
4. Sub-Skill Routing – By Use Case
| User wants to… | Primary Skill | Also Load |
|---|---|---|
| Predict numeric values (prices, scores) | ml-fundamentals |
ml-advanced (ensembles) |
| Classify categories (spam, churn) | ml-fundamentals |
ml-advanced (XGBoost) |
| Segment customers, find anomalies | ml-advanced |
ml-fundamentals (features) |
| Build recommendation engine | ml-advanced |
pytorch-mastery, deep-learning-core |
| Classify text (small data <1K) | nlp-classical |
ml-fundamentals |
| Classify text (large data >5K) | transformers-llm |
fine-tuning-peft |
| Understand training fundamentals | deep-learning-core |
pytorch-mastery |
| Write PyTorch training code | pytorch-mastery |
deep-learning-core |
| Classify/detect in images | cnn-vision |
pytorch-mastery |
| Forecast time series | sequence-models |
ml-fundamentals |
| Use BERT / HuggingFace / LLMs | transformers-llm |
fine-tuning-peft |
| Build RAG / Q&A system | rag-retrieval |
data-pipeline, transformers-llm |
| Parse PDFs, call LLM APIs | data-pipeline |
rag-retrieval |
| Fine-tune LLM with LoRA/QLoRA | fine-tuning-peft |
transformers-llm, mlops-experiment |
| Track experiments, tune hyperparams | mlops-experiment |
any modeling skill |
| Explain predictions, debug errors | model-interpretability |
ml-fundamentals |
| Train RL agent | reinforcement-learning |
pytorch-mastery |
| Generate images (GAN/VAE/Diffusion) | generative-models |
cnn-vision, pytorch-mastery |
| Get concept explanation | ml-teaching-assistant |
specific sub-skill |
| Unsure which skill applies | ml-knowledge-index |
(has A-Z topic index) |
5. Sub-Skill Directory (17 Skills)
Foundation
- ml-fundamentals â Tabular ML: regression, classification, evaluation metrics, feature engineering, sklearn
- ml-advanced â Beyond basics: ensembles (XGBoost, CatBoost), clustering (K-Means, DBSCAN), PCA, recommender systems
- deep-learning-core â DL theory: training loop, loss functions, backprop, optimizers, regularization, autoencoders
- pytorch-mastery â Practical PyTorch: tensors, DataLoader, GPU memory, debugging shapes, environment setup
NLP & Language
- nlp-classical â Pre-transformer NLP: TF-IDF, Word2Vec, topic modeling, text similarity. Best for small datasets
- transformers-llm â Modern NLP: Transformer architecture, BERT, HuggingFace, LLM ecosystem, prompt engineering
- rag-retrieval â Knowledge retrieval: RAG architectures, embeddings, FAISS, ChromaDB, hybrid search, evaluation
- data-pipeline â Data engineering: LLM APIs, PDF parsing, chunking, function calling, structured output, data sourcing
Vision & Sequences
- cnn-vision â Computer vision: CNN architectures, transfer learning, augmentation, MNIST, multi-modal, captioning
- sequence-models â Sequential data: RNN, LSTM/GRU, time series forecasting, text generation
Advanced Deep Learning
- fine-tuning-peft â Efficient fine-tuning: LoRA, QLoRA, PEFT, quantization (GPTQ/AWQ/GGUF), DPO/RLHF alignment
- generative-models â Generative AI: GANs (DCGAN, WGAN), VAEs, Diffusion Models, Stable Diffusion
- reinforcement-learning â RL: Q-Learning, DQN, PPO, Actor-Critic, Gymnasium, Stable-Baselines3
Operations & Understanding
- mlops-experiment â ML operations: MLflow, W&B, TensorBoard, Optuna, model registry, experiment versioning
- model-interpretability â Explainability: SHAP, LIME, Grad-CAM, feature importance, error analysis pipeline
Meta Skills
- ml-knowledge-index â A-Z topic index mapping ANY question to the right sub-skill. Use when routing is unclear
- ml-teaching-assistant â Concept explanations, everyday analogies, ASCII diagrams, anti-patterns, methodology
6. Cross-Skill Workflows
“Build an image classifier”
1. /find-dataset "image classification [domain]" â Get data
2. cnn-vision/SKILL.md â Architecture, augmentation
3. pytorch-mastery/SKILL.md â Training loop, DataLoader
4. deep-learning-core/SKILL.md â Loss, regularization
5. model-interpretability/SKILL.md â Grad-CAM visualization
“Build a RAG system”
1. data-pipeline/SKILL.md â PDF parsing, chunking
2. rag-retrieval/SKILL.md â Vector store, embeddings, RAG architecture
3. transformers-llm/SKILL.md â LLM selection, prompt engineering
“Classify text”
Decision tree:
Data size?
âââ <500 samples â nlp-classical (TF-IDF + LogisticRegression)
âââ 500-5K â transformers-llm (zero-shot or few-shot)
âââ >5K â transformers-llm (fine-tuned BERT)
Interpretability required?
âââ Yes â nlp-classical (TF-IDF features are transparent)
âââ No â transformers-llm (higher accuracy)
“Fine-tune an LLM”
1. /find-dataset "instruction tuning data" â Get or create dataset
2. fine-tuning-peft/SKILL.md â LoRA/QLoRA, SFTTrainer
3. transformers-llm/SKILL.md â Tokenization, HuggingFace Trainer
4. mlops-experiment/SKILL.md â Track experiments
“Customer segmentation”
1. /find-dataset "customer data" â Get data
2. ml-fundamentals/SKILL.md â EDA, feature engineering
3. ml-advanced/SKILL.md â K-Means, DBSCAN, PCA
4. model-interpretability/SKILL.md â Cluster analysis
“Build a recommender system”
1. ml-advanced/SKILL.md â Matrix Factorization, NeuMF
2. pytorch-mastery/SKILL.md â Training loop, embeddings
3. deep-learning-core/SKILL.md â Loss functions, embedding layers
“My model isn’t working”
1. /debug-training [error-description] â Systematic 4-phase debugging
2. model-interpretability/SKILL.md â Error analysis, SHAP
3. deep-learning-core/SKILL.md â Check loss, optimizer, architecture
“Generate images”
1. generative-models/SKILL.md â GAN/VAE/Diffusion selection
2. cnn-vision/SKILL.md â CNN layers, image processing
3. pytorch-mastery/SKILL.md â Training loop, GPU optimization
“Train an RL agent”
1. reinforcement-learning/SKILL.md â Algorithm selection (DQN vs PPO)
2. pytorch-mastery/SKILL.md â Neural network for policy/value
3. mlops-experiment/SKILL.md â Track RL experiments
“Explain predictions / Debug errors”
1. model-interpretability/SKILL.md â SHAP, LIME, Grad-CAM
2. ml-fundamentals/SKILL.md â Evaluation metrics, confusion matrix
3. ml-teaching-assistant/SKILL.md â Conceptual explanation
“Deploy model to production”
1. mlops-experiment/SKILL.md â Model registry, versioning
2. fine-tuning-peft/SKILL.md â Quantization for efficiency
3. data-pipeline/SKILL.md â API integration, structured output
7. Hebrew Keyword Routing â ×פת × ×ת×× ××¢×ר×ת
| Hebrew Term | English | Route To |
|---|---|---|
| ר×רס××, ×§×ס×פ×קצ××, ס×××× | Regression, Classification | ml-fundamentals |
| ×ער ×קר××, XGBoost, ×ש××××ת | Random Forest, Clustering | ml-advanced |
| רשת × ××ר×× ××, ××××× ×¢×××§× | Neural network, Deep learning | deep-learning-core |
| PyTorch, ×× ××ר××, GPU | Tensors, GPU | pytorch-mastery |
| ×¢×××× ×©×¤× ×××¢×ת, TF-IDF | NLP, TF-IDF | nlp-classical |
| ××¨× ×¡×¤×ר×ר, BERT, ×××× ×©×¤× | Transformer, LLM | transformers-llm |
| RAG, ××פ×ש ס×× ××, ××§××ר×× | RAG, Semantic search | rag-retrieval |
| פרס×ר PDF, chunking, API | PDF parsing, APIs | data-pipeline |
| CNN, ר×××× ××××ש×ת, ת××× ×ת | CNN, Computer vision | cnn-vision |
| LSTM, RNN, ס×ר×ת ××× | Time series | sequence-models |
| LoRA, ×××× ×× ×¢×××, ×§××× ×××צ×× | Fine-tuning, Quantization | fine-tuning-peft |
| MLflow, × ×ס××××, ××פר-פר××ר×× | Experiments, Hyperparameters | mlops-experiment |
| SHAP, ×ס×ר ××××, ×¤×¨×©× ×ת | Explainability | model-interpretability |
| Q-Learning, ×××××§, PPO | Reinforcement learning | reinforcement-learning |
| GAN, VAE, ××פ××××, ×צ×רת ת××× ×ת | Generative models | generative-models |
| ×ער×ת ×××צ×ת | Recommender system | ml-advanced |
| ××× ×× ××××, ×××ר × ×ª×× ×× | No data, Dataset | /find-dataset |
| ש×××× ××××××, ×× ×ת×× ×¡ | Training error | /debug-training |
| ×× ×× X?, ××× ×¢××× Y? | What is X?, How does Y work? | /explain-concept |
8. Loading Depth Strategy
User asks question
â
â¼
Intent is task skill? (debug/explain/find-data)
YES â Load task skill, done
NO â
â¼
Match to 1-3 sub-skills
â
â¼
Load their SKILL.md files (Level 2)
â
â¼
Can answer from SKILL.md patterns?
YES â Answer using patterns + code
NO â
â¼
Load 1-2 specific reference files (Level 3)
â
â¼
Answer with synthesis from all loaded context
When to Load Reference Files
| User needs… | Load reference file for… |
|---|---|
| Full implementation walkthrough | Detailed code patterns |
| Mathematical foundations | Theory and derivations |
| Library API details | Specific library guides |
| Advanced configuration | Edge cases, tuning |
| Troubleshooting beyond SKILL.md | Deep debugging patterns |
Rule: Load SKILL.md first. Only go to reference files when SKILL.md patterns aren’t enough. Load 1-2 reference files max per response.
9. Response Format Guidelines
Every Response Should Include:
- Code First â Complete, runnable Python with imports and sample data
- Hebrew Comments â On separate lines (NOT mixed RTL/LTR on same line!)
- Explain Why â Why this approach? When would you choose differently?
- Anti-Pattern Warnings â Call out common mistakes for this topic
- Next Steps â What to explore next, related concepts
Code Quality Standards
# Hebrew comment explaining the concept
# ×× ×× × ×פצ××× ×ת ××××× ××¤× × ×× ×¢×××× - ××× ××¢ ×××פת ××××¢
# Always include:
import statements # All imports at top
sample_data = ... # Realistic sample data
expected_output = "..." # Show what the output looks like
Hebrew Integration Rules
- Translate concept names to Hebrew on first mention
- Hebrew code comments on SEPARATE lines (RTL/LTR conflict prevention)
- Use Hebrew analogies when culturally relevant
Quality Checklist
[ ] Code is complete and runnable (not snippets)
[ ] All imports included
[ ] Common pitfalls mentioned for this topic
[ ] 5-step ML workflow followed (if applicable)
[ ] Hebrew translation for key concepts
[ ] Next steps / related topics mentioned
10. Custom Models vs LLMs â Decision Framework
| Scenario | Approach | Route To |
|---|---|---|
| Tabular data (CSV, structured) | Custom ML | ml-fundamentals, ml-advanced |
| Time-series forecasting | Custom DL | sequence-models |
| Narrow classification (spam, churn) | Custom ML/DL | ml-fundamentals â transformers-llm |
| Recommender systems | Custom DL | ml-advanced (Matrix Factorization, NeuMF) |
| Image classification/detection | Custom DL | cnn-vision |
| Flexible NL understanding | LLM | transformers-llm (zero-shot) |
| Document Q&A / summarization | LLM + RAG | rag-retrieval + transformers-llm |
| Function calling / AI agents | LLM | data-pipeline |
| Cost/privacy sensitive | Custom | Any custom model skill |
| Rapid prototyping | LLM | transformers-llm, data-pipeline |
Rule of thumb: Start with the simplest model that meets your needs.
11. 5-Step ML Workflow â ALWAYS FOLLOW
Step 1: UNDERSTAND â What type of problem? What data? What constraints?
Step 2: EDA â df.shape, df.info(), missing values, target distribution
Step 3: PREPROCESS â Split FIRST, fit on train ONLY, check leakage!
Step 4: MODEL â Start simple, then increase complexity
Step 5: EVALUATE â Baseline comparison, cross-validation, shuffled test
Enforce this in every ML project response. Reference: .claude/rules/ml-best-practices.md
Critical Anti-Patterns
DO:
BCEWithLogitsLoss(NOTBCELoss)model.eval()+torch.no_grad()for inference- Fit scaler on train ONLY, transform all sets
- Set random seeds (
torch.manual_seed,np.random.seed) - Check class balance before training
DON’T:
- Skip EDA and jump to modeling
- Fit scaler before split â DATA LEAKAGE!
- Apply SMOTE/augmentation to test data
- Train without validation set
- Ignore class imbalance
12. Quick Help
| Need | Action |
|---|---|
| Concept explanation | /explain-concept [concept] |
| Training debugging | /debug-training [error] |
| Data for ML project | /find-dataset [task] |
| Unsure which skill | Load ml-knowledge-index/SKILL.md |
| Full system guide | See ML_DL_SKILL_SYSTEM_GUIDE.md |
13. GSD Workflow Integration
When this skill operates within a GSD orchestration workflow (gsd init/discuss/plan/execute/verify), it adapts its behavior to provide domain expertise at each stage.
Domain Context Manifest
GSD detects ML/DL domain from PROJECT.md tech stack using these keywords:
PyTorch, TensorFlow, sklearn, scikit-learn, neural network, deep learning,
CNN, BERT, GPT, RAG, embeddings, transformer, training loop, loss function,
model training, computer vision, NLP, reinforcement learning, fine-tuning,
LSTM, GAN, diffusion, HuggingFace, vector store, FAISS, ChromaDB
When detected â GSD loads references/DOMAIN-INTEGRATION.md for ML/DL domain profile.
Per-Phase Behavior
gsd discuss [N] â Domain Consultation:
- Ask ML-specific clarification questions: task type, data type, evaluation strategy, deployment target, compute constraints
- Warn about anti-patterns early: data leakage risks, wrong loss functions, missing baselines
- Recommend which sub-skills apply to this phase
- Save ML decisions to CONTEXT.md (model type, data strategy, evaluation plan, sub-skills to use)
gsd plan [N] â Task Planning Guidance:
- Map ML 5-step workflow to GSD atomic tasks:
- Task 1: Data â preprocessing, splitting, augmentation (reference
ml-fundamentals,data-pipeline) - Task 2: Model â architecture, training loop, hyperparams (reference
pytorch-mastery,deep-learning-core) - Task 3: Evaluate â metrics, interpretability, error analysis (reference
model-interpretability)
- Task 1: Data â preprocessing, splitting, augmentation (reference
- Include specific sub-skill pattern references in each PLAN-X.md
<action>field - Use
<domain-skill>tag in XML to declare which sub-skill the executor should consult
gsd execute [N] â Context Per Task:
- Each PLAN-X.md
<action>includes “Follow [sub-skill] Pattern [N]” directives - ML best practices auto-enforced via
.claude/rules/ml-best-practices.mdon all.pyfiles - Use
/debug-trainingwhen training issues arise during execution - Use
/explain-conceptwhen concept clarification is needed
gsd verify [N] â ML Verification Checklist:
- Data split before any preprocessing (no leakage)
- Scaler/encoder fit on train set ONLY
- Correct loss function for task type (BCEWithLogitsLoss, not BCELoss)
-
model.eval()+torch.no_grad()for inference - Random seeds set for reproducibility
- No SMOTE/augmentation on test data
- Metrics compared against baseline
- Class imbalance addressed if present
Cross-Skill Workflows â GSD Phase Mapping
| ML Project Type | Phase 1 | Phase 2 | Phase 3 |
|---|---|---|---|
| Image Classifier | Data + augmentation (cnn-vision, ml-fundamentals) |
Model + training (pytorch-mastery, deep-learning-core) |
Evaluation + Grad-CAM (model-interpretability) |
| RAG System | Data pipeline + chunking (data-pipeline) |
Vector store + retrieval (rag-retrieval) |
LLM integration + eval (transformers-llm) |
| Fine-tune LLM | Data preparation (data-pipeline, transformers-llm) |
LoRA/QLoRA training (fine-tuning-peft) |
Evaluation + deployment (mlops-experiment) |
| Text Classifier | Data + EDA (ml-fundamentals, nlp-classical) |
Model selection + training (transformers-llm) |
Evaluation + interpretability (model-interpretability) |
| Recommender | Data + features (ml-fundamentals) |
Matrix Factorization / NeuMF (ml-advanced) |
Evaluation + A/B setup (mlops-experiment) |
| RL Agent | Environment setup (reinforcement-learning) |
Algorithm + training (pytorch-mastery) |
Evaluation + logging (mlops-experiment) |
Agent-Architect Integration
When building ML/DL agent systems through agent-architect within GSD:
- Phase 2 (Tools): Suggest ML custom MCP tools â model inference, evaluation metrics, data validation
- Phase 2 (Agents): Use ML domain prompts â “Senior ML Engineer”, “Data Quality Analyst”
- Phase 3 (Orchestration): Define ML-specific workflows â data prep â train â evaluate â report
- Phase 4 (Guardrails): ML-specific â input validation, model versioning, drift detection, output confidence thresholds