instrumenting-with-mlflow-tracing

📁 mlflow/skills 📅 11 days ago
31
总安装量
4
周安装量
#12004
全站排名
安装命令
npx skills add https://github.com/mlflow/skills --skill instrumenting-with-mlflow-tracing

Agent 安装分布

amp 3
opencode 3
kimi-cli 3
codex 3
github-copilot 3
gemini-cli 3

Skill 文档

MLflow Tracing Instrumentation Guide

Language-Specific Guides

Based on the user’s project, load the appropriate guide:

  • Python projects: Read references/python.md
  • TypeScript/JavaScript projects: Read references/typescript.md

If unclear, check for package.json (TypeScript) or requirements.txt/pyproject.toml (Python) in the project.


What to Trace

Trace these operations (high debugging/observability value):

Operation Type Examples Why Trace
Root operations Main entry points, top-level pipelines, workflow steps End-to-end latency, input/output logging
LLM calls Chat completions, embeddings Token usage, latency, prompt/response inspection
Retrieval Vector DB queries, document fetches, search Relevance debugging, retrieval quality
Tool/function calls API calls, database queries, web search External dependency monitoring, error tracking
Agent decisions Routing, planning, tool selection Understand agent reasoning and choices
External services HTTP APIs, file I/O, message queues Dependency failures, timeout tracking

Skip tracing these (too granular, adds noise):

  • Simple data transformations (dict/list manipulation)
  • String formatting, parsing, validation
  • Configuration loading, environment setup
  • Logging or metric emission
  • Pure utility functions (math, sorting, filtering)

Rule of thumb: Trace operations that are important for debugging and identifying issues in your application.


Feedback Collection

Log user feedback on traces for evaluation, debugging, and fine-tuning. Essential for identifying quality issues in production.

See references/feedback-collection.md for:

  • Recording user ratings and comments with mlflow.log_feedback()
  • Capturing trace IDs to return to clients
  • LLM-as-judge automated evaluation

Reference Documentation

Production Deployment

See references/production.md for:

  • Environment variable configuration
  • Async logging for low-latency applications
  • Sampling configuration (MLFLOW_TRACE_SAMPLING_RATIO)
  • Lightweight SDK (mlflow-tracing)
  • Docker/Kubernetes deployment

Advanced Patterns

See references/advanced-patterns.md for:

  • Async function tracing
  • Multi-threading with context propagation
  • PII redaction with span processors

Distributed Tracing

See references/distributed-tracing.md for:

  • Propagating trace context across services
  • Client/server header APIs