langsmith-trace-analyzer

📁 lubu-labs/langchain-agent-skills 📅 14 days ago
8
总安装量
8
周安装量
#34687
全站排名
安装命令
npx skills add https://github.com/lubu-labs/langchain-agent-skills --skill langsmith-trace-analyzer

Agent 安装分布

opencode 8
gemini-cli 8
github-copilot 8
codex 8
amp 7
kimi-cli 7

Skill 文档

LangSmith Trace Analyzer

Use this skill to move from raw LangSmith traces to actionable debugging/evaluation insights.

Quick Start

# Install dependencies
uv pip install langsmith langsmith-fetch

# Auth
export LANGSMITH_API_KEY=<your_langsmith_api_key>

Fast workflow

  1. Download traces with scripts/download_traces.py (or scripts/download_traces.ts).
  2. Analyze downloaded JSON with scripts/analyze_traces.py.
  3. Load targeted references only when needed:
    • references/filtering-querying.md for query/filter syntax
    • references/analysis-patterns.md for deeper diagnostics
    • references/benchmark-analysis.md for benchmark-specific workflows

Decision Guide

  1. Known trace IDs
    Use langsmith-fetch trace <id> directly, or --trace-ids in downloader scripts.

  2. Need to discover traces first
    Use LangSmith SDK list_runs/listRuns with filters, then download selected trace IDs.

  3. Need aggregate insights
    Run analyze_traces.py for summary stats, patterns, and passed-vs-failed comparisons.

Core Workflows

1) Download and organize traces

Python:

uv run skills/langsmith-trace-analyzer/scripts/download_traces.py \
  --project "my-project" \
  --filter "job_id=abc123" \
  --last-hours 24 \
  --limit 100 \
  --output ./traces \
  --organize

TypeScript:

ts-node skills/langsmith-trace-analyzer/scripts/download_traces.ts \
  --project "my-project" \
  --filter "job_id=abc123" \
  --last-hours 24 \
  --limit 100 \
  --output ./traces

Output layout:

traces/
├── manifest.json
└── by-outcome/
    ├── passed/
    ├── failed/
    └── error/
        ├── GraphRecursionError/
        ├── TimeoutError/
        └── DaytonaError/

Notes:

  • Python script supports --organize/--no-organize.
  • Both scripts use SDK filtering plus langsmith-fetch for full trace payload export.

2) Analyze downloaded traces

# Markdown report
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --output report.md

# JSON output
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --json

# Compare passed vs failed (expects by-outcome folders)
uv run skills/langsmith-trace-analyzer/scripts/analyze_traces.py ./traces --compare --output comparison.md

The analyzer reports:

  • message/tool-call/token/duration summaries
  • top tool usage
  • anomaly patterns (high message count, repeated tools, quick failures)
  • passed-vs-failed metric deltas when comparison is enabled

3) Query traces correctly (SDK)

Use official LangSmith run filter syntax via filter and/or start_time:

from datetime import datetime, timedelta, timezone
from langsmith import Client

client = Client()

start = datetime.now(timezone.utc) - timedelta(hours=24)
filter_query = 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))'

runs = client.list_runs(
    project_name="my-project",
    is_root=True,
    start_time=start,
    filter=filter_query,
)

For TypeScript:

import { Client } from "langsmith";

const client = new Client();
for await (const run of client.listRuns({
  projectName: "my-project",
  isRoot: true,
  filter: 'and(eq(metadata_key, "job_id"), eq(metadata_value, "abc123"))',
})) {
  console.log(run.id, run.status);
}

Accuracy and Schema Notes

  • LangSmith run fields are commonly top-level (status, error, total_tokens, start_time, end_time).
  • Some exported traces also include nested metadata (metadata or extra.metadata) and/or messages.
  • analyze_traces.py is resilient to multiple payload shapes, including raw array payloads.
  • For full conversation content, prefer downloaded trace payloads over bare list_runs results.

Troubleshooting

Issue Likely Cause Action
LANGSMITH_API_KEY missing Auth not configured export LANGSMITH_API_KEY=<your_langsmith_api_key>
No runs returned Wrong project/filter/time range Verify project name and filter syntax
Empty/partial message arrays Run schema differs or incomplete data Use downloaded trace JSON and inspect status/error fields
JSON parse error on downloaded files Bad/incomplete export Re-download trace; use --format raw paths in scripts
Re-downloading same traces repeatedly Existing files in nested folders Use current scripts (they check existing files across output tree)

Safety for Open Source

  • Do not commit downloaded trace artifacts (manifest.json, trace JSON dumps) unless sanitized.
  • Trace payloads can contain user prompts, outputs, metadata, and other sensitive runtime data.
  • Keep this skill repository focused on scripts/templates, not production trace exports.

Resources

scripts/

  • scripts/download_traces.py: Python downloader + organizer
  • scripts/download_traces.ts: TypeScript downloader + organizer
  • scripts/analyze_traces.py: Offline analysis and reporting

references/

  • references/filtering-querying.md: LangSmith query/filter examples
  • references/analysis-patterns.md: Diagnostic patterns and heuristics
  • references/benchmark-analysis.md: Benchmark-oriented analysis