datadog-cli
npx skills add https://github.com/ctdio/datadog-cli --skill datadog-cli
Agent 安装分布
Skill 文档
Datadog CLI Reference
A CLI tool for AI agents to debug and triage using Datadog logs, metrics, and APM traces.
Setup
Running the CLI
# Via npx (no install needed)
npx @ctdio/datadog-cli <command>
# Via bunx
bunx @ctdio/datadog-cli <command>
# Or create an alias for convenience
alias datadog="npx @ctdio/datadog-cli"
Environment Variables (Required)
export DD_API_KEY="your-api-key"
export DD_APP_KEY="your-app-key"
Get keys from: https://app.datadoghq.com/organization-settings/api-keys
For Non-US Datadog Sites
Use --site flag:
npx @ctdio/datadog-cli logs search --query "*" --site datadoghq.eu
Commands
Log Search
datadog logs search --query "<query>" [--from <time>] [--to <time>] [--limit <n>] [--sort <order>]
Examples:
datadog logs search --query "status:error" --from 1h
datadog logs search --query "service:api status:error @http.status_code:500" --from 1h
Live Tail (Real-time Streaming)
Stream logs as they arrive. Press Ctrl+C to stop.
datadog logs tail --query "<query>" [--interval <seconds>]
Examples:
datadog logs tail --query "status:error"
datadog logs tail --query "service:api" --interval 5
Trace Correlation
Find all logs for a distributed trace across services.
datadog logs trace --id "<trace-id>" [--from <time>] [--to <time>]
Example:
datadog logs trace --id "abc123def456" --from 24h
Log Context
Get logs before and after a specific timestamp to understand what happened.
datadog logs context --timestamp "<iso-timestamp>" [--before <time>] [--after <time>] [--service <svc>]
Examples:
datadog logs context --timestamp "2024-01-15T10:30:00Z" --before 5m --after 2m
datadog logs context --timestamp "2024-01-15T10:30:00Z" --service api --before 10m
Error Summary
Quick breakdown of errors by service, type, and message.
datadog errors [--from <time>] [--to <time>] [--service <svc>]
Examples:
datadog errors --from 1h
datadog errors --service payment-api --from 24h
Period Comparison
Compare log counts between current period and previous period.
datadog logs compare --query "<query>" --period <time>
Examples:
datadog logs compare --query "status:error" --period 1h
datadog logs compare --query "service:api status:error" --period 6h
Log Patterns
Group similar log messages to find patterns (replaces UUIDs, numbers, etc.).
datadog logs patterns --query "<query>" [--from <time>] [--limit <n>]
Examples:
datadog logs patterns --query "status:error" --from 1h
datadog logs patterns --query "service:api" --from 6h --limit 1000
Service Discovery
List all services with recent log activity.
datadog services [--from <time>] [--to <time>]
Example:
datadog services --from 24h
Log Aggregation
datadog logs agg --query "<query>" --facet <facet> [--from <time>]
Common facets: status, service, host, @http.status_code, @error.kind
Examples:
datadog logs agg --query "*" --facet status --from 1h
datadog logs agg --query "status:error" --facet service --from 24h
Multiple Queries
Run multiple queries in parallel.
datadog logs multi --queries "name1:query1,name2:query2" [--from <time>]
Example:
datadog logs multi --queries "errors:status:error,warnings:status:warn" --from 1h
Metrics Query
datadog metrics query --query "<metrics-query>" [--from <time>] [--to <time>]
Query format: <aggregation>:<metric>{<tags>}
Examples:
datadog metrics query --query "avg:system.cpu.user{*}" --from 1h
datadog metrics query --query "avg:system.cpu.user{service:api}" --from 1h
datadog metrics query --query "sum:trace.http.request.errors{service:api}.as_count()" --from 1h
APM Traces & Spans
Span Search
Search APM spans/traces with query filters.
datadog spans search --query "<query>" [--from <time>] [--to <time>] [--limit <n>] [--min-duration <duration>]
Examples:
datadog spans search --query "env:prod service:api" --from 1h
datadog spans search --query "service:api" --min-duration 1s --from 1h # Slow requests
datadog spans search --query "resource_name:POST" --from 1h
Span Aggregation
Aggregate spans by facet.
datadog spans agg --query "<query>" --facet <facet> [--from <time>]
Common facets: service, resource_name, status, env, operation_name
Examples:
datadog spans agg --query "env:prod" --facet service --from 24h
datadog spans agg --query "service:api" --facet resource_name --from 1h
Trace Hierarchy View
View all spans in a trace with parent-child relationships.
datadog spans trace --id "<trace-id>" [--from <time>]
Example:
datadog spans trace --id "abc123def456" --from 24h
Span Error Summary
Get breakdown of span errors by service and resource.
datadog spans errors [--from <time>] [--service <svc>]
Examples:
datadog spans errors --from 1h
datadog spans errors --service api --from 24h
APM Service Discovery
List all services with APM trace activity.
datadog spans services [--from <time>]
Example:
datadog spans services --from 24h
Global Flags
| Flag | Description |
|---|---|
--pretty |
Human-readable output with colors |
--output <file> |
Export results to JSON file |
--site <site> |
Datadog site (e.g., datadoghq.eu) |
Time Formats
- Relative:
30m,1h,6h,24h,7d - ISO 8601:
2024-01-15T10:30:00Z
Common Workflows
Incident Triage
# 1. Quick error overview (logs and spans)
datadog errors --from 1h
datadog spans errors --from 1h
# 2. Is this new? Compare to previous period
datadog logs compare --query "status:error" --period 1h
# 3. What patterns are we seeing?
datadog logs patterns --query "status:error" --from 1h
# 4. Narrow down by service
datadog logs search --query "status:error service:payment-api" --from 1h
# 5. Find slow requests
datadog spans search --query "service:api" --min-duration 1s --from 1h
# 6. Get context around a specific timestamp
datadog logs context --timestamp "2024-01-15T10:30:00Z" --service api --before 5m --after 2m
# 7. Follow the distributed trace (logs and spans)
datadog logs trace --id "TRACE_ID"
datadog spans trace --id "TRACE_ID"
Real-time Debugging
# Stream errors as they happen
datadog logs tail --query "status:error"
# Watch specific service
datadog logs tail --query "service:api status:error"
Service Health Check
# List services
datadog services --from 24h
# Check error distribution
datadog logs agg --query "service:api" --facet status --from 1h
# Check CPU/memory
datadog metrics query --query "avg:system.cpu.user{service:api}" --from 1h
Export for Sharing
# Save search results
datadog logs search --query "status:error" --from 1h --output errors.json
# Save error summary
datadog errors --from 24h --output error-report.json
Datadog Query Syntax
| Operator | Example | Description |
|---|---|---|
AND |
service:api status:error |
Both conditions |
OR |
status:error OR status:warn |
Either condition |
- |
-status:info |
Exclude |
* |
service:api-* |
Wildcard |
>= <= |
@http.status_code:>=400 |
Numeric comparison |
[TO] |
@duration:[1000 TO 5000] |
Range |
Common Attributes
service– Service namestatus– Log level (error, warn, info, debug)host– Hostname@http.status_code– HTTP status code@error.kind– Error type@trace_id/@dd.trace_id– Trace ID