datadog-automation
npx skills add https://github.com/composiohq/awesome-claude-skills --skill datadog-automation
Agent 安装分布
Skill 文档
Datadog Automation via Rube MCP
Automate Datadog monitoring and observability operations through Composio’s Datadog toolkit via Rube MCP.
Prerequisites
- Rube MCP must be connected (RUBE_SEARCH_TOOLS available)
- Active Datadog connection via
RUBE_MANAGE_CONNECTIONSwith toolkitdatadog - Always call
RUBE_SEARCH_TOOLSfirst to get current tool schemas
Setup
Get Rube MCP: Add https://rube.app/mcp as an MCP server in your client configuration. No API keys needed â just add the endpoint and it works.
- Verify Rube MCP is available by confirming
RUBE_SEARCH_TOOLSresponds - Call
RUBE_MANAGE_CONNECTIONSwith toolkitdatadog - If connection is not ACTIVE, follow the returned auth link to complete Datadog authentication
- Confirm connection status shows ACTIVE before running any workflows
Core Workflows
1. Query and Explore Metrics
When to use: User wants to query metric data or list available metrics
Tool sequence:
DATADOG_LIST_METRICS– List available metric names [Optional]DATADOG_QUERY_METRICS– Query metric time series data [Required]
Key parameters:
query: Datadog metric query string (e.g.,avg:system.cpu.user{host:web01})from: Start timestamp (Unix epoch seconds)to: End timestamp (Unix epoch seconds)q: Search string for listing metrics
Pitfalls:
- Query syntax follows Datadog’s metric query format:
aggregation:metric_name{tag_filters} fromandtoare Unix epoch timestamps in seconds, not milliseconds- Valid aggregations:
avg,sum,min,max,count - Tag filters use curly braces:
{host:web01,env:prod} - Time range should not exceed Datadog’s retention limits for the metric type
2. Search and Analyze Logs
When to use: User wants to search log entries or list log indexes
Tool sequence:
DATADOG_LIST_LOG_INDEXES– List available log indexes [Optional]DATADOG_SEARCH_LOGS– Search logs with query and filters [Required]
Key parameters:
query: Log search query using Datadog log query syntaxfrom: Start time (ISO 8601 or Unix timestamp)to: End time (ISO 8601 or Unix timestamp)sort: Sort order (‘asc’ or ‘desc’)limit: Number of log entries to return
Pitfalls:
- Log queries use Datadog’s log search syntax:
service:web status:error - Search is limited to retained logs within the configured retention period
- Large result sets require pagination; check for cursor/page tokens
- Log indexes control routing and retention; filter by index if known
3. Manage Monitors
When to use: User wants to create, update, mute, or inspect monitors
Tool sequence:
DATADOG_LIST_MONITORS– List all monitors with filters [Required]DATADOG_GET_MONITOR– Get specific monitor details [Optional]DATADOG_CREATE_MONITOR– Create a new monitor [Optional]DATADOG_UPDATE_MONITOR– Update monitor configuration [Optional]DATADOG_MUTE_MONITOR– Silence a monitor temporarily [Optional]DATADOG_UNMUTE_MONITOR– Re-enable a muted monitor [Optional]
Key parameters:
monitor_id: Numeric monitor IDname: Monitor display nametype: Monitor type (‘metric alert’, ‘service check’, ‘log alert’, ‘query alert’, etc.)query: Monitor query defining the alert conditionmessage: Notification message with @mentionstags: Array of tag stringsthresholds: Alert threshold values (critical,warning,ok)
Pitfalls:
- Monitor
typemust match the query type; mismatches cause creation failures messagesupports @mentions for notifications (e.g.,@slack-channel,@pagerduty)- Thresholds vary by monitor type; metric monitors need
criticalat minimum - Muting a monitor suppresses notifications but the monitor still evaluates
- Monitor IDs are numeric integers
4. Manage Dashboards
When to use: User wants to list, view, update, or delete dashboards
Tool sequence:
DATADOG_LIST_DASHBOARDS– List all dashboards [Required]DATADOG_GET_DASHBOARD– Get full dashboard definition [Optional]DATADOG_UPDATE_DASHBOARD– Update dashboard layout or widgets [Optional]DATADOG_DELETE_DASHBOARD– Remove a dashboard (irreversible) [Optional]
Key parameters:
dashboard_id: Dashboard identifier stringtitle: Dashboard titlelayout_type: ‘ordered’ (grid) or ‘free’ (freeform positioning)widgets: Array of widget definition objectsdescription: Dashboard description
Pitfalls:
- Dashboard IDs are alphanumeric strings (e.g., ‘abc-def-ghi’), not numeric
layout_typecannot be changed after creation; must recreate the dashboard- Widget definitions are complex nested objects; get existing dashboard first to understand structure
- DELETE is permanent; there is no undo
5. Create Events and Manage Downtimes
When to use: User wants to post events or schedule maintenance downtimes
Tool sequence:
DATADOG_LIST_EVENTS– List existing events [Optional]DATADOG_CREATE_EVENT– Post a new event [Required]DATADOG_CREATE_DOWNTIME– Schedule a maintenance downtime [Optional]
Key parameters for events:
title: Event titletext: Event body text (supports markdown)alert_type: Event severity (‘error’, ‘warning’, ‘info’, ‘success’)tags: Array of tag strings
Key parameters for downtimes:
scope: Tag scope for the downtime (e.g.,host:web01)start: Start time (Unix epoch)end: End time (Unix epoch; omit for indefinite)message: Downtime descriptionmonitor_id: Specific monitor to downtime (optional, omit for scope-based)
Pitfalls:
- Event
textsupports Datadog’s markdown format including @mentions - Downtimes scope uses tag syntax:
host:web01,env:staging - Omitting
endcreates an indefinite downtime; always set an end time for maintenance - Downtime
monitor_idnarrows to a single monitor; scope applies to all matching monitors
6. Manage Hosts and Traces
When to use: User wants to list infrastructure hosts or inspect distributed traces
Tool sequence:
DATADOG_LIST_HOSTS– List all reporting hosts [Required]DATADOG_GET_TRACE_BY_ID– Get a specific distributed trace [Optional]
Key parameters:
filter: Host search filter stringsort_field: Sort hosts by field (e.g., ‘name’, ‘apps’, ‘cpu’)sort_dir: Sort direction (‘asc’ or ‘desc’)trace_id: Distributed trace ID for trace lookup
Pitfalls:
- Host list includes all hosts reporting to Datadog within the retention window
- Trace IDs are long numeric strings; ensure exact match
- Hosts that stop reporting are retained for a configured period before removal
Common Patterns
Monitor Query Syntax
Metric alerts:
avg(last_5m):avg:system.cpu.user{env:prod} > 90
Log alerts:
logs("service:web status:error").index("main").rollup("count").last("5m") > 10
Tag Filtering
- Tags use
key:valueformat:host:web01,env:prod,service:api - Multiple tags:
{host:web01,env:prod}(AND logic) - Wildcard:
host:web*
Pagination
- Use
pageandpage_sizeor offset-based pagination depending on endpoint - Check response for total count to determine if more pages exist
- Continue until all results are retrieved
Known Pitfalls
Timestamps:
- Most endpoints use Unix epoch seconds (not milliseconds)
- Some endpoints accept ISO 8601; check tool schema
- Time ranges should be reasonable (not years of data)
Query Syntax:
- Metric queries:
aggregation:metric{tags} - Log queries:
field:valuepairs - Monitor queries vary by type; check Datadog documentation
Rate Limits:
- Datadog API has per-endpoint rate limits
- Implement backoff on 429 responses
- Batch operations where possible
Quick Reference
| Task | Tool Slug | Key Params |
|---|---|---|
| Query metrics | DATADOG_QUERY_METRICS | query, from, to |
| List metrics | DATADOG_LIST_METRICS | q |
| Search logs | DATADOG_SEARCH_LOGS | query, from, to, limit |
| List log indexes | DATADOG_LIST_LOG_INDEXES | (none) |
| List monitors | DATADOG_LIST_MONITORS | tags |
| Get monitor | DATADOG_GET_MONITOR | monitor_id |
| Create monitor | DATADOG_CREATE_MONITOR | name, type, query, message |
| Update monitor | DATADOG_UPDATE_MONITOR | monitor_id |
| Mute monitor | DATADOG_MUTE_MONITOR | monitor_id |
| Unmute monitor | DATADOG_UNMUTE_MONITOR | monitor_id |
| List dashboards | DATADOG_LIST_DASHBOARDS | (none) |
| Get dashboard | DATADOG_GET_DASHBOARD | dashboard_id |
| Update dashboard | DATADOG_UPDATE_DASHBOARD | dashboard_id, title, widgets |
| Delete dashboard | DATADOG_DELETE_DASHBOARD | dashboard_id |
| List events | DATADOG_LIST_EVENTS | start, end |
| Create event | DATADOG_CREATE_EVENT | title, text, alert_type |
| Create downtime | DATADOG_CREATE_DOWNTIME | scope, start, end |
| List hosts | DATADOG_LIST_HOSTS | filter, sort_field |
| Get trace | DATADOG_GET_TRACE_BY_ID | trace_id |