worker-image-investigation

📁 jwmossmoz/agent-skills 📅 1 day ago
1
总安装量
1
周安装量
#50075
全站排名
安装命令
npx skills add https://github.com/jwmossmoz/agent-skills --skill worker-image-investigation

Agent 安装分布

amp 1
opencode 1
kimi-cli 1
codex 1
github-copilot 1
claude-code 1

Skill 文档

Worker Image Investigation

Investigate Taskcluster task failures by comparing worker images, extracting SBOM info, and debugging Azure VMs.

Use this local skills checkout path for commands in this file:

SKILLS_ROOT=/Users/jwmoss/github_moz/agent-skills/skills
WII="$SKILLS_ROOT/worker-image-investigation/scripts/investigate.py"

Prerequisites

  • taskcluster CLI: brew install taskcluster
  • az CLI (for VM debugging): brew install azure-cli && az login
  • uv for running scripts

Usage

# Investigate a failing task - get worker pool, image version, status
uv run "$WII" investigate <TASK_ID>
uv run "$WII" investigate https://firefox-ci-tc.services.mozilla.com/tasks/<TASK_ID>

# Compare two tasks (e.g., passing vs failing on same revision)
uv run "$WII" compare <PASSING_TASK_ID> <FAILING_TASK_ID>

# List running workers in a pool (for Azure VM access)
uv run "$WII" workers gecko-t/win11-64-24h2

# Get SBOM/image info for a worker pool
uv run "$WII" sbom gecko-t/win11-64-24h2

# Get Windows build and GenericWorker version from Azure VM
uv run "$WII" vm-info <VM_NAME> <RESOURCE_GROUP>

Investigation Workflow

1. Initial Task Analysis

# Get task info including worker pool and image
uv run "$WII" investigate <FAILING_TASK_ID>

Output includes: taskId, taskLabel, workerPool, workerId, imageVersion, status.

2. Find Comparison Task

Use Treeherder to find a passing run of the same test on the same revision or a recent revision:

  • Check if test passed on an older image version
  • Look for passing runs on mozilla-central vs autoland

3. Compare Tasks

uv run "$WII" compare <PASSING_TASK_ID> <FAILING_TASK_ID>

Look for differences in image versions (e.g., 1.0.8 vs 1.0.9).

4. Debug Running Worker (Azure)

# Find running workers
uv run "$WII" workers gecko-t/win11-64-24h2

# Get VM details - extract VM name from workerId (e.g., vm-xyz...)
uv run "$WII" vm-info vm-xyz RG-TASKCLUSTER-WORKER-MANAGER-PRODUCTION

5. Direct Azure VM Commands

For deeper investigation, use Azure CLI directly:

# Get Windows build number
az vm run-command invoke --resource-group RG-TASKCLUSTER-WORKER-MANAGER-PRODUCTION \
  --name <VM_NAME> --command-id RunPowerShellScript \
  --scripts "(Get-ItemProperty 'HKLM:\\SOFTWARE\\Microsoft\\Windows NT\\CurrentVersion').CurrentBuild"

# Get GenericWorker version
az vm run-command invoke --resource-group RG-TASKCLUSTER-WORKER-MANAGER-PRODUCTION \
  --name <VM_NAME> --command-id RunPowerShellScript \
  --scripts "Get-Content C:\\generic-worker\\generic-worker-info.json"

# Get recent Windows updates
az vm run-command invoke --resource-group RG-TASKCLUSTER-WORKER-MANAGER-PRODUCTION \
  --name <VM_NAME> --command-id RunPowerShellScript \
  --scripts "Get-HotFix | Sort-Object InstalledOn -Descending | Select-Object -First 10"

# Check for file system filters (AppLocker, etc.)
az vm run-command invoke --resource-group RG-TASKCLUSTER-WORKER-MANAGER-PRODUCTION \
  --name <VM_NAME> --command-id RunPowerShellScript \
  --scripts "fltMC"

Common Resource Groups

  • Production: RG-TASKCLUSTER-WORKER-MANAGER-PRODUCTION
  • Staging: RG-TASKCLUSTER-WORKER-MANAGER-STAGING

Common Worker Pools

Pool Description
gecko-t/win11-64-24h2 Windows 11 24H2 64-bit production
gecko-t/win11-64-24h2-alpha Windows 11 24H2 64-bit alpha (os-integration)
gecko-t/win11-32-24h2 Windows 11 24H2 32-bit

Azure Resource Groups to Review

  • RG-TASKCLUSTER-WORKER-MANAGER-PRODUCTION contains all taskcluster azure windows 10 windows 11 windows server machines.

Stage Taskcluster

For CI tasks on fxci-config PRs:

uv run "$WII" --root-url https://stage.taskcluster.nonprod.cloudops.mozgcp.net \
  investigate <TASK_ID>

Output Format

All commands return JSON for easy parsing with jq:

uv run "$WII" investigate <TASK_ID> | jq '.imageVersion'
uv run "$WII" workers gecko-t/win11-64-24h2 | jq '.workers[0].workerId'

SBOM Encoding Note

Some Windows worker SBOM markdown artifacts are UTF-16LE encoded. If text looks garbled, decode before parsing:

curl -sL <SBOM_URL> | iconv -f UTF-16LE -t UTF-8

Related Skills

  • taskcluster: Query task status, logs, artifacts
  • treeherder: Find tasks by revision and job type
  • os-integrations: Run mach try commands for testing

References

  • Worker image configs: fxci-config/worker-images/
  • SBOM files: Check Azure Shared Image Gallery
  • ronin_puppet: Worker configuration management