ml-model-evaluation

📁 kentoshimizu/sw-agent-skills 📅 1 day ago

总安装量

周安装量

#77932

全站排名

安装命令

npx skills add https://github.com/kentoshimizu/sw-agent-skills --skill ml-model-evaluation

Agent 安装分布

amp 1

cline 1

opencode 1

cursor 1

continue 1

kimi-cli 1

Use this skill to evaluate models with decision-grade evidence across aggregate and high-risk segments.

Use this skill when the task matches the trigger condition described in description.
Do not use this skill when the primary task falls outside this skill’s domain.

Threshold and segmentation rules:
- references/threshold-and-segmentation-rules.md

Build evaluation report in assets/evaluation-report-template.md.
Apply threshold/segment policy via references/threshold-and-segmentation-rules.md.
Validate calibration and error concentration risks.
Compare baseline vs candidate under same conditions.
Publish release recommendation and unresolved risks.