eval-relevance

📁 whitespectre/ai-assistant-evals 📅 9 days ago

总安装量

周安装量

#55102

全站排名

安装命令

npx skills add https://github.com/whitespectre/ai-assistant-evals --skill eval-relevance

Agent 安装分布

opencode 2

claude-code 2

cursor 2

mcpjam 1

openhands 1

zencoder 1

Skill 文档

Eval Relevance

Use this skill to evaluate how relevant an assistant response is to the userâs request.

Inputs

Require:

The assistant response text to evaluate.
(Optional) The userâs original request for comparison.

Internal Rubric (1â5)

5 = Directly addresses the userâs request, stays fully on-topic, and prioritizes what the user actually asked
4 = Mostly relevant, minor digressions or small omissions
3 = Partially relevant, addresses the general topic but misses key parts of the request
2 = Weak relevance, significant digressions or failure to address the core request
1 = Not relevant, does not address the userâs request or answers a different question entirely

Workflow

Compare the assistant response to the userâs request (if provided).
Score relevance on a 1-5 integer scale using the rubric only.
Write concise rationale tied directly to rubric criteria.
Produce actionable suggestions that improve relevance.

Output Contract

Return JSON only. Do not include markdown, backticks, prose, or extra keys.

Use exactly this schema:

{ “dimension”: “relevance”, “score”: 1, “rationale”: “…”, “improvement_suggestions”: [ “…” ] }

Hard Rules

dimension must always equal "relevance".
score must be an integer from 1 to 5.
rationale must be concise (max 3 sentences).
Do not include step-by-step reasoning.
improvement_suggestions must be a non-empty array of concrete edits.
Never output text outside the JSON object.

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台