data-generation-quality-metrics-loop
3
总安装量
3
周安装量
#56129
全站排名
安装命令
npx skills add https://github.com/shimo4228/claude-code-learned-skills --skill data-generation-quality-metrics-loop
Agent 安装分布
replit
3
openclaw
3
mcpjam
2
claude-code
2
windsurf
2
zencoder
2
Skill 文档
Data Generation Quality Metrics Loop
Extracted: 2026-02-11 Context: 大éã®ãã¼ã¿ãèªåçæããã¹ã¯ãªããã®å質ããå®éã¡ããªã¯ã¹ã§å復æ¹åãããã¿ã¼ã³ã
Problem
ãã¼ã¿çæã¹ã¯ãªããã®åºåå質ããå°æ°ã®ã¹ããããã§ãã¯ã ãã§å¤æããã¨åé¡ãè¦éãã ä¾: 411åã®æ§é åãã¼ã¿çæã§ãã¹ããããã§ãã¯5åã¯OKã ãå ¨ä½ã®23%ã«ã¤ã³ããæãæ··å ¥ãã¦ããã
Solution
1. å®éã¡ããªã¯ã¹ãå®ç¾©ãã
çæç©ã®åè³ªãæ°å¤ã§æ¸¬å®ã§ãããã§ãã¯é ç®ãè¨è¨:
# ä¾: æ§é å解説ãã¼ã¿ã®å質ã¡ããªã¯ã¹
metrics = {
"intro_contamination": "ã¤ã³ããæ('åãåé¡')ãpointã«å«ã¾ããä»¶æ°",
"long_points": "150æå以ä¸ã®pointæ°",
"empty_points": "5æå以ä¸ã®pointæ°",
"empty_key_fields": "å¿
é ãã£ã¼ã«ãã空ã®ä»¶æ°",
}
2. çæâ測å®âä¿®æ£ã®ã«ã¼ããåã
çæ v1 â ã¡ããªã¯ã¹æ¸¬å® â åé¡ç¹å® â åå åæ â ä¿®æ£ â çæ v2 â ...
å ·ä½ä¾ï¼æ¬ã»ãã·ã§ã³ï¼:
- v1: ã¤ã³ããæ··å ¥ 373/1641 (23%) â ã¤ã³ããé¤å»ãã¸ãã¯è¿½å
- v2: ã¤ã³ããæ··å ¥ 88/1641 (5%) â æ£è¦è¡¨ç¾ãã¿ã¼ã³æ¡å¼µ
- v3: ã¤ã³ããæ··å ¥ 2/1641 (0.1%) â OCRåæå¯¾å¿
- v4: ã¤ã³ããæ··å ¥ 1/1641 (0.06%) â 許容ç¯å²ï¼ã½ã¼ã¹ãã¼ã¿èµ·å ï¼
3. ã¡ããªã¯ã¹æ¸¬å®ã¹ã¯ãªããã¯ç¬ç«ããã
çæã¹ã¯ãªããã¨ã¯å¥ã«ãå質ãã§ãã¯ç¨ã®è¨æ¸¬ã³ã¼ããç¨æ:
# çæå¾ã«æ¯åå®è¡
python3 -c "
import json
with open('data.json') as f: data = json.load(f)
# ... ã¡ããªã¯ã¹è¨ç® ...
print(f'Issue A: {count_a}/{total} ({pct_a}%)')
print(f'Issue B: {count_b}/{total} ({pct_b}%)')
"
4. æ¢ãæã®å¤æåºæºãæã¤
- 0%ã¯ç®æããªãï¼ã½ã¼ã¹ãã¼ã¿èµ·å ã®åé¡ã¯ä¿®æ£ä¸å¯è½ï¼
- 1%以ä¸ã«ãªã£ããæ®ä»¶ã®åå¥èª¿æ»ã«åãæ¿ãã
- æåä¿®æ£ãå¿ è¦ãªã±ã¼ã¹ã¯ãã¼ã¿å質ã¤ã·ã¥ã¼ã¨ãã¦è¨é²
When to Use
- JSONãã¼ã¿ã®ä¸æ¬å¤æã»æ¡å
- ãã¹ããã¼ã¿ã®èªåçæ
- ãã¤ã°ã¬ã¼ã·ã§ã³ã¹ã¯ãªããã®åºåæ¤è¨¼
- ä»»æã®ãå ¥åâ夿âåºåããã¤ãã©ã¤ã³ã®å質ä¿è¨¼