swe-bench
3
总安装量
2
周安装量
#57387
全站排名
安装命令
npx skills add https://github.com/halflifezyf2680/mpm-vibe-coding --skill swe-bench
Agent 安装分布
mcpjam
2
gemini-cli
2
claude-code
2
junie
2
windsurf
2
zencoder
2
Skill 文档
SWE-BenchStandard Solving Workflow
æ¬æè½æå¯¼ä½ æç § SWE-Bench çä¸¥èæ åè§£å³ GitHub Issueãä¸ä» æ¯ä¿®å¤ä»£ç ï¼æ´æ¯è¦è¯æä¿®å¤çæ£ç¡®æ§åæ å¯ä½ç¨ã
ð æ ¸å¿åå (Core Principles)
- Reproduction First: ä¿®æ¹ä»£ç åï¼å¿ é¡»å ç¼åå¤ç°èæ¬ï¼è¯æ Bug åå¨ã
- Test Driven: åªæå½å¤ç°èæ¬ä» Fail å为 Passï¼ä¸ä¸ç ´ååææµè¯æ¶ï¼ä»»å¡æç®å®æã
- Minimal Changes: åªä¿®æ¹å¿ è¦çæä»¶ï¼é¿å éææ å ³ä»£ç ã
ð æ å工使µ (Standard Workflow)
è¯·ä¸¥æ ¼æç §ä»¥ä¸ 5 ä¸ªé¶æ®µæ§è¡ï¼
Phase 1: Issue Analysis (审é¢)
- é 读 Issue: çè§£ç¨æ·æ¥åç Bug ç°è±¡ãç¯å¢åå¤ç°æ¥éª¤ã
- è°ç¨å·¥å
·: 使ç¨
code_search忥æ¢ç´¢ Issue æå°çæ¥éä¿¡æ¯æå ³é®è¯ã - è¾åº: æç¡® “Bug 颿è¡ä¸º” vs “å®é è¡ä¸º”ã
Phase 2: Reproduction (å¤ç° – æå ³é®ä¸æ¥)
â 严ç¦è·³è¿æ¤æ¥ç´æ¥æ¹ä»£ç ï¼
- åå»ºèæ¬: 使ç¨
scripts/reproduce_template.py模æ¿ï¼å¨é¡¹ç®æ ¹ç®å½å建reproduce_issue.pyã - ç¼åæè¨: èæ¬å¿
é¡»å
å«
assertè¯å¥ï¼- å¨ Bug å卿¶ï¼èæ¬åºæåº
AssertionErroræ crash (Exit Code != 0)ã - å¨ Bug ä¿®å¤åï¼èæ¬åºæ£å¸¸éåº (Exit Code = 0)ã
- å¨ Bug å卿¶ï¼èæ¬åºæåº
- éªè¯å¤ç°:
python reproduce_issue.py # 颿è¾åº: AssertionError æ Traceback
Phase 3: Localization (å®ä½)
- AST å®ä½: 使ç¨
code_search(search_type="function")æ¥æ¾ç¸å ³å½æ°å®ä¹ã - è°ç¨åæ: 使ç¨
code_impact(direction="both")æ¥çè°ç¨é¾ï¼ç¡®å®ä¿®æ¹çå½±åèå´ã - ç¡®è®¤æ ¹å : é 读æºç ï¼æ¾å°é»è¾æ¼æ´çç¡®åä½ç½®ã
Phase 4: TDD Fixing (ä¿®å¤)
- å¼åä¿®å¤: ä¿®æ¹ä»£ç ï¼ä¿®å¤ Bugã
- å¢éæµè¯: 妿éè¦ï¼å¨é¡¹ç®æµè¯å¥ä»¶ï¼å¦
tests/ç®å½ï¼ä¸æ·»å æ°çæµè¯ç¨ä¾æä»¶ã - ** lint æ£æ¥**: ç¡®ä¿ä»£ç 飿 ¼ç¬¦å项ç®è§èã
Phase 5: Verification (éªè¯)
- éªè¯å¤ç°èæ¬ (Fail-to-Pass):
python reproduce_issue.py # 颿è¾åº: æ æ¥éï¼æ£å¸¸éåº - éªè¯åææµè¯ (Pass-to-Pass):
è¿è¡ä¸ä¿®æ¹æ¨¡åç¸å
³çåææµè¯ï¼ç¡®ä¿æ Regressionã
pytest tests/path/to/relevant_tests.py
ð ï¸ å¸¸ç¨å·¥å ·é
reproduce_issue.py: å¿ é¡»å建çå¤ç°èæ¬ãcode_search: æ¥æ¾å®ä¹ãcode_impact: è¯ä¼°å½±åãrun_command: æ§è¡æµè¯å½ä»¤ã
â ï¸ å¸¸è§é·é± (Pitfalls)
- é·é±1: 没åå¤ç°èæ¬å°±æ¹ä»£ç ã -> åæ: æ æ³è¯æä½ éè¿äº SWE-Bench è¯æµã
- é·é±2: ä¿®æ¹äºå¤ªå¤æ å ³æä»¶ã -> åæ: å¼å ¥æ° Bugï¼è¯åéä½ã
- é·é±3: è¿éçæµè¯éè¿äºï¼ä½ç ´åäºå ¶ä»æ¨¡åã -> åæ: å¿ é¡»è¿è¡ç¸å ³å彿µè¯ã