trustworthy-experiments
30
总安装量
27
周安装量
#12295
全站排名
安装命令
npx skills add https://github.com/pmprompt/claude-plugin-product-management --skill trustworthy-experiments
Agent 安装分布
gemini-cli
24
opencode
23
github-copilot
23
codex
23
kimi-cli
23
amp
23
Skill 文档
Trustworthy Experiments
What It Is
Trustworthy Experiments is a framework for running controlled experiments (A/B tests) that produce reliable, actionable results. The core insight: most experiments fail, and many “successful” results are actually false positives.
The key shift: Move from “Did the experiment show a positive result?” to “Can I trust this result enough to act on it?”
Ronny Kohavi, who built experimentation platforms at Microsoft, Amazon, and Airbnb, found that:
- 66-92% of experiments fail to improve the target metric
- 8% of experiments have invalid results due to sample ratio mismatch alone
- When the base success rate is 8%, a P-value of 0.05 still means 26% false positive risk
When to Use It
Use Trustworthy Experiments when you need to:
- Design an A/B test that will produce valid, actionable results
- Determine sample size and runtime for statistical power
- Validate experiment results before making ship/no-ship decisions
- Build an experimentation culture at your company
- Choose metrics (OEC) that balance short-term gains with long-term value
- Diagnose why results look suspicious (Twyman’s Law)
- Speed up experimentation without sacrificing validity
When Not to Use It
Don’t use controlled experiments when:
- You don’t have enough users â Need tens of thousands minimum
- The decision is one-time â Can’t A/B test mergers or acquisitions
- There’s no real user choice â Employer-mandated software
- You need immediate decisions â Experiments need time
- The metric can’t be measured â No experiment without observable outcomes
Resources
Book:
- Trustworthy Online Controlled Experiments by Ronny Kohavi, Diane Tang, and Ya Xu