Design statistically sound experiments with clear hypotheses and sample size calculations.
/ab-test-designer2 hrs → 20 min
Compared to doing it manually
/ab-test-designerType this in Claude to run the skill
Underpowered experiments waste weeks of traffic and produce inconclusive results. Without proper planning, you're flipping coins and calling it data-driven.
Agent workflows chain multiple skills into one command.
.claude/skills/ folder in your project/ab-test-designer in Claude to run the skill/metric-framework-builderBuild comprehensive metrics frameworks using the AARRR pirate metrics or input/output methodology.
/funnel-analyzerDiagnose conversion funnel problems and generate data-backed improvement hypotheses.
/experiment-designerDesign A/B tests with proper methodology, sample sizes, and success criteria.
/ab-test-analyzerInterpret experiment results with statistical rigor and clear ship/no-ship recommendations.
A/B test when: you have enough traffic for statistical significance, the change is measurable, and the risk of the change warrants validation. Don't A/B test obvious fixes or low-traffic features.
Until you reach statistical significance (usually 95% confidence) AND at least one full business cycle (typically 1-2 weeks). Don't stop tests early just because results look good.
Statistical significance (usually 95%) means there's only a 5% chance the result is due to random variation. It's not about how big the difference is — it's about how confident you are it's real.
Run this skill inside your PM Operating System, or download it on its own.
Use all 70 skills, workflows, and sub-agents in a system that knows your company, product, and customers.