The A/B testing industrial complex has sold a generation of marketers on a seductive idea: that growth is mostly a matter of running enough variants. Ship more tests, read more dashboards, iterate your way to greatness. Most of the time, this advice is either misapplied or actively harmful.
Here's the uncomfortable part: if your homepage has fewer than 30,000 unique visitors a month, most of the tests you're running are statistical theatre. The sample sizes are too small, the confidence intervals are too wide, and the "winners" you ship are, more often than not, noise. You are optimising a metric that isn't really moving.
What the math actually says
For a hero headline test to detect a 10% lift at 95% confidence and 80% power, with a baseline conversion of 2%, you need roughly 8,000 sessions per variant. Two variants, two weeks to hit significance if you're getting 2,000 sessions a day. Three variants, three weeks. Doing six simultaneously? You're not running an experiment. You're running a seance.
Most mid-market homepages do not come close to that traffic. Which means the teams running the loudest testing programmes are usually the ones generating the least reliable results.
The better question
If the statistics aren't there, what is testing actually for? Two things, honestly:
- Guardrails. You want to know that the thing you're about to ship doesn't crater conversion. Even a small sample will catch a disaster.
- Calibration. You want to see which direction your instincts run wrong. Over six months, you'll learn things about your audience that survive across tests — even if no individual test hits significance.
Neither of those justifies a permanent optimisation team of three. Both can be handled by someone with good instincts running one test at a time, one week apart.
The thing you should be doing instead
Here's the trade-off nobody wants to name: the same hour spent designing your seventh hero variant could be spent rewriting the page from scratch with a clearer positioning hypothesis. Testing is local search. Rewriting is global search. If your homepage is underperforming, the answer is almost never "the CTA button should say Start free instead of Get started." The answer is almost always "this page is answering the wrong question."
We've sat in the war room and watched a team kill a three-month testing programme, replace the homepage with a version written in 48 hours by one person who had actually talked to ten customers, and move a lagging metric 40% in a single quarter. That's not an A/B win. That's a decision.
When testing actually earns its keep
- You're at scale: >50k sessions/month, stable traffic, clean analytics.
- You've already done the rewrite work, and you're tuning inside a page you believe in.
- You have a specific hypothesis backed by qualitative research — not a vague "let's try some variants."
- You're willing to kill the programme the moment it stops generating compounding lifts.
If those four don't describe you, stop testing. Go talk to customers. Rewrite the page. Ship it. Look at the number in 90 days. Iterate the positioning, not the button.
Testing is not a substitute for having something to say.