I went to college twice for practically nothing that I’m paid to do now, in part, due to my incredible ability to speak about any subject at length regardless of whether I know about it or not. I was once asked if I knew anything about A/B testing and I pulled on a wealth knowledge from an undergraduate degree in Political Science to statistic-ify my shoddy answer. But they bought it.
When I actually started doing tests, I perceived it as a clinical, empirical activity; one that’s devoid of personal experience or philosophy. What I realized, however, was that opinions and approaches vary as much as anything else. A personal ethos exists with how you approach testing that very much influences how your testing program will perform.
People Love Testing
My experience is that people love the concept of testing. Real world experimentation smacks of all the curiosity of a middle school science fair project, but with all the promise to make any business owners’ dream come true.
A/B testing makes space numbers and fairy dust feel like real science.
Imagine it: A process that has some semblance of actual science and is designed to improve practically any business objective. Intrinsically measurable. Often associated with a bevy of lines or bar charts. All incredibly sexy, but nothing as alluring as knowledge. Testing gives definition to conversations with marketing folks that are often bullshit.
It feels real and for many people it is.
Early Mistakes In A/B Testing
The logic behind A/B testing is easy to understand. The Pepsi Challenge, where random participants in a blind taste test were asked to choose their favorite soft drink, was simple enough for anyone to get: Two things are tested and one of those things outperformed the other. If you do this enough times then the result is the same then, well, that’s the Truth.
The problem wasn’t in the logic, it was with the philosophy. The biggest mistake I made when starting A/B test was not making enough mistakes.
Fear played a lot into it. I was new and the pressure to improve, in this instance, conversions through testing was incredibly high. In retrospect, this was unfortunate since fear—the fear of making mistakes—is the antithesis of what testing is: sheer exploration, luck, and nerve.
The things I tested were bad. Not because they were bad variables, but because they were too small and iterative. Part of this was this influenced by the tabloid-like cornucopia of blog posts about how changing the color of a call-to-action results in an unbelievable lift in conversions.
I’m not saying things like this don’t happen, but I also know they make good headlines. The point is that tiny variables equal tiny changes or changes not significant enough to justify running a test in the first place.
Most people think that if you run a test or experiment that one variation has to simply outperform another when in reality variations have to outperform other variations by a certain amount. This is part of the methodology behind many A/B testing tools—using statistical models (e.g. a Chi Squared Test) that measure how far away one variation is from an expected value and thus statistically significant.
A Dark Room: A Framework for A/B
An approach I like better is what I call the Dark Room Theory.
Imagine you’re in a dark room—too dark for you to make out anything but the faintest of details. Your objective is to leave this room, but you have to find the door in order to do that. This is A/B testing.
1. Take Tiny Steps
You could take tiny steps. Methodical, micro movements that pattern out the layout of the room in a grid pattern. Like a cartographer, the logic and accuracy of your mapping will, eventually, let you find the door.
But time is a factor too. You might physically die of hunger or thirst before leaving this room (e.g. running out of money or time to test with). Plotting every portion of the room is the equivalent of testing every single variable. It’s physically impossible or only possible at an incredible cost.
This is basically the approach I took when I first started A/B testing. It’s logical and structured, but is costly for both time and budget. After all, this is marketing—not a double blind placebo test for a drug. We’re not proving knowledge, we’re making sketchy marketing campaigns perform better.
2. Run Around The Room At Full Speed
Another approach is to literally run around the room at full speed until you find the door. Since it’s pitch black you won’t have any clue where to run, so you’re randomly guessing at where the door might be. You’ll probably smack into furniture, walls and anything else.
It can work, if you’re lucky; you may in fact find the door on the first try. If it doesn’t work, it gets increasingly hard to justify continuing to find it at all. Imagine advocating that a client continue to A/B test after six months of testing without a single conversion.
It’s hard to derive as much value from this philosophy because it isn’t built on logic, which means it also isn’t typically using a hypothesis centered approach. This is the wild west of testing philosophies.
3. Develop Hypotheses, Rinse, Repeat
I’ve presented some extreme methods for getting out of this hellish dark room, mainly, to contrast what sane people do when they test: develop hypotheses based on information, test them out, and repeat until the objective is complete.
One easy hypothesis is that the door exists on a wall. If that turns out not to be true, it’s possible the door is in the floor or ceiling, which creates an entirely new hypothesis based on the previous experiment.
Unlike aimlessly bouncing around a room (which doesn’t build on previous knowledge/data) or methodically plotting out every A/B testing idea you’ve read about (which is absolutely a waste), this approach balances both the need for exploration, speed, and effectiveness.
Good Methods Still Matter
Regardless of your theoretical approach to deciding how and what to focus on in testing, methods and process are critical for any philosophy.
1. Hypothesis Development
Every test needs an hypothesis. A stated objective or theory for a test.
Tests should be evolutionary over time. Good tests are built upon the shoulders of old, bad A/B tests.
Every test should be tracked and documented. If you’re serious about testing, you can run an unbelievable amount of them in a short period of time and keeping themes straight in your head is critical.