The ABCs of A/B Testing

The ABCs of A/B Testing

Updated on March 2, 2017.

When reading advice for digital marketing, it’s important to remember that what works for someone else might not work best for you and your audience. While some practices are definite yesses—like keeping subject lines short and attention-grabbing—others may need to be tweaked to best fit your audience and purpose. The best way to figure out what works is to test, test, test.

A. All About A/B Testing

You may have heard of A/B testing (or “split testing”) as a method for testing the success of marketing materials. A/B testing involves creating two versions of an email, landing page, web page, etc. and changing one aspect. This might be the subject line of an email, the placement of a CTA, phrasing, color choices, etc. There are many different things you could test, but you must only test one aspect at a time; otherwise, you won’t know which change triggered the difference in audience response.

Once you have created the two versions, make arrangements so half of your audience sees version A, and the other half sees version B. If you are testing something on your website, A/B testing software (such as Optimizely) can randomly distribute site visitors to the two versions for you. The audience distribution must be random so no other factors (like gender or age) come in to play and skew the results.

B. Bigger Sample Sizes Are Better

The larger the audience you have during your A/B testing, the more confident you can be in the results. You can control the sample size by compiling a large email list or watching the traffic on your website (the longer you keep the test going for your website, the more site visitors you’ll have). The general consensus is that you should test to at least 1,000 users.

But what about when you want to test an email that’s only sending once? If you want to make sure your audience is getting the most successful version, you can A/B test the email to a smaller portion of the audience, and send the winner to the rest. To do this effectively, you’ll want to A/B test to the smallest audience you can, while still being statistically significant. Luckily, this sample size calculator (see right) can help you figure it out. If you need a little more guidance, Hubspot has a great article on determining sample size.

C. Calculate Significance

You’ve pinned down your audience size, users have interacted with your tests, now what do these numbers mean? You may be hesitant to declare a winner if the difference in audience interaction with versions A and B is small, but there’s a tool out there to help you determine if the difference is significant. Visual Website Optimizer has a handy A/B Split Test Significance Calculator on their site. The calculator will tell you if your results are significant based on the sample size and number of conversions for both test versions.

Generally, you’ll want to have higher confidence in smaller, more specific tests, like changing a CTA color, since it’s such a minute change. Radical changes can have a lower confidence threshold, as the change is more likely to have a large affect on the audience.

Results Are Inconclusive, Now What?

Inconclusive results from an A/B test may seem frustrating, but there are still things to learn from them. First, look into the segments. It’s possible that your audience as a group may have inconclusive results, but after segmenting by device, user, or other segments that make sense for your test, clearer results may emerge.

ConversionXL gave the example of Brian Massey from Conversion Sciences, who found inconclusive results after A/B testing video traffic on an apparel site. After segmenting users, he found that new visitors preferred longer videos while returning visitors preferred shorter videos, among other results. The important thing to remember here is that segments still need to have a significant sample size to count the results as conclusive.

Inconclusive results can also tell us which changes are influential. If changing the color of a CTA comes up inconclusive, you can safely assume that any of the tested colors will do just fine. These results can be just as valuable, as you can now focus on testing another aspect which may be more influential.

If you are repeatedly getting inconclusive results, you may need to think about the changes you are testing. Are they big enough? Instead of looking at CTA color, maybe the phrasing or placement could be tested.

A/B testing will help you take the guesswork out of marketing. Think an orange CTA may make more conversions than a yellow one? Test it! And even if results are inconclusive, you have answered your question and can now move on to your next one. The best thing about testing is that YOU are in charge of determining what works for YOUR business. We know your business is unique; it’s time to find out what’s best by your standards.

help desk software