What is A/B-testing and How to do it Correctly

blog

A/B testing is a method of comparing two versions of a page or application to see which one performs better in the context of a particular task. It is one of the most popular methods to improve the performance of digital objects: websites, mobile apps, SaaS products, newsletters.

With controlled experiments, marketers, product managers and developers can quickly check their creative ideas for viability and, based on objective data, quickly and flexibly incorporate the most successful ones into the product. In short, you no longer need to speculate about why one version or another is better than another: experimentation speaks for itself. This method is ideal when you need to increase conversion rates, boost revenue, increase your subscriber base, or attract more clients and leads.

Global companies like Google, Amazon, Netflix and Facebook have built their business processes in such a way that they conduct thousands of experiments a year and put the results into action quickly.

Jeff Bezos once said: “Amazon’s success depends on how many experiments we can do per year, per month, per week, per day.”

Netflix’s technology blog in April 2016 included this phrase: “Thanks to our empirical approach, we can make sure that product changes are dictated not by the tastes and opinions of the company’s most authoritative employees, but by objective data. That is, our viewers themselves tell us what they like.”

And Mark Zuckerberg confessed in an interview that Facebook’s success is due to their unique testing system, of which the entrepreneur is very proud: “At any given second, there is not one version of Facebook running in the world, but about 10,000.

What is an A/B test?

In a classic A/B test, the first thing we do is define what we are going to test and what our objective is. Next, we create one or more variations of the initial (control) web element. Next, we split the traffic in half at random (that is, we distribute users according to some probability), and finally, we collect data (metrics) about how each version of the page performs. After some time, we analyze the data and leave the version that worked better and switch off the less successful one.

It’s very important to conduct tests correctly: otherwise, not only will you not get meaningful and useful results, but you can go down the wrong path. In general, controlled experiments can help to solve the following tasks:

  • To solve UX deficiencies and popular customer barriers (pain points)
  • Improve the effectiveness of existing traffic (increase conversions and revenue, optimize customer acquisition costs)
  • To raise engagement (to reduce the bounce rate, to increase the click-through rate).

We need to remember that when we favor a particular option, we’re essentially scaling (rolling out) the results we’ve gotten to that point to the entire audience of potential users. It’s a real leap of faith, and every such action must be justified. By implementing solutions without a solid rationale, we’re bound to make at least one wrong move – which will negatively impact the product in the long run. We call the process of gathering such rationales hypothesis testing, and the rationales sought are statistical significance.

Here are a few examples of what is tested through A/B tests:

  • Different types of sorting in site navigation menus (as in this example of a large German electronics retailer)
  • Landing pages (as in this example from a leading European airline passenger protection company)
  • Marketing messages: e.g., newsletters or banners (like this example of an international natural cosmetics retailer)

How an A/B test is born: Formulate a hypothesis

At the heart of any A/B test is a problem that we need to solve or some user behavior that we need to change/reinforce. Having identified the problem or challenge, the marketer formulates a hypothesis – an educated guess that will either be confirmed or disproved by the experiment.

An example of a hypothesis: If we add a social proof icon to a product page, visitors will learn about the popularity of our product and the number of cart additions will increase by 10%.

In this case, when we have identified the problem (low rate of additions to cart, for example) and formulated the hypothesis (displaying the icon with social proof encourages users to add the product to cart more often), we can proceed with testing on the site.

Classic approach to A/B testing

In a simple A/B test, traffic is distributed between two versions of content. One version – with the original content and design – is considered the control version. The second version is a variation. Variations can be different: for example, you can test different header variations, call-to-action buttons, layouts, and designs.

In a classic single-page-level experiment, we don’t even need to make two URLs to test. Most A/B testing solutions allow you to dynamically change the content, layout, and page design.

However, if you want to include 2+ sets of pages in testing, then you need to split test and use multiple URLs.

When to do split testing

Split testing (sometimes called multi-page testing) is generally similar to A/B testing, but allows you to experiment using a separate URL for each variation. In other words, split testing can be done between two existing URLs, which is especially useful if you have dynamic content.

Let’s say you already have two pages, and you want to see which one works better. For example, you’re launching a newsletter and you have two different versions of a potential landing page. Do split testing – and you’ll understand which landing page performs the best within that campaign.

You can include more than two variations in an A/B test

If you want to test more than two variations, conduct an A/B/n test. This can be used to compare the effectiveness of three or more variations, rather than testing each variation against the same control variation (i.e. running a chain of independent A/B tests). If your site has high traffic, you can use A/B/n testing to test many variations at once, thereby reducing testing time and getting results faster.

However, I don’t recommend making too many changes to a variation. If you make only the most important and meaningful changes, the results of the experiment will make it easier for you to understand possible cause-effect relationships. And if you want to test a number of changes at once, conduct a multivariate test instead.

What is multivariate testing?

Multivariate tests allow you to test changes in several sections of a page at once. To understand the principle, conduct a multivariate test on one of your webpages: change a couple of elements on it. In the first variation, replace the main photo on the feedback form page. In the second variation, add a video to the page. Now the system will generate another possible variation based on your changes – with a video AND a feedback form.

You will have 2 x 2 = 4 versions of the page

V1 – control version (without feedback form and video)

V2 – a variation with the feedback form

V3 – variation with video

V4 – variation with feedback form + video

Because multivariate testing checks all possible combinations, we do not recommend adding many variations – unless you have a site with very, very high traffic. If you conduct multivariate testing with many variables on a site with low traffic, there is a risk of getting insufficiently meaningful results on which you will not be able to draw any meaningful conclusions. This type of testing requires at least several thousand visits per month.

What test to use depending on the situation

An A/B test will help you find answers to questions like, “which of these two variations of the page do visitors respond better to?”

And multivariate tests can help answer such questions:

  • Do visitors respond better to the video or the feedback form?
  • Or does only the feedback form work better, without the video?
  • Or is it better to keep the video and remove the feedback form?

How to evaluate the effectiveness of an A/B testing platform

To evaluate the effectiveness of an A/B testing platform, you can do an A/A test. To do this, you need to create two identical versions of the page and run the A/B test. Ideally, the system should respond that both variations showed approximately the same results. Read more about A/A tests here.

The path to successful A/B tests

“I didn’t fail the test, I just found 100 ways to do it wrong” – Benjamin Franklin

When conducting A/B testing, a clear and adequate methodology is very important. Only then can we trust the test results and make effective decisions based on them. A/B testing gives us a framework that allows us to compare the reaction of site visitors to different variations of pages, and if one of the variations works better – to establish the statistical significance of the result and to some extent causality.