Demystifying A/B Testing for the web, a glimpse into the past and a book you can't miss
Intro
Before exploring the intricacies of A/B testing, lets first have a look at its fascinating history and the impact it has had across various industries. I’ve recently stumbled upon a captivating story about how Google's A/B testing team ventured out to form their own consultancy startup and started helping others run similar tests like Google (e.g., changing button colours, font sizes, or more complex tests like search result order).
Their success skyrocketed when they assisted President Obama's tech-driven re-election strategy with A/B testing. Dan Siroker, Obama's former 2008 campaign manager, estimated that A/B testing and website experimentation resulted in over 2.8 million email sign-ups and $60 million in donations. Fast forward to today, and A/B testing has become mainstream, with numerous platforms enabling people to run tests easily.
The platform that played a crucial role in Obama's presidential victory, Optimizely, was founded by Siroker himself. It has since raised $200 million from top investors, becoming one of the largest A/B testing SaaS platforms available.
Google also offers its own A/B testing solution, Google Optimize, which is being integrated into Google Analytics 4 and due do be sunset in 2023. Many companies have incorporated A/B testing into their product offerings, such as Klaviyo for email automation, Facebook Ads, and more.
In this article, we'll focus on website A/B tests that present different content or pages to users. Running a website test isn't overly complicated. You create your experiment in Google Optimize (or any other tool of your choice), choose the pages you want to run testing on and create variants using CSS/JS code. The implementation is up to you and your dev team.
A sample implementation might look like this:
<div id="control">
Your original version
</div>
<div id="variant" style="display:none;">
The variant in test
</div>
Your A/B test code for your variant script can then be something simple like:
// hide the control
document.querySelector("#control").style.display = 'none';
// show the variant
document.querySelector("#variant").style.display = 'block';
Once you have tested the control and variant versions, ensuring there are no anomalies that could compromise your results, the next step is choosing a primary goal (and secondary goals if applicable). This should be the most relevant metric for your test, such as reducing bounce rates on your homepage or increasing average order value if testing your upsell strategy.
Analytics Segments
Creating Google Analytics user segments based on your Optimize experiment can be super helpful for tracking user allocation and segmenting users in various analytics views. This allows you to assess test performance for specific user groups and will give you data insights into how your variant is affecting different metrics like iPhone users behaviour or comparing how your variant is affecting your paid media conversion rates.
The test should run long enough for Google Optimize to determine statistical significance between the control and variant, ensuring you select the correct goal early in the process (might be anywhere from 2 weeks to over a month).
Safeguarding Data Integrity
Keep a close eye on your test to ensure data accuracy. Misconfigurations or errors can lead to incorrect decisions and choosing the wrong version. For high-stakes tests, it's wise to measure your own metrics alongside Analytics, verifying that the data matches the test's results. For example, you can cross-reference reported orders with your ecommerce platform.
You can also employ other checks during the test, such as investigating if there are any significant performance differences between organic and paid traffic. Significant differences might uncover errors in your test.
Developing a Test Hypothesis
By this point, you have likely recognised that conducting tests does not have to be arduous, and that monitoring collected data is useful in order to avoid false positives. However, deciding which tests to perform can be the most challenging aspect of A/B testing. It’s good to start with straightforward tests, such as minor alterations to the user interface (e.g., modifying a button's colour). You may discover some easy wins but you may soon realise that subsequent tests do not yield definitive winners.
This is because the most impactful A/B tests can be intricate and require a deep understanding of the business's operations to develop complex hypotheses. So, what steps can you take? The most effective approach is to examine your company's data and assemble a cross-functional team with firsthand experience in the organisation's inner workings:
Product Manager: Are there any product features that warrant A/B testing due to uncertainty about their impact?
Engage with your customer service team, as they represent the initial human interaction your clients have with the company.
Seek qualitative and quantitative data to substantiate your hypotheses (your customer service team may provide valuable insights, or you can analyse your company's data).
Consult your marketing team: What trends do they observe in your organisation's paid media spending?
Involving an SEO expert: What are users genuinely searching for?
The dev team: Can the proposed test be implemented within a reasonable timeframe?
Learning from Successes and Failures
Despite the diligent efforts in developing a well-founded hypothesis that appeared to align with customer demands, search volume, and efficient implementation by the dev team, the test might not work out. It can be disheartening, especially when adding together the associated costs.
However, this is a natural part of the A/B testing process. The key is to not let the invested effort go in vain. Focus on formulating hypotheses to determine the reasons behind the test's failure. Reflect on the lessons learned from this test and identify the next steps to be taken. Consider how you can build upon these learnings to develop and execute new hypotheses. It’s also crucial to ensure data integrity at this stage.
Embracing the learnings from A/B tests is arguably the most vital aspect of fostering a testing culture within your organisation. This approach encourages teams to become more data-driven and provides a clear indicator of success and progress in your product development efforts.
Celebrating Your Wins
Congratulations! You now have two possible courses of action. First, you can continue to run the winning variant through your A/B test platform until the development team completes the implementation of the changes.
You may choose to conclude the test and await the live deployment of the changes, incorporating them into your dev team's future sprints. The most suitable option depends on more factors, for example: dev timelines, impact, dependencies related to further feature development based on test results.
It’s good to always keep in mind that relying on Optimize scripts for an extended period can have negative effects on UX and your website's SEO. Optimize scripts can be resource-intensive, which negatively impacting your site's speed. Also, retaining test content on your site after the conclusion of the testing period can lead to penalties in search engine rankings.
For a fascinating read, I highly recommend Algorithms to Live By, which explores how computer algorithms can be applied to our everyday lives and served as inspiration for writing this bit about A/B testing.