Martin Tingley With Wenjing Zheng, Simon Azdemer, Stephanie Lane, And Colin McFarland
This introduction is the first in a multi-part series to decide how to use Netflix A / B testing to continuously improve our products, so that we can provide more joy and satisfaction to our members. Subsequent posts will cover basic statistical concepts under A / B testing, the role of testing across Netflix, how Netflix has invested in infrastructure to support and scale testing, and the importance of testing culture within Netflix.
Netflix was created with the idea of keeping consumer preferences and controls at the center of the entertainment experience, and as a company we are constantly developing our product offers to improve those value propositions. For example, the Netflix UI has undergone a complete transformation in the last decade. In 2010, the UI was stable, with limited navigation options and a presentation inspired by a video rental store display. Now, the UI is immersed and video-forward, rich in options on navigation but less irresistible, and the box art presentation takes more advantage of the digital experience.
From the experience of 2010, Netflix needs to make countless decisions to transition to what we have today. What is the right balance between a large display area to show a larger title versus more titles? Is video better than static image? How do we provide a seamless video-forwarding experience on limited networks? How do we choose which title to show? Where are the navigation menus and what should they be? The list goes on.
It’s easy to make decisions – the hardest part is making the right decisions. How can we be confident that our decisions are providing a better product experience for existing members and helping businesses grow with new members? Here are some ways to decide how Netflix can develop our product to make our members more happy:
- Let the leadership make all the decisions.
- Hire some experts in design, product management, UX, streaming delivery and other disciplines – and then take their best ideas.
- Have an internal debate and let the vision of our most charismatic colleagues carry the day.
- Copy the contest.
In each of these instances, a limited number of perspectives and perspectives contribute to the decision. The leadership group is small, the group debate can only be so big, and there are only so many experts in every domain area of Netflix where we need to make a decision. And maybe there are some streaming or related services that we can use as inspiration. Moreover, these illustrations do not provide a systematic way to make decisions or resolve conflicting perspectives.
At Netflix, we believe there are better ways to decide how to improve the experience we provide to our members: We use A / B testing. Experimental scale. When it comes to making decisions, rather than a small group of executives or experts, the experiment gives all our members a chance to vote on how to develop their enjoyable Netflix experience through their activities.
More broadly, A / B testing, like other performance estimation methods, semi-testing is a method that Netflix uses scientifically for decision making. We form assumptions, collect empirical data, including experiments, which provide evidence for or against our assumptions and then draw conclusions and make new assumptions. As my colleague Nirmal Govinda explained, the repetitive cycle of experimentation (making precise decisions from a general principle) and bringing (formulating a general principle from specific results and observations) play an important role depending on the scientific method.
Interested in learning more? Follow the Netflix Tech blog for future posts that provide details on A / B testing and how Netflix uses tests to make decisions.