Around the world, a small but growing movement of experimenters are bringing fresh ideas to how we solve social and public problems. From crafting better services, to making the back-office of organisations more efficient, new methods and tools are being used to develop and test policy and practice solutions.
An Inventory published today by The Alliance for Useful Evidence and Nesta catalogues the contents of the experimenter’s toolbox. We provide clarity in a confusing space, drawing on lessons from around the world. The Inventory brings together 18 experimental approaches, to simplify jargon and provide a framework for thinking about the choices available to a government, funder or delivery organisation that wants to experiment more effectively.
From the thousands of experiments conducted by Thomas Edison to create the first lightbulb, through to trials in medicine, or the long-running field experiments than underpin modern agriculture, trying ideas out in practice is a cornerstone of scientific and technological discovery. Testing and learning from experience is also central to the history of science and social science, and how best to create useful new knowledge about the world.
Experiments are now critical to sectors where innovation and optimisation are routine, like web development. It has caught on in business. Among the largest financial institutions, retailers and restaurants in the US, at least a third are running randomized experiments, with companies like Google and Amazon running tens of thousands of experiments a year. A/B testing is now the standard means through which Silicon Valley improves its online products. However, experimentation in most policy areas remains relatively rare.
When organisations do experiment, the tools useful for testing and learning about policy can get muddled. For some, experimentation is just a synonym for innovation – trying out a new idea, like the introduction of gender quotas in Rwanda, or Citizens’ Assemblies in Ireland. The word ‘experimental’ has come to mean ‘innovative’ or ‘radical’ rather than simply ‘untested’. But genuine experimentation is about committing to some sort of formal shared learning. It’s about putting in place a structure to learn from trying things out in the world. It is not about just doing things differently – and expecting to succeed.
Experimentation necessitates a more mature attitude from leaders and decision-makers, one that allows us to learn positively from ‘good failure’. Rather than pretending we have all the answers, it means admitting we that we don’t, and that we need to put our ideas to the test. A more transparent, ‘open-by-default’ approach is being pioneered internationally, by experimenters in Canada, Finland and the US.
The experimental toolbox
In 2019 three pioneers of randomised experiments in economics and international development – Esther Duflo, Abhijit Banerjee and Michael Kremer – won the Nobel Memorial Prize in Economics. They set up the Abdul Latif Jameel Poverty Action Lab (or JPAL), an organisation who have run over 1,000 randomised controlled trials to understand how to reduce poverty – and have championed the use of the method internationally.
The last two decades have witnessed the growth of the use of RCTs in social policy areas. In education, there were only around ten RCTs published each year in the early 2000s, but this had grown to over a hundred a year by 2012. The UK’s Education Endowment Foundation has now conducted over 180 trials. In policing, around 400 trials have now been run – with 40 new studies being added a year. Our inventory catalogues 11 kinds of randomised experiments being used to inform decision-making. From multi-arm trials, which test multiple innovations head-to-head; nimble RCTs, which provide fast answers to operational problems; through to realist trials, which build new theory about the mechanisms that drive change. We highlight adaptive, flexible approaches, like the stepped wedge trial, that can be used during phased policy roll-outs – and may be more politically palatable, as they allow everybody in the trial to receive the new policy innovation.
There are also other valuable tools being used to guide public decision-making. Despite their unfashionable status among some policy wonks and evaluators, we also highlight the value and utility of Quasi-Experimental Designs. We explain these methods (usually couched in deadly technical language) in plain English, including newer approaches like Synthetic Control, that use inventive statistical methods to take advantage of the advent of ‘big data’.
We also forefront innovation in how experiments are designed and conducted. Some designs provide insight across places and populations, like the multi-site RCTs that build the evidence base on what works across diverse locations and populations. Other approaches value local knowledge, situated in context, that’s responsive to the questions and concerns of service users and professionals. Some of these rapid-cycle, improvement focused approaches draw on a different ethos of experimenting. They aim to tackle complex problems by raising the quality of systems of service, emphasising agile project management, and iterative trialling and assessment, as central to achieving impact.
It’s important to be clear what different designs can and can’t tell us. Only certain designs are helpful for learning about impact and effectiveness: whether or not our new idea is really making a difference. Here, randomised experiments, or quasi-experimental designs when these aren’t possible, are uniquely valuable. But other approaches may be better suited to finding out different things, at different stages of developing a policy solution. Prototyping, for example, emphasises front-loading risk and creating a solution with a better chance of success, through stakeholder engagement. Using plastic Lego bricks to build a prototype of an engineering product is a low-fi, low-resource way of making early operational or design issues obvious. But it won’t tell you whether or not the new system works in real life, or at scale. As well as getting better at knowing what to use when, there is scope for experimenters to get much better at using different tools in combination with each other, to innovate more effectively – such as using low-cost randomised trials like A/B tests or nimble trials to evaluate prototyped products or services.
Why this inventory?
In 2009 Nesta ran one of the earliest randomised controlled trials on a business support scheme, an innovation voucher called ‘Creative Credits’ that spawned many similar initiatives around the world. A few years later, we launched the Innovation Growth Lab (IGL), the biggest global partnership supporting and running randomised controlled trials on economic, innovation and industrial policy. Our States of Change collective provides training for public servants internationally, with a focus on problem-solving. In 2019 we set up a novel experimental platform, the Edtech Innovation Testbed.
The Alliance for Useful Evidence has made the case that government must rigorously and systematically put policy to the test – or risk stagnation. Over the past 6 years we worked with partners and collaborators across the UK to advocate for smart decision-making, launching the UK’s What Works Network in 2013. In our work with partners, advising governments and foundations on how to test new ideas, many asked us for a comprehensive guide that brings different experimental approaches together.
So we take a pragmatic approach in this resource, and draw on lessons from a broad range of experimenters – professionals in charities, change-makers in frontline services, researchers in academic institutes, and decision-makers in government. We advocate for an open approach, that’s wise to the option available – and how to navigate them. This might mean creating more mission-oriented, multi-disciplinary teams; for example new collaborations between policy designers and ‘randomistas’. Sharing learning and results will demand working across sectors and areas of expertise, to link up what we know above and beyond niche areas of specialism.
We hope this resource will be useful in clarifying when experiments are useful and why. We have simplified jargon and eliminated double-speak, and do some myth-busting on common misperceptions. As innovators seeking social good, we have a duty to put our ideas to the test – to find out what doesn’t work, and discover what does, to improve people’s lives.
Read The Experimenter’s Inventory here.