Articles Synchronised Revolutions: bringing RCTs and big data together

Synchronised Revolutions: bringing RCTs and big data together

Conversations about evidence base policy often treat RCTs and big data as mutually exclusive but there is no good reason to do so, writes Michael Sanders, Head of Policy Research at the Behavioural Insights Team. By combining the two they can become greater than the sum of their parts – RCTs become cheaper, attrition rates drop and big data moves towards becoming a legitimate part of the policymakers’ toolkit. Together, argues Sanders, they have the potential to spark a real revolution in evidence based policy making by embedding good evidence in government practice and making it an integrated part of the entire policy arena.

“There is a revolution taking place in evidence based policy….”

These words, or sentences with this general meaning, have become increasingly common in the last few years. Despite the formulaic beginning, it is difficult to predict what will follow. Most often, the author will be talking about either Big Data and its many uses (or misuses), or about Randomised Controlled Trials and their huge potential to revolutionise policymaking (or to slow down the process of policymaking and reduce government to a technocratic machine). Both of these topics have attracted the attention of the media, with Tim Harford, the data-nerd world’s answer to Justin Beiber, writing columns on both in recent months.

Why do we treat RCTs and big data as mutually exclusive?

One interesting thing is that these articles seem to treat the two as mutually exclusive. Authors are either interested in Big Data, or in RCTs, but not both at the same time (even when the author, as Harford, is interested in both separately). This is confusing, as combining the two can create something that is bigger than the sum of its parts.

RCTs: our best method for identifying causal relationships

The arguments against RCTs are well-worn. They are more expensive, compared to post-hoc evaluation, they are time consuming, they often suffer from attrition, and they can lack generalizability outside of the immediate context of the trial. At the same time, they remain the best way of testing whether some treatment X causes some outcome Y, and to what extent. In other words, if you want to find the precise causal relationship between two variables, an RCT is your best bet.

Big data: focusing on correlation

For big data advocates, causality is not the big issue – just prediction, and hence correlation. Big Data has plenty of generalizability. However, the assumption that big data covers an entire population, or covers it representatively, is usually wrong. Hence, the usefulness of Big Data for governments making policy is limited, despite Chris Anderson’s glib assessment that with Big Data “correlation is enough”.

Natural bedfellows

There is no good reason for these two tools to be in competition with each-other, and every reason for them to work collaboratively. Widely accessible, large datasets can allow larger RCTs to be run more cheaply than ever before. As well as increasing external validity and lowering cost, statistical issues around attrition can be reduced. Attrition from a normal randomised controlled trial is a passive act – participants can simply fail to respond to surveys or to knocks at their door. With the power of big data, attrition is much rarer. After all, you can’t stop the signal. This same lack of attrition of a short trial period can be extended further into the future, allowing us to track the impact of, for example, education policy, over not just months and years but decades.

At the same time, with the addition of truly random variation in assignment to a government policy, big data could move from being a cool thing talked about at TED Talks but rarely used by governments, to a fully legitimatised part of the policymaker’s toolkit.

RCTs and big data, a real revolution in evidence based policy

The Behavioural Insights Team has run dozens of randomised controlled trials across the spectrum at low cost, not just because behavioural interventions are low cost, but because our RCTs have largely made use of administrative data. This allows good evidence to be embedded in government practice right from the start. The growing availability of large government datasets also makes results more transparent and verifiable. With luck and a lot of hard graft by researchers and officials, evidence could become an integrated part of the entire policy arena, from start to finish and from officials to academics to public discourse. Now that would really be a revolution.


Views expressed are the author’s own and do not necessarily represent those of the Alliance for Useful Evidence. Join us (it’s free and open to all) and find out more about the how we champion the use of evidence in social policy and practice.