Articles Making more use of existing data: What Works?

Making more use of existing data: What Works?

This week, on the 27th February 2020, What Works for Children’s Social Care (WWCSC) will launch its report ‘What Works in Education for Children Who Have Had Social Workers?’ at a joint event with the Alliance for Useful Evidence at Nesta. If you’d like to come to the event, you can register here. The report outlines our exploratory findings on how effective educational interventions are on children who have had a social worker during the intervention or six years prior.  The results stem from exploratory analysis of 63 of the Education Endowment Foundation’s (EEF) existing data archive, joined to data from the National Pupil Database. 

This project represents the first time one What Works Centre has systematically re-analysed the data from another, and reflects how far we’ve come in terms of designing research in such a way as to allow this kind of analysis later. The EEF’s archive contains pupil-level data from a large number of the trials they’ve funded over the years, and represents a rich resource. The project is also a fairly early use of the ONS’s Secure Research Service, which facilitated the secure use of this data and joining it to the National Pupil Database. We are also among the first organisations to be given remote access to it.

The project has certainly had its highs and lows, it proved more complex than we had initially expected, as is often the case. Throughout we’ve had our fair share of things to be grateful for, as well as some challenges. 


First, the very fact that the EEF have taken the steps to collect this data and that, with the help of the Fischer Family Trust (FFT), it is available to conduct this type of analysis demonstrates an impressive commitment to open data and empiricism. The EEF have also been generous with their time to review our analytical strategy and results, and for this we’re very grateful.  

Secondly, the wider infrastructure and support that is available to researchers to enable them to access the data and conduct the analysis. This included:

  • The FFT staff: Their knowledge of the data is impressive, and they were always willing to answer questions to ensure the nuances of the data were understood. 
  • The ONS’s Secure Research Service: This fantastic resource allows for the data to be kept securely, accessed remotely, and linked with other data previously not used in the original research (with no required input from the researcher). In addition, the staff’s continuing dedication to both facilitating our needs as researchers, whilst also ensuring the data subjects rights were met was greatly appreciated.
  • The Children in Need and Looked-after Children statistics teams at the Department for Education –  These teams were continually helpful in answering our questions and guiding us in identifying our sub-group of interest.


It is apparent, with the glory of hindsight, that we had some optimism bias when considering the complexity of the project.  Given the data had already been collected, all we had to do was quickly merge in a couple of variables, and run the analysis, right? Well it turned out the devil was most certainly in the detail. 

The archived data is definitely an amazing resource, and considerable efforts have gone into creating a flexible data structure, and a standardised data set. However, it remains, perhaps out of necessity, somewhat a chimera of different datasets. Each was created by the each separate evaluator, and then combined in the archive. Naturally not all variables were used in all projects, and in some cases were coded inconsistently. In the vast majority of cases, these issues were resolvable, but they often took time, whether liaising with FFT or digging through each report separately.  

Similarly, it was sometimes difficult to understand from the dataset exactly how the analysis had been originally conducted. For example, how missing data was dealt with, or what analytical model was used. This was usually contained within the trial reports themselves, but sometimes lacked key details. This lack of clarity in some projects meant that we ended up not strictly replicating the original analysis. This doesn’t seem to have had a big effect on the findings, but it did mean the project took longer than we thought. 

Looking forward

Overall there’s a great deal of reason to be optimistic about the world of data-archiving and exploratory analysis of existing data-sets. Through this project we’ve given an indication of some of the work in this area that is possible, but we’ve only scratched the surface.  

In the first instance we hope to have laid the groundwork for others to similarly use the EEF’s data archive to do similar projects. To this end, in addition to our summary report being published on 27 February, we will also be publishing a full technical report, and publish our code in full at a later date (giving us a little bit of time to make sure it’s as intelligible as possible!).  

Further to this, organisations like WWCSC that are running and funding their own research projects can learn about our role in this process. Capturing key information that relates to the analysis, as well as the data itself opens up a range of other projects, and allows for replication much more simply. Ideally this would include the raw data, the data used for analysis, and the full code (thoroughly annotated) used to do the analysis, and other key data that provides key information that relates to the project and analysis. This would make projects like ours quicker and more robust, even if it takes a bit more up front investment from evaluators.  

We hope that, once published, you find our report useful and informative. It aims to be hypothesis-generating and start an ongoing conversation about what we can do to help narrow the attainment gap between children who have had a social worker, and their peers.  If you have any questions, either about the specific research, or our experience of working with the EEF dataset, please get in touch.

Views expressed are the authors’ own and do not necessarily represent those of the Alliance for Useful Evidence. Remember you can join us (it’s free and open to all) and find out more about the how we champion the use of evidence in social policy and practice.