Collaborating to enhance the coherence of Crown Court data

This is a case study for Q1: Suitable data sources 

Criminal court statistics  

The Ministry of Justice’s criminal court statistics cover the type, volume and timeliness of cases in magistrates’ courts and the Crown Court in England and Wales. Less serious offences like low level theft or traffic violations are handled in magistrates’ courts, while more serious crimes such as robbery or homicide are referred to the Crown Court for sentencing or trial.  

Since 2023, all criminal courts in England and Wales have started using Common Platform, a digital administrative system that tracks cases through both magistrates’ and Crown courts. Common Platform is the replacement for previous systems for the Crown Court (Xhibit) and magistrates’ courts (Libra). 

Data discrepancies  

The Ministry of Justice (MoJ) and HM Courts and Tribunals Service (HMCTS) previously maintained separate versions of the administrative data used for operational oversight, reporting, analysing and modelling Crown Court caseloads in England and Wales. While there are some reasons for differences between the management information produced and published by HMCTS and the MoJ official statistics (including HMCTS needing an earlier run of data prior to full quality assurance and some minor definitional differences), this dual approach often led to confusion for users in understanding which source was best to use and how to interpret the differences. 

One Crown: Improving methods and data coherence 

To address this issue, MoJ and HMCTS initiated the One Crown data project. The aim of One Crown was to align HMCTS’s and MoJ’s methodologies to create a single dataset and core pipeline for Crown Court data. This dataset could then be used by HMCTS for its management information reporting and by MoJ for official statistics purposes, resulting in greater coherence, transparency and clarity for users. It is a single and unified data source. 

Analysts and data engineers from HMCTS and MoJ pooled their knowledge to design a single data model. They worked with operational users of the Common Platform system to understand how data are entered onto the system and processed. This collaboration led to a better shared understanding of the Common Platform and data quality. 

As part of building the single dataset, MoJ and HMCTS jointly decided on the methodology and definitions that would be used for key Crown Court metrics. The initial focus was on Crown Court caseloads, including receipts, disposals and the open caseload. The One Crown Steering Group, comprising senior staff from both MoJ and HMCTS across operations, data, analysis and policy, discussed and agreed on 11 definitions focussed on the Crown Court caseloads, often considering a range of alternatives to finalise the methodology. Having common methodologies and definitions has created greater coherence across Crown Court data and made it easier to quality-assure the data. 

Communicating the data quality improvement work to users 

MoJ has been open and transparent with users about this project. It published a consultation on the changes to the statistics from the One Crown project, which explains the reasoning behind key decisions and clearly sets out the impact on the statistics. By being transparent about their decision-making, methodologies and the impact of changes, MoJ has helped users understand how the quality of the Crown Court statistics has improved. In addition, the consultation gave users an opportunity to share views on the planned changes to the statistics and the next steps for the One Crown project. 

Building stronger relationships to improve data for all 

The One Crown project represents a significant step towards improving the quality and coherence of Crown Court data. By aligning methodologies and definitions, MoJ and HMCTS have created a more transparent and reliable dataset that benefits all users. MoJ and HMCTS are continuing to work together to further improve the Crown Court data. 

One Crown highlights the benefits gained when statistics producers build and maintain strong relationships with data suppliers and the operational areas responsible for running a service. Close collaboration in this way has allowed MoJ statisticians to gain a more comprehensive understanding of the data they receive and improve the quality of their statistical outputs. It has equally highlighted the importance of accurate statistical reporting of the data to those in the operational area.

Ensuring source data is appropriate for intended uses

This is a case study for Principle Q1: Suitable data sources

Legal aid statistics for England and Wales are published quarterly by the Ministry of Justice (MoJ) and draw on a range of Legal Aid Agency (LAA), an executive agency of the MoJ, administrative data sources. Legal aid statistics were first published independently as Official Statistics in 2013, and were awarded National Statistics status in 2016.  

Legal aid is a complex area and the statistics report on a variety of criminal and civil legal aid schemes, including police station attendance and civil representation. The statistics provide an extensive evidence base on the legal aid system, but the constraints of using administrative data from LAA systems means that there are some things they do not measure precisely, or at all. To enable user understanding, MoJ publishes a comprehensive Guide to Legal Aid Statistics in England and Wales. The user guide includes considerable detail about operational context in which the data are recorded and case studies to show the types of cases where legal aid would be granted and how this would be shown in the statistics  

The guide also provides a summary of the team’s professional judgments around the robustness of each data source and, more generally, a clear steer on the sort of comparisons that the overall statistics allow (e.g. volume and expenditure levels by scheme) or do not permit (e.g. the number of clients or precise geographic distribution of legal aid clients). A detailed account of the individual data sources used is further detailed in a separate ‘index of legal aid data’. The index and user guide both include a flow diagram which presents the data sources for each of the legal aid schemes. 

Many legal aid data sources are subject to minor revisions within each quarterly update from new information being included, or previous information being amended, on the underlying systems. These revisions are clearly flagged in the quarterly statistics.  

The legal aid statistics team were embedded in the LAA until recent years and maintain close links with LAA colleagues, including those responsible for the management and supply of the administrative datasets. These relationships help provide additional insight into the detail of the data sources used and any changes to these. A recent example of this was when a new provider contract for telephone advice services led to a discontinuation of a published time series on costs. These changes were explained by LAA colleagues and subsequently reported in the statistical series. 

There have been numerous other enhancements to the statistics over time, which are also clearly documented in the user guide timeline, and which have continued to improve the comparability and transparency of the data sources used to produce legal aid statistics. 

This example shows how the legal aid statistics team within MoJ ensure that the LAA data they draw on is appropriate for statistical purposes by having a thorough understanding of the operational context within which the administrative source data used to produce the statistics are collected, and by maintaining close links with LAA data suppliers. It also shows the considerable lengths that the statisticians go in explaining the relative strengths and limitations of the various data sources used to ensure the appropriate interpretation of the official statistics, including explaining the impact of changes or revisions to data sources and administrative systems over time. 

Archived: Automating statistical production to free up analytical resources

This is a case study for Principle V4: Innovation and improvement.

The Reproducible Analytical Pipeline (RAP) is an innovation initiated by the Government Digital Service (GDS) that combines techniques from academic research and software development. It aims to automate certain statistical production and publication processes – specifically, the narrative, highlights, graphs and tables. Tailor made functions work raw data up into a statistical release, freeing up resource for further analysis. The benefits of RAP are laid out in the link above, but include:

  • Auditability – the RAP method provides a permanent record of the process used to create the report, moreover, using Git for version control producers have access to all previous iterations of the code. This aids transparency, and the process itself can easily be published
  • Speed – it is quick and easy to update or reproduce the report, producers can implement small changes across multiple outputs simultaneously. The statistician, now free from doing repetitive tasks, has more time to exercise their analytical skills
  • Quality – Producers can build automated validation into the pipeline and produce a validation report, which can be continually augmented. Statisticians can therefore perform more robust quality assurance than would be possible by hand in the timeframe from receiving data to publication.
  • Knowledge transfer – all the information about how the report is produced is embedded in the code and documentation, making handover simple
  • Upskill – RAP is an opportunity to upskill individuals by giving them the opportunity to learn new skills or develop existing ones. This also upskills teams by making use of underused coding skills that may exist within their resource; coding skills are becoming ubiquitous nowadays with many STEM subject students learning to code at university

RAP therefore enables departments to develop and share high-quality reusable components of their statistics processes. This ‘reusability’ enables increased collaboration, greater consistency and quality across government, and reduced duplication of effort.

In June 2018, the Department for Transport (DfT) published its RAP debut with the automation of the Search and Rescue Helicopter (SARH) statistical tables. This was closely followed by the publication of Quarterly traffic estimates (TRA25) produced by DfT’s first bespoke Road Traffic pipeline R package. RAP methods are now being adopted across the department, with other teams building on the code already written for these reports. DfT have begun a dedicated RAP User Group to act as a support network for colleagues interested in RAPping.

DfT’s RAP successes have benefited from the early work and community code sharing approach of other departments, including:

  • Department for Digital, Culture, Media & Sport first published statistics using a custom-made R package, eesectors, in late 2016, with the code itself made freely available on GitHub.
  • Department for Education first published automated statistical tables of initial teacher training census data in November 2016, followed by the automated statistical report of pupil absence in schools in May 2017. DfE are now in the process of rolling out the RAP approach across their statistics publications
  • Ministry of Justice, as well as automating their own reports, have made a huge contribution with the development of the R package xltabr which can be used by RAPpers to easily format tables to meet presentation standards. Xtabr has also been made available to all on the Comprehensive R Archive Network.

The incorporation of data science coding skills with the traditional statistical production process, coupled with an online code sharing approach lends itself to increased collaboration, improved efficiency, and creates opportunities for government statisticians to provide further insights into their data.