Demonstrating transparency when linking and publishing data

This is a case study for Principle T6: Data governance.

The Scottish Government’s (SG)  health and homelessness in Scotland project linked local authority data about homelessness between 2001 and 2016 with NHS data on hospital admissions, outpatient visits, prescriptions, drugs misuse, and National Records of Scotland information about deaths.

Transparency around the risk assessment process helps to demonstrate a producer’s Trustworthiness to users, suppliers and the public. One of the ways that SG were able to demonstrate this was by conducting and publishing their data privacy impact assessment alongside the main analysis report. SG also published the original application for the data, the public benefit and privacy panel application and the correspondence documenting its approval, and details of how to access the data. This approach is now standard practice for all SG publications based on linked data.

Since SG carried out this work, a new tool for risk assessment – Data Protection Impact Assessments (DPIAs) – have been introduced following the 2018 Data Protection Act (DPA), as a requirement of GDPR. They are mandatory where data are combined from multiple sources and the Information Commissioner’s Office recommends they are also conducted on a voluntary basis for any large-scale processing of personal data.

The accountability principle in the DPA requires organisations to have appropriate records in place to demonstrate compliance if required. Departments can meet the DPA accountability principle by conducting a DPIA, and publishing them helps to meet the Code’s requirements for transparency (providing that they are accessibly presented). It isn’t essential to publish a DPIA in full, a summary of the process and the lessons learnt would be sufficient to demonstrate transparency.

Another step producers can take to increase transparency is to publish details of all the data share requests made to them and their outcomes. SG publishes details of the data sharing requests submitted to its Statistics Data Access Panel on its website, which also includes details about past decisions made and the justifications for those decisions.

The Department for Education in England has also been publishing details of the data share requests and outcomes in relation to ad hoc National Pupil Data Sharing for several years. In December 2017, the Department for Education broadened the scope to cover all routine sharing of personal data and have recently consulted users about further changes to make this easier to engage with and understand.

These examples show how Trustworthiness can be demonstrated by statistics producers being transparent about their approaches to the management of the data linkage process and data shares, and their relevance to some of the current legislation in this area.

Reviewing and amending statistics provisions

This is a case study for Principle V1: Relevance to users.

Skills Development Scotland and the Scottish Government education team both produce statistics on young people in Scotland.

Scottish Government reports on school leaver destinations in its Initial destinations of Senior Phase School Leavers (official statistics) and Summary Statistics for Attainment, Leaver Destinations and Healthy Living (National Statistics) publications.

Skills Development Scotland produces its Annual Participation Measure for 16–19-year olds in Scotland which reports on the proportion of the 16-19 year old cohort, including those at school, who are in learning, training or work. Skills Development Scotland is not an official statistics producer at present; however, its statistics are produced in line with the Code of Practice for Statistics.

The team at Scottish Government recognised what might be perceived as a confusing landscape for users who may not be clear on which statistics to use for what purpose. They reviewed all the available statistics about school leaver destinations, and, with the support of Skills Development Scotland, ran a consultation to find out how the data are used and how users felt about proposals to reduce the duplication of statistics. As a result, they developed a long-term plan to reduce duplication and simplify the statistical landscape, which builds upon the progress made in recent years.

This demonstrates how producers can work collaboratively and engage with users to review and amend statistics provisions.

Working effectively with contractors to deliver a survey

This is a case study for Principle Q1: Suitable data sources.

The Scottish Crime and Justice Survey (SCJS) is an annual, large-scale, continuous survey carried out by the Scottish Government that measures adults’ experiences and perceptions of crime in Scotland.

The Scottish Government appointed a consortium of two contractors to jointly deliver the SCJS from 2016/17 under a single contract. Fieldwork is shared jointly across both organisations, while Ipsos MORI focuses on the content and questionnaire development, and ScotCen is responsible for data processing, from data cleaning through to delivery of final data sets.

The Scottish Government maintains a close and effective working relationship with the contractor consortium to manage and deliver the survey:

  • It established appropriate safeguards by specifying the roles, data requirements, delivery arrangements and communication channels between the survey team in Scottish Government and the contractor consortium
  • When setting up the contract, the statistical team modernised the data coding and processing arrangements for the ‘offence codes’ used to classify incidents, to ensure a smooth transition to the new contractors

The Scottish Government carried out a range of activities to decide on the survey design, including meeting with users and potential contractors. It also produces a detailed technical report with information about the survey design and delivery, including the sample design and selection and the survey response.

This approach gives the Scottish Government and users of the statistics confidence in the suitability and quality of the data sources and data collection process.