NHS Digital Data Release Register - reformatted

Ipsos Mori projects

2 data files in total were disseminated unsafely (information about files used safely is missing for TRE/"system access" projects).

CQC Adult Inpatient Survey Bespoke HES Extraction — DARS-NIC-407121-Z8K8K

Opt outs honoured: Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Purposes: (Research)

Sensitive: Sensitive

When:2021.06 — 2021.11. DSA runs 2021-03-18 — 2022-03-17

Access method: One-Off

Data-controller type: CARE QUALITY COMMISSION (CQC)

Sublicensing allowed: No


  1. Hospital Episode Statistics Admitted Patient Care

Expected Benefits:

The National Patient Survey Programme (NPSP) provides patient experience data for trusts across England, to facilitate service improvement and to support the Care Quality Commission (CQC) with healthcare regulation. Ensuring that the data provided by this programme is high quality and that any impacts on trends are understood is vital to ensuring that these processes are effective. Therefore, it is important that any change to the process is fully tested.

If it is possible to move to a centralised sampling approach, this could have the following potential benefits:

1) Reduction of burden to trusts
Currently, every NHS trust in England has to provide their own sample for each of the survey in the NPSP, which is a burden among already busy data processing staff at each trust. As each sample is drawn separately, any data queries also have to be dealt with by the trust, which is an additional burden. Particularly in the current situation, reducing the amount of work required by each trust is a particular benefit, and also means that the survey is not reliant on trusts having the staff time (the Maternity 2020 survey had to be cancelled in part due to trusts not being able to provide samples, meaning data from that year is not available for that survey).

2) A consistent approach for data collection
As each trust draws their own sample, this introduces several opportunities for variation. Trusts are selecting samples at different times, and entering field at different times, leading to different fieldwork lengths. It also introduces the risk of different trusts understanding the rules differently, and although processes are in place to reduce this risk, selecting centralised samples would eliminate this risk altogether. By ensuring as much comparability between trusts as possible, it ensures that the results are better able to measure variation in experience between trusts and know this is true variation and not the result of sampling differences.

3) Potential cost saving and reducing time from hospital episode to publication of results
By centralising the sample, there is potential to reduce costs and the associated time involved by minimising the number of samples that need to be individually checked from over 100 to 1. The trusts pay costs for contractors to conduct their surveys, and the CQC pay for the coordination centre to review, and both of these costs have the possibility to reduce if the sample is centralised. There is also potential to reduce the time from the hospital episode to the publication of the results, by potentially shortening the sample period (as some trusts take a longer time than others to produce their sample) and reducing the fieldwork period (as this is currently longer to account for some trusts going into field later than others). As part of this piece of work, the potential implications on the survey timings and costs will also be reviewed.

4) Reducing the risk of data breaches
At the moment, as each sample is drawn by the individual trust, each sample needs to be individually securely transferred to a contractor, separated from the mailing data, and then a pseudonymised version of each individual sample shared with the co-ordination centre for checking. As this is a large number of individual data transfers, it increases the risk of data breaches, where data may not be shared securely or may include details that should not be shared with the recipient. Although there are processes in place to reduce this risk, reducing the number of data transfers reduces the risk of these breaches happening.


A written report will be produced by Summer 2021, with aggregated and suppressed comparisons between the trust provided sample and the NHS Digital sample, as well as details on any potential impact on the process or survey results.

All outputs will be aggregated with small number suppression applied as per the HES Analysis Guide

This is likely to be published on the National Patient Survey Programme website, and shared with NHS trusts, contractors and stakeholders as part of any decision to move to a centralised sample.


- NHS Digital Data Production would use the CQC Adult Inpatient Survey filters to obtain a random sample of 1,250 individuals from each of the 140/150 Trusts (a total sample size likely to be around 180,000 individuals) with the "most complete" contact information (this will be done by using the Master Patient Service to verify addresses and mobile numbers). Only if the individual has a verified postal address will they be included in the 1,250 random individuals.

- NHS Digital Data Production links the 180,000 individuals to HES APC to obtain their latest spell only in hospital between 01/04/2020 and 30/11/2020 (calculated back from 30/11/2020), and provides fields requested below in the extract:
o Trust code
o Pseudonymised Patient Record Number (PRN)
o Mobile number indicator
o Year of birth
o Gender
o Ethnic category
o Day of Admission
o Month of Admission
o Year of Admission
o Day of Discharge
o Month of Discharge
o Year of Discharge
o Length of Stay
o Treatment Function Code (on discharge)
o ICD-10 Chapter Code - This should be the chapter code (in roman numbers e.g. XVIII or V) for the primary diagnosis on discharge – so there should only be one code per person and they should match the roman numbers as here: https://icd.who.int/browse10/2019/en#/
o Treatment Centre Admission
o Admission method
o NHS Site code-Admitted
o NHS Site code-Discharged
o COVID-19 diagnosis
• Trust codes need to be 3 digits; if more than that, then take the first 3 digits only

(COVID-19 diagnosis will be split as follows:
1 - ICD-10 code U071 – if U071 at any time during a patient’s spell in hospital.
2 - ICD-10 code U072 – if no U071 code, but a U072 code at any time during the spell in hospital.
3 - Not including any of the above - if neither U071 code or U072 code at any point.)

- NHS Digital Data Production will then disseminated this record level pseudonymised data extract to Ipsos MORI via Secure Electronic File Transfer Service (SEFT).

CQC will not have access to any pseudonymised data under this service evaluation. No attempt to re-identify individuals will be made by the Data Processor. All outputs must be aggregated with small number suppression applied as per the HES Analysis Guide. The data will not be linked to any other datasets.

All pseudonymised data will be saved encrypted and password protected, on the Ipsos MORI server, with access given to the project team only. All project team members are substantive members staff employed by Ipsos MORI. They are required to abide to Ipsos MORI policies on information security, data protection and physical security and have received training on these. Ipsos MORI will process and store the data sets provided by NHS Digital in line with these policies.

In order to protect patient confidentiality, when presenting results calculated from HES record level data, outputs will contain only aggregate level data with small numbers suppressed in line with HES Analysis Guide. When publishing HES data, you must make sure that:
· cell values from 1 to 7 are suppressed at a local level to prevent possible identification of individuals from small counts within the table.
· Zeros (0) do not need to be suppressed.
· All other counts will be rounded to the nearest 5.
Data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.