NHS Digital Data Release Register - reformatted
Department For Education projects
2 data files in total were disseminated unsafely (information about files used safely is missing for TRE/"system access" projects).
ECHILD - Education and Child Health Insights from Linked Data — DARS-NIC-578994-N0J1X
Type of data: information not disclosed for TRE projects
Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)
Legal basis: Health and Social Care Act 2012 - s261(5)(d)
Purposes: No (Agency/Public Body)
Sensitive: Non-Sensitive, and Sensitive
When:DSA runs 2023-02-13 — 2026-02-12
Access method: One-Off
Data-controller type: DEPARTMENT FOR EDUCATION
Sublicensing allowed: No
- Birth Notification Data
- Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
- Civil Registrations of Death
- Community Services Data Set (CSDS)
- Emergency Care Data Set (ECDS)
- HES:Civil Registration (Deaths) bridge
- Hospital Episode Statistics Accident and Emergency (HES A and E)
- Hospital Episode Statistics Admitted Patient Care (HES APC)
- Hospital Episode Statistics Critical Care (HES Critical Care)
- Hospital Episode Statistics Outpatients (HES OP)
- Maternity Services Data Set (MSDS) v1.5
- Maternity Services Data Set (MSDS) v2
- Mental Health and Learning Disabilities Data Set (MHLDDS)
- Mental Health Minimum Data Set (MHMDS)
- Mental Health Services Data Set (MHSDS)
- Mental Health Services Data Set (MHSDS) v5.0
The purpose of this application is for Department for Education (DfE) to gain access to the Education and Child Health Insight Linked Data (ECHILD), which consists of education records and health records. ECHILD, is a pseudonymised longitudinal linked dataset obtained from matching the Hospital Episodes Statistics (HES) dataset, including Admitted Patient Care, Outpatient, Critical Care, Accident and Emergency and Emergency Care Dataset data, plus death registration to administrative data contained in the datasets collectively supplied within the National Pupil Dataset (NPD), including education, Children in Need (CiN), and Children Looked After (CLA). The ECHILD Database also includes the mental health services data set (MHSDS), the community services data set and maternity services data set (MSDS) for the age range of the education records to be matched from 01/09/1984. The National Pupil Database does not include children who are home schooled or who are in private education.
The ECHILD data asset was created through a research study led by University College London (UCL) in partnership with Department for Education and NHS Digital and UCL already have access to the ECHILD asset under DARS-NIC-381972.
ECHILD is stored on the Office for National Statistics (ONS) Secure Research Service (SRS). This agreement will permit DfE to gain access to ECHILD in the ONS SRS for the purpose of fulfilling their own independent crown function of promoting the well-being of children in England.
DfE are the sole data controller who will also process the data under this agreement. UCL will separately be a data controller for the version of ECHILD they access under their own data sharing agreement (reference DARS-NIC-381972). UCL are not listed as a data controller for this agreement (DARS-NIC-578994) as DfE will access the ECHILD data asset for the purpose of policy. The DfE purpose has a different purpose to the UCL ECHILD purpose.
Although DfE work closely with the Department of Health and Social Care (DHSC) to determine priorities, DfE determine the purpose and the means of the processing to address the key priorities for health and education and provide the evidence directly to the Secretary of State for Health, government departments and arms-length bodies. ONS are a data processor as the ECHILD asset is hosted in the ONS SRS.
This study aims to assess the impact of educational outcomes on health, social care and wellbeing of children and young adults, mainly the most vulnerable. In addition, it will look at the long-term impact of COVID-19 on both, education and health of Children and young people (CYP) and whether there are differences in the health and educational outcomes of vulnerable CYP when compared to other CYP and assess the loss of learning suffered by CYP. CYP who are vulnerable due to social welfare or chronic health needs are expected to experience more adverse health, educational outcomes, and social effects of COVID-19 than other CYP.
Vulnerable groups can only be reliably identified through linkage of longitudinal health, education, and social care data (Children in Need and Children Looked After) as it has been done with ECHILD. For the purpose of this agreement, DfE are using the published DfEs definition for vulnerable children and young people which relates to children and young people aged 0-25 years who are assessed as being in need under section 17 of the Children Act 1989 (i.e. have a child in need plan, child protection plan, or are a looked-after child), have an education, health and care (EHC) plan or have been assessed as otherwise vulnerable by educational providers or local authorities (e.g. children on the edge of receiving support or those at risk of becoming not in employment, education or training).
DfE will explore whether children with long-term health conditions such as asthma or poor mental health, and those allocated any special educational needs (as indicators of underlying health or behavioural problems), are at greater risk of adverse impacts of COVID-19 poor health and/or educational outcomes.
DfE may use the outcomes from this research to help inform policies which are designed to facilitate more effective interventions, promote better service delivery in both education and health, and improve educational attainment to generate health, social and fiscal benefits, fulfilling their own independent crown function of promoting the well-being of children in England.
The strong interrelationship between health and education services in relation to the health and wellbeing of children is recognised by policy makers, but evidence is lacking on how services complement or compensate for each other and there have been calls for a stronger evidence base to be developed. It is imperative that DfE analysts work to fill this evidence gap to improve the health, wellbeing, education, and safety of children, young people, and families, particularly the most vulnerable.
The ECHILD asset will be used to understand what works to improve the design and delivery of policies and systems which better meet the needs of children and young people, filling some of this gap in evidence by facilitating work that will inform policy-makers and service commissioners about the associations between education risk factors and health outcomes.
The access to ECHILD will enable DfE analysts to map user journeys and to identify meaningful relationships between health and education outcomes. With more information, it should be possible to create richer user journeys, hopefully leading to more effective interventions, better service delivery, generating health, social and fiscal benefits. These insights could facilitate improvements in the effectiveness of DfEs support for vulnerable children.
This programme addresses DfEs following priority areas which have been set by Department of Health and Social Care:
Improve, protect and level up the nations health, including through reducing health disparities
improve healthcare outcomes by providing high-quality and sustainable care at the right time in the right place and by improving infrastructure and transforming technology
Develop policy and strategy to reduce health inequality.
Improve integration of social care and Special Educational Needs Data (SEND).
In support of wider government objectives, deliver programmes of work for vulnerable groups.
Support research and innovation to maximise health and economic productivity.
Support the most disadvantaged and vulnerable children and young people through high-quality local services so that no one is left behind.
Reduce inequalities and the disadvantage gap.
DfE will focus on the following policy questions:
RQ1 What is the impact of educational outcomes on health, social care and wellbeing of children and young adults, mainly the most vulnerable? Assess whether there are differences in the health and educational outcomes of vulnerable CYP when compared to other CYP. Investigate the drivers of health inequalities for Children in Need (CIN) and Children Looked After (CLA) and how these inequalities affect their educational outcomes.
Analyses: The analysts will use the information from NPD (performance, absence, exclusion, indicators of vulnerability such as Free School Meals FSM, CLA and CIN) and HES dataset to gain insights into the impact of education on health and vice versa. A variety of statistical analysis methods will be used to identify the patterns and testing hypotheses. In addition, the analysts will compare the CYP with their vulnerable counterpart, compare regions/Middle Layer Super Output Areas (MSOAs) for allocation of funding, identify areas where health and social services can be improved. By using the linked data DfE analysts will be able to examine for example how the association between school achievement and hospitalisation varies according to local authority, type of school and vulnerability, taking into account factors at the individual level (eg chronic conditions in the health record, free school meals, ethnicity in the school record) that might affect both school achievement and emergency use of hospital services (e.g, confounding factors).
Analysts will also examine whether low school achievement is associated with admissions to hospital, and whether health problems are associated with subsequent changes in school achievement. These associations are expected to vary across the country, as local factors, such as type of school and local services, e.g. support for children with chronic conditions, vary between local authorities. Evidence on such risk factors will be important for generating hypotheses about how healthcare and schools can reduce adverse outcomes for children and adolescents. To evaluate variation across the country, and to have sufficient power to evaluate outcomes for children with chronic conditions at different ages and in different services, DfE need to use data for the whole country and for all children (within the age restriction) in the ECHILD.
RQ2 What is the long-term impact of COVID-19 on both, education and health of Children and young people (CYP)? Is there any evidence that differences are related to COVID-19 infection or the secondary effects of lockdown? What are the losses of learning suffered by CYP compared to the losses with vulnerable CYP before, during and after the peak of COVID pandemic?
Analyses: The analysts will analyse educational outcomes for all CYP, and compare differences in health outcomes (e.g., emergency hospital contacts, deaths) between groups classified as vulnerable or not (stratified by age group), in the years before, compared with during and after, COVID-19 onset peak. Descriptive analyses will explore whether changes post-COVID-19 reflect increased risks of contacts related to infection, mental health, adversity, or acute complications of chronic or complex health conditions for all CYP and compare those with the vulnerable CYP. The analysts will use educational outcomes and absences from NPD together with diagnostic and procedure codes, and types of hospital contact (e.g. emergency/elective admission), recorded in HES, to infer whether hospital contacts are related to underlying chronic conditions.
The analysts will also conduct analyses of all CYP to explore associations between inequalities (using index of multiple deprivation), ethnic group, and vulnerable vs other CYP, and outcomes measured in health care, NPD data (i.e. education and CiN/CLA) in periods before, during and after the COVID-19 pandemic.
The full cohort (longitudinal data for all children and young people in England) is justified for the following reasons:
1) Longitudinal coverage:
Examining health data from the time of birth to current age (up to, but not including age 25 years) is critical for identifying markers of vulnerability in administrative data. Minimisation has been applied to the ECHILD so it only contains the minimum data necessary for answering the policy questions, which reflects the administrative history of the child for a subset of the available fields (e.g., 60% of available inpatient fields, with no sensitive or identifiable fields).
2) Geographical coverage:
The work aims to draw conclusions that are valid for all children and young people in England. Yet the health and education associations as well as the pandemic has differential impacts across the country (reflecting both infection rates and public health responses) at different times. For example, surveys (e.g. Royal College of Paediatrics and Child Health (RCPCH)) indicate geographical diversity, including re-routing/re-deployment of healthcare staff and services, uptake of school access by eligible children, which are likely to disproportionately impact on areas with higher levels of overcrowding, less outside space, and greater deprivation. However, many surveys have incomplete coverage by geography or over time, making it difficult to accurately estimate the scale of the problem. Understanding time-varying patterns of change is increasingly important as public health responses shift towards localised management (e.g. local lockdowns) to control spread. The analysts therefore need data that makes it possible to understand local area impacts. The analysts will have access to the minimum granularity possible, for example using MSOA rather than Lower Layer Super Outputs Area (LSOA).
(LSOAs (Lower-layer Super Output Areas) are small areas designed to be of a similar population size, with an average of approximately 1,500 residents or 650 households. There are 32,844 Lower-layer Super Output Areas (LSOAs) in England. Middle Layer Super Output Areas (MSOAs) are generated automatically using census data to form groupings of LSOAs. They have a minimum size of 5,000 residents and 2,000 households with an average population size of 7,800. They fit within local authority boundaries.)
The work focuses on the impact of educational outcomes on health, social care and wellbeing of children and young adults and on the impact of the pandemic and loss of learning on vulnerable children and young people (further defined below). Reliable identification of children meeting this definition is not trivial, requiring longitudinal data from birth across health, education, and social care. As a result, there are relatively few robust estimates of the size of this population.
However, there is good evidence that these indicators of vulnerability are common. For example, new research estimates that 25% of all children are ever designated a child in need and that 44% (of this 25%) are ever referred to childrens social care before the age of 16 years. A further subset of children will have other indicators of vulnerability reflecting health or educational needs. For the research to draw meaningful conclusions the analysts wish to draw comparisons on different groups of vulnerable children relative to a series of control children. A high-level comparison will be carried out (e.g., to all other children) relevant to evaluating associations and impacts at national level and for international comparison, as well as detailed comparisons against synthetic control groups to better understand the impacts of vulnerability in the context of related factors such as local environment, access to schools and healthcare needs.
The analysts will therefore need access to data for all children and young people in England as without these data our comparisons would be incomplete, at greater risk of selection bias and not generalisable.
To enable the analyses to address these questions, the analysts will access the ECHILD asset and examine health data from the time of birth to current age (up to, but not including age 25 years) to identify markers of vulnerability in administrative data. UCL have demonstrated the added value of using the whole longitudinal record and ECHILD has been mentioned as a good practice example in the Office for Statistics Regulation Report on availability and need for statistics about children and young people.
By identifying drivers of health inequalities, it is anticipated that there will be better understanding of how to support vulnerable children and the outcomes are likely to include improvements in the provision of health services and social care, with an improved understanding of the impact that health conditions have on childrens education, better understanding around social care experiences and how this impacts health in childhood, improved understanding of the relationship between children with disabilities and their health and educational outcomes and a greater understanding of the disadvantages in health and education in deprivation areas.
To process personal data, DfE must comply with the UK GDPR and DPA 2018, i.e., DfE must have a legal basis for processing under Article 6 of the UK GDPR (lawfulness of processing).
Section 8 of the DPA 2018 states that Article 6(1)(e) includes processing that is necessary for:
(a) "the exercise of a function conferred on a person by an enactment or rule of law" (Section 8(c)); or
(b) "the exercise of a function of the Crown, a Minister of the Crown or a government department" (Section 8(d)).
As the DfE is a government department it will be able to rely on Article 6(1)(e) provided it is "exercising a function". The Secretary of State for Education has a duty to promote the well-being of Children in England" and activity that falls within that function. In support of this broad function, there are various statutes which give DfE a gateway for this Data Sharing, for example, the Education Act 1996, section 10 which requires the Secretary of State to "promote the education of the people of England and Wales or the Children and Young Persons Act 2008 section 7 which lays 2008 lays out the duty of SoS for Education (Secretary of State for Children and Families) to promote the well-being of children in England.
Section 10(3) of the DPA 2018 provides that processing meets the requirements of Article 9(2)(g) of the UK GDPR if it meets a condition in Part 2 of Schedule 1 of the DPA 2018, Paragraph 6 (statutory and government purposes), which requires the processing to be necessary for:
reasons of substantial public interest; and
the exercise of a function either (i) conferred on a person by enactment or rule of law; or (ii) the exercise of a function of the Crown, a minister of the Crown or a government department.
The conditions of substantial public interest are only met if the controllers have a policy document in place (Data Protection Act 2018 s.39 of Part 4 of Schedule 1), which explains the controllers procedures for securing compliance with the Article 5 of GDPR and explains the controllers policies as regards the retention and erasure of personal data processed in reliance on the condition, giving an indication of how long such personal data is likely to be retained. DfE has policy documents in place in compliance with Article 5 of GDPR.
The processing of data for this study is a task of public interest as it will provide insights on the effects of education on health and social care and vice versa as well as evidence on the long-term effect of the COVID-19 pandemic on health outcomes and use of healthcare services among vulnerable children, learning loss, which will benefit and inform policy makers, service providers, vulnerable children and their families.
It is evident that the benefits from accessing the ECHILD will have a number of advantages for society, but the over-arching benefit is to improve the understanding between health and education, improve service delivery and the well-being of children.
Even though substantial public interest is not defined in the DPA 2018 or the UK GDPR, the ICO guidance suggests:
the public interest must be real and of substance, which the DSA will have an over-arching benefit of improving the understanding between health and education, health, education, social service delivery and the well-being of children;
there must be specific and concrete examples of a wider benefit of the processing - this processing will enable DfE to gain insights on the impact of educational outcomes on health, social care and wellbeing of children and young adults, on the impact of long-term COVID-19 infection on the health of children and young people and in particular vulnerable children and loss of learning on vulnerable children and young people, characterised by education outcomes and social care indices from the NPD linked datasets. It will provide vital understanding of the strategies on the health and well-being of key population groups, and provide insight into how education outcomes impacts on health, social care and wellbeing of children and how government strategies should be developed to better meet the needs of children and young people. With this processing, DfE will be able to identify the drivers of health inequalities, inform social care and health policy making, enable better understanding of the relationship between children with disabilities and their health and educational outcomes among others. As a result, we expect reduction in health inequalities, better support to vulnerable children improvements in the provision of education, health and social care services.
comply with the data minimisation principle, which has already been applied to ECHILD by restricting the cohort and information. Further minimisation will be applied on the point of access by the analysts as they are only have access to data pertinent to the research question.
Analysts will either be DfE analysts or analysts loaned from another Department, where an employee is loaned by the home department to the host department, these loan arrangements are temporary, and the employee will continue to be employed by the home department throughout the loan. All analysts will have ONS Accredited Researchers and project approvals will be managed through the ONS Research Accreditation Service, all access will be in the ONS SRS environment and prior to submitting an application for access to ONS, all proposed projects will be considered by the DfE Head of Data Sharing and Lead Analyst for ECHILD to ensure that the proposed use is consistent with the purpose within this agreement and to address the research questions stated above.
ECHILD has complied with the minimisation by restricting the age of cohort (all children and individuals appearing in HES records from (the latest of) birth or April 1997 onwards and born since 01/09/1984) and reducing the number of variables at HES and NPD. Additional minimisation of variables will be carried out at the point of access by analysts as they will only access anonymised information that is relevant to the research questions listed above and not the whole database.
All operational control and access for data processing and management of de-identified data, including but not limited to data processing and storage, and provision of data to approved users is restricted to a controlled subset of DfE and ONS SRS staff. These named individuals within DfE and ONS staff are the only individuals who can access all data within ECHILD and sources in their raw and processed forms.
Patient and public information groups that have been engaged with to date:
o ECHILD has seen 8 engagements so far, with 2 more planned during 2022.
o This includes advocacy and representative groups that were brought together for the ECHILD public stakeholder event on 29 April 2021. Amongst others, this included the Childrens Commissioner for England, NSPCC, SCOPE, NASEN, Contact, Council for Disabled Children, Down's Syndrome Association, MENCAP, GenerationR Alliance, Parentkind (PTA UK).
o A nationwide survey of parents and carers of disabled children and young people is being finalised with SCOPE.
The Data Improvement Across Government (DIAG) programme have designed a high-level communications plan which is designed to inform a variant of stakeholders about the work and give them the opportunity to comment on and engage with the programme. The Data Improvement across Government Programme (DIAG) is a 3-year programme,
from April 2020 to March 2023, funded by Her Majestys Treasury Shared Outcome Fund (SOF). The programme is designed to improve data linkage across the civil service to improve outcomes for vulnerable children and families. The mission for DIAG is to improve the health, wellbeing, education, and safety of children, young people, and families, particularly the most vulnerable, by using data to generate a comprehensive view of the journey through childhood to young adulthood and understand what works to improve the design and delivery of policies and systems which better meet the needs of children and young people". There is already existing strong support for the ECHILD asset from several key stakeholders, these include the Childrens Commissioner, NSPCC, Scope and Mencap.
A series of in-person and online events are planned, and particular focus is being paid to hard-to-reach groups such as those with limited English Language skills, these events will be held in easily accessible for all regions of England. These events will also be open to members of the public.
This project aims to gain insights on the impact of educational outcomes on health, social care and wellbeing of children and young adults, on the impact of long-term COVID-19 infection on the health of children and young people and in particular vulnerable children and loss of learning on vulnerable children and young people, characterised by education outcomes and social care indices from the NPD linked datasets. The study should provide vital understanding of the strategies on the health and well-being of key population groups, and provide insight into how education outcomes impacts on health, social care and wellbeing of children and how government strategies should be developed to better meet the needs of children and young people. These results (preliminary results in March 2024 and March 2025) are critical for addressing current health needs of CYP, mainly the most vulnerable, and also for informing government policy strategies for education, health and social care, improving the understanding of the infection and how to minimise the impact on CYP.
The study maycompare different groups of vulnerable and non-vulnerable groups of children and young people, using indicators of vulnerability drawn from health, social care and education histories in administrative data. The analyses will address a priority for DfE in fulfilling their own independent crown function of promoting the well-being of children in England, and aim to inform policy to better support children and young people, and to better understand which types of vulnerability are most affected.
Summary of health benefits:
With identification of drivers of health inequalities, DfE expect reduction in health inequalities, better support to vulnerable children including improvements in the provision of health services and social care.
Inform social care and health policy making and intervention design
Better understanding of variations in the provision of services for children with learning disability and hospital admissions
A better understanding of the variations enabling better service delivery (social and health services)
Inform the design of services for children with disabilities
Better understanding of drivers of health inequalities between vulnerable children, enabling better service delivery (social care and health services)
Ability to understand the impact that health conditions have on childrens education.
Comparison of service take up pre and post pandemic and how that influences pupils wellbeing
Comparison of effectiveness of therapies pre, during and post covid peak enabling better service delivery
Comparison of impact of covid between vulnerable children and other children will enable better support and service delivery
Better understanding around social care experiences and how this impacts health both in childhood and across the life course
Better understanding of the relationship between children with disabilities and their health and educational outcomes
Better understanding of disadvantages in health and education in deprivation areas
A range of outputs are expected from the study relating to the purpose and objectives of the programme, addressing relevant policy, healthcare and NHS systems questions on the association between child health and education, the impact of educational outcomes on health, social care and wellbeing of CYP, the long-term impact of COVID on education and health of CYP, assess the loss of learning by CYP, mainly the most vulnerable.
All outputs will contain aggregate level data only and all small numbers will be suppressed in line with the relevant dataset analysis guide. Outputs will be monitored for compliance with government official statistics output controls and the Analysis Guides. No potentially disclosive outputs will be shared or published.
Preliminary reports, aggregated with small numbers suppressed, will be shared with DHSC, government departments, Ministers and parliament, NHS England and with UCL through the ECHILD project and a DfE project advisory group. It is hoped preliminary results will be produced for March 2024 (RQ2) and March 2025 (RQ1).
The DfE analyst will produce briefing reports for DHSC, DfE and other public bodies through the gov.uk. Findings will also be used in public involvement and engagement events. Study findings will be also disseminated through peer-reviewed Government Social Research (GSR) and Government Statistician Group (GSG) and social media including lay summaries.
DfE would expect to present findings to Ministers and Government Data conferences within two years of obtaining access to the ECHILD.
Relevant findings will be shared with policy makers, clinicians/health professionals, educators and parent groups particularly in accessible formats (e.g. lay summaries, videos or animations).
The data analyses are conducted on the ONS Secure Research Service. Detailed individual level child data cannot leave the ONS Secure Research Service. Results of analyses can be exported by a secure encrypted transfer system, which is audited.
Any outputs from analyses that are published have to meet statistical disclosure controls that prevent small sizes in accordance with NHS England and DfE requirements. Tabulations of aggregate data are assessed for statistical disclosure control and authorized for export by an ONS data scientist not involved in the project.
UCLs data sharing agreement DARS-NIC-381972 permits the NPD data to be linked to the NHS England data to create the ECHILD data asset. The following describes the complete data flow and how the identifiable and ECHILD data asset was created:
Linkage of identifiers from health and NPD has been conducted at NHS England. NHS England have transferred the pseudonymised linkage key to the UCL Data Safe Haven to flag linked records in the UCL-curated HES extract for transfer to the ONS Secure Research Statistics (SRS). NPD attribute data will only be available in the ONS SRS. Using the pseudonymised linkage key, merging of pseudonymised attribute data (clinical or education characteristics) will then occur separately, at the ONS Secure Research Service (SRS). The following outline describes the complete data flow for future linkages and details how identifiable and non-identifiable data extracts are handled. In the below, health data refers to the health datasets requested from NHS England (HES, mortality, ECDS, MSDS, CSDS, birth notifications/registrations, mental health data).
1) DfE supply the Trusted Third Party (NHS England) with a list of NPD identifier variables. These identifiers include name, date of birth, full postcode and sex, alongside a study specific pseudonymised linkage key known as the anonymized Pupil Matching Reference (aPMR). The identifying variables will be used for linkage to the Master Person Service (MPS)/Personal Demographic Service (PDS) (as previously done for DARS-NIC-27404 and DARS-NIC-381972). DfE will transfer the variables for any CYP born on or after cohort inception (1.9.84).
2) NHS England will match the identifiers from DfE to records held in the MPS/PDS using an algorithm that makes use of the chronology of postcodes in NPD and MPS/PDS. Matching to MPS/PDS data will be done internally within NHS England: no MPS/PDS data will be disseminated to ONS SRS or UCL Data Safe Haven. NHS England will link the NPD pseudonymised linkage key (i.e. anonymised PMR or young person ID) to MPS/PDS, and then to the data.
3) For those children and young people whose NPD identifiers were matched to MPS/PDS, onward linkage to health data will occur within NHS England, linking aPMRs and Token Person IDs. NHS England will then transfer encrypted Token Person IDs, aPMRs, and indicators of match rank (denoting the step at which the match to HES and MPS/PDS was made) for these linked cases to the ONS SRS.
4) NHS England will extract the health data for all children and young people born on or after 01/09/1984, including a pseudonymised mother-baby link and additional HES records of mothers, and link the aPMR and match rank statistics for those children and young people that were linked by NHS England in step (3) from NPD. The deidentified health data will be transferred to the ONS SRS. Only month/year of birth and death will be transferred to the ONS SRS, in order to account for well-established effects of month of birth on school achievement (i.e. research consistently shows that children born in September do better than children born in July/August).
5) DfE will supply ONS SRS with requested de-identified attribute data extracts, with the aPMR for all children and young people born on or after 01/09/1984. The deidentified attribute NPD and HES data will be linked within the ONS SRS by the research team, using the aPMR. Data will only be used by researchers/analysts authorised for the project or those who have been granted access to the data through a sub-license, with strict output controls applied by ONS SRS staff.
6) The final data set that will be used for analyses will remain within the ONS SRS. The files will not contain any identifiable data. No additional record level data will be gathered or linked to the dataset. The aPMR is the only variable supplied from NPD data that is supplied by NHS England to UCL Data Safe Haven and then to ONS SRS.
7) NHS England will retain the identifier file of all individuals linked in MPS/NPD-PDS and MPS/PDS-HES and all the postcodes used in linkage and postcode dates for 12 months after linkage, to address data queries or potential linkage errors. This data set will not contain any attribute data and will be accessible only to NHS England staff. At the end of the 12 months, NHS England will confirm deletion of the data to DfE. NHS England will not send confidential data to DfE or UCL DSH.
Under this agreement DARS-NIC-578994 there will no further flows of data into or out of NHS England. The DfE analysts will access the ECHILD data asset in the ONS SRS and, as with the UCL DSA, the data will only be used by researchers authorised for the project, with strict output controls applied by ONS SRS staff. To confirm, DfE are processing data disseminated under DARS-NIC-381972, not accessing the data via a sub-licence.
DfE will have named analysts / researchers who will undertake analyses (on the ONS SRS) relating to a specific component of the wider research questions. All analysts will have ONS Accredited Researcher Status and project approvals will be managed through the ONS Research Accreditation Service. All access will be in the ONS SRS environment and prior to submitting an application for access to ONS, all proposed projects will be considered by the DfE Head of Data Sharing and Lead Analyst for ECHILD to ensure that the proposed use is consistent with the purpose within this agreement and to address the research questions stated in this application.
The de-identified linked HES-NPD attribute data will be held on the ONS SRS and will only be accessible through a safe setting. Safe settings may be in safe rooms on ONS sites, in safe rooms on other certified sites, or through an organisation which has an Assured Organisational Connectivity Agreement with ONS and which maintains a current certification. No record level data can be removed from the ONS SRS and statistical disclosure controls are applied by ONS staff before removal of any aggregated outputs. Access will be restricted to named users, who are employees of DfE and are accessing the data for the purposes outlined in this DSA. As outlined above, access to the data is only via the ONS SRS environment.
Office of National Statistics (ONS) and DfE maintain an organisational agreement (memorandum of understanding) to use the ONS Secure Research Statistics (SRS) service for the purposes of secure statistical research, with an indefinite expiry date, which covers the processing of DfE data within the ONS SRS.
For security and resource reasons the SRS is a Managed Service. Equiniti Ltd (based in Belfast) maintains the system, on behalf of the ONS SRS. All Equiniti Ltd administrators are Security Check (SC) cleared and have no access to any data. ONS SRS Research Support Admin staff only have permissions to carry out such tasks as creating users, updating patches, testing and installing software applications etc. along with the SRS environment Admin maintenance.
Equiniti Ltd nor any their staff process the data. Therefore Equinity Ltd is not considered to be a Data Processor.
The ONS SRS environment is an isolated system. It has no connectivity to the internet other than using it as a bearer to pass TLS1.2 encrypted image packages for a virtual desktop infrastructure (VDI), hosted on an accredited cloud server hosted by UKCloud Ltd on the mainland UK. UKCloud Ltd merely host the environment, they have no access to data. Therefore, CloudUK Ltd is not considered to be a Data Processor.
NHS England security has provided assurance regarding the use of the Office of National Statistics' Secure Research Statistics service (ONS SRS), hosted by CloudUK Ltd in this application. The Office of National Statistics has submitted a selection of security documentation to support the use of cloud storage. NHS England Security have reviewed the documentation and provided relevant feedback, where necessary. NHS England are satisfied that the documentation demonstrates the level of security and governance in place.
The Office of National Statistics have supplied evidence to support:
The use of the Data Risk Model to assess the Risk Profile Class.
Risk Management of the use of the Cloud for this data, taking into consideration Confidentiality, Integrity and Availability.
The use of Pseudonymisation.
Board level involvement in the Risk Management Process evidenced through Minutes of these meetings.
Understanding of the Shared Responsibility Model
The Office of National Statistics have a very good understanding of the security controls available to them to provide the appropriate controls to secure data in the Cloud.
Using the Cloud, benefits from the inherited controls that cannot practically be replicated locally such as Physical Controls, Resilience of Systems, Power Supplies, Communications and Geographically dispersed Data Centres within a region.
Elasticity in provisioning is also a consideration that benefits organisations in managing workloads. The Cloud provider, CloudUK, will use UK Data Centres only. All outputs will contain aggregate level data only and all small numbers will be suppressed in line with the HES analysis guide. Outputs will be monitored for compliance with ADRN statistical output controls and the HES Analysis Guides. No potentially disclosive outputs will be shared or published.
DfE maintains an information security governance framework and has policies in place, which may be used as evidence of its compliance with the DPA 2018/UK GDPR. DfE regularly review their information security governance arrangements and policies to ensure that they remain appropriate to the potential risks and to ensure that the processing remains compliant with the DPA 2018/UK GDPR.