NHS Digital Data Release Register - reformatted

Office for National Statistics (ONS)

Project 1 — DARS-NIC-177068-M1P0L

Opt outs honoured: No - data flow is not identifiable (Does not include the flow of confidential data)

Sensitive: Non Sensitive

When: 2018/10 — 2019/04.

Repeats: System Access

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Other - Statistics and Registration Service Act 2007 section 45(a)

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Admitted Patient Care
  • Hospital Episode Statistics Outpatients

Objectives:

The Digital Economy Act 2017 amended the Statistics and Services Registration Act (SRSA) 2007 such that ONS can request or require information is shared by a crown body, other public authority, charity or undertaking as long as it is for its functions (essentially statistics and statistical research). Where a request is made, the data controller may disclose the information and this permissive power overcomes any other duty of confidence the organisation may have, except if sharing the data would contravene the Data Protection Act, relevant parts of the Investigatory Powers Act, or relevant EU legislation. No other legislation is mentioned (e.g. care act). The part of the legislation covering this permissive gateway has been live since July 2017. The power to require data are shared is not yet live. This part of the legislation requires a code of practice to underpin it and this must be approved by parliament. The code has been drafted and consulted upon. Parliamentary approval of the code is expected by the end of June 2018. In discussions with NHS Digital (including its, Caldicott guardian), it was agreed that ONS would acquire data for its functions under this latter power to require. Therefore ONS cannot require HES data are shared with it until this power is live. In the meantime, ONS is working with NHS Digital’s analytical experts to better understand HES data and whether it will be fit for the statistical purposes to which ONS wants to put it. Remote access to pseudonymised HES data will be significant in helping with this process. This Data Sharing Agreement will permit ONS to access pseudonymised HES data via the HES Data Interrogation System (HDIS). ONS will use this data to assess the technical feasibility of using HES data for the purposes outlined below. Getting this right ties in with the current wording of the code of practice that will underpin the power to require data are shared. This states that ‘We will only seek access to data for the purposes of fulfilling one or more of our statutory functions, including to produce official statistics and undertake statistical research that meets identifiable user needs for the public good.’ The statement also sets out six principles to which ONS will adhere when requiring data; They state that ONS will: • safeguard confidentiality • be transparent about what data it is accessing and why • ensure accessing the data are lawful and meet strict ethical standards • ensure that accessing the data is in the public interest - for example that the data are fit for purpose for the statistical use to which ONS intends to put it • ensure requiring that the data supplied is proportionate – for example, ONS will have exhausted possible alternatives • seek to collaborate with suppliers at all times The statistical purposes for which ONS will ultimately require HES data are shared with it under the SRSA are outlined below. ONS currently believes these would be in the public interest, but the final application for identifiable HES data may not include all of these uses, subject to the data quality and feasibility work enabled by this remote access. The interim use of HDIS will inform what data is necessary to perform the following purposes once identifiable HES data are shared later this year: There are a range of initial statistical uses to which ONS intends to put identifiable Hospital Episodes Statistics (HES) data. Generally, linkage to other sources at a record level will be a prerequisite to success, and therefore identifiers including name, postcode, date of birth, sex and NHS number will be required at that point. The other HES information required varies by purpose and it is this other information which ONS employees can familiarize themselves with and assess the quality of, by gaining access to pseudonymised HES data remotely. The variables and time periods ONS have requested are those that they believe will potentially support these uses, and may be in the subsequent application. 1. To enable research being conducted by ONS' Administrative Data Census Project using ‘activity’ and characteristics data ONS’ Administrative Data Census Project is assessing whether the Government’s stated ambition that ‘censuses after 2021 be conducted using other sources of data’ can be realized. ONS aims to replicate the type of information collected through a Census by using the administrative data already held by government, supplemented by surveys. ONS’ goal is to compare statistical outputs based on administrative data and surveys with the outputs possible using data from the planned traditional Census in 2021, to show that this alternative can meet users’ needs with high quality information at a lower cost, and more frequently. There are two main types of information from the Hospital Episodes Statistics dataset that are needed for this project; ‘activity data’ and characteristics data. In both cases, the information will also help with ONS’ research into improving migration statistics which has significant overlap with the Admin Data Census project: a. Activity data ONS already has access to Administrative sources with high population coverage such as GP patient registration and tax records that provide evidence of how many people live in each area of the country. However, these sources often suffer from over coverage where people have actually left the country even though they still appear on the source, and/or address information can be out of date so although they are still in the UK, ONS would assign them to the wrong part of the country. Evidence from other administrative sources that an individual is interacting with a service (‘activity’ data), even if these sources only cover a proportion of the population, will provide evidence that they are in the country. It may also help determine which address information recorded on the other high population coverage sources is the correct one, when those sources do not agree. In addition to the Administrative Data census project work, ONS is also researching whether administrative sources can improve its migration statistics and ‘activity data’ in this context would be useful for the same reasons. For this particular use, ONS only require information about where and when individuals are interacting with secondary care, not why. b. Characteristics data The traditional Census includes questions on ethnicity. It is currently very difficult to estimate ethnicity at a local level between Censuses. Also, very few administrative sources capture ethnicity at all, which would currently make including ethnicity on an Administrative Data Census challenging; Hospital Episode Statistics is one of the few datasets where ethnicity is captured. ONS has worked with NHS Digital data experts to understand the limitations of the ethnicity data, for example coverage and definitions used. ONS can research methodological approaches into mitigating these limitations. Ethnicity and national identity received one of the highest user needs scores from the 2015 Census Topic Consultation, and Census ethnicity information is used by national and local decision makers. For example, in equality impact assessments when local authorities make changes to service delivery. The feasibility of producing admin data based ethnicity estimates will be important when deciding whether to move to an Admin Data based Census after 2021. In terms of the framework of uses presented earlier in this section, then the Administrative Data Census project work described falls into multiple categories: 2. To conduct a range of Statistical Research and Health Analyses using clinical data ONS’ Health analysts will use clinical data from HES for a range of initial statistical purposes: a. Exploring the feasibility of producing robust projections of the future health state of the nation. These projections would need to take into account population projections, morbidity and mortality trends, and other characteristics. It is likely HES can provide some of the information required, although there will be gaps in for example morbidity data. The different models that could produce projections – bayesian modelling, markov chain, microsimulation - would all rely on linked individual level datasets. HES may provide valuable information on the prevalence of conditions across the population but there are gaps even with HES – for example, those with conditions / disabilities only interacting with primary and community services, or private secondary care providers (there are / may be other datasets available that could fill these gaps). The State pension age review, 2017, called for more work on healthy life expectancy projections to better inform future decisions about the state pension age. The review also noted their potential value in informing planning future health and social care provision at a local and national level. b. Exploring the use of linked morbidity, mortality, Census, benefits and other data to produce more granular statistics on health inequalities and health state life expectancies. This is potentially more straightforward than a), and involves for example, linking individual’s self assessment of their health and disability state in the 2011 Census, the ONS annual population survey since 2011 (if they were surveyed), and ultimately the 2021 Census (once collected). Where an individual’s self-reported health has transitioned from good to poor, or they indicate for the first time that they have a long term limiting condition, HES data on actual morbidity can be linked in to compare this with perception of their own health. There are limitations in using HES alone for morbidity information, not least that it only covers secondary care and many new conditions will be diagnosed through primary care only. Research would include what methodological techniques could be used to account for these limitations. The ultimate goal would be producing healthy life expectancy estimates that do not rely on survey data, potentially allowing more granular statistics. A decision on the feasibility of removing self-reported health state questions from the Census and surveys may also lead to reductions in cost and respondent burden. ONS healthy life expectancy statistics are central amongst the public health indicators that help guide local decisions by Local Authorities (LAs) about distribution and prioritisation of services. More local level health expectancy statistics, and more breakdowns by other characteristics, would provide insight allowing LAs to better target interventions to reduce health inequalities. c. Exploring the care pathways in the run up to death. This would allow ONS to add detail to avoidable mortality statistics, and explore any links between health care access and premature mortality, for example suicides and drug related deaths. Linking in Census 2011 data, ONS’ mortality data and HES, will also allow ONS to investigate health inequalities at a local level, bringing in information on characteristics such as ethnicity and occupation from the Census. A specific aspect to this research would be analysing inequalities in infant mortality, where ONS would collaborate with NHS Digital to ensure policy makers have evidence to help meet the Secretary of State’s target to halve infant mortality by 2030. 3. Statistical Research into whether ONS can improve its Address Register This Statistical Research would have a particular focus on identification of communal establishments when using HES data, and would require information in HES about where individuals were admitted from and discharged to. Length of stay will also provide a picture of how many people ONS would expect to be classed as usually resident (> 6 months stay) in hospital at any given time. Sex information may assist with identifying communal establishments that are male or female only. 4. Statistical research to assess the feasibility of creating a better estimate in the UK household expenditure on hospitals services (inpatient only) and medical and paramedical services (outpatient). The national accounts framework brings units and transactions together to provide a simple and understandable description of production, income, consumption, accumulation and wealth. The team will conduct Statistical Research into whether HES data can improve estimates of revenue paid by patients, split into outpatient and inpatient activity, private patient episodes split by outpatient and inpatient activity, and outpatient activity split between medical services and paramedical services. 5. To assess the feasibility of HES data enabling the UK to report data or proxy indicator data to measure its progress against the United Nation's Sustainable Development Goals (SDGs). Interest in HES is specifically around the feasibility of better estimating the following Sustainable Development indicators: 3.1.1: Maternal mortality ratio 3.1.2: Proportion of births attended by skilled health personnel 3.3.5: Number of people requiring interventions against neglected tropical diseases 3.5.1: Coverage of treatment interventions (pharmacological, psychosocial and rehabilitation and aftercare services) for substance use disorders 3.7.1: Proportion of women of reproductive age (aged 15-49 years) who have their need for family planning satisfied with modern methods 3.8.1: Coverage of essential health services (defined as the average coverage of essential services based on tracer interventions that include reproductive, maternal, newborn and child health, infectious diseases, non-communicable diseases and service capacity and access, among the general and the most disadvantaged population) While ONS’ SDG team will work with NHS Digital and Public Health England to produce these indicators without the need for data sharing, ONS need to be able to disaggregate these headline indicators by ethnicity, age, sex, disability, geography. Linking HES data to ONS held data such as from Census 2011 at an individual level, may help ONS to achieve some of these breakdowns where this goal. No sensitive data can be accessed through the HDIS. The data provided would include the standard non-sensitive HES fields.

Expected Benefits:

The main short term benefit is to support ONS in learning more about HES data quality and supporting it in determining what HES data to subsequently require are shared with it under the Statistics and Registration Services Act 2007, as amended by the Digital Economy Act, 2017. The proposed ultimate statistical uses for the data acquired under those powers / that application, are detailed in the objectives section. The potential benefits of that statistical research (which will not be possible based on remote access to HES alone) are: 1. Admin Data Census project and improved migration statistics Population estimates and information on population characteristics are used by a wide range of national and local organisations for numerous purposes including resource and funding allocation for both local and central Government, service planning and delivery, policy development, monitoring and evaluation, and providing an accurate denominator for other statistics. The Department of Health and their agencies use ONS' population statistics for the planning and provision of health and social care services and the distribution of funds. Throughout government, decisions on the distribution of billions of pounds of funds are made based on population estimates and projections. Respondents to the Census Topic Consultation conducted in June 2015 gave strong evidence for high-quality and more timely population estimates. If it proves feasible, an Admin Data Census approach will deliver more timely statistics. It will potentially also deliver more accurate population statistics, at least in inter-censal periods, if not traditional Census year itself. An Admin Data Census approach will also reduce cost and respondent burden. New and more accurate information on international and internal migration is needed to better inform migration system policy making in a post-Brexit era. For example, note the 2017 Migration Advisory Committee call for evidence on aspects of migration in response to a Government commission to guide decisions on post-Brexit migration policy. 2. Health analyses Successful production of robust health projections would support better decision making around where to set the state pension age, and planning of health and social care services: The Cridland Report (2017) which was commissioned by government to independently review the state pension age made the following statement: “We believe more work is needed to understand healthy life expectancy, as it affects a range of policy areas. Projecting healthy life expectancy into the future is not currently possible, but would be valuable for future Reviews, as well as in work around health and caring.” Independent Review of State Pension Age: Smoothing the Transition, 2017, pg 35 The report also notes: • Developments in Healthy Life Expectancy (HLE) and Health State Transitions (HST) will have a notable impact on the demand for social care and different types of medical care, for instance the number of trained dementia nurses required in 40 years’ time? • In order to manage budgets and allocate funding effectively, there is a need to understand what the main patterns of key diseases will be, and what the distribution of these illnesses across the population will look like. • It is likely that the prevalence of diseases which affect the oldest old such as cancer and dementia will increase. • If social care and health care provision needs to be increased, the national budget will need to be changed to reflect this which may result in other services seeing cuts. Current healthy life expectancy estimates rely on ONS surveys, where despite the large sample size, the number of breakdowns geographically and by characteristic possible is limited by this sample size. Current estimates rely on aggregate figures – i.e. the prevalence of poor health / limiting long term conditions, and mortality rates by age are calculated independently and then fed into the model. Linking health states and mortality at the individual level over time, and for a greater proportion of the population (which may be possible using HES data) will allow more granular analysis. Linking to Census and other sources to add in other characteristics, could inform interventions to support tackling inequalities at the local level. See here for PHE guidance to local authorities: https://www.gov.uk/government/publications/reducing-health-inequalities-in-local-areas See here for local authority profiles from PHE which rely on a lot of ONS data: http://fingertipsreports.phe.org.uk/health-profiles/2017/e07000087.pdf More accurate and new statistics on the characteristics and factors associated with suicides, drug related deaths and infant mortality may provide insight that leads to better targeting of interventions, or the development of new interventions, that could ultimately save lives. Similar to some of the above, the unique benefit ONS can bring in this space is the ability to link the health and mortality sources (which NHS Digital also own and can link / analyse), with other non-health sources such as Census and DWP/HMRC data on benefits and income. This can also be done for a large proportion of the population (although there will be gaps that need to be understood and assessed). 3. Address Register Research Research will enhance the Address Register including the information held on communal establishments (CEs), for which there is currently a recognized data gap. A better Address Register will in turn benefit ONS' other statistics, such as the population statistics described above. For example, it will allow ONS to quality assure its local level population statistics (whether from a traditional Census or other method) as local areas with CEs can have unusual demographic profiles, which can cause concern over the accuracy of the statistics unless the location and nature of the CE is known. It will also help with better planning of survey operations and sample design. 4. UK household Expenditure Statistical Research Household Final Consumption Expenditure is a component of National Accounts; improvements therefore affect estimates of Gross Domestic Product (GDP). This is a key national economic indicator that drives national economic policy making. 5. Sustainable Development Indicator Research The UK was at the forefront of developing the UN recognized Sustainable Development Goals (SDGs) and ONS aims to fully report on an agenda that it pushed to develop to continue to show leadership in this space. A key theme of the SDGs is to leave no one behind and ONS needs to be able to disaggregate the headline indicators so that ONS can be sure progress occurs across all groups, regardless of ethnicity, age, sex, disability, geography. Subject feasibility research, linking HES data to ONS held data such as from Census 2011 at an individual level, may help to achieve this goal.

Outputs:

The key output will be evidence to feed into ONS’ full DARS application for identifiable HES data which will follow later this year (summer 2018 at the earliest depending on when full Digital Economy Act powers come into force). Other outputs will include internal statistical data quality reports and desk notes to guide ONS researchers working with HES data once an identifiable dataset has been acquired. These can be shared with NHS Digital analytical colleagues if useful. No external publications or statistics are expected to be released based ONS’ remote access alone - these will come later after the subsequent application, which will therefore detail the expected external outputs.

Processing:

This Agreement permits online access to the record level HES database via the HDIS system. The system is hosted and audited by NHS Digital meaning that large transfers of data to on-site servers is reduced and NHS Digital has the ability to audit the use and access to the data. HDIS is accessed via a two-factor secure authentication method to approved users who are in receipt of an encryption token ID. Users have to attend training before the account is set up and users are only permitted to access the datasets that are agreed within this agreement. Users log onto the HDIS system and are presented with a SAS software application called Enterprise Guide which presents the users with a list of available data sets and available reference data tables so that they can return appropriate descriptions to the coded data. The access and use of the system is fully auditable and all users have to comply with the use of the data as specified in this agreement. The software tool also provides users with the ability to perform full data minimisation and filtering of the HES data as part of processing activities. Users are not permitted to upload data into the system. Users of HDIS are able to produce outputs from the system in a number of formats. The system has the ability to be able to produce small row count extracts for local analysis in Excel or other local analysis software. Users are also able to produce tabulations, aggregations, reports, charts, graphs and statistical outputs for viewing on screen or export to a local system. Only registered HDIS users will have access to record level data downloaded from the HDIS system. Following completion of the analysis the record level data will be securely destroyed. In addition to those outlined elsewhere within this Agreement, the Office for National Statistics will: 1. only use the HES data for the purposes as outlined in this Agreement; 2. comply with the requirements of NHS Digital Code of Practice on Confidential Information, the Caldicott Principles and other relevant statutory requirements and guidance to protect confidentiality; 3. not publish the results of any analyses of the HES data unless safely de-identified in line with the anonymisation standard; and 4. comply with the guidelines set out in the HES Analysis Guide; 5. ensure role-based control access is in place to manage access to the HES data within the Office for National Statistics. As this Agreement permits remote access to HES, this would be limited to analysing the data within the secure environment ONS is given access to, and potentially requesting export of aggregate tables. Work would include: -Gaining experience of wrangling such a large dataset - for example, linking records for the same individual across years -Assessing the quality of key variables such as the ethnicity variable, for example assessing missingness and frequency of ethnic group by other characteristics -Summarizing the number of hospital interactions by person, age, sex and Geography to give an idea of what proportion of each age-sex group in each area are interacting, and therefore, for what proportion of the population HES will give ONS evidence of their presence in the country and up to date location -Summarizing secondary care morbidity using the diagnosis variables to exhaust how useful this aggregate information would be for ONS’ proposed health statistics purposes - this will guide if and how much clinical data ONS seeks in the subsequent full application for identifiable HES data (it is expected that some uses will require record level linkage to other sources that only ONS hold and therefore aggregate may well not suffice if the potential benefits are to be realized). It may be useful for ONS to export some of these aggregate tables if possible. ONS already has good links into the NHS Digital secondary care information team who can help with queries about findings in the data and how best to use it.


Project 2 — DARS-NIC-48781-Z5C2L

Opt outs honoured: N

Sensitive: Non Sensitive

When: 2016/12 — 2017/11.

Repeats: Ongoing

Legal basis: Health and Social Care Act 2012

Categories: Anonymised - ICO code compliant

Datasets:

  • MRIS - Bespoke

Objectives:

Provide an assessment of the quality of the informal date of death contained within PDS death notifications, compared with formal date of death. Provide an indication as to whether the informal date of death is sufficiently accurate, and whether there would be benefit in providing this date to researchers in advance of the formal notification. To establish what advantages of timeliness this provision could bring, and whether there is any variation by age, gender and cause of death.

Expected Benefits:

Expected measurable benefits to health and/or social care including target date: More timely notification of deaths to medical researchers will benefit the health and social care system by substantially reducing delays in the discovery-potential from record-linkage studies. Currently, research-teams may observe that a subject’s series of court appearances or benefits claims has ceased but cannot know assuredly - due to lateness of notification of fDoD - whether the explanation is that the subject has died or that s/he has been rehabilitated/employed. The research team has to allow about 2-years to account for late registered deaths, which delays deriving new knowledge, in this example about criminal sanctions or benefits.

Outputs:

The outputs will inform an assessment as to the potential advantages of using informal date of death to notify research studies of deaths in their study cohorts. At present, research-teams need to delay their record-linkage requests [for follow-up to 31 December 2015, say] by at least two years to be almost sure that ONS has been notified of almost all deaths that actually occurred in England and Wales on or before 31 December 2015. If iDoD is substantially accurate, this undesirable delay to record-linkage studies could be avoided if informal date of death was available to researchers. The research team will not know ICD-10 chapter for cause of death but in many studies, only fact-of-death was needed and, in others, imputation for likely cause-of-death may be technically possible. These are huge advantages for the discovery potential from approved record-linkage studies but are dependent on knowing how reliable iDOD is likely to be. Among those for whom iDoD exists in 2011 but no fDoD was notified by 30 June 2016, there may be falsely assigned notifications (eg in terms of NHS number) so that the number (%) of C-differences which exceed 4-years provides an upper limit for this error-rate.

Processing:

Compare PDS death notifications with GRO death registrations. NHS Digital will extract identifiable data (NHS Number, gender, date of birth, date of registration and ICD 10 primary cause of death code from ONS Mortality data) for persons with a date of death recorded between 1st Jan 2011 and 31th June 2016. NHS Digital will then link these data, using NHS Number, to data from Patient Demographic Service (PDS). For linked records NHS Digital will extract informal date of death and the PDS-system notification date, the formal date of death and ONS date of registration. NHS Digital will calculate age at formal date of death (using informal date if formal date not available) and stratify to the following age bands: < 5 years; 5-14 years; 15-44 years; 45-64 years; 65-74 years; 75-84 years; 85+ years. NHS Digital will calculate A. the difference between the ONS date of death and the informal date of death from PDS [ONS-PDS] and stratify as follows: zero days, 1-7 days; 8-14 days; 15-28 days; 29-90 days; 91-182 days; 183-365 days; 366-730 days; 731+ days; no ONS death recorded; no PDS death recorded. NHS Digital will calculate B. the difference between ONS date of registration and the PDS informal death system notification date [ONS registration-PDS notification], and stratify as follows: zero days, 1-7 days; 8-14 days; 15-28 days; 29-90 days; 91-182 days; 183-365 days; 366-730 days; 731+ days, no ONS death recorded; no PDS death recorded. The above difference-records (A and B separately) will be aggregated and tabulated by i) sex, ii) ICD-10 chapter, iii) age group and iv) year of death (from formal death date); and by all pairs of i) to iv). The difference-records (A and B) will also be cross-tabulated, separately for each co-variate level of the following four covariates: i) sex, ii) ICD-10 chapter, iii) age group and iv) year of death (from formal death date) Tabulations will compare the difference between the formal date of death (fDoD) and the informal date of death (iDoD), indicating the quality/accuracy of the informal death date; they will also compare the difference between the date of registration and date the informal death was recorded in PDS, indicating the days gained in advance notification using the informal date vs the formal date. In addition, we’d like to know, for each calendar year [2011 to 2015], how many iDODs were notified in that calendar year for whom there was no fDOD prior to 1 January 2016 had been notified by 30 June 2016. We’d like these counts, if possible, to be provided separately for each covariate-level of the following two covariates i) gender and iii) informal age group where age-group at death is based on iDOD (since, for these cases, fDOD has not been registered). Moreover, C. we’d like the death registration delay to be computed as [31 December 2015 – iDOD] and stratified as follows: zero days, 1-7 days; 8-14 days; 15-28 days; 29-90 days; 91-182 days; 183-365 days; 366-730 days; 731-1096 days; 1097-1462; 1463-1827. The difference-records [C.] will be aggregated and tabulated by i) sex, iii) age group and iv) year of iDOD; and by all pairs of i) to iv). Tabulated outputs, with small number suppression applied, will be provided to ONS. No record level or identifiable data will be released by NHS Digital.


Project 3 — DARS-NIC-57592-H7S8B

Opt outs honoured: N

Sensitive: Sensitive, and Non Sensitive

When: 2016/04 (or before) — 2018/05.

Repeats: Ongoing

Legal basis: Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012, Section 42(4) of the Statistics and Registration Service Act (2007) as amended by section 287 of the Health and Social Care Act (2012)

Categories: Identifiable, Anonymised - ICO code compliant

Datasets:

  • MRIS - Bespoke
  • MRIS - Scottish NHS / Registration

Objectives:

Purpose - The Longitudinal Study (LS) contains data on 1 per cent of the population of England and Wales. It is used for several types of analysis: for example, studies using registration event data as outcomes or studies using linked census data. The purpose of these studies include those that link social, occupational and demographic information to data on vital events. Examples include studies of mortality, cancer incidence and survival, and fertility patterns. Those looking at environmental effects on health and inequalities in health. Also those investigating social mobility and the study of ageing.