NHS Digital Data Release Register - reformatted
Office for National Statistics (ONS)
Project 1 — DARS-NIC-57592-H7S8B
Opt outs honoured: N
Sensitive: Sensitive, and Non Sensitive
When: 2016/04 (or before) — 2018/05.
Legal basis: Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012, Section 42(4) of the Statistics and Registration Service Act (2007) as amended by section 287 of the Health and Social Care Act (2012)
Categories: Identifiable, Anonymised - ICO code compliant
- MRIS - Bespoke
- MRIS - Scottish NHS / Registration
Purpose - The Longitudinal Study (LS) contains data on 1 per cent of the population of England and Wales. It is used for several types of analysis: for example, studies using registration event data as outcomes or studies using linked census data. The purpose of these studies include those that link social, occupational and demographic information to data on vital events. Examples include studies of mortality, cancer incidence and survival, and fertility patterns. Those looking at environmental effects on health and inequalities in health. Also those investigating social mobility and the study of ageing.
Project 2 — DARS-NIC-48781-Z5C2L
Opt outs honoured: N
Sensitive: Non Sensitive
When: 2016/12 — 2017/11.
Legal basis: Health and Social Care Act 2012
Categories: Anonymised - ICO code compliant
- MRIS - Bespoke
Provide an assessment of the quality of the informal date of death contained within PDS death notifications, compared with formal date of death. Provide an indication as to whether the informal date of death is sufficiently accurate, and whether there would be benefit in providing this date to researchers in advance of the formal notification. To establish what advantages of timeliness this provision could bring, and whether there is any variation by age, gender and cause of death.
Expected measurable benefits to health and/or social care including target date: More timely notification of deaths to medical researchers will benefit the health and social care system by substantially reducing delays in the discovery-potential from record-linkage studies. Currently, research-teams may observe that a subject’s series of court appearances or benefits claims has ceased but cannot know assuredly - due to lateness of notification of fDoD - whether the explanation is that the subject has died or that s/he has been rehabilitated/employed. The research team has to allow about 2-years to account for late registered deaths, which delays deriving new knowledge, in this example about criminal sanctions or benefits.
The outputs will inform an assessment as to the potential advantages of using informal date of death to notify research studies of deaths in their study cohorts. At present, research-teams need to delay their record-linkage requests [for follow-up to 31 December 2015, say] by at least two years to be almost sure that ONS has been notified of almost all deaths that actually occurred in England and Wales on or before 31 December 2015. If iDoD is substantially accurate, this undesirable delay to record-linkage studies could be avoided if informal date of death was available to researchers. The research team will not know ICD-10 chapter for cause of death but in many studies, only fact-of-death was needed and, in others, imputation for likely cause-of-death may be technically possible. These are huge advantages for the discovery potential from approved record-linkage studies but are dependent on knowing how reliable iDOD is likely to be. Among those for whom iDoD exists in 2011 but no fDoD was notified by 30 June 2016, there may be falsely assigned notifications (eg in terms of NHS number) so that the number (%) of C-differences which exceed 4-years provides an upper limit for this error-rate.
Compare PDS death notifications with GRO death registrations. NHS Digital will extract identifiable data (NHS Number, gender, date of birth, date of registration and ICD 10 primary cause of death code from ONS Mortality data) for persons with a date of death recorded between 1st Jan 2011 and 31th June 2016. NHS Digital will then link these data, using NHS Number, to data from Patient Demographic Service (PDS). For linked records NHS Digital will extract informal date of death and the PDS-system notification date, the formal date of death and ONS date of registration. NHS Digital will calculate age at formal date of death (using informal date if formal date not available) and stratify to the following age bands: < 5 years; 5-14 years; 15-44 years; 45-64 years; 65-74 years; 75-84 years; 85+ years. NHS Digital will calculate A. the difference between the ONS date of death and the informal date of death from PDS [ONS-PDS] and stratify as follows: zero days, 1-7 days; 8-14 days; 15-28 days; 29-90 days; 91-182 days; 183-365 days; 366-730 days; 731+ days; no ONS death recorded; no PDS death recorded. NHS Digital will calculate B. the difference between ONS date of registration and the PDS informal death system notification date [ONS registration-PDS notification], and stratify as follows: zero days, 1-7 days; 8-14 days; 15-28 days; 29-90 days; 91-182 days; 183-365 days; 366-730 days; 731+ days, no ONS death recorded; no PDS death recorded. The above difference-records (A and B separately) will be aggregated and tabulated by i) sex, ii) ICD-10 chapter, iii) age group and iv) year of death (from formal death date); and by all pairs of i) to iv). The difference-records (A and B) will also be cross-tabulated, separately for each co-variate level of the following four covariates: i) sex, ii) ICD-10 chapter, iii) age group and iv) year of death (from formal death date) Tabulations will compare the difference between the formal date of death (fDoD) and the informal date of death (iDoD), indicating the quality/accuracy of the informal death date; they will also compare the difference between the date of registration and date the informal death was recorded in PDS, indicating the days gained in advance notification using the informal date vs the formal date. In addition, we’d like to know, for each calendar year [2011 to 2015], how many iDODs were notified in that calendar year for whom there was no fDOD prior to 1 January 2016 had been notified by 30 June 2016. We’d like these counts, if possible, to be provided separately for each covariate-level of the following two covariates i) gender and iii) informal age group where age-group at death is based on iDOD (since, for these cases, fDOD has not been registered). Moreover, C. we’d like the death registration delay to be computed as [31 December 2015 – iDOD] and stratified as follows: zero days, 1-7 days; 8-14 days; 15-28 days; 29-90 days; 91-182 days; 183-365 days; 366-730 days; 731-1096 days; 1097-1462; 1463-1827. The difference-records [C.] will be aggregated and tabulated by i) sex, iii) age group and iv) year of iDOD; and by all pairs of i) to iv). Tabulated outputs, with small number suppression applied, will be provided to ONS. No record level or identifiable data will be released by NHS Digital.
Project 3 — DARS-NIC-177068-M1P0L
Opt outs honoured: No - data flow is not identifiable (Does not include the flow of confidential data)
Sensitive: Non Sensitive
When: 2018/10 — 2019/04.
Repeats: System Access
Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Other - Statistics and Registration Service Act 2007 section 45(a)
Categories: Anonymised - ICO code compliant
- Hospital Episode Statistics Accident and Emergency
- Hospital Episode Statistics Admitted Patient Care
- Hospital Episode Statistics Outpatients
The Digital Economy Act 2017 amended the Statistics and Services Registration Act (SRSA) 2007 such that ONS can request or require information is shared by a crown body, other public authority, charity or undertaking as long as it is for its functions (essentially statistics and statistical research). Where a request is made, the data controller may disclose the information and this permissive power overcomes any other duty of confidence the organisation may have, except if sharing the data would contravene the Data Protection Act, relevant parts of the Investigatory Powers Act, or relevant EU legislation. No other legislation is mentioned (e.g. care act). The part of the legislation covering this permissive gateway has been live since July 2017. The power to require data are shared is not yet live. This part of the legislation requires a code of practice to underpin it and this must be approved by parliament. The code has been drafted and consulted upon. Parliamentary approval of the code is expected by the end of June 2018. In discussions with NHS Digital (including its, Caldicott guardian), it was agreed that ONS would acquire data for its functions under this latter power to require. Therefore ONS cannot require HES data are shared with it until this power is live. In the meantime, ONS is working with NHS Digital’s analytical experts to better understand HES data and whether it will be fit for the statistical purposes to which ONS wants to put it. Remote access to pseudonymised HES data will be significant in helping with this process. This Data Sharing Agreement will permit ONS to access pseudonymised HES data via the HES Data Interrogation System (HDIS). ONS will use this data to assess the technical feasibility of using HES data for the purposes outlined below. Getting this right ties in with the current wording of the code of practice that will underpin the power to require data are shared. This states that ‘We will only seek access to data for the purposes of fulfilling one or more of our statutory functions, including to produce official statistics and undertake statistical research that meets identifiable user needs for the public good.’ The statement also sets out six principles to which ONS will adhere when requiring data; They state that ONS will: • safeguard confidentiality • be transparent about what data it is accessing and why • ensure accessing the data are lawful and meet strict ethical standards • ensure that accessing the data is in the public interest - for example that the data are fit for purpose for the statistical use to which ONS intends to put it • ensure requiring that the data supplied is proportionate – for example, ONS will have exhausted possible alternatives • seek to collaborate with suppliers at all times The statistical purposes for which ONS will ultimately require HES data are shared with it under the SRSA are outlined below. ONS currently believes these would be in the public interest, but the final application for identifiable HES data may not include all of these uses, subject to the data quality and feasibility work enabled by this remote access. The interim use of HDIS will inform what data is necessary to perform the following purposes once identifiable HES data are shared later this year: There are a range of initial statistical uses to which ONS intends to put identifiable Hospital Episodes Statistics (HES) data. Generally, linkage to other sources at a record level will be a prerequisite to success, and therefore identifiers including name, postcode, date of birth, sex and NHS number will be required at that point. The other HES information required varies by purpose and it is this other information which ONS employees can familiarize themselves with and assess the quality of, by gaining access to pseudonymised HES data remotely. The variables and time periods ONS have requested are those that they believe will potentially support these uses, and may be in the subsequent application. 1. To enable research being conducted by ONS' Administrative Data Census Project using ‘activity’ and characteristics data ONS’ Administrative Data Census Project is assessing whether the Government’s stated ambition that ‘censuses after 2021 be conducted using other sources of data’ can be realized. ONS aims to replicate the type of information collected through a Census by using the administrative data already held by government, supplemented by surveys. ONS’ goal is to compare statistical outputs based on administrative data and surveys with the outputs possible using data from the planned traditional Census in 2021, to show that this alternative can meet users’ needs with high quality information at a lower cost, and more frequently. There are two main types of information from the Hospital Episodes Statistics dataset that are needed for this project; ‘activity data’ and characteristics data. In both cases, the information will also help with ONS’ research into improving migration statistics which has significant overlap with the Admin Data Census project: a. Activity data ONS already has access to Administrative sources with high population coverage such as GP patient registration and tax records that provide evidence of how many people live in each area of the country. However, these sources often suffer from over coverage where people have actually left the country even though they still appear on the source, and/or address information can be out of date so although they are still in the UK, ONS would assign them to the wrong part of the country. Evidence from other administrative sources that an individual is interacting with a service (‘activity’ data), even if these sources only cover a proportion of the population, will provide evidence that they are in the country. It may also help determine which address information recorded on the other high population coverage sources is the correct one, when those sources do not agree. In addition to the Administrative Data census project work, ONS is also researching whether administrative sources can improve its migration statistics and ‘activity data’ in this context would be useful for the same reasons. For this particular use, ONS only require information about where and when individuals are interacting with secondary care, not why. b. Characteristics data The traditional Census includes questions on ethnicity. It is currently very difficult to estimate ethnicity at a local level between Censuses. Also, very few administrative sources capture ethnicity at all, which would currently make including ethnicity on an Administrative Data Census challenging; Hospital Episode Statistics is one of the few datasets where ethnicity is captured. ONS has worked with NHS Digital data experts to understand the limitations of the ethnicity data, for example coverage and definitions used. ONS can research methodological approaches into mitigating these limitations. Ethnicity and national identity received one of the highest user needs scores from the 2015 Census Topic Consultation, and Census ethnicity information is used by national and local decision makers. For example, in equality impact assessments when local authorities make changes to service delivery. The feasibility of producing admin data based ethnicity estimates will be important when deciding whether to move to an Admin Data based Census after 2021. In terms of the framework of uses presented earlier in this section, then the Administrative Data Census project work described falls into multiple categories: 2. To conduct a range of Statistical Research and Health Analyses using clinical data ONS’ Health analysts will use clinical data from HES for a range of initial statistical purposes: a. Exploring the feasibility of producing robust projections of the future health state of the nation. These projections would need to take into account population projections, morbidity and mortality trends, and other characteristics. It is likely HES can provide some of the information required, although there will be gaps in for example morbidity data. The different models that could produce projections – bayesian modelling, markov chain, microsimulation - would all rely on linked individual level datasets. HES may provide valuable information on the prevalence of conditions across the population but there are gaps even with HES – for example, those with conditions / disabilities only interacting with primary and community services, or private secondary care providers (there are / may be other datasets available that could fill these gaps). The State pension age review, 2017, called for more work on healthy life expectancy projections to better inform future decisions about the state pension age. The review also noted their potential value in informing planning future health and social care provision at a local and national level. b. Exploring the use of linked morbidity, mortality, Census, benefits and other data to produce more granular statistics on health inequalities and health state life expectancies. This is potentially more straightforward than a), and involves for example, linking individual’s self assessment of their health and disability state in the 2011 Census, the ONS annual population survey since 2011 (if they were surveyed), and ultimately the 2021 Census (once collected). Where an individual’s self-reported health has transitioned from good to poor, or they indicate for the first time that they have a long term limiting condition, HES data on actual morbidity can be linked in to compare this with perception of their own health. There are limitations in using HES alone for morbidity information, not least that it only covers secondary care and many new conditions will be diagnosed through primary care only. Research would include what methodological techniques could be used to account for these limitations. The ultimate goal would be producing healthy life expectancy estimates that do not rely on survey data, potentially allowing more granular statistics. A decision on the feasibility of removing self-reported health state questions from the Census and surveys may also lead to reductions in cost and respondent burden. ONS healthy life expectancy statistics are central amongst the public health indicators that help guide local decisions by Local Authorities (LAs) about distribution and prioritisation of services. More local level health expectancy statistics, and more breakdowns by other characteristics, would provide insight allowing LAs to better target interventions to reduce health inequalities. c. Exploring the care pathways in the run up to death. This would allow ONS to add detail to avoidable mortality statistics, and explore any links between health care access and premature mortality, for example suicides and drug related deaths. Linking in Census 2011 data, ONS’ mortality data and HES, will also allow ONS to investigate health inequalities at a local level, bringing in information on characteristics such as ethnicity and occupation from the Census. A specific aspect to this research would be analysing inequalities in infant mortality, where ONS would collaborate with NHS Digital to ensure policy makers have evidence to help meet the Secretary of State’s target to halve infant mortality by 2030. 3. Statistical Research into whether ONS can improve its Address Register This Statistical Research would have a particular focus on identification of communal establishments when using HES data, and would require information in HES about where individuals were admitted from and discharged to. Length of stay will also provide a picture of how many people ONS would expect to be classed as usually resident (> 6 months stay) in hospital at any given time. Sex information may assist with identifying communal establishments that are male or female only. 4. Statistical research to assess the feasibility of creating a better estimate in the UK household expenditure on hospitals services (inpatient only) and medical and paramedical services (outpatient). The national accounts framework brings units and transactions together to provide a simple and understandable description of production, income, consumption, accumulation and wealth. The team will conduct Statistical Research into whether HES data can improve estimates of revenue paid by patients, split into outpatient and inpatient activity, private patient episodes split by outpatient and inpatient activity, and outpatient activity split between medical services and paramedical services. 5. To assess the feasibility of HES data enabling the UK to report data or proxy indicator data to measure its progress against the United Nation's Sustainable Development Goals (SDGs). Interest in HES is specifically around the feasibility of better estimating the following Sustainable Development indicators: 3.1.1: Maternal mortality ratio 3.1.2: Proportion of births attended by skilled health personnel 3.3.5: Number of people requiring interventions against neglected tropical diseases 3.5.1: Coverage of treatment interventions (pharmacological, psychosocial and rehabilitation and aftercare services) for substance use disorders 3.7.1: Proportion of women of reproductive age (aged 15-49 years) who have their need for family planning satisfied with modern methods 3.8.1: Coverage of essential health services (defined as the average coverage of essential services based on tracer interventions that include reproductive, maternal, newborn and child health, infectious diseases, non-communicable diseases and service capacity and access, among the general and the most disadvantaged population) While ONS’ SDG team will work with NHS Digital and Public Health England to produce these indicators without the need for data sharing, ONS need to be able to disaggregate these headline indicators by ethnicity, age, sex, disability, geography. Linking HES data to ONS held data such as from Census 2011 at an individual level, may help ONS to achieve some of these breakdowns where this goal. No sensitive data can be accessed through the HDIS. The data provided would include the standard non-sensitive HES fields.
The main short term benefit is to support ONS in learning more about HES data quality and supporting it in determining what HES data to subsequently require are shared with it under the Statistics and Registration Services Act 2007, as amended by the Digital Economy Act, 2017. The proposed ultimate statistical uses for the data acquired under those powers / that application, are detailed in the objectives section. The potential benefits of that statistical research (which will not be possible based on remote access to HES alone) are: 1. Admin Data Census project and improved migration statistics Population estimates and information on population characteristics are used by a wide range of national and local organisations for numerous purposes including resource and funding allocation for both local and central Government, service planning and delivery, policy development, monitoring and evaluation, and providing an accurate denominator for other statistics. The Department of Health and their agencies use ONS' population statistics for the planning and provision of health and social care services and the distribution of funds. Throughout government, decisions on the distribution of billions of pounds of funds are made based on population estimates and projections. Respondents to the Census Topic Consultation conducted in June 2015 gave strong evidence for high-quality and more timely population estimates. If it proves feasible, an Admin Data Census approach will deliver more timely statistics. It will potentially also deliver more accurate population statistics, at least in inter-censal periods, if not traditional Census year itself. An Admin Data Census approach will also reduce cost and respondent burden. New and more accurate information on international and internal migration is needed to better inform migration system policy making in a post-Brexit era. For example, note the 2017 Migration Advisory Committee call for evidence on aspects of migration in response to a Government commission to guide decisions on post-Brexit migration policy. 2. Health analyses Successful production of robust health projections would support better decision making around where to set the state pension age, and planning of health and social care services: The Cridland Report (2017) which was commissioned by government to independently review the state pension age made the following statement: “We believe more work is needed to understand healthy life expectancy, as it affects a range of policy areas. Projecting healthy life expectancy into the future is not currently possible, but would be valuable for future Reviews, as well as in work around health and caring.” Independent Review of State Pension Age: Smoothing the Transition, 2017, pg 35 The report also notes: • Developments in Healthy Life Expectancy (HLE) and Health State Transitions (HST) will have a notable impact on the demand for social care and different types of medical care, for instance the number of trained dementia nurses required in 40 years’ time? • In order to manage budgets and allocate funding effectively, there is a need to understand what the main patterns of key diseases will be, and what the distribution of these illnesses across the population will look like. • It is likely that the prevalence of diseases which affect the oldest old such as cancer and dementia will increase. • If social care and health care provision needs to be increased, the national budget will need to be changed to reflect this which may result in other services seeing cuts. Current healthy life expectancy estimates rely on ONS surveys, where despite the large sample size, the number of breakdowns geographically and by characteristic possible is limited by this sample size. Current estimates rely on aggregate figures – i.e. the prevalence of poor health / limiting long term conditions, and mortality rates by age are calculated independently and then fed into the model. Linking health states and mortality at the individual level over time, and for a greater proportion of the population (which may be possible using HES data) will allow more granular analysis. Linking to Census and other sources to add in other characteristics, could inform interventions to support tackling inequalities at the local level. See here for PHE guidance to local authorities: https://www.gov.uk/government/publications/reducing-health-inequalities-in-local-areas See here for local authority profiles from PHE which rely on a lot of ONS data: http://fingertipsreports.phe.org.uk/health-profiles/2017/e07000087.pdf More accurate and new statistics on the characteristics and factors associated with suicides, drug related deaths and infant mortality may provide insight that leads to better targeting of interventions, or the development of new interventions, that could ultimately save lives. Similar to some of the above, the unique benefit ONS can bring in this space is the ability to link the health and mortality sources (which NHS Digital also own and can link / analyse), with other non-health sources such as Census and DWP/HMRC data on benefits and income. This can also be done for a large proportion of the population (although there will be gaps that need to be understood and assessed). 3. Address Register Research Research will enhance the Address Register including the information held on communal establishments (CEs), for which there is currently a recognized data gap. A better Address Register will in turn benefit ONS' other statistics, such as the population statistics described above. For example, it will allow ONS to quality assure its local level population statistics (whether from a traditional Census or other method) as local areas with CEs can have unusual demographic profiles, which can cause concern over the accuracy of the statistics unless the location and nature of the CE is known. It will also help with better planning of survey operations and sample design. 4. UK household Expenditure Statistical Research Household Final Consumption Expenditure is a component of National Accounts; improvements therefore affect estimates of Gross Domestic Product (GDP). This is a key national economic indicator that drives national economic policy making. 5. Sustainable Development Indicator Research The UK was at the forefront of developing the UN recognized Sustainable Development Goals (SDGs) and ONS aims to fully report on an agenda that it pushed to develop to continue to show leadership in this space. A key theme of the SDGs is to leave no one behind and ONS needs to be able to disaggregate the headline indicators so that ONS can be sure progress occurs across all groups, regardless of ethnicity, age, sex, disability, geography. Subject feasibility research, linking HES data to ONS held data such as from Census 2011 at an individual level, may help to achieve this goal.
The key output will be evidence to feed into ONS’ full DARS application for identifiable HES data which will follow later this year (summer 2018 at the earliest depending on when full Digital Economy Act powers come into force). Other outputs will include internal statistical data quality reports and desk notes to guide ONS researchers working with HES data once an identifiable dataset has been acquired. These can be shared with NHS Digital analytical colleagues if useful. No external publications or statistics are expected to be released based ONS’ remote access alone - these will come later after the subsequent application, which will therefore detail the expected external outputs.
This Agreement permits online access to the record level HES database via the HDIS system. The system is hosted and audited by NHS Digital meaning that large transfers of data to on-site servers is reduced and NHS Digital has the ability to audit the use and access to the data. HDIS is accessed via a two-factor secure authentication method to approved users who are in receipt of an encryption token ID. Users have to attend training before the account is set up and users are only permitted to access the datasets that are agreed within this agreement. Users log onto the HDIS system and are presented with a SAS software application called Enterprise Guide which presents the users with a list of available data sets and available reference data tables so that they can return appropriate descriptions to the coded data. The access and use of the system is fully auditable and all users have to comply with the use of the data as specified in this agreement. The software tool also provides users with the ability to perform full data minimisation and filtering of the HES data as part of processing activities. Users are not permitted to upload data into the system. Users of HDIS are able to produce outputs from the system in a number of formats. The system has the ability to be able to produce small row count extracts for local analysis in Excel or other local analysis software. Users are also able to produce tabulations, aggregations, reports, charts, graphs and statistical outputs for viewing on screen or export to a local system. Only registered HDIS users will have access to record level data downloaded from the HDIS system. Following completion of the analysis the record level data will be securely destroyed. In addition to those outlined elsewhere within this Agreement, the Office for National Statistics will: 1. only use the HES data for the purposes as outlined in this Agreement; 2. comply with the requirements of NHS Digital Code of Practice on Confidential Information, the Caldicott Principles and other relevant statutory requirements and guidance to protect confidentiality; 3. not publish the results of any analyses of the HES data unless safely de-identified in line with the anonymisation standard; and 4. comply with the guidelines set out in the HES Analysis Guide; 5. ensure role-based control access is in place to manage access to the HES data within the Office for National Statistics. As this Agreement permits remote access to HES, this would be limited to analysing the data within the secure environment ONS is given access to, and potentially requesting export of aggregate tables. Work would include: -Gaining experience of wrangling such a large dataset - for example, linking records for the same individual across years -Assessing the quality of key variables such as the ethnicity variable, for example assessing missingness and frequency of ethnic group by other characteristics -Summarizing the number of hospital interactions by person, age, sex and Geography to give an idea of what proportion of each age-sex group in each area are interacting, and therefore, for what proportion of the population HES will give ONS evidence of their presence in the country and up to date location -Summarizing secondary care morbidity using the diagnosis variables to exhaust how useful this aggregate information would be for ONS’ proposed health statistics purposes - this will guide if and how much clinical data ONS seeks in the subsequent full application for identifiable HES data (it is expected that some uses will require record level linkage to other sources that only ONS hold and therefore aggregate may well not suffice if the potential benefits are to be realized). It may be useful for ONS to export some of these aggregate tables if possible. ONS already has good links into the NHS Digital secondary care information team who can help with queries about findings in the data and how best to use it.
Project 4 — DARS-NIC-175120-W5G2X
Opt outs honoured: No - legal basis permits flow of identifiable data (Statutory exemption to flow confidential data without consent)
When: 2019/08 — 2019/08.
Legal basis: Other - Data dissemination is mandated under section 45c of the Statistics and Registration Service Act (2007) as amended by the Digital Economy Act 2017
- Improving Access to Psychological Therapies Data Set
The Office for National Statistics (ONS), as the executive arm of the UK Statistics Authority (UKSA) requires access to administrative data held by NHS Digital, for the production of official statistics. In the past it has been difficult for ONS to access administrative data controlled by other Government departments, information that could potentially transform official statistics and the impact they have on decision making for the better. Often, this has been caused by the lack of a clear legal basis under which the data can be shared with ONS. As a result, in 2016, ONS set out why legislation was needed for better access to data: https://www.statisticsauthority.gov.uk/publication/delivering-better-statistics-for-better-decisions-data-access-legislation-march-2016/ As a result, the Digital Economy Act in April 2017 amended the Statistics and Registration Services Act (2007) (SRSA) such that ONS can require public authorities to share data with it. See the Digital Economy Act (chapter 7 of part 5): http://www.legislation.gov.uk/ukpga/2017/30/part/5/chapter/7/enacted More specifically, section 45c of the SRSA 2007 (as inserted by section 80 of the Digital Economy Act 2017) permits the Statistics Board (of which ONS is part) to serve a Notice on a public authority requiring it to disclose information it holds in connection with its functions: http://www.legislation.gov.uk/ukpga/2007/18/section/45C To do so, the information so disclosed must be required by the Statistics Board for one or more of its functions as set out in the SRSA 2007 and the Census Act 1920. The SRSA (2007) states that the ONS’s objectives include ‘promoting and safeguarding the production and publication of official statistics that serve the public good, where serving public good includes informing the public about social and economic matters, and assisting in the development and evaluation of public policy’. It also sets out the Board’s functions, which are the specifically referred to in section 45c of the amended SRSA. Notably they include, under section 20, that ONS ‘may produce and publish statistics relating to any matter relating to the United Kingdom or any part of it’. Requirements made under section 45 must also be in line with a statistical statement of principles that has been approved by parliament: https://www.gov.uk/government/publications/digital-economy-act-2017-part-5-codes-of-practice/statistics-statement-of-principles-and-code-of-practice-on-changes-to-data-systems This states that ‘We will only seek access to data for the purposes of fulfilling one or more of our statutory functions, including to produce official statistics and undertake statistical research that meets identifiable user needs for the public good.’ The statement also sets out six principles to which ONS will adhere when requiring information under section 45; they state that ONS will: • safeguard confidentiality • be transparent about what data it is accessing and why • ensure accessing the data is lawful and meet strict ethical standards • ensure that accessing the data is in the public interest - for example that the data are fit for purpose for the statistical use which ONS intends • ensure requiring that the data be supplied is proportionate – for example, ONS will have exhausted possible alternatives • seek to collaborate with suppliers at all times In addition, the following is a useful framework for categorizing ONS’s statistical uses for information such as that covered under this agreement. They are all ultimately all related to ONS’s functions of producing Official Statistics mentioned earlier: • Improvements to existing Official Statistics • Development of new Official Statistics – this may involve testing to investigate whether statistics of sufficient quality can be produced, and may also involve the production of statistics badged as ‘experimental’ while further work is done to improve quality aspects such as accuracy • Quality assurance of Official Statistics • Development of commentary around Official Statistics • Replacement of current survey questions – developing statistics from available data to directly replace the need to collect the information through survey questions • Improving efficiency or accuracy of sampling – for example, ensuring that a representative sample of the target population is taken when conducting a survey of the public, such that the statistics produced from the survey are the best possible reflection of reality • Research and development of methodology – for example, using data to develop and test linkage methodology that is ultimately used to help produce statistics based on other data rather than the original data source Using robust information governance processes, ONS has determined that the conditions associated with requiring data under section 45c of the amended SRSA have been met for the information in this data sharing agreement. This process involved working closely with NHS Digital’s experts to help determine that the data would likely be of good enough quality to meet the proposed statistical purposes. This work guided ONS’s assessment against some of the principles underpinning its legal powers – for example whether sharing the data is in the public interest, and proportionate in terms of burden on the supplier. In addition, as part of its commitment to transparency, ONS will publish full details of the reasons for acquiring the information, and ONS notes that NHS Digital will also publish this data sharing agreement. In terms of public interest, it is worth noting that the benefits gained from the statistics enabled by this data share do not need to be specific to health and social care when data are flowing under section 45 of the SRSA. For example, some of the data being required will help improve ONS’s population and economic statistics, and in these cases, the improved statistics may not benefit health and social care directly. The data shared with ONS under this agreement will not be onwardly disseminated or shared, except as disclosure controlled aggregate statistics and/or analysis as aggregated data with small numbers suppressed, in line with the Hospital Episode Statistics Analysis Guide. Any exceptions to this would require additional NHS Digital approval . It would also require an appropriate alternative legal gateway, because section 45c of the SRSA as amended by the Digital Economy Act only enables data to be shared with ONS (not for example, other Government departments or academic researchers). The rest of this section will set out the specific purposes for which ONS requires each dataset. Each purpose will be linked to the framework of statistical uses set out above. In future, ONS may decide to put a dataset to new uses not explained below. In these cases, the new use will be in line with ONS’s legally defined functions. ONS will inform NHS Digital and enter into an amended Data Sharing Agreement before proceeding with that new purpose . Dataset 1: Birth Notifications data NHS Digital has disseminated birth notifications data to ONS since 2005. Support under section 251 of the NHS Act 2006 (reference PIAG 4-05(d)/2005) permitted this sharing but the legal gateway under which the data will continue to flow will change to section 45c of the amended SRSA 2007. The birth notifications data contribute to ONS statistical analyses of births, maternities, and infant mortality outcomes. Analyses are made publicly available as aggregate National Statistics. These statistics help a range of public and other bodies make better decisions (see section 5d). They also feed into the Department of Health's NHS Outcomes Framework for monitoring low birthweight of term babies. Birth registration data that ONS receives from the General Register Office (GRO) is the primary source for producing these statistics, and ONS become controllers of that data under Section 42 of the 2007 Statistics and Registration Services Act. However, there are some limitations with the GRO data, including a lack of medical information such as length of gestation, as well as some missing and implausible values in the fields that are available. To mitigate these limitations, the NHS Digital birth notifications data are used to improve and validate the registration data. Before this can be done, the two datasets must be linked at an individual level. Several identifying variables such as NHS number are received to enable this linkage. In terms of the statistical uses framework set out earlier, then the data are used for: • Improving official statistics – additional information not on the birth registrations data can be added at the record level once the two sources have been linked • Quality assurance of official statistics – where information is on both sources, the birth notifications data can be used to validate the values contained in the birth registration data, and potentially edit (overwrite) the birth registrations data where that value is missing or implausible ONS also plans to use birth notifications data to help develop and improve its data linkage methodology. For example, the birth notifications data allows ONS to link siblings born at different times (i.e. not twins) using the NHS number of the mother which is only available on the notification data. This provides a ‘gold standard’ linkage method. ONS can then then attempt to link siblings together using only the data available in the registration data – e.g. mother’s name and date of birth, but not NHS number. ONS can then assess how closely the latter linkage method matches the gold standard. This will inform the best matching methodology to use when seeking to link siblings if NHS number of mother is not available. This is needed to link pre-2005 birth registration data, a time when the birth notification data is not available to ONS. This purpose would fall under the Research and development of methodology category in the uses framework above. Dataset 2: Hospital Episode Statistics There are a range of initial statistical uses to which ONS intends to put Hospital Episodes Statistics (HES) data. Generally, linkage to other sources at a record level is a prerequisite to success for all proposed uses, and therefore identifiers including postcode, date of birth, sex and NHS number are required. The other HES information required varies by purpose, broken down below. The specification of the variables being required has been developed in collaboration with NHS Digital data experts to ensure the data being shared are of sufficient quality (e.g. coverage, accuracy, relevance) to be likely to support the statistical purpose intended. The proposed uses of the HES data are as follows. 2.1. To enable ONS’s Administrative Data Census Project, including placing administrative data at the core of migration statistics, using ‘activity’ and characteristics data from HES ONS’s Administrative Data Census Project (ADC) is assessing whether the Government’s ambition that ‘censuses after 2021 be conducted using other sources of data’ can be realized. ONS aims to replicate the type of information collected through a census by using administrative data already held by government, supplemented by surveys. This can then be compared with the data collected by the 2021 census itself. This will allow ONS to determine whether this alternative approach can meet users’ needs. In addition, ONS set out a cross-Government Statistical Service (GSS) programme working with the Home Office (the lead policy department), the devolved administrations and other government departments who have a strong interest in improving the migration evidence base. ONS aims to deliver improvements in migration statistics by putting administrative data at the core of migration statistics as part of the wider transformation to an administrative data-based population statistics system. The programme also recognises the changing demand from users of migration statistics and the need for more information on the impact migrants have while they are in the UK: https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/internationalmigration/articles/migrationstatisticstransformationupdate/2018-05-24 There are two main types of information from the Hospital Episodes Statistics dataset that are needed for these projects; so called ‘activity data’, and characteristics data. a. Activity data ONS has access to administrative sources that include a large proportion of the population such as GP patient registration information and tax records. These provide evidence of how many people live in each area of the country. However, these sources often suffer from over coverage. This is because people may have left the country but still appear in the data, creating the risk that the size of the national population is overestimated. Even when someone is still in the country, they may move without updating their address information with relevant services – for example, they may not register with a new GP at their new location until they need to see a doctor. In this case, there is a risk of ONS including them as contributing to the resident population in the wrong part of the country. ONS can mitigate these limitations using other sources such as HES. For example, where these other sources show that an individual is interacting with a service, it provides evidence that they are in the country, and indeed which address information is correct (if the main sources mentioned earlier do not agree on this). For this particular use, ONS only requires information about where and when individuals are interacting with hospital services, not why. b. Characteristics data Ethnicity and national identity received one of the highest user needs scores from the 2015 census topic consultation, and the census ethnicity information is used by national and local decision makers; for example, in equality impact assessments when local authorities make changes to service delivery. The traditional census includes questions on ethnicity but it is currently very difficult to estimate ethnicity at a local level between censuses. The feasibility of producing admin data based ethnicity estimates will be important when deciding whether to move to an admin data based census after 2021. Very few administrative sources capture ethnicity at all, so including ethnicity on an administrative data census is challenging. However, HES is one of the few sources where ethnicity is captured. ONS has worked with NHS Digital data experts to understand the limitations of the HES ethnicity data and there are several; for example coverage and differences between the ethnicity categories used on HES vs on the Census. However, there are methodological approaches that can be used to mitigate these, and ONS is of the view that it is in the public interest this ethnicity information is acquired from HES. In terms of the framework of statistical uses presented earlier in this section, then the Administrative Data Census project work described (both a and b) falls into multiple categories: • Improvements to existing Official Statistics - If an Administrative Data Census proves feasible, ONS will be able to produce census-type population and other statistics more often, in more granular detail, produce new analyses not possible using traditional census data, and reduce the cost and burden on the public by avoiding a traditional decennial census • Development of new Official Statistics - In the short term, ‘activity data’ from HES may contribute to new admin data-based migration statistics • Quality assurance of Official Statistics - ‘Activity data’ will help ONS quality assure presence and address information from other sources • Development of commentary around Official Statistics - Identification of interaction by migrants with secondary care will allow ONS to expand on and increase the frequency of commentary on population changes and impacts, meeting user demand and providing better evidence to better inform policy-makers; for example, impact of migrants on public service demand • Research and development of methodology - Estimating ethnicity at a population level by local area using an Administrative Data Census approach will be challenging. Using HES ethnicity data, methodological teams will gain experience of developing methods to mitigate the statistical weaknesses often found in administrative data. For example, how to adjust for bias in coverage, and also data being collected on a different statistical definition compared to the desired definition 2.2. To conduct a range of Statistical Research and Health Analyses using clinical data from HES ONS’s health analysts will use information about why people have accessed hospital services, for example diagnosis, for a range of statistical purposes. This information is clearly more sensitive, and the intended statistical uses will require testing to determine whether official statistics of sufficient quality can be produced using HES data. As such, the volume of this information is being minimised to that absolutely necessary to do this. In practice, this means fewer years’ worth of information about why people have accessed hospital services will be shared with ONS, compared with the information about when and where people have accessed services. a. Exploring the feasibility of producing robust projections of the future health state of the nation. The State pension age review, 2017, called for more work on healthy life expectancy projections to better inform future decisions about the state pension age. The review also noted their potential value in informing planning future health and social care provision at a local and national level. These projections would need to take into account population projections, morbidity and mortality trends, and other characteristics, and HES could provide some of the information required. ONS recognises that there are serious limitations when using healthcare activity data, particularly hospital episodes, to make inferences about the health of the population. However, using the HES data experimentally will allow ONS to investigate the possibilities of this dataset contributing to more complete estimation of selected serious and acute illnesses, in combination with mortality data and other relevant sources. It will be necessary to link the HES data with other data sources to prevent double counting of cases and understand the relative completeness, coverage and quality of each data source, and to enable additional demographic variables to be applied to the HES data, therefore record level identifiable data is required. In terms of the framework of statistical uses, this would be Research and Development of Methodology in the first instance, with the ultimate goal of Developing New National Statistics. b. Exploring the use of linked morbidity, mortality, census, benefits and other data to produce more granular statistics on health inequalities and health state life expectancies. ONS healthy life expectancy statistics are central amongst the public health indicators that help guide decisions by Local Authorities (LAs) about the distribution and prioritisation of services. More local level health expectancy statistics, and more breakdowns such as ethnicity, educational attainment and occupation based socioeconomic position to examine interactions would provide insight allowing LAs to better target interventions to reduce health inequalities. Researching the feasibility of meeting this need will involve linking the HES data to individuals’ self-assessments of their health and disability status as collected by the 2011 Census, the ONS annual population survey since 2011 (for those surveyed), and ultimately the 2021 Census once collected in due course. ONS will explore the relationship between hospital admissions and self-reported health status at both individual and small area levels, and with reference to potentially mediating or confounding demographic and geographic variables. Therefore, identifiable record level data is required, including postcodes. Research will include exploring the feasibility of using actual morbidity data such as HES to supplement or even replace survey data to produce healthy life expectancy estimates, potentially allowing more granular statistics. In terms of the framework of statistical uses, this would be this would be Developing New National Statistics and potentially Replacing current survey questions. c. Exploring the completeness of death certification and patterns of comorbidities in specific population groups ONS holds data from the compulsory registration of all deaths in England and Wales. The information recorded about causes of death is sometimes unclear or inadequate for the range of public health, monitoring and research purposes to which the data can be put. The majority of deaths occur in hospital, or following an illness for which the deceased had hospital treatment. Linking the diagnosis data in HES with the registered causes of death will allow exploration of the relationships between them, including: (i) Understanding multi-morbidity and vulnerability in the elderly. It is well-known that deaths of elderly people tend to mention more health conditions, but also to be less specific in a way which makes identifying the factor(s) which contributed most to death difficult. Terms such as ‘old age’ and ‘frailty’ are often used on death certificates with no specific clinical cause of death. By examining the HES diagnoses and registered causes of death together, ONS will aim to throw more light on the combinations of health conditions in elderly people (multimorbidity), the role and frequency of key conditions such as pneumonia and sepsis in the causal pathways leading to death, and if possible to develop new measures of avoidable mortality in the elderly. This use would require the linkage of HES to deaths at the individual record level. ONS would also link the data to the Census and/or survey data, so as to explore the role of social factors such as living alone in deaths of the elderly along with clinical factors, with the potential to identify at-risk groups and improve targeting of preventive interventions. (ii) Understanding infant mortality. The causes of death recorded at registration of perinatal deaths in particular are often very broad and not clinically meaningful. ONS is discussing with clinical and scientific experts ways to improve this information and to determine the underlying cause of death. Linkage of the HES data to registered deaths will provide extra information on the factors underlying the recorded causes of death. ONS will aim to improve the accuracy and completeness of infant mortality statistics, potentially contributing to the government ambition to halve infant mortality by 2025. In terms of the framework of statistical uses, these projects would contribute to Improvements to existing Official Statistics, Quality Assurance of Official Statistics and Developing New National Statistics. 2.3. improving ONS’ Address Register This project will investigate using HES data to identify and/or validate the addresses of communal establishments, and would require information including where individuals were admitted from and discharged to. Also: • Length of stay information will provide evidence of how many people ONS would expect to be classed as usually resident (> 6 months stay) in hospital at any given time • Sex information may assist with identifying communal establishments that are male or female only. In terms of the framework of statistical uses, this research, if successful, would enable Quality Assurance of Official Statistics and Improved efficiency / accuracy of sampling. 2.4. Creating a better estimate of the UK household expenditure on hospital services (inpatient only) and medical and paramedical services (outpatient) The ONS national accounts framework provides a simple and understandable description of national production, income, consumption, accumulation and wealth. The national accounts research team will investigate whether HES data can improve estimates of revenue paid by patients, split into outpatient and inpatient activity, private patient episodes split by outpatient and inpatient activity, and outpatient activity split between medical services and paramedical services. The data may also be used to improve the figures on UK healthcare resources, activity and expenditure which are provided regularly to the international institutions (Eurostat, OECD and WHO) for comparative purposes. In terms of the framework of statistical uses, the ultimate aim would be to Improve an existing National Statistic – i.e. UK national accounts. 2.5. Enabling the UK to report data or proxy indicator data to measure its progress against the United Nation's Sustainable Development Goals (SDGs) The UK is committed to reporting progress against all of the internationally agreed Sustainable Development Goals (SDGs), and ONS will lead on delivering this. In some cases, new indicators will need to be developed, and/or new uses made of existing data. Interest in HES is specifically around the feasibility of providing data for the following Sustainable Development indicators: • Maternal mortality ratio • Proportion of births attended by skilled health personnel • Number of people requiring interventions against neglected tropical diseases • Coverage of treatment interventions (pharmacological, psychosocial and rehabilitation and aftercare services) for substance use disorders • Proportion of women of reproductive age (aged 15-49 years) who have their need for family planning satisfied with modern methods • Coverage of essential health services (defined as the average coverage of essential services based on tracer interventions that include reproductive, maternal, newborn and child health, infectious diseases, non-communicable diseases and service capacity and access, among the general and the most disadvantaged population) ONS’s SDGs team are working with NHS Digital and Public Health England (PHE) to produce these indicators without the need for data sharing. However, ONS also needs to disaggregate these headline indicators by ethnicity, age, sex, disability and geography. In some cases, NHS Digital / PHE will not hold data that would enable this, but linking HES data to ONS held data such as from Census 2011 at an individual level may fill this gap. In terms of the framework of statistical uses, the ultimate aim would be to Develop a new National Statistic. Dataset 3: Improving Access to Psychological Therapies (IAPT) Dataset 3.1. To enable research being conducted by ONS’ Administrative Data Census and Migration Statistics improvement projects using ‘activity’ and characteristics data from IAPT. This first use is essentially the same as described for the uses of HES data within these projects: The IAPT data provides evidence of presence at a particular address, and it also includes information on characteristics including ethnicity. See section 2.1 above (within the HES section) for the full rationale for why this information is needed. 3.2. To conduct a range of Statistical Research and Health Analyses using IAPT data a. Statistical Research to inform Primary Mental Health Service Policy Making This project will focus on common mental health disorders (CMDs) such as anxiety and depression. Using a phased approach, ONS will look first at the mortality risk of people with CMDs, co-morbidities between mental and physical health problems, and investigate inequalities around mental health. In the second phase, ONS will investigate income and employment transitions for patients who have been through mental health treatment. The first phase will address existing evidence gaps on co-morbidities between mental and physical health, improve understanding on the demographics of people with CMDs, and investigate whether some or all people with CMDs are more at risk of death than the general population. The IAPT data for 2012 to 2017 will be linked to the 2011 Census to provide detailed demographic background, and to death registrations from 2012 to 2018. The mortality analysis will focus on specific causes of death which may be connected to mental health (suicide, alcohol and drug abuse) as well as overall risk. In addition, the causes of death will be compared to the distribution of causes in the general population to identify any common co-morbidity with life-threatening illnesses. This goes some way to provide insight into important issues raised by the NHS England Five Year Forward View on mental health: “An important barrier to good care is the lack of appropriate data sharing to enable organisations to identify co-morbidities…People with poor mental health may require primary care, secondary physical care and social care, as well as mental health services, but the lack of linked datasets hinders effective provision.” IAPT data is estimated to cover over 15% of people with CMDs in England. Because of the service’s large, national scale and focus on people with mild and moderate mental health conditions, it provides a reasonable proxy for patterns and trends in the population of people with diagnosable CMDs. The IAPT data will be compared with the findings of the Adult Psychiatric Morbidity Survey (2007 and 2014) to assess likely issues of representativeness, such as the under-representation of specific population groups in the treatment cohort. People with severe mental health conditions are not typically treated in the IAPT programme. The three-way linkage will provide an independent and more detailed demographic baseline than the IAPT data could do alone, and allow ONS to investigate if there have been changes in peoples’ circumstances between the Census and treatment in IAPT (e.g. becoming disabled or living alone). Having the mortality data linked as well allows ONS to see the overall trends in mortality, plus to see if there is any relationship between changes in demographics and the cause of death outcomes. The research is not aiming to look at individual level outcomes or to evaluate the IAPT treatment, but to look for trends in the aggregate data after linkage, to provide population level analysis to inform policy. Entry into IAPT treatment will be used as the main indicator of having a diagnosable CMD. The clinical data will not be analysed except to: • Group the cohort into broad types of CMD • Potentially, link successful/unsuccessful treatment outcome to risk of subsequent death. b. Exploring the feasibility of producing robust projections of the future health state of the nation. The State pension age review, 2017, called for more work on healthy life expectancy projections to better inform future decisions about the state pension age. The review also noted their potential value in informing planning future health and social care provision at a local and national level. These projections would need to take into account population projections, morbidity and mortality trends, and other characteristics, and IAPT data could provide some of the information required. ONS recognises that there are serious limitations when using healthcare activity data to make inferences about the health of the population. However, ONS will investigate the possibilities of IAPT data contributing to more complete estimation of morbidity, to then use alongside mortality data and other relevant sources in health projection modelling. This assessment of morbidity would be in conjunction with other sources such as NHS Digital’s Hospital Episode Statistics (HES) that the Board has already required be shared with ONS. c. Exploring the use of linked morbidity, mortality, census, benefits and other data to produce more granular statistics on health inequalities and health state life expectancies. ONS healthy life expectancy statistics are central amongst the public health indicators that help guide decisions by Local Authorities (LAs) about the distribution and prioritisation of services. More local level health expectancy statistics, and more breakdowns such as ethnicity, educational attainment and occupation based socioeconomic position to examine interactions, would provide insight allowing LAs to better target interventions to reduce health inequalities. Researching the feasibility of meeting this need will involve linking IAPT data to individuals’ self-assessments of their health and disability status as collected by the 2011 Census, the ONS annual
As per section 5a, the legal gateway under which data will flow from NHS Digital to ONS will be Section 45c of the SRSA 2007 (as amended by the Digital Economy Act 2017). This means ONS can require that data are shared as long as the data are required for its functions, and the share is in line with the statistical statement of principles that underpins these powers. These considerations include that the purposes to which ONS puts the data must be in the public interest and serve the public good. However, for this legal gateway, the benefits do not need to be to health and social care specifically. This is unlike some other legal gateways under which NHS Digital data can be disseminated, for example section 251 of the NHS Act 2006, when research outcomes must benefit health and social care. In the above context, the following will briefly cover all the potential benefits by dataset. Dataset 1: Benefits of the ONS births statistics which depend on the Birth Notifications data Local authorities and other government departments are important users of birth statistics and use the data for planning and resource allocation. For example, local authorities use birth statistics to decide how many school places will be needed in a given area. The Department for Work and Pensions uses detailed birth statistics to feed into statistical models they use for pensions and benefits. The Department of Health uses the data to plan maternity services and inform policy decisions. Other users include academics, demographers and health researchers, who conduct research into trends and characteristics. Lobby groups use birth statistics for their cause, for example, campaigns against school closures or midwife shortages. Special interest groups, such as Birth Choice UK, make the data available to enable comparisons between maternity units to help women choose where they might like to give birth, and work closely with health professionals. Charities, such as the Twins and Multiple Births Association provide advice and support to multiple birth parents and use the data to monitor trends. Organisations such as Eurostat and the UN use ONS birth statistics for international comparison purposes. The media also report on trends and statistics. In addition, ONS’ births data is used as a component of its population statistics. Population estimates and projections are also used extensively throughout government and specifically by the Department of Health and their agencies for the planning and provision of health and social care services, and the distribution of funds. Throughout government, decisions on the distribution of billions of pounds of funds are made based on population estimates and projections. In addition, they are used as the denominator in any statistics that are published on a per capita basis. For example, any health data published per capita for a particular level of geography (national, regional, local authority, clinical commissioning group, parliamentary constituency etc) is almost certain to use ONS estimated or projected population as the denominator. Population estimates and projections are published by age and sex. This means that they can also be used to better target age and sex specific health and care services (e.g. maternity, aging populations etc) Benefits of Data Linkage Methodology Research that will use the birth notifications data: Any improvements in Data linkage expertise and methodology would be an enabler to other projects and therefore their benefits. This is because the more accurately data can be linked, the more accurate any statistics derived from the linked dataset will be. The benefits of these projects may or may not be relevant to health and social care – for example, many of the projects involving HES data will depend on accurate data linkage. Dataset 2: Predicted Benefits of the uses for Hospital Episode Statistics data 2.1. Benefits of the Admin Data Census Project and Improved Migration Statistics Population estimates and information on population characteristics are used by a wide range of national and local organisations for numerous purposes including resource and funding allocation for both local and central Government, service planning and delivery, policy development, monitoring and evaluation, and providing an accurate denominator for other statistics. The Department of Health and their agencies use ONS’ population statistics for the planning and provision of health and social care services and the distribution of funds. Throughout government, decisions on the distribution of billions of pounds of funds are made based on population estimates and projections. Respondents to the Census Topic Consultation conducted in June 2015 gave strong evidence for high-quality and more timely population estimates. If it proves feasible, an Admin Data Census approach will deliver more timely statistics. It will potentially also deliver more accurate and timely population statistics, at least in inter-censal periods, if not traditional census year itself. An Admin Data Census approach will also reduce cost and respondent burden. New and more accurate information on international and internal migration is needed to better inform migration system policy making in a post-Brexit era. Evidence of this includes the 2017 Migration Advisory Committee call for evidence on aspects of migration, in response to a Government commission to guide decisions on post-Brexit migration policy and the cross-Government Statistical Service (GSS) migration transformation programme 2.2. Benefits of the Health Analyses Successful production of robust health projections would support better decision making around where to set the state pension age, and planning of health and social care services: Evidence of this includes the Cridland report, 2017 which was commissioned by government to independently review the state pension age, and made the following statement: (https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/611460/independent-review-of-the-state-pension-age-smoothing-the-transition.pdf) “We believe more work is needed to understand healthy life expectancy, as it affects a range of policy areas. Projecting healthy life expectancy into the future is not currently possible, but would be valuable for future Reviews, as well as in work around health and caring.” The report also notes: • Developments in Healthy Life Expectancy and Health State Transitions will have a notable impact on the demand for social care and different types of medical care, for instance the number of trained dementia nurses required in 40 years’ time • In order to manage budgets and allocate funding effectively, there is a need to understand what the main patterns of key diseases will be, and what the distribution of these illnesses across the population will look like • It is likely that the prevalence of diseases which affect the oldest old such as cancer and dementia will increase • If social care and health care provision needs to be increased, the national budget will need to be changed to reflect this which may result in other services seeing cuts. Current healthy life expectancy estimates rely on ONS surveys, where despite the large sample size, the number of possible breakdowns geographically and by characteristic is limited by this sample size. Current estimates also rely on aggregate figures – ie the prevalence of poor health / limiting long term conditions, and also mortality rates by age are calculated independently and then fed into the model. These factors limit the accuracy of the model. Linking health states and mortality at the individual level over time, and for a greater proportion of the population (which may be possible using HES data) will allow more granular analysis. Linking to Census and other sources to add in other characteristics, could inform interventions to support tackling inequalities at the local level. Improving understanding of causes of death in vulnerable population groups such as the elderly and infants, by using HES diagnostic data to supplement the registered causes of death, will improve mortality statistics which are currently relied on by government for a wide range of policy and resource allocation purposes and as indicators in the NHS outcomes frameworks. Developing a better understanding of complex causes of death in the elderly will help to address an internationally acknowledged issue which is of growing importance as the average age of the population, and the proportion of deaths which are among elderly people, increases globally. There is international interest in developing new measures of avoidable death in the elderly, and the potential of studies on this to help identify those who are most at risk and target preventive interventions. 2.3. Benefits of Improving the ONS Address Register This research will enhance the Address Register including the information held on communal establishments (CEs), for which there is currently a recognized data gap. A better Address Register will in turn benefit ONS’ other statistics, such as the population statistics described earlier. For example, it will allow ONS to quality assure its local level population statistics (whether from a traditional Census or other method) as local areas with CEs can have unusual demographic profiles, which can cause concern over the accuracy of the statistics unless the location and nature of the CE is known. It will also help with better planning of survey operations and sample design. 2.4. Benefits of Improving UK household Expenditure Household Final Consumption Expenditure is a component of National Accounts; improvements to its accuracy therefore improve estimates of Gross Domestic Product (GDP). GDP is a key national economic indicator that drives national economic policy making, in turn potentially affecting the wellbeing (financial or otherwise) of everyone in the country. 2.5. Benefits of Improving ONS Sustainable Development Indicators The UK was at the forefront of developing the United Nations recognized Sustainable Development Goals (SDGs). ONS aims to fully report UK progress against these goals (i.e. have data available for the SDG indicators that have been proposed), given the UK was heavily involved in SDG development, and wants to continue to show leadership in this space. A key theme of the SDGs is to leave no one behind and ONS needs to be able to disaggregate the headline indicators so that it can be sure progress occurs across all groups, regardless of ethnicity, age, sex, disability, geography. Subject feasibility research, linking HES data to ONS held data such as from Census 2011 at an individual level, may help to achieve this goal. In some cases, reporting against the SDG indicators will not always enable better decision making on UK government policy, but it will encourage other nations to fully report against the indicators, and by extension enable better decision making in those nations. Dataset 3: Benefits of the uses for IAPT data 3.1. Benefits of the Admin Data Census project and improved migration statistics This is essentially the same as described for the uses of HES data within ONS’ Admin Data Census Project. See section 2.1 above (within the HES section) for benefits of these uses. 3.2. Benefits of Statistical Research and Health Analyses using IAPT data Mental health is a high priority in government policy. Co-morbidities between mental and physical health, as well as inequalities in mental health, are of increasing interest within health policy. In their response to the Five Year Forward View (FYFV), NHS England set an objective for the majority of new common mental health disorder (CMD) services to be integrated with physical healthcare by 2020/21. This is in line with a King’s Fund report which provided evidence for the strong links between mental and physical health. This project will add to the evidence base by: • providing information on many physical conditions (rather than a focus on only a few key health problems, as in the Adult Psychiatric Morbidity Survey) • providing a detailed demographic context, including information such as ethnicity, sexual orientation, occupation, marital status • Investigating inequalities Investigating the links between mental health, mortality, and co-morbidity has clear benefits for the public. By determining physical and mental health conditions that commonly co-occur, the government can target its health services to better meet the needs of patients resulting in a better patient experience, and ultimately could saves lives. For example, it may be that a particular cause of death has an increased prevalence in patients with CMD compared to the general population; by ensuring policy makers and clinical staff are aware of this, prevention and intervention could be more targeted. The second benefit of the project is analysis of inequalities in mental health, in line with the FYFV “focus on tackling inequalities. Mental health problems disproportionately affect people living in poverty, those who are unemployed and who already face discrimination”. The King’s Fund found that people with long term physical health problems and co-morbid mental illness disproportionately live in deprived areas. This analysis would allow detailed geographical mapping of those with a CMD who died from particular causes, and analysis by deprivation deciles. Other demographic variables could also be used for inequalities analysis to investigate any difference in premature mortality in certain demographic groups of IAPT users (age, sex, or occupation) versus the general population. Obtaining this information will benefit the public by allowing healthcare providers to target groups who may be disproportionately affected by physical and mental health problems, and subsequently reduce premature mortality due to co-morbidities. The benefits of the other health statistics mentioned in section 5a (3.2) to which IAPT data will contribute are already described under the benefits gained from ONS acquiring HES data.
Dataset 1: Birth Notifications Official Birth Statistics Annual birth outputs represent births occurring in England and Wales in a given year. A package containing summary tables for the previous calendar year is released in July, with supporting commentary in a statistical bulletin. More detailed figures are then released between August and December in a series of themed packages. Each package consists of a number of data tables; these are generally accompanied by a statistical bulletin. ONS’ tables provide the latest year’s figures with some also showing historical data for comparison, sometimes back to 1837. ONS publishes all its statistics on its website, and also extends its reach through social media, for example its twitter feed. There are several published packages: Birth summary tables: includes the number of live births and stillbirths, fertility rates, percentage of live births outside marriage and civil partnership, mean age of mother and percentage of live births to non-UK born mothers for England and Wales as a whole. Live births (number and rate) and the number of stillbirths are also provided down to local authority level. To aid with user interpretation, ONS also publishes an interactive fertility mapping tool, which enables users to analyse trends in fertility by county district and unitary authority; this is contained within the statistical bulletin. Parents’ country of birth: includes births by country of birth of mother and total fertility rates for UK born and non-UK born women for England and Wales as a whole. Summary figures are also available down to local authority level. ONS publishes detailed analysis on parents’ country of birth because this information is collected at birth registration and does not change over time, while their nationality or ethnicity may change. Birth characteristics and by area of usual residence: contains statistics on stillbirths and maternities for England and Wales, birthweight data for live and stillbirths by mother's region of usual residence, and live births and stillbirths in hospitals and communal establishments by region of occurrence. These tables also provide figures on month and quarter of occurrence, place of birth, ethnicity and gestational age and multiple births for England and Wales as a whole. Also provides summary data for live births down to local authority level including figures by age of mother figures are published using boundaries in place during the year the birth occurred. Births by parents’ characteristics: provides live birth, stillbirth and maternity statistics by age of mother and type of registration (within marriage and civil partnership, joint, sole). It also provides data on previous live-born children, National Statistics Socio-economic Classification (NS-SEC), median birth intervals, age-specific fertility rates for men and mean age of fathers. All tables are for England and Wales as a whole with no sub-national breakdown. Childbearing for women born in different years (formerly known as Cohort fertility): presents data on fertility by year of birth of mother rather than the year of birth of child for England and Wales as a whole this package includes the average number of live-born children and the proportion of women remaining childless for women born in different years. Data Linkage Methodology Research: This will result in internal, and potentially external, ONS reports and presentations on how best to link siblings / family units together when linkage based on NHS number is not possible. Any reports or presentations would not include statistics derived from the birth notifications data. They would only include figures comparing the success of various matching strategies compared to one based on linking using mother’s NHS number. Dataset 2 and dataset 3: Hospital Episode Statistics and Improving Access to Psychological Therapies data The initial uses to which ONS will put HES and IAPT data are most commonly new or improved official statistics that will enable better decision making (see sections 5a and 5d). To reach this goal, a lot of development work, testing, and quality assurance will be required to determine whether official statistics of sufficient quality can be produced in each case. Generally, this initial work will be disseminated through a range of products and channels, in particular research updates and research outputs. For example, the Admin Data Census project already publishes its research outputs and work involving HES will be reported in similar fashion on this section of the ONS website: https://www.ons.gov.uk/census/censustransformationprogramme/administrativedatacensusproject/administrativedatacensusresearchoutputs Subsequently, projects will move on to the production of experimental statistics and potentially in due course, National Statistics (a status that can only be gained once certain quality standards are met). Both types are released via the ONS website. By way of illustration, a good example of an experimental statistic is here: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/articles/estimatingsuicideamonghighereducationstudentsenglandandwalesexperimentalstatistics/2018-06-25 This release is based on a project linking information about suicides with information on higher education students to increase the evidence base on suicides by those in higher education. No targets can be given as to if and when experimental or National Statistics will be produced using HES or IAPT data until the initial stage of any given project is complete. All ONS statistical teams engage regularly with users, and will seek to provide frequent updates on these projects during that first stage.
Dataset 1: Birth Notifications data ONS receives the data in real time through its Spine2 connection from NHS Digital. It arrives as xml files which are converted on a secure WebLogic server before being transferred to another secure server for processing. Here, it is processed ready for ONS use. This server is separate to those which are used for other datasets from NHS Digital, due to the long-standing nature of this data share. The birth notifications data are linked with ONS’s birth registrations data at an individual level. Where possible, NHS number of baby and/or mother are used. In some cases this will fail, for example when the same NHS number is used twice in the registrations data in error. Therefore, other demographic variables are used for linkage when required. The majority of this is automated matching with no visual inspection of the identifiable data, but in a small number of cases, clerical matching is required. Only a small number of security cleared, trained, substantive ONS employees are involved with this part of the process. Once linkage is complete, other variables from the birth notifications data are used to either enhance or validate the birth registrations data. Once the enhancement and validation are complete, all additional birth notification data that are not needed for the production of statistics, notably the identifiers, are removed before any more ONS staff can access the data. As suggested in 5a, the resulting de-identified, linked dataset produced includes additional variables from the birth notifications data that were not on the birth registrations data. This linked de-identified dataset is transferred to another secure server where health analysts can produce the statistics listed in section 5c. No attempt is made to re-identify individuals; ONS is only interested in producing aggregate statistics for the public good. ONS employs strict security procedures to protect confidentiality throughout processing. These include: • Only a small number of substantive ONS employees can access the data and all ONS employees who have access to the data have contractual obligations of confidentiality, enforceable via disciplinary procedures, as set out in the ONS Code of Practice • Relevant staff are Security Check cleared, and have undergone appropriate training and ongoing supervision to maintain confidentiality and integrity • Data are held on secure servers with restricted access, the data are only held in an identifiable form for the shortest period necessary to enable the data to be used for the stated purposes The complete Birth Notification dataset is required, rather than just a sample, because the birth registrations data, which is the primary source for ONS’s birth statistics, is (in theory) a Census of all births. This means the statistics are more accurate than statistics based on surveys that suffer from sampling error. If the variables that are appended to the registrations data from the birth notifications were only a sample, this would reduce the accuracy of some of the birth statistics, limiting their use and impact. In addition, the difference in accuracy between statistics based on different variables within the same statistical release would be confusing for users. Some variables that are on the birth registrations have their value overwritten with the equivalent value from the birth notifications data. In this case, only having a sample of birth notifications data, or certain geographic regions, would potentially introduce bias. For example, it would mean ONS’s birth statistics are more accurate for the regions where it has birth notifications data and is therefore able to improve on any implausible values in the birth registrations data, than for those where it would not have been able to do this. Dataset 2 and 3: Hospital Episode Statistics (HES) and Improving Access to Psychological Therapies (IAPT) data Data security for storage and linkage of the data will be provided with an assured ONS data analysis environment that includes the following elements of security control: • Need To Access applied through user account access and management . Access to the data is restricted to individuals granted access on the basis of a justified need to access the data • Controlled ingest and export of data into/out from the DAP environment • Controlled account access using unique credentials based on job role • Logged and monitored access of user activity within the DAP environment • Secure build configuration for infrastructure • Vulnerability tested infrastructure with appropriate remediation and patching • Compliance checks against security enforcing controls • Architectural review against standards and best practice • Staff security cleared to the appropriate level based on their supervised and/or unsupervised access to sensitive data in accordance with ONS clearance policies and data access processes • Education and awareness of environment users covering security policies and secure working practices • Operational support processes to securely manage the environment • Risk assessment to identify security risks and mitigation actions to reduce this risk. Following policy specified by the ONS Chief Security Officer, ONS user access to the data environment is only after approval of an application by the Information Asset Owner including ethical assessment of proposed data use. A list of approved users is available on request. With reasonable notice, periodic written/verbal checks may be conducted by an authorised employee of NHS Digital to confirm compliance with this application. ONS will keep a record of any processing of Personal Data and will provide a copy of such record to NHS Digital on request. ONS will not transfer or permit the transfer of the Data to any territory outside the UK without the prior written consent of NHS Digital. As described in section 5a, the proposed purposes require linkage of records at the individual level. This is why personal identifiers such as date of birth, postcode and NHS number are required. However, ONS is only interested in producing aggregate statistics and using these to uncover trends and other useful insights based on the non-identifiable ‘attribute’ information. ONS is in the process of enhancing the capabilities of its Data Access Platform to allow for variable-by-variable control of researcher access granting. Until this is completed, any analyst who is granted access to the dataset will technically have access to all variables and identifiers. However, users will not be permitted to access identifiers for the purposes of analysis. In addition to the system protocols above, ONS will therefore keep the number of staff permitted to process identifiers to an absolute minimum, and these staff will have a higher level of clearance. All other staff will only be permitted to access non-identifying data. Inadvertent re-identification is still a risk but ONS will never seek to intentionally re-identify this data. ONS staff are suitably trained; for example ONS’s health analysts in particular are experienced working with sensitive data about deaths (such as individual level data about suicides). Further, only statistical disclosure controlled aggregate outputs will be exportable from the secure data analysis environment. In other words, other than the initial transfer of the data from NHS Digital to ONS, the identifiable data will never be in transit and will always be protected by procedural controls in place now and technical controls to be implemented by 31/12/2019 to enforce the controls as described above. The reasons complete information on who, when and where people accessed hospital services for 2009/10 onwards is needed varies across the multiple statistical purposes presented in section 5a. The drivers are largely to do with quality and therefore value of the statistics that can be produced using the full dataset compared with less than this, for example a subset or random sample. There is more on statistical quality on the ONS website including the following: ‘The quality of a statistical product can be defined as the “fitness for purpose” of that product. More specifically, it is the fitness for purpose with regards to the European Statistical System dimensions of quality: • relevance – is the degree to which a statistical product meets user needs in terms of content and coverage • accuracy and reliability – is how close the estimated value in the output is to the true result • timeliness and punctuality – describes the time between the date of publication and the date to which the data refers, and the time between the actual publication and the planned publication of a statistic • accessibility and clarity – is the ease with which users can access data, and the quality and sufficiency of metadata, illustrations and accompanying advice • coherence and comparability – is the degree to which data derived from different sources or methods, but that refers to the same topic, is similar, and the degree to which data can be compared over time and domain, for example, geographic level There are additional characteristics that should be considered when thinking about quality. These include output quality trade-offs, user needs and perceptions, performance cost and respondent burden, and confidentiality, transparency and security.’ The clearest example of the need for the information in this agreement is using so called activity data to determine where in the country people were/are resident as part of the ONS Administrative Data Census project. This project is developing new population statistics methods and products that cover the whole of England (and beyond), so complete HES and IAPT coverage is required. In addition, a decision is required post-2021 Census about whether these new methods and statistics can replace the traditional Census. To determine this robustly requires that the new methods and statistics are produced for the whole of the 2011 to 2021 time period. This will allow a robust view to be taken of the level of error and drift of those new statistics during this period, comparing them to the gold standard Census figures available for 2011 and 2021. For the health analysis purposes presented in section 5a that require why people interacted with hospital services, then similar arguments around quality apply; Health projections that rely on HES diagnosis information will require full coverage for an extended time period. However, this and the other health analysis uses presented are more complex than a lot of the other proposed uses and require more ground work to determine whether statistics of sufficient quality can be produced. In addition, diagnosis information is clearly more sensitive. As a result, ONS has determined that it is proportionate and in the public interest that the years worth of HES diagnosis information required is minimised for now. It will still be possible to test statistical quality for these uses with this volume of information. But ONS do expect this work to be successful and if it is ONS will require additional years of diagnosis information be shared at a later date. Access to data held within the Data Access Platform (DAP), which includes HES data, is granted to users on a need-to-know basis depending on their role, through a request process which provides a business justification. Access is authorised on a case-by-case basis by the ONS Information Asset Owner (IAO) responsible for HES data, with advice from Security and Information Management. Staff requesting access to HES data must be cleared to the appropriate National Vetting level, which is higher than the standard basic clearance required for all ONS staff. Only authorised ONS staff with appropriate security clearance will have access to identifiable HES data, with regular audit and monitoring in place to ensure compliance