NHS Digital Data Release Register - reformatted

University of Surrey

Project 1 — DARS-NIC-345789-L9Q7J

Opt outs honoured: No - data flow is not identifiable (Does not include the flow of confidential data)

Sensitive: Sensitive, and Non Sensitive

When: 2020/11 — 2021/05.

Repeats: One-Off

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Categories: Anonymised - ICO code compliant

Datasets:

  • Mental Health Services Data Set
  • Mental Health Minimum Data Set
  • Mental Health and Learning Disabilities Data Set
  • Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Admitted Patient Care

Objectives:

AIM AND PURPOSE The aim of this project is to investigate the determinants and effects of hospital workforce retention (WFR). Workforce retention refers to the ability of a workforce to retain its employees). This project is led by University of Surrey (UoS) and funded by The Health Foundation (the Funder). The project is of interest for the research team, the Funder, and the wider community of researchers and healthcare policy-makers, with an expected positive impact on the knowledge of the economics of healthcare workforce and its effects on hospital performance and patients’ outcomes. The added contribution generated by the project is hoped to help improve the sustainability of the English NHS. The lawful basis for processing personal data is Article 6(1)(e), in that processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller. The lawful basis for processing Special Category Data is Article (9)(2)(j), in that processing is necessary for scientific research purposes in accordance with Article 89(1). The University of Surrey is a public authority responsible for conducting scientific research for academic and public benefit. Data in the ‘Hospital workforce retention and patient outcomes’ study is processed to enable the University of Surrey to perform its public task. The University of Surrey rely on GDPR Article 6 (1) (e), to carry out its public task and for special categories of data (including health information and pathways, and information concerning ethnicity); GDPR Article 9.2(j), for archiving, research and statistics, as the study is a research project which will use data and statistics, in accordance with Article 89(1). The following data products are requested: 1) Hospital Episode Statistics (HES) Admitted Patient Care: Financial Years 2009/10 to 2021/22; 2) Patient Reported Outcome Measures (PROMs): Financial Years 2009/10 to 2021/22; 3) Civil Registration - Deaths (CR-D): Financial Years 2009/10 to 2021/22; 4) HES Critical Care (HES CC): Financial Years 2017/18 to 2021/22; 5) HES Accidents and Emergencies (HES A&E): Financial Years 2009/10 to 2018/19; 6) Emergency Care Data Set (ECDS): Financial Years 2018/19 to 2021/22; 7) Mental Health Minimum DataSet (MHMDS), Mental Health Learning Disabilities (MHLDDS), Mental Health Services DataSet (MHSDS): Financial Years 2011/12 to 2021/22; 8) Bridge files: Hospital Episode Statistics to Mental Health Minimum Data Set 9) Bridge files: Hospital Episode Statistics to Mental Health Services Data Set 10) Mapping files: MHSDS to MHMDS / MHLDDS 11) Bridge files: PROMS to Hospital Episode Statistics The data requested will allow UoS to investigate the association of hospital WFR for different categories of hospital workers. These workers include consultants, nurses, ambulance staff with patients’ outcomes in different type of hospital care, length of stay in acute emergency care, unplanned re-admissions in acute elective care, and unplanned re-admission to inpatient mental health wards for mental health care. The study will focus on two research questions (RQ). RQ1: What are the determinants of variations in NHS WFR, in both acute care (AC) and mental health (MH) hospitals? (what causes the differences in work force retention between different hospital settings) RQ2: What are the causal effects of WFR on admitted patients’ health outcomes (mortality, emergency readmissions, length of stay, waiting times) in emergency, elective and MH care? (what is the impact on care). RQ1 investigates the sources of variation in hospital WFR for ambulance and clinical staff (doctors and nurses), i.e. which are the factors that affect the hospital WFR within and amongst NHS hospital organizations and Ambulance Trusts, and whether different kind of factors affect differently the various categories of hospital workers (i.e. doctors, nurses and ambulance staff). The baseline model will make use of a range of longitudinal data, including NHS Digital data, to uncover the association between WFR and factors determining its variations over time and across hospitals. Moreover, the variation stemming from a series of policies (i.e. the 2016 new junior doctor contract; the 2018 NHS Improvement Retention Support Programme; the 2018 hospital staff’s new pay/progression contract) and political events (Brexit 2016 poll and 2019 EU withdrawal) and the fact that such changes often affected unequally different groups of hospital workers, will be used to estimate statistical models (e.g.: Interrupted Time Series (ITS) and difference-in-difference (DiD) models) to evaluate the effect of such ‘breaks’ on hospital WFR. This analysis is not concerned with the identification of the retention behaviour of individual doctors, but with the identification of average retention behaviour for groups of doctors/nurses/ambulance workers amongst different NHS organization over time. RQ2 will rely on a simple (linear regression) analysis to uncover how the effect of WFR on patients' outcomes can lead to estimates biased by endogeneity due to reverse causality (e.g. bad patients' outcomes leading to poor WFR), so the changes in hospital WFR caused either by policy changes and the Brexit shock or by the variation in the non-NHS wages (i.e. wages for workers with similar age/qualification profiles as NHS workers, but not working for the NHS) and business turnover in the local labour market around NHS hospitals. The aim is to plausibly identify causal effects. This analysis is not concerned with the identification of the clinical performance of individual doctors, but with the identification of the average performance amongst NHS organisations with high vs low WFR rates and with the identification of the average performance of groups of doctors staying working in a hospital vs groups of doctors leaving a hospital, the effects of interest are at the very least aggregated at stayers vs leavers level, and never defined or reported for individual doctors. This is a standalone project and has no links to other projects or collaborations. The project consists of two phases (related to each RQ), which will be carried out partly sequentially and partly in parallel. Phase 1. The first phase will model the retention of the NHS hospital workforce. In this phase, the NHSD datasets requested will be used to extract variables of interest that are likely to be factors associated with or causing changes in hospital WFR patterns over time. The data obtained by the Department of Health and Social Care (Electronic Staff Records data, ESR) will be used to define the main output variables for the analysis of the determinants of hospital workforce retention as well as some of the main variables of interest in the same analysis (e.g. average salary / gender pay gap / staff nationality / staff qualification / staff age). The ESR data will also be used to define the main variables of interest (e.g. stability index, number of leavers) in the analysis of the effects of hospital WFR on patient outcomes. The patient level data requested from NHSD will be used to define some of the control variables in the analysis of the determinants of hospital workforce retention (e.g. weekly admissions to hospital, average age/comorbidities (state of having multiple medical conditions at the same time, especially when they interact with each other in some way)/procedures, number of competitors at NHS Trust level). Phase 2. The second phase will analyse the effects of hospital WFR on patients’ outcomes. In this phase, the NHSD datasets requested will be used to extract variables that are either the patients’ health / process outcomes of interests (e.g. mortality, readmission, length of stay, waiting time) or characteristics of the patients (e.g. co-morbidities, age, economics deprivation, hospital attended, year, month, day of the week when admitted or treated). The patient level data requested from NHSD will be used to define some of the patient-level control variables in the analysis of the effect of hospital WFR on patient outcomes. (e.g. emergency or elective admission to hospital, patient age/comorbidities/procedures, number of competitors) as well as the main outcome variables (e.g. mortality, readmissions, length of stay, waiting times, change in the Oxford Hip/Knee score ( a joint-specific, patient-reported outcome measure tool designed to assess disability in patients undergoing total hip replacement)). The most important academic references to this project are the following published studies: Propper, C., & Van Reenen, J. (2010). Can pay regulation kill? Panel data evidence on the effect of labour markets on hospital performance. Journal of Political Economy, 118(2), 222-273]. Shields, M. A., & Ward, M. (2001). Improving nurse retention in the National Health Service in England: the impact of job satisfaction on intentions to quit. Journal of health economics, 20(5), 677-701. Newman, K., & Maylor, U. (2002). The NHS Plan: nurse satisfaction, commitment and retention strategies. Health Services Management Research, 15(2), 93-105. Other important healthcare policy references on the economics of the NHS healthcare workforce are: Health Education England. (2017). Facing the Facts, Shaping the Future. A draft health and care workforce strategy for England to 2027. Charlesworth, A., & Lafond, S. (2017). Shifting from Undersupply to Oversupply: Does NHS Workforce Planning Need a Paradigm Shift?. Economic Affairs, 37(1), 36-52. Buchan, J., Charlesworth, A., Gershlick, B., & Seccombe, I. (2017). Rising pressure: the NHS workforce challenge. Nuffield Trust (2017). Creating a sustainable workforce: The long-term sustainability of the NHS. PARTICIPANTS: - all patients hospitalized in English NHS hospitals, from 2009/10 to 2021/22 for acute care, and from 2011/12 to 2021/22 for mental health care; - the hospital consultants treating NHS inpatients for acute care and inpatients and outpatients in mental health (MH) care; - the nurses working in NHS hospitals acute care wards and NHS mental health care hospitals (inpatients and outpatients wards); - the NHS Ambulance Trusts workers. The data requested from NHS Digital is related only to the patients admitted to acute and/or MH care, as well as the hospital consultant codes of the consultants treating the patients in hospitals. For all data products requested, the full datasets are required, including all admissions to hospital care (& community care for MH patients). The data needed to deliver the project cannot be limited to a cohort of patients with a specific condition, procedure or age range and there are several reasons for this. To study both the determinant factors of hospital WFR and its effects on hospital patient outcomes, data is needed to: 1) Create variables to proxy healthcare demand pressure at provider-level (or at department-level within each provider). These variables depend on the sum of all admissions recorded, and not by a single cohort of patients. 2) Define clinically and policy-makers relevant health outcomes like emergency readmissions to hospital following a previous hospital discharge, where the diagnosis for the readmission spell need not be the same as the diagnosis of the index admission spell. This requires having records for all the patients. 3) Define market concentration variables for non-emergency admissions, e.g. the HHI index. The Herfindahl-Hirschman Index (HHI) is a commonly accepted measure of market concentration - measure of the size of firms in relation to the industry and an indicator of the amount of competition among them. Computation at provider level requires to observe the spectrum of all non-emergency admissions in a given year or month for all providers in England, and so it requires the records of all elective patients in England. 4) Define clinically relevant case-mix variables to control for patient severity like the number of emergency admissions to hospital within a given period (e.g. one year, two years), where the diagnosis for the emergency admission can be of any type. This requires observing records for all the emergency patients. The University of Surrey is the sole data controller and the sole data processor for this agreement. There are co-investigators from University of Leeds and City University London involved in this project. The University of Leeds and City University London do not have access to or process NHS Digital data and will only contribute to the writing of the reports and papers. The co- investigators belonging to these organizations are only contributing intellectually and in the writing of reports/papers to the project. But they are not involved in determining the means by which the data are being processed. FUNDER The Health Foundation is funding this project and their role is to ensure and facilitate the delivery of this project, but The Health Foundation has no data controlling or data processing roles within the project. The Health Foundation is a major stakeholder in the project and it has funded this research project, along with several other projects from other institutions, under a funding call for their Efficiency Research Programme (https://www.health.org.uk/sites/default/files/ERP%202018%20Call%20for%20applications.pdf) which is targeted to investigate the under-researched themes of labour productivity and workforce retention in health and social care. As such, the remit of RQ1 and RQ2 of this project fall under the remit of the funding call issued by the Funder. The Funder organises an Advisory meeting for the research projects which facilitates the circulation of ideas among researchers, their collaboration and so the development of the research project. The Funder is a very known think-tank in the UK and has an extensive network of professionals that supports the public good and public health mission. As such, using its network, The Health Foundation is going to help the research team circulate the findings of the research, prior to and along with any other dissemination channel (e.g. peer-reviewed journals, conferences, etc…). The Health Foundation has no control of the data that is released by NHS Digital. The Health Foundation will have access to research outputs, aggregated with small numbers suppressed, in terms of graphs, tables and paper to be produced by the UoS research team, which will not be able to be published or used without the UoS research team’s explicit consent. The Health Foundation will act as an additional dissemination channel, e.g. similarly to posting a working paper from the project on the Surrey project website. Ethical Approval. This project has received approval from a Research Ethics Committee in February 2020. It was approved subject to a condition that the lay summary provided be revised suitably to be more lay friendly. This was revised, and full ethical approval was granted on August 10th 2020.

Expected Benefits:

The NHS has faced substantial pressures over the past two decades - with services beign over stretched due to a prolonged financial austerity period coupled with demand pressure from population growth and ageing. Despite a recent Governmental pledge to refinance the NHS, it remains clear that substantial efficiency savings are necessary. One area where efficiency gains could be achieved is NHS WFR, described by the Health Education England chief executive as “the biggest workforce challenge facing the NHS”. The research dissemination is of public interest as a relevant part of the evidence found through this research will be translated into policy recommendations for healthcare policy makers, leaders, managers and workers on the best ways to improve hospital WFR, and through it also patients’ outcomes. One of the team members is in charge of the impact for the project (as well as the impact for the UoS School of Economics REF case studies); as an expert in the communication of research outcomes in layman’s terms, the team member is in charge of laying the policy recommendations out from the analysis reports and will assist the Principal Investigator in the communication with the healthcare policy makers, leaders, managers and workers. Overall, the aim of the research and its dissemination are to uncover mechanisms and generate recommendations that can lead to possible efficiency gains in the hospital healthcare sector, and in the English NHS hospital healthcare system. It is possible that enhancing hospital WFR in the NHS can result in two types of efficiency gains: directly, through larger savings from reduced hiring of temporary staff; and indirectly, through the better utilization of skills, reduced human capital losses and better staff wellbeing. Thus, this research has the potential to improve the lives of both hospital workers and patients, and the working conditions of hospital workers. The empirical analysis and the policy recommendations stemming from it will be the most substantial part of the research outputs (policy briefs, working papers, peer-reviewed publications) that the research team aims to disseminate within the scientific and healthcare academic communities, the healthcare policy makers and leaders communities, the healthcare professional community and the general public. The policy recommendations arising from the study will be included in policy briefs that will be circulated to healthcare policy makers, leaders and managers via the research team’s as well as the Funder’s research networks. Furthermore, the year 2023 launch event and the ongoing dissemination of the research through seminars, conferences, the research team’s networks and the project Steering Group and the funder’s network will increase the reach and impact of the research team’s work. As a result of the project’s outputs, it is hoped that healthcare policy makers will: - increase the monitoring of the hospital WFR and the factors affecting it, in an effort to improve both WFR and patients’ outcomes in case of the outcomes that this research shows to be positively affected by higher retention’s levels; - possibly develop guidelines to improve the management and retention of hospital workforce, supported by the empirical evidence and by specific case studies that might stem from eventual follow-ups of this research, by the current or different research teams. The impact of the project (including the research outputs and the possible improvements for the NHS) depends unambiguously on the findings of the study. It will be possible to ascertain who will realize the improvements in the management of the hospital workforce and whether and how these improvements can be achieved only when the project analysis is concluded (or at least ongoing at an advanced stage). The project research team will make sure to make the findings easily transferrable into improvements for the most suitable stakeholders including NHS England/Improvement, Care Quality Commission, Department for Health and Social Care and Clinical Commissioning Groups. Currently, it is impossible quantifying the magnitude of the impact on patients’ outcomes. This will strongly depend on the number of emergency conditions and elective treatments that the research team are able to investigate during the funded period of the grant. The benefits of processing/dissemination will be achieved directly by the data controller and the funder, and indirectly by the project stakeholders and the general public. The efficiency savings that can be achieved by implementing policies that improve hospital WFR will be object of a cost-benefit analysis stemming from the project’s empirical research. This will lead to British Pound estimates of the monetary gains (or losses) that the NHS can achieve for say a 1% increase in the WFR of hospital nurses/consultants/ambulance workers. The cost-benefit analysis will be achieved by the end of the project, with its final formulation in the published versions of the study that it is expected to happen after the end of the project funding period (as soon as possible, and possibly within 3 years from 2023). However, the benefits for hospital workers and patients may happen at different times, before or after the end of this study, depending on: the relevance of the findings; the success of the dissemination; the appetite for the findings, their implications and the related recommendations from policy-makers, politicians and the general public. The study will also support the research of at least one PhD student (from UoS) and a post-doctoral research fellow (University of Surrey). These junior researchers will both contribute to this project with their work and will be an active and fundamental part of the research team to achieve the status of co-authors of the study and its related published and unpublished outputs. This is a research project whose aim is to investigate the economics of the hospital workforce retention, its determinants and its associations with patient outcomes, in order to provide policy-makers and hospital managers with recommendations that can improve both on the stability and engagement of hospital workers and on the quality of care perceived and received by hospital patients. The outcomes of the project will be presented to and discussed with the project’s Steering Group and the Health Foundation’s advisory board committee. A number of experts, healthcare policy leaders and academics takes part to both these committees and will ensure the rigour of the analysis as well as a precious advisor to improve the analysis and a network to disseminate the finding of the analysis of the project. Based on the recommendations from both committees, the investigators will discuss and disseminate the results of the analysis to healthcare leaders and policy-makers. The results will be informative for the retention of the hospital workforce and for the way such retention may be correlated with the health and process outcomes (e.g. mortality, readmissions, waiting times) of patients admitted to English hospitals.

Outputs:

The study findings resulting from the data processing will contribute to the production of: 1. Reports to the Funder / working papers; 2. Submissions to peer reviewed journals; 3. Presentations to seminars and conferences / policy briefs reports; 4. Conferences. The research outputs will never be reported at individual patient / worker level. The research outputs will always be reported as aggregate quantities, e.g. the average number of patients with condition X (e.g. heart attack) across NHS hospitals, by year. Categories with small numbers of observations will be suppressed / not reported. All outputs that will be produced using the NHS Digital data will only contain aggregated results with small number suppression applied. To disseminate the results of the research, the research team will: - Draft at least two non-technical briefing papers (one on the factors affecting staff retention and the other on its consequences for patient welfare). The Investigators in research team have considerable experience of presenting research to varied audiences. - Hold a launch event at the end of the 4 years of funding, inviting the project’s key stakeholders and wider networks, including representatives of individual Trusts. This will be timed to coincide with the actions at the point below. - Publication of non-technical papers on the project’s website and through Twitter. They will be accompanied by a blog and animation. - The research team will make use of the University of Surrey media team and press release the research work. This press release team assisted in writing and placing the article on the free entitlement in The Daily Telegraph [2]. The team will also seek opportunities to write for other blogs both general outlets such as The Conversation and those addressed to more specialized audiences such as The Health Foundation’s own blog series. The research team is also willing to provide guidance to organisations, such as Trusts through NHS Improvement. One of the co-Is, has written NICE guidelines so he has experience of turning research results into a specific product. Beyond the launch event and associated activities, the research team will seek to publish the project’s findings in high quality journals. This will establish the rigor of the research and ensure its lasting academic influence. Potential target journals are Journal of Human Resources (JHR), Economic Journal (EJ), Journal of Public Economics (JPubEcon), Journal of Health Economics (JHE), Journal of Economic Behaviour and Organization (JEBO), Health Economics (HE), Social Science and Medicine (SSM). [1] Cookson, R. and Moscelli, G. (2018) Are Angioplasty Waiting Times Growing Again. Centre for Health Economics. https://www.york.ac.uk/media/che/documents/policybriefing/Angioplasty.pdf [2] Blanden, J. (2016). X-Factor Over Evidence: The Failure of Early Years’ Education. The Daily Telegraph, 22nd October. https://www.telegraph.co.uk/education/educationopinion/11177381/X-Factor-over-evidence-the-failure-of-early-years-education.html The overall communications objectives are to: 1. Engage key stakeholders with the project at its inception, enabling the research team members to understand their concerns and draw on their specialist knowledge to shape the research strategy. 2. Create awareness of the research project and expertise among a broad range of interested parties. 3. Gain valuable feedback from academics and stakeholders as the project results approach their final version. 4. Disseminate the research findings widely among stakeholders, health researchers and the wider academic community. 5. Present clear and relevant policy implications to both national and local decision-makers. In the setup phase (during the first 2 years of the project) the research team will: - Conduct a scoping exercise to ensure the fully understanding of the key organizations (and individuals within them) interested in HWFR. This will help the research team ensure they are inviting exactly the right people to be on the project Steering Group. It will also grow the project wider network by subscribing to the right mailing lists and following the right Twitter feeds to be appraised of relevant events. The research team will also contact the most important individuals by email. [This activity has already taken place over the course of Summer and Fall 2019] - Set up the project website at the University of Surrey, using as potential models the websites of previous research projects like Better for Less (https://www.surrey.ac.uk/better-for-less) and the Centre for Vocational Education Research (http://cver.lse.ac.uk/), which saw the involvement of one team-member of this project. - Establish a Twitter account. The research team can share the blog through this channel as well as using it to provide brief comment and link the project to related activity. The former activities will enable the research team to engage interested parties at the start of the project, and it allows these parties to interact with the research team in the way that suits them best. - Form and hold the first meeting of the project Steering Group. This enables the research team to form deeper connections with key stakeholders and gain valuable feedback about the proposed methodology. Members will encourage the research team to think of ways of addressing their concerns and are likely to be able to provide valuable information about institutions, policy detail and data. The first project Steering Group has already taken place on 15/03/2019. During such meeting, the Investigators have received valuable suggestions how to set up the analysis and which data may be useful for the investigation. After the initial set-up phase, the research team will sustain interest in the research without over-burdening its audience. In this phase the research team will: - Continue to use Twitter, an especially helpful tool at this stage of the project. As the project networks are established, Twitter is an effective way of sharing the team’s growing involvement with stakeholders and establishing the research team as an authority on this topic. - Update the website and add blogs if appropriate. - Continue the meetings with the Steering Group. These will provide the research team with valuable feedback as the research findings begin to emerge and will enable the research team to discuss potential robustness checks and extensions. The Steering Group’s input will also be invaluable in helping to understand the results and their policy implications, particularly as the part of the Cost-Benefit analysis is approached. - Begin to present the emerging findings at seminars and conferences. Some of the intended events are specialized in the health field (Health Economists’ Study Group, European Health Econometrics workshop, European Health Economics Association conference) or aimed at policy makers (DHSC analytical lunchtime seminars) while others are more general (Royal Economics Society and International Association for Applied Econometrics conferences) and give scope for gaining wide academic feedback. - The project’s research papers will be made available as Discussion Papers on the project’s website once they are complete. The work on the determinants of staff retention will be completed first, in accordance with the research agenda and scheduled plan. In regards to access to journal articles - Green open access will be provided as the base case. Gold open access will be provided subject to the will of the funder to pay for the Gold open access charges. In any case, the publications pre-prints will always be freely available to the public. Gold open access is where an author publishes their article in an online open access journal. In contrast, green open access is where an author publishes their article in any journal and then self-archives a copy in a freely accessible institutional or specialist online archive known as a repository, or on a website. Subject to any third party rights, the data and knowledge generated by the study will belong to the project partners (from UoS, University of Leeds and City University London), who will also: - manage the data and knowledge produced; - administer the access rights to the study and its results; - together with The Health Foundation, arrange the possibility of making the publications of the study available as Open access. With respect to utilization rights, The Health Foundation has requested a license to use the work produced by the project partners for its public benefit purposes. The Health Foundation is under an obligation to ensure that the outputs of the project are applied for the public good. Therefore, the funder has requested a license to use, for its public benefit purposes, the outputs generated by the Recipient under the Project. Subject to any third party rights, the project partners (i.e. the Principal Investigator and co-Investigators) have granted the funder a royalty-free, non-exclusive, world-wide license to use the outputs generated by the project partners under the project for its own charitable public benefit purposes. The funder, where reasonable, will discuss with the Principal Investigator prior to using the outputs for public benefit. All outputs shared with the Funder will be aggregated with small numbers suppressed. Relevant Target Dates: Submission to peer-review of at least one paper related the first research question by December 2021; Submission to peer-review of at least one paper related to the second research question by June 2023; Organization of launch event of the project by June 2023; Publication of at least 60% of the outcomes of the project by December 2023. [Please notice that publications in Economics have a very long turnover and so they require several years to get peer-reviewed, revised and published. A paper publication in a 4 stars peer-reviewed journal in Economics can take also several years, due to previous rejections and the time for the peer-reviewers to provide comments]. EU funding is not applicable.

Processing:

The following data products are requested: 1) HES Admitted Patient Care (HES APC): Financial Years 2009/10 to 2021/22 (13 years); 2) Patient Reported Outcome Measures (PROMs): Financial Years 2009/10 to 2021/22 (13 years); 3) Civil Registration - Deaths (CR-D): Financial Years 2009/10 to 2021/22 (13 years); 4) HES Critical Care (HES CC): Financial Years 2017/18 to 2021/22 (5 years); 5) HES Accidents and Emergencies (HES A&E): Financial Years 2009/10 to 2018/19 (10 years); 6) Emergency Care Data Set (ECDS): Financial Years 2018/19 to 2021/22 (4 years); 7) Mental Health Minimum DataSet (MHMDS), Mental Health Learning Disabilities (MHLDDS), Mental Health Services DataSet (MHSDS): Financial Years 2011/12 to 2021/22 (11 years). The data flowing from NHS Digital to the UoS will be pseudonymised at patient level for all datasets. The hospital consultant code (GMC code of the hospital consultant in charge of the patient; consult variable) in HES APC data (and possibly also HES A&E, ECDS and MHMDS/MHLDDS/MHSDS) is needed to link the Hospital administrative datasets from NHSD to the Electronic Staff Records data. The GMC consultant code is needed because that is the only way that the project members can link HES and MH data at consultant level to the ESR data provided by DHSC. A pseudonymised GMC code would not find a match in the ESR data provided by DHSC. The GMC consultant code is replaced with a study ID key once the linkage has taken place. The UoS of has access to ESR data, supplied by the Department of Health and Social Care (DHSC). UoS will link this ESR data to NHS Digital data at UoS only. The other two organisations participating to this project, The University of Leeds and City University London, will not have access to either NHS Digital data or ESR data. ESR and NHS Digital data will be linked in two ways. The first linkage will be by period (e.g. year) and organization (i.e. Trust XXX) and the data for such linkage will be accessible and processed by substantive employees and research students at UoS. The second linkage will be by consultant code-period-organization and the data for such linkage will be accessed and processed only by substantive employees at University of Surrey and not by research students at UoS. To mitigate any risk of re-identification, the identity of hospital consultants from the GMC register will never be included in the secure IT system project folders. There will also never be a reporting of results at individual patient or worker level, and when the aggregated number of workers or patients is less than 10 observations within a hospital-time period interval. Other data linkages will happen only at aggregate level, e.g. hospital location using Organisation Data Service (ODS) data postcodes; average wages of workers living in a given hospital catchment area, extracted from Labour Force Survey (LFS) and Annual Survey of Hours and Earnings (ASHE) data (collected by the UK Office for National Statistics and accessed through the UK Data Archive). There will never be reporting of results at individual patient or worker level, when the aggregated number of workers or patients is less than 10 observations within a hospital-time period interval. As above, there will never be any attempt to re-identify individuals, whether patients or hospital consultants. Data will only be accessed and processed by either substantive employees of The UoS or PhD students based at the UoS who are involved as honorary research fellows with a written contract defining their role and duties. The total number of PhD students working on the data will never be higher than 5 students, and their research work will have to fall within the remit of the research purpose stated in this application. The data will not be accessed or processed by any other third parties not mentioned in this agreement. There will be no attempts made by any of the research project team members to re-identify individuals involved in this project as there is no requirement to do so. The two co-investigators based at University of Leeds or City University will not be processing any NHS Digital data, and so they will only contribute intellectually to the project. The data will be accessed through a remote access secure environment, whose technical details are provided below. The data are accessed by any member of the research team involved in data processing activities through a secure IT system, called “Surrey Secure Network (SSN)”. Security of the “Surrey Secure Network” is consistent with the framework of University of Surrey’s Information Security Policy, available here - http://www.surrey.ac.uk/about/corporate/policies/information_security_policy.htm Data will be processed using: - Virtual Desktop sessions that are only connected to the secure network; - All laptops have their local disk encrypted using CESG approved standards. Virtual desktops are used for remote working. All the analysis is always run on virtual desktops, regardless of the applicants working from home or the office. The NHSD data is hosted by a secure server to which the applicants have no physical access. The secure server is encrypted and it is not possible to copy and paste data from the screen when using the virtual desktops Patient level data is only accessible on the Surrey Secure Network. Research data is retained for 10 years after the completion of the study in accordance with University policy on research data management. The System shall be risk assessed every 12 months, which includes an annual infrastructure penetration test. The UoS uses the ITIL (Information Technology Infrastructure Library) Risk Management framework for its IT policies and management. Services are backed-up and data replicated between 2 data centres on campus. These are geographically spaced to provide cover for disaster purposes to ensure a copy of the data can be recovered. All systems within the University are bound by the University’s Information Security Policy (http://www.surrey.ac.uk/about/corporate/policies/information_security_policy.pdf). In the first stage (RQ1), the data provided by NHSD will: - be linked to Electronic Staff Record (ESR) data over the years; - aggregated at hospital level by subperiods (e.g. monthly) to create variables that control for time-varying demand and supply factors at hospital level; - used in statistical models to investigate the association of demand and supply factors with hospital WFR in the English NHS (at the mean or over the outcome distribution). In the second stage (RQ2), the data provided by NHSD will: - be used to produce hospital quality (e.g. mortality, readmissions, PROM gains) and hospital process (e.g. length of stay, waiting times) indicators at patient level [Outcome Variables]; - be used to create variables that control for patients’ characteristics (e.g. age, gender, ethnicity), patients’ pathways (e.g. hospital or GP of treatment) or provider characteristics (e.g. NHS or Independent Sector hospital) [Control Variables]; - used in statistical models to investigate the association of hospital WFR with patients’ Outcomes in the English NHS (at the mean or over the outcome distribution), controlling for the demand and supply determinants of healthcare. UoS will not flow any data to NHS Digital. The data flows out of NHS Digital will consist in 3 annual data disseminations. In the first data flow the latest datasets release and the historical datasets will be delivered. In the last two remaining data drops, only the latest datasets release will be delivered. There will be no data linkage undertaken with NHS digital data provided under this agreement that is not already noted in the agreement. • Electronic Staff Record (ESR) data The UoS has access to ESR data, supplied by the Department of Health and Social Care (DHSC). UoS will link this ESR data to NHS Digital data at UoS only. Substantive employees at University of Surrey are the only individuals able to access and link NHS Digital data to ESR data. No other organisations, including The University of Leeds and City University London have access to NHS Digital data. ESR and NHS Digital data will be linked via consultant code only. To mitigate any risk of reidentification, the identity of hospital consultants from the GMC register will never be included in the secure IT system project folders. There will also never be a reporting of results at individual patient or worker level, and when the aggregated number of workers or patients is less than 10 observations within a hospital-time period interval. Other data linkages will happen only at aggregate level, e.g. hospital location using Organisation Data Service (ODS) data postcodes; average wages of workers living in a given hospital catchment area, extracted from Labour Force Survey (LFS) and Annual Survey of Hours and Earnings (ASHE) data (collected by the UK Office for National Statistics and accessed through the UK Data Archive). There will never be reporting of results at individual patient or worker level, when the aggregated number of workers or patients is less than 10 observations within a hospital-time period interval. As above, there will never be any attempt to re-identify individuals, whether patients or hospital consultants. DATA MINIMISATION It is sufficient to use pseudonymised data, there is no need to identify any patient. Patient level data is needed as most of the outcome variables, and the effects of interest on such variables will be measured at patient level. This will prevent the risk of ecological fallacy (i.e. aggregation bias) in the results. The risk of ecological fallacy (i.e. aggregation bias) arises by using grouped (i.e. aggregated) data. If the outcome of interest is at patient level, aggregating data at hospital level may hide important patterns. For example, the mortality of hospital X for heart attack is found to be 10% of emergency admissions; however, the aggregate figure at hospital level may hide that the hospital mortality is very different by gender (e.g. 5% for male patients; 15% for female patients) or by comorbidities (7% for non-diabetic patients; 13% for diabetic patients). Hence, using the lowest level of data aggregation (i.e. patient level data) allows to uncover patterns that may be related with the relationships of interest and features of healthcare delivery as it varies depending on patients characteristics or the interaction of patients and organization characteristics. The data cannot be identifiable by geography; the projects delivery needs detailed patient level data and aims to explore the heterogeneity of the effects/associations of interest by different geographies of England. Moreover, geographic variation can be a confounder in the analysis that needs to be controlled for. The data cannot be identifiable by demographics; the project aims to explore the heterogeneity of the effects/associations of interest by different demographic characteristics of the patients, but more importantly because such characteristics are possible confounders that needs to be controlled for. The data cannot be identifiable by diagnosis and procedures; the project aims to explore the heterogeneity of the effects/associations of interest by different diagnosis and procedures of the patients, and also because there is a need to know the total number of patients that are admitted to hospital in any given day to compute measures of demand pressure for hospitals. For the latter reason, the creation of a HES cohort is not going to be sufficient for the delivery of the project. All fields requested are necessary for the project, as the analysis may lack otherwise the consideration of important mechanisms or confounders, and so provide wrong recommendations to healthcare leaders and policymakers. It would be in principle to replace date of death with mortality flags, however, mortality flags at many time intervals (7, 14, 30, 60, 90 days, 6 months, 1 year, 2, 3, 5 years) and with respect to both date of admission and date of discharge would be needed, so that would increase the sizes of the HES extracts. The access to the Civil Registration Deaths records, including the full date of death and the reason for death, are then preferable, also for purposes or cross-validation with the hospital records. The consultant code is needed to allow linkage of HES to ESR and evaluate the effect of time to leave a given hospital on the patients' health outcomes. The request of this variable is motivated by the interest in the average effect of consultants’ time-to-leave a hospital, and not by any interest in the identification of specific consultants and their performances. The postcode outward code and the postcode sector are needed to compute distance measures from patient's residence to GP location and Hospital site location that are more precise than those based on LSOA of patient residence. Given that these variables do not constitute the full postcode, this will prevent patient identifiability. The analysis is at national level, so it will need to cover all England. Moreover, the project plans to investigate regional variations in the associations or effects of interest. • HES Admitted Patient Care (HES APC) HES APC is necessary to investigate the associations of hospital WFR and emergency care patients’ outcomes. HES APC will provide the bulk of acute care data for quality indicators, patients' characteristics, patients’ pathways, and patients' health outcomes that will be used in the research project relatedly to acute emergency care. HES APC years from 2009/10 to 2021/22 are requested to exploit the variation due to several policies happening in the last decade. Some of these policies happened at the start of this decade (e.g. 2012 abolition of PCTs and creation of CCGs in 2013), some have happened more recently (e.g.: the 2016 new junior doctor contract; the 2018 NHS Improvement Retention Support Programme; the 2018 hospital staff’s new pay/progression contract). In any longitudinal analysis (e.g. before-after, difference in difference, interrupted time series) some years before and some years after the policy change are needed to assess the effect of a given event or policy. All patients' episodes are required because a multi-episode spell reports all the information regarding the patient pathway from admission to discharge or death. All elective episodes are required since the number of elective patients treated is in itself an outcome variable and different episode may contain different information that must be included in the analysis. Maternity episodes are required because unborn children and neonatal records will be used as outcome variables for maternity wards, and the retention of midwives is part of this study. The timeframe around the index event (e.g. procedure or diagnosis) is required because waiting times, length of stay (both post and pre-operative) are some of the target outcome variables in the analysis. For reasons spelled out above, the full HES Admitted Patient Care is needed, with bridges files to Civil Registration Deaths, Mental Health Services (or Minimum) data, HES A&E / ECDS, HES CC and Patient Reported Outcome Measures. There are no alternatives or less intrusive ways of achieving the purpose. Aggregate data or semi-aggregate data would not serve the project purposes as they would imply incurring the risk of aggregation bias and they could mask specific patient’s pathways or characteristics that are needed to be accounted for in order to estimate the correct effects of interest for the investigation. • HES Critical Care (HES CC) HES CC is necessary to investigate the associations of hospital WFR and emergency care patients’ outcomes related to patients admitted to Critical Care departments. This will allow the research team to also investigate the associations (or effects) of hospital WFR with health outcomes for COVID-19 patients. HES CC years from 2017/18 to 2021/22 are requested to investigate the performance of CC departments before and after the 2020 Covid19 crisis. • HES Accidents and Emergency (HES A&E) HES A&E will provide the bulk of data for quality indicators, patients' characteristics, patients’ pathways, and patients' health outcomes related to ambulance and emergency care. Financial years from 2009/10 to 2018/19 for HES A&E are needed to exploit the variation due to several policies happening in the last decade. • Civil Registration (Deaths) - Secondary Care Cut Civil Registration (Deaths) will provide data for out-of-hospital mortality after discharge, which is one of the main quality indicators in acute healthcare. The bridge file will allow to link the Deaths file to HES APC, HES A&E, ECDS and MHMDS/MHLDDS/MHSDS. Data for financial years from 2009/10 to 2021/22 are needed to exploit the variation due to several policies happening in the last decade. The Original Underlying Cause of Death variable is needed to double-check the medical reason the patient has died. Variables for neonatal mortality are needed to create indicators of care for new-borns. Subsequent Activity and Match Rank variables are needed for data quality checks purposes. Access to the Civil Registration Deaths records, including the full date of death and the reason for death, are preferable for purposes or cross-validation with the hospital records. • Patient Reported Outcome Measures (PROMS; Linkable to HES). PROMS will provide important quality indicators, patients' characteristics and patients' health outcomes that will be used relatedly to elective care, e.g. health gains (or losses) in terms of a change in the patient Oxford Hip/Knee Score. Data for financial years from 2009/10 to 2021/22 are needed in order to exploit the variation due to several policies happening in the last decade • Mental Health Minimum Data Set (Linkable to HES). • Mental Health and Learning Disabilities Data Set (Linkable to HES). •Mental Health Services Data Set (Linkable to HES) [packages: MH Community: 1d + add-on package 4 (currencies) and MH Inpatients: 2a + add on package 3 (patients info) + package 4 (currencies; i.e. MHS 801-803)]. • Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set (and other MH datasets). The datasets will provide important quality indicators, patients' characteristics, and patients' health outcomes that will be used in the research project relatedly to mental health care. Data for financial years from 2011/12 to 2021/22 are needed to exploit the variation due to several policies happening in the last decade. Data prior to 2011/12 are not requested, given both funding constraints and the availability of a less precise dataset. For all MH datasets, data on both community care and hospital care are needed since only a fraction (about 10%) of MH patients are hospitalized, while the other patients are treated to community services where both MH doctors and MH nurses operate and may operate as a replacement/substitution of MH hospital services. Failing to control for this alternative channel at local area level may imply a bias in the results of the analysis. • ECDS ECDS dataset is necessary, as it will provide the data for quality indicators, patients' characteristics, patients; pathways, and patients' health outcomes that will be used in the research project, relatedly to patients admitted to emergency care departments. Data for financial years from 2018/19 to 2021/22 are needed in order to exploit longitudinal variation. Data prior to 2018/19 are not requested, given both funding constraints and the concerns related to data quality and completeness. Data for year 2018/19 is requested for both ECDS and HES A&E, this is because in ECDS the 'Token_ID' to link HES datasets like HES APC or HES CC is still not available at the moment, and the project needs to evaluate the associations of Hospital WFR with emergency care outcomes along the patient pathway using the most complete information (e.g. for the evaluation of the effect of retention exploiting the shock due to the Brexit referendum); at same time, in order to evaluate the associations of Hospital WFR on emergency care outcomes pre and post Covid19, at least about two financial years are needed before and after the Covid19 outbreak, which motivates the request also for year 2018/19 of the ECDS. In order to investigate the effect of hospital workers' proximity to leaving an NHS organization on Emergency care, the 'ProfessionalRegistrationIssuerCode' variable is requested in order to try and link it, for hospital consultants only, to the ESR data through the GMC code. The volume of data in terms of years is needed for several reasons: 1. In order to control for unobservable but time invariant factors, the project in most cases uses organization (e.g. Trust) fixed-effects. The estimation of longitudinal models with fixed-effects requires several yearly data points to assure that the fixed-effects estimates are consistent. The estimates of interest coming from these models is likely incorrect without a sufficient number of data time points – in the case of this project, the points are years of data. Several years of data are usually needed for the fixed-effects to be estimated correctly. This is even truer in presence of breaks due to policy changes over time (see next point below), which would imply a few years before the policy and a few years after it (e.g. two subperiods of 5 years each). For some datasets like HES APC, this can be used as described above as the dataset does not suffer from structural changes / discontinuities over time and remains a similar structure. For some datasets like Mental Health datasets, HES Critical Care and A+E/ECDS this is not possible since either such does not have 10 years avilable as it did not exist 10 years ago (eg HES CC), or the dataset has been discontinued / changed over the years (e.g. MH data with new formats/variables. With the datasets that have only a few years of data the project team will either use longitudinal methods but acknowledge the limitations due to having fewer data points, or it will focus on cross-sectional variations of health outcomes for different organizations within each of the year of data requested. Finally, given the presence of budget restrictions to the project, the years of data requested have been kept to the strictly minimum possible to deliver a valid analysis to the Funder and the stakeholders of the project (including the general public). 2. The project requires to exploit several policies and events that act as 'exogenous shifters', i.e. events or policies that will have an association with the health outcomes only because of the impact they might have had on NHS workforce retention. Such events or policies might have contributed in different ways to define the patterns of hospital workforce retention. Some of these policies are: - the 2012 doctor revalidation policy; - the 2012 introduction of CCGs; - the 2016 Brexit Referendum; - the 2017/19 junior doctors contract reforms; - the 2018 NHS Improvement Workforce Retention Program; - the 2020 EU withdrawal. The analysis of the impact of each policy or event requires several years before and after the time when the policy/event was introduced, in order to estimate effects that are correct and plausible. 3. Moreover, when faced with estimation of dynamic models (which are needed to estimate how changes in some variables lead to changes in the outcome of interest) some variables need to be lagged. Compared with a model using only contemporaneous (existing at or occurring in the same period of time )data, a model including lagged data requires even more yearly data points. This is because if one wants to evaluate the effect of workforce retention in 2008 on patient waiting times in year 2009, we need data for 2008 and 2009, not just 2009. If the suspected time dependence is longer, the number of required time lags has to be longer. For some statistical models 4 or more years of lags are required for the estimation to be correct, and this further motivates the request for a longer number of years for some of the datasets such as HES APC or HES A&E. The large volume in terms of number of fields is required as well for several reasons. 4. GEOGRAPHIES. The data cannot be narrowed by geography, as the our study will explore the heterogeneity of the effects/associations of interest by different geographies of England; moreover geographic variation can be a confounder that we need to control for in the analysis. 5. DEMOGRAPHICS (age, gender, ethnicity, socioeconomic indicators like Index of Multiple Deprivation). The data cannot be narrowed by demographics, as the study needs to explore the heterogeneity of the effects/associations of interest by different demographic characteristics of the patients, and more importantly because such characteristics are possible confounders that must be controlled for in the analysis. 6. DIAGNOSES and PROCEDURES. The data cannot be narrowed by diagnosis and procedures, as the study requires these fields to: i) compute comorbidity indices; ii) compute different indicators of hospital quality, which vary either by diagnosis, procedure or clinical specialty; iii) compute different indicators of hospital demand pressure depending on all admission (i.e. for any reason/diagnosis/procedure) to a hospital; iv) investigate the heterogeneity of the effects/associations of interest by different diagnosis and procedures of the patients; v) compute indicators of competition, which are based on all admissions to a hospital (i.e. for any reason/diagnosis/procedure); iv) compute unplanned readmissions after hospital discharge as a measure of widely used quality measure, in which the index spell and the readmission spell are not necessarily due to the same diagnosis or procedure. 7. EPISODES (including: dates of admission and discharge; durations; type, i.e. emergency or not; admission/discharge to/from home or not). All the patients' episodes are required in order to construct hospital spells, which may be made of multiple episodes and include precious information regarding the patient pathway from hospital admission to discharge or death. 8. TRUST-LEVEL AND CCG-LEVEL. These fields are necessary for the project, as they can act as important mechanisms or confounders that we need to control for to provide the right recommendations to healthcare leaders and policy-makers. 9. MSOA, LSOA, and POSTCODE information (i.e. postcode outward code and the postcode sector). These variables are needed to compute distance measures from patient's residence to GP location and Hospital site location at different levels of precision. 10. GP-PRACTICE IDENTIFIER. This field is necessary for two purposes: a) to control for the quality of primary care for the patients, e.g. given by the number of ambulatory care sensitive conditions (derived from HES APC) for patients admitted to hospital but treated by the same GP practice; b) as a geographical/organizational factor, in order to control for the number of elective patients referred by a given GP practice to different hospitals (e.g. to check the GP-hospital market concentration of patients choosing the hospital for elective care). All organisations party to this agreement must comply with the data sharing framework contract requirements, including those regarding the use (and purposes of that use) by “personnel” (as defined within the data sharing framework contract i.e. employees, agents and contractors of the data recipient who may have access to that data).


Project 2 — DARS-NIC-21083-B6C5J

Opt outs honoured: Yes - patient objections upheld (Statutory exemption to flow confidential data without consent)

Sensitive: Non Sensitive, and Sensitive

When: 2020/05 — 2020/05.

Repeats: Ongoing

Legal basis: Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii)

Categories: Identifiable

Datasets:

  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Outpatients
  • Hospital Episode Statistics Critical Care
  • Hospital Episode Statistics Admitted Patient Care
  • HES:Civil Registration (Deaths) bridge
  • Civil Registration - Deaths

Objectives:

Public Health England (PHE) holds a contract with the Royal Collage of Practitioners (RCGP) who in turn hold a contract with the University of Surrey to deliver information to support surveillance and monitoring of vaccine efficacy on Influenza. PHE, RCGP and University of Surrey are Joint Data Controllers for this request. They require HES and Civil Registration Data (CRD) to look at the outcomes of care, including death to support surveillance and monitoring of vaccine efficacy on Influenza. Most important health outcomes happen in hospital, hospital is where the bulk of health care costs are incurred. The focus of the work will be the impact of influenza and other infections on health the benefit-risk of influenza and other vaccinations. The Royal College of General Practitioners (RCGP)Research Surveillance Centre (RSC), is based at the University of Surrey. The University of Surrey will have access to the record level data supplied by NHS Digital under this agreement. The University of Surrey will be the only organisation who accesses and processes the data disseminated under this agreement. The GDPR Lawful basis for processing the requested data under this agreement are; Public Health England; Article 6(1)(e) (Public Task processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller) and Article 9(2)(h) (processing is necessary for the purposes of preventive or occupational medicine, for the assessment of the working capacity of the employee, medical diagnosis, the provision of health or social care or treatment or the management of health or social care systems and services) and Article 9(2)(i) (processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices) PHE exist to protect and improve the nation's health and wellbeing, and reduce health inequalities. RCGP; Article 6(1)(f) processing is necessary for the purposes of the legitimate interests pursued by a controller, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child. This shall not apply to processing carried out by public authorities in the performance of their tasks. 9(2)(i) (processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices) University of Surrey; Article 6(1)(e) (Public Task processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller) and Article 9(2)(i) (processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices). Additionally the request for data is supported by PHE as they have an emanation of the Secretary of State for health and social care, to both self-approve the use of Regulation 3 and to grant this approval to third parties processing confidential patient information without consent for purposes that fall under the scope of Regulation 3. This authority to has been in existence since PHE was established in 2013 although the large majority of the Regulation 3 approvals granted since that date have been internal to PHE; only a very small number have been granted by PHE to third parties. Specifically the work being undertaken under Reg 3 in this application is limited to Communicable Disease surveillance and other risks to public health’. This secondary care data being requested will be linked at individual level to the Royal College of General Practitioners (RCGP) Research and Surveillance Centre's (RSC) primary care sentinel data for the purposes of infectious and respiratory diseases surveillance in England’. These include feeding back to member practices about their quality of care through a practice dashboard. The key objectives of the work are to: (1) Monitor influenza; (2) Analyse influenza vaccine effectiveness; (3) Understand and predict the impact of influenza and other winter infections on health service utilisation (e.g. older people with co morbid illness may be more likely admitted to hospital. Primary care/general practice data (which is already held) is rich in terms of diagnosis and information about the process of care. However, HES and CRD data provides key information about the outcomes of care (A&E use, hospitalisation and death data) The University of Surrey have an established sentinel GP influenza surveillance scheme in over 270 practices across England that monitors Influenza-like-illness and a subset who take virology swabs with the purpose of virologically confirming infection. The University of Surrey have a great deal of experience in using health related data to monitor infectious illnesses. Accessing HES and CRD data will allow the University of Surrey to expand their knowledge about the impact of infectious diseases further; this will both be at the individual patient risk level as well as looking how the University of Surrey could better predict winter pressures on the NHS to support PHE and RCGP. Public Health England (PHE) is involved in this programme of surveillance and quality improvement. PHE is a large organisation whose main aim is to protect and improve the nation’s health and reduce inequalities. The RCGP RSC and PHE have worked together for over 50 years to monitor the progression of infectious illnesses in order to put any action plans in place if needed. PHE are funding this surveillance and quality improvement being undertaken through this agreement. Individual patient level data is required because this allows much more precise statistical analyses to be made, compared with just comparing aggregate data. The main aim of this project is to build a robust database and reporting system using up-to-date primary and secondary care data at the individual patient level, which can be easily queried; and has the likely variables required for PHE reports outlined in the specific outputs section. The database will contain the following variables for each patient (where present): • Influenza-like-illness appointments: including information on whether or not a virology swab was taken and the outcome of the swab • Data for the other 32 conditions monitored by University of Surrey as contracted by RCGP RSC on behalf of PHE • To provide national surveillance data about an outbreak or pandemic that was not predicted • Vaccination status: date of vaccination, type of vaccination • Co morbid conditions • Medication which may be associated with better or adverse outcomes. • A & E visits • Inpatient appointments, including critical care • Outpatient appointments • Mortality data (if applicable). The database will be used to answer the many associated questions exclusively related to surveillance and monitoring of vaccine efficacy on Influenza. For example, gaining access to HES and CRD data means that the University of Surrey can clearly see the rates of patients who access health care because of influenza related conditions. This will enable the University of Surrey to assess the pressure that is put on the healthcare system during influenza seasons, and devise and test measures to prevent this. Another example relates to comorbidities of disease, reducing the rates of influenza nationwide is of public health interest as influenza can be particularly dangerous for those in high risk groups. HES and CRD data will be used to identify the incidence of flu in those with certain conditions, such as pregnancy or diabetes. This will enable the University of Surrey to identify whether certain conditions are associated with an increased risk of catching influenza, and may lead to individuals with certain conditions being offered vaccinations in future influenza seasons. A further example relates to vaccine effectiveness. The RCGP RSC system is also used to monitor the effectiveness of influenza vaccine on behalf of PHE each season. PHE make decisions about England’s vaccination programme, and the data the RCGP RSC provides to PHE informs their decisions on future influenza vaccinations. The data provided under this agreement will be used to see whether anyone with certain conditions, who are vaccinated, are less likely to use hospital services than those who have not been vaccinated. This will provide further information on vaccine effectiveness in individuals with certain conditions. The data will be used to support University of Surrey, RCGP and PHE in understanding more about the primary and secondary care data at a patient level for the following conditions; URTI – Upper respiratory infections LRTI – Lower respiratory infections (pneumonia and acute bronchitis) Asthma and COPD These peak as flu circulates and not all flu is diagnosed as flu therefor looking at these conditions will support the influenza overall programme. Both CRD and HES data will be required: • HES: Critical Care • HES: Outpatients • HES: A&E • HES: Admitted patient care • CRD (mortality) data Since the outbreak of COVID-19 in Wuhan, China, the surveillance programme have been working closely with and under instruction from Public Health England (PHE) and other national bodies to closely monitor and make plans to deal with any situation that may develop in the UK. A vital part of that will be to monitor the number of suspected COVID-19 cases in the community in a timely way. PHE has commissioned the RCGP Research Surveillance Centre to incorporate monitoring of COVID-19 into its virology surveillance scheme. RCGP RSC and PHE will be extending the surveillance to include COVID-19. All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Expected Benefits:

The surveillance work conducted by the RCGP RSC on behalf of the Data Controllers is used by Department of Health, NHS England and PHE to monitor trends in a number of infectious conditions. Specifically for influenza, seasonal epidemics are carefully followed, in order to deploy necessary measures as needed to limit the impact on the population. The trends in other conditions inform the development of vaccination programmes or other public health measures. Linking with HES and CRD data can help assess the severity and mortality of a given condition, thereby alerting PHE on whether larger measures should be implemented. This could lead to improved healthcare and reduced mortality of certain conditions. Additionally, the link with the HES and CRD data allows The University of Surrey to identify whether a particular flu season is putting additional pressures which means that plans can be out in place in order to prevent or deal with these pressures next season. Specific benefits The benefits include improved knowledge of the pressures of certain conditions during the winter period, reduced mortality from influenza, improved vaccine effectiveness and a health system that is more prepared in the event of an influenza outbreak. Magnitude of benefits It is expected that these benefits will be nationwide across England, to both patients and staff working in the health care system. Sequence of events needed to take place in order for benefits to be achieved: 1. Pseudonymised matching and then HES and CRD mortality data are linked with the data the University of Surrey hold at the RSC 2. The University of Surrey analyse the data and identify trends in rates of illnesses, hospital use and mortality in certain groups (i.e. pregnant women, older people, and people with co morbid conditions) 3. The University of Surrey alert PHE and the Chief Medical Officer of the findings who will then evaluate the evidence and make health care plans that are in the best interest of the nation’s health. For example, from the data provided by HES and CRD, PHE might identify that certain conditions are associated with higher influenza rates, and therefore the possibility of extending the vaccination programme to this condition might be examined. The RSC and PHE have been working together for many years, to improve the nation’s health. University of Surrey has become important in the process from March 2015 the secure network was established at University of Surrey. The work is funded by PHE and the University of Surrey's work has previously been used to influence practice. For example, if high rates of influenza are circulating the University of Surrey will inform the Chief Medical Officer who will then make a decision about whether or not to dispense anti-viral medications at hospitals and general practices.

Outputs:

The purpose of linking HES, CRD, and primary care data is to implement a wider and more accurate sentinel surveillance of infectious diseases in England. The main outputs of the RCGP RSC’s surveillance work, which is funded by Public Health England (PHE) are as follows: • The RCGP RSC weekly report is circulated to a selected list of recipients on Wednesdays and it is publicly available on Thursdays at 2 pm at the RCGP RSC website (http://www.rcgp.org.uk/clinical-and-research/our-programmes/research-and-surveillance-centre.aspx). This report currently covers incidence rates of 37 infectious and respiratory conditions in England. It is expected that, in future, hospitalisation trends will be included. This is incorporated into the syndromic surveillance carried out by PHE on a daily basis, which allow them to determine any urgent priorities for local health protection teams. • Similar to this, an annual report is published covering the annual trends of the 37 conditions. Each year, this report has a new theme which is explored in a paper submitted to a peer-reviewed journal (usually British Journal of General Practice). Themes explored include demographic disparities in disease presentation, higher rates of consultations for lower respiratory infections for boys, and urban/rural disparities of presentation. • In January of every year, the University of Surrey provide a mid-season flu cohort to PHE with data up to the end of December. This is a fully pseudonymised patient-level extract collected by a PHE statistician using a secure drive. This data extract contains details of influenza swabbing, chronic conditions, and vaccination status for each patient. It is hoped to be able to include details of emergency attendances or admission around influenza, pneumonia, or lower respiratory tract infection events. At the end of the flu season (varies from March to May), a second extract is provided updating the first, with data recorded after December. • The data from both of these extracts is used to estimate seasonal influenza vaccine effectiveness, stratified by comorbidities and demographics. HES data will allow the University of Surrey/PHE and RCGP to include the impact of any changes in effectiveness, assessed through changes in hospital admissions/emergencies due to respiratory conditions. The results are published at the mid-season and at the end of season stage, on the peer-reviewed journal Eurosurveillance. • Important results from either of these will be further analysed and presented at the RCGP annual conference, the PHE annual conference, and the PHE annual epidemiology conference. All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Processing:

Flows of data: • Data are extracted from practices that are members of the Royal College of General Practitioners (RCGP RSC) Research and Surveillance Network by Apollo. The University of Surrey subcontracts with Apollo to do this as part its contractual responsibilities. • The University of Surrey, on behalf of RCGP RSC, will provide NHS digital with a list of pseudonymised NHS numbers and date of birth for the cohort each quarter. • NHS digital will provide HES Critical Care, Outpatients, A&E, Admitted patient care and CRD data for the cohort to the University of Surrey each quarter for it to link these information to RCGP RSC data. • University of Surrey will store the data on the secure network. • University of Surrey will process and aggregate pseudonymised data to produce approved reports for surveillance (as part of the National surveillance process); and quality improvement. Detailed explanation of flows of data: a) Data flow from RCGP RSC network member practices to University of Surrey: Apollo extract the data from the practices. Patients who have opted out of data sharing do not have their data extracted, unless they have consented to a specific surveillance programme or study. This extract provides the study with information about patient’s visits to general practices including the date of the appointment, the reason for the visit and any relevant vaccination information. The University of Surrey also receive patient’s NHS numbers and date of births which are pseudonymised using SHA-512 algorithm. Detailed information about this algorithm is held in a separate location by IT services at the University of Surrey. This extract provides University of Surrey with a cohort of participants whose data will then requested from NHS digital. b) University of Surrey to NHS Digital: The University of Surrey securely transfers a file of identifiers (Pseudonymised NHS Number, date of birth, and Unique Study ID) to NHS Digital for all non-opt-out patients who are registered with RCGP RSC general practices. c) NHS Digital to University of Surrey: NHS Digital returns linked HES and CRD mortality data including the Unique Study ID and pseudonymised NHS numbers or date of birth to University of Surrey. d) University of Surrey Storage and processing of data: The data about patients registered with RCGP RSC general practices is stored on the secure server at the University of Surrey which can only be accessed from the University of Surrey. The data will be processed within secure network and dedicated analysis server of the Surveillance Group. The secure network is located behind a firewall within the University’s network, all in-bounded connections are blocked, but out-bounded connections are allowed. Patient level data are held in the database server within the RSC Group’s secure network. Pseudonymised data will be stored on the database server within the RSC’s secure network. The pseudonymisation algorithm is held in a separate location by IT services at the University of Surrey. e) University of Surrey process and aggregate pseudonymised data to produce reports. For example, University of Surrey on behalf of RCGP RSC provide a mid-season flu cohort to PHE with data up to the end of December. This is a fully pseudonymised patient-level extract collected by a PHE statistician using a secure drive. The University of Surrey also produce an end of season report, an annual report and weekly reports that are available to the public and use aggregated data on rates of infectious and allergic conditions. The RCGP RSC data is controlled and processed by a group of staff who are all based at the University of Surrey; all are mandated to complete information governance training. The group is made up of analysts, academic fellows, Structure Language Query (SQL) developers, RCGP RSC practice liaison officers, a project manager and a head of department. The team work from secure workstations or secure laptops with encrypted drives within the group’s secure network. Data will only be accessed by individuals within the RSC who have authorisation that are substantive employees of University of Surrey. The authorisation process includes: (1) Contractual requirement to follow IG principles; (2) Using the email registered with Human Resources to complete IG training and to return the certificate; (3) Staff’s email is authorised by the IT department for one year to access the secure network and staff’s computers are configured to allow this; (4) At any point the project managers or Head can have access to the secure network turned off. There is special authorisation to have access to the main database. Only three SQL developers and one senior project manager can access the main database. Surveillance databases are created for approved analyses once they have been agreed by the RCGP RSC approval committee. This agreed protocol includes the list of variables required for the database. The SQL developers create separate databases for individual projects only including the required variables, for the required time interval. The HES and CRD data will be linked with the data that the University of Surrey already receives from the RCGP RSC network practices and PHE reference laboratories. The linkage between secondary and primary care data would happen via linking pseudonymised NHS numbers in both sets of data. The University of Surrey have used this process for previous projects linking different sets of data, and the linkage has been successful, provided both parties use the same pseudonymisation algorithm (SHA-512). There will be no requirement nor attempt to re-identify individuals from the data. The data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide. Historic data are needed because longitudinal data better enable the RSC to predict what might happen in the future; even a small increase in the ability to understand flu and its associated morbidity and mortality would offer benefits for patients and the NHS. Both historical and future data are needed in order to build a robust database and reporting system using up-to-date primary and secondary care data at the individual patient level, which can be easily queried. This will enable the study group to answer a wide range of questions which will have an impact on the provision of health care in England. For example, the data will be used to answer questions posed by PHE, who make many decisions about healthcare, such as the vaccination programme, or preventative measures. The use of national data is needed as the University of Surrey are a national surveillance centre and the cohort are from across England. Practices are recruited to be nationally representative. Due to the potentially wide variety of adverse events that influenza can cause, it is not seen as appropriate to limit the data to specific health conditions/diagnostic codes or data types. For example, unexpected rise in scarlet fever and winter outbreaks of scabies are examples of unexpected increased incidence of diseases that has been followed. The use of pseudonymised NHS numbers are essential as the request to link HES and CRD data to the data that the University of Surrey already receives from the RCGP RSC network general practices and PHE reference laboratories.


Project 3 — DARS-NIC-203503-X7K8K

Opt outs honoured: N

Sensitive: Non Sensitive, and Sensitive

When: 2017/03 — 2017/05.

Repeats: One-Off

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Admitted Patient Care
  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Outpatients
  • Office for National Statistics Mortality Data

Objectives:

The Imperial College study team have recorded baseline characterisation of approximately 30,000 Indian Asian men and women aged 35-74 years and free from clinically manifest cardiovascular disease (CVD), in the London Life Sciences Prospective Population (LOLIPOP) study. LOLIPOP aims to precisely calculate the increased vascular risk for British Asians. Health economic analysis of the introduction of the CVD risk prediction calculator for use in Indian Asians will be performed as well as a qualitative study to evaluate the utility and acceptability to general practitioners and individuals of implementing the CVD risk prediction model in general practice. In parallel University of Surrey will develop models and conduct an economic evaluation to examine the cost-effectiveness of using the new risk estimator to detect the number of Asian men at risk. This includes the costs of identifying the cohort using the new risk estimator and putting them in a preventative scheme, and the benefit, both in terms of improved health outcomes and associated reduced health care costs

Expected Benefits:

The high CVD morbidity and mortality amongst the Asian population compared to Europeans represents a significant health inequality which needs to be explored, explained and addressed. Currently the precise risk is not known, so the costs effectiveness of a possible greater intensity of cholesterol, blood pressure and other interventions can’t be defined. Inclusion of enhanced treatment in national and international guidelines generally requires demonstration of cost effectiveness. By precisely calculating risk the University of Surrey will enable cost-effectiveness of any enhanced intervention to be determined. The current recommended method for risk prediction is NOT adequate for this group and uncertainty of risk leads generally to standard guidelines being applied and the consequent under-treatment widens the inequalities in CVD outcomes for this population. Some patients may also be inappropriately over treated where individual clinician approximate additional risk. This study has received a large investment from the NIHR, through a competitive, peer reviewed application process, to produce results of the highest standards to ensure this issue is addressed. The results will be used to derive a new model for CVD prediction for British Asians and this will be disseminated into routine clinical care. This research will result in clinicians being able to make informed decisions on how aggressively to treat this group as a whole, or specific subgroups (e.g. people with diabetes). Preventative treatment will benefit health care both in terms of improved health outcomes and associated reduced health care costs.

Outputs:

All outputs will be aggregate with small numbers suppressed in line with the HES Analysis guide. The outputs from this research will be published in major scientific journals. Target journals include the Lancet and New England Journal of Medicine. It is anticipated that the outputs will directly impact national guidelines in the preventative management regimes implemented for public health as well as in primary and secondary care. This is likely to be in place within two years of publication. Outputs will also directly impact the treatment of the study participants as well as the needs of the west London community for education and service development. There have already been many publications from the LOLIPOP study team including; 1. Coronary heart disease in Indian Asians. Tan ST, Scott W, Panoulas V, Sehmi J, Zhang W, Scott J, Elliott P, Chambers J, Kooner JS. Glob Cardiol Sci Pract. 2014 Jan 29;2014(1):13-23. doi: 10.5339/gcsp.2014.4. Collection 2014. PMID: 25054115 2. 6. Prevalence of coronary artery calcium scores and silent myocardial ischaemia was similar in Indian Asians and European whites in a cross-sectional study of asymptomatic subjects from a U.K. population (LOLIPOP-IPC). Jain P, Kooner JS, Raval U, Lahiri A. J Nucl Cardiol. 2011 May;18(3):435-42. doi: 10.1007/s12350-011-9371-2. Epub 2011 Apr 9. PMID: 21479755 3. 9. Ethnicity-related differences in left ventricular function, structure and geometry: a population study of UK Indian Asian and European white subjects. Chahal NS, Lim TK, Jain P, Chambers JC, Kooner JS, Senior R. 4. A replication study of GWAS-derived lipid genes in Asian Indians: the chromosomal region 11q23.3 harbors loci contributing to triglycerides. Braun TR, Been LF, Singhal A, Worsham J, Ralhan S, Wander GS, Chambers JC, Kooner JS, Aston CE, Sanghera DK. PLoS One. 2012;7(5):e37056. doi: 10.1371/journal.pone.0037056. Epub 2012 May 18. PMID: 22623978

Processing:

The University of Surrey are conducting a first full follow up of the participants in the LOLIPOP study and therefore need access to data from all the patients from the cohort that have been in the study for the past 10 years. Both HES and ONS data will be linked to cohort data to maximize the identification of their CVD outcomes (stroke, advanced coronary artery disease and myocardial infarction) to allow a more rigorous evaluation. Particularly as many people may have moved away from northwest London. NHS Digital will use the consented cohort already flagged under MR1143 to link to the requested data. The University of Surrey would receive a pseudonymised output from the HSCIC, which will be encrypted so re-identification cannot take place. No record level data will be provided to third parties and none of the data will be used within any commercial tool or product or for commercial gain. Only substantive employees of the University of Surrey will have access to the data and only for the purposes described in this document.


Project 4 — DARS-NIC-195793-R5Y3H

Opt outs honoured: No - data flow is not identifiable (Does not include the flow of confidential data)

Sensitive: Non Sensitive, and Sensitive

When: 2020/03 — 2020/03.

Repeats: One-Off

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii)

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Admitted Patient Care
  • Civil Registration - Deaths
  • HES:Civil Registration (Deaths) bridge

Objectives:

The University of Surrey and the Royal College of General Practitioners (RCGP) are working together as joint data controllers to look at preventing Venous thromboembolism. The Royal College of General Practitioners legitimate purpose through the use of the data is to provide information and analysis on general practice data – both disease and workload Project specific: To understand the impact of mandatory venous thromboembolism (VTE) risk assessment on the incidence and outcomes of VTE after surgery. The benefits to patients is to improve the understand the impact of mandatory venous thromboembolism (VTE) risk assessment on the incidence and outcomes of VTE after surgery (Patient Care) There are benefits to national bodies through the • Provision of national surveillance information for Public Health England • Provision of workload and workforce breakdown (influence policy) for NHS England • Project specific: To understand the impact of mandatory venous thromboembolism (VTE) risk assessment on the incidence and outcomes of VTE after surgery (Policy implications) The processing of the data will help the study to provide surveillance services based on general practice electronic healthcare records. Project specific: data from general practice is required to fill in the gaps in the current understanding of the incidence and outcomes of mandatory VTE risk assessment after surgery as currently much of the existing information is from secondary care. RSC data is required to identify where VTE has occurred once a patient has left hospital. No additional processing outside of what is required is approved by the RSC and all amendments have been made to ensure that data is processed in the least intrusive way possible while still enabling the purpose of the RCGP RSC Data is pseudonymised as close to the source as possible, the RCGP RSC does not hold or process any identifiable personal data. There are no existing relationships that can be identified with the individuals whose data is processed. All data is pseudonymised, In this application, the University of Surrey will be studying a comparable-sized population of about 3 million individuals from the Royal College of General Practitioners Research & Surveillance Centre (RCGP RSC) database. Over seven years they expect to see about 450 VTE events from approximately 78,000 surgical procedures (ie prior to mandatory screening). With 61,506 surgeries before, and 61,506 surgeries after the introduction of VTE guidelines, there is 80% power to detect a 10% reduction in VTE events from 23.7/1000 years to 21.3/1000 years. The concept to use preventive measures to prevent Venous thromboembolism (VTE, also known as 'blood clots') for specific at-risk groups is well established (Haut et al, 2013). There are significant risks to medications that reduce clotting of the blood and so determining the risk-to-benefit ratio is essential to ensure that prevention is targeted appropriately. The National VTE prevention programme was launched in 2010 with the introduction of mandatory VTE risk assessment of all adults on admission to hospital. This was supported by NICE guidelines. Where patients are at increased risk of VTE (ie the risk is NOT outweighed by risk factors for bleeding), then NICE recommend mobilisation of the patient as soon as possible, medicines to limit clotting and compression stockings. Patients are also given information of the risk of blood clots and discharge planning includes relaying this information to other care-givers. The risk of VTE persists for up to 12 months after surgery, and is particularly high in the first three months (Kearon, 2003; Sweetland et al, 2009). This risk was estimated before the era of Enhanced Recovery After Surgery (ERAS) which may change the natural history of the disease. ERAS is a package of care that mean patients stay in hospital is much shorter than it once was. In the modern era, in the UK the length of stay for bariatric surgery is less than three days (Awad et al, 2014) and for thyroidectomy is just two days (Perera et al, 2014), for example. Studies evaluating HES have estimated postoperative VTE rates in varicose vein (Sutton et al, 2012), urological (Dyer et al, 2013) and orthopaedic surgery (Jameson et al, 2010). However, data from hospitals (Hospital Episode Statistics, HES) by itself is limited to capturing in-hospital adverse events and those recorded during readmission. It is evident that the risk of VTE persists well beyond discharge from hospital. For instance, a study by Bouras and colleagues showed that a large proportion of postoperative VTE was detected in primary care (2015). Linkage to primary care electronic health records and mortality data will allow for a more accurate perspective of a patients’ entire postoperative course. Mortality is obviously a key clinical outcome after surgery and would be recorded in hospital-derived data but if it occurs in the community, has been shown to be not well recorded through clinical coding in primary care. Linkage between NHS Digital data and the primary care record will be made via a pseudonymised NHS number. No patient identifiable information will be seen or used by Apollo Medical Software Solutions (the company that facilitate data extraction at the GP surgeries) or anyone at University of Surrey and the RCGP. The aim of this study is to examine the impact of mandatory VTE risk assessment (introduced in 2010) on the incidence of VTE after general surgery and major orthopaedic surgery. Patients undergoing one of twelve general surgical procedures will be chosen. Limiting to twelve operations allows the study to standardise for operative duration, likelihood of postoperative immobilisation etc. These procedures represent the majority of emergency and elective general surgical operations in UK hospitals. In terms of the number of years of data required, it is necessary to have data over such a long period because in the paper by Bouras et al (PLoS ONE 2015) there were 981 VTE events captured within 90 days of surgery, in 168005 procedures, from a background population of ~2.9 million people over 15 years (23.7/1000 patient-years). The period of time that the Bouras study relates to was 1997 to 2012. Importantly, this crosses the introduction of mandatory VTE risk assessment and so the paper cannot describe the effect of mandatory screening. Orthopaedic sub study It is hypothesized that an individual’s VTE risk after hip or knee surgery can be modelled with the use of a mathematical prediction model. Study Objectives: To develop a model that predicts the risk of VTE in patients who undergo total hip arthoplasty (THA, 'hip replacement') or total knee arthroplasty (TKA, 'knee replacement') surgery. This will be based upon data of the clinical characteristics of the individual as well as data of the operation itself and routinely collected hospital biochemical (laboratory) data. The Research question is therefore: What is the optimal prediction model for VTE risk following THA and TKA surgery? Expected results and influence in society: Current strategies to prevent blood clots are a one-size fits all -ie for all patients who undergo THA and TKA - these are not optimal because patients vary hugely in their ability to form blood clots. Therefore a new strategy, i.e., advice on an individual basis, is necessary to reduce VTE, bleeding complications and costs. A prediction model should be able to reach a discriminative value (area under the curve) of at least 0.7 with a sensitivity of 75% (in other words, detects at least 75% of those with blood clots) and specificity of 50% (in other words, detect at least half of people who do not have blood clots). Ideally, three risk groups could be identified according to the prediction model; a low- (60% of the total), intermediate-(30%) and high-risk (10%) group. These risk groups could consequently be used to optimize strategies to prevent blood clots (thromboprophylaxis). For patients in the low-risk group (VTE risk <0.5%), thromboprophylaxis could be limited to in-hospital preventative treatment only, resulting in less costs and less bleeding events. For patients in the intermediate group (0.5-1.0%), current thromboprophylaxis policies (lasing for 2 to 4 weeks) may be sufficient; while patients in the high risk group (>1.0%) could potentially benefit from an extended period (or higher dosage) of thromboprophylaxis. However, before such a tailored strategy can be implemented in clinical practice, an additional impact-analysis has to be performed that measures the validity of the prediction model, and the usefulness in clinical practice. This current study will form the basis for this approach. For the orthopaedic sub study in this application, the two most common elective major orthopaedic surgeries (hip and knee replacement) have been chosen. In England and Wales (population 58 million) there are approximately 160,000 total hip and knee replacement procedures performed each year. From the population of 3 million in the RCGP RSC database, it would be expected that about 8300 surgeries occur per year ( in other words, about 17,000 over the two years requested). In patients who undergo total hip arthroplasty (THA) or total knee arthroplasty (TKA), 3.7% and 2.7% of patients will develop symptomatic VTE, respectively, despite use of preventative low-molecular-weight heparin (a drug used to thin the blood). This is considered the minimal data necessary upon which to build a predictive algorithm for postoperative VTE. Sanofi provide a research grant for this study and have no obligation to provide any other support for the study. University of Surrey is responsible for the initiation, management and conduct of the study. The Parties acknowledge that nothing in the funding agreement is provided as or intended to be an inducement to prescribe, purchase, recommend, use, or dispense any of Sanofi’s or its Affiliates’ products. University of Surrey is performing the study independently of Sanofi. Sanofi will have no control of nor in any way contribute to the conduct of the Study.

Expected Benefits:

The results are expected to inform the evaluation of the NHS policy on VTE risk assessment and thromboprophylaxis in the surgical populations studied. Understanding the impact of the VTE prevention programme and consequent VTE rates following surgical procedures will identify areas with scope for further improvement. Expected benefits will include length of stay, costs, complication rate after surgery and patient satisfaction. As an example, the orthopaedic substudy will enable VTE risk stratification of patients undergoing joint replacement (currently all are considered high risk). This will enable delivery of thromboprophylaxis only to those at very high risk (anticipated to be 50% of patients). Avoiding thromboprophylaxis in those at low risk will minimise adverse effects such as surgical site bleeding/ooze which predisposes to infection, which can adversely impact patient quality of life immediately post operatively and in the long term. Additionally, this represents a significant cost saving to the NHS. This will help inform best practice and guideline development in the continuum of care for joint replacement, as well as in general surgery. The data will be made immediately available to the National VTE Programme Board at NHS England (via a co-applicant of this study) - who is also a Director of the National VTE Exemplar Centres Network.

Outputs:

There are four key audiences for this research, these are: A. patients, the public, and health care practitioners B. commissioning organisations (such as Clinical Commissioning Groups and NHS England) C. external statutory organisations (such as Department of Health, NHS Information Centre, NICE) D. academia The outputs will be in the form of aggregate data tables, graphs, reports and papers for publication, with any small numbers suppressed (in line with the HES Analysis Guide). • The university of Surrey will work with the local Academic Health Science Network, who will advise and support routes for dissemination to the public. • Outputs to the public will be made via University of Surrey, University of Leiden and King’s College Hospital twitter feeds, Facebook and the media offices. Results of the study will be posted on www.clininfo.eu and University of Surrey webpages. • Publications including Full, Executive Summary and Plain English summary reports of the research will be made in peer review journals and local NHS newsletters. Journals may include, but not limited to: JAMA surgery, BMJ, British Journal of Haematology, Journal of Thrombosis and Haemostasis, Thrombosis research. • Wherever possible, publication will be made using a Creative Commons Licence. This will allow downloading the report, free of charge. Publications are made available on the University of Surrey library page, academics webpage and Researchgate.net. • Presentations at national and international haemostasis and perioperative conferences. • A Report of the study will be written for Sanofi (funding body) • There is a website for the National VTE prevention programme: vteengland.org.uk where the study will promote the findings. • Outcomes from the research will be included in future iterations of the guide to achieving CQUIN targets by King’s Thrombosis Centre, in conjunction with VTE Exemplar Centres. http://www.kingsthrombosiscentre.org.uk/kings/Delivering%20the%20CQUIN%20Goal_2ndEdition_LR.pdf • A co-applicant is the Director of the King's Thrombosis Centre and a Senior Medical Advisor to the National VTE Prevention Programme in England. Through this channel, the research outcomes will influence Department of Health, NICE guidance for thrmboprophylaxis. Expected Output of Research/Impact OUTPUTS 1. An understanding of the effect of mandatory VTE risk assessment, introduced in 2010 2. A risk prediction tool for VTE after orthopaedic surgery. IMPACT The approach to research and dissemination will: • Potentially reduce NHS costs through better assessment of VTE risk and through more accurate understanding of thrombosis risk after hospitalisation. • Provide findings to enhance the current evidence base for quality indicators and commissioning practices enabling commissioners and providers to make evidence based decisions to ensure maximum benefit to patients and the NHS • Contribute to national debates on the role of VTE thrombo-prophylaxis in driving forward improvements in patient care. Submission of manuscripts will be targeted for the end of 2019 / Spring 2020.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data)” The study will only use and store pseudonymised information extracted by an approved third party provider, Apollo Medical Software Solutions. Each unique patient will be de-identified using a computer generated patient ID which could only be retraced by staff of the participating GP practices. The research team at University of Surrey will not view patient identifiable information in any form. Linkage between NHS Digital and the primary care data will be made via the pseudonymised NHS number. Apollo generates the hash key which then de-identifies all the patients in the server. This is passed onto the University of Surrey. University of Surrey will transfer the ‘hash’ algorithm to NHS Digital via Secure Electronic File Transfer (SEFT). The Hash algorithm is a one way encryption and can not be reversed so there is o ability for the pseudo data to be re-identified by University of Surrey. Record level HES data pseudonymised at source using ‘hash’ algorithm downloaded to the Research Group at the University of Surrey for linkage via SEFT. Pseudonomised record-level HES data will be processed and stored by the Research Group at the University of Surrey. Patient level databases are held in the database server within the Research Group’s secure network. The Research Group is made up of staff substantially employed by University of Surrey. The Research Group’s dedicated secure network is sited behind a firewall within the University’s network. It is a standalone – independent network, all in-bounded connections are block, but out-bounded connections are allowed. All staff members of the research group working within the team base work from secure workstations or secure laptops with encrypted drive. All staff members of the Research Group working within the team base work from secure workstations or secure laptops with encrypted drive within the Research Group’s secure network. The secure network is located behind a firewall within the University’s network, all in-bounded connections are blocked, but out-bounded connections are allowed. The Research Group has conducted a risk assessment of the physical security of the offices and servers where patient level data is kept, a copy of the risk assessment can be accessed: https://clininf.eu/wp-content/uploads/2017/02/Risk-Assessment-of-physical-security-V3.1-2016_18-signed.pdf A more recent review was carried out on the 2nd May 2019 which will soon be published. The hashed data provided by NHS Digital for this study will be downloaded by the Research Group. The Research Group will not have access to the identifiable data or the SALT key used for encryption. The University of Surrey will make no attempt to re-identify the data extract provided by NHS Digital under this agreement. The GDPR legal basis for the data processing is 'public interest', as medical research. There will be no additional data linkage undertaken with NHS Digital data provided under this agreement that is not already noted in the purpose. Data will only be accessed and processed by substantive employees of the University of Surrey and will not be accessed or processed by any other third parties.