NHS Digital Data Release Register - reformatted

University Of Surrey projects

133 data files in total were disseminated unsafely (information about files used safely is missing for TRE/"system access" projects).


Models of Child Health Appraised (MOCHA - A study of Primary Care in 30 European Countries): comparing eight exemplar conditions in the UK — DARS-NIC-115590-Q1C7Z

Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii); Other-GDPR does not apply solely to the deceased, Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2020-01-01 — 2022-01-19

Access method: One-Off

Data-controller type: ROYAL COLLEGE OF GENERAL PRACTITIONERS, UNIVERSITY OF SURREY

Sublicensing allowed: No

Datasets:

  1. Civil Registration (Deaths) - Secondary Care Cut
  2. HES:Civil Registration (Deaths) bridge
  3. Hospital Episode Statistics Accident and Emergency
  4. Hospital Episode Statistics Admitted Patient Care
  5. Hospital Episode Statistics Critical Care
  6. Hospital Episode Statistics Outpatients

Objectives:

This is a study which is being run by University of Surry along with the Royal Collage of General Practitioners (RCGP). This request for data from NHS Digital will contribute to the work package 5 of the The Models of Child Health Appraised (MOCHA) study. MOCHA is an EU-wide project led by Imperial College and funded through the Horizon2020 Framework Programme for Research and Innovation under the grant agreement number: 634201.

The overall aim of the MOCHA project is to appraise the existing models of primary child healthcare in Europe.

The MOCHA is divided into 10 work packages (WP) as follows:

WP 1. Identification of Models of Children’s Primary Health Care
This will coordinate the scientific work, be the interface with the country agents and the scientific analysts, and will identify the core basic models of primary care provision.

WP 2. Interfaces of Models of Primary Health Care with Secondary, Social and Complex Care
Covering both day-to-day referrals, and management of complex conditions, between primary and secondary care, and the collaboration between health and social care.

WP 3. Effective Models of School Health Services and Adolescent Health Services

WP 4. Identification and Application of Innovative Measures of Quality and Outcome
This WP will devise and apply a number of innovative measures of quality and outcome of child primary care models, based on concepts, analysis of available routine statistics.


WP 5 will assess the availability of large data sets, using learning from the TIRRE survey tool https://www.surveymonkey.com/r/tirre2 (developed as part of the 7th Framework TRANSFoRm project to assess the potential to link health databases) to create a set of common data descriptions and case definitions (ontologies). further details around TIRRE can be sourced below;

Jennings E, De Lusignan S, Michalakidis G, Krause P, Sullivan F, Liyanage H,
Delaney B. An instrument to identify computerised primary care research networks,
genetic and disease registries prepared to conduct linked research: TRANSFoRm
International Research Readiness (TIRRE) survey. J Innov Health Inform. 2018 Dec
31;25(4):207-220.

WP 6. Economic and Skill Set Evaluation and Analysis of Models

WP 7. Ensuring Equity for All Children in all Models
Equity across socio-economic, ethnic, and cultural divides, regardless of gender. How different health systems address these challenges will be considered, as will other triggers for inequality such as children in care, children from challenged families, and refugee and undocumented children.

WP 8. The Role of Electronic Records and Data to Support Safe and Efficient Models

WP 9. Validated Optimal Models of Children’s Prevention-Orientated Primary Health Care
This WP is an overarching outcome of the other WPs, drawing on the evidence collected by WP 1 and analysed by the specialist WPs 2-8. It will develop optimal patient-centered and prevention oriented primary child health care models emerging from the analyses in the other WPs, and seek public and stakeholder views.

WP 10. Dissemination
Dissemination will be active throughout the project, involving all Work Packages and many stakeholder interfaces.

The NHS Digital data used for Work Package 5 will not be used for any of the Work Packages outlined in the application.

Aim and purpose of this application -

This agreement is solely for WP5 of the MOCHA project which is being led by the University of Surrey. WP5 is the only work package that requires access to individual level data.

The aim of this study, which is part of WP5 is to analyse the effect of individual and structural health services factors on the antecedents and outcomes in eight key childhood disease areas between 2003-17. These are Asthma care, Epilepsy care, Care for children with diarrhoea and vomiting, Prevention of rickets, Vaccine preventable disease, Post-natal care, Treatment of depression in teenagers, Treatment of enuresis.

Antecedents are conditions which are present in the decade (i.e. past medical history) before the key conditions under study and may be associated with the key condition under study.

The University of Surrey will examine the following individual determinants using information from Royal College of General Practitioners (RCGP) Research Surveillance Centre (RSC) data
- Demographic characteristics – age, sex, Socio-Economic Status (SES)
- Medications prescribed

The University of Surrey will examine the following individual determinants using information from the HES data
- Antecedent conditions (i.e. past medical history)

The University of Surrey will examine the following structural health service factors using information from general practice websites, the General Medical Council and NHS Choices
- Practice list size
- Number of general practitioners
- Average number of years since general practitioners graduated from medical school
- Average number of qualifications for general practitioners
- Practice QoF/ P4P score
- Practice NHS choices star rating

The Article 6 and article 9 (2) (j) justification for this application relates to the processing of data necessary for the performance of a task carried out in the public interest, namely research to appraise existing models of primary child healthcare as part of the MOCHA project.


The University of Surrey will study the outcomes of these key conditions including hospitalisations from HES data, health services use from HES and primary care data, and Civil Registration Data (CRD) data. Linkage between primary care data and HES/ CRD data will allow the study to maximize the identification of morbidity and mortality as these outcomes may not be consistently recorded in the primary care record.


The cohorts being considered are;

1.The primary group of study will be those children under the age of 19 years who have received care in any of the following eight key childhood disease areas since 2003 until 2019 - Asthma care, Epilepsy care, Care for children with diarrhea and vomiting, Prevention of rickets, Vaccine preventable disease, Post-natal care, Treatment of depression in teenagers, Treatment of enuresis. This will include 644,000 children under 19 years of age who have received care in any of the eight key childhood disease areas.

2. up to 1.5 million household contacts of children who were identified in cohort 1 above with diarrhoea and vomiting who are registered between 1997 (when computerised records were available in the database) and 2019 with 164 general practices within the Royal College of General Practitioners Research and Surveillance Centre (RCGP RSC) network.

The RCGP RSC database contains a pseudonymised field called the ‘household key’ which identifies all people living within the same household. It does not contain any information about the address. This ‘household key’ will be used to identify household contacts of children with diarrhoea and vomiting (D&V). Members of the same household who are at risk of D&V, namely those within 7 days of the presentation of the child to GP for D&V will be identified. It is estimated that there may be up to 1,500,000 household contacts of children with D&V.

3. Women of child bearing age between 10-69 years. It is estimated that there are 900,000 such women in the database. Ante-natal and early post-natal care will be studied in these women to understand the determinants of early preventative care for children in primary care. No attempt will be made to match women of child bearing age to their children, this cohort will not be linked to cohorts 1 or 2 above.

Purpose of the request –

The University of Surrey will study the key antecedents and outcomes of these key conditions in children registered with practices in the RCGP RSC network including hospitalisations from HES data, health services use from HES and primary care data, and Civil Registration Data (CRD) data. Thus requesting data for the whole of England and Wales relating to patients registered with RCGP RSC practices.

Linkage between primary care data and HES/ CRD data will allow the University of Surrey to identify antecedents conditions which are present in the decade (i.e. past medical history) before the key conditions under study and may be associated with the key condition under study. It will also maximize the identification of morbidity and mortality as these outcomes from these conditions that may not be consistently recorded in the primary care record. Currently this work is not possible using primary care datasets alone without linkage to NHS Digital datasets. These linked datasets will also help us to identify early adverse ante-natal and early child outcomes such as miscarriage in women of child-bearing age which are important to capture to understand the effect of preventative care pertaining to children in primary care, such as dietary advice, smoking cessation advice and routine antenatal checks.
The University of Surrey will use multiple regression analysis to look for associations between individual characteristics/ structural characteristics/ antecedents and key childhood disease outcomes. Thus we have requested individual pseudonymised data.

Organisations involved –

Imperial Collage London are the lead for the wider MOCHA Programme of research as an integrated programme of research projects which interlinked in order to answer the question “ how is primary care for children and young people operating in Europe and which aspects lead to best quality care?”. Imperial do not have any influence over the design , processing activities, decision making responsibilities over what data may be used in WP5 outputs or dissemination of data from WP5. Imperial will review aggregated data outputs only and oversee the academic outputs in terms of their contribution to the overall programme.


Work Package 5 (this application) is focused on Identification and Use of Derivatives of Large Data Sets and Systems to Measure Quality and had a number of discrete tasks including examining child health specific measures in primary care data sets where available in different countries. University of Surrey are the lead on WP5 alongside the Royal College of GPs (RCGP). For WP5 University of Surrey and RCGP are joint Data Controllers for this application and WP5. University of Surrey and RCGP as leads for WP5 in the MOCHA project will make the final determination of how the data from this agreement is used/analysed and how the results are disseminated. The RCGP Research and Surveillance Centre (RSC) is based at the University of Surrey. The University of Surrey has a contract with the RCGP to provide a surveillance, quality improvement and research platform through the RSC.

Apollo Medical Software Solutions, an approved third party provider has formal service agreements and service specifications with RCGP RSC and with individual participating GP practices to conduct data collection and secure web transfer. (Copies of these formal agreements and technical details were shared with NHS Digital in the last IGTK assessment and were deemed satisfactory, and are available to legitimate requests). Apollo will not have access to any data disseminated under this agreement.

No funders (Horizon 2020) will be involved in the analysis or have access to the data, they have no influence over the outputs or design of the study.

Outputs:

Outputs will include (but is not limited to) the following:

a. Reports
b. Submissions to peer reviewed journals
c. Presentations
d. Conferences
e. Dashboards


There will be a number of outputs by the end of the MOCHA project and two years beyond up to March 2022. These will include academic, scientific and professional groups and individuals; policy makers (both political and professional) involved in deciding future health policies; and bodies representing parents, children and young people.
Much of the dissemination will be at European level and in professional journals, but materials on the project web site (with which other sites will be encouraged to link) will be important, as will targeted national dissemination as recommended by country agents and some publications in selected lay outlets. There have already been a list of publications from the MOCHA team including WP5 which can be accessed from the MOCHA website. The results from WP5 including the results from this study using data from England will also feed into MOCHA conclusions in the final report about the existing models of primary child healthcare in Europe.


All data and outputs from this study will not be made available to any third parties except in the form of aggregated outputs with small numbers suppressed in line with NHS Digital Guidelines

Dissemination for MOCHA as a whole will be coordinated by WP10 and will be formative (disseminating the MOCHA project’s objectives and methods) as well as summative (disseminating the findings). University of Surrey will lead and control dissemination of results and data from WP5 and WP10 will oversee the academic outputs of WP5 in terms of their contribution to the overall programme.

The MOCHA project sees as its beneficiaries the children and families of Europe including the UK and the professionals who care for them currently through varied historically-based service models.


There will be a number of target populations for the dissemination activities by the end of the MOCHA project in 31/01/2019 and two years beyond. These will include academic, scientific and professional groups and individuals; policy makers (both political and professional) involved in deciding future health policies; and bodies representing parents, children and young people. Much of the dissemination will be at European level and in professional journals, but materials on the project web site (with which other sites will be encouraged to link) will be important, as will targeted national dissemination as recommended by country agents and some publications in selected lay outlets. There have already been a list of publications from the MOCHA team including WP5 which can be accessed from the MOCHA website.

The MOCHA project and its External Advisory Board include persons directly embedded in a number of key scientific and strategic European organisations including the World Health Organisation, Health Forum Bad Gastein, European Public Health Association, European Health Management Association, European Patients’ Association, European paediatric networks (such as the European Academy of Paediatrics, European Paediatric
Associations and European Confederation of Primary Care Paediatrics), Alliance for Childhood (with its network and regular European Parliament meetings) and Eurochild.

Additional key conferences, such as those of nursing associations at European level, will also be targeted, while the European Union for School and University Health and Medicine has offered collaboration.

All data and outputs from this study will not be made available to any third parties except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guidelines.

The dates for outputs from this study are from March 2015 until March 2022.

The wider MOCHA project and its External Advisory Board include persons directly embedded in a number of key scientific and strategic European organisations including the World Health Organisation, Health Forum Bad Gastein, European Public Health Association, European Health Management Association, European Patients’ Association, European paediatric networks (such as the European Academy of Paediatrics, European Paediatric Associations and European Confederation of Primary Care Paediatrics), Alliance for Childhood (with its network and regular European Parliament meetings) and Eurochild.

Additional key conferences, such as those of nursing associations at European level, will also be targeted, while the European Union for School and University Health and Medicine has offered collaboration.

MOCHA has a website and newsletter which is written in lay language along with press releases which are sent to media outlets including, inter alia, the BBC, the Financial Times, The Guardian, The Independent, The Times, The Daily Telegraph, and the New Statesman. As an example, a recent MOCHA paper was covered by The Sunday Telegraph https://www.telegraph.co.uk/science/2019/08/10/older-parents-may-not-know-child-has-adhd-better-absorbing-rowdy/

Processing:

Only substantive employees of the University of Surrey will have access to the data and only for the purposes described in this agreement. All researchers who have access to the data will also have to have undertaken data governance training in accordance with the NHS Information Governance (IG) toolkit. The data will be used solely for the MOCHA project WP5. Imperial College will only have access to data which has been aggregated with small numbers suppressed in line with NHS Digital Guidelines.

General practices within the RCGP RSC network have been involved in disease surveillance for over 50 years. Over this period practices have had feedback about their data quality and many practices have been computerised since the late 1990s, allowing long-term outcomes to be studied.

Each unique patient within the RCGP RSC database is de-identified at source before data is extracted from individual practices using a computer generated patient ID. This de- identification of records includes production of a scrambled NHS number using pseudonymisation algorithm. Apollo Medical Software Solutions, an approved third party provider has formal service agreements and service specifications with RCGP RSC and with individual participating GP practices to conduct data collection and secure web transfer. (Copies of these formal agreements and technical details were shared with NHS Digital in the last IGTK assessment and were deemed satisfactory, and are available to legitimate requests).

The University of Surrey will send the hashed NHS numbers to NHS Digital. The three following flows of hashed NHS numbers will be undertaken

Firstly from the primary group of study - those children under the age of 19 years who have received care in any of the following eight key childhood disease areas since 2003 until 2019 - Asthma care, Epilepsy care, Care for children with diarrhoea and vomiting, Prevention of rickets, Vaccine preventable disease, Post-natal care, Treatment of depression in teenagers, Treatment of enuresis. This will include 644,000 children under 19 years of age who have received care in any of the eight key childhood disease areas

Secondly, up to 1.5 million household contacts of children with diarrhoea and vomiting who are registered between 1997 (when computerised records were available in the database) and 2019 with 164 general practices within the Royal College of General Practitioners Research and Surveillance Centre (RCGP RSC) network.

Thirdly, data from women of child bearing age between 10-69 years. It is estimated that there are 900,000 such women in the database. No attempt will be made to match women of child bearing age to their children.

Data from women of child bearing age will also be used to examine the determinants of ante-natal and post-natal care in primary care. Women of child bearing age will be identified in the RCGP RSC database using the ‘pregnancy sliding window’ that has recently been developed and published in a peer-reviewed journal. This algorithm uses information about women between the ages of 10-69 years age. No attempt will be made to match women of child bearing age to their children in this particular sub-study.

The research team at the University of Surrey will not link with any record level data and there will be no requirements nor attempt to re-identify individuals from the data. The University of Surrey will undertake the following processing activities

• University of Surrey will identify the study patients for the three cohorts above from primary care records in the RCGP RSC practices and send the scrambled/hashed NHS numbers of the cohort under study to NHS Digital to link to HES/ Civil registration data. No other GP data will be sent to NHS Digital.
• NHS Digital will hash their NHS numbers using the same pseudonymisation algorithm
• NHS Digital will undertake data linkage via the hashed NHS numbers in both sets of data. This process has been used for previous projects linking different sets of data, and the linkage has been successful
• NHS digital extract all HES and CRD records for which there are matched primary care records
• NHS digital will send the extract of HES and CRD records with the hashed NHS number to the University of Surrey
• University of Surrey will link the HES and CRD records together with GP data from the primary care records from RCGP RSC practices with the same hashed NHS numbers
• The linkages to the cohorts will be done separately so each of the 3 cohorts will be sent in as separate files, linked by NHS Digital as separate cohorts and released as 3 separate cohorts.

Records for each study participant will when fully linked contain information from HES and CRD, together with information from RCGP RSC primary care practices.

Pseudonymised record-level HES data will be processed and stored at the University of Surrey. Patient level databases are held in the database server within the Research Group’s secure network. The Research Group’s dedicated secure network is sited behind a firewall within the University’s network. It is a standalone, independent network, all in-bounded connections are block, but out-bounded connections are allowed. All staff members of the research group working within the team base work from secure workstations or secure laptops with encrypted drive. Only substantive employees of the University of Surrey will have access to the data and only for the purposes described in this document. The data will be used solely for the MOCHA project WP5.


Each unique patient within the RCGP RSC database is de-identified at source before data is extracted from individual practices using a computer generated patient ID. The University of Surrey holds no identifiable data and only hashed NHS number. Linking data from University of Surrey for this project will not lead to or increase the risk of pseudonymised data becoming identifiable data. The data held at the University of Surrey is pseudonymised at the practice level using a non-reversible hash key.

The hashing of identifiable data for the Clinical Informatics and outcomes Research Group, at the University of Surrey if needed, is conducted by the Salt Service of the University of Surrey Central IT team, so that the holder of the pseudonymised data is separated from the service that holds the non-reversible hash key. This will not lead to risk of pseudonymised data becoming identifiable data.


Publicly available data about general characteristics of practices will be extracted from individual practice websites, the NHS choices website and the General Medical Council (GMC) register. This includes the practice list size, the percentage of children in the lowest IMD quintile, the total number of children registered with the practice, the total number of general practitioners (GPs) in the practice, the gender of GPs in the practice, the average number of years since their medical and specialist general practice qualifications, the average number of qualifications of GPs in the practice, and whether the practice was in an urban or rural location. Data about general practitioners will be aggregated and information about general practices will be anonymised to mitigate the risk of re-identification of practices and practitioners No details of individual general practices including practice name, address or postcode or individual general practitioners will be identified.


NHS Digital will hash their NHS numbers using the same pseudonymisation algorithm (SHA-512). NHS Digital will undertake data linkage via the hashed NHS numbers in both sets of data. This process has been used for previous projects linking different sets of data, and the linkage has been successful.

Records for each study participant containing information from HES and CRD, together with hashed NHS numbers will be sent to the University of Surrey. The study are interested in any subsequent hospital admissions and deaths of children who present with these conditions in primary care. Thus they are interested in HES records and deaths in these children even if they are over the age of 19 at the time of these subsequent records.


There will be no subsequent flows of data from the University of Surrey.


Each unique patient within the RCGP RSC database is de-identified at source before data is extracted from individual practices using a computer generated patient ID. The University of Surrey holds no identifiable data and only hashed NHS number. Combining/ linking data from University of Surrey for this project will not lead to or increase the risk of pseudonymised data becoming identifiable data. Linkage of two non-confidential datasets does not create a confidential dataset. A data protection impact assessment was completed on 12/11/2018. https://clininf.eu/wp-content/uploads/2018/11/DPIA-2018.11.26-signed.pdf

Publicly available data about practices will be included in this study. This data will be aggregated and anonymised to mitigate the risk of re-identification of practices and individuals. No information from single handed practices within the RCGP RSC network where information about the practice cannot be aggregated will be used for this study.


Pseudonymised record-level HES data will be processed and stored at the University of Surrey. Patient level databases are held in the database server within the Research Group’s secure network. The Research Group’s dedicated secure network is sited behind a firewall within the University’s network. It is a standalone, independent network, all in-bounded connections are block, but out-bounded connections are allowed. All staff members of the research group working within the team base work from secure workstations or secure laptops with encrypted drive. Only substantive employees of the University of Surrey will have access to the data and only for the purposes described in this document. The data will be used solely for the MOCHA project.

The Research Group has conducted a risk assessment of the physical security of the offices and servers where patient level data is kept.
The Research Group of Department of Clinical and Experimental Medicine at the University of Surrey has worked with routinely collected healthcare data in a number of research and evaluation projects over the last 15 years. The Research Group works within the Research and Information Governance


Data for NHS hospital workforce retention project (determinants and effects on patients' outcomes). — DARS-NIC-345789-L9Q7J

Opt outs honoured: No - data flow is not identifiable, Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2020-11-06 — 2023-11-05 2020.11 — 2022.04.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY OF SURREY

Sublicensing allowed: No

Datasets:

  1. Mental Health Services Data Set
  2. Mental Health Minimum Data Set
  3. Mental Health and Learning Disabilities Data Set
  4. Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
  5. Hospital Episode Statistics Accident and Emergency
  6. Hospital Episode Statistics Admitted Patient Care
  7. Civil Registration (Deaths) - Secondary Care Cut
  8. Emergency Care Data Set (ECDS)
  9. HES:Civil Registration (Deaths) bridge
  10. Hospital Episode Statistics Critical Care
  11. Patient Reported Outcome Measures (Linkable to HES)
  12. HES-ID to MPS-ID HES Accident and Emergency
  13. HES-ID to MPS-ID HES Admitted Patient Care

Objectives:

AIM AND PURPOSE
The aim of this project is to investigate the determinants and effects of hospital workforce retention (WFR). Workforce retention refers to the ability of a workforce to retain its employees). This project is led by University of Surrey (UoS) and funded by The Health Foundation (the Funder). The project is of interest for the research team, the Funder, and the wider community of researchers and healthcare policy-makers, with an expected positive impact on the knowledge of the economics of healthcare workforce and its effects on hospital performance and patients’ outcomes. The added contribution generated by the project is hoped to help improve the sustainability of the English NHS.

The lawful basis for processing personal data is Article 6(1)(e), in that processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller. The lawful basis for processing Special Category Data is Article (9)(2)(j), in that processing is necessary for scientific research purposes in accordance with Article 89(1). The University of Surrey is a public authority responsible for conducting scientific research for academic and public benefit. Data in the ‘Hospital workforce retention and patient outcomes’ study is processed to enable the University of Surrey to perform its public task. The University of Surrey rely on GDPR Article 6 (1) (e), to carry out its public task and for special categories of data (including health information and pathways, and information concerning ethnicity); GDPR Article 9.2(j), for archiving, research and statistics, as the study is a research project which will use data and statistics, in accordance with Article 89(1).

The following data products are requested:
1) Hospital Episode Statistics (HES) Admitted Patient Care: Financial Years 2009/10 to 2021/22;
2) Patient Reported Outcome Measures (PROMs): Financial Years 2009/10 to 2021/22;
3) Civil Registration - Deaths (CR-D): Financial Years 2009/10 to 2021/22;
4) HES Critical Care (HES CC): Financial Years 2017/18 to 2021/22;
5) HES Accidents and Emergencies (HES A&E): Financial Years 2009/10 to 2018/19;
6) Emergency Care Data Set (ECDS): Financial Years 2018/19 to 2021/22;
7) Mental Health Minimum DataSet (MHMDS), Mental Health Learning Disabilities (MHLDDS), Mental Health Services DataSet (MHSDS): Financial Years 2011/12 to 2021/22;
8) Bridge files: Hospital Episode Statistics to Mental Health Minimum Data Set
9) Bridge files: Hospital Episode Statistics to Mental Health Services Data Set
10) Mapping files: MHSDS to MHMDS / MHLDDS
11) Bridge files: PROMS to Hospital Episode Statistics

The data requested will allow UoS to investigate the association of hospital WFR for different categories of hospital workers. These workers include consultants, nurses, ambulance staff with patients’ outcomes in different type of hospital care, length of stay in acute emergency care, unplanned re-admissions in acute elective care, and unplanned re-admission to inpatient mental health wards for mental health care.

The study will focus on two research questions (RQ).
RQ1: What are the determinants of variations in NHS WFR, in both acute care (AC) and mental health (MH) hospitals? (what causes the differences in work force retention between different hospital settings)
RQ2: What are the causal effects of WFR on admitted patients’ health outcomes (mortality, emergency readmissions, length of stay, waiting times) in emergency, elective and MH care? (what is the impact on care).

RQ1 investigates the sources of variation in hospital WFR for ambulance and clinical staff (doctors and nurses), i.e. which are the factors that affect the hospital WFR within and amongst NHS hospital organizations and Ambulance Trusts, and whether different kind of factors affect differently the various categories of hospital workers (i.e. doctors, nurses and ambulance staff).

The baseline model will make use of a range of longitudinal data, including NHS Digital data, to uncover the association between WFR and factors determining its variations over time and across hospitals. Moreover, the variation stemming from a series of policies (i.e. the 2016 new junior doctor contract; the 2018 NHS Improvement Retention Support Programme; the 2018 hospital staff’s new pay/progression contract) and political events (Brexit 2016 poll and 2019 EU withdrawal) and the fact that such changes often affected unequally different groups of hospital workers, will be used to estimate statistical models (e.g.: Interrupted Time Series (ITS) and difference-in-difference (DiD) models) to evaluate the effect of such ‘breaks’ on hospital WFR. This analysis is not concerned with the identification of the retention behaviour of individual doctors, but with the identification of average retention behaviour for groups of doctors/nurses/ambulance workers amongst different NHS organization over time.

RQ2 will rely on a simple (linear regression) analysis to uncover how the effect of WFR on patients' outcomes can lead to estimates biased by endogeneity due to reverse causality (e.g. bad patients' outcomes leading to poor WFR), so the changes in hospital WFR caused either by policy changes and the Brexit shock or by the variation in the non-NHS wages (i.e. wages for workers with similar age/qualification profiles as NHS workers, but not working for the NHS) and business turnover in the local labour market around NHS hospitals. The aim is to plausibly identify causal effects.

This analysis is not concerned with the identification of the clinical performance of individual doctors, but with the identification of the average performance amongst NHS organisations with high vs low WFR rates and with the identification of the average performance of groups of doctors staying working in a hospital vs groups of doctors leaving a hospital, the effects of interest are at the very least aggregated at stayers vs leavers level, and never defined or reported for individual doctors.

This is a standalone project and has no links to other projects or collaborations. The project consists of two phases (related to each RQ), which will be carried out partly sequentially and partly in parallel.

Phase 1.
The first phase will model the retention of the NHS hospital workforce. In this phase, the NHSD datasets requested will be used to extract variables of interest that are likely to be factors associated with or causing changes in hospital WFR patterns over time. The data obtained by the Department of Health and Social Care (Electronic Staff Records data, ESR) will be used to define the main output variables for the analysis of the determinants of hospital workforce retention as well as some of the main variables of interest in the same analysis (e.g. average salary / gender pay gap / staff nationality / staff qualification / staff age). The ESR data will also be used to define the main variables of interest (e.g. stability index, number of leavers) in the analysis of the effects of hospital WFR on patient outcomes. The patient level data requested from NHSD will be used to define some of the control variables in the analysis of the determinants of hospital workforce retention (e.g. weekly admissions to hospital, average age/comorbidities (state of having multiple medical conditions at the same time, especially when they interact with each other in some way)/procedures, number of competitors at NHS Trust level).

Phase 2.
The second phase will analyse the effects of hospital WFR on patients’ outcomes. In this phase, the NHSD datasets requested will be used to extract variables that are either the patients’ health / process outcomes of interests (e.g. mortality, readmission, length of stay, waiting time) or characteristics of the patients (e.g. co-morbidities, age, economics deprivation, hospital attended, year, month, day of the week when admitted or treated).

The patient level data requested from NHSD will be used to define some of the patient-level control variables in the analysis of the effect of hospital WFR on patient outcomes. (e.g. emergency or elective admission to hospital, patient age/comorbidities/procedures, number of competitors) as well as the main outcome variables (e.g. mortality, readmissions, length of stay, waiting times, change in the Oxford Hip/Knee score ( a joint-specific, patient-reported outcome measure tool designed to assess disability in patients undergoing total hip replacement)).

The most important academic references to this project are the following published studies:
Propper, C., & Van Reenen, J. (2010). Can pay regulation kill? Panel data evidence on the effect of labour markets on hospital performance. Journal of Political Economy, 118(2), 222-273].
Shields, M. A., & Ward, M. (2001). Improving nurse retention in the National Health Service in England: the impact of job satisfaction on intentions to quit. Journal of health economics, 20(5), 677-701.
Newman, K., & Maylor, U. (2002). The NHS Plan: nurse satisfaction, commitment and retention strategies. Health Services Management Research, 15(2), 93-105.
Other important healthcare policy references on the economics of the NHS healthcare workforce are:
Health Education England. (2017). Facing the Facts, Shaping the Future. A draft health and care workforce strategy for England to 2027.
Charlesworth, A., & Lafond, S. (2017). Shifting from Undersupply to Oversupply: Does NHS Workforce Planning Need a Paradigm Shift?. Economic Affairs, 37(1), 36-52.
Buchan, J., Charlesworth, A., Gershlick, B., & Seccombe, I. (2017). Rising pressure: the NHS workforce challenge.
Nuffield Trust (2017). Creating a sustainable workforce: The long-term sustainability of the NHS.

PARTICIPANTS:
- all patients hospitalized in English NHS hospitals, from 2009/10 to 2021/22 for acute care, and from 2011/12 to 2021/22 for mental health care;
- the hospital consultants treating NHS inpatients for acute care and inpatients and outpatients in mental health (MH) care;
- the nurses working in NHS hospitals acute care wards and NHS mental health care hospitals (inpatients and outpatients wards);
- the NHS Ambulance Trusts workers.
The data requested from NHS Digital is related only to the patients admitted to acute and/or MH care, as well as the hospital consultant codes of the consultants treating the patients in hospitals.

For all data products requested, the full datasets are required, including all admissions to hospital care (& community care for MH patients). The data needed to deliver the project cannot be limited to a cohort of patients with a specific condition, procedure or age range and there are several reasons for this. To study both the determinant factors of hospital WFR and its effects on hospital patient outcomes, data is needed to:
1) Create variables to proxy healthcare demand pressure at provider-level (or at department-level within each provider). These variables depend on the sum of all admissions recorded, and not by a single cohort of patients.
2) Define clinically and policy-makers relevant health outcomes like emergency readmissions to hospital following a previous hospital discharge, where the diagnosis for the readmission spell need not be the same as the diagnosis of the index admission spell. This requires having records for all the patients.
3) Define market concentration variables for non-emergency admissions, e.g. the HHI index. The Herfindahl-Hirschman Index (HHI) is a commonly accepted measure of market concentration - measure of the size of firms in relation to the industry and an indicator of the amount of competition among them. Computation at provider level requires to observe the spectrum of all non-emergency admissions in a given year or month for all providers in England, and so it requires the records of all elective patients in England.
4) Define clinically relevant case-mix variables to control for patient severity like the number of emergency admissions to hospital within a given period (e.g. one year, two years), where the diagnosis for the emergency admission can be of any type. This requires observing records for all the emergency patients.

The University of Surrey is the sole data controller and the sole data processor for this agreement.

There are co-investigators from University of Leeds and City University London involved in this project. The University of Leeds and City University London do not have access to or process NHS Digital data and will only contribute to the writing of the reports and papers. The co- investigators belonging to these organizations are only contributing intellectually and in the writing of reports/papers to the project. But they are not involved in determining the means by which the data are being processed.

FUNDER
The Health Foundation is funding this project and their role is to ensure and facilitate the delivery of this project, but The Health Foundation has no data controlling or data processing roles within the project.

The Health Foundation is a major stakeholder in the project and it has funded this research project, along with several other projects from other institutions, under a funding call for their Efficiency Research Programme (https://www.health.org.uk/sites/default/files/ERP%202018%20Call%20for%20applications.pdf) which is targeted to investigate the under-researched themes of labour productivity and workforce retention in health and social care. As such, the remit of RQ1 and RQ2 of this project fall under the remit of the funding call issued by the Funder.

The Funder organises an Advisory meeting for the research projects which facilitates the circulation of ideas among researchers, their collaboration and so the development of the research project. The Funder is a very known think-tank in the UK and has an extensive network of professionals that supports the public good and public health mission. As such, using its network, The Health Foundation is going to help the research team circulate the findings of the research, prior to and along with any other dissemination channel (e.g. peer-reviewed journals, conferences, etc…).

The Health Foundation has no control of the data that is released by NHS Digital.

The Health Foundation will have access to research outputs, aggregated with small numbers suppressed, in terms of graphs, tables and paper to be produced by the UoS research team, which will not be able to be published or used without the UoS research team’s explicit consent.

The Health Foundation will act as an additional dissemination channel, e.g. similarly to posting a working paper from the project on the Surrey project website.

Ethical Approval.
This project has received approval from a Research Ethics Committee in February 2020. It was approved subject to a condition that the lay summary provided be revised suitably to be more lay friendly. This was revised, and full ethical approval was granted on August 10th 2020.

Expected Benefits:

The NHS has faced substantial pressures over the past two decades - with services beign over stretched due to a prolonged financial austerity period coupled with demand pressure from population growth and ageing. Despite a recent Governmental pledge to refinance the NHS, it remains clear that substantial efficiency savings are necessary. One area where efficiency gains could be achieved is NHS WFR, described by the Health Education England chief executive as “the biggest workforce challenge facing the NHS”.

The research dissemination is of public interest as a relevant part of the evidence found through this research will be translated into policy recommendations for healthcare policy makers, leaders, managers and workers on the best ways to improve hospital WFR, and through it also patients’ outcomes. One of the team members is in charge of the impact for the project (as well as the impact for the UoS School of Economics REF case studies); as an expert in the communication of research outcomes in layman’s terms, the team member is in charge of laying the policy recommendations out from the analysis reports and will assist the Principal Investigator in the communication with the healthcare policy makers, leaders, managers and workers.

Overall, the aim of the research and its dissemination are to uncover mechanisms and generate recommendations that can lead to possible efficiency gains in the hospital healthcare sector, and in the English NHS hospital healthcare system. It is possible that enhancing hospital WFR in the NHS can result in two types of efficiency gains: directly, through larger savings from reduced hiring of temporary staff; and indirectly, through the better utilization of skills, reduced human capital losses and better staff wellbeing. Thus, this research has the potential to improve the lives of both hospital workers and patients, and the working conditions of hospital workers.

The empirical analysis and the policy recommendations stemming from it will be the most substantial part of the research outputs (policy briefs, working papers, peer-reviewed publications) that the research team aims to disseminate within the scientific and healthcare academic communities, the healthcare policy makers and leaders communities, the healthcare professional community and the general public. The policy recommendations arising from the study will be included in policy briefs that will be circulated to healthcare policy makers, leaders and managers via the research team’s as well as the Funder’s research networks. Furthermore, the year 2023 launch event and the ongoing dissemination of the research through seminars, conferences, the research team’s networks and the project Steering Group and the funder’s network will increase the reach and impact of the research team’s work.

As a result of the project’s outputs, it is hoped that healthcare policy makers will:
- increase the monitoring of the hospital WFR and the factors affecting it, in an effort to improve both WFR and patients’ outcomes in case of the outcomes that this research shows to be positively affected by higher retention’s levels;
- possibly develop guidelines to improve the management and retention of hospital workforce, supported by the empirical evidence and by specific case studies that might stem from eventual follow-ups of this research, by the current or different research teams.

The impact of the project (including the research outputs and the possible improvements for the NHS) depends unambiguously on the findings of the study. It will be possible to ascertain who will realize the improvements in the management of the hospital workforce and whether and how these improvements can be achieved only when the project analysis is concluded (or at least ongoing at an advanced stage). The project research team will make sure to make the findings easily transferrable into improvements for the most suitable stakeholders including NHS England/Improvement, Care Quality Commission, Department for Health and Social Care and Clinical Commissioning Groups.

Currently, it is impossible quantifying the magnitude of the impact on patients’ outcomes. This will strongly depend on the number of emergency conditions and elective treatments that the research team are able to investigate during the funded period of the grant. The benefits of processing/dissemination will be achieved directly by the data controller and the funder, and indirectly by the project stakeholders and the general public.

The efficiency savings that can be achieved by implementing policies that improve hospital WFR will be object of a cost-benefit analysis stemming from the project’s empirical research. This will lead to British Pound estimates of the monetary gains (or losses) that the NHS can achieve for say a 1% increase in the WFR of hospital nurses/consultants/ambulance workers.

The cost-benefit analysis will be achieved by the end of the project, with its final formulation in the published versions of the study that it is expected to happen after the end of the project funding period (as soon as possible, and possibly within 3 years from 2023). However, the benefits for hospital workers and patients may happen at different times, before or after the end of this study, depending on: the relevance of the findings; the success of the dissemination; the appetite for the findings, their implications and the related recommendations from policy-makers, politicians and the general public.

The study will also support the research of at least one PhD student (from UoS) and a post-doctoral research fellow (University of Surrey). These junior researchers will both contribute to this project with their work and will be an active and fundamental part of the research team to achieve the status of co-authors of the study and its related published and unpublished outputs.

This is a research project whose aim is to investigate the economics of the hospital workforce retention, its determinants and its associations with patient outcomes, in order to provide policy-makers and hospital managers with recommendations that can improve both on the stability and engagement of hospital workers and on the quality of care perceived and received by hospital patients.

The outcomes of the project will be presented to and discussed with the project’s Steering Group and the Health Foundation’s advisory board committee. A number of experts, healthcare policy leaders and academics takes part to both these committees and will ensure the rigour of the analysis as well as a precious advisor to improve the analysis and a network to disseminate the finding of the analysis of the project. Based on the recommendations from both committees, the investigators will discuss and disseminate the results of the analysis to healthcare leaders and policy-makers. The results will be informative for the retention of the hospital workforce and for the way such retention may be correlated with the health and process outcomes (e.g. mortality, readmissions, waiting times) of patients admitted to English hospitals.

Outputs:

The study findings resulting from the data processing will contribute to the production of:
1. Reports to the Funder / working papers;
2. Submissions to peer reviewed journals;
3. Presentations to seminars and conferences / policy briefs reports;
4. Conferences.

The research outputs will never be reported at individual patient / worker level. The research outputs will always be reported as aggregate quantities, e.g. the average number of patients with condition X (e.g. heart attack) across NHS hospitals, by year. Categories with small numbers of observations will be suppressed / not reported.
All outputs that will be produced using the NHS Digital data will only contain aggregated results with small number suppression applied.

To disseminate the results of the research, the research team will:
- Draft at least two non-technical briefing papers (one on the factors affecting staff retention and the other on its consequences for patient welfare). The Investigators in research team have considerable experience of presenting research to varied audiences.
- Hold a launch event at the end of the 4 years of funding, inviting the project’s key stakeholders and wider networks, including representatives of individual Trusts. This will be timed to coincide with the actions at the point below.
- Publication of non-technical papers on the project’s website and through Twitter. They will be accompanied by a blog and animation.
- The research team will make use of the University of Surrey media team and press release the research work. This press release team assisted in writing and placing the article on the free entitlement in The Daily Telegraph [2]. The team will also seek opportunities to write for other blogs both general outlets such as The Conversation and those addressed to more specialized audiences such as The Health Foundation’s own blog series. The research team is also willing to provide guidance to organisations, such as Trusts through NHS Improvement. One of the co-Is, has written NICE guidelines so he has experience of turning research results into a specific product.

Beyond the launch event and associated activities, the research team will seek to publish the project’s findings in high quality journals. This will establish the rigor of the research and ensure its lasting academic influence. Potential target journals are Journal of Human Resources (JHR), Economic Journal (EJ), Journal of Public Economics (JPubEcon), Journal of Health Economics (JHE), Journal of Economic Behaviour and Organization (JEBO), Health Economics (HE), Social Science and Medicine (SSM).

[1] Cookson, R. and Moscelli, G. (2018) Are Angioplasty Waiting Times Growing Again. Centre for Health Economics. https://www.york.ac.uk/media/che/documents/policybriefing/Angioplasty.pdf
[2] Blanden, J. (2016). X-Factor Over Evidence: The Failure of Early Years’ Education. The Daily Telegraph, 22nd October. https://www.telegraph.co.uk/education/educationopinion/11177381/X-Factor-over-evidence-the-failure-of-early-years-education.html

The overall communications objectives are to:
1. Engage key stakeholders with the project at its inception, enabling the research team members to understand their concerns and draw on their specialist knowledge to shape the research strategy.
2. Create awareness of the research project and expertise among a broad range of interested parties.
3. Gain valuable feedback from academics and stakeholders as the project results approach their final version.
4. Disseminate the research findings widely among stakeholders, health researchers and the wider academic community.
5. Present clear and relevant policy implications to both national and local decision-makers.

In the setup phase (during the first 2 years of the project) the research team will:
- Conduct a scoping exercise to ensure the fully understanding of the key organizations (and individuals within them) interested in HWFR. This will help the research team ensure they are inviting exactly the right people to be on the project Steering Group. It will also grow the project wider network by subscribing to the right mailing lists and following the right Twitter feeds to be appraised of relevant events. The research team will also contact the most important individuals by email. [This activity has already taken place over the course of Summer and Fall 2019]
- Set up the project website at the University of Surrey, using as potential models the websites of previous research projects like Better for Less (https://www.surrey.ac.uk/better-for-less) and the Centre for Vocational Education Research (http://cver.lse.ac.uk/), which saw the involvement of one team-member of this project.
- Establish a Twitter account. The research team can share the blog through this channel as well as using it to provide brief comment and link the project to related activity.

The former activities will enable the research team to engage interested parties at the start of the project, and it allows these parties to interact with the research team in the way that suits them best.

- Form and hold the first meeting of the project Steering Group. This enables the research team to form deeper connections with key stakeholders and gain valuable feedback about the proposed methodology. Members will encourage the research team to think of ways of addressing their concerns and are likely to be able to provide valuable information about institutions, policy detail and data. The first project Steering Group has already taken place on 15/03/2019. During such meeting, the Investigators have received valuable suggestions how to set up the analysis and which data may be useful for the investigation.

After the initial set-up phase, the research team will sustain interest in the research without over-burdening its audience. In this phase the research team will:
- Continue to use Twitter, an especially helpful tool at this stage of the project. As the project networks are established, Twitter is an effective way of sharing the team’s growing involvement with stakeholders and establishing the research team as an authority on this topic.
- Update the website and add blogs if appropriate.
- Continue the meetings with the Steering Group. These will provide the research team with valuable feedback as the research findings begin to emerge and will enable the research team to discuss potential robustness checks and extensions. The Steering Group’s input will also be invaluable in helping to understand the results and their policy implications, particularly as the part of the Cost-Benefit analysis is approached.
- Begin to present the emerging findings at seminars and conferences. Some of the intended events are specialized in the health field (Health Economists’ Study Group, European Health Econometrics workshop, European Health Economics Association conference) or aimed at policy makers (DHSC analytical lunchtime seminars) while others are more general (Royal Economics Society and International Association for Applied Econometrics conferences) and give scope for gaining wide academic feedback.
- The project’s research papers will be made available as Discussion Papers on the project’s website once they are complete. The work on the determinants of staff retention will be completed first, in accordance with the research agenda and scheduled plan.

In regards to access to journal articles - Green open access will be provided as the base case. Gold open access will be provided subject to the will of the funder to pay for the Gold open access charges. In any case, the publications pre-prints will always be freely available to the public. Gold open access is where an author publishes their article in an online open access journal. In contrast, green open access is where an author publishes their article in any journal and then self-archives a copy in a freely accessible institutional or specialist online archive known as a repository, or on a website.

Subject to any third party rights, the data and knowledge generated by the study will belong to the project partners (from UoS, University of Leeds and City University London), who will also:
- manage the data and knowledge produced;
- administer the access rights to the study and its results;
- together with The Health Foundation, arrange the possibility of making the publications of the study available as Open access. With respect to utilization rights, The Health Foundation has requested a license to use the work produced by the project partners for its public benefit purposes.

The Health Foundation is under an obligation to ensure that the outputs of the project are applied for the public good. Therefore, the funder has requested a license to use, for its public benefit purposes, the outputs generated by the Recipient under the Project. Subject to any third party rights, the project partners (i.e. the Principal Investigator and co-Investigators) have granted the funder a royalty-free, non-exclusive, world-wide license to use the outputs generated by the project partners under the project for its own charitable public benefit purposes. The funder, where reasonable, will discuss with the Principal Investigator prior to using the outputs for public benefit. All outputs shared with the Funder will be aggregated with small numbers suppressed.

Relevant Target Dates:
Submission to peer-review of at least one paper related the first research question by December 2021;
Submission to peer-review of at least one paper related to the second research question by June 2023;
Organization of launch event of the project by June 2023;
Publication of at least 60% of the outcomes of the project by December 2023. [Please notice that publications in Economics have a very long turnover and so they require several years to get peer-reviewed, revised and published. A paper publication in a 4 stars peer-reviewed journal in Economics can take also several years, due to previous rejections and the time for the peer-reviewers to provide comments].

EU funding is not applicable.

Processing:

The following data products are requested:
1) HES Admitted Patient Care (HES APC): Financial Years 2009/10 to 2021/22 (13 years);
2) Patient Reported Outcome Measures (PROMs): Financial Years 2009/10 to 2021/22 (13 years);
3) Civil Registration - Deaths (CR-D): Financial Years 2009/10 to 2021/22 (13 years);
4) HES Critical Care (HES CC): Financial Years 2017/18 to 2021/22 (5 years);
5) HES Accidents and Emergencies (HES A&E): Financial Years 2009/10 to 2018/19 (10 years);
6) Emergency Care Data Set (ECDS): Financial Years 2018/19 to 2021/22 (4 years);
7) Mental Health Minimum DataSet (MHMDS), Mental Health Learning Disabilities (MHLDDS), Mental Health Services DataSet (MHSDS): Financial Years 2011/12 to 2021/22 (11 years).

The data flowing from NHS Digital to the UoS will be pseudonymised at patient level for all datasets. The hospital consultant code (GMC code of the hospital consultant in charge of the patient; consult variable) in HES APC data (and possibly also HES A&E, ECDS and MHMDS/MHLDDS/MHSDS) is needed to link the Hospital administrative datasets from NHSD to the Electronic Staff Records data.
The GMC consultant code is needed because that is the only way that the project members can link HES and MH data at consultant level to the ESR data provided by DHSC. A pseudonymised GMC code would not find a match in the ESR data provided by DHSC. The GMC consultant code is replaced with a study ID key once the linkage has taken place.

The UoS of has access to ESR data, supplied by the Department of Health and Social Care (DHSC). UoS will link this ESR data to NHS Digital data at UoS only. The other two organisations participating to this project, The University of Leeds and City University London, will not have access to either NHS Digital data or ESR data. ESR and NHS Digital data will be linked in two ways. The first linkage will be by period (e.g. year) and organization (i.e. Trust XXX) and the data for such linkage will be accessible and processed by substantive employees and research students at UoS. The second linkage will be by consultant code-period-organization and the data for such linkage will be accessed and processed only by substantive employees at University of Surrey and not by research students at UoS.

To mitigate any risk of re-identification, the identity of hospital consultants from the GMC register will never be included in the secure IT system project folders. There will also never be a reporting of results at individual patient or worker level, and when the aggregated number of workers or patients is less than 10 observations within a hospital-time period interval.

Other data linkages will happen only at aggregate level, e.g. hospital location using Organisation Data Service (ODS) data postcodes; average wages of workers living in a given hospital catchment area, extracted from Labour Force Survey (LFS) and Annual Survey of Hours and Earnings (ASHE) data (collected by the UK Office for National Statistics and accessed through the UK Data Archive).

There will never be reporting of results at individual patient or worker level, when the aggregated number of workers or patients is less than 10 observations within a hospital-time period interval. As above, there will never be any attempt to re-identify individuals, whether patients or hospital consultants.

Data will only be accessed and processed by either substantive employees of The UoS or PhD students based at the UoS who are involved as honorary research fellows with a written contract defining their role and duties. The total number of PhD students working on the data will never be higher than 5 students, and their research work will have to fall within the remit of the research purpose stated in this application.

The data will not be accessed or processed by any other third parties not mentioned in this agreement.

There will be no attempts made by any of the research project team members to re-identify individuals involved in this project as there is no requirement to do so. The two co-investigators based at University of Leeds or City University will not be processing any NHS Digital data, and so they will only contribute intellectually to the project.

The data will be accessed through a remote access secure environment, whose technical details are provided below.

The data are accessed by any member of the research team involved in data processing activities through a secure IT system, called “Surrey Secure Network (SSN)”. Security of the “Surrey Secure Network” is consistent with the framework of University of Surrey’s Information Security Policy, available here - http://www.surrey.ac.uk/about/corporate/policies/information_security_policy.htm

Data will be processed using:
- Virtual Desktop sessions that are only connected to the secure network;
- All laptops have their local disk encrypted using CESG approved standards.

Virtual desktops are used for remote working. All the analysis is always run on virtual desktops, regardless of the applicants working from home or the office. The NHSD data is hosted by a secure server to which the applicants have no physical access. The secure server is encrypted and it is not possible to copy and paste data from the screen when using the virtual desktops

Patient level data is only accessible on the Surrey Secure Network. Research data is retained for 10 years after the completion of the study in accordance with University policy on research data management. The System shall be risk assessed every 12 months, which includes an annual infrastructure penetration test. The UoS uses the ITIL (Information Technology Infrastructure Library) Risk Management framework for its IT policies and management.

Services are backed-up and data replicated between 2 data centres on campus. These are geographically spaced to provide cover for disaster purposes to ensure a copy of the data can be recovered. All systems within the University are bound by the University’s Information Security Policy (http://www.surrey.ac.uk/about/corporate/policies/information_security_policy.pdf).

In the first stage (RQ1), the data provided by NHSD will:
- be linked to Electronic Staff Record (ESR) data over the years;
- aggregated at hospital level by subperiods (e.g. monthly) to create variables that control for time-varying demand and supply factors at hospital level;
- used in statistical models to investigate the association of demand and supply factors with hospital WFR in the English NHS (at the mean or over the outcome distribution).

In the second stage (RQ2), the data provided by NHSD will:
- be used to produce hospital quality (e.g. mortality, readmissions, PROM gains) and hospital process (e.g. length of stay, waiting times) indicators at patient level [Outcome Variables];
- be used to create variables that control for patients’ characteristics (e.g. age, gender, ethnicity), patients’ pathways (e.g. hospital or GP of treatment) or provider characteristics (e.g. NHS or Independent Sector hospital) [Control Variables];
- used in statistical models to investigate the association of hospital WFR with patients’ Outcomes in the English NHS (at the mean or over the outcome distribution), controlling for the demand and supply determinants of healthcare.

UoS will not flow any data to NHS Digital.

The data flows out of NHS Digital will consist in 3 annual data disseminations. In the first data flow the latest datasets release and the historical datasets will be delivered. In the last two remaining data drops, only the latest datasets release will be delivered.

There will be no data linkage undertaken with NHS digital data provided under this agreement that is not already noted in the agreement.

• Electronic Staff Record (ESR) data
The UoS has access to ESR data, supplied by the Department of Health and Social Care (DHSC). UoS will link this ESR data to NHS Digital data at UoS only. Substantive employees at University of Surrey are the only individuals able to access and link NHS Digital data to ESR data. No other organisations, including The University of Leeds and City University London have access to NHS Digital data. ESR and NHS Digital data will be linked via consultant code only.

To mitigate any risk of reidentification, the identity of hospital consultants from the GMC register will never be included in the secure IT system project folders. There will also never be a reporting of results at individual patient or worker level, and when the aggregated number of workers or patients is less than 10 observations within a hospital-time period interval.

Other data linkages will happen only at aggregate level, e.g. hospital location using Organisation Data Service (ODS) data postcodes; average wages of workers living in a given hospital catchment area, extracted from Labour Force Survey (LFS) and Annual Survey of Hours and Earnings (ASHE) data (collected by the UK Office for National Statistics and accessed through the UK Data Archive).

There will never be reporting of results at individual patient or worker level, when the aggregated number of workers or patients is less than 10 observations within a hospital-time period interval. As above, there will never be any attempt to re-identify individuals, whether patients or hospital consultants.

DATA MINIMISATION
It is sufficient to use pseudonymised data, there is no need to identify any patient. Patient level data is needed as most of the outcome variables, and the effects of interest on such variables will be measured at patient level. This will prevent the risk of ecological fallacy (i.e. aggregation bias) in the results.
The risk of ecological fallacy (i.e. aggregation bias) arises by using grouped (i.e. aggregated) data.
If the outcome of interest is at patient level, aggregating data at hospital level may hide important patterns. For example, the mortality of hospital X for heart attack is found to be 10% of emergency admissions; however, the aggregate figure at hospital level may hide that the hospital mortality is very different by gender (e.g. 5% for male patients; 15% for female patients) or by comorbidities (7% for non-diabetic patients; 13% for diabetic patients). Hence, using the lowest level of data aggregation (i.e. patient level data) allows to uncover patterns that may be related with the relationships of interest and features of healthcare delivery as it varies depending on patients characteristics or the interaction of patients and organization characteristics.

The data cannot be identifiable by geography; the projects delivery needs detailed patient level data and aims to explore the heterogeneity of the effects/associations of interest by different geographies of England. Moreover, geographic variation can be a confounder in the analysis that needs to be controlled for.

The data cannot be identifiable by demographics; the project aims to explore the heterogeneity of the effects/associations of interest by different demographic characteristics of the patients, but more importantly because such characteristics are possible confounders that needs to be controlled for.

The data cannot be identifiable by diagnosis and procedures; the project aims to explore the heterogeneity of the effects/associations of interest by different diagnosis and procedures of the patients, and also because there is a need to know the total number of patients that are admitted to hospital in any given day to compute measures of demand pressure for hospitals. For the latter reason, the creation of a HES cohort is not going to be sufficient for the delivery of the project.

All fields requested are necessary for the project, as the analysis may lack otherwise the consideration of important mechanisms or confounders, and so provide wrong recommendations to healthcare leaders and policymakers. It would be in principle to replace date of death with mortality flags, however, mortality flags at many time intervals (7, 14, 30, 60, 90 days, 6 months, 1 year, 2, 3, 5 years) and with respect to both date of admission and date of discharge would be needed, so that would increase the sizes of the HES extracts.

The access to the Civil Registration Deaths records, including the full date of death and the reason for death, are then preferable, also for purposes or cross-validation with the hospital records. The consultant code is needed to allow linkage of HES to ESR and evaluate the effect of time to leave a given hospital on the patients' health outcomes. The request of this variable is motivated by the interest in the average effect of consultants’ time-to-leave a hospital, and not by any interest in the identification of specific consultants and their performances. The postcode outward code and the postcode sector are needed to compute distance measures from patient's residence to GP location and Hospital site location that are more precise than those based on LSOA of patient residence. Given that these variables do not constitute the full postcode, this will prevent patient identifiability.

The analysis is at national level, so it will need to cover all England. Moreover, the project plans to investigate regional variations in the associations or effects of interest.

• HES Admitted Patient Care (HES APC)
HES APC is necessary to investigate the associations of hospital WFR and emergency care patients’ outcomes. HES APC will provide the bulk of acute care data for quality indicators, patients' characteristics, patients’ pathways, and patients' health outcomes that will be used in the research project relatedly to acute emergency care.

HES APC years from 2009/10 to 2021/22 are requested to exploit the variation due to several policies happening in the last decade. Some of these policies happened at the start of this decade (e.g. 2012 abolition of PCTs and creation of CCGs in 2013), some have happened more recently (e.g.: the 2016 new junior doctor contract; the 2018 NHS Improvement Retention Support Programme; the 2018 hospital staff’s new pay/progression contract). In any longitudinal analysis (e.g. before-after, difference in difference, interrupted time series) some years before and some years after the policy change are needed to assess the effect of a given event or policy.

All patients' episodes are required because a multi-episode spell reports all the information regarding the patient pathway from admission to discharge or death. All elective episodes are required since the number of elective patients treated is in itself an outcome variable and different episode may contain different information that must be included in the analysis. Maternity episodes are required because unborn children and neonatal records will be used as outcome variables for maternity wards, and the retention of midwives is part of this study. The timeframe around the index event (e.g. procedure or diagnosis) is required because waiting times, length of stay (both post and pre-operative) are some of the target outcome variables in the analysis.

For reasons spelled out above, the full HES Admitted Patient Care is needed, with bridges files to Civil Registration Deaths, Mental Health Services (or Minimum) data, HES A&E / ECDS, HES CC and Patient Reported Outcome Measures. There are no alternatives or less intrusive ways of achieving the purpose. Aggregate data or semi-aggregate data would not serve the project purposes as they would imply incurring the risk of aggregation bias and they could mask specific patient’s pathways or characteristics that are needed to be accounted for in order to estimate the correct effects of interest for the investigation.

• HES Critical Care (HES CC)
HES CC is necessary to investigate the associations of hospital WFR and emergency care patients’ outcomes related to patients admitted to Critical Care departments. This will allow the research team to also investigate the associations (or effects) of hospital WFR with health outcomes for COVID-19 patients. HES CC years from 2017/18 to 2021/22 are requested to investigate the performance of CC departments before and after the 2020 Covid19 crisis.

• HES Accidents and Emergency (HES A&E)
HES A&E will provide the bulk of data for quality indicators, patients' characteristics, patients’ pathways, and patients' health outcomes related to ambulance and emergency care. Financial years from 2009/10 to 2018/19 for HES A&E are needed to exploit the variation due to several policies happening in the last decade.

• Civil Registration (Deaths) - Secondary Care Cut
Civil Registration (Deaths) will provide data for out-of-hospital mortality after discharge, which is one of the main quality indicators in acute healthcare. The bridge file will allow to link the Deaths file to HES APC, HES A&E, ECDS and MHMDS/MHLDDS/MHSDS. Data for financial years from 2009/10 to 2021/22 are needed to exploit the variation due to several policies happening in the last decade.

The Original Underlying Cause of Death variable is needed to double-check the medical reason the patient has died. Variables for neonatal mortality are needed to create indicators of care for new-borns.

Subsequent Activity and Match Rank variables are needed for data quality checks purposes. Access to the Civil Registration Deaths records, including the full date of death and the reason for death, are preferable for purposes or cross-validation with the hospital records.

• Patient Reported Outcome Measures (PROMS; Linkable to HES).
PROMS will provide important quality indicators, patients' characteristics and patients' health outcomes that will be used relatedly to elective care, e.g. health gains (or losses) in terms of a change in the patient Oxford Hip/Knee Score.

Data for financial years from 2009/10 to 2021/22 are needed in order to exploit the variation due to several policies happening in the last decade

• Mental Health Minimum Data Set (Linkable to HES).
• Mental Health and Learning Disabilities Data Set (Linkable to HES).
•Mental Health Services Data Set (Linkable to HES) [packages: MH Community: 1d + add-on package 4 (currencies) and MH Inpatients: 2a + add on package 3 (patients info) + package 4 (currencies; i.e. MHS 801-803)].
• Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set (and other MH datasets).

The datasets will provide important quality indicators, patients' characteristics, and patients' health outcomes that will be used in the research project relatedly to mental health care.

Data for financial years from 2011/12 to 2021/22 are needed to exploit the variation due to several policies happening in the last decade. Data prior to 2011/12 are not requested, given both funding constraints and the availability of a less precise dataset.

For all MH datasets, data on both community care and hospital care are needed since only a fraction (about 10%) of MH patients are hospitalized, while the other patients are treated to community services where both MH doctors and MH nurses operate and may operate as a replacement/substitution of MH hospital services. Failing to control for this alternative channel at local area level may imply a bias in the results of the analysis.

• ECDS
ECDS dataset is necessary, as it will provide the data for quality indicators, patients' characteristics, patients; pathways, and patients' health outcomes that will be used in the research project, relatedly to patients admitted to emergency care departments.

Data for financial years from 2018/19 to 2021/22 are needed in order to exploit longitudinal variation. Data prior to 2018/19 are not requested, given both funding constraints and the concerns related to data quality and completeness. Data for year 2018/19 is requested for both ECDS and HES A&E, this is because in ECDS the 'Token_ID' to link HES datasets like HES APC or HES CC is still not available at the moment, and the project needs to evaluate the associations of Hospital WFR with emergency care outcomes along the patient pathway using the most complete information (e.g. for the evaluation of the effect of retention exploiting the shock due to the Brexit referendum); at same time, in order to evaluate the associations of Hospital WFR on emergency care outcomes pre and post Covid19, at least about two financial years are needed before and after the Covid19 outbreak, which motivates the request also for year 2018/19 of the ECDS.

In order to investigate the effect of hospital workers' proximity to leaving an NHS organization on Emergency care, the 'ProfessionalRegistrationIssuerCode' variable is requested in order to try and link it, for hospital consultants only, to the ESR data through the GMC code.

The volume of data in terms of years is needed for several reasons:
1. In order to control for unobservable but time invariant factors, the project in most cases uses organization (e.g. Trust) fixed-effects. The estimation of longitudinal models with fixed-effects requires several yearly data points to assure that the fixed-effects estimates are consistent. The estimates of interest coming from these models is likely incorrect without a sufficient number of data time points – in the case of this project, the points are years of data. Several years of data are usually needed for the fixed-effects to be estimated correctly. This is even truer in presence of breaks due to policy changes over time (see next point below), which would imply a few years before the policy and a few years after it (e.g. two subperiods of 5 years each).
For some datasets like HES APC, this can be used as described above as the dataset does not suffer from structural changes / discontinuities over time and remains a similar structure. For some datasets like Mental Health datasets, HES Critical Care and A+E/ECDS this is not possible since either such does not have 10 years avilable as it did not exist 10 years ago (eg HES CC), or the dataset has been discontinued / changed over the years (e.g. MH data with new formats/variables. With the datasets that have only a few years of data the project team will either use longitudinal methods but acknowledge the limitations due to having fewer data points, or it will focus on cross-sectional variations of health outcomes for different organizations within each of the year of data requested.
Finally, given the presence of budget restrictions to the project, the years of data requested have been kept to the strictly minimum possible to deliver a valid analysis to the Funder and the stakeholders of the project (including the general public).

2. The project requires to exploit several policies and events that act as 'exogenous shifters', i.e. events or policies that will have an association with the health outcomes only because of the impact they might have had on NHS workforce retention. Such events or policies might have contributed in different ways to define the patterns of hospital workforce retention. Some of these policies are:
- the 2012 doctor revalidation policy;
- the 2012 introduction of CCGs;
- the 2016 Brexit Referendum;
- the 2017/19 junior doctors contract reforms;
- the 2018 NHS Improvement Workforce Retention Program;
- the 2020 EU withdrawal.
The analysis of the impact of each policy or event requires several years before and after the time when the policy/event was introduced, in order to estimate effects that are correct and plausible.

3. Moreover, when faced with estimation of dynamic models (which are needed to estimate how changes in some variables lead to changes in the outcome of interest) some variables need to be lagged. Compared with a model using only contemporaneous (existing at or occurring in the same period of time )data, a model including lagged data requires even more yearly data points. This is because if one wants to evaluate the effect of workforce retention in 2008 on patient waiting times in year 2009, we need data for 2008 and 2009, not just 2009. If the suspected time dependence is longer, the number of required time lags has to be longer. For some statistical models 4 or more years of lags are required for the estimation to be correct, and this further motivates the request for a longer number of years for some of the datasets such as HES APC or HES A&E.
The large volume in terms of number of fields is required as well for several reasons.

4. GEOGRAPHIES. The data cannot be narrowed by geography, as the our study will explore the heterogeneity of the effects/associations of interest by different geographies of England; moreover geographic variation can be a confounder that we need to control for in the analysis.
5. DEMOGRAPHICS (age, gender, ethnicity, socioeconomic indicators like Index of Multiple Deprivation). The data cannot be narrowed by demographics, as the study needs to explore the heterogeneity of the effects/associations of interest by different demographic characteristics of the patients, and more importantly because such characteristics are possible confounders that must be controlled for in the analysis.
6. DIAGNOSES and PROCEDURES. The data cannot be narrowed by diagnosis and procedures, as the study requires these fields to: i) compute comorbidity indices; ii) compute different indicators of hospital quality, which vary either by diagnosis, procedure or clinical specialty; iii) compute different indicators of hospital demand pressure depending on all admission (i.e. for any reason/diagnosis/procedure) to a hospital; iv) investigate the heterogeneity of the effects/associations of interest by different diagnosis and procedures of the patients; v) compute indicators of competition, which are based on all admissions to a hospital (i.e. for any reason/diagnosis/procedure); iv) compute unplanned readmissions after hospital discharge as a measure of widely used quality measure, in which the index spell and the readmission spell are not necessarily due to the same diagnosis or procedure.
7. EPISODES (including: dates of admission and discharge; durations; type, i.e. emergency or not; admission/discharge to/from home or not). All the patients' episodes are required in order to construct hospital spells, which may be made of multiple episodes and include precious information regarding the patient pathway from hospital admission to discharge or death.
8. TRUST-LEVEL AND CCG-LEVEL. These fields are necessary for the project, as they can act as important mechanisms or confounders that we need to control for to provide the right recommendations to healthcare leaders and policy-makers.
9. MSOA, LSOA, and POSTCODE information (i.e. postcode outward code and the postcode sector). These variables are needed to compute distance measures from patient's residence to GP location and Hospital site location at different levels of precision.
10. GP-PRACTICE IDENTIFIER. This field is necessary for two purposes:
a) to control for the quality of primary care for the patients, e.g. given by the number of ambulatory care sensitive conditions (derived from HES APC) for patients admitted to hospital but treated by the same GP practice;
b) as a geographical/organizational factor, in order to control for the number of elective patients referred by a given GP practice to different hospitals (e.g. to check the GP-hospital market concentration of patients choosing the hospital for elective care).

All organisations party to this agreement must comply with the data sharing framework contract requirements, including those regarding the use (and purposes of that use) by “personnel” (as defined within the data sharing framework contract i.e. employees, agents and contractors of the data recipient who may have access to that data).


Secondary data linked to the Royal College of General Practitioners (RCGP) Research and Surveillance Centre's (RSC) primary care sentinel data for the purposes of infectious and respiratory diseases surveillance in England — DARS-NIC-21083-B6C5J

Opt outs honoured: Yes - patient objections upheld, Identifiable, Anonymised - ICO Code Compliant, Yes, No (Statutory exemption to flow confidential data without consent)

Legal basis: Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(7)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-03-12 — 2023-03-11 2020.05 — 2022.04.

Access method: Ongoing, One-Off

Data-controller type: PUBLIC HEALTH ENGLAND (PHE), ROYAL COLLEGE OF GENERAL PRACTITIONERS, UNIVERSITY OF SURREY

Sublicensing allowed: No

Datasets:

  1. Hospital Episode Statistics Accident and Emergency
  2. Hospital Episode Statistics Outpatients
  3. Hospital Episode Statistics Critical Care
  4. Hospital Episode Statistics Admitted Patient Care
  5. HES:Civil Registration (Deaths) bridge
  6. Civil Registration - Deaths
  7. Civil Registration (Deaths) - Secondary Care Cut
  8. HES-ID to MPS-ID HES Accident and Emergency
  9. HES-ID to MPS-ID HES Admitted Patient Care
  10. HES-ID to MPS-ID HES Outpatients

Objectives:

Public Health England (PHE) holds a contract with the Royal Collage of Practitioners (RCGP) who in turn hold a contract with the University of Surrey to deliver information to support surveillance and monitoring of vaccine efficacy on Influenza.

PHE, RCGP and University of Surrey are Joint Data Controllers for this request. They require HES and Civil Registration Data (CRD) to look at the outcomes of care, including death to support surveillance and monitoring of vaccine efficacy on Influenza. Most important health outcomes happen in hospital, hospital is where the bulk of health care costs are incurred. The focus of the work will be the impact of influenza and other infections on health the benefit-risk of influenza and other vaccinations. The Royal College of General Practitioners (RCGP)Research Surveillance Centre (RSC), is based at the University of Surrey.

The University of Surrey will have access to the record level data supplied by NHS Digital under this agreement. The University of Surrey will be the only organisation who accesses and processes the data disseminated under this agreement.

The GDPR Lawful basis for processing the requested data under this agreement are;

Public Health England;
Article 6(1)(e) (Public Task processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller) and Article 9(2)(h) (processing is necessary for the purposes of preventive or occupational medicine, for the assessment of the working capacity of the employee, medical diagnosis, the provision of health or social care or treatment or the management of health or social care systems and services) and Article 9(2)(i) (processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices)
PHE exist to protect and improve the nation's health and wellbeing, and reduce health inequalities.

RCGP;
Article 6(1)(f) processing is necessary for the purposes of the legitimate interests pursued by a controller, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child. This shall not apply to processing carried out by public authorities in the performance of their tasks. 9(2)(i) (processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices)

University of Surrey;
Article 6(1)(e) (Public Task processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller) and Article 9(2)(i) (processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices).

Additionally the request for data is supported by PHE as they have an emanation of the Secretary of State for health and social care, to both self-approve the use of Regulation 3 and to grant this approval to third parties processing confidential patient information without consent for purposes that fall under the scope of Regulation 3.

This authority to has been in existence since PHE was established in 2013 although the large majority of the Regulation 3 approvals granted since that date have been internal to PHE; only a very small number have been granted by PHE to third parties. Specifically the work being undertaken under Reg 3 in this application is limited to Communicable Disease surveillance and other risks to public health’.

This secondary care data being requested will be linked at individual level to the Royal College of General Practitioners (RCGP) Research and Surveillance Centre's (RSC) primary care sentinel data for the purposes of infectious and respiratory diseases surveillance in England’. These include feeding back to member practices about their quality of care through a practice dashboard. The key objectives of the work are to: (1) Monitor influenza; (2) Analyse influenza vaccine effectiveness; (3) Understand and predict the impact of influenza and other winter infections on health service utilisation (e.g. older people with co morbid illness may be more likely admitted to hospital. Primary care/general practice data (which is already held) is rich in terms of diagnosis and information about the process of care. However, HES and CRD data provides key information about the outcomes of care (A&E use, hospitalisation and death data)

The University of Surrey have an established sentinel GP influenza surveillance scheme in over 270 practices across England that monitors Influenza-like-illness and a subset who take virology swabs with the purpose of virologically confirming infection. The University of Surrey have a great deal of experience in using health related data to monitor infectious illnesses. Accessing HES and CRD data will allow the University of Surrey to expand their knowledge about the impact of infectious diseases further; this will both be at the individual patient risk level as well as looking how the University of Surrey could better predict winter pressures on the NHS to support PHE and RCGP.

Public Health England (PHE) is involved in this programme of surveillance and quality improvement. PHE is a large organisation whose main aim is to protect and improve the nation’s health and reduce inequalities.

The RCGP RSC and PHE have worked together for over 50 years to monitor the progression of infectious illnesses in order to put any action plans in place if needed. PHE are funding this surveillance and quality improvement being undertaken through this agreement.

Individual patient level data is required because this allows much more precise statistical analyses to be made, compared with just comparing aggregate data.


The main aim of this project is to build a robust database and reporting system using up-to-date primary and secondary care data at the individual patient level, which can be easily queried; and has the likely variables required for PHE reports outlined in the specific outputs section. The database will contain the following variables for each patient (where present):
• Influenza-like-illness appointments: including information on whether or not a virology swab was taken and the outcome of the swab
• Data for the other 32 conditions monitored by University of Surrey as contracted by RCGP RSC on behalf of PHE
• To provide national surveillance data about an outbreak or pandemic that was not predicted
• Vaccination status: date of vaccination, type of vaccination
• Co morbid conditions
• Medication which may be associated with better or adverse outcomes.
• A & E visits
• Inpatient appointments, including critical care
• Outpatient appointments
• Mortality data (if applicable).

The database will be used to answer the many associated questions exclusively related to surveillance and monitoring of vaccine efficacy on Influenza. For example, gaining access to HES and CRD data means that the University of Surrey can clearly see the rates of patients who access health care because of influenza related conditions. This will enable the University of Surrey to assess the pressure that is put on the healthcare system during influenza seasons, and devise and test measures to prevent this. Another example relates to comorbidities of disease, reducing the rates of influenza nationwide is of public health interest as influenza can be particularly dangerous for those in high risk groups. HES and CRD data will be used to identify the incidence of flu in those with certain conditions, such as pregnancy or diabetes. This will enable the University of Surrey to identify whether certain conditions are associated with an increased risk of catching influenza, and may lead to individuals with certain conditions being offered vaccinations in future influenza seasons. A further example relates to vaccine effectiveness. The RCGP RSC system is also used to monitor the effectiveness of influenza vaccine on behalf of PHE each season. PHE make decisions about England’s vaccination programme, and the data the RCGP RSC provides to PHE informs their decisions on future influenza vaccinations. The data provided under this agreement will be used to see whether anyone with certain conditions, who are vaccinated, are less likely to use hospital services than those who have not been vaccinated. This will provide further information on vaccine effectiveness in individuals with certain conditions.

The data will be used to support University of Surrey, RCGP and PHE in understanding more about the primary and secondary care data at a patient level for the following conditions;
URTI – Upper respiratory infections
LRTI – Lower respiratory infections (pneumonia and acute bronchitis)
Asthma and COPD

These peak as flu circulates and not all flu is diagnosed as flu therefor looking at these conditions will support the influenza overall programme.

Both CRD and HES data will be required:
• HES: Critical Care
• HES: Outpatients
• HES: A&E
• HES: Admitted patient care
• CRD (mortality) data


Since the outbreak of COVID-19 in Wuhan, China, the surveillance programme have been working closely with and under instruction from Public Health England (PHE) and other national bodies to closely monitor and make plans to deal with any situation that may develop in the UK. A vital part of that will be to monitor the number of suspected COVID-19 cases in the community in a timely way. PHE has commissioned the RCGP Research Surveillance Centre to incorporate monitoring of COVID-19 into its virology surveillance scheme.

RCGP RSC and PHE will be extending the surveillance to include COVID-19.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Expected Benefits:

The surveillance work conducted by the RCGP RSC on behalf of the Data Controllers is used by Department of Health, NHS England and PHE to monitor trends in a number of infectious conditions. Specifically for influenza, seasonal epidemics are carefully followed, in order to deploy necessary measures as needed to limit the impact on the population. The trends in other conditions inform the development of vaccination programmes or other public health measures. Linking with HES and CRD data can help assess the severity and mortality of a given condition, thereby alerting PHE on whether larger measures should be implemented. This could lead to improved healthcare and reduced mortality of certain conditions. Additionally, the link with the HES and CRD data allows The University of Surrey to identify whether a particular flu season is putting additional pressures which means that plans can be out in place in order to prevent or deal with these pressures next season.

Specific benefits
The benefits include improved knowledge of the pressures of certain conditions during the winter period, reduced mortality from influenza, improved vaccine effectiveness and a health system that is more prepared in the event of an influenza outbreak.

Magnitude of benefits
It is expected that these benefits will be nationwide across England, to both patients and staff working in the health care system.

Sequence of events needed to take place in order for benefits to be achieved:
1. Pseudonymised matching and then HES and CRD mortality data are linked with the data the University of Surrey hold at the RSC
2. The University of Surrey analyse the data and identify trends in rates of illnesses, hospital use and mortality in certain groups (i.e. pregnant women, older people, and people with co morbid conditions)
3. The University of Surrey alert PHE and the Chief Medical Officer of the findings who will then evaluate the evidence and make health care plans that are in the best interest of the nation’s health.
For example, from the data provided by HES and CRD, PHE might identify that certain conditions are associated with higher influenza rates, and therefore the possibility of extending the vaccination programme to this condition might be examined.

The RSC and PHE have been working together for many years, to improve the nation’s health. University of Surrey has become important in the process from March 2015 the secure network was established at University of Surrey. The work is funded by PHE and the University of Surrey's work has previously been used to influence practice. For example, if high rates of influenza are circulating the University of Surrey will inform the Chief Medical Officer who will then make a decision about whether or not to dispense anti-viral medications at hospitals and general practices.

Outputs:

The purpose of linking HES, CRD, and primary care data is to implement a wider and more accurate sentinel surveillance of infectious diseases in England. The main outputs of the RCGP RSC’s surveillance work, which is funded by Public Health England (PHE) are as follows:

• The RCGP RSC weekly report is circulated to a selected list of recipients on Wednesdays and it is publicly available on Thursdays at 2 pm at the RCGP RSC website (http://www.rcgp.org.uk/clinical-and-research/our-programmes/research-and-surveillance-centre.aspx). This report currently covers incidence rates of 37 infectious and respiratory conditions in England. It is expected that, in future, hospitalisation trends will be included. This is incorporated into the syndromic surveillance carried out by PHE on a daily basis, which allow them to determine any urgent priorities for local health protection teams.
• Similar to this, an annual report is published covering the annual trends of the 37 conditions. Each year, this report has a new theme which is explored in a paper submitted to a peer-reviewed journal (usually British Journal of General Practice). Themes explored include demographic disparities in disease presentation, higher rates of consultations for lower respiratory infections for boys, and urban/rural disparities of presentation.
• In January of every year, the University of Surrey provide a mid-season flu cohort to PHE with data up to the end of December. This is a fully pseudonymised patient-level extract collected by a PHE statistician using a secure drive. This data extract contains details of influenza swabbing, chronic conditions, and vaccination status for each patient. It is hoped to be able to include details of emergency attendances or admission around influenza, pneumonia, or lower respiratory tract infection events. At the end of the flu season (varies from March to May), a second extract is provided updating the first, with data recorded after December.
• The data from both of these extracts is used to estimate seasonal influenza vaccine effectiveness, stratified by comorbidities and demographics. HES data will allow the University of Surrey/PHE and RCGP to include the impact of any changes in effectiveness, assessed through changes in hospital admissions/emergencies due to respiratory conditions. The results are published at the mid-season and at the end of season stage, on the peer-reviewed journal Eurosurveillance.
• Important results from either of these will be further analysed and presented at the RCGP annual conference, the PHE annual conference, and the PHE annual epidemiology conference.


All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Processing:

Flows of data:
• Data are extracted from practices that are members of the Royal College of General Practitioners (RCGP RSC) Research and Surveillance Network by Apollo. The University of Surrey subcontracts with Apollo to do this as part its contractual responsibilities.
• The University of Surrey, on behalf of RCGP RSC, will provide NHS digital with a list of pseudonymised NHS numbers and date of birth for the cohort each quarter.
• NHS digital will provide HES Critical Care, Outpatients, A&E, Admitted patient care and CRD data for the cohort to the University of Surrey each quarter for it to link these information to RCGP RSC data.
• University of Surrey will store the data on the secure network.
• University of Surrey will process and aggregate pseudonymised data to produce approved reports for surveillance (as part of the National surveillance process); and quality improvement.

Detailed explanation of flows of data:

a) Data flow from RCGP RSC network member practices to University of Surrey: Apollo extract the data from the practices. Patients who have opted out of data sharing do not have their data extracted, unless they have consented to a specific surveillance programme or study. This extract provides the study with information about patient’s visits to general practices including the date of the appointment, the reason for the visit and any relevant vaccination information. The University of Surrey also receive patient’s NHS numbers and date of births which are pseudonymised using SHA-512 algorithm. Detailed information about this algorithm is held in a separate location by IT services at the University of Surrey. This extract provides University of Surrey with a cohort of participants whose data will then requested from NHS digital.
b) University of Surrey to NHS Digital: The University of Surrey securely transfers a file of identifiers (Pseudonymised NHS Number, date of birth, and Unique Study ID) to NHS Digital for all non-opt-out patients who are registered with RCGP RSC general practices.
c) NHS Digital to University of Surrey: NHS Digital returns linked HES and CRD mortality data including the Unique Study ID and pseudonymised NHS numbers or date of birth to University of Surrey.
d) University of Surrey Storage and processing of data: The data about patients registered with RCGP RSC general practices is stored on the secure server at the University of Surrey which can only be accessed from the University of Surrey. The data will be processed within secure network and dedicated analysis server of the Surveillance Group. The secure network is located behind a firewall within the University’s network, all in-bounded connections are blocked, but out-bounded connections are allowed. Patient level data are held in the database server within the RSC Group’s secure network.
Pseudonymised data will be stored on the database server within the RSC’s secure network. The pseudonymisation algorithm is held in a separate location by IT services at the University of Surrey.
e) University of Surrey process and aggregate pseudonymised data to produce reports. For example, University of Surrey on behalf of RCGP RSC provide a mid-season flu cohort to PHE with data up to the end of December. This is a fully pseudonymised patient-level extract collected by a PHE statistician using a secure drive. The University of Surrey also produce an end of season report, an annual report and weekly reports that are available to the public and use aggregated data on rates of infectious and allergic conditions.


The RCGP RSC data is controlled and processed by a group of staff who are all based at the University of Surrey; all are mandated to complete information governance training. The group is made up of analysts, academic fellows, Structure Language Query (SQL) developers, RCGP RSC practice liaison officers, a project manager and a head of department. The team work from secure workstations or secure laptops with encrypted drives within the group’s secure network.

Data will only be accessed by individuals within the RSC who have authorisation that are substantive employees of University of Surrey. The authorisation process includes: (1) Contractual requirement to follow IG principles; (2) Using the email registered with Human Resources to complete IG training and to return the certificate; (3) Staff’s email is authorised by the IT department for one year to access the secure network and staff’s computers are configured to allow this; (4) At any point the project managers or Head can have access to the secure network turned off. There is special authorisation to have access to the main database.

Only three SQL developers and one senior project manager can access the main database. Surveillance databases are created for approved analyses once they have been agreed by the RCGP RSC approval committee. This agreed protocol includes the list of variables required for the database. The SQL developers create separate databases for individual projects only including the required variables, for the required time interval.


The HES and CRD data will be linked with the data that the University of Surrey already receives from the RCGP RSC network practices and PHE reference laboratories. The linkage between secondary and primary care data would happen via linking pseudonymised NHS numbers in both sets of data. The University of Surrey have used this process for previous projects linking different sets of data, and the linkage has been successful, provided both parties use the same pseudonymisation algorithm (SHA-512).

There will be no requirement nor attempt to re-identify individuals from the data.
The data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.

Historic data are needed because longitudinal data better enable the RSC to predict what might happen in the future; even a small increase in the ability to understand flu and its associated morbidity and mortality would offer benefits for patients and the NHS. Both historical and future data are needed in order to build a robust database and reporting system using up-to-date primary and secondary care data at the individual patient level, which can be easily queried. This will enable the study group to answer a wide range of questions which will have an impact on the provision of health care in England. For example, the data will be used to answer questions posed by PHE, who make many decisions about healthcare, such as the vaccination programme, or preventative measures.

The use of national data is needed as the University of Surrey are a national surveillance centre and the cohort are from across England. Practices are recruited to be nationally representative.

Due to the potentially wide variety of adverse events that influenza can cause, it is not seen as appropriate to limit the data to specific health conditions/diagnostic codes or data types. For example, unexpected rise in scarlet fever and winter outbreaks of scabies are examples of unexpected increased incidence of diseases that has been followed.

The use of pseudonymised NHS numbers are essential as the request to link HES and CRD data to the data that the University of Surrey already receives from the RCGP RSC network general practices and PHE reference laboratories.


The dynamics of frailty in older people: modelling impact on health care demand and outcomes to inform service planning and commissioning — DARS-NIC-353126-Y1S5F

Opt outs honoured: Identifiable, Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii)

Purposes: Yes (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2021-02-18 — 2024-02-17 2022.01 — 2022.01.

Access method: One-Off

Data-controller type: UNIVERSITY OF OXFORD, UNIVERSITY OF SOUTHAMPTON

Sublicensing allowed: No

Datasets:

  1. Civil Registration - Deaths
  2. Hospital Episode Statistics Accident and Emergency
  3. Hospital Episode Statistics Admitted Patient Care
  4. Hospital Episode Statistics Critical Care
  5. Hospital Episode Statistics Outpatients

Objectives:

STUDY AIMS AND PURPOSE:
Frailty has emerged as a significant issue for the National Health Service (NHS) in recent years. Frailty is associated with outcomes including unplanned admission, transfer to residential care and high levels of service use. As the population ages, the frailty becomes more common, and associated demand for health care increases. Where there are limited NHS resources but increasing demand, planning delivery of appropriate services to support people with frailty will be key to providing cost-effective, quality care for older people. However, detailed information about how many people develop frailty over a certain time, how common frailty is in different groups of people, how it progresses and how it impacts on need for health care is still lacking. We need this information to be able to plan, commission and delivery services for older people who are at risk of developing frailty, or who already have frailty.

To be able to do this, we need to explore the trends of frailty development and progression within large populations and understand the impact of frailty on patients and their use of NHS services. A useful tool, the electronic Frailty Index (eFI) has recently been introduced to the NHS. This tool uses data in the patients primary care medical record and looks for 36 different ‘deficits’ (e.g., clinical conditions or diagnoses, laboratory tests, limitations to mobility) which are used to calculate a score. A low score indicates that patients are ‘fit’, and higher scores indicate patients may have mild frailty, moderate frailty or severe frailty. The worse the frailty becomes, the more at-risk patients are from poor outcomes, as people are more likely to find it difficult to deal with small changes in their health or circumstances, so the effects of getting ill are worse than for other people.

Until recently, it was difficult for General Practitioners (GPs) to provide care for frail older people because it was hard to identify people who were frail without an assessment by a consultant. This meant that frail older people were not always receiving the care they needed. However, GPs can use the information from the eFI to improve care for patients with moderate and severe frailty.

University of Southampton have access to a pseudonymised extract of all individuals aged 50 years and above between 2006 and 2017 from the primary care records of participating RCGP RSC practices. An eFI score for each individual for each calendar year they are present in the cohort has been calculated from the RCGP RSC primary care data. The eFI score is then categorised into fit, mild, moderate and severe based on the cut-offs provided in Clegg (2016). Given the number of years individuals may be present in the dataset, it is likely the eFI category will vary during their follow-up time; University of Southamptons preliminary analyses of the existing primary care data (RCGP dataset) confirms this is the case. The data extract requested from NHS Digital will be pseudonymised and linked to the existing pseudonymised primary care data as per the processes described in section 5b. This linked dataset will allow University of Southampton to explore the relationship between frailty transitions and outcomes without having to identify individuals.

The request is for primary care data from RCGP RSC, including all relevant codes for the 36 variables used in calculating eFI. There is no use of the eFI scores coded within GP systems, which would not have been available for the years being requested. The eFI scores are generated from routinely collected primary care data and use Read or CTV3 codes, as specified in the method by Clegg (2016), which allows calculation of eFI for all ages (Read code algorithm). This method has been applied to retrospective primary care data to generate information on each of the 36 eFI deficits. The requested data will be linked to our pseudonymised primary care records using the processes described. This will allow University of Southampton to utilise the primary-care derived eFI scores to explore the relationship between frailty and secondary and urgent care use.

Within the primary care record, which are used to calculate the eFI score, the eFI uses clinical diagnoses which have already been made and recorded by GPs using Read or CTV3 codes in the patients’ records. No prospective clinical assessments or diagnoses are therefore required for the proposed work. No clinical diagnoses are being carried out for this study. The eFI is not the same as a clinical diagnosis of frailty, its development was based on the recognised cumulative deficit framework devised by Rockwood. The intention is to use eFI scores as a measure of potential frailty or frailty associated burden at population level, not as a clinical diagnostic tool. By diagnoses, it is meant existing diagnostic codes held by RCGP RSC.

Therefore, the overarching aim of this study is to explore trends in development and progression of frailty, and the dynamics of frailty related healthcare demand, outcomes, and costs in the older general practice population, to inform the development of guidelines and tools to facilitate commissioning and service development for this patient group.

SPECIFIC STUDY OBJECTIVES AND RELATED WORKSTREAMS:
The study objectives are:
1. Identification of incidence and prevalence of frailty states in an ageing population (50 years and over)
2. Identification of frailty trajectories and transitions in severity in the older population over time
3. Exploration of drivers of progression of frailty, including clinical, socioeconomic, and demographic factors
4. Examination of the impact of frailty on service use, costs, and pathways of care
5. Exploration of the relationship between frailty status, socio-economic factors, practice factors and service use and outcomes (mortality, unplanned admissions, residential care use)
6. Prediction of trends in frailty, modelling of health and care demand and costs over time and in different service contexts
Workstreams:
To fulfil the above objectives, the project is divided into the following workstreams, of which workstreams 1 and 4 are relevant to this data request. There are two main aims for the project Workstream to which this data request relates. They are:
- the identification of key variables capable of predicting frailty development
- progression and assessment of the relationship between frailty status and key clinical outcomes (including mortality and unplanned admissions).
These analyses will then be used to inform the simulation modelling being conducted in Workstream 4.
This application relates to the linkage of HES and mortality data from NHS Digital to primary care data provided by RCGP RSC within Workstream 1 of the study. There is no data linkage between Hospital Episode Statistics (HES) and mortality data and SAIL.

• Workstream 1: statistical modelling of population trends, incidence and prevalence of frailty, stratification of frailty and related outcomes, resource use and costs – this data request specifically relates to providing secondary and urgent care service use data as a component of this workstream.
• Workstream 2: validation of the population model
• Workstream 3: stakeholder engagement.
• Workstream 4: simulation modelling to explore impact of different service and demographic scenarios on population trends, service demand and costs in the future – the analysis of data provided under this data request will inform this workstream, no further data is required for workstream 4.
The data requested under this application will provide the necessary hospital outcomes and mortality data to be able to fulfil Workstream 1, and the results of analyses in Workstream 1 will inform the simulation modelling in Workstream 4.

WHICH DATA IS BEING REQUESTED?
This study will use electronic data which is recorded during the routine care of NHS patients, where explicit consent has not been gained from participants. To be able to fulfil the aims of the study, healthcare data on an ageing cohort over a 12-year period will be needed, so that health outcomes can be explored over the medium to long term. We will use data which has already been collected (‘retrospective data’), as it would not be possible to do a large-scale representative study prospectively.

The dataset requested is minimised to a pre-defined cohort of approximately 2.2 million patients from the Royal College of GPs Research Surveillance Centre (RCGP RSC) dataset. The RCGP RSC dataset is an electronic health record (EHR) that collates routinely recorded primary care data from a population of 3 million nationwide from more than 400 GP practices. We are only requesting variables which are needed for analysis of our defined outcomes, in appropriate formats to minimise potential identifiers and reduce the risk of any inadvertent identification through combinations of variables. The RCGP Research Surveillance Centre team at University of Oxford and the study team at University of Southampton are requesting a unique study identifier (ID) only; NHS Digital data will be pseudonymised using a non-reversible hashing algorithm. The RCGP Research Surveillance Centre team will link the NHS Digital data to their primary care data (RCGP RSC dataset) using the pseudonymised Identifier.

To conduct this component of the research (Workstream 1), the research team from University of Southampton will work on a pseudonymised RCGP RSC data extract, linked to pseudonymised NHS Digital HES and Civil Registration Deaths data/Mortality data to determine the outcomes specified in this application. In total, for this workstream, the University of Southampton will obtain fully de-identified, pseudonymised data extracts from the following databanks:

• Royal College of General Practitioners Research Surveillance Centre (RCGP RSC) dataset: this primary care dataset will include demographic data, residence, long-term conditions diagnoses, frailty index domains, prescriptions, primary care service events

The primary care RCGP RSC dataset comprises the baseline characteristics of the patients and primary healthcare contacts over this period, in addition to frailty scores, the main predictor of interest in this study. Secondary care attendances and their outcomes (outpatient appointments, Accident and Emergency (A&E) visits, hospital admissions, critical care admissions) and deaths are key study outcomes of interest to understand how attendances and healthcare use varies between people with different frailty states. It is therefore important to have individual-level data to be able to analyse changes in healthcare use over the cohort period and examine predictors of secondary care use and deaths, hence the request for pseudonymised Hospital Episode Statistics (HES) and Civil Registration Deaths data/Mortality data, which will be linked to the RCGP RSC primary care dataset only.

For this study, the research team will need to link the de-identified RCGP RSC data extract with data on secondary care Hospital Episode Statistics (HES) and Civil Registration Deaths data/Mortality data. HES and mortality data are therefore being requested in the performance of a task in the public interest - Article 6(1)(e) i.e. processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller, and Article 9(2)(j) with regards to the processing being necessary for achieving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law. The public interest function of the proposed data linkage is evidenced in the acceptance of this study within the RCGP Research Surveillance Centre portfolio and its funding by the NIHR, where it is a part of their established programme of research in relation to management of frailty.

ROLE OF THE FUNDER AND OTHER ORGANISATIONS:
This project has been funded by the National Institute of Health Research (Health Services & Delivery Research funding stream, grant number 16/116/43) which commenced in March 2019 and is due to conclude in February 2022. NIHR are the funding body only, they will not determine the aims and objectives of this project nor will they have access to NHS Digital data.

As the main study is funded by the NIHR, an independent Study Steering Committee (SSC) comprising academics, service commissioners, public health experts and Public Patient Involvement (PPI) representatives provides oversight on behalf of NIHR. The SSC is independent of the study and their remit is to ensure that the project is delivered in line with the agreed protocol. The National Clinical Director for Older People and Person-Centred Integrated Care at NHS England is a member of the Study Steering Committee; the Director provides the NHS England perspective on commissioning of services for older people and guidance on dissemination and implementation of study findings.

The other organisations involved in the wider project with advisory roles include Southampton University Hospitals NHS Trust, Southern Health Foundation Trust, the University of Oxford, the University of Leeds, and public contributors. No staff from these organisations will have access to NHS Digital data. The Stakeholder Engagement Group (SEG) comprises a wide range of stakeholders, including representatives from service providers, commissioners, clinical experts, health, social care and voluntary organisation and patients/carers. The SEG role is to advise on the development of the simulation model and the scenarios to be tested by the simulation in Workstream 3 of the funded project.

DATA CONTROLLERS AND DATA PROCESSORS:
In this agreement, the University of Oxford and the University of Southampton are the joint Data Controllers. University of Oxford makes decisions about the processes for data processing and access. University of Southampton are data controllers as they are dictating the analysis that is being done.

The University of Oxford and University of Southampton are Data Processors. The RCGP RSC dataset is stored and managed at the University of Oxford. The University of Oxford has a contract with the RCGP to provide this surveillance, quality improvement and research platform. Under the 2018 Data Protection Act, the University of Oxford is identified as a processor of personal data for the Royal College of General Practitioners (RCGP). The RCGP Research Surveillance Centre has its secure data and analytics hub at University of Oxford, who will manage data governance, encryption, and access. NHS Digital data will be released to University of Oxford who will be carrying out the linkage with their primary care dataset before making the linked data available to University of Southampton. University of Southampton are Data Processor because University of Southampton staff will access NHS Digital data and carry out data analysis on the University of Oxford secure remote server.

DATA ANALYSIS METHODS AND USE OF THE RESULTS:
This study will explore the incidence and prevalence, development, and impact of frailty within the population using retrospective data from the RCGP Research Surveillance Centre databank. The eFI tool will be utilised to stratify a cohort of people aged 50 and over present in the database between 2006 and 2017 inclusive into fit, mild, moderate, and severe frailty groups. Data will be extracted on frailty status, health care use, and outcomes for the subsequent years, and the team at University of Southampton will calculate key service use costs from the linked RCGP RSC dataset and NHS Digital HES and mortality data. Outcomes will include mortality, unplanned hospital admission, A&E attendance, and GP appointments.

The RCGP RSC dataset will provide data on socio-economic factors, practice size and location and residence. The research team at University of Southampton will use the eFI to stratify the RCGP RSC dataset cohort by severity of frailty and explore frailty status over time, determining incidence, prevalence, and progression of frailty. The University of Southampton research team will use descriptive statistics to estimate baseline prevalence, burden of frailty and transition rates between frailty states in population aged 50 and over.
The research team at the University of Southampton will use the RCGP RSC dataset to examine the relationships between factors such as age, deprivation, ethnicity, location, and comorbidities of individuals in relation to development of, and deterioration in, frailty status. The epidemiology of frailty will also be described, calculating prevalence, incidence and describing trajectories of decline. Relationships between demographics, practice characteristics, outcomes, service use and costs will be explored for frailty (eFI score) strata (robust, mild, moderate, and severe). The influence of frailty on outcomes, service use and costs will be explored. With the linked HES and mortality data, the University of Southampton team will also explore the relationship between frailty and secondary care outcomes and costs and mortality. Multi-state models (models which take account of the ‘level’ of frailty a patient has at any one time – i.e. fit, mild, moderate or severe) will be used to determine what clinical, demographic, and socio-economic variables are able to stratify frailty progression. Time-dependent Cox models will be used to examine the relationship between frailty state and key binary clinical outcomes (including mortality and service use). Mixed-effects negative binomial models will be used to examine the relationship between frailty state and count based clinical outcomes, such as the number of A&E attendances and unplanned hospitalisations.

The research team at the University of Southampton will use results from these analyses to inform development of guidelines for service commissioners, developed in partnership with experts in service delivery, commissioning and the study PPI representatives through stakeholder engagement. The key clinical, demographic and socio-economic drivers that are identified as significant predictors of frailty progression and/or associated with outcomes or service use patterns of interest will also be used to inform the development of a prototype simulation model. The simulation model will use a System Dynamics (SD) based approach to explore the development and impact of frailty in the population and likely future scenarios over a 12-year timeframe. SD is a computer simulation modelling approach whose purpose is to analyse changes over time in complex, interacting systems and is ideally suited for health and care systems. The statistical analyses will be used to stratify the SD model and to inform potential ‘what if’ scenarios for simulation developed with the Stakeholder Engagement Group.

WHAT WILL THE SIMULATION MODEL DO?
An SD model consists of stocks (accumulations) of material, and flows between them, analogous to a series of water tanks connected by pipes. The rate of flow along each pipe is governed by valves that can be turned up or down. A stock-flow model will be developed, depicting patient transitions between different states. In this case, the “material” is frail patients, and the stocks are the numbers of patients in different health and social care states. These states will be further broken down by those characteristics identified in Workstream 1 as significantly impacting on demand for services, or strongly associated with specific outcomes. Potential candidate characteristics include age, gender, long-term condition (LTC) diagnoses and Index of Multiple Deprivation (IMD) scores.

The model does not follow individual patients, but uses the results obtained in Workstream 1 to calculate monthly transition probabilities between states (stocks). The model will use data from Workstream 1 to capture the key clinical and demographic differences that influence these transitions, as well as information about the costs and outcomes associated with each state. Data from Workstream 1 will be used to populate the simulation model to enable accurate simulation of population trends, service use and costs.
The anticipated time horizon for running the model is ten years (2018-2027). This length of time is required to capture fully the population dynamics and the evolution of frailty. While the demographic predictions thus derived will be robust, it is recognised that any cost calculations more than two or three years into the future can only be indicative, given that service delivery modalities and health and social care organisational structures are unlikely to remain fixed for the whole period.

Moreover, there is bound to be considerable local variation. The simulation model can easily take this into account by modifying the relevant parameters. The key benefit of using simulation is that a wide range of “what-if” scenarios can be tested and compared, including demographic trends and changes to prevalence and progression rates, in addition to service delivery scenarios developed with the SEG in Workstream 3. The model outputs, which will enable comparison between scenarios, will include:
• The number of patients, and proportion of the population, in each stock over time
• Demand for services over time, aggregated or broken down by patient category
• A range of health outcome measures, aggregated or broken down by patient category
• Mortality, total or cumulative, aggregated or broken down by patient category

To develop the simulation model to allow prediction of future trends and population health burden, retrospective analysis of a number of years of population-level data is required. As frailty is a slowly developing condition and is much more prevalent in the oldest old, this study requires analysis of transitions in frailty states and health outcomes for a large ageing cohort on an individual level over at least a 10-year period. This will allow the study to capture transitions in frailty development and important health outcomes during this period.

In line with the public interest basis for this request, the data requested from NHS Digital is to provide additional longitudinal data required for delivery of this National Institute of Health Research (NIHR) funded project, specifically linked hospital outpatient, emergency department, health economic and mortality data for a cohort of primary care patients identified from the RCGP RSC dataset. The simulation modelling study will be at population level (England) for which a representative cohort has been obtained from RCGP Research Surveillance Centre. This cohort primary care data covers a range of geographical areas, urban and rural locations, and the range of deprivation levels. This data request is for linked NHS Digital HES and Civil Registration Deaths data/Mortality data, so necessarily covers the same geographical range as the primary care data.

Expected Benefits:

HOW DOES SHARING THIS DATA BENEFIT HEALTHCARE PROVISION?
The clinical management of frailty will become increasingly important as the population ages, with prevalence of frailty rising from 10% of people aged over 65, to up to 50% in those aged over 85. Despite the scale of this patient group, research indicates that half of patients with frailty are not receiving effective health care interventions. In Fit for Frailty Part 2 (British Geriatric Society, 2015), it is noted that there is potential for significant harm to frail patients if they receive inappropriate interventions. However, many services across the health and care system do not take adequate account of individuals’ frailty and so opportunities to improve quality of care are missed. Attention to the needs of older people living with frailty could, therefore, be more effective in reducing acute bed use and improving quality of care than focusing on those at high risk of admission. At the individual patient level, guidance for patient management exists and there is general agreement about the features of good quality care. There are, however, gaps in the evidence relating to the organisation and delivery of interventions and services to optimise provision of high-quality individualised patient management across the frail older population. The improved understanding of population needs offered by this study hopes to inform appropriate service planning and delivery, giving direct benefit for patients through provision of timely and appropriate care.

This study, with its emphasis on whole-system population dynamics of frailty, will explore the issues around population need, service configurations and clinical interventions highlighted above. Data from Workstream 1 (including the shared data) will be analysed and the results used to inform the simulation model in Workstream 4. A strength of the simulation modelling approach is that it allows for identification of different trajectories of care and key transition points, projection of future demand and rapid testing of the impact of different service configuration scenarios to aid decision-making. The proposed study hopes to impact on patient care directly, for example, by identifying features of people with frailty who are more likely to have adverse outcomes, identifying risk factors for frailty progression and informing targeted prevention through identification of trajectories of frailty, so enabling better targeting of interventions and services. Indirect impact may also be important, for example, through allowing commissioners to understand different care trajectories, and therefore the likely scale and nature of service demand, or service providers to identify cost-effective approaches for their specific population and facilitating the integration of health and social care.

The study outputs may have a direct benefit for commissioners; commissioning is a complex cycle involving assessment and understanding of population health needs, planning services to meet those needs, procuring appropriate and cost-effective services, and monitoring their delivery and impact. The outputs of this study has the potential to contribute at each of these stages but may have most impact in relation to assessment of population health needs. The planning stage of the commissioning cycle is often limited by a lack of reliable data on demand, particularly data which allows for forward projections; this study will address this need in relation to older people with frailty. The simulation modelling approach proposed in this study is particularly well-positioned to support commissioning, with its recent shifts towards more local commissioning, joint working and context specific (or ‘place-based’) commissioning and a focus on integrated systems of care. Integrated care organisations and commissioners may need to become more focused on needs of patients with multiple morbidity and functional problems (consistent with the presenting problems encountered in frailty) rather than disease-specific approaches.

The study team anticipate that realisable benefits from the proposed work will include guidance for commissioners and service providers on service configurations and the development of a customisable simulation model for local exploration of service demand and configurations.

The immediate project outputs from Workstreams 1 and 4 are the statistical and economic analyses, algorithms and simulation model, which will form the core of guidance for NHS commissioners and planners to aid resource planning in relation to frailty.

SPECIFIC OUTPUTS AND DISSEMINATION:
The study team will use a range of dissemination approaches to reach the various target audiences for this research. The dissemination strategy will be guided by study PPI representatives and other key stakeholders on the SEG, including carer organisations and Age UK. The study team, SEG and collaborators include senior stakeholders relevant to development of frailty services and use of the eFI, including from provider Trusts, NHS England (NHSE) Older People Team and CCGs. The study team will use their established networks to share findings with leaders in implementation and commissioning of frailty services. The team will work with the study PPI lead and SEG to plan dissemination to NHS staff ‘on the ground’ and the local and wider body of patient/carers, with a focus on making the results ‘accessible’ to the wider public, both in writing and verbally, through presentations at workshops/team meetings/to patient groups. The study team will run a dissemination planning event, to which NHS commissioners and frailty leaders will be invited, to review findings, consider their implications and implementation and explore key messages and strategies for dissemination. The study team will use their established formal social media networks to promote project outputs, and for dissemination. The team will share the results of the study with the public and staff in the relevant health, local and third sectors a public/patient friendly way by use of infographics, using plain English, and via use of local and national media and social media. In addition, the study team will summarise the findings of the work via professional journals (e.g. the Health Service Journal (HSJ)) and health service networks and professional organisations (Health Services Research Network, British Geriatrics Society).

In addition to the above, the core study team will lead on other study outputs, including academic journal papers. They will submit abstracts for oral and poster presentations at a minimum of two national and one international conference focusing on care of older people and aiming for the widest possible audience. They will submit at least two academic papers to high impact open access journals. These will be focused on the dynamics of frailty within the population and the impact of frailty on health care demand and outcomes. This analyses from Workstream 1 will provide data on incidence and prevalence of frailty, stratified by severity, in a typical older, primary care population, and the associated outcomes including emergency department use, hospitalisation and deaths. The long-term impact of frailty on outcomes and service demand and costs will be modelled. The simulation model could allow local and regional service planners and commissioners to explore a range of scenarios relevance to their specific contexts, so aiding decisions on service commissioning and design. The study team will collate the outputs of the study into a commissioning toolkit, comprising guidance on drivers of frailty-related demand and outputs from the Workstream 4 simulation model that can be used for prediction of future demand and exploration of different scenarios. The simulation model could be capable of adaptation for exploration of different service and demographic contexts. The simulation model algorithms may also be transferable to modelling of other chronic conditions that are common within the ageing population.

The study team will produce a final research report for NIHR detailing the work undertaken and results alongside an abstract, executive summary and technical appendices. The executive summary will be suitable for use as a briefing paper for NHS managers and commissioners. In addition, they will prepare a short Powerpoint presentation to present the main findings to NHS organisations. The slides will be made available, alongside the full report, on the HS&DR programme web pages and, where possible, as additional linked material with other publications. They will also work closely with the University communications team and ensure that members of the study team are given appropriate support and training in handling enquiries from the media.

i) Development of guidance and commissioning toolkit for service providers and commissioners to inform planning over a 15 year+ period
ii) Development of a simulation model that may allow service planners and commissioners to explore scenarios and trends tailored to local and regional populations
iii) Future development of the simulation model of population trends into a workforce planning tool
iv) Future adaptation of the simulation model algorithms to explore health care demand and mitigation scenarios in relation to other conditions within the ageing population

Better understanding of the development and dynamics of frailty over time could facilitate service and workforce planning and commissioning. Outputs of the study will include guidance for commissioners, a simulation model to facilitate prediction of service demand associated with frailty and the potential for development of these resources into a workforce planning toolkit. The simulation model architecture, and the know-how relating to populating and operationalising the model may be transferable to prediction of demand for other populations and conditions with a high population prevalence (e.g., dementia, obesity, mental health problems). As the models are based on national-level data, the application of results and the ability to adapt the model to geographical locations means that the impact may be nationwide within the UK, and the information may also be adapted on an international level.

HOW THE BENEFITS WILL BE ACHIEVED, AND TIMELINES
The study team (including researchers at the University of Oxford and Southampton) will achieve the benefit, working together with the NIHR to ensure appropriate dissemination and with third parties such as NHS Commissioners to realise the benefits.

The Stakeholder Engagement Group (SEG) includes representation from local Strategic Transformation Partnership (STP), including from Clinical Commissioning Groups (CCGs), local authorities, and provider organisations, in addition to national commissioning representatives. The SEG also includes the PPI lead, PPI representatives from the Ageing & Dementia PPI panel and representation from third sector organisations, including Age UK. This will ensure that the results from the analyses and simulation model are discussed with the right people to make the appropriate changes to the healthcare system.

The study outputs will be monitored by the independent Study Steering Committee (SSC) according to the study Gannt chart, publication plan and dissemination activities. For example, milestones such as simulation model production, analysis of scenarios, commissioning guidance and toolkit and dissemination and implementation events will all be reviewed by the SSC.

Epidemiological analysis of the primary care and linked HES/mortality data is expected to be complete within 9 months of data delivery, enabling provision of aggregate data to inform the simulation model and scenario development. The simulation model and related outputs including scenarios is projected to complete by the end of 2022.

Outputs:

There will be several outputs by the end of the study. The target groups and individuals for the outputs will include:
• academic
• scientific
• professional
• policy makers (both political and professional) involved in deciding future health policies

The main report will be delivered by 30th March 2022 delivered to the study funder and academic papers and seminars will be delivered in the following year.

The immediate project outputs will be:
• statistical and economic analyses - in the form of aggregate data tables, graphs, reports and submissions to peer reviewed journals, with any small numbers suppressed (in line with the HES Analysis Guide)
• oral and poster presentations at a minimum of two national and one international conference focusing on care of older people and aiming for the widest possible audience. The research team will submit at least two academic papers to high impact open access journals.
• Guidance for providers/commissioners
• Algorithms, simulation model and interactive dashboard for simulation

These outputs will form the core of guidance for NHS commissioners and planners to aid resource planning in relation to frailty. Study Patient and Public Involvement (PPI) representatives, drawn from the core study team and the School of Health Sciences Ageing & Dementia Research PPI Panel at the University of Southampton, will advise on dissemination and implementation through the SEG events and a study dissemination planning event.

The study team, Study Engagement Group and collaborators includes senior stakeholders relevant to development of frailty services and use of the electronic Frailty Index (eFI), including from provider Trusts, National Health Service England (NHSE) Older People Team and Clinical Commissioning Groups (CCGs).

The study team will use the established networks to share findings with leaders in implementation and commissioning of frailty services nationally. The team will work with the study PPI lead and SEG (including the Age UK representative) to plan dissemination to NHS staff 'on the ground' and also the local and wider body of patient/carers, with a focus on making the results 'accessible' to the wider public, both in writing and verbally, through presentations at workshops/team meetings/to patient groups. The study team will run a dissemination planning event, to which NHS commissioners and frailty leaders will be invited, to review findings, consider their implications and implementation and explore key messages and strategies for dissemination.

Processing:

As per the definition of ‘controller’ in the General Data Protection Regulation (1), both University of Southampton and the RCGP RSC research group based at the University of Oxford determine the purposes and means of the processing of personal data. The Principal Investigator (PI) at University of Southampton determines the study aims and objectives, the personal data that will be processed and the analyses that will be carried out. The University of Southampton staff working on Workstream 1 will also have access to the pseudonymised, linked primary care, HES and mortality data provided by RCGP RSC. In this case, University of Southampton are controllers in that they determine why the personal data are processed. The RCGP Research Surveillance Centre based at the University of Oxford determines the means of processing the data and approves its purpose; the University of Oxford therefore has oversight of study aims and objectives via the RCGP Research Surveillance Centre. University of Southampton and the University of Oxford are therefore joint Data Controllers.

Data flow into NHS Digital will consist of hashed identifiers for the study cohort (adults aged 50 and above registered with an RCGP practice at any year from 2006 to 2017 inclusive). The hashing of identifiable data for the Clinical Informatics and outcomes Research Group (RCGP Research Surveillance Centre) is conducted by the Salt Service of the University of Oxford Central IT team, so that the holder of the pseudonymised data is separated from the service that holds the non-reversible hash key. This avoids pseudonymised data becoming identifiable data.

NHS Digital will hash their NHS numbers using the same pseudonymisation algorithm (SHA-512). NHS Digital will undertake data linkage via the hashed NHS numbers in both sets of data. This process has been used for previous projects linking different sets of data, and the linkage has been successful. Records for each study participant containing information from HES and mortality data, together with hashed NHS numbers will be sent to the University of Oxford.

All individual-level data will then be stored and analysed at the University of Oxford. There will be no subsequent flows of individual level data from the University of Oxford. Aggregate analyses will be shared with the wider study team and used in dissemination of the research.

Apollo Medical Software Solutions, an approved third-party provider, has formal service agreements and service specifications with RCGP Research Surveillance Centre and with individual participating GP practices to conduct data collection and secure web transfer. (Copies of these formal agreements and technical details were shared with NHS Digital in the last IGTK assessment and were deemed satisfactory and are available to legitimate requests).

Each unique patient within the RCGP RSC databank is de-identified at source before data is extracted from individual practices using a computer-generated patient identifier created by Apollo Medical Software Solutions. This de-identification of records includes production of a hashed NHS number using pseudonymisation algorithm (SHA-512).

Pseudonymised record-level HES data will be processed and stored at the University of Oxford. Patient level databases (such as this study’s RCGP RSC dataset) are held in the database server within the RCGP Research Surveillance Centre Research Group's secure network. The Research Group's dedicated secure network is sited behind a firewall within the University's network. It is a standalone, independent network, all in-bounded connections are block, but out-bounded connections are allowed. All staff members of the research group working within the team base work from secure workstations or secure laptops with encrypted drive. Only substantive employees of the University of Oxford will have access to the data and only for the purposes described in this document. The data will be used solely for the "Dynamics of frailty in older people" study.

The University of Oxford will send the hashed NHS numbers to NHS Digital. The following flow of hashed NHS numbers will be undertaken.

The study group is the cohort of patients aged 50 years and over in registered in RCGP Research Surveillance Centre network practices 2006. University of Oxford will identify the study patients for the cohort above from primary care records in the RCGP Research Surveillance Centre practices and send the hashed NHS numbers of the cohort under study to NHS Digital to link to HES/ Civil Registration Data (CRD).

No other GP data will be sent to NHS Digital.

The process of linkage is as follows:
• University of Oxford’s senior SQL developer will submit the cohort to NHS Digital, with hashed NHS numbers using the pseudonymisation algorithm SHA-512
• NHS Digital will hash their NHS numbers using the same pseudonymisation algorithm (SHA-512) as used by the RCGP Research Surveillance Centre.
• NHS Digital will undertake data linkage via the hashed NHS numbers in both sets of data. This process has been used for previous projects linking different sets of data, and the linkage has been successful
• NHS digital extract all HES and CRD records for which there are matched primary care records.
• NHS digital will send the extract of HES and CRD records with the hashed NHS number to the University of Oxford.
• University of Oxford will link the HES and CRD records together with GP data from the primary care records from RCGP Research Surveillance Centre network practices with the same hashed NHS numbers.
• Records for each study participant will when fully linked contain information from HES and CRD, together with information from RCGP Research Surveillance Centre network primary care practices.

Each unique patient within the RCGP RSC dataset is de-identified at source before data is extracted from individual practices using a computer-generated patient ID. The University of Oxford holds no identifiable data and only hashed NHS number.

Only pseudonymised data with direct patient identifiers removed will be used. The research team will not seek individual patient identifiers; where required, data linkage will be achieved through ‘hashing’ algorithms to generate non-identifiable, unique IDs from identifiable data. As a further protection, non-reversible, pseudonymised ID numbers held be database organisations will be converted to unique study IDs, the keys to which will not be accessible to the research team; and, when using these data the research team will suppress small numbers in reporting and avoid the presentation of data that can potentially be used to reveal identities. Data extracts and aggregate analyses will be pseudonymised as described.

All data processing will be carried out by staff employed by the University of Oxford. Data analysis will be carried by staff employed by the University of Southampton and the University of Oxford. All staff have received Information Governance training on an annual basis and have all passed the NHS Information Governance on-line test for the current year.

Access to the data will limited to either:
1) Researchers with substantive employee contracts with the University of Oxford
2) Senior researchers from the University of Southampton. Southampton researchers will be required to complete training and sign the relevant agreements to be able to access the data on the University of Oxford secure environment. No individual-level study data can leave this environment, and all aggregated results data is reviewed prior to export.

The Research Group at the University of Oxford has conducted a risk assessment of the physical security of the offices and servers where patient level data is kept. The Research Group of Department of Clinical and Experimental Medicine at the University of Oxford has worked with routinely collected healthcare data in several research and evaluation projects over the last 15 years. The Research Group works within the Research and Information Governance team at the University of Oxford.

No data is stored outside of the secure computer system hosted at the University of Oxford.


Establishing the impact of the national VTE prevention programme on post-operative VTE rates in England — DARS-NIC-195793-R5Y3H

Opt outs honoured: No - data flow is not identifiable, Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-10-21 — 2022-10-20 2020.03 — 2020.03.

Access method: One-Off

Data-controller type: ROYAL COLLEGE OF GENERAL PRACTITIONERS, UNIVERSITY OF SURREY

Sublicensing allowed: No

Datasets:

  1. Hospital Episode Statistics Admitted Patient Care
  2. Civil Registration - Deaths
  3. HES:Civil Registration (Deaths) bridge
  4. Civil Registration (Deaths) - Secondary Care Cut

Objectives:

The University of Surrey and the Royal College of General Practitioners (RCGP) are working together as joint data controllers to look at preventing Venous thromboembolism.

The Royal College of General Practitioners legitimate purpose through the use of the data is to provide information and analysis on general practice data – both disease and workload
Project specific: To understand the impact of mandatory venous thromboembolism (VTE) risk assessment on the incidence and outcomes of VTE after surgery.
The benefits to patients is to improve the understand the impact of mandatory venous thromboembolism (VTE) risk assessment on the incidence and outcomes of VTE after surgery (Patient Care)
There are benefits to national bodies through the
• Provision of national surveillance information for Public Health England
• Provision of workload and workforce breakdown (influence policy) for NHS England
• Project specific: To understand the impact of mandatory venous thromboembolism (VTE) risk assessment on the incidence and outcomes of VTE after surgery (Policy implications)

The processing of the data will help the study to
provide surveillance services based on general practice electronic healthcare records.
Project specific: data from general practice is required to fill in the gaps in the current understanding of the incidence and outcomes of mandatory VTE risk assessment after surgery as currently much of the existing information is from secondary care. RSC data is required to identify where VTE has occurred once a patient has left hospital. No additional processing outside of what is required is approved by the RSC and all amendments have been made to ensure that data is processed in the least intrusive way possible while still enabling the purpose of the RCGP RSC

Data is pseudonymised as close to the source as possible, the RCGP RSC does not
hold or process any identifiable personal data.
There are no existing relationships that can be identified with the individuals whose data is processed. All data is pseudonymised,

In this application, the University of Surrey will be studying a comparable-sized population of about 3 million individuals from the Royal College of General Practitioners Research & Surveillance Centre (RCGP RSC) database. Over seven years they expect to see about 450 VTE events from approximately 78,000 surgical procedures (ie prior to mandatory screening). With 61,506 surgeries before, and 61,506 surgeries after the introduction of VTE guidelines, there is 80% power to detect a 10% reduction in VTE events from 23.7/1000 years to 21.3/1000 years.

The concept to use preventive measures to prevent Venous thromboembolism (VTE, also known as 'blood clots') for specific at-risk groups is well established (Haut et al, 2013). There are significant risks to medications that reduce clotting of the blood and so determining the risk-to-benefit ratio is essential to ensure that prevention is targeted appropriately. The National VTE prevention programme was launched in 2010 with the introduction of mandatory VTE risk assessment of all adults on admission to hospital. This was supported by NICE guidelines.

Where patients are at increased risk of VTE (ie the risk is NOT outweighed by risk factors for bleeding), then NICE recommend mobilisation of the patient as soon as possible, medicines to limit clotting and compression stockings. Patients are also given information of the risk of blood clots and discharge planning includes relaying this information to other care-givers.

The risk of VTE persists for up to 12 months after surgery, and is particularly high in the first three months (Kearon, 2003; Sweetland et al, 2009). This risk was estimated before the era of Enhanced Recovery After Surgery (ERAS) which may change the natural history of the disease. ERAS is a package of care that mean patients stay in hospital is much shorter than it once was. In the modern era, in the UK the length of stay for bariatric surgery is less than three days (Awad et al, 2014) and for thyroidectomy is just two days (Perera et al, 2014), for example. Studies evaluating HES have estimated postoperative VTE rates in varicose vein (Sutton et al, 2012), urological (Dyer et al, 2013) and orthopaedic surgery (Jameson et al, 2010). However, data from hospitals (Hospital Episode Statistics, HES) by itself is limited to capturing in-hospital adverse events and those recorded during readmission. It is evident that the risk of VTE persists well beyond discharge from hospital. For instance, a study by Bouras and colleagues showed that a large proportion of postoperative VTE was detected in primary care (2015). Linkage to primary care electronic health records and mortality data will allow for a more accurate perspective of a patients’ entire postoperative course. Mortality is obviously a key clinical outcome after surgery and would be recorded in hospital-derived data but if it occurs in the community, has been shown to be not well recorded through clinical coding in primary care.

Linkage between NHS Digital data and the primary care record will be made via a pseudonymised NHS number. No patient identifiable information will be seen or used by Apollo Medical Software Solutions (the company that facilitate data extraction at the GP surgeries) or anyone at University of Surrey and the RCGP.

The aim of this study is to examine the impact of mandatory VTE risk assessment (introduced in 2010) on the incidence of VTE after general surgery and major orthopaedic surgery.

Patients undergoing one of twelve general surgical procedures will be chosen. Limiting to twelve operations allows the study to standardise for operative duration, likelihood of postoperative immobilisation etc. These procedures represent the majority of emergency and elective general surgical operations in UK hospitals. In terms of the number of years of data required, it is necessary to have data over such a long period because in the paper by Bouras et al (PLoS ONE 2015) there were 981 VTE events captured within 90 days of surgery, in 168005 procedures, from a background population of ~2.9 million people over 15 years (23.7/1000 patient-years). The period of time that the Bouras study relates to was 1997 to 2012. Importantly, this crosses the introduction of mandatory VTE risk assessment and so the paper cannot describe the effect of mandatory screening.

Orthopaedic sub study
It is hypothesized that an individual’s VTE risk after hip or knee surgery can be modelled with the use of a mathematical prediction model.
Study Objectives: To develop a model that predicts the risk of VTE in patients who undergo total hip arthoplasty (THA, 'hip replacement') or total knee arthroplasty (TKA, 'knee replacement') surgery. This will be based upon data of the clinical characteristics of the individual as well as data of the operation itself and routinely collected hospital biochemical (laboratory) data.

The Research question is therefore: What is the optimal prediction model for VTE risk following THA and TKA surgery?

Expected results and influence in society: Current strategies to prevent blood clots are a one-size fits all -ie for all patients who undergo THA and TKA - these are not optimal because patients vary hugely in their ability to form blood clots. Therefore a new strategy, i.e., advice on an individual basis, is necessary to reduce VTE, bleeding complications and costs. A prediction model should be able to reach a discriminative value (area under the curve) of at least 0.7 with a sensitivity of 75% (in other words, detects at least 75% of those with blood clots) and specificity of 50% (in other words, detect at least half of people who do not have blood clots). Ideally, three risk groups could be identified according to the prediction model; a low- (60% of the total), intermediate-(30%) and high-risk (10%) group. These risk groups could consequently be used to optimize strategies to prevent blood clots (thromboprophylaxis).

For patients in the low-risk group (VTE risk <0.5%), thromboprophylaxis could be limited to in-hospital preventative treatment only, resulting in less costs and less bleeding events. For patients in the intermediate group (0.5-1.0%), current thromboprophylaxis policies (lasing for 2 to 4 weeks) may be sufficient; while patients in the high risk group (>1.0%) could potentially benefit from an extended period (or higher dosage) of thromboprophylaxis.

However, before such a tailored strategy can be implemented in clinical practice, an additional impact-analysis has to be performed that measures the validity of the prediction model, and the usefulness in clinical practice. This current study will form the basis for this approach.

For the orthopaedic sub study in this application, the two most common elective major orthopaedic surgeries (hip and knee replacement) have been chosen. In England and Wales (population 58 million) there are approximately 160,000 total hip and knee replacement procedures performed each year. From the population of 3 million in the RCGP RSC database, it would be expected that about 8300 surgeries occur per year ( in other words, about 17,000 over the two years requested). In patients who undergo total hip arthroplasty (THA) or total knee arthroplasty (TKA), 3.7% and 2.7% of patients will develop symptomatic VTE, respectively, despite use of preventative low-molecular-weight heparin (a drug used to thin the blood). This is considered the minimal data necessary upon which to build a predictive algorithm for postoperative VTE.

Sanofi provide a research grant for this study and have no obligation to provide any other support for the study. University of Surrey is responsible for the initiation, management and conduct of the study. The Parties acknowledge that nothing in the funding agreement is provided as or intended to be an inducement to prescribe, purchase, recommend, use, or dispense any of Sanofi’s or its Affiliates’ products. University of Surrey is performing the study independently of Sanofi. Sanofi will have no control of nor in any way contribute to the conduct of the Study.

Expected Benefits:

The results are expected to inform the evaluation of the NHS policy on VTE risk assessment and thromboprophylaxis in the surgical populations studied. Understanding the impact of the VTE prevention programme and consequent VTE rates following surgical procedures will identify areas with scope for further improvement. Expected benefits will include length of stay, costs, complication rate after surgery and patient satisfaction. As an example, the orthopaedic substudy will enable VTE risk stratification of patients undergoing joint replacement (currently all are considered high risk). This will enable delivery of thromboprophylaxis only to those at very high risk (anticipated to be 50% of patients). Avoiding thromboprophylaxis in those at low risk will minimise adverse effects such as surgical site bleeding/ooze which predisposes to infection, which can adversely impact patient quality of life immediately post operatively and in the long term. Additionally, this represents a significant cost saving to the NHS. This will help inform best practice and guideline development in the continuum of care for joint replacement, as well as in general surgery.

The data will be made immediately available to the National VTE Programme Board at NHS England (via a co-applicant of this study) - who is also a Director of the National VTE Exemplar Centres Network.

Outputs:

There are four key audiences for this research, these are:
A. patients, the public, and health care practitioners
B. commissioning organisations (such as Clinical Commissioning Groups and NHS England)
C. external statutory organisations (such as Department of Health, NHS Information Centre, NICE)
D. academia

The outputs will be in the form of aggregate data tables, graphs, reports and papers for publication, with any small numbers suppressed (in line with the HES Analysis Guide).

• The university of Surrey will work with the local Academic Health Science Network, who will advise and support routes for dissemination to the public.
• Outputs to the public will be made via University of Surrey, University of Leiden and King’s College Hospital twitter feeds, Facebook and the media offices. Results of the study will be posted on www.clininfo.eu and University of Surrey webpages.
• Publications including Full, Executive Summary and Plain English summary reports of the research will be made in peer review journals and local NHS newsletters. Journals may include, but not limited to: JAMA surgery, BMJ, British Journal of Haematology, Journal of Thrombosis and Haemostasis, Thrombosis research.
• Wherever possible, publication will be made using a Creative Commons Licence. This will allow downloading the report, free of charge. Publications are made available on the University of Surrey library page, academics webpage and Researchgate.net.
• Presentations at national and international haemostasis and perioperative conferences.
• A Report of the study will be written for Sanofi (funding body)
• There is a website for the National VTE prevention programme: vteengland.org.uk where the study will promote the findings.
• Outcomes from the research will be included in future iterations of the guide to achieving CQUIN targets by King’s Thrombosis Centre, in conjunction with VTE Exemplar Centres. http://www.kingsthrombosiscentre.org.uk/kings/Delivering%20the%20CQUIN%20Goal_2ndEdition_LR.pdf
• A co-applicant is the Director of the King's Thrombosis Centre and a Senior Medical Advisor to the National VTE Prevention Programme in England. Through this channel, the research outcomes will influence Department of Health, NICE guidance for thrmboprophylaxis.

Expected Output of Research/Impact
OUTPUTS
1. An understanding of the effect of mandatory VTE risk assessment, introduced in 2010
2. A risk prediction tool for VTE after orthopaedic surgery.

IMPACT
The approach to research and dissemination will:
• Potentially reduce NHS costs through better assessment of VTE risk and through more accurate understanding of thrombosis risk after hospitalisation.
• Provide findings to enhance the current evidence base for quality indicators and commissioning practices enabling commissioners and providers to make evidence based decisions to ensure maximum benefit to patients and the NHS
• Contribute to national debates on the role of VTE thrombo-prophylaxis in driving forward improvements in patient care.

Submission of manuscripts will be targeted for the end of 2019 / Spring 2020.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data)”

The study will only use and store pseudonymised information extracted by an approved third party provider, Apollo Medical Software Solutions.

Each unique patient will be de-identified using a computer generated patient ID which could only be retraced by staff of the participating GP practices.

The research team at University of Surrey will not view patient identifiable information in any form.

Linkage between NHS Digital and the primary care data will be made via the pseudonymised NHS number.

Apollo generates the hash key which then de-identifies all the patients in the server. This is passed onto the University of Surrey.

University of Surrey will transfer the ‘hash’ algorithm to NHS Digital via Secure Electronic File Transfer (SEFT). The Hash algorithm is a one way encryption and can not be reversed so there is o ability for the pseudo data to be re-identified by University of Surrey.

Record level HES data pseudonymised at source using ‘hash’ algorithm downloaded to the Research Group at the University of Surrey for linkage via SEFT.

Pseudonomised record-level HES data will be processed and stored by the Research Group at the University of Surrey.

Patient level databases are held in the database server within the Research Group’s secure network. The Research Group is made up of staff substantially employed by University of Surrey. The Research Group’s dedicated secure network is sited behind a firewall within the University’s network. It is a standalone – independent network, all in-bounded connections are block, but out-bounded connections are allowed. All staff members of the research group working within the team base work from secure workstations or secure laptops with encrypted drive.

All staff members of the Research Group working within the team base work from secure workstations or secure laptops with encrypted drive within the Research Group’s secure network. The secure network is located behind a firewall within the University’s network, all in-bounded connections are blocked, but out-bounded connections are allowed.

The Research Group has conducted a risk assessment of the physical security of the offices and servers where patient level data is kept, a copy of the risk assessment can be accessed:
https://clininf.eu/wp-content/uploads/2017/02/Risk-Assessment-of-physical-security-V3.1-2016_18-signed.pdf
A more recent review was carried out on the 2nd May 2019 which will soon be published.

The hashed data provided by NHS Digital for this study will be downloaded by the Research Group. The Research Group will not have access to the identifiable data or the SALT key used for encryption.

The University of Surrey will make no attempt to re-identify the data extract provided by NHS Digital under this agreement.

The GDPR legal basis for the data processing is 'public interest', as medical research.

There will be no additional data linkage undertaken with NHS Digital data provided under this agreement that is not already noted in the purpose.

Data will only be accessed and processed by substantive employees of the University of Surrey and will not be accessed or processed by any other third parties.


Project 6 — DARS-NIC-203503-X7K8K

Opt outs honoured: N ()

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC

Purposes: ()

Sensitive: Non Sensitive, and Sensitive

When:2017.03 — 2017.05.

Access method: One-Off

Data-controller type:

Sublicensing allowed:

Datasets:

  1. Hospital Episode Statistics Admitted Patient Care
  2. Hospital Episode Statistics Accident and Emergency
  3. Hospital Episode Statistics Outpatients
  4. Office for National Statistics Mortality Data

Objectives:

The Imperial College study team have recorded baseline characterisation of approximately 30,000 Indian Asian men and women aged 35-74 years and free from clinically manifest cardiovascular disease (CVD), in the London Life Sciences Prospective Population (LOLIPOP) study. LOLIPOP aims to precisely calculate the increased vascular risk for British Asians. Health economic analysis of the introduction of the CVD risk prediction calculator for use in Indian Asians will be performed as well as a qualitative study to evaluate the utility and acceptability to general practitioners and individuals of implementing the CVD risk prediction model in general practice.

In parallel University of Surrey will develop models and conduct an economic evaluation to examine the cost-effectiveness of using the new risk estimator to detect the number of Asian men at risk. This includes the costs of identifying the cohort using the new risk estimator and putting them in a preventative scheme, and the benefit, both in terms of improved health outcomes and associated reduced health care costs

Expected Benefits:

The high CVD morbidity and mortality amongst the Asian population compared to Europeans represents a significant health inequality which needs to be explored, explained and addressed. Currently the precise risk is not known, so the costs effectiveness of a possible greater intensity of cholesterol, blood pressure and other interventions can’t be defined. Inclusion of enhanced treatment in national and international guidelines generally requires demonstration of cost effectiveness. By precisely calculating risk the University of Surrey will enable cost-effectiveness of any enhanced intervention to be determined.
The current recommended method for risk prediction is NOT adequate for this group and uncertainty of risk leads generally to standard guidelines being applied and the consequent under-treatment widens the inequalities in CVD outcomes for this population. Some patients may also be inappropriately over treated where individual clinician approximate additional risk.
This study has received a large investment from the NIHR, through a competitive, peer reviewed application process, to produce results of the highest standards to ensure this issue is addressed. The results will be used to derive a new model for CVD prediction for British Asians and this will be disseminated into routine clinical care.
This research will result in clinicians being able to make informed decisions on how aggressively to treat this group as a whole, or specific subgroups (e.g. people with diabetes). Preventative treatment will benefit health care both in terms of improved health outcomes and associated reduced health care costs.

Outputs:

All outputs will be aggregate with small numbers suppressed in line with the HES Analysis guide.

The outputs from this research will be published in major scientific journals. Target journals include the Lancet and New England Journal of Medicine. It is anticipated that the outputs will directly impact national guidelines in the preventative management regimes implemented for public health as well as in primary and secondary care. This is likely to be in place within two years of publication.

Outputs will also directly impact the treatment of the study participants as well as the needs of the west London community for education and service development.

There have already been many publications from the LOLIPOP study team including;
1. Coronary heart disease in Indian Asians. Tan ST, Scott W, Panoulas V, Sehmi J, Zhang W, Scott J, Elliott P, Chambers J, Kooner JS. Glob Cardiol Sci Pract. 2014 Jan 29;2014(1):13-23. doi: 10.5339/gcsp.2014.4. Collection 2014. PMID: 25054115
2. 6. Prevalence of coronary artery calcium scores and silent myocardial ischaemia was similar in Indian Asians and European whites in a cross-sectional study of asymptomatic subjects from a U.K. population (LOLIPOP-IPC). Jain P, Kooner JS, Raval U, Lahiri A. J Nucl Cardiol. 2011 May;18(3):435-42. doi: 10.1007/s12350-011-9371-2. Epub 2011 Apr 9. PMID: 21479755
3. 9. Ethnicity-related differences in left ventricular function, structure and geometry: a population study of UK Indian Asian and European white subjects. Chahal NS, Lim TK, Jain P, Chambers JC, Kooner JS, Senior R.
4. A replication study of GWAS-derived lipid genes in Asian Indians: the chromosomal region 11q23.3 harbors loci contributing to triglycerides. Braun TR, Been LF, Singhal A, Worsham J, Ralhan S, Wander GS, Chambers JC, Kooner JS, Aston CE, Sanghera DK. PLoS One. 2012;7(5):e37056. doi: 10.1371/journal.pone.0037056. Epub 2012 May 18. PMID: 22623978

Processing:

The University of Surrey are conducting a first full follow up of the participants in the LOLIPOP study and therefore need access to data from all the patients from the cohort that have been in the study for the past 10 years.

Both HES and ONS data will be linked to cohort data to maximize the identification of their CVD outcomes (stroke, advanced coronary artery disease and myocardial infarction) to allow a more rigorous evaluation. Particularly as many people may have moved away from northwest London.

NHS Digital will use the consented cohort already flagged under MR1143 to link to the requested data. The University of Surrey would receive a pseudonymised output from the HSCIC, which will be encrypted so re-identification cannot take place.

No record level data will be provided to third parties and none of the data will be used within any commercial tool or product or for commercial gain.

Only substantive employees of the University of Surrey will have access to the data and only for the purposes described in this document.