NHS Digital Data Release Register - reformatted

Ignite Data Limited

Project 1 — DARS-NIC-297783-V4P6H

Opt outs honoured: No - data flow is not identifiable (Consent (Reasonable Expectation))

Sensitive: Non Sensitive, and Sensitive

When: 2021/01 — 2021/02.

Repeats: One-Off

Legal basis: Health and Social Care Act 2012 – s261(2)(c)

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Outpatients
  • Hospital Episode Statistics Admitted Patient Care
  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Critical Care

Objectives:

Healthcare resource utilisations (HCRU) is the quantifiable measure of a person’s use of services for the purpose of both preventing and curing health problems, the promotion of maintenance of health and wellbeing. Through systematic review the disease burden experienced by both the patient and their healthcare providers can be assessed. There are two objectives for the processing: First objective: to assess the feasibility of using routine healthcare data to collect secondary care healthcare resource utilisation data (all cause and COPD related) in clinical trials using HES data for patients consented into the INTREPID study. The study will describe the recording (available in HES, yes/no) and completeness (% non-missing values) of different components of secondary care HCRU and the ability to apply Healthcare Resource Group (HRG) tariffs to these where possible. Second Objective: to use the HES data to summarise HCRU and costs using HRG tariffs for COPD patients (all cause and COPD related) on inhaled triple therapy for patients consented into the INTREPID study. (HRG is analysed using the latest publicly available National Cost Collection: https://improvement.nhs.uk/resources/national-cost-collection/) This study will consider the following HCRU and cost outcomes (all cause and COPD related): • Secondary care outpatient attendance • Hospital admission • ICU admission during hospital admission • A&E attendance • Secondary care costs Background to the request The highly controlled conditions of a randomised clinical trial (RCT) remove factors that influence and differentiate the use of medicines in everyday clinical practice. Effectiveness data generated in the broader population observed in an everyday clinical setting is increasingly being recognised as important in complementing data derived from the pivotal Phase III safety and efficacy studies. The primary purpose of the wider study is to assess the effectiveness of TRELEGY ELLIPTA relative to non-ELLIPTA Multiple Inhaler Triple Therapies (MITT) for Chronic Obstructive Pulmonary Disease (COPD) control within the usual clinical practice setting. This study was conducted once TRELEGY ELLIPTA had been approved and is available commercially. Primary care Electronic Medical Records (EMR) do not accurately and reliably capture complete information about hospital attendances and admissions. Therefore, to assess the full spectrum of healthcare resource utilisation by individual patients consented into the INTREPID study, additional information is required. The data requested will enable research to be carried out to better understand the service impact on the National Health Service (NHS) for patients; some examples have been provided below: • A better understanding of A&E attendances relating to acute exacerbations of COPD. • Understanding which admissions represent serious events. By understanding the above the data will enhance the understanding of a patient’s experience of the study drug by providing additional information, not readily available elsewhere, on their health during the trial period. In addition, primary and available secondary healthcare resource utilisation data including prescriptions associated with COPD and related medical conditions will be collected in the electronic Case Report Form (eCRF) by the investigator and study-site personnel for all participants and combined with the Hospital Episode Statistics (HES) datasets being applied for in this agreement. Visits and contacts that are due to a moderate or severe COPD exacerbation will be assessed and recorded. The data collected within the eCRF for the wider study will include: • Primary healthcare contacts • Medications • Hospital admissions, outpatient appointments and A&E attendances Organisations involved • GlaxoSmithKline (GSK): The sponsor for the INTREPID study. GSK wish to receive the final pseudonymised cut of study data, including HES data provided by NHS Digital. GSK are both a data controller and processor for this study. • Ignite Data (IGNITE): IGNITE will have the explicit consent of participants for receiving their HES data linked to their unique study ID for processing the data on GSK’s behalf and pseudonymisation prior to delivery for research. IGNITE will have access to the record level data to complete this work and this has been made clear to the Participants in the study information sheet and Informed Consent Form (ICF) the Participants complete. IGNITE will destroy the data once data processing has been completed. Cohort Number of participants in England = 835 (this will be the maximum cohort size). The cohort was recruited to the main randomised controlled trial (RCT) element of the study. Each patient has a current diagnosis of Chronic obstructive pulmonary disease (COPD) and was eligible for a triple therapy treatment. Each individual patient was recruited via the study site, which for the most part was their local GP. The GP Practices were selected from across the country. In a small number of cases the patient may have been referred from another local GP Practice or referred to an alternative local site (e.g. a hospital). Each patient in the cohort was then consented and managed by this local study site. The RCT data was then collected at a serious of standard study visits at the patient’s specific study site. Data Summary GSK are requesting IGNITE receive access to record-level identifiable data from the following Hospital Episode Statistics (HES) datasets: • HES Admitted Patient Care • HES Accident and Emergency • HES Critical Care • HES Outpatients The data requested will enhance the analysis of Healthcare Resource Utilisation by supplementing what has been located in the patient’s primary care records and what can be feasibly collected via eCRFs. As detailed above, this will enable research into the primary and secondary objectives. The application is commercially funded by GSK. The data is only being used for the purposes stated above. GSK has a corporate policy to publish results of all research, regardless of whether they reflect positively or negatively on their medicines. https://www.gsk.com/en-gb/research-and-development/trials-in-people/data-transparency/ General Data Protection Regulation Article 6 (1) (f) Legitimate interests as this is for the benefit of science and public health. All patients have been fully informed of the data processing and the organisations carrying out the data processing. Each individual has provided explicit informed consent. The research provides wider public benefit and the risks to the individuals are low. As special category data, the data is further being processed under article 9(2)(j) (processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes). This is because the consented data will be used to enhance the potential outcomes and improve care, which is in the public interest. The patient ICF makes reference to this specific healthcare resource utilisation work in the ‘Use of National Healthcare Data’ section.

Expected Benefits:

A full picture of the real-world implications of COPD and its treatment can only be formed when all interactions with the healthcare services can be appraised. The results from this study will provide insight into how real-world patients interacted with the UK healthcare system whilst they participated in this study. The dissemination of this information will be included into the wider outputs of the INTREPID study, where the objectives are to assess the effectiveness of TRELEGY ELLIPTA relative to non-ELLIPTA Multiple Inhaler Triple Therapies (MITT) for Chronic Obstructive Pulmonary Disease (COPD) control within the usual clinical practice setting. Improved understanding of COPD and its treatment will improve patient outcomes and reduce the burden on the UK healthcare service by: • Improving understanding of the workload of healthcare placed on the patient in this population, for example, the frequency of COPD related outpatient visits and how this relates to total frequency of all-cause outpatient visits in patients taking triple inhaled therapy • Improving understanding of serious events in this population, such as hospitalisation for acute exacerbation of COPD in patients taking triple inhaled therapy • Improving ability to understand if the clinical research community can conduct medical research and answer these types of question more effectively and efficiently in the UK using these types of data. As the second most common reason for emergency medical admission in the UK, COPD represents a substantial cost to the NHS. The datasets provided will be analysed to better understand the healthcare resource utilisation of patients included in this study. The dissemination of this information will expand the understanding of disease burden for patients initiating triple inhaled therapy in the UK. Information on cost and healthcare resource utilisation are used by organisations such as the NHS when making decisions on the organisation and delivery of care. The information generated from this analysis will, for example, provide valuable information on which aspects of secondary care represent the highest use of healthcare resource in this population. This information is valuable to the NHS as it can be used to determine which aspects of secondary care, if targeted could yield the greatest reduction in overall healthcare resource utilisation and costs. Reduction of overall healthcare resource utilisation and costs in one area, for example, reduction in unplanned COPD admissions means that more resources could be spent on another area, for example on preventative healthcare services. The outputs are integral to the analysis of the Healthcare Resource Utilisation of patients in the study, as they are the most accurate representation of the patient’s interactions with the UK healthcare service. They are also required so that an assessment of data available within pre-existing datasets can be made against that captured within the eCRF.

Outputs:

The wider INTREPID study will produce a study report, which will be published in the public domain. As part of the wider INTREPID study, further submissions to peer reviewed journals, presentations and conferences may be made. At this stage, it is hard to determine the impact of the data processed as part of this application. If anything of significance identified as part of the objectives, these would form part of the wider INTREPID study publications. Any published results would contain data which has been minimised and aggregated to ensure privacy is maintained. All data in outputs will be published in-line with GSK policies. The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement. All outputs will be restricted to aggregate data with small numbers suppressed in line with HES Analysis Guide. In accordance with GSK’s internal policy, all trial results will be shared, regardless of whether they reflect positively or negatively on GSKs medicines. GSK posted information about the INTREPID study on a publicly accessible register (ClinicalTrials.Gov - https://clinicaltrials.gov/ct2/show/NCT03467425) before it started and will update it with a result summary after the study is finished. GSK seek the publication of all results of all clinical trials in peer-reviewed scientific journals and the INTREPID study and any relevant result found as part of the data processing activities in this application will be no different. The last patient last visit for the INTREPID study was October 2019. The date analysis of the real-world data captured during the study has begun. At present, it is estimated that initial results may be published in mid to late 2021. GSK has published policies for sharing outputs on its website, found at : https://www.gsk.com/media/2946/disclosure-of-clinical-trial-information-policy.pdf.

Processing:

GSK request access to the data under the data flows described below: 1. Participating practises collect consent from patients using ICF and collect patient identifiers including NHS Number, First Name, Surname, Date of Birth and Postcode. 2. Practises transfer copies of completed ICFs and file containing patient identifiers to IGNITE via password protected email. IGNITE store ICFs and patient data using Cloud Storage provided by Microsoft Azure. For this reason Microsoft Azure are included as a Data Processor and the appropriate data centre locations have been included in this request. 3. IGNITE then apply a unique Study ID to each participant to form a Subject Log of all participants and associated identifiers. 4. IGNITE submit Subject Log containing all cohort data including identifiers to NHS Digital for linkage to HES datasets. 5. NHS Digital provide HES data for all 835 cohort members to IGNITE via Secure Electronic File Transfer (SEFT). This data will be received by IGNITE and stored on Cloud Storage provided by Microsoft Azure. 6. IGNITE will link NHS Digital data to pseudonymised clinical trial data using Study ID. IGNITE will remove all patients’ identifiers except for Study ID. 7. IGNITE will transfer the linked, pseudonymised dataset to GSK for analysis where: a. The data will be processed to produce descriptive statistics on the patient cohort. Initially, the feasibility of assessing each component of healthcare resource utilisation HCRU will be assessed by tabulating missingness of key variables. Where variables have a low degree of missingness the team will summarise HCRU using means, medians and proportions as appropriate both for COPD related and all cause HCRU. For example, frequency and mean length of stay in hospital for COPD related and all-cause in-patient hospitalisations to understand admitted patient care; average frequency of respiratory related emergency department attendance to better understand COPD exacerbations resulting in A&E attendance; and average frequency of attendance at outpatient clinics to better understand the use of secondary care services in this patient group. The team will summarise HCRU overall and by important subgroups where there are sufficient numbers. In this study, directly identifiable personal and sensitive personal information is determined to be the Participant Surname, Forename, Date of Birth, NHS Number, Address and Post Code. This data is legitimately required by Ignite to hold for administrative purposes and to pass to NHS digital for Participant record identification purposes. IGNITE have the explicit consent of participants for receiving their HES data linked to their unique study ID for processing the data on GSK’s behalf and pseudonymisation prior to delivery for research. The data will be linked to pseudonymised clinical trial data obtained during a patients’ participation using a code (GSK Study ID). This will enhance the understanding of their Healthcare Resource Utilisation during their study participation. The dataset provided by NHS Digital will be pseudonymised using a study specific identifier by IGNITE prior to sending to GSK for analysis. IGNITE will have access to the record level data to complete this work and this has been made clear to the participant in the Patient Information Sheet (PIS) and Informed Consent Form (ICF) the participants have completed. IGNITE will destroy the data once data processing has been complete in-line with GDPR and Good Clinical Practise (GCP) law. IGNITE and GSK the only organisations involved in this agreement. At both GSK and IGNITE data access is restricted to specific research teams and is only accessible to members of these teams via role-based access controls to file shares and servers. All members of these teams are substantively employed by IGNITE or GSK. The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement. All outputs will be restricted to aggregate data with small numbers suppressed in line with HES Analysis Guide. Microsoft Ltd provide Azure Backup Storage Services for IGNITE Data Limited and are therefore listed as a data processor. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data. No record level data disseminated by NHS Digital will leave England and Wales. NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).


Project 2 — DARS-NIC-115298-L5X4V

Opt outs honoured: No - data flow is not identifiable (Consent (Reasonable Expectation))

Sensitive: Non Sensitive

When: 2021/04 — 2021/04.

Repeats: One-Off

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Categories: Anonymised - ICO code compliant

Datasets:

  • Hospital Episode Statistics Accident and Emergency
  • Hospital Episode Statistics Admitted Patient Care
  • Hospital Episode Statistics Outpatients
  • Hospital Episode Statistics Critical Care

Objectives:

BACKGROUND The Extended-SLS is a follow-on study to the Salford Lung Studies (SLS), two landmark effectiveness trials of fluticasone furoate / vilanterol (an inhaled corticosteroid combined with a long-acting-b2-agonist [LABA] in a single inhaler device) in patients with Chronic Obstructive Pulmonary Disease (COPD) (NCT01551758) and asthma (NCT01706198) which ran from March 2012 to December 2016. The SLS subjects represent patient cohorts that are extremely well-characterised over a short period of their COPD/asthma disease experience. Subjects in the SLS originally consented for information relevant to the study to be shared with the sponsor, GlaxosmithKline Research & Development Limited (referred to in this application as GlaxosmithKline or GSK), and these data were limited to three years prior to randomisation and the twelve-month interventional treatment period. This finite period of data coverage limits the potential to address scientific questions of clinical interest related to long-term COPD/asthma disease progression and associated outcomes (for example, mortality). Broadened access to patients’ data would allow SLS subjects’ entire disease journey to be researched, presenting a rare opportunity to use epidemiological research to improve scientific and clinical understanding of COPD/asthma disease risk, treatment and progression. The Extended-SLS seeks to broaden capture of SLS patients’ data through the collection of additional subject-level data encompassing past and periodic future (up to 10 years duration from the date of consent) demographic, COPD/asthma risk factors and healthcare-related information from both primary and secondary care, resulting in the creation of the Extended-SLS cohort. Additional patient reported data on early life exposures, impact of disease on sleep, smoking history and other information not typically available from electronic medical records (EMR) will be captured using disease-specific questionnaires. Secondary care data on hospital admissions and attendances available in NHS Digital’s Hospital Episode Statistics (HES) data are crucial to understanding and researching hospitalised exacerbation's of COPD/asthma, comorbidity, long-term safety of COPD and asthma therapies, and healthcare resource utilisation. The Extended-SLS has three organisations involved in the work but only GSK and Ignite are involved in the processing of the HES being requested in this application: • GlaxosmithKline (GSK): The sponsors of the original SLS and the Extended-SLS. GSK wish to receive the final pseudonymised cut of study data, including the HES data provided by NHS Digital. GSK are a data controller and a data processor in this study. GSK is an international pharmaceutical company and the relevant GSK party for this agreement is a company registered in England and Wales. GSK produces drugs directly relevant to the ailments being studied and holds the international patent to one of the drugs specifically referred to in this agreement. • Ignite Data (Ignite): The contracted research company looking after site management, patient consent and data processing. Ignite will have the explicit consent of Participants for bringing together multiple different data sources captured on each individual study volunteer, specifically including the consent of patients to process their HES and mortality data. Note patients provided consent for use of mortality data; however, this application is not requesting ONS mortality data. • Graphnet Health (Graphnet): The general practitioner (GP) extract provider for the study. Graphnet currently have active data sharing agreements with every GP who is participating in the study for direct care and hold Data Security and Protection (DSP) Toolkit. Study specific agreements exist with each individual GP giving Graphnet permission to extract consented Participant data for this study only. Graphnet will not be in receipt of any HES data provided to Ignite Data by NHS Digital. Graphnet are not directly involved with this application but are a source of the main study data set. Hospital Episode Statistics will be used to form a longitudinal patient record that GSK wish to build over the lifetime of the study. HES data supplied to Ignite Data and GSK are essential to answer questions in the Extended-SLS relating to: • Healthcare resource utilisation and costs (HRG); primary care electronic medical records do not accurately and reliably capture complete information about hospital attendances and admissions and therefore cannot inform on the full spectrum of healthcare resource utilisation. • Severe exacerbation's of COPD and asthma; these are defined, according to global standards, as exacerbation's requiring hospital admission (asthma and COPD) and exacerbation's requiring emergency care (asthma), validation studies demonstrate that severe exacerbation's are not adequately recorded in primary care EMR. • Frailty and disease severity defined based on prior hospitalisations (all-cause) and comorbidities not managed in primary care. • Potential, treatment-related adverse events resulting in hospitalisation. • Impact of treatments and/or disease severity/subtypes on all-cause or COPD- and asthma- related mortality; primary care. (HRG is analysed using the latest publicly available National Cost Collection: https://improvement.nhs.uk/resources/national-cost-collection/) The number of study participants in England in the SLS who have consented onto the study is currently 1,183, with those consenting for NHS Digital sharing their data being 1,121. This is an ongoing study and Ignite Data Limited estimate the cohort to be 1,300 individuals. All participants are 18 years or older. Each study participant has a current diagnosis of COPD or asthma and was previously enrolled on to RCT (Randomised Controlled Trial) known as the Salford Lung Study. Each individual participant was recruited via the SLS study site, which in every case was their local GP. The GP Practices were selected from the Greater Manchester Region based on the fact they had participated in the original Randomised Control Trial (RCT). GSK and Ignite Data have many questions that can potentially be answered as a result of learnings from the study. Within GSK’s COPD and Asthma disease areas, these data may help to identify other key areas for GSK to focus their medicines development. By linking NHS Digital’s HES data to the data already held from the SLS study for patients in the Extended-SLS cohort (primary care EHR, disease-specific patient-completed questionnaires, and the SLS trial data), GSK and Ignite Data hope to address a range of research questions. Without access to HES data GSK and Ignite Data will not be able to answer any research questions where data on severe asthma or COPD exacerbations are required as these important adverse outcomes of asthma and COPD can only be reliably identified using discharge data from hospitals. Furthermore, GSK and Ignite Data will not be able to describe the full societal burden of these conditions without access to HES Admitted Patient Care (APC), Accident and Emergency (A&E) and Outpatient (OP) data as these are important sources of health-care resource utilization in asthma and COPD. Research questions which GSK and Ignite Data cannot fully answer without access to the HES data include: • Identifying which specific patient and disease traits can be used to guide clinician’s choice of treatment. • Identifying what impact biomarkers (eosinophils, fibrinogen, etc.) have on (a) the risk of asthma and COPD exacerbations and (b) the risk of pneumonia. • How patients progress from diagnosis to particular treatments, and if there are factors associated with this progression, e.g. potential clusters of comorbid conditions. • Identifying if there is a relationship between events pre-COPD or pre-asthma diagnosis (respiratory infections, comorbidities, healthcare-resource utilisation patterns, smoking and body mass index [BMI] dynamic changes) and natural history of COPD and asthma disease progression?. • Identifying how smoking status modulate or predict disease trajectory in COPD. • How healthcare resource utilisation differs in patients with and without exacerbations. • Examining the effect of asthma onset (early vs late onset) on longitudinal healthcare utilisation and asthma management. • Long term symptom control, adherence and outcomes (e.g. exacerbations) for asthma patients using specific treatments. • Investigation of sentinel events (e.g. exacerbations, a prescription of antibiotics or prednisolone) trigger a change in management and investigate what the health care is after a sentinel event. DATA MINIMISATION Ignite Data Ltd requires HES Critical Care (CC) data to accurately ascertain the level of Healthcare Resource Utilisation (HRU) during hospitalisation. Although these are a subset of admitted patient care episodes, critical care is associated with a higher cost. These data are required for example, to ascertain the number of days of advanced respiratory support for patients ventilated in critical care during an admission for COPD exacerbation. HES Accident and Emergency (A&E) data is required to understand any serious events which resulted in attendance at the emergency department but did not result in admission to hospital. Without these data the study team would not be able to, for example, ascertain episodes of treatment associated with attendance at A&E for acute exacerbations of Chronic obstructive pulmonary disease (COPD), where the patient is seen in A&E without admission. HES Admitted Patient Care (APC) data is required to ascertain any serious events which resulted in admission to hospital – for example – severe COPD or Asthma exacerbations and other comorbidities; HRU/cost. HES Out Patients (OP) data is required to ascertain secondary care treatment and outcomes, for example, outpatient respiratory clinic appointments where the patient has been referred to the respiratory physician by their GP and associated cost/HRU of this treatment. The data provided has been linked against each consented participants’ Randomised Control Trial (RCT) data via their unique Study ID. Only this pseudonymised dataset will be provided to GSK by Ignite Data Ltd for analysis. Ignite Data Ltd will destroy the date they hold after processing has been completed. As this data is being linked to an RCT dataset and then viewed prospectively for the next few years, only the years relevant to the study period are being requested. If the number of years are reduced any further it will not be possible to complete any beneficial analysis as the dataset would no longer match the study timeline or provide the depth of information required to study the desired outcomes. The data has been narrowed by geography due to the fact that the consented patient list only contains patients from specific English research sites. Due to the condition, all patients will live within a reasonable commute of each study site. As the study is a consented and recruited cohort it is not possible to further minimise by demographic information any further. The data cannot be narrow by clinical factors as without data relating to events outside of the main study condition, Ignite Data Ltd and GSK will not be able to understand all cause HRU. Patient episodes are required to achieve the study purpose. The patients have consented to GSK having access to relevant data in their whole medical record, only then can the study team truly get a picture of their disease history, its treatment and progression historically and during the prospective period of the study. Elective episodes are also required as without data relating to events outside of the main study condition GSK will not be able to understand all cause HRU. Maternity episodes are not required for analysis. There is a timeframe around the index event required because without the dates relating to events, GSK will not be able to determine whether events occur after the patient’s entry into the study and before study end, therefore during that individual patient’s follow up. The study team will also not be able to determine the pattern or treatment and relevant measures contributing to a developing picture of a patient’s disease progression or management. Further, although record identifiers can be used to link episodes of care into spells, associating spells into super-spells of care (when patients may receive care from more than one hospital or trust) requires dates of admission and discharge. Not all the fields within the HES APC, A&E and CC dataset are required. Only the specific fields required, which have been reviewed by GSK epidemiologist have been selected. For linkage, the data will only be linked to the RCT cohort for the analysis set out in this application. As previously described the dataset is minimised by the recruited study cohort and all episodes on the patients are required to determine all cause HRU. The data is being processed under General Data Protection Regulation Article 6 (1) (f) Legitimate interests as this is for the benefit of science and public health. Processing the NHS Digital data will enable us to address specific research questions relating to asthma and COPD including advances in general knowledge of COPD and asthma disease risk and treatment that may alter how clinicians care for their patients and raise awareness of the burden of these on patients and the wider health system. Insight into the longer-term burden of COPD and Asthma (which are common chronic respiratory conditions which place a high burden on patients and society) would be beneficial to the wider medical science community including medicines developers and healthcare providers. As special category data, the data is further being processed under article 9(2)(j) (processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes). This is because the consented data will be used to enhance the potential outcomes and improve care, which is in the public interest.

Expected Benefits:

Benefits to the provision of health care The Extended SLS Study will be one of the key resources used to help researchers: • Get a better understanding of who is at risk of developing Asthma and COPD and why the progression of the disease varies from person to person; • Explore the anatomy of the diseases to help develop new medicines and enable more accurate diagnosis • Look into how existing drugs which are used to treat other conditions might help to treat the progression of Asthma and COPD and improve symptoms. The Extended-SLS will enable GSK to conduct methodologically-sound research that answers questions important to COPD and asthma Participants, providing further clarity to the scientific and clinical communities around COPD/asthma disease risk, treatment and progression. The data requested will enhance the analysis of who is at risk of developing Asthma and COPD and allow GSK and the broader scientific respiratory community to answer questions which will ultimately be used to benefit the management of Asthma and COPD patients. Processing of the HES secondary care data to address specific research questions relating to asthma and COPD using the Extended SLS cohort will have broad public benefit, including advances in general knowledge of COPD and asthma disease risk and treatment that may alter how clinicians care for their patients and raise awareness of the burden of these on patients and the wider health system. For example, analysis of the Extended-SLS data may improve our understanding of groups of patients at higher risk of developing asthma and COPD who may benefit from increased monitoring, or provide additional information on specific groups of asthma and COPD patients who are more likely to experience severe exacerbations and may therefore require alternative disease management. More generally, creation of the Extended SLS cohort will demonstrate that extension studies of RCTs are feasible and that important research questions can be addressed using data that has already been collected for other purposes (e.g. NHS Digital HES). Thus facilitating GPs participation in research without onerous and time consuming completion of case report forms which require patient data to be extracted from patient records into a research database. Insight into the longer-term burden of COPD and Asthma (which are common chronic respiratory conditions which place a high burden on patients and society) would be beneficial to the wider medical science community including medicines developers and healthcare providers. We might learn how to better manage these conditions. Patients consenting into this study have an expectation that processing their data may lead to improved healthcare for them and others. Furthermore, sourcing data from EHR held at medical practices and in centralised healthcare records allows an investigational site within the NHS to participate in medical research but with a much-reduced resource cost to the site. Reducing the burden of research at sites would help to increase the amount of medical research that could be conducted in the UK with obvious benefit to both the public and the healthcare system.

Outputs:

Outputs for the Extended-SLS will include a series of reports, conference abstracts (international and UK conferences including: European Respiratory Society, British Thoracic Society, ISPOR, etc) and published manuscripts in peer-reviewed journals on COPD/asthma disease risk, treatment and progression. GSK is committed to sharing results of impactful research with the wider scientific community and all research evaluating a medicine, providing important scientific knowledge, and/or which has relevance for patient care is published externally (and on our GSK Study Register: https://www.gsk-clinicalstudyregister.com/) in line with their policies. The first publication and piece of research they intend to conduct using the Extended-SLS cohort is a descriptive analysis , comparing participants in the Extended-SLS with those in the original SLS. The analysis using SLS trial data, Extended-SLS GP data and Extended-SLS data from the disease-specific questionnaires has been completed and is being developed as a manuscript. An abstract on the Extended-SLS study, including results from this initial analysis was presented at the European Respiratory Society congress in September 2020. Once HES data are available, the Extended-SLS cohort will be further described using HES data. Additional outputs of the Extended-SLS, including reports, presentations, manuscripts, etc. will be determined once recruitment into the Extended-SLS study has completed. Three studies that are proposed to begin from 2021 will explore specific questions relating to asthma control, moderate exacerbations of asthma, and use of ‘maintenance and reliever therapy’ regimens in asthma patients in Ext -SLS, with conference abstracts and publications planned. The audiences for outputs of the wider Extended-SLS will predominantly include researchers, scientists, and clinicians. GSK plan to disseminate information to Extended SLS participants throughout the lifecycle of the study via newsletters to GP sites. Information will need to be disseminated via GPs as GSK does not have access to participants names, addresses or email. Any published results would contain data which has been aggregated with small number suppression applied as per the HES Analysis Guide to ensure privacy is maintained. All data in outputs will be published in-line with GSK policies. In accordance with GSK’s internal policy, all trial results will be shared, regardless of whether they reflect positively or negatively on GSK’s medicines. GSK posted information about the EX-SLS study on a publicly accessible register (ClinicalTrials.Gov) before it started and will update it with a result summary after the study is finished. GSK seek the publication of all results of all clinical trials in peer-reviewed scientific journals and the EX-SLS study and any relevant result found as part of the data processing activities in this application will be no different. Within a year of data receipt, and during the lifetime of the study, there will be multiple outputs: GSK's aim is to provide updates to annual scientific conferences/congresses as the study progresses and to answer further disease specific questions generated by the research community inside and external to GSK. The outputs of this study aim to support the below listed benefits by looking at the range of benefits for COPD and asthma patients and HCPs as below: - Identifying what other factors are associated with diagnosis of early COPD. This will increase understanding of groups at higher risk for developing COPD and who may benefit from increased monitoring/ alternative treatment paradigm or disease management - Identifying what triggers a COPD diagnosis, e.g. are there particular events that lead to a initial diagnosis of COPD? This will improve understanding of presenting events which often suggest a COPD diagnosis; improved patient diagnosis. - To aim to describe patient progression from diagnosis to therapy with a long-acting bronchodilator (LABD ) to triple therapy (comprising an ICS, a LABA and a long-acting muscarinic antagonist [LAMA]), and describe factors associated with this progression, including the role of potential clusters of comorbid conditions. This will mean patient treatment pathways can assist in identifying subgroups of patients which may be over/under treated based upon their symptoms and COPD profile.

Processing:

DATA SUMMARY: Pseudonymised* record-level historic Hospital Episode Statistics data extracts linked to a consented cohort of approximately 1,300 extended Salford Lung Study participants. Each patient has provided explicit consent for their personal information identifiers [Participant Surname, Forename, Date of Birth, NHS Number, Address and Post Code] to be provided to NHS Digital. • Hospital Episode Statistics (HES) Admitted Patient Care (APC) 2009/10 – 2019/20 • HES Accident and Emergency (A&E) 2009/10 – 2019/20 M12 • HES Out Patients (OP) for the periods 2009/10 – 2019/20 • HES Critical Care (CC) for the periods 2009/10 – 2019/20 As per the consent documentation, the study team can only hold 10 full years of prospective data from the date of consent. METHODOLOGY 1. [Outside of this agreement scope] Cohort data originated from Graphnet, collected from participating GPs, and sent to Ignite Data Limited. 2. Ignite Data Limited apply GSK Study ID to the cohort and the cohort identifiers [Participant Surname, Forename, Date of Birth, NHS Number, Address and Post Code] sent to NHS Digital via Secure Electronic File Transfer (SEFT). It is estimated that the cohort will be 1,300 individuals. 3. NHS Digital will receive the cohort and link to the below periods of HES data. Identifiers will be removed, and pseudonymised data returned to Ignite Data Ltd via Secure Electronic File Transfer (SEFT). 4. Ignite Data Ltd will link NHS Digital Pseudonymised data to further pseudonymised RCT participant data via the GSK Study ID, and pass to GSK for further analysis. *Pseudonymisation is defined as using a code (GSK Study ID) to replace any directly identifiable personal or sensitive personal information that GSK has no legal requirement, or wish, to hold. The use of this code is to allow GSK to link an individual’s data from different sources over time whilst minimising risk incurred by holding unnecessary Personal and Sensitive Personal information. The pseudonymised NHS Digital HES data will be linked to pseudonymised RCT data obtained during a patient’s participation using the GSK Study ID. This will enhance GSK and Ignite Data’s understanding of a patient’s Healthcare Resource Utilisation during their study participation. IGNITE will have access to the record level data to complete this work and this has been made clear to the participant in the Patient Information Sheet (PIS) and Informed Consent Form (ICF) the participants have completed. IGNITE will destroy the data once data processing has been complete in-line with GDPR and Data Protection Legislation. IGNITE shall transfer the data onto a server within their Microsoft Azure datacentre environment. This environment is heavily access controlled and is only accessible via trained and authorised IGNITE staff. This access control is managed in-line with ISO27001 policy. To safeguard any sensitive data being processed staff are trained on a regular basis. Yearly ISO27001 training is compulsory and is reviewed by external audit yearly, NHS DSP Toolkit is also tied into this regime along with GDPR training. Data accessed within the IGNITE environment shall be exclusively by IGNITE employees. The final audited data is sent via encrypted transfer to GSK where it is then stored behind a VPN controlled firewall on a UK based server with further access right control in-line with GSK’s NHS DSP Toolkit policy. The data will not be linked or compared (matched) with other data sets not detailed within this agreement. There will be no attempts to try and re-identify or re link to identifiable record level patient data. The data received by GSK and Ignite Data will not be used for any purpose other than to meet objectives as stated in this Data Sharing Agreement and will not be shared with any other third party or organisation other than in the form of aggregated level data with small number suppression applied as per the HES Analysis Guide. Microsoft Ltd provide Azure Backup Storage Services for IGNITE Data Limited and are therefore listed as a data processor. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data. HES and ECDS DISCLOSURE CONTROL / SMALL NUMBER SUPPRESSION In order to protect patient confidentiality, when presenting results calculated from HES record level data, outputs will contain only aggregate level data with small numbers suppressed in line with HES Analysis Guide. When publishing HES data, you must make sure that: • cell values from 1 to 7 are suppressed at a local level to prevent possible identification of individuals from small counts within the table. • Zeros (0) do not need to be suppressed. • All other counts will be rounded to the nearest 5. Data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.