NHS Digital Data Release Register - reformatted

Clinical Practice Research Datalink (cprd) projects

89 data files in total were disseminated unsafely (information about files used safely is missing for TRE/"system access" projects).


SUS-COVID vaccination datasets for MHRA surveillance through CPRD — DARS-NIC-424723-D5Q9W

Type of data: information not disclosed for TRE projects

Opt outs honoured: Identifiable, Anonymised - ICO Code Compliant (Statutory exemption to flow confidential data without consent, Does not include the flow of confidential data)

Legal basis: CV19: Regulation 3 (4) of the Health Service (Control of Patient Information) Regulations 2002, Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 – s261(2)(a)

Purposes: No (Research)

Sensitive: Sensitive

When:DSA runs 2021-02-11 — 2022-02-10

Access method: One-Off

Data-controller type: DEPARTMENT OF HEALTH AND SOCIAL CARE

Sublicensing allowed: No

Datasets:

  1. Minimal SUS Dataset for COVID-19 Surveillance

Objectives:

This agreement is requesting a daily flow of data from NHS Digital to enable the Medicines and Healthcare products Regulatory Agency (MHRA) to carry our essential regulation of the CV19 vaccination roll out. This measure is being put in place until the flow of this data is captured in the routine GP data flows to the MHRA. The data supplied under this agreement will be used solely for this purpose.

The controller for GDPR purposes is the Department of Health and Social Care (DHSC); the legal signatory for this agreement (and for the overarching DSFC) is the Secretary of State for Health and Social Care (acting as part of the Crown), acting through the Clinical Practice Research Datalink centre (hereinafter referred to as CPRD) within the Medicines and Healthcare products Regulatory Agency (the agency); and the licensee is CPRD as a part of the MHRA, not the wider DHSC.

The Clinical Practice Research Datalink (CPRD) is a centre of the Medicines and Healthcare products Regulatory Agency (MHRA), an executive agency of the Department of Health and Social Care (DHSC). The agency regulates medicines (including vaccines), medical devices and blood components for transfusion in the UK and the agency acts as an Executive agency.

The Clinical Practice Research Datalink (CPRD) is a government not for profit research organisation, jointly supported by the Medicines and Healthcare products Regulatory Agency (MHRA) and the National Institute of Health Research (NIHR), supplying anonymised health data for studies to safeguard and improve patient and public health. For more than 30 years, CPRD data have supported vital research into health care delivery, drug safety, effectiveness of medicines and risk factors for disease. It currently receives data and linkage services from NHS Digital under a separate DSA (DARS-NIC-15625-T8K6L) for public health research.

These flows of COVID-related data are likely to be needed for approximately 12 months, possibly longer, though it is hoped not, as flows through GP systems providers should be available by then. The position will be reviewed post Summer 2021 once the overall position is clearer.

The lawful basis for processing data under GDPR has been reviewed against the guidance provided by IGARD and been assessed as acceptable by NHS Digital. As per GDPR Article 6(1)(e) “processing is necessary for the performance of a task in the public interest or in the exercise of official authority vested in the controller” the Clinical Practice Research Datalink (CPRD is a function of the Medicines and Healthcare products Regulatory Agency (MHRA) which is a function of Department of Health and Social Care which is a public authority as per the FOI Act 2000 Part 1, section 3. Under Section 8 of the Data Protection Act 2018 CPRD are a function of a government department and provide the England-wide NHS observational and interventional research service. NHS Digital are satisfied that this request is appropriate, necessary and proportionate for the performance of the task described in the Purpose statement and that there is no other reasonable means for the data processor to achieve their purpose that is less intrusive to the data subjects.

As CPRD will be processing Health Data, which is a Special Category of Personal Data, as per GDPR Article 9(2)(i)” Processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices, on the basis of Union or Member State law which provides for suitable and specific measures to safeguard the rights and freedoms of the data subject, in particular professional secrecy” as CPRD have confirmed that the processing is necessary for reasons of public interest in the area of public health (effectively monitor the new COVID-19 vaccines) and will be carried out under the responsibility of health professionals.

Data are being processed under the following functions within their statutory purpose;

• The Human Medicines Regulations 2012: S178c ‘take all appropriate measures to obtain accurate and verifiable data for the scientific evaluation of suspected adverse reaction reports’ and S179 ‘Obligation on licensing authority to operate pharmacovigilance system’
• MHRA as licensing authority: under Medicines and Healthcare products Regulatory Agency Trading Fund Order 2003 (SI 2003/1076), made under the Government Trading Funds Act 1973

The Agency performs the functions of the Secretary of State under UK legislation relating to medicines, medical devices and blood, amongst other things. From 1 April 2013, the Agency also performs the functions of the Secretary of State in relation to biological substances conferred under section 57 of the Health and Social Care Act 2012. These functions, which relate to ensuring the quality of biological medicines, were previously carried out by the Health Protection Agency through the non-statutory body, the National Institute of Biological Standards and Control (NIBSC). The NIBSC continues to deliver these functions as part of the Agency on behalf of the Secretary of State

Expected Benefits:

Effective monitoring of the safey and quality of new COVID vaccines - until normal dataflows can be used. Avoidance of deaths or hospitalisation by enabling the provision of effective vaccines and detecting possible adverse reactions.

Outputs:

Statutory & effective monitoring of new COVID vaccines - until normal dataflows from GP systems can be used.

This will start as soon as dataflows can be arranged. This is an URGENT requirement.

Processing:

There are to be three distinct dataflows - first two involve data from NHS England where NHS Digital provide the data to MHRA acting as a processor on NHS England's behalf and the third data from NHS Digital itself:
1) COVID-19 Vaccination Dataset - from NHS England (not covered by this agreement)
2) COVID-19 Vaccination Adverse Reactions Dataset - from NHS England (Not covered by this agreement)
3) Minimal bespoke extract from SUS Dataset (Minimal SUS Dataset for COVID-19 Surveillance) - for COVID-19 Surveillance (Purpose of this agreement)
These datasets are to be restricted to the standard CPRD GP Cohort as per DARS-NIC-15625-T8K6L and updates are to be processed daily through MESH (see below).

For the first two dataflows, NHS England are already providing daily updates to NHS Digital, which will effect linkage to the CPRD cohort (discarding unmatched entries) and replacing identifiers with the usually CPRD GP_ID and GP_Practice_ID (ODS code). The resultant two files are already provided to CPRD via MESH (see below).

For the third dataflow (the purpose of this agreement) . NHS Digital will apply the appropriate extract criteria to SUS to generate a daily update and apply the same matching top the CPRD cohort and the replacement of identifiers with the usual GP_IDs.

CPRD will then link these datasets with CPRD primary-care data already received from GP practices and supply a restricted dataset to the Vigilance and Risk Management of Medicines (VRMM) section of the MHRA - the sole ultimate data recipient for this data flow to support their vaccine surveillance requirements.

There is no requirement to be able to sub-licence this data for research purposes as use will be limited to use within MHRA for its statutory duty to monitor the safety and effectiveness of medicines.

There are already established three MESH accounts for each of the three dataflows: VACCINATIONS_DAILY_1, ADVERSE_REACTIONS_DAILY_1, and SUS_EXTRACTS (one for each of the flows above, respectively)

SUS+ Disclosure Control / small number suppression
In order to protect patient confidentiality, when presenting results calculated from SUS+ record level data, outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide. When publishing SUS+ data, you must make sure that cell values from 1 to 7 are suppressed at a local level to prevent possible identification of individuals from small counts within the table. Zeros (0) do not need to be suppressed. All other counts will be rounded to the nearest 5.

Sungard Availability Services Ltd/Crown Hosting Data Centres Ltd are recorded as data storage addresses as for the purposes of this application. Sungard Availability Services Ltd is the primary data centre store and is considered to be the initial back-up and recovery. They are not involved in processing of the data in any way (Sungard Availability Services Ltd provide a facilities management and site management service). CPRD has confirmed that neither Crown Hosting Data Centres Ltd nor Sungard Availability Services Ltd have access to the server (neither administrative nor user rights). Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data.

NTT Data UK Limited supply IT infrastructure for Medicines and Healthcare Products Regulatory Agency and are therefore listed as data processors. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data.


PEARL Study (Prolonged Effects of Assisted reproductive technologies on the health of women and their children: a Record Linkage study for England) (CPRD-HFEA linkage project) — DARS-NIC-113025-X7Z3L

Type of data: information not disclosed for TRE projects

Opt outs honoured: Anonymised - ICO Code Compliant, Identifiable (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7),

Purposes: No (Research)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2019-04-01 — 2020-03-31

Access method: One-Off

Data-controller type: DEPARTMENT OF HEALTH AND SOCIAL CARE

Sublicensing allowed: No

Datasets:

  1. MRIS - Bespoke
  2. MRIS - List Cleaning Report

Objectives:

The data controller is Department of Health and Social Care, with the Secretary of State for Health and Social Care (acting as part of the Crown), acting through the Clinical Practice Research Datalink centre (hereinafter referred to as CPRD) within the Medicines and Healthcare Products Regulatory Agency. This is the same arrangement for the data processor in that it is Department of Health and Social Care although it is CPRD who process the data but are not listed as data processors because they are not a legal entity.

The data processor is Department of Health and Social Care.

The Clinical Practice Research Data-linkage (CPRD) is a centre of the Medicines and Healthcare products Regulatory Agency (MHRA), an executive agency of the Department of Health & Social Care (DHSC). The MHRA regulates medicines, medical devices and blood components for transfusion in the UK and the MHRA act as the Executive agency.

CPRD is the UK’s pre-eminent research service, providing access to primary care data (that has been de-identified) linked by NHS Digital to other similarly pseudonymised health data. This data is provided by NHS Digital and others for the purposes of public health research including the monitoring of drug safety. All such data is linked (in its identifiable form) by NHS Digital only. It is jointly funded by the MHRA and the National Institute for Health Research (NIHR).

CPRD’s aims are to support vital public health research and to inform advances in patient safety in the delivery of patient care pathways. These depend on access to accurate, real-time representative patient data to produce reliable evidence based clinical and drug safety guidance. The legal bases for processing the data provided by NHS Digital are:
• Gathering of GP patient data and collation with other data sets to produce data-sets that have been de-identified: medical research under Article 9(2)(j); drug and device safety under Article 9(2)(i) of the General Data Protection Regulation

CPRD services are designed to maximise the way de-identified NHS clinical data can be used to improve and safeguard public health. For more than 20 years data provided by CPRD have been used in a range of drug safety and epidemiological studies that have impacted on health care and resulted in over 1700 peer-reviewed publications. In addition to supporting high-quality observational research, CPRD is developing world-leading services based on using real world data to support clinical trials and intervention studies. The intention is to continue to link CPRD primary care data to NHS Digital’s secondary care and other datasets, as linkage greatly increases the scale, depth, completeness and therefore value of data available for public health research. The outputs of such research based on linked data in turn improve and protect patient care pathways/treatments and provide clinical benefits for the UK, supporting delivery of CPRD’s core objectives.

CPRD’s research and data services are based on a database of de-identified longitudinal primary care records contributed by consenting GP practices from the four UK nations, and on the ability to link primary care data to secondary care data (and other data sets), from the NHS, Office of National Statistics (ONS) and Public Health England (PHE). One of CPRD’s main priorities is to increase the number of national data sets that are linked to primary care data and made available on a routine basis to the research community.

NHS Digital has been providing secondary and other data for linkage with CPRD primary care data for a number of years. Data linkage is carried out exclusively by NHS Digital as the Trusted Third Party (TTP) for this purpose. Linked data sets currently available include extracts from Civil Registration data; Hospital Episode Statistics (HES), which encompasses Admitted Patient Care, Critical Care, Outpatient and Accident & Emergency data; Patient Reported Outcome Measures (PROMs); Diagnostic Imaging Dataset (DID); Mental Health data; National Cancer Registry; Deprivation data including Townsend Score and Index of Multiple Deprivation. Critical care is supplied as a separate dataset by NHS Digital but is integrated with Admitted Patient Care.

Data can only be used for public health research purposes in research recommended for approval by ISAC for MHRA database research. CPRD make the final decision on access and ensure compliance with NHS Digital’s requirements within the data sharing agreement, e.g. security of the third party. Access to CPRD data and services will not be permitted in circumstances that may result in loss of public trust or for activities that may undermine the integrity of the CPRD database.

This application is to support a research project, which involves linkage of record level data from the Clinical Practice Research Datalink (CPRD), and the Human Fertilisation and Embryology Authority (HFEA).

The study is funded by the Medial Research Council until 31/08/2020.

The study is presented below:
The PEARL study (Prolonged Effects of Assisted reproductive technologies on the health of women and their children: a Record Linkage study for England)

The aim of this project is to create a linked dataset between HFEA infertility data and health data from the Clinical Practice Research Datalink (CPRD) mother-baby track, and to use the linked dataset to assess the effect of assisted reproductive technologies (ART) on the health of women and their children after successful fertility treatment. Specific objectives are:

1. To estimate the effect of subfertility, ovulation induction (OI) and ART on the health and development of children to adolescence [hypothesis: children born after subfertility, OI or ART experience poorer health and developmental outcomes to adolescence than their naturally conceived peers].
2. To examine the impact of subfertility, OI and successful ART on the health and wellbeing of infertile women [hypothesis: Women who have had subfertility, OI and successful ART experience different mental health trajectories to those who conceived naturally].
3. To quantify the additional resources, if any, used by women and their children after successful ART [hypothesis: Mother-baby pairs formed after successful ART make greater use of the health services and incur additional costs, compared to the mother-baby pairs formed after natural conception].
4. To assess the impact of low consent rates after September 2009, on the results of ART studies conducted using the HFEA register, and explore techniques to deal with the effects of the missing data and the impact of the potential bias [hypothesis: low consent rates since Sept 2009 adversely impact the validity of the aetiological research conducted using this dataset].

The study is an observational epidemiological study, which will use a retrospective cohort design. Data for the study will be linked between CPRD and the Human Fertilisation and Embryology Authority (HFEA). PEARL links health data (from the Mother-Baby track of CPRD GOLD), to information collected about all assisted reproductive technology (ART) cycles in England (from the Human Fertilisation and Embryology Authority Register). Approximately 460,000 mother-baby dyads based in English GP practices, with a valid NHS number and consent to link data, are included in the CPRD mother-baby dataset for the period 1991-Sept 2009. Estimates based on 1.5% of babies resulting from ART, 95% successful matching between HFEA and CPRD, 61% with continuous follow up of 4-22 years, gives 3933 ART dyads and 262,571 comparison dyads. Different indicators will be used to divide the mothers in the CPRD mother-baby dataset to identify those with no records for fertility consultations (fertile comparison group), those with a record of consulting the doctor about fertility and use evidence of consultations for fertility problems or investigations as a means to identify those who may be subfertile. The fertile comparison group includes both unplanned and planned pregnancies.

Methodological work to assess the impact of low consent rates after September 2009, on the results of ART studies conducted using the HFEA register and explore techniques to deal with the effects of the missing data will also be conducted. Data are required for the period 1991-2018.

Mother-baby pairs will be grouped depending on their exposure to ART, based on both primary care and HFEA records, and outcomes will be compared in those children born with and without the use of ART, and their mothers. Multivariable regression analysis will be used, and the role of confounders and effect modifiers in explaining any observed effects between exposure and outcomes will be explored. The cohort design takes maximum advantage of the longitudinal nature of the data, and the available sample, while being most suitable for a relatively rare exposure (ART) and more common outcomes. It is anticipated that the final cohort will be just under 270,000, with 3933 assisted conceptions.

CPRD already hold NHS Digital data disseminated under NIC-15625, which includes HES data, Mental Health data, Diagnostic Imaging data, mortality data and Patient Reported Outcome Measures. This data is linked to the patient identifiers sent to NHS Digital by the GP system suppliers. No patient identifiable data is sent to CPRD, only study ids that enable them to link the NHS Digital data to the GP data. The PEARL study will use the data from HFEA and the Mother-Baby track of CPRD GOLD, this includes mothers that have had fertility treatment and mothers that have no records for fertility consultations, it is this second group that will be used as the fertile comparison group. Routine data will be collected up to 31 December 2017.

The legal basis for the study comprises of the following components:

1) The processing of the HFEA data by NHS Digital and CPRD are considered outsourced functions of the HFEA, and thus are covered by Section 8D of the HFE Act (1990), which grants the HFEA the power to contract out functions and disclose information where the function of the HFEA is exercised by others.

2) Study specific Section 251 approval (ref: 16/CAG/0053) allows the flow of CPRD identifiers to NHS Digital for the purposes of linkage, and the linkage of CPRD identifiers to HFEA identifiers without consent (this applies to pre-2009 HFEA data, as consent was not sought prior to 2010).

3) Where broad consent for research use was sought (HFEA data since 2009), and consent was given, it is recognised that they do not fulfil the GDPR consent requirements, however, the consent forms meet the Common Law Duty of Confidentiality. The governance landscape has changed since these consent forms were drafted, but there would be a reasonable expectation among those who signed the consent forms that people who work with the databases (which are mentioned) would use identifiers for the linkage process. It is not possible to contact all those who consented to the use of data to update their consents (nor would it be appropriate, since they consented to non-contact research use of their data). The identifiable data that is used in processing and linkage are held only by the HFEA and by NHS Digital. Identifiable data will not be released to either the CPRD or the study team at the University of Oxford. The use of the data that has been de-identified will provide robust scientific evidence regarding the long term health outcomes for women and children after fertility treatment, which is in the public interest (as evidenced by the provision of Section 251 for the data for which there is not consent available).

4) University of Oxford have support for the study under section 251 of the NHS Act 2006 16/CAG/0053. SD5.2 - SD5.6 provide the detail of the support for name, NHS number, postcode, date of birth/date of death and link to mother baby data already held by HSCIC for 1999-2009 data only for English patients only. The scope of the s251 support is to cover the processing of confidential patient information within NHS Digital. The disclosure of HFEA data to NHS Digital is permissible under the HFEA regulations.
The Human Fertilisation and Embryology Authority (HFEA) are permitted under the Human Fertilisation and Embryology Act 1990, Section 33D to enable the common law duty of confidentiality to be temporarily lifted so that confidential patient information can be transferred from the HFEA to NHS Digital without the disclosure being in breach of the common law duty of confidentiality.
(1) Regulations may—
a. (a)make such provision for and in connection with requiring or regulating the processing of protected information for the purposes of medical research as the Secretary of State considers is necessary or expedient in the public interest or in the interests of improving patient care, and
b. (b)make such provision for and in connection with requiring or regulating the processing of protected information for the purposes of any other research as the Secretary of State considers is necessary or expedient in the public interest.
(3) Where regulations under subsection (1) require or regulate the processing of protected information for the purposes of medical research, such regulations may enable any approval given under regulations made under section 251 of the National Health Service Act 2006 (control of patient information) to have effect for the purposes of the regulations under subsection (1) in their application to England and Wales.

The legal basis under GDPR for the processing and storage of personal data for PEARL is that it is ‘a task in the public interest’ (article 6 (1) (e)) and that sensitive personal data is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes (article 9 (2) (j), based on Article 89(1).

Expected Benefits:

Benefits to couples who are considering fertility treatment: Over 60,000 ART cycles are conducted annually in the UK. The research will provide those couples who are considering treatment better evidence regarding the longer-term health outcomes for their children, and any mental health impacts for the mother, allowing them to make a more informed choice about their care.

Benefits to families formed through ART: For the families of the 250,000 babies born as a result of treatment in the UK since 1991, this research will give better evidence of the potential longer- term health implications of their conception history. It is hoped that this will provide reassurance that few ART babies are seriously affected by their conception history – but if the research findings suggest an excess risk of adverse outcomes the families will be better informed as to what they may expect. This is why a well-written and targeted lay summary is so important.

Benefits to clinicians who provide care to these couples and their children: Scientific journal articles and conference proceedings will also target the clinicians who treat sub-fertile and infertile couples. This will allow them to provide better advice to these individuals, when they are considering treatment options. In addition, it is expected that these papers can contribute to guidelines regarding the safety of treatment – specifically within the UK (for example, the NICE guidance, and the Royal College of Obstetrics and Gynaecology ‘Special Impact Papers’).

Benefits to the NHS, in terms of financial implications and planning: As the NHS continues to face financial constraints difficult decisions are being made about the provision of fertility treatment in the UK. PEARL will provide information about the numbers of consultations in primary care for fertility problems, and model the trend to look at the implications for providing care in the next 10 years. The costs of ongoing care in ART mother-baby pairs will be calculated, and the financial impact of the increasing numbers of ART treatments will be assessed and reported. This is important, because even if the shift from NHS-funded to self-funded treatment cycles continues, the longer term care of individuals who have poor long term outcomes will fall to the NHS.

Benefits to the research community: There have been changes to the way HFEA register data are collected, and in 2009 the HFEA started asking patients if they would be willing to allow their data to be used for research. There is some suggestion that researchers are not using these 'modern' (post 2009) HFEA data because of concerns over low participation rates, lack of representativeness and potential bias (personal communication, HFEA). The PEARL study would provide evidence on the extent of bias in the HFEA dataset collected after the change to consent rules in 2009, and provide methods (such as weights) that can be employed to reduce the effects, and thus render the dataset more reliable. This output would come as a report and academic paper by the end of the study in 2020.

Given the time it takes from completion of a research project to any measurable impact, it is important to formulate a process for documenting any influence that the research has in the wider world. The main output of this project will be academic publications, and we will proactively search online for, and record citations of, work in academic journals, government policy documents, guidelines, or on websites for public sector services, voluntary groups and advocacy groups.

The University of Oxford will keep in contact with key stakeholders, such as the HFEA, CPRD, and Fertility Network UK, and ask them to inform them of any impacts of our research. The University of Oxford will also request that HFEA inform them of usage of the ‘modern’ dataset (post 2009) in light of the research to validate the quality of the data and provide appropriate tools to account for any evidence of bias. The target date for showing measurable impact is 2-5 years after the end of the project.

At the completion of this study the University of Oxford will have new evidence for the health implications of Assisted Reproductive Technologies in the English population. This includes:
- Providing evidence regarding the long term health outcomes of children born after subfertility and ART, and a comparison of whether they are at higher risk than children conceived without ART. This is expected to contribute to policy and information documents such as the Royal College of Obstetrics & Gynaecology Special Interest Paper on the subject, and NICE guidance on infertility care.
- New evidence regarding the mental health and wellbeing of women who had an ART baby, compared to their peers who conceived without treatment.
- The team will establish whether there is increased health service use for the ART mother-baby pairs, which will allow estimates of the costs to the NHS of additional care (if any) and modelled projections for the next 10-20 years based on the changing ART birth rate in the UK.
- Finally, the study will provide evidence of the effects of the HFEA introducing consent forms on the quality and utility of the HFEA register data in research.

Outputs:

Data analysis will commence as soon as the linked datasets are received and is expected to be finished within 24 months of the release of the bespoke dataset. Data will not be used for sales and marketing purposes. All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Research reports will be prepared for the funder (MRC), Research Ethics Committee, Confidentiality Advisory Group and HFEA, with a final report outlining the key finding submitted within one year of study completion. A report will be produced for the HFEA which: summarizes the results of the analyses of outcomes in the mothers and their children based on the linked data; describes the impact of non-consent in the HFEA data and how this can be addressed. University of Oxford will also have a summary of the study results on their website. The final reports are expected as soon as results from the studies are available.

The outputs will be peer-reviewed scientific journal articles, presentations (at scientific conferences and to other audiences) and reports as outlined below. All outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide.

It is anticipated that 4-5 peer-reviewed articles will be published between mid-2019 and the end of 2020, including papers that focus on child health outcomes, maternal mental health, the long-term economic costs of fertility treatments, and methodological considerations of using these data. High impact journals will be selected with the target audience in mind for example, the analysis of economic costs and analysis of mothers’ mental health will be of interest to different professional and academic groups and the selected accordingly.

Target journals include: New England Journal of Medicine, British Medical Journal, International Journal of Epidemiology, and all accepted papers will be open access publications to ensure free availability to all readers internationally. The PEARL team will liaise with the University of Oxford departmental communications office regarding the dissemination of findings. This will ensure that the media are informed of the results, so that the widest audience can be reached. This will include the writing of a press release, and a pre-recorded video interview with the study lead. At the end of the study, the video interviews will be collated into a podcast to be made freely available on the PEARL website (early 2021).

Findings from the study will be presented at appropriate scientific conferences (e.g. European Society of Human Reproduction and Embryology, 36th meeting, in June 2020, or World Congress of Epidemiology in 2020). This will include poster presentations and short talks to expert audiences.

A lay summary of the key findings, with links to published papers, will be produced by the end of 2021. This will be circulated to interested parties (e.g. charities, ART clinics involved in the qualitative survey), and will also be posted on the National Perinatal Epidemiology Unit (NPEU) website where it can be accessed by the general public. This summary will also be supported by an infographic communicating the main findings and implications for women and children born through treatment. The infographic will be developed by the NPEUs experienced in-house design team. The PEARL researchers will work with the University of Oxford departmental communications team to disseminate to a wider audience via blog posts, the university video wall, and media interviews etc. The research findings will also be communicated to the wider lay community through social media such as Twitter and Facebook.

In 2020 researchers from the University of Oxford will arrange a one-day workshop for researchers who use, or are interested in using, CPRD fertility or HFEA registry data. University of Oxford will take this opportunity to present key findings as part of the proceedings.

Processing:

A linked HES-primary care dataset already exists and is held by CPRD, with the linked HES and Small Area Level data having previously been provided to CPRD under a Data Sharing Agreement (DSA) with NHS Digital (NIC-15625-T8K6L).
Patient identifiers required for linkage of CPRD Primary care data to Human Fertilisation and Embryology Authority (HFEA) data are the NHS number, date of birth, Full names and postcode; these are not needed for the research study itself but will be sent by the GP system providers to NHS Digital. The same patient identifiers (NHS number, date of birth, gender, postcode) from HFEA will be sent to NHS Digital by the HFEA.

The bespoke dataset that will be received by the University of Oxford will be pseudonymised data.

This bespoke data linkage requires CPRD and Human Fertilisation and Embryology Authority (HFEA) patient identifiers - namely date of birth, postcode, NHS number and full name - to permit accurate linkage of CPRD and Human Fertilisation and Embryology Authority datasets into a new single dataset for the research study.

No clinical data from the GP system providers or HFEA is sent to NHS Digital, and at no stage do CPRD or University of Oxford receive any patient identifiers. Personal identifiers including name, date of birth, full names, postcode and NHS number are removed at source by the GP system providers and replaced by pseudonymised system patient and practice identifiers (GP System Practice Key and GP System Patient Key) prior to transfer of data to CPRD. CPRD then replaces the original GP System Practice Key and GP System Patient Key with a CPRD patient pseudonym (CPRD Patient Study ID). Identifiable data fields for CPRD patients flow directly from GP system providers to NHS Digital.

The s251 obtained for the specific aims of the study (ref: 16/CAG/0053) covers the data flows and linkages, this support permits “the purpose of this project to create a linked dataset between HFEA infertility data and health data from the Clinical Practice Research Datalink (CPRD)”The University of Oxford has Research Ethics Committee approval (ref:16/SC/0222) for the study and for this linkage to take place.

CPRD will have a Data Sharing Agreement (DSA) with HFEA and this will permit CPRD to receive and process HFEA pseudonymised patient data.

Under the described legal basis, the following steps explained below will be used to transfer, store and process data as part of this linkage.

Step 1. Transfer of patient Identifiers

Step 1a.
At the request of CPRD, HFEA will provide a study specific pseudonymised patient identifier for each patient (HFEA pseudonym), full name, date of birth, postcode (where available), NHS number (where available) to NHS Digital as the Trusted Third Party (TTP) for linkages to use patient identifiers from the Human Fertilisation and Embryology Authority to create pseudonymised study IDs required for linkage.

Step 1b.
In parallel, CPRD requests that participating GP system providers securely provide to NHS Digital a file containing information on all patients held in CPRD. The file consists of the four identifiable data fields (NHS Number, Date of birth, Gender and Postcode) and the GP System Practice Key and GP System Patient Key (pseudonymised data fields assigned to each unique individual in CPRD). Transfer of data from GP system providers to NHS Digital, will be via secure file transfer protocol (SFTP) servers which are encrypted to ensure security of electronic data in transit.

Step 2. Matching of HFEA identifiers to NHS Patient Demographic System

NHS Digital matches HFEA identifiers to the details held for every NHS registered patient in the Patient Demographic System (MRIS) to add NHS number where it is missing.

Step 3. Creation and provision of bridging file by the Trusted Third Party

Step 3a. Bridging file to CPRD

Using NHS number, the identifiable data fields received from HFEA and the participating GP providers.

NHS Digital supply CPRD with a bridging file containing pseudonymised patient identifiers (The GP System Practice Key and GP System Patient Key) for each linked patient that can be used to merge the primary care dataset and HFEA dataset. Additionally, NHS Digital generate and supply a HFEA specific
pseudonymised patient identifier for each linked patient (Study ID). NHS Digital securely releases the bridging file via secure file transfer protocol (SFTP) to CPRD. The bridging file will be supplied to CPRD, and CPRD will confirm the linkage as valid.

Step 3b. Bridging file to HFEA
NHS Digital also releases a second bridging file in parallel containing HFEA study specific pseudonymised patient identifier for each linked patient (Study ID) and HFEA pseudonym to HFEA. Data supplied by the GP system providers to NHS Digital (Step 1b) is utilised for CPRD routine linkage and will be retained. NHS Digital will delete the HFEA identifiable fields that do not match to the CPRD data. It is emphasised that following data linkage by NHS Digital using patient identifiable fields, there is no further flow or use of identifiable data at any point past this stage.

4. Extraction of required record-level data and transfer of HFEA data
HFEA will extract the relevant treatment and outcome data from the registry for the individuals identified in the bridging file, and construct a flat file containing the study ID, HFEA pseudonym and the clinical data only. HFEA securely transfers this to CPRD. No other personal identifiable details are included in this dataset.

CPRD will then further pseudonymise the Keys used in the linked dataset extracts to further ensure patient data cannot be identified.

Step 5. Creation of study dataset by CPRD and release to University of Oxford
CPRD receives bridging file from NHS-Digital, and de-identified HFEA clinical data from HFEA. CPRD creates de-identified dataset that contains CPRD primary care data, HFEA data, HES & IMD.

Prior to release of the linked dataset extract, CPRD ensures the University of Oxford researchers have signed a bespoke Dataset Agreement (inclusive of any additional HFEA terms and conditions) which has been previously agreed with HFEA. CPRD then transfers, with approval of HFEA, and via secure file transfer protocol (SFTP), the dataset extracts to the University, and confirms safe receipt of this. This data will have pseudonymised data extract containing HFEA data linked to previously linked CPRD primary care data - IMD, CPRD mother-baby linked data, and HES. The IMD and HES data are part of the established routinely linked dataset which CPRD receive as part of a separate Data Sharing Agreement with NHS Digital.

Analysis undertaken
The linked datasets received by University of Oxford will not be linked again with any other data by University of Oxford. The data will not be made available to any third parties, except in the form of aggregated outputs that comply with the HES analysis guidance on the suppression of small numbers. Statistical analyses will be recorded in a detailed methodology document, this will also include variable information such as data source (i.e. whether the original source is CPRD, HES, HFEA etc), the coding used, and validation or checks employed. Analyses will be conducted using STATA.

Step 6. Implementation of patient opt-outs

As part of the approval for the study from CAG (ref16/SC/0222), University of Oxford will allow an implementation period where women who wish to exercise their right to opt out during the 6months period prior to the dataset finalisation will be able to so, via the University of Oxford and HFEA websites.

If women contact the HFEA and request that they be removed from the study, their name, date of birth and year of treatment will be used to find their HFEA study specific pseudonymised patient identifier for each linked patient (Study ID) and HFEA pseudonym. HFEA will keep a record of all women who choose to opt out, and will securely transfer these to CPRD after a period of six months so that they can be removed from the final study dataset.

Step 7. CPRD will release a list of pseudonyms for patients wishing to opt-out of the study to the University of Oxford, who will delete them prior to running the final analysis.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).


Bowel Cancer Screening Programme - Data Linkage — DARS-NIC-108098-D2L3V

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - data flow is not identifiable, Anonymised - ICO Code Compliant, No (Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(7)

Purposes: No (Research)

Sensitive: Non Sensitive, and Non-Sensitive

When:DSA runs 2019-04-04 — 2020-04-03 2020.03 — 2020.03.

Access method: One-Off

Data-controller type: DEPARTMENT OF HEALTH AND SOCIAL CARE

Sublicensing allowed: No

Datasets:

  1. MRIS - Bespoke

Objectives:

The data controller is Department of Health and Social Care, with the Secretary of State for Health and Social Care (acting as part of the Crown), acting through the Clinical Practice Research Datalink centre (hereinafter referred to as CPRD) within the Medicines and Healthcare Products Regulatory Agency. This is the same arrangement for the data processor although it is CPRD who actually process the data but are not listed as data processors.

The data processor is Department of Health and Social Care.

The Clinical Practice Research Data-linkage (CPRD) is a centre of the Medicines and Healthcare products Regulatory Agency (MHRA), an executive agency of the Department of Health & Social Care (DHSC). The MHRA regulates medicines, medical devices and blood components for transfusion in the UK and the MHRA act as the Executive agency.

CPRD is the UK’s pre-eminent research service, providing access to primary care data (that has been anonymised) linked by NHS Digital to other similarly pseudonymised health data. This data is provided by NHS Digital and others for the purposes of public health research including the monitoring of drug safety. All such data is linked (in its identifiable form) by NHS Digital only. It is jointly funded by the MHRA and the National Institute for Health Research (NIHR).

CPRD’s aims are to support vital public health research and to inform advances in patient safety in the delivery of patient care pathways. These depend on access to accurate, real-time representative patient data to produce reliable evidence based clinical and drug safety guidance. The legal bases for processing the data provided by NHS Digital are:

• Gathering of GP patient data and collation with other data sets to produce data-sets that have been anonymised: medical research under Article 9(2)(j); drug and device safety under Article 9(2)(i) of the General Data Protection Regulation

CPRD services are designed to maximise the way de-identified NHS clinical data can be used to improve and safeguard public health. For more than 20 years data provided by CPRD have been used in a range of drug safety and epidemiological studies that have impacted on health care, and resulted in over 1700 peer-reviewed publications. In addition to supporting high-quality observational research, CPRD is developing world-leading services based on using real world data to support clinical trials and intervention studies. The intention is to continue to link CPRD primary care data to NHS Digital’s secondary care and other datasets, as linkage greatly increases the scale, depth, completeness and therefore value of data available for public health research. The outputs of such research based on linked data in turn improve and protect patient care pathways/treatments and provide clinical benefits for the UK, supporting delivery of CPRD’s core objectives.

CPRD’s research and data services are based on a database of de-identified longitudinal primary care records contributed by consenting GP practices from the four UK nations, and on the ability to link primary care data to secondary care data (and other data sets), from the NHS, Office of National Statistics (ONS) and Public Health England (PHE). One of CPRD’s main priorities is to increase the number of national data sets that are linked to primary care data and made available on a routine basis to the research community. Such collection and linkages occur under the appropriate permissions (ethical and s251), which have been granted to CPRD by the East Midlands & Derby Research Ethics Committee (REC), and the Health Research Authority (HRA).

NHS Digital has been providing secondary and other data for linkage with CPRD primary care data for a number of years. Data linkage is carried out exclusively by NHS Digital as the Trusted Third Party (TTP) for this purpose. Linked data sets currently available include extracts from Civil Registration data; Hospital Episode Statistics (HES), which encompasses Admitted Patient Care, Critical Care, Outpatient and Accident & Emergency data; Patient Reported Outcome Measures (PROMs); Diagnostic Imaging Dataset (DID); Mental Health data; National Cancer Registry; Deprivation data including Townsend Score and Index of Multiple Deprivation. Critical care is supplied as a separate dataset by NHS Digital, but is integrated with Admitted Patient Care.

Data can only be used for public health research purposes in research recommended for approval by ISAC for MHRA database research. CPRD make the final decision on access, and ensure compliance with NHS Digital’s requirements within the data sharing agreement, e.g. security of the third party. Access to CPRD data and services will not be permitted in circumstances that may result in loss of public trust or for activities that may undermine the integrity of the CPRD database.

This application is to support two separate research projects. Both projects involve linkage of record level data from the same databases: the Clinical Practice Research Datalink (CPRD), and the Midlands and North West Bowel Cancer Screening Hub (Public Health England).

Both studies are funded by the National Awareness and Early Diagnosis Initiative (NAEDI) and are administered by Cancer Research UK. No data from either study will be made available to third-parties and no elements of the work will take place outside the UK.

The two study projects are presented below.

1. Project 1: An enhanced role for primary care in bowel cancer screening: an observational study investigating primary care use among bowel screening non-responders.

Purpose – Despite the efficient provision of bowel cancer screening programmes in the UK, low participation remains a problem, especially in lower socio-economic groups. Primary care professionals can have an important role in increasing participation among non-responders, but little is known about how the non-responders use primary care.

The study’s main aim is to explore and describe the utilisation of primary care services by non-responders to bowel cancer screening 25 months after the last invitation to screening, in order to identify opportunities to engage with the non-responders. It also aims to compare responders and non-responders to identify if there are differences in the way they use primary care.

The primary research questions are:

1. How frequently do non-responders to bowel cancer screening consult with primary care and what are their main reasons for consultation (diagnoses, symptoms and procedures)?
2. Which professionals are more frequently involved in the care of non-responders?
3. How are the non-responders characterised in terms of socio-demographic characteristics such as age, gender, marital status, ethnicity and deprivation?
4. How frequently do non-responders engage in health-seeking behaviours such as health screening programmes (i.e. cervical cancer/breast cancer) or other preventative activities?

Secondary research questions for Project 1 are:

1. Amongst non-responders, are socio-demographic characteristics associated with frequency of attendance to consultations (very low/low frequency attenders versus other attenders)?
2. Amongst non-responders, are lifestyle risk factors for Colorectal Cancer (CRC) associated with frequency of attendance (very low/low frequency attenders versus other attenders)?
3. Are lifestyle risk factors for CRC, multimorbidity and poor health status associated with responder status (non-responders versus responders) to bowel cancer screening?
4. Do the identified patterns of consultation (frequency and main reasons for consultation) vary according to responder status (non-responders versus responders) to bowel cancer screening?

Data will be linked between CPRD and the Midlands and North West (NW) Bowel Cancer Screening (BCS) Hub. The study population is composed of patients living in Midlands and North West area who are eligible to bowel cancer screening (aged 60-74); classified as either responders or non-responders. CPRD data is requested for all patients who received an invitation from the Bowel Cancer Screening Programme from Apr 2014 to Apr 2016. This is limited to those in the Midlands and North West. The estimated cohort size for non-responders is 66,275.

University of Edinburgh (UoE) will extract data from a 25-month period. Using descriptive statistics, UoE will explore and describe reasons for consultation and patterns of attendance according to the non-responders' socio-demographic characteristics (such as age, gender and deprivation) and calculate consultation rates (taking into account the patients’ age and gender). Using multivariate binary logistic regression, UoE will compare responders and non-responders in order to identify groups in need of more support and information, and examine whether patterns of consultation differ among both groups.

A detailed understanding of how bowel screening non-responders use primary care will allow for the identification of optimum opportunities to engage with them. More effective primary care-based strategies can help to improve bowel screening uptake and reduce current disparities. Furthermore, they have the potential to increase the proportion of cancers diagnosed earlier and reduce mortality from the disease in the long-term.

2. Project 2: The influence of a negative Faecal Occult Blood test (FOBt) on the response of screening invitees and healthcare providers to symptoms of colorectal cancer.

Purpose – Bowel cancer screening has the potential to significantly reduce deaths from colorectal (bowel) cancer and has been introduced across the UK. However, approximately 40% of cancers will not be detected by the test, and therefore there is a need for awareness of the symptoms of colorectal cancer in the general population, and for primary care to respond effectively to symptomatic patients. Previous work has shown that significant numbers of invitees believe that a one off test confers long term protection from the disease.

The aim of this study is to determine whether the pattern of symptom presentation to primary care differs between individuals who have accepted offers of bowel screening and received a negative result, and those who have not yet been invited or declined to take part.

The cohort size is 7,800.

The primary research questions are:

1. Does the pattern of bowel associated symptom presentation to primary care differ between; individuals who have accepted offers of FOBt screening and received a negative result, those who declined their invitation to take part in screening and those who live in an area where roll-out of the screening programme had yet to commence?
2. Does the pattern of GP referral for Colorectal Cancer (CRC) associated investigations differ between; individuals who have accepted offers of FOBt screening and received a negative result, those who declined their invitation to take part in screening and those who live in an area where roll-out of the screening programme had yet to commence?
3. Does the pattern of bowel associated diagnoses in primary care differ between; individuals who have accepted offers of FOBt screening and received a negative result, those who declined their invitation to take part in screening and those who live in an area where roll-out of the screening programme had yet to commence?

Secondary research questions for Project 2 are:

1. Does the pattern of bowel-associated consultations in primary care differ by socio-economic status?
2. Does the pattern of bowel-associated consultations in primary care differ between different ethnic groups?

This study will use a linked dataset from CPRD and PHE’s Midlands and North West Programme Hub to investigate patterns of bowel symptom presentation in primary care over a six-month period. The study will also utilize established linkages with Hospital Episode Statistics and civil registration death data to compliment routinely linked CPRD data. HES in relation to identifying bowel related investigations and diagnoses; and civil registration death data to determine date and cause of death for any patient who died during the 6 month follow up.

At the completion of this study the University of Edinburgh will have a comprehensive picture of the pattern of response to bowel symptoms amongst invitees to FOBt screening in England and unique insights into how this response is moderated by ethnicity and socioeconomic status. Further, this work has tremendous potential to lead to the better integration of effort of early diagnosis and screening activities in colorectal cancer.

Expected Benefits:

Both studies benefit from collaborators who have important roles in bowel screening provision in England and Scotland, and can influence not only to the clinical community, but also policy makers.

Summaries of findings from studies will be prepared and disseminated to the UK Bowel Screening Programmes, Health Psychologists and primary care. Summaries will also be shared with other relevant contacts such as the Scottish Coordinator of Screening Programmes and the study funder (Cancer Research UK). A comprehensive research report for each study will be prepared for the study funder and will also help to inform future discussions with Cancer Screening Programmes. The final reports are expected as soon as results from the studies are available (expected to be in 2019).

In order to disseminate results to primary care professionals, policy makers and researchers (and to meet funder requirements); papers from both studies will be published Open Access. The research team will aim for the British Medical Journal, the British Journal of General Practice and the British Journal of Cancer. Presentations at national (such as the National Cancer Research Institute Annual Conference) and international Conferences (such as the Annual Cancer and Primary Care Network (Ca-PRI) Conference) are planned.

“Negative FOBt study”:
This study, along with the complementary study components already published, will provide crucial information to help determine the potential impact of a negative test result on how patients and GPs respond when presented with symptoms associated with a colorectal cancer diagnosis following a negative screening test result.

Screening programmes inevitably miss a proportion of cancers and some cancers will develop between screening rounds. Even with a fully implemented programme, approximately 75% of all colorectal cancers will be diagnosed symptomatically in primary care. This study will provide new and unique insights which will inform on-going initiatives in primary care, in collaboration with the national screening programmes, to essentially promote symptom awareness, encourage prompt help-seeking, timely referral and early diagnosis. Furthermore, it will generate a comprehensive picture of how patients respond to symptoms, and provide insights into which patient characteristics moderate this response. Finally, the study will provide a better understanding of the limitations of colorectal cancer screening tests among screening participants, and will generate benchmark data for further analyses of symptom awareness among patents attending screening with the new faecal immunochemical test (FIT).

“Non-responders using primary care study”:
Bowel cancer screening programmes can contribute to reducing mortality from the disease, but increased participation is required for this to happen. Current uptake in England is below 60%, and there are substantial challenges in ensuring equitable uptake, especially among invitees with lower socio-economic status, men and ethnic minorities.

Evidence shows that a personal recommendation from a GP or other health care professional can increase participation in bowel cancer screening. However, despite the important role that primary care can have in promoting screening uptake, information on the profile of non-responders consulting in primary care is scarce. When primary care strategies (which have been increasing over the years) do not have sufficient information on the patients they are trying to reach, they are missing opportunities to engage with them. A detailed understanding of how non-responders use primary care will allow for the identification of optimum opportunities to engage with patients, especially hard to reach groups who consult in primary care. Study findings will comprehensively describe the profile of patients who require more effective support, information and risk assessment, and will inform target populations for future initiatives aiming to increase informed participation in bowel screening.

More effective primary care-based strategies can help to improve bowel screening uptake and reduce current disparities. Furthermore, they have the potential to increase the proportion of cancers diagnosed earlier and reduce mortality from the disease. These wider benefits are expected in the long-term (5-10 years), and should be considered as part of a larger context in which other public health strategies are developed to increase bowel screening uptake; in addition to providing optimum treatment when a cancer is actually diagnosed.

Outputs:

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide. Data analysis for both studies will commence as soon as the linked datasets are received and is expected to be finished before the end of 2019. Data will not be used for sales and marketing purposes.

Research reports will be prepared for both studies and will be submitted to the funder (CRUK). Reports will be used to inform discussions with NHS Cancer Screening Programmes and NSD Scotland. CRUK will also have a summary of the study results for their website. The final reports are expected as soon as results from the studies are available.

Disseminating results to primary care professionals, policy makers and researchers is paramount. All papers will be published Open Access as per the funder’s requirements. Manuscript submission for both studies is expected in 2019. Target Journals from both studies include the British Medical Journal, the British Journal of General Practice and the British Journal of Cancer. Presentations are planned at the National Cancer Research Institute (NCRI) annual meeting and the 11th Cancer and Primary Care Network (Ca-PRI) Conference.

Specific Outputs are described separately for Project 1 and Project 2:

Project 1: An enhanced role for primary care in bowel cancer screening: an observational study investigating primary care use among bowel screening non-responders

Research outputs will fill a gap by examining patterns of health care utilisation in detail, along with the non-responders’ socio-demographic characteristics. The study will investigate attendance to preventative activities (as a proxy for health-seeking behaviour) and socio-demographics as these are associated with higher uptake. It will compare non-responders’ presentation of lifestyle risk factors for CRC according to different frequencies of attendance in order to identify patients who might require more effective support, health promotion and risk assessment than others. By comparing responders and non-responders the study will identify groups in need for more support and information and examine which (if any) patterns are exclusive of non-responders.

In order to inform the data analysis protocol for this study, a literature review of challenges in analysing routine datasets was prepared by the research team. The output was a comprehensive report which was presented at the SAPC Conference and at the Dealing with Data Conference at the University of Edinburgh (2014).

The study is part of a larger project which has also developed and tested the feasibility of a bowel screening brief intervention in routine practice. Feasibility study results are in press at BMJ Open.

Project 2: The influence of a negative Faecal Occult Blood test (FOBt) on the response of screening invitees and healthcare providers to symptoms of colorectal cancer.

The output data from this data-linkage study will provide a comprehensive picture of the pattern of response to symptoms suggestive of colorectal cancer, following a negative FOBt result. Data will include the presentation and frequency of both colorectal specific and non-specific symptoms, clinical investigations and GP referrals) among screening participants in England and provide unique insights into how this response is moderated by socioeconomic status.

The study is part of a larger project exploring the influence of a negative FOBt test result on response to symptoms of colorectal cancer. Complementary qualitative components of this project have already resulted in one published article and a second article which is currently under review with the journal Health Expectations.

Processing:

A linked HES-primary care dataset already exists and is held by CPRD, with the linked HES data having previously been provided to CPRD under a Data Sharing Agreement (DSA) with NHS Digital (NIC-15625-T8K6L).

Patient identifiers required for linkage of CPRD Primary care data to the Midlands and North West Bowel Cancer Screening data are the NHS number, date of birth, gender and postcode; these are not needed for the research study itself but will be sent by the GP system providers to NHS Digital. NHS Digital already hold the Midlands and North West Bowel Cancer Screening data on behalf of Public Health England.

The bespoke dataset that will be received by the University of Edinburgh will be pseudonymised data.

This bespoke data linkage requires CPRD and Midlands and North West Bowel Cancer Screening patient identifiers – namely date of birth, postcode, NHS number and gender – to permit accurate linkage of CPRD and Midlands and North West Bowel Cancer Screening datasets into a new single dataset for the research study.

No clinical data from the GP system providers or PHE is sent to NHS Digital, and at no stage do CPRD or University of Edinburgh receive any patient identifiers. NHS Digital hold the clinical data on behalf of PHE. Personal identifiers including name, date of birth, postcode and NHS number are removed at source by the GP system providers and replaced by pseudonymised system patient and practice identifiers (GP System Practice Key and GP System Patient Key) prior to transfer of data to CPRD. CPRD then replaces the original GP System Practice Key and GP System Patient Key with a CPRD patient pseudonym (CPRD Patient Study ID). Identifiable data fields for CPRD patients flow directly from GP system providers to NHS Digital.

The legal support for the lawful flow of identifiable data is primarily CPRD’s s251 support (ref: ECC 5-05 (a)/2012). This support permits “GP practices and specified others (according to the approved ‘Master Dataset’ list) to [1] transfer confidential patient information to NHS Digital; [2] NHS Digital to receive identifiers, undertake linkages and provide the CPRD a de-identified dataset.”

CPRD has obtained further clarification from CAG (via a s251 amendment in December 2017) that the PHE bowel cancer screening dataset (Midlands and North West) is part of CPRD’s Master Dataset List, and that CPRD has ongoing CAG approval for linkages to this dataset.

CPRD also have Research Ethics Committee (REC) approval (ref: 05/MRE04/87) for the research study and this linkage, to take place.

CPRD have a Data Sharing Agreement (DSA) with PHE and this permits CPRD to receive and process BCS pseudonymised patient data.

Under the described legal basis, the following steps explained below will be used to transfer, store and process data as part of this linkage.

Step 1. Transfer of patient identifiers

Step 1a.
The Midlands and North West Bowel Cancer Screening (PHE) dataset is held at NHS Digital and not at PHE itself. At the request of CPRD PHE provides instructions to NHS Digital as the Trusted Third Party (TTP) for linkages to use patient identifiers from the Midlands and North West Bowel Cancer Screening to create pseudonymised study IDs required for linkage. This means that the flow of PHE patient identifiers will remain within NHS Digital.

Step 1b.
In parallel, CPRD requests that participating GP system providers securely provide to NHS Digital a file containing information on all patients held in CPRD. The file consists of the four identifiable data fields (NHS Number, Date of birth, Gender and Postcode) and the GP System Practice Key and GP System Patient Key (pseudonymised data fields assigned to each unique individual in CPRD).

Transfer of data from GP system providers to NHS Digital, will be via secure file transfer protocol (SFTP) servers which are encrypted to ensure security of electronic data in transit.

Step 2. Creation and provision of bridging file by the Trusted Third Party

Step 2a. Bridging file to CPRD

NHS Digital match the identifiable data fields held on behalf of PHE and participating GP system providers.

NHS Digital supply CPRD with a bridging file containing pseudonymised patient identifiers (The GP System Practice Key and GP System Patient Key) for each linked patient that can be used to merge the primary care dataset with the Midlands and North West BCS dataset. Additionally, NHS Digital generate and supply a Midlands and North West BCS specific pseudonymised patient identifier for each linked patient (Study ID). NHS Digital securely releases the bridging file via secure file transfer protocol (SFTP) to CPRD. The bridging file will be supplied to CPRD, and CPRD will confirm the linkage as valid.

Step 2b. Bridging file to Trusted Third Party
NHS Digital also releases a second bridging file in parallel containing a Midlands and North West BCS study specific pseudonymised patient identifier for each linked patient (Study ID), to NHS Digital. This is done since the Midlands and North West BCS dataset is held by NHS Digital on behalf of PHE, and not at PHE itself.

Data supplied by the GP system providers to NHS Digital (Step 1b) is utilised for CPRD routine linkage and will be retained.

It is emphasised that following data linkage by NHS Digital using patient identifiable fields, there is no further flow or use of identifiable data at any point past this stage.

Step 3. Extraction of matching Keys by NHS Digital
NHS Digital will match the Study ID for the Midland and North West data and extract the required clinical information. The Study ID generated by NHS Digital for the specific study is then matched to the clinical information and extracted. The new file will contain no personal identifiable details in the datasets. The file containing study ID and clinical variables is sent to CPRD via secure transfer. NHS Digital will apply opt outs to the PHE data before it is disseminated from NHS Digital.

Step 4. Creation of the linked PHE-CPRD dataset by CPRD

Upon receipt of the clinical variable from NHS Digital, CPRD uses the Study ID to match the file containing CPRD GP System Practice Key, GP System Patient Key and the required clinical information at record level. This linked dataset remains at CPRD and is only released to researchers, after further pseudonymisation

Step 5. Dataset extract

The linked dataset held by CPRD is then used to create two project specific linked datasets, limited to the requested patient cohort and clinical information as approved by Independent Scientific Advisory Committee (ISAC) for each project.

This will involve the creation of a file containing a linked BCS (Midland and North West) -CPRD patient dataset extract for ‘Project 1’ as explained in the Purpose section above.

For ‘Project 2’, CPRD will repeat the process, additionally adding linked HES and ONS data to the PHE-CPRD patient dataset extract using CPRD IDs, and its existing HES and ONS linked datasets provided by NHS Digital.

CPRD will then further pseudonymise the Keys used in the linked dataset extracts’ to further ensure patient data cannot be identified.

Step 6. Creation of study dataset by CPRD and release to UoE

Prior to release of the linked dataset extract, CPRD ensures the UoE researchers have signed a bespoke Dataset Agreement (inclusive of any additional PHE terms and conditions) which has been previously agreed with PHE. CPRD then transfers, with approval of PHE, and via secure file transfer protocol (SFTP), the two dataset extracts to the University, and confirms safe receipt of this. Project 1 will have pseudonymised data extract containing the Midlands and North West Bowel Cancer Screening data linked to the CPRD primary care data. Project 2 will have pseudonymised data extract containing the Midlands and North West Bowel Cancer Screening data linked to previously linked CPRD primary care data- IMD, HES and mortality data. The IMD, HES and mortality data are part of the established routinely linked dataset which CPRD receive as part of a separate Data Sharing Agreement with NHS Digital.

Analysis undertaken
The linked datasets received by Edinburgh researchers will not be linked again with any other data by Edinburgh. In order to answer the research questions, both descriptive statistics and multivariate analysis of data using a conditional logistic regression model will be carried out.

The data linkage taking place under this application will also be available to other researchers subject to a suitable application submitted through CPRD’s ISAC process. The bowel cancer screening data will be linked to the wider CPRD database and will be available to other researchers subject to a suitable application submitted through CPRD’s ISAC process.


Project 4 — DARS-NIC-113074-D9M1C

Type of data: information not disclosed for TRE projects

Opt outs honoured: Yes - patient objections upheld (Section 251, Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7)

Purposes: ()

Sensitive: Non Sensitive

When:2019.02 — 2019.07.

Access method: One-Off

Data-controller type:

Sublicensing allowed:

Datasets:

  1. MRIS - Bespoke

Objectives:

5a. Objective for processing:

The data controller is Department of Health and Social Care, with the Secretary of State for Health and Social Care (acting as part of the Crown), acting through the Clinical Practice Research Datalink centre (hereinafter referred to as CPRD) within the Medicines and Healthcare Products Regulatory Agency. This is the same arrangement for the data processor although it is CPRD who actually process the data but are not listed as data processors.

The data processor is Department of Health and Social Care

The Clinical Practice Research Data-linkage (CPRD) is a centre of the Medicines and Healthcare products Regulatory Agency (MHRA), an executive agency of the Department of Health & Social Care (DHSC). The MHRA regulates medicines, medical devices and blood components for transfusion in the UK and the MHRA act as the Executive agency.

CPRD is the UK’s pre-eminent research service, providing access to primary care data (that has been anonymised) linked by NHS Digital to other similarly pseudonymised health data. This data is provided by NHS Digital and others for the purposes of public health research including the monitoring of drug safety. All such data is linked (in its identifiable form) by NHS Digital only. It is jointly funded by the MHRA and the National Institute for Health Research (NIHR).

CPRD’s aims are to support vital public health research and to inform advances in patient safety in the delivery of patient care pathways. These depend on access to accurate, real-time representative patient data to produce reliable evidence based clinical and drug safety guidance. The legal bases for processing the data provided by NHS Digital are:

• Gathering of GP patient data and collation with other data sets to produce data-sets that have been anonymised: medical research under Article 9(2)(j); drug and device safety under Article 9(2)(i) of the General Data Protection Regulation

CPRD services are designed to maximise the way de-identified NHS clinical data can be used to improve and safeguard public health. For more than 20 years data provided by CPRD have been used in a range of drug safety and epidemiological studies that have impacted on health care, and resulted in over 1700 peer-reviewed publications. In addition to supporting high-quality observational research, CPRD is developing world-leading services based on using real world data to support clinical trials and intervention studies. The intention is to continue to link CPRD primary care data to NHS Digital’s secondary care and other datasets, as linkage greatly increases the scale, depth, completeness and therefore value of data available for public health research. The outputs of such research based on linked data in turn improve and protect patient care pathways/treatments and provide clinical benefits for the UK, supporting delivery of CPRD’s core objectives.

CPRD’s research and data services are based on a database of de-identified longitudinal primary care records contributed by consenting GP practices from the four UK nations, and on the ability to link primary care data to secondary care data (and other data sets), from the NHS, Office of National Statistics (ONS) and Public Health England (PHE). One of CPRD’s main priorities is to increase the number of national data sets that are linked to primary care data and made available on a routine basis to the research community. Such collection and linkages occur under the appropriate permissions (ethical and s251), which have been granted to CPRD by the East Midlands & Derby Research Ethics Committee (REC), and the Health Research Authority (HRA).

NHS Digital has been providing secondary and other data for linkage with CPRD primary care data for a number of years. Data linkage is carried out exclusively by NHS Digital as the Trusted Third Party (TTP) for this purpose. Linked data sets currently available include extracts from Civil Registration data; Hospital Episode Statistics (HES), which encompasses Admitted Patient Care, Critical Care, Outpatient and Accident & Emergency data; Patient Reported Outcome Measures (PROMs); Diagnostic Imaging Dataset (DID); Mental Health data; National Cancer Registry; Deprivation data including Townsend Score and Index of Multiple Deprivation. Critical care is supplied as a separate dataset by NHS Digital, but is integrated with Admitted Patient Care.

Data can only be used for public health research purposes in research recommended for approval by ISAC for MHRA database research. CPRD make the final decision on access, and ensure compliance with NHS Digital’s requirements within the data sharing agreement, e.g. security of the third party. Access to CPRD data and services will not be permitted in circumstances that may result in loss of public trust or for activities that may undermine the integrity of the CPRD database.

For this study CPRD will receive the linked data file from NHS Digital. Imperial College London will send nominal pollution codes and English postcodes to NHS Digital. St Georges University of London will receive the final pseudonymised dataset from CPRD.

The legal bases for CPRD processing the data linked by NHS Digital is article 9(2)(j) and article 6(1)(e) of the General Data Protection Regulation. The request is not for NHS Digital data but for NHS Digital to carry out a trusted 3rd party data linkage between the English postcodes sent by Imperial College London and the primary care health records sent by the GP system providers on behalf of CPRD. Section 251 support is in place to cover the linkage.

Associations between long-term concentrations of outdoor air pollution and heath have been evaluated using epidemiological cohort studies. Substantial reviews of the epidemiological, toxicological and mechanistic literature have concluded that the evidence is sufficient, or suggestive, to infer causality for a range of health outcomes.

The US Health Effects Institute (HEI) identified in their 2014 Research Agenda the need to improve understanding of the nature of the relationships between pollutants and health at low levels of air pollution currently prevalent in North America, Europe, and other high-income countries.

In response to the HEI’s request for applications, a consortium of European investigators proposed a joint study utilising existing individual and administrative cohorts. The ESCAPE cohort study (European Study of Cohorts for Air Pollution Effects) has analysed previously a number of individual cohorts across Europe. The proposal from the European consortium, now funded by the HEI, aims to develop previous cohort studies by group members by combining a pooled analysis of the ESCAPE cohorts together with local analyses of the administrative cohorts utilising pollution concentrations derived from European scale and local pollution models at a 100m grid resolution.

The Health Effects Institute has funded 14 institutions in a European collaborative study to bring together cohorts from a large number of countries, including England, to study the associations between concentrations of pollutants and mortality and disease incidence. None of the 14 institutions will have a role in this study and the linked dataset. The US Health Effects Institute fund St Georges, University of London to carry out the work, however, they have no control on the outputs of the study.

The aim of the study is to assess, associations between long-term average concentrations of particulate matter, nitrogen dioxide, sulphur dioxide, black carbon and ozone and the risk of death and disease incidence in England.

This investigation comprises a survival analysis incorporating measures of air pollutants, patient characteristics such as age, sex, body mass index, smoking status and index of multiple deprivation score for all-cause and cause-specific mortality and the incidence of coronary and cerebrovascular disease, dementia, and lung cancer.

Annual concentrations of pollutants including nitrogen dioxide, particles and ozone will be provided by Imperial College London in pseudonymised form to CPRD, who will construct the final dataset. The pollutant concentrations have been derived from models based upon data from satellites, land utilization data and monitoring stations. Noise levels have also been derived from statistical models based upon measurements and building topology. The pollution data is provided for all postcodes (postcode centroid) in England for 2010 and are also extrapolated to other years.

The data will not be used for commercial purposes, not provided in record level form to any third party and not used for direct marketing.

Expected Benefits:

The expected benefit from this study will be an improved understanding of the nature of the relationships between pollutants and health at low levels of air pollution currently prevalent in the UK. Air pollution, particularly nitrogen dioxide and particles emitted in diesel exhaust, continues to be of concern to government agencies, health organisations, environmental groups and the public. In 2009 the UK Committee on the Medical Effects of Air Pollutants concluded that the available evidence supported a causal association between long-term exposure to particulate air pollution, represented by PM2.5, and mortality. A recent assessment of the consequence of life long exposure to air pollution by the Royal Colleges highlighted the dangerous impact on the nation’s health. Clarification of the nature of the relationships at relatively low concentrations will enable the burden of air pollution at current levels and the impact of any policy scenarios to be evaluated more accurately hence leading to appropriate, cost effective, pollution abatement strategies leading to improved protection of human health and the environment.

The outputs from this study will provide evidence for the association between air pollution and health. These results will be incorporated into evidential assessments by national and international bodies such as the UK Committee on the Medical Effects of Air Pollutants (COMEAP), the WHO and the US Environmental Protection Agency. Such assessment are used in setting guideline and limit values for air pollution and provide inputs to cost benefit modelling exercises such as Defra’s current assessment of mitigation strategies for the UK to meet NO2 limit values as directed by the European Commission and confirmed UK courts after challenge by ClientEarth.

The outputs will be hazard ratios and confidence intervals for a range of diseases. These measures are incorporated into systematic reviews and meta-analysis undertaken by governments/health organisations. The large English cohort will contribute data to these reviews as well as provide specific evidence for a UK population. A recent example of how these data feed into evidential reviews and into policy and public health benefits is the recently published NO2 review by COMEAP. The summary coefficient for nitrogen dioxide (NO2) and mortality from the review was used by Defra in their cost-benefit analysis to determine strategies to achieve mandatory reductions in concentrations of NO2. The HRs were used to quantify reduction in years of life lost which translates into monetary benefits.

The benefits of the outputs from this study will be improved information for the characterisation of the effects of air pollution on health in the UK. As described, the outputs feed into a process that lead to the formulation of air pollution control strategies that will reduce the risks of long-term exposure to air pollution in the general population. The outputs from this research will be disseminated via conference presentations and publications in the peer review literature. Publication in open-access journals enables the results to reach the widest audience world-wide and ensures the results are included in literature searches as part of systematic reviews. The outputs will provide coefficients for input into cost benefit models in order to formulate appropriate mitigation strategies to reduce air pollution emissions. Examples might include controls on engine emissions, traffic volumes or low emission zones. For example, such plans are detailed in Defra’s consultation for reducing NO2 concentrations.

Air pollution exposure is ubiquitous. The Royal Colleges recently assessed the lifelong burden of air pollution exposure. They estimate that 40,000 deaths per annum were attributed to long-term exposure to outdoor air pollution. The benefits will be achieved by the data controller and third parties as described above. The outputs are an important input to evidential reviews and cost benefit analysis undertaken by Government departments, Health organisations and academics. The benefit will be measured using years of life lost (for mortality) and the attributable number of deaths. The health effects of air pollution are routinely monitored (by COMEAP for example) and reviews routinely undertaken. WHO is currently undertaking a review of the evidence in support of its revision of the air pollution guidelines. The US EPA also regularly updates its assessments. Depending upon the findings from this study these organisations may consider updating their recommendations or they will include them in their next planned assessments.

Outputs:

The outputs from the analyses comprising summary statistics and hazard ratios and associated 95% confidence intervals will be included in a report to the study sponsors (Health Effects Institute). The findings will be published in specialist peer reviewed epidemiological journals to be decided at the end of the study. The findings from the study will be presented at the first International Society for Environmental Epidemiology meeting and the first HEI annual review meeting after completion of the study. Publication in an HEI report and in peer review journals will enable the findings from the study to be included in evidential reviews by organisations and Governmental agencies such as the UK Committee on Air Pollution, World Health Organisation and the US Environmental Protection Agency. These evidential reviews provide the scientific basis for advice to Government Departments in cost/benefit calculations e.g. the recent Air Quality Strategy published by Defra.

Imperial College London and CPRD will also disseminate the study findings on their websites and internal newsletters/ publication. No individual level data will be included in any reports, journal publications or conference abstracts/presentations/posters. The target date for the production of the output is 30/06/2019.

For the pathways of dissemination of the outputs there will be presentations at scientific conferences: the annual HEI conference and the International Society for Environmental Epidemiology both of which are open to stakeholders and the public. The output will also be published in peer reviewed open access journal papers. No specific public / patient engagement activities are currently planned but suitable routes of dissemination will be considered and put in place.

All outputs will be restricted to aggregate data with small numbers suppressed in line with the HES Analysis Guide.

Processing:

The only Identifier required for the linkage of CPRD primacy care data to Imperial College London pollution data is patient postcode; this is not needed for the research study itself but will be sent by the GP system providers and Imperial College London to NHS Digital to generate the bridging file. The GP system providers will not submit any other identifiers to NHS Digital. This bridging file will contain the nominal codes that has successfully been linked to the CPRD primary care data and the patient pseudonyms, which will be used by CPRD to create a linked pseudonymised dataset

The final dataset that will be sent to St. George’s, University of London will be pseudonymised data.

This data linkage requires CPRD and Imperial College London identifier –postcode– to permit accurate linkage of CPRD’s primary care health records and Imperial College London’s air pollution datasets for all English practices into a new linked dataset for the research study.

Imperial College London provide only environmental data for all English postcodes to NHS Digital. This data is generated from annual average concentrations of air pollutants (particular matter, nitrogen dioxide, ozone and black carbon) were modelled using a state-of-the-art European model which combines information from satellite data and chemical transport models with information on the road network, land use and monitored pollutant concentrations These models were developed and published by the Swiss Tropical and Public Health Institute, Basel, as part of this project. These air pollution maps (100m x 100m resolution) were sent to Imperial College London. A second set of modelled pollutant concentrations were produced by IC using UK specific land use data and model specification. Imperial College London then linked each English postcode centroid (x,y coordinate) to these air pollution maps using a geographic information system.

Annual estimates of noise exposures will be assigned to English postcode centroids using a version of the CNOSSOS-EU model developed by Imperial College London.

Imperial College London and St Georges University London do not know which postcodes contain patient data held in CPRD. No clinical data from the GP system providers is sent to NHS Digital, and at no stage do CPRD, Imperial College London or St George’s, University of London receive any patient identifiers. Personal identifiers including name, date of birth, postcode and NHS number are removed at source by the GP system providers and replaced by pseudonymised system patient and practice identifiers (GP System Practice and Patient ID) prior to transfer of data to CPRD. CPRD then replaces the original GP System Practice and Patient ID with a CPRD patient pseudonym (CPRD Patient ID). Identifiable data fields for CPRD patients flow directly from GP system providers to NHS Digital.

The legal basis for the lawful flow of identifiable data is primarily CPRD’s s251 support (ref: ECC 5-05 (a)/2012). This support permits “GP practices and specified others (according to the approved ‘Master Dataset’ list) to [1] transfer confidential patient information to NHS Digital; [2] NHS Digital to receive identifiers, undertake linkages and provide CPRD a de-identified dataset.”

Under the described legal basis, the following steps explained below will be used to transfer, store and process data as part of this linkage.

Step 1. Transfer of identifiers (transfer of data from Imperial College London to NHS Digital, and from GP system providers to NHS Digital, will be via secure file transfer protocol (SFTP) servers which are encrypted to ensure security of electronic data in transit).

Step 1a.
Imperial College London will securely provide to NHS Digital as the Trusted Third Party (TTP) for linkages, a file containing all English postcodes held in Imperial College London, since at this stage, it is not clear which postcodes will link to the CPRD patient data and be relevant to the study. The file consists of one data field (English Postcode) and the Imperial College London nominal pollution code (a pseudonym attached to each English postcode within the Imperial College London dataset and used to link the pollution data). The nominal code sent by Imperial College London is for the creation of the bridging file sent to CPRD, which is explained in step 2.

Step 1b.
In parallel, CPRD requests that participating GP system providers securely provide to NHS Digital, a file containing information on all patients held in CPRD. The file consists of the one identifiable data field (Postcode) and the GP System Practice Key and Patient Key (pseudonymised data fields assigned to each unique individual in CPRD).

Step 2. Creation and provision of bridging file by the Trusted Third Party

NHS Digital match the identifiable data field (Postcode) received from GP system providers to the English postcode and nominal pollution code file from Imperial College London.

NHS Digital supply CPRD with a bridging file containing a pseudonymised patient identifier (Study ID), Imperial College London nominal pollution code and the GP System Practice Key and Patient Key for each linked postcode that can be used to merge the primary care dataset with the second Imperial College London dataset containing nominal pollution codes and postcodes.

Additionally, NHS Digital generate and supply a study specific pseudonymised patient identifier for each linked patient (Study ID). NHS Digital securely releases the bridging file via secure file transfer protocol (SFTP) to CPRD. Once the bridging file has been supplied to CPRD, and CPRD confirm the linkage as valid, NHS Digital will delete the file supplied by Imperial College London (Step 1a). Imperial College London Data that has not been matched from the primary care dataset with the Imperial College London dataset will be deleted by NHS Digital.

Data supplied by the GP system providers to NHS Digital (Step 1b) is utilised for CPRD routine linkage and will be retained.

It is emphasised that following data linkage by NHS Digital using patient identifiable fields, there is no further flow or use of identifiable data at any point past this stage.

Step 3. Extraction and provision to CPRD

Imperial College London will send to CPRD a file containing all the nominal pollution code with the pollution data securely by SFTP.

Step 4. Creation of study dataset by CPRD

CPRD use the GP System Practice and Patient IDs in the bridging file (supplied by NHS Digital and initially provided by the GP System Providers) to generate the associated CPRD patient pseudonym (CPRD Record Key) using internal lookup files.

A patient cohort file containing CPRD Record Key is combined with the CPRD Patient Key generated from the bridging file received from NHS Digital (Step2) to generate a list of Imperial College London nominal pollution codes corresponding to each CPRD patient in the cohort.

The bridging file supplied to CPRD in Step 2 containing the nominal pollution code, will be used to link the pollution data in the file sent by Imperial College London in step 3. The air pollution data that has not been linked will be discarded by CPRD. CPRD creates an anonymised study dataset for release to researchers containing Imperial College London nominal pollution codes for all CPRD patients in the cohort. CPRD Record Key and Imperial College London nominal pollution code are not included in the dataset.

Step 5. Release of study dataset to St George’s, University of London

CPRD ensure that the research applicants (St George’s, University of London) have signed a bespoke Dataset Agreement, previously agreed with Imperial College London. This will include any additional terms and conditions required by Imperial College London, before any release of the linked data outside of CPRD.

The study dataset is then sent securely to St George’s, University of London by CPRD, using SFTP. St George’s, University of London researchers use the study dataset under the Dataset Agreement to produce research outcomes as approved under their Independent Scientific Advisory Committee, ISAC, protocol.

CPRD retain a copy of the study dataset for archiving purposes once the data has been successfully transferred to and verified by St George’s, University of London. CPRD also deletes Imperial College London data not included in the study after preparation of the dataset has been undertaken (Step 4).

The resulting dataset will be accessible solely by employees of St George’s University of London who will process and analyse the data to obtain findings for research outcomes. The data will be held on St George’s, University of London servers in the UK and will not be stored elsewhere at any time.

The request for this particular type of data linkage has been initiated by St George's University of London, after this initial linkage and dissemination, the data will also be available to other researchers subject to a suitable application submitted through CPRD’s ISAC process. The environmental data provided by Imperial College London will be linked to the wider CPRD database and will be available to other researchers subject to a suitable application submitted through CPRD’s ISAC process.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).