NHS Digital Data Release Register - reformatted
Medicines And Healthcare Products Regulatory Agency (mhra) projects
- CPRD Data Linkage Scheme (ODR1819_CRPD2019)
- R23 - Clinical Practice Research Datalink (CPRD) Routine Linkages Application
- Pelvic Floor Registry and Surgical Devices and Implant data to support medical device safety vigilance
- Project 4
2714 data files in total were disseminated unsafely (information about files used safely is missing for TRE/"system access" projects).
🚩 Medicines And Healthcare Products Regulatory Agency (mhra) was sent multiple files from the same dataset, in the same month, both with optouts respected and with optouts ignored. Medicines And Healthcare Products Regulatory Agency (mhra) may not have compared the two files, but the identifiers are consistent between datasets, and outside of a good TRE NHS Digital can not know what recipients actually do.
CPRD Data Linkage Scheme (ODR1819_CRPD2019) — DARS-NIC-656848-T9J1Q
Type of data: information not disclosed for TRE projects
Opt outs honoured: Anonymised - ICO Code Compliant (Does not include the flow of confidential data)
Legal basis: Health and Social Care Act 2012 s261(2)(a)
Purposes: No (Agency/Public Body)
Sensitive: Non-Sensitive
When:DSA runs 2022-11-21 — 2023-10-31
Access method: One-Off
Data-controller type: DEPARTMENT OF HEALTH AND SOCIAL CARE
Sublicensing allowed: No
Datasets:
- NDRS Cancer Registrations
Objectives:
Data for this study has previously been share when the data were controlled and managed by Public Health England (PHE). PHE facilitated data release via its Office of Data Release service (ODR). ODR was responsible for providing a common governance framework for responding to requests to access PHE data for secondary purposes, including service improvement, surveillance and ethically approved research. All requests to access data were reviewed by the ODR and were subject to strict confidentiality provisions. The responsibility for the management of the National Disease Registration Service of which the National Cancer Registration and Analysis Service is a part, transferred from PHE to NHS Digital on 1st October 2021.
Clinical Practice Research Datalink (CPRD) is a not-for-profit research data service,
jointly funded by the Medicines and Healthcare products Regulatory Agency (MHRA)
and the National Institute for Health Research (NIHR), to improve public health by
providing observational and interventional research services.
CPRDs research data services are based on patient level primary care electronic
health records (EHRs) extracted from GP practices that have consented to contribute data under the rigorous regulatory, information governance (IG) and security arrangements outlined below. All patient identifiers are removed at source by the GP practice system providers before release of limited fields within EHRs to CPRD, and
each record is assigned a data subject pseudonym before data flows to CPRD. EHR
records sent from GP practice system providers to CPRD are from, from that point,
considered de-identified pseudonymised records, which CPRD cannot identify (as
only GPs themselves hold the key to reversing pseudonyms).
A central part of CPRDs observational research services is to provide anonymised
primary care data, linked to other health care data sets, for the purpose of health research leading to clinical benefits for the UK. The ability to link primary care data to
other datasets is reliant on the presence of data subject pseudonyms being allocated by GP system providers, before collection by CPRD.
This Contract (1) supports the lawful disclosure of patient identifiers for the purpose of linkage with primary care data; and (2) the release of
pseudonymised NDRS Data relating to patients represented in both data sources.
Releases of anonymised row level data will be for the purpose of conducting Observational Research for public health benefit. Access to the data by CPRD clients, and
by internal CPRD researchers, will only be granted by CPRD following approval by
the MHRA Independent Scientific Advisory Committee for MHRA Database Research (ISAC). Controls will be placed on use of the data by the client through a licence agreement which mirrors the requirements of NHS Digital. This agreement must be signed by the client before any release of linked health data can occur.
Approval to Access Research Data
CPRD clients include academic institutions, life science companies, governmental
bodies, and research charities. Linked NDRS Data are only ever made available by
CPRD to such clients, for use within research protocols that are approved by ISAC.
Following protocol approval by ISAC, the research organisation supporting the Chief
Investigator enters into a legal contract with CPRD. The contract covers the terms,
conditions and obligations related to access to and use of data for the approved research study.
All researchers are made fully aware they must not use data for any other purposes
than the research specified in the approved protocol. An alteration to the agreed research protocol, an extension to the study or new use of data requires additional or
new ISAC approval and a new legal arrangement to be put in place between CPRD
Researchers are encouraged to publish their findings and to share
these with the regulators when appropriate.
As stated above, CPRD itself does not receive any identifiable patient data (with the
exception of gender) as all personal identifiers are removed at source by GP system
providers before EHR record level data is provided to CPRD.
Scientific Governance
The MHRA Independent Scientific Advisory Committee for MHRA Database Research (ISAC) is a non-statutory expert advisory body that was established in 2006 by the Secretary of State for Health. Its role is to provide expert advice on Observational Research protocols that seek access to data available through CPRD. When new data linkages are established, CPRD works with the data source and Data Controller to establish suitable arrangements for representation on ISAC, or through an alternative streamlined review mechanism.
Applications may be shared with the Health Research Authoritys (HRA) Confidentiality Advisory Group (CAG) if they include disclosure-risk categories agreed in
CPRDs Section 251 approval (see below).
Contractual control
CPRD licence agreements control the use of data by individual researchers and their
host organisation. Licence agreements (i.e. sub-licence and dataset agreement),
which are contractually binding in nature, limit the use of data to medical and health
research purposes and impose confidentiality controls to protect against any further
risk of patient identification. CPRD ensures that the use of data by CPRD and their
clients complies with Regulation 2 of The Health Service (Control of Patient Information) Regulations 2002. All research using linked NDRS data requires prior approval by ISAC, on a study-specific protocol basis, and NDRS releases study-specific data for linkage
Section 251 approval
HRA support is provided to enable NHS Digital, as CPRDs Trusted Third Party
(TTP), to receive and process a defined and minimum number of personal identifiers
(NHS number, full date of birth, postcode, gender), without breaching the Common
Law Duty of Confidentiality. These are securely and directly provided to NHS Digital
by participating GP system providers, and by NDRS .
Identifiers are provided under HRA support to enable the linking of CPRD primary
care records to a wide range of kinds of secondary datasets relating to the provision
of care and public health in England, as set out in CPRDs Master Dataset List.
All observational studies operate under this HRA approval, termed section 251 support, as the appropriate legal gateway to enable the processing of identifiable data
used for the purpose of linkage by the TTP without breaching patient confidentiality.
CPRD must obtain, and maintain, annually renewed approval from CAG to lawfully
undertake linkages by the Trusted Third Party, and residual identifiers that have an
agreed research purpose (e.g. Date of Death). CPRD is required to submit amendment requests to CAG for new linkages not covered by its Master Dataset List. A
new dataset will only be linked once the CAG approval is obtained.
Ethics approval
The HRAs East Midlands Derby Research Ethics Committee (REC) has granted
overarching approval to CPRD to collect and use anonymised primary care data for
purposes of observational public health research studies. Any study that involves an
intervention or interaction with patients requires a separate ethical review and approval before any data can be provided.
Yielded Benefits:
Data for this study has previously been share when the data were controlled and managed by Public Health England (PHE). As such there are some yielded benefits to be observed from the access to the data for the study prior to NHS Digital becoming data controller. These yielded benefits are noted below; CPRD publish here https://cprd.com/approved-studies-using-cprd-data a register of all approved studies of which the detail on each study includes a lay and technical summary, the health outcomes to be measured and details on the organisations involved. There are 3 case studies presented below highlighting how linked CPRD-NHSD data has supported public health research, especially in response to the COVID 19 effort in recent years. Research using linked CPRD data benefits patients in the UK indirectly by contributing to the evidence base for medicine and public health, which in turn informs public health policy, programmes and clinical guidelines. Case study 1: Higher risks of flu and COVID-19 for cancer survivors (CPRD protocol 20_082). Older individuals and people with certain health conditions are known to be at higher risk of severe illness if they contract viruses such as flu and COVID-19. This includes people who had certain cancers diagnosed recently and are receiving treatments like chemotherapy. In the UK there are more than two million cancer survivors. To investigate whether people who had cancer some time ago are also at higher risk from flu and COVID-19 a study was carried out using CPRD data. CPRD GOLD was linked to Hospital Episode Statistics Admitted Patient Care (HES APC) database, cancer registrations from the National Cancer Registration and Analysis Service (NCRAS), death registrations from the Office of National Statistics mortality database, and postcode-based index of Multiple Deprivation data. Researchers found that survivors from a wide range of cancers are more likely than people in the general population to be hospitalised or die from flu, even several years after their cancer diagnosis. The raised risks were most likely for blood cancer survivors. Because flu and COVID-19 are both respiratory viruses, this suggested that cancer survivors also have a higher risk of severe COVID-19. The study also showed that cancer survivors were more likely to have other diseases that are associated with increased risk of severe COVID-19, such as heart disease, diabetes, respiratory disease and kidney disease. The findings support the UK policy recommendation to include all blood cancer survivors as one of the priority groups to receive the COVID-19 vaccination. The study findings could also support any work by others to prioritise vaccinations and treatments for longer-term cancer survivors. Reference 1: Carreira H, Strongman H, Peppa M, McDonald H, dos-Santos-Silva I, Stanway S, Smeeth L, Bhaskaran K. Prevalence of COVID-19-related risk factors and risk of severe influenza outcomes in cancer survivors: a matched cohort study using linked UK electronic health records data. EClinicalMedicine, Volume 29, 100656, December 2020. https://doi.org/10.1016/j.eclinm.2020.100656 https://cprd.com/protocol/covid-19-related-risks-cancer-survivors-matched-cohort-study-using-linked-uk-electronic Case study 2: Vaccine uptake in pregnancy Vaccination is an effective way to prevent infectious diseases. However, people with certain social characteristics such as those living in more deprived areas may be less likely to receive vaccination. Addressing inequalities in vaccine uptake to prevent infections is a key priority for public health. This cohort study examined the social factors that may be associated with lower uptake of flu vaccine and whooping cough vaccine by pregnant women. It considered a range of social determinants including maternal age, ethnicity, socioeconomic status, number of children in the household and region. The study used linkages between CPRD data, the CPRD GOLD Pregnancy Register, Hospital Episode Statistics (HES) and Office of National Statistics (ONS) small-area-level deprivation data. The researchers concluded that more targeted campaigns have the potential to reduce vaccine-preventable disease among infants and pregnant women, and to reduce health inequalities. Identifying social factors that are associated with lower vaccination rates could help to design programmes to improve vaccine uptake for specific groups of individuals. Reference 2: Walker et al. Social determinants of pertussis and influenza vaccine uptake in pregnancy: a national cohort study in England using electronic health records. BMJ Open 2021;11:e046545. https://doi.org/10.1136/bmjopen-2020-046545 https://cprd.com/protocol/social-determinants-uptake-maternal-influenza-and-pertussis-vaccine Case study 3: Health of mothers of children with a life-limiting condition More than 86,000 children and young people in England are now living with medical conditions that may ultimately shorten their life and cause death in childhood or young adulthood. Mothers of children with a severe health condition or whose child has died are more likely themselves to die earlier than other mothers. The lack of studies quantifying the mental health of mothers of children with a life-limiting condition has been highlighted by the National Institute for Health and Care Excellence. In this first part of a larger research programme, researchers used CPRD data to investigate the types of physical and psychological health conditions diagnosed in mothers of children with a life-limiting condition. The CPRD GOLD Pregnancy Register was used to link mothers and their children's healthcare data. The study also used linkages to Hospital Episodes Statistics (HES), Mental Health Minimum Dataset (MHMDS) and Office for National Statistics (ONS) death certificate data. The study concluded that mothers of children with life-limiting conditions have much higher rates of physical health problems, mental illness, and death. These findings were flagged as an Alert for important research by the NIHR. Prior to this study, little research had explored the health of this group of women. Knowing more about the health problems that the women experience could help in the design of specific healthcare interventions. Reference 3: Fraser et al. Health of mothers of children with a life-limiting condition: a comparative cohort study. Archives of Disease in Childhood 2021;106:987-993. http://dx.doi.org/10.1136/archdischild-2020-320655 https://cprd.com/protocol/life-limiting-conditions-health-children-and-their-mothers NIHR Evidence - Mothers of children with life-limiting conditions are at risk of serious health problems - Informative and accessible health and care research https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fevidence.nihr.ac.uk%2Falert%2Fchildren-life-limiting-conditions-mothers-more-likely-to die%2F%3Futm_source%3DNIHR%2Bmailing%2Blist%26utm_campaign%3Dc836e7227e-NEWS_RESEARCH_26_8_2021_COPY_01%26utm_medium%3Demail%26utm_term%3D0_570d86f9cb-c836e7227e-33120976&data=04%7C01%7CRhian.Hortin%40mhra.gov.uk%7Ceeaa93de1cb4421aafec08d9b0242f33%7Ce527ea5c62584cd2a27f8bd237ec4c26%7C0%7C0%7C637734491713613980%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=TKPdVHwcavrPA3jVYfUKdqC0ds9O2bgjo6GzYGJAk34%3D&reserved=0
Expected Benefits:
Studies using NHS Digital data linked to CPRD primary care database are expected and required to demonstrate likely benefits to patients in England which may be via informing better clinical care or public health policies. Some study results may also be used to support regulatory decision making both within and outside the UK, directly affecting the approval or removal of drugs or devices (and hence their availability) or guidance as to their use. Some examples of expected benefits included by researchers in recent RDG applications include:
Estimates of prevalence, incidence and healthcare burden of specific types of psoriasis in England. This study will further the understanding of the burden of these rare diseases in the UK, particularly highlighting differences between the different presentations of psoriasis. (RDG reference: 21_000421, see: https://www.cprd.com/protocol/prevalence-incidence-and-healthcare-burden-generalised-pustular-psoriasis-palmoplantar
An understanding of the risks associated with COVID-19 infection in patients with congenital heart disease to inform better clinical management of these patients (RDG reference: 20_000161, see: https://cprd.com/protocol/identifying-clinical-risks-associated-covid-19-patients-congenital-heart-disease-and-0)
An understanding of potential risks associated with various medications prescribed during pregnancy to both mothers and their children, to inform prescribing decisions (RDG reference: 21_000362, see: https://cprd.com/protocol/maternal-prescriptive-drug-use-risks-and-benefits-mothers-and-neonates)
An understanding of the different presentations of Long Covid and risk factors associated with developing Long Covid among non-hospitalised COVID patients with the aim of developing supportive interventions and treatments (RDG reference: 21_000423, see: https://cprd.com/protocol/long-covid-non-hospitalised-individuals-symptoms-risk-factors-and-syndromes-0)
Examining the effects of COVID-19 on primary care management following self-harm in the UK, concluding that despite the challenges experienced by primary healthcare teams during the initial COVID-19 wave, prescribing and consultation patterns following self-harm were broadly similar to pre-pandemic levels (RDG reference 20_001, see: https://cprd.com/protocol/impact-covid-19-primary-care-contact-referrals-follow-care-and-patient-outcomes-after)
Examples of how research using CPRD data benefits public health: https://cprd.com/examples-how-research-using-cprd-data-benefits-public-health
Further examples and other relevant publications resulting from linked data research which have informed clinical practice or public health policy are presented below. Some of the studies referenced are older (published up to 5 years before) as it can take 4-5 years to translate some research findings into clinical guidance or public health policy.
Pearson-Stuttard J, Cheng YJ, Bennett J et al. Trends in leading causes of hospitalisation of adults with diabetes in England from 2003 to 2018: an epidemiological analysis of linked primary care records. The Lancet Diabetes & Endocrinology, Volume 10, Issue 1, 2022. https://doi.org/10.1016/S2213-8587(21)00288-6 November 2021. The number of people with diabetes in the UK has increased substantially over past decades to almost 4 million with increasing costs to the health service.
This research will inform the provision of health services, management and prevention of diabetes and diabetes-related complications.
Masoli JAH, Delgado J, Pilling L et al. Blood pressure in frail older adults: associations with cardiovascular outcomes and all-cause mortality. Age and Ageing, Volume 49, Issue 5, September 2020, Pages 807813. https://doi.org/10.1093/ageing/afaa028. In a study of 415,980 people, including those often excluded from studies, researchers reported that there was no increased mortality risk with hypertension in adults above 75 years with moderate to severe frailty and all above 85 years. Research supports the move to raise the blood pressure target for frail older people. https://evidence.nihr.ac.uk/alert/new-research-supports-the-move-to-raise-the-blood-pressure-target-for-frail-older-people/
Sheng-Chia Chung, Reecha Sofat, Dionisio Acosta-Mena, Julie A Taylor, Pier D Lambiase, Juan P Casas, Rui Providencia. Atrial fibrillation epidemiology, disparity and healthcare contacts: a population-wide study of 5.6 million individuals. The Lancet Regional Health - Europe, Volume 7, 2021. https://doi.org/10.1016/j.lanepe.2021.100157. From the paper: The study provides comprehensive evidence for the AF burden on population health and healthcare utilisation. We found approximately two in five AF patients had three or more comorbidities at the time of diagnosis
Outputs:
PRD clients using these linked data will be producing (on an on-going basis) research publications in peer-reviewed journals and presentations in scientific conferences
Processing:
Agreed fields from primary care EHRs are collected from consenting GP practices
and sent to CPRD via a Health and Social Care Network (HSCN) secure connection.
The data are verified for integrity and completeness before further processing.
To obtain and provide anonymised patient-level data for use in health research,
CPRD adheres strictly to the UK Information Commissioners Offices (ICO) Anonymisation: Managing Data Protection Risk Code of Practice. A key aspect of maintaining confidentiality under the Code is the use of a Trusted Third Party, who takes identifiable personal data and then anonymises this in safe, high-security conditions, to an agreed specification, which allows the subsequent use and linkage of anonymised individual-level data by others.
A significant advantage of using a Trusted Third Party is that it allows scientific research to take place without organisations that are involved in the research, ever
having access to identifiable personal data themselves.
CPRD uses NHS Digital as its Trusted Third Party for data linkage. It is NHS Digital
that undertakes data linkage for CPRD using actual patient identifiers, as permitted
under strict legal and ethical permissions and scrutiny provided by the HRA. Once
NHS Digital has undertaken this linkage process, it provides anonymised patient level linked data back to CPRD, that (in confidentiality terms) is safe for research
use. CPRD may then itself safely link anonymised datasets together, but only where
NHS Digital has previously provided anonymised linked data to CPRD as explained
above. This physical and logical separation of the flow of de-identified EHR records
from GP system providers to CPRD, from the flow of identifiable patient data to the
TTP direct from GP system providers, is a fundamental tenet of CPRDs governance, ethics and security model.
CPRD operates to high levels of security to ensure that when data is transmitted and
or stored it is done so in a way that protects the data. All data in CPRD is stored in a
"Tier 3" data centre that is compliant with Government standards to operate in a way
that meets the full requirements for managing and storing such important data. The
measures are always under review and are subject to audit.
Information Governance
CPRD mitigates the risk of inadvertent disclosure of patient identity through legal
agreements preventing the use of the data in conjunction with other data sources,
which when linked may potentially re-identify an individual.
CPRD also holds the right to audit data recipients to ensure they are adhering to the
terms of data use (including the full terms of the data sharing agreement with the research user), security and confidentiality. On request from NHS Digital , CPRD will provide
to NHS Digital the results of any such audit.
CPRD employees are appropriately trained in information governance and data security processes to ensure they have the necessary understanding of relevant laws
and standards. Staff are aware that any misuse of data may result in disciplinary
procedures and, in the case of a severe breach may lead to dismissal.
Training covering use of data is mandatory for CPRD staff. All employees responsible for the interaction with contributing GP practices are precluded from access to data. Data are kept on restricted servers and drives accessible only to appropriately trained research staff.
De-sensitising Data
Additional processes are put in place to reduce the likelihood of deductive disclosure
of an individuals identity. A number of variables included in the data are made less
specific. For example, year of birth is provided rather than exact date of birth and in
studies involving children, the month and year of birth are normally provided. Similarly, the geographical information provided by default is at a regional level, and geographical areas with fewer than one million residents are combined to prevent small
cell counts.
CPRD has produced a Policy for Managing Anonymisation and the Risk of Identification in Observational Research, which sets outs such requirements which must be
met before release of data to CPRD clients.
Pseudonymisation
Pseudonymisation is applicable in a number of contexts within CPRD to prevent the
identity of the following from being revealed:
a patient
an individual recorded in their records (such as clinicians providing clinical
care)
an organisation (such as hospital trusts or general medical practices)
geospatial identifiers (such as postcode, or grid reference) which could lead
to identification of an individual.
Primary Pseudonymisation
CPRD establishes data subject pseudonymisation through the assignment of a compound pseudonym key. The key comprises a practice identifier and a patient identifier (within that practice). This data subject pseudonym is not identifiable within the
data held by the GP.
Multiple Pseudonym Layers
CPRD processes data and makes it available internally to CPRD researchers. In doing so, CPRD replaces the original data source pseudonym(s) with a second layer pseudonym or person ID. This creates multiple layers of separation, such that an adversary would need to translate the CPRD person ID back to the data source pseudonym ID and then gain access to the data source patient index, in order to directly identify a data subject. When linked NCRAS data are supplied to third parties, CPRD replaces the person ID again, by adding a third layer pseudonym ID that establishes a further layer of separation.
Encryption is used for data in transit between secure locations. This applies to both
identifier data for linkage and clinical research data. Although the clinical data is
pseudonymised, there remains the residual risk of re-identification or the risk of inclusion of disclosive content and data is only intended for processing by authorised
recipients. Encryption mitigates this risk and provides assurance. The general default minimum standard for encryption is AES 256 using a complex pass-phrase consisting characters and a mix of upper case, lower case, numeric and special characters
R23 - Clinical Practice Research Datalink (CPRD) Routine Linkages Application — DARS-NIC-15625-T8K6L
Type of data: information not disclosed for TRE projects
Opt outs honoured: Y, N, Yes - patient objections upheld, No - data flow is not identifiable, Anonymised - ICO Code Compliant, No, Yes (Section 251, Section 251 NHS Act 2006)
Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 – s261(7), Other - Coronavirus (COVID-19) notices under reg 3(4) of the Health Service Control of Patient Information Regulations 2002, National Health Service Act 2006 - s251 - 'Control of patient information'. , Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Other-Coronavirus (COVID-19) notices under reg 3(4) of the Health Service Control of Patient Information Regulations 2002, Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 - s261(5)(d)
Purposes: No, Yes (Agency/Public Body)
Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive
When:DSA runs 2021-05-01 — 2022-04-30 2017.09 — 2024.09.
Access method: Ongoing, One-Off
Data-controller type: DEPARTMENT OF HEALTH AND SOCIAL CARE
Sublicensing allowed: Yes
Datasets:
- Hospital Episode Statistics Admitted Patient Care
- Hospital Episode Statistics Critical Care
- Hospital Episode Statistics Accident and Emergency
- Hospital Episode Statistics Outpatients
- Diagnostic Imaging Dataset
- Office for National Statistics Mortality Data (linkable to HES)
- Patient Reported Outcome Measures (Linkable to HES)
- Mental Health Minimum Data Set
- Mental Health Services Data Set
- Office for National Statistics Mortality Data
- Bridge file: Hospital Episode Statistics to Diagnostic Imaging Dataset
- Civil Registration (Deaths) - Secondary Care Cut
- Civil Registration - Deaths
- COVID-19 Hospitalization in England Surveillance System
- COVID-19 Second Generation Surveillance System (Beta version)
- Emergency Care Data Set (ECDS)
- Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
- COVID-19 Second Generation Surveillance System
- HES:Civil Registration (Deaths) bridge
- HES-ID to MPS-ID HES Accident and Emergency
- HES-ID to MPS-ID HES Admitted Patient Care
- HES-ID to MPS-ID HES Outpatients
- Mental Health and Learning Disabilities Data Set
- MRIS - Bespoke
- CPRD/UHB linkage file (pseudonymised data only)
- Civil Registrations of Death - Secondary Care Cut
- COVID-19 Second Generation Surveillance System (SGSS)
- Diagnostic Imaging Data Set (DID)
- Hospital Episode Statistics Accident and Emergency (HES A and E)
- Hospital Episode Statistics Admitted Patient Care (HES APC)
- Hospital Episode Statistics Critical Care (HES Critical Care)
- Hospital Episode Statistics Outpatients (HES OP)
- Mental Health and Learning Disabilities Data Set (MHLDDS)
- Mental Health Minimum Data Set (MHMDS)
- Mental Health Services Data Set (MHSDS)
- Civil Registrations of Death
- Maternity Services Data Set (MSDS) v2
- Medicines dispensed in Primary Care (NHSBSA data)
- COVID-19 SGSS First Positives (Second Generation Surveillance System)
- NDRS Cancer Registrations
- NDRS National Radiotherapy Dataset (RTDS)
- NDRS Systemic Anti-Cancer Therapy Dataset (SACT)
- CPRD Deprivation
Objectives:
CPRD is the UK’s pre-eminent research service, providing access to anonymised (in line with the ICO code of anonymisation) primary care data linked by NHS Digital to other similarly anonymised health data provided by NHS Digital and others for the purposes of public health research including the monitoring of drug safety. All such data is linked (in its identifiable form) by NHS Digital only. It is jointly funded by the MHRA and the National Institute for Health Research (NIHR).
CPRD’s aims are to support vital public health research and to inform advances in patient safety in the delivery of patient care pathways. These depend on access to accurate, real-time representative patient data to produce reliable evidence-based clinical and drug safety guidance.
CPRD services are designed to maximise the way anonymised NHS clinical data can be used to improve and safeguard public health. For more than 20 years data provided by CPRD have been used in a range of drug safety and epidemiological studies that have impacted on health care, and resulted in over 1700 peer-reviewed publications. In addition to supporting high-quality observational research, CPRD is developing world-leading services based on using real world data to support clinical trials and intervention studies.
The intention is to continue to link anonymised CPRD primary care data to NHS Digital’s secondary care and other datasets, as linkage greatly increases the scale, depth, completeness and therefore value of data available for public health research. The outputs of such research based on linked data in turn improve and protect patient care pathways/treatments and provide clinical benefits for the UK, supporting delivery of CPRD’s core objectives.
CPRD’s research and data services are based on a database of anonymised longitudinal primary care records contributed by consenting GP practices from the four UK nations, and on the ability to link primary care data to secondary care data (and other data sets), from the NHS, Office of National Statistics (ONS) and Public Health England (PHE). One of CPRD’s main priorities is to increase the number of national data sets that are linked to primary care data and made available on a routine basis to the research community.
Such collection and linkages occur under the appropriate permissions (ethical and s251), which have been granted to CPRD by the East Midlands – Derby Research Ethics Committee (REC), and the Health Research Authority (HRA).
NHS Digital has been providing secondary and other data for linkage with CPRD primary care data for a number of years. Data linkage is carried out exclusively by NHS Digital as the Trusted Third Party (TTP) for this purpose. Linked data sets currently available include extracts from ONS Death Registration data; Hospital Episode Statistics (HES), which encompasses Admitted Patient Care, Critical Care, Outpatient and Accident & Emergency data; Patient Reported Outcome Measures (PROMs); Diagnostic Imaging Dataset (DID); Mental Health data; National Cancer Registry; Deprivation data including Townsend Score and Index of Multiple Deprivation. Critical care is supplied as a separate dataset by NHS Digital, but is integrated with Admitted Patient Care.
Data can only be used for public health research purposes in research recommended for approval by ISAC for MHRA database research. CPRD make the final decision on access, and ensure compliance with NHS Digital’s requirements within the data sharing agreement, including (e.g.) security of the third party. Access to CPRD data and services will not be permitted in circumstances that may result in loss of public trust or for activities that may undermine the integrity of the CPRD database.
Yielded Benefits:
MMR and risk of autism The Wakefield study in 1998 suggested a link between mumps-measles-rubella (MMR) vaccination and autism, based on an uncontrolled series of case studies of 12 children. Despite widespread criticism from the scientific community, the study generated media interest and led to a fall in MMR coverage in the UK. Researchers at the London School of Hygiene and Tropical Medicine used CPRD data to investigate the possibility of a link between the vaccine and incident autism. A case-control design was used to determine whether autistic children were more likely to have received the MMR vaccine. The study found no evidence of an association between being vaccinated against MMR and the risk of developing autism. The study was published in the Lancet and was instrumental in restoring public opinion of the vaccine. The Wakefield study was fully retracted, and authors withdrew their association with the original publication. Pertussis and pregnancy Whooping cough can cause serious and fatal complications in new-born babies and young children. Babies are routinely vaccinated early, from two months of age, but can still be at risk if a mother catches whooping cough whilst pregnant. After an outbreak of whooping cough in 2012, a national programme was introduced to give pregnant mothers a vaccine to protect their baby. Using the CPRD database, the MHRA took a proactive approach to pharmacovigilance, collecting data on a monthly basis from the start of the programme to identify a large cohort of vaccinated women. The study compared the vaccinated cohort to historical records. There was no evidence of an increased risk of adverse events in women who received the vaccine in the third trimester. CPRD data was vital for checking the safety of the vaccine after the programme was introduced, in as near to real-time as possible. As a result of the research, all GPs and maternity services now routinely vaccinate all pregnant mothers from 20 weeks. Blood pressure treatment for diabetes This study evaluated the dogma of ‘lower is better’ when managing hypertension among patients with diabetes. Results showed a greater than three-fold increase in mortality among hypertensive patients with newly-diagnosed diabetes when systolic blood pressure was reduced to levels below 110 mmHg. Risk was further increased in patients with diabetes and hypertension who had previously had a stroke or myocardial infarction. Subsequent guidelines from the European Society of Hypertension and the European Society of Cardiology (2013) now recommend lowering blood pressure to a goal of <140/90 mmHg.
Expected Benefits:
Past and existing studies (on an ongoing basis) use linked data with the CPRD primary care database to generate research results. These studies are expected to produce benefits of clinical importance to the UK public, and to be published in peer-reviewed journals and presented at scientific conferences. Some recent examples and other relevant publications resulting from linked data research which resulted in clinical benefits are presented below.
Case Study 1: The effectiveness of the influenza vaccine against hospital admissions and mortality in individuals with type 2 diabetes
Seasonal influenza accounts for a significant proportion of excess winter mortality. Current policy in the UK and in many countries worldwide recommends annual flu vaccinations for patients with chronic conditions such as diabetes, though evidence to support such policies is limited.
Imperial College London recently investigated the effectiveness of the influenza vaccine at reducing cardiovascular and respiratory hospital admissions and mortality in patients with type 2 diabetes. The study used linkages between CPRD GOLD primary care data, Hospital Episode Statistics (HES) and the Office for National Statistics (ONS) mortality data to look at admissions and death in 125,000 patients over a seven-year period. Influenza vaccination was associated with a reduction in the rate of hospital admissions for acute cardiovascular and respiratory disease and a reduction in all-cause mortality across the seven flu seasons.
The study has been widely reported within healthcare and mainstream media and supports current flu vaccination initiatives in the UK and beyond.
Reference 1:
Vamos EP et al. Effectiveness of the influenza vaccine in preventing admission to hospital and death in people with type 2 diabetes. CMAJ. 2016 Oct 4;188(14):E342-E351.
Case Study 2: Risk associated with the prescription of long-acting β2-agonists (LABA), short-acting β2-agonists (SABA) or inhaled corticosteroids (ICS) for asthma in primary care.
Omalizumab is a recent antibody-based treatment developed to help control moderate to severe allergic asthma, when symptom control with inhaled corticosteroids (ICS) is inadequate. ICS are frequently prescribed alongside long-acting β2-agonists (LABA). A 2010 study using CPRD data (then GPRD) linked with Hospital Episode Statistics investigated the risk of asthma-related death and hospitalisation among patients on ICS or LABA therapy. The study was important to establish the relative risk across commonly-prescribed asthma treatments and concluded that LABA exposure was not associated with an increased risk for all-cause mortality. This study was subsequently incorporated into NICE guidelines released in 2013 outlining evidence-based recommendations for omalizumab use in patients with severe persistent asthma.
Reference 2:
de Vries F, Setakis E, Zhang B, van Staa TP. Long-acting {beta}2-agonists in adult asthma and the pattern of risk of death and severe asthma outcomes: a study using the GPRD. Eur Respir J. 2010 Sep;36(3):494-502.
Additional references describing health benefits of CPRD and linked data.
Example Reference 3:
The CPRD and the RCGP: building on research success by enhancing benefits for patients and practices.
Antonis A Kousoulis, Imran Rafi, Chair, and Simon de Lusignan
Br J Gen Pract. 2015 Feb; 65(631): 54–55. Available from:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4325440/?tool=pmcentrez
Example Reference 4:
Quality Improvement's greatest hits of 2016, Hannah Price, Head of Quality Improvement
http://www.rcgp.org.uk/clinical-and-research/clinical-news/quality-improvements-greatest-hits-of-2016.aspx
RCGP and Clinical Practice Research Datalink (CPRD) have joined forces to produce innovative data reports focusing on prescribing and patient safety, which enable benchmarking and case-finding. Over 100 practices from all four nations of the UK have participated in the successful pilot stage of the projects. This phase is now coming to an end and the reports will be rolled out to all practices in the CPRD network during 2017.
Example Reference 5:
A recently published systematic review (Oyinlola et al 2016) identified 43 CPRD studies that have been used in 25 medical guidance documents. The reviewers found that use of data from the CPRD to inform guidelines has increased in recent years and noted the importance of linking data to extend research to medical conditions that are treated in multiple settings (e.g. primary and secondary care).
Reference:
Oyinlola JO, Campbell J, Kousoulis AA. Is real world evidence influencing practice? A systematic review of CPRD research in NICE guidances. BMC Health Serv Res. 2016 Jul 26;16:299.
Example Reference 6:
A review of patients with learning disabilities (LD) at the Winterbourne View private hospital was established for all aspects of care for this patient group. This study used three years of CPRD data alongside HES APC data to describe the level of GP prescribing of psychotropic medication to patients with LD, and explored whether a relevant diagnostic indication was recorded. The results of the study led NHS England to promise rapid and sustained action to tackle over-prescribing, and an urgent letter sent to professionals to urge they review their own prescribing.
Reference:
Glover, G, Williams R. 'Prescribing of psychotropic drugs to people with learning disabilities and/or autism by general practitioners in England'. Public Health England. June 2015.
Outputs:
CPRD customers using linked data products will be producing (on an on-going basis) research publications in peer-reviewed journals and presentations at scientific conferences. CPRD customers include academic institutions, pharmaceutical companies, Governmental centres and research charities. These all undertake medical and health data research, which may result in formal publications.
All data included in such outputs by CPRD customers will be aggregated, small numbers suppressed in line with the HES Analysis Guide (or dataset specific suppression controls).
A selection of recent publications resulting from use of CPRD linked data are presented below.
Moss S, Melia J, Sutton J, Mathews C, Kirby M. (2016) ‘Prostate-specific antigen testing rates and referral patterns from general practice data in England.’ Int J Clin Pract. 2016 Apr;70(4):312-8. doi: 10.1111/ijcp.12784. Epub 2016 Mar 14.
William Hollingworth, (Professor), Mousumi Biswas, Rachel L Maishman, Mark J Dayer, Theresa McDonagh, Sarah Purdya, Barnaby C Reeves, Chris A Rogers, Rachael Williams, Maria Pufulete. (2016) ‘The healthcare costs of heart failure during the last five years of life: A retrospective cohort study’ International Journal of Cardiology, Volume 224, 1 December 2016, Pages 132–138
Laurence Baril, Dominique Rosillon, Corinne Willame, Maria Genalin Angelo, Julia Zima, Judith H. van den Bosch, Tjeerd Van Staa, Rachael Boggon, Eveline M. Bunge, Sonia Hernandez-Diaz, Christina D. Chambers. (2015) ‘Risk of spontaneous abortion and other pregnancy outcomes in 15–25 year old women exposed to human papillomavirus-16/18 AS04-adjuvanted vaccine in the United Kingdom’ Vaccine, Vol 33, Issue 48, 27 November 2015, Pages 6884–6891
Taylor S, Taylor RJ, Lustig RL, Schuck-Paim C, Haguinet F, Webb DJ, Logie J, Matias G, Fleming DM (2016). ‘Modelling estimates of the burden of respiratory syncytial virus infection in children in the UK.’ BMJ Open. (2016) Jun 2;6(6):e009337. doi: 10.1136/bmjopen-2015-009337.
Wing K, Bhaskaran K, Smeeth L, van Staa TP, Klungel OH, Reynolds RF, Douglas I (2016). ‘Optimising case detection within UK electronic health records: use of multiple linked databases for detecting liver injury.’ BMJ Open. 2016 Sep 2;6(9):e012102. doi: 10.1136/bmjopen-2016-012102.
Alexandre L, Clark AB, Bhutta HY, Chan SS, Lewis MP, Hart AR. (2016) ‘Association Between Statin Use After Diagnosis of Esophageal Cancer and Survival: A Population-Based Cohort Study.’ Gastroenterology. 2016 Apr;150(4):854-65.e1; quiz e16-7. doi: 10.1053/j.gastro.2015.12.039. Epub 2016 Jan 9.
Processing:
CPRD has established agreements with General Practices and agreed contracts with their data processors, the GP clinical IT system providers, enabling the extraction of agreed data from the primary care electronic health record (EHR).
Protecting patient confidentiality is paramount to CPRD. A number of processes and procedures are in place to safeguard the identity and confidentiality of patient data received and supplied by CPRD. An overview of these is presented below, including minimised dataset extraction, data transformation, strong and multiple pseudonymisation, and governance and scrutiny on approvals to use linked data.
The CPRD Policy for Managing Anonymisation and the Risk of Identification in Observational Research sets out the management processes employed to ensure that CPRD appropriately anonymises patient data for observational research purposes, and complies with the Information Commissioner’s Office (ICO) Code on Anonymisation and with Office of National Statistics (ONS) requirements on use of death registration data.
1) Data Collection
Data collected by CPRD includes all coded patient primary care data, including gender, year-of-birth, and year-and-month-of-birth for patients aged 16 and under. CPRD does not receive patient name, address, full date of birth, NHS Number or free text medical notes.
In order to enable the linking of primary care records to other health related data, GP EHR suppliers provide certain patient identifiers directly to NHS Digital. These are: NHS Number, full date of birth, post code and gender. CPRD does receive gender but does not receive any of the other identifiers. The Trusted Third Party (NHS Digital) provides the linkage service for CPRD.
Data collections are received by CPRD securely through an N3 link. Data arrives as a series of incremental collections of data from practices that have agreed to share data with CPRD. Data collections, once received, are checked for content (data structure and format), completeness (presence of key and optional data files) and continuity. The collection is then archived. Details of each data collection are logged to an administrative database, and data is made available for processing.
Database creation or a build process is undertaken on a monthly basis by taking a snapshot of the fully processed data and organising it into a structure which enables tools to query and extract the data for use in observation and interventional research studies.
CPRD retains the data collected up to the point that a GP Practice withdraws from participation. This is to ensure that CPRD can create (if needed) datasets for (eg) validation of previous research, or for longitudinal studies. Patient opt-outs remain respected from the point of notification to CPRD.
2) Data transformation
CPRD does not release the same linked data to external researchers that it receives from NHS Digital. The changes made in between data receipt and release are termed ‘data transformation’. This is done to protect patient confidentiality, and also to better facilitate relevant research.
Transformation involves removing the provider codes provided by NHS Digital. Data provided by CPRD is matched to the former Strategic Health Authority boundaries. It is based on matching the address of the GP Practice to the SHA. This ‘blurs’ the link between hospital activity records and other potential identifiers collected and provided (Gender, Year of Birth, Date of Death and Ethnicity).
For example, transformation of the CPRD linked HES Admitted Patient Care (APC) data involves:
(i) The encrypted_”HESID” id field provided to CPRD by NHS Digital is not released to customers. CPRD creates a pseudonym linked to a unique patient activity record in the HES data.
(ii) Encoding of the record level identifier (epikey). The epikey variable has been encoded by the CPRD to minimise the risk of breaching licensing conditions through linkage of these data to other HES data sources containing patient identifiable information. The epikey is encoded with a new key each time data is processed, so that the epikey for the same record differs in every release of CPRD linked HES APC data. This prevents different researchers from linking patients from the same dataset, or from comparison with older release versions of the data.
(iii) Collating data across years, formatting date and diagnosis fields, and dropping fields (mainly provider and geographical based) from standard release of the data. The episode-level data files received by NHS Digital are transformed into a normalised data structure containing the following tables:
1. Hospitalisations
2. Episodes
3. Diagnosis
4. Procedures
5. Augmented Care
6. Critical Care
7. Maternity
8. Health Resource Group
3) Pseudonymisation process
CPRD has agreed pseudonymisation processes with each GP EHR system provider as well as the Trusted Third Party used for data linkage. The overarching process for patient data pseudonymisation comprises of the following stages to protect patient confidentiality at all times:
(i) GP system provider – the provider replaces the patient identifiers (NHS number) in each patient record with a system practice ID and system patient ID before its secure transfer to CPRD.
(ii) CPRD – on collection of the patient data, CPRD replaces the original data source patient and practice ID from the GP system provider with a CPRD patient and practice pseudonym.
(iii) Data linkage – where this is undertaken by the Trusted Third Party (using patient identifiers sent directly from GPs), all linked patient record data are anonymised by the TTP before release to CPRD. Similarly, where cancer registry data is received by CPRD from Public Health England for linkage, PHE anonymise patient data before release to CPRD.
(iv) Data release – the linked data is cut by CPRD to minimise data ultimately released to third party researchers, with the linked patient ID replaced again by a further patient ID at release establishing yet another layer of separation.
Record level identifiers (such as epikey, attendkey, aekey) are additionally encoded such that the record level pseudonym differs in every release of the linked data.
Combined with the wider processes and procedures noted in this section, the above pseudonymisation process precludes users of linked data from identifying patients from the data provided.
4) Data use
Access and use of the data is controlled. With the exception of interventional and clinical studies (which require separate Health Research Authority approval), researchers must gain approval for their study protocol from the Independent Scientific Advisory Committee for MHRA Database Research (ISAC). Approved applications to ISAC are published on the CPRD website https://www.cprd.com/ISAC/datause.asp.
CPRD may generate aggregate level linked data without ISAC approval to inform feasibility and design of external research (observational and interventional/clinical) and for assessment of ISAC protocols. CPRD will undertake such assessments on behalf of external researchers with no release of record level linked data permitted outside of CPRD.
ISAC therefore plays a key data governance role. Approval from ISAC is required if access to anonymised patient level data is requested for requested for observational research and there is an intention to publish results, or where the study depends on access to primary care data linked to other health related data. ISAC's role is to determine whether a research proposal is of public health value, will be conducted by researchers with the appropriate level of expertise, highlight if any ethical or confidentiality issues may arise in the proposed research, and to consider the scientific merit of the proposed methods and overall study.
5) Data release
Release of patient level linked data to third party researchers will only occur after:
a) All required approvals (including ISAC approval) have been obtained;
b) Data is determined to be anonymised as per CPRD’s Policy for Managing Anonymisation and the Risk of Identification in Observational Research;
c) Additional requirements on anonymisation relating to ONS death registration data have been met, with agreement from ONS (see below);
d) Researchers are provided with robust contracts defining terms of use relating to secure access, retention and destruction of the data; and
e) Access to data provided by CPRD which is sub-licensed having been provided by NHS Digital, the Office for National Statistics or by any other Data Controller or Custodian, is done so under terms compatible with the terms under which data is provided to CPRD.
With regard to release of ONS death registration data:
• ONS death data provided by NHS Digital is stored separately by CPRD and interrogated on a case by case basis to assess the scientific value of research applications;
• Sub-national geographic data are not provided to researchers without additional review and approvals from ISAC and where relevant, HRA CAG;
• As standard, CPRD match each GP practice postcode to a larger geographical area aligned with the historical NHS Strategic Health Authority boundaries, ensuring an underlying population size of at least 2 million persons. The GP practice post code, the hospital or other institutional identifier are not released;
• The ISAC review includes a risk assessment of patient re-identification, and if appropriate research applicants are required to outline risk mitigation plans;
• CPRD’s Policy for Managing Anonymisation and the Risk of Identification in Observational Research sets out CPRD’s policy for the release for publication of data relating to small cell counts;
• Restrictions on the number of stratified analyses are imposed in the case of research proposals investigating rare diseases or treatments to minimise the risk of re-identification;
• ISAC approvals only allow exact ONS dates of death for use in calculating the time to death from a given event of interest (for e.g. a particular diagnosis) for the purpose of survival analyses and where there is a clear benefit to public health from the proposed research; and
• Researchers are also contractually bound to maintain patient anonymity and prevent inadvertent re-identification of patients
In accordance with guidance from the Information Commissioner’s Office (ICO), CPRD does not permit personal data to be processed outside its own servers, and hence any such data is retained within the EU.
6) Data Access Management
CPRD processes patient data and makes it available internally to CPRD researchers. To control third-party access to linked data and minimise data released to third parties, the CPRD Observational Research Team will extract datasets for researchers against a query specification or primary care data defined cohort. The query and its output content will be agreed with the researcher prior to generation of the data sets. This is the only process by which applicants to CPRD may access linked data.
Hewlett Packard/Sungard are captured as data storage addresses as for the purposes of this application, Sungard is considered to be the initial back-up and recovery, Hewlett Packard are the 'back-up to the back-up'. They are not involved in processing of the data in any way (Sungard provide a facilities management and site management service). CPRD have confirmed that neither HP nor Sungard have access to the server (neither administrative nor user rights).
7) Information Security Measures
CPRD is part of a wider Government agency (the MHRA) and conforms to the 10 National Data Guardian data security standards as well as to NHS Digital requirements. The MHRA meets NHS Information Governance Toolkit standards on information security, and details on standards and arrangements are set out in CPRD’s approved System Level Security Policy (SLSP).
CPRD operates to a high level to ensure that when data is transmitted and or stored it is done so in a way that protects the data. All data in CPRD is stored in a “Tier 3” data centre that is compliant with Government standards to operate in a way that meets the full requirements for managing and storing such important data. The measures are always under review and are subject to audit. Security measures include:
• Multifactor authentication for access
• Monitoring of access
• Round the clock security staff presence
• Robust firewalls and other access restrictions
A back-up store of the data (provided by named data processors) mirrors the above features but in an alternative location to allow for business continuity.
8) Data destruction and disposal
Data destruction standards (currently NHS Digital ‘Destruction and Disposal of Sensitive Data’ guidelines v3.2) will be met through planned implementation in MHRA of a Blancco LUN Eraser tool, to guarantee that sensitive data is properly erased and sanitized securely and permanently. This tool ensures compliance with industry standards and regulations, including PCI DSS, HIPAA, SOX, ISO 27001 and the EU General Data Protection Regulation, and the tool will be in place by August 2017.
9) Encryption
Encryption is used for data in transit between secure locations. This will apply to both identifier data for linkage and clinical / research data. Although the clinical data is pseudonymised, there remains the residual risk of re-identification or the risk of inclusion of disclosive content and data is only intended for processing by authorised recipients. Encryption mitigates the risk and provides assurance.
The default minimum standard for encryption will be AES 256 using a complex pass-phrase consisting of 12 characters and a mix of upper case, lower case, numeric and special characters.
10) Training
All CPRD staff and licensed data users are appropriately trained and have the necessary understanding of the governance processes pertaining to relevant laws. They will also be aware that any misuse of data may result in disciplinary procedures and, in the case of a severe breach, dismissal and immediate removal from the premises.
Training covering use of data is mandatory for CPRD staff and licence-holders prior to accessing data. CPRD staff who are responsible for the collection of data and interaction with site staff are precluded from access to data. Data is kept on restricted servers and drives accessible only to appropriately trained research staff.
NHS Digital permits CPRD sub-licensees to share data with third parties subject to the third parties collaborating on the same research as the sub-licensee, and subject to the terms, checks and controls carried out by CPRD in relation to sub-licences. Details of such licences will be published and shared with NHS Digital.
Pelvic Floor Registry and Surgical Devices and Implant data to support medical device safety vigilance — DARS-NIC-694500-C0W0K
Type of data: information not disclosed for TRE projects
Opt outs honoured: Anonymised - ICO Code Compliant, No (Does not include the flow of confidential data)
Legal basis: Health and Social Care Act 2012 s261(2)(a)
Purposes: No (Agency/Public Body)
Sensitive: Non-Sensitive
When:DSA runs 2023-08-01 — 2024-07-31 2024.05 — 2024.05.
Access method: One-Off
Data-controller type: DEPARTMENT OF HEALTH AND SOCIAL CARE
Sublicensing allowed: No
Datasets:
- Surgical Devices and Implants Data Set (SDIDS)
Objectives:
Medicines and Healthcare Products Regulatory Agency (MHRA), an executive agency of the Department of Health and Social Care (DHSC), requires access to the NHS England dataset Surgical Devices and Implants Data Set (SDIDS) to conduct exploratory data analyses for the purposes of assessing the value in the safety vigilance of pelvic mesh or tape devices.
This project explores the feasibility of the use of this data set to support MHRAs vigilance and surveillance of pelvic mesh products. The results of this project will provide insight into the strengths and limitations of the data within the SDIDS and Pelvic Floor Registry for the purposes of supporting device vigilance.
The SDIDS was set up to address Recommendation 7 from the Independent Medicines and Medical Devices Safety (IMMDS) Review First Do No Harm in relation to the Reviews investigation of the medical device, pelvic mesh. The pelvic mesh was used to manage pelvic organ prolapse (POP) and stress urinary incontinence (SUI). The pelvic mesh was used to support pelvic organs and each indication of use might involve a different type of device and operation to fit the device. Concerns were raised around the use of transvaginal POP mesh, leading to restrictions of use both in the UK and internationally. In 2017, all transvaginal POP mesh surgeries were stopped and have subsequently been restricted to use in research trials. In July 2018, a halt to mesh procedures for SUI was recommended by the IMMDS Review and this recommendation was implemented by NHS England and DHSC. The use of SUI mesh was restricted to use in exceptional circumstances and under high vigilance.
A key point identified by the Review was the limited information on the number of women fitted with a pelvic mesh product and missing information around the patient outcomes relating to these procedures, meaning the extent of complications is not fully known. Under the recommendation of the Review, NHS England undertook a retrospective experimental audit of pelvic mesh procedures conducted between April 2008 and March 2017 using Hospital Episodes Statistics (HES). The audit included follow-up of patients who had undergone these procedures, but it was acknowledged that this data was likely incomplete and did not include private healthcare.
The following NHS England data will be accessed:
· Surgical Devices and Implants Data Set (SDIDS) including the Pelvic Floor Registry module embedded within the SDIDS dataset - necessary to identify the number of surgical procedures conducted using pelvic mesh or tape devices [and/or the number of alternative non-surgical procedures] for the indication of Stress Urinary Incontinence (SUI) and Pelvic Organ Prolapse (POP). The number of procedures recorded will enable the study investigators to assess the feasibility of conducting safety vigilance of pelvic mesh or tape devices in this dataset.
The Pelvic Floor Registry derived from SDIDS should provide valuable data on the usage of pelvic mesh and its alternatives, and on clinical outcomes associated with the use of such devices. This data source should provide a detailed record of the number of patients who have undergone pelvic mesh procedures and support monitoring of any clinical complications associated with the use of the pelvic mesh product.
The level of the data will be:
· Pseudonymised
The data will be minimised to all pseudonymised data.
Department of Health and Social Care (DHSC) is the controller as the organisation responsible for ensuring that the data will only be processed for the purpose described above.
The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller
The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(i) - processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices, on the basis of Union or Member State law which provides for suitable and specific measures to safeguard the rights and freedoms of the data subject, in particular professional secrecy.
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.
The data are processed for two different purposes: Effective regulation of medicines and medical devices (Article 9(2)(i)) and Exploratory Analysis for research purposes (Article 9(2)(j)).
MHRA (an executive agency of DHSC) is a processor acting under the instructions of DHSC. MHRA is responsible for the operation and management of the exploratory analyses described above.
NTT Data UK Limited provides IT infrastructure for MHRA. NTT Data UK Limited manages the Microsoft Azure Cloud technology at MHRA on which the data will be stored. They supply support to the system and therefore have access to the data. NTT Data UK Limited involvement in processing the data is limited to support.
Microsoft Azure Cloud provides IT backup services to MHRA and will store copies of the data as contracted by MHRA.
The data will only be accessed by permanent employees who work in the Epidemiology Unit within the Safety & Surveillance division at MHRA.
Expected Benefits:
The analyses conducted under this project would provide MHRA with knowledge and understanding of the data captured in the SDIDS and specifically the Pelvic Floor Registry data module. Based on the evaluation of the data set, the SDIDS and the Pelvic Floor Registry could potentially be used in the MHRAs routine vigilance activities to monitor the use and the safety of pelvic mesh devices and its alternatives and as such will inform ongoing refinement of the MHRAs wider vigilance strategy.
The potential vigilance activities introduced include monitoring the number of pelvic mesh and tapes procedures conducted to put into context the adverse incident reports received via the Yellow Card Scheme (YCS), the strengthening of new potential safety signals identified in MHRAs routine signal detection activities and the use of SDIIS as an additional data source to conduct signal detection activities in. The analyses seek to explore the feasibility of the SDIIS to fit the needs of the MHRAs vigilance responsibilities in monitoring the safety of pelvic mesh usage in patients. The outcome of the study would assist MHRA in informing the decision-making process for identifying if the conditions outlined in the IMMDS review are met to allow for the pause in SUI operations using pelvic mesh to be removed. If the conditions were not met, it would assist in identifying the gaps in information captured in SDIDS and Pelvic Floor Registry from a regulatory perspective.
It is hoped that the overall impact of this project will provide long-term insights for monitoring the use and safety of other surgical medical devices and implants used in the UK and therefore the potential for strengthening the MHRAs post-marketing vigilance strategy for medical devices.
Outputs:
The expected outputs of the processing will be:
A data specification document from a researchers perspective will be drafted as a guide for the Epidemiology team in conducting analyses and interpreting the outputs of these analyses in future research projects using SDIDS data
Submissions to peer reviewed journals
Presentations to Devices Expert Advisory Group
The outputs will not contain NHS England data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived
The outputs will be communicated to relevant recipients through the following dissemination channels:
A data specification document from a researchers perspective will be drafted as a guide for the Epidemiology team in conducting analyses and interpreting the outputs of these analyses in future research projects using SDIDS data
Devices Expert Advisory Group, the statutory committee providing the MHRA with advice on the safety, performance, benefits and risks of devices in the UK healthcare system.
Patient Safety Commissioner
The proposed target date for conducting all analyses, the dissemination and communication of the results of the analyses (including recommendations for the use of the Pelvic Floor Registry and overall SDIDS) to the specified target audience is 12 months from receiving the data.
Processing:
No data will flow to NHS England for the purposes of this Agreement
NHS England data will provide the relevant records from the SDIDS and Pelvic Floor Registry dataset to MHRA. The data will contain no direct identifying data items. The data will be pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient.
The data will not be transferred to any other location.
The data will be stored on servers at MHRA
MHRA stores data on the Cloud provided by Microsoft Azure Limited
The data will be accessed by authorised personnel within MHRA via remote access. The data will remain on the servers at MHRA at all times.
Personnel are prohibited from downloading or copying data to local devices.
The data will not leave the UK at any time.
Access is restricted to employees within the Epidemiology Unit of the Medicines and Healthcare Products Regulatory Agency (MHRA). The Principal Investigator is also an employee within the Epidemiology Unit at the MHRA.
All personnel accessing the data have been appropriately trained in data protection and confidentiality.
The data will not be linked with any other data.
There will be no requirement and no attempt to reidentify individuals when using the data.
Analysts from the MHRA will analyse the data for the purposes described above.
Project 4 — DARS-NIC-08477-H7S0Z
Type of data: information not disclosed for TRE projects
Opt outs honoured: Y, N
Legal basis: Health and Social Care Act 2012, Section 251 approval is in place for the flow of identifiable data, Section 42(4) of the Statistics and Registration Service Act (2007) as amended by section 287 of the Health and Social Care Act (2012)
Purposes: ()
Sensitive: Sensitive, and Non Sensitive
When:2017.06 — 2017.05.
Access method: Ongoing, One-Off
Data-controller type:
Sublicensing allowed:
Datasets:
- Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
- Hospital Episode Statistics Admitted Patient Care
- Patient Reported Outcome Measures (Linkable to HES)
- Office for National Statistics Mortality Data
- Hospital Episode Statistics Accident and Emergency
- Hospital Episode Statistics Critical Care
- Diagnostic Imaging Dataset
- Hospital Episode Statistics Outpatients
- Mental Health Minimum Data Set
- Mental Health and Learning Disabilities Data Set
Objectives:
Clinical Practice Research Datalink (CPRD) is a governmental centre, jointly funded by the Medicines and Healthcare Products Regulatory Agency (MHRA) and the National Institute for Health Research (NIHR), the remit of which is to enable medical research through sharing pseudonymised clinical data. The CPRD in-house teams add value to the data sources through building tools and methodologies, conducting data characterisation studies and facilitating verification and validation services. CPRD wishes to be in position to share with its customers linked pseudonymised data so they can undertake approved research projects looking to improve public health and patient outcomes. Researchers using CPRD data can undertake, among others, drug safety, disease epidemiology, healthcare utilisation and health economics and outcomes studies that often directly inform clinical guidelines or regulatory decisions. An additional incentive is to provide pseudonymised data in the context of clinical drug safety or randomised controlled trials that can help build a more complete picture for the enrolled patients and increase transparency and efficiency of study results and design. These linkages will open the door to a large number of research questions, given that the granularity and detail of this information is not found in primary care data alone.
Primary care data, in the main, forms the core of the CPRD service and linkage to other data sets. Each individual practice has the control over the extent of their participation with the CPRD. When the extract is initiated the practice has the choice whether to “consent” to additional linkages - and hence the extract to the Trusted Third Party (TTP). At any time subsequently, they can revoke the consent to linkage, or the consent to the CPRD extract.
The GP Electronic Healthcare Records (EHR) registration options allow for selection of an individual patient and for that patient to be flagged as opting out of the CPRD pseudonymised extract. In the event that this option is selected, the patient’s data will not be extracted from the GP EHR for CPRD or for linkage (to the TTP). The Vision system has an inbuilt process for this but CPRD also reviews and respects READ codes that flag patient objections to their data being used for various purposes by not collecting this data. This number is between one and two per cent of the total patient population.
CPRD promotes the right of patients to exercise this opt out by the provision of posters and information leaflets to practices. CPRD most recently provided this information between December 2014 and January 2015. All practices sending data for inclusion in CPRD were sent two posters and an initial batch of 30 patient leaflets. More patient leaflets, without a limit on numbers are available to practices on request.
Expected Benefits:
As evidenced by existing studies using linked data to the CPRD primary care database (on an ongoing basis), results derived from publications undertaken by CPRD customers have frequently impacted public health and informed clinical guidelines.
Notable examples and relevant publications include the following:
A CPRD study on the prevalence and management of atrial fibrillation (AF) in the general population led to initiatives to improve the care of women and older people with AF [1].
The recently published and very important NICE guidance consultation document “Suspected cancer: Recognition of suspected cancer in children, young people and adults” derived its evidence for management, investigation and referral for most of the cancers discussed largely from GPRD and CPRD studies and, for some of them, entirely from CPRD research [2].
In a large cohort of patients seen in general practice with irritable bowel syndrome, the very considerable extent of psychological co-morbidity was accurately described using data from the CPRD [3], and other research in CPRD has highlighted changing diagnostic "fashions" in the recognition of chronic fatigue syndrome [4]. Both of these studies had implications for the management of patients with long-term physical and psychological problems in general practice.
An important CPRD study detailed the epidemiology of gastrointestinal (GI) bleeding, leading to modification of NICE guidance on GI bleeding [5], while a series of studies on the complications of coeliac disease and inflammatory bowel disease led to a modification of the risk estimates of these conditions in clinical practice [6-8].
Publications:
[1] Majeed A, K Moser, K Carroll (2001) Trends in the prevalence and management of atrial fibrillation in general practice in England and Wales, 1994–1998: analysis of data from the general practice research database. Heart (3) 284 – 288
[2] National Institute for Health and Care Excellence. Suspected cancer (update). Anticipated publication date: May 2015. Suspected cancer: recognition and management of suspected cancer in children, young people and adults (update). http://www.nice.org.uk/guidance/indevelopment/gidcgwave0618.
[3] Jones R, Latinovic R, Charlton J, Gulliford M. Physical and psychological co-morbidity in irritable bowel syndrome: a matched cohort study using the General Practice Research Database. Aliment Pharmacol Ther. 2006 Sep1;24(5):879-86. PubMed PMID: 16918893.
[4] Gallagher AM, Thomas JM, Hamilton WT, White PD Incidence of fatigue symptoms and diagnoses presenting in UK primary care from 1990 – 2001. JRSM (2004); 97: 571-575.
[5] Crooks CJ, West J, Card TR. Comorbidities affect risk of nonvariceal upper gstrointestinal bleeding. Gastroenterology 2013;144:1384-1393, 1393 e1381-1382; quiz e1318-1389.
[6] Crooks CJ, Card TR, West J. Defining upper gastrointestinal bleeding from linked primary and secondary care data and the effect on occurrence and 28 day mortality. BMC Health Serv Res 2012;12:392.
[7] Cooks CJ, Card TR, West J. Excess long-term mortality following non-variceal upper gastrointestinal bleeding: a population-based cohort study. PLoS Med 2013;10:e1001437.
[8] West J, Logan RF, Smith CJ, Hubbard RB, Card TR. Malignancy and mortality in people with coeliac disease: population based cohort study. BMJ 2004;329:716-719.
The same principles apply to both UK-based and international customers; through using UK healthcare data provided by CPRD, researchers are expected to produce results of clinical importance to the UK public. Making this data available to a range of international customers opens up the resource to a broader base of talent and expertise that can run data analytics to provide measurable benefits to UK public health and the NHS. Linking DIDs to CPRD GOLD would enable, among others, further validation of diagnostic records and more accurate phenotypic definition of diseases; linking PROMS to CPRD GOLD would enable to introduce quality of life information in the clinical records; linking MHLDDSMD to CPRD GOLD would, among others, be a step towards addressing the major health and social challenges the country faces and responding to national research priorities, including Dementia Challenge sponsored by the Cabinet Office.
Third Party (Research) Licence – Risk Mitigation
In recognition of the fact that data may still be a small residual risk of re-identification in specific contexts (e.g. where rare patterns or combinations of data items occur in the data and may be recognisable to the recipient of the data), CPRD establishes a legal licence with the end user that obliges them not to re-identify, or to attempt to re-identify a data subject.
This is particularly a risk where an organisation that controls identifiable data wishes to link its data with other health care data. Extra licence controls are established in these circumstances.
Outputs:
CPRD customers using these linked data will be producing (on an ongoing basis) research publications in peer-reviewed journals and presentations in scientific conferences. CPRD customers include academic institutions, pharmaceutical companies, governmental centres, and research charities, undertaking medical data research.
Trialviz: Trialviz is a feasibility and protocol optimisation tool created by CPRD that generates numbers of eligible patients based on inclusion/exclusion criteria. Access to the tool is currently only restricted to internal researchers. No decision has yet been made regarding whether Trialvis is released to external customers. Outputs are restricted to feasibility numbers and no identifiable or medical information is provided. Per internal policy, Trialviz suppresses in line with the HES analysis guide.
Processing:
CPRD receives and supplies only pseudonymised datasets. Customers may receive more than one data linkage if they are licensed to do so. CPRD will use a licence agreement which mirrors the requirements of HSCIC.
The CPRD enables linkage of NHS and other health related data for research projects approved by the Independent Scientific Advisory Committee (ISAC) of the MHRA. CPRD stores no data that has other than an pseudonymised ID as all coded data has its personal identifiers removed before entering the CPRD data domain. Data are only ever made available for medical and health research projects, approved by ISAC, and not for any other purpose.
All researchers are made fully aware they must not use data outside of the approved protocol and that an extension study or new use of data requires additional or new ISAC approval and a new legal arrangement. CPRD and ISAC require that research results need to be published in the scientific literature or shared with the regulators.
Following protocol approval by ISAC, CPRD will enable research by researchers who can demonstrate they have previous experience by peer review of undertaking the proposed research and who works for an organisation with which CPRD has a legal contract that covers all the obligations expected by the release of a research dataset.
Scientific Governance
The Independent Scientific Advisory Committee (ISAC) is a non-statutory expert advisory body established in 2006 by the Secretary of State to provide advice on research related requests to access data from the MHRA Yellow Card Scheme and the General Practice Research Database. The ISAC provides expert advisory support for all studies seeking access to data available under the Clinical Practice Research Datalink (CPRD).
ISAC consists of an appointed Chair (0.5 FTE), a Senior Scientific Officer and a multi-disciplinary appointed panel. The generalist members of the panel cover statistics, primary care, secondary care, patient and public involvement, drug safety and outcomes. They may be supplemented by specialist members drawn from the communities represented in linked data sets (e.g. cancer and MINAP)
Applications to ISAC are screened and categorised for risk according to a high, medium and low risk assessment. Low risk protocols are subject to Chair approval, medium risk are assessed by the Chair to determine whether they are submitted to the panel and high risk protocols, all drug safety studies for instance are subject to committee review. Protocols can be approved, rejected, or can be conditionally approved subject to review and / or addressing any comments made by ISAC. ISAC publishedwill be reinstating publication of meeting minutes shortly and are to includinge summary information describing the nature of the studies reviewed.
CPRD controls the use to which its data may be put by licence agreements, which reference ISAC approved protocols. Licence agreements, which are contractually binding in nature, limit the use of data to medical and health research purposes. CPRD will ensure that the use of data will comply with the Care Act and its application by HSCIC. Within this limit, research requires the approval by ISAC of a protocol, which meets stringent standards. Guidance for submission of a protocol is attached. This provides detailed guidance on the requirements of ISAC, and including the expectation that findings of scientific and public health importance are disseminated by publication.
Applications will also be streamed according to the disclosure-risk categories agreed in the CPRD Section 251 application. A summary report of applications is shared with the Confidentiality Advisory Group (CAG) as a condition of the former Ethics and Confidentiality Committee (ECC) approval subject to 12 months disclosure reporting and a half year review.
As and when new linkages are established, CPRD will work with the data source and Data Controller to establish suitable arrangements for representation on ISAC, or parallel approval mechanisms. This will ensure that suitable expertise in the understanding of specific data sets and their utility is incorporated into the approval process.
Information Governance
All studies operate under an appropriate legal gateway to enable the processing of identifiable data used for the purpose of linkage by the Trusted Third Party. This may be Section 251 approval (without consent) through the Confidentiality Advisory Group (CAG) of the Health Research Authority or under consent.
CPRD has approval from the ECC / CAG for linkage by the trusted third party and residual identifiers that have an agreed research purpose (e.g. Date of Death).
CPRD will make further applications to the CAG for any processing that goes outside of the existing permissions, for example in terms of additional linkages. A new dataset will only be linked once CAG approval is obtained.
Ethics approval
CPRD has an overarching approval provided by East Midlands (Derby) Research Ethics Committee for observational research using primary care and linked data provided by CPRD. Any study that involves interaction or intervention with patients requires a separate ethical review and approval before any data will be provided.
Data are collected from consenting practices. On acceptance as a CPRD contributor, an initial data collection of all available historic EHR data is taken from the practice computer. Subsequent data collections are performed on an incremental basis approximately every four to six weeks, transmitted electronically to the CPRD via the secure NHS intranet. The data are verified for integrity and completeness before further processing. If a collection fails these checks a re-collection is requested.
The physical and logical separation of identifier data services from the clinical research data services is a fundamental tenet of the CPRD security model that is built around the Health and Social Care Information Centre acting as a Trusted Third Party for linkage and CPRD itself acting as the research data processor.
Direct identifiers are streamed separately to the data linkage service (HSCIC) by data sources. This approach will be persisted even where the HSCIC hold the data (e.g. HES).
It is important to note that data that has been de-identified (had the direct identifiers removed) CPRD recognise that it can still disclose identity; either where the rarity of information expressed in a record, or through the association of the data in a record with other records accessible to the recipient, or through other knowledge possessed by the recipient. It is recognised that combinations of data, particularly within complex, granular or linked data sets provide a potential for disclosing identity or personal information, especially where a researcher possesses a local clinical data source that they are attempting to supplement for research purposes. CPRD mitigates this risk through the application of legal agreements with researchers that will prevent the use of the data in conjunction with any other data from any other source for the purpose of re-identifying or attempting to re-identify an individual.
CPRD also holds the right of audit with research data recipients to establish that relevant terms of use (including the full terms of the data sharing agreement with the research user), security and confidentiality conditions are being adhered to. On request from HSCIC, CPRD will provide to HSCIC the results of any such audit.
Physical Security Measures
CPRD operates to high level to ensure that when data is transmitted and or stored it is done so in a way that protects the data. All data in CPRD is stored in a “Tier 3” data centre that is compliant with Government standards to operate in a way that meets the full requirements for managing and storing such important data. The measures are always under review and are subject to audit.
Security measures include:
• Multifactor authentication for access
• Monitoring of access
• Round the clock security staff presence
• Robust firewalls and other access restrictions
A back-up store of the data mirrors the above features but in an alternative location to allow for business continuity.
Data processing
Collection loading takes collections as received, extracted and pre-loaded by CPRD, and links them to a collection so that they can be processed.
Stage 1 identifies if there is a viable collection to process. This involves checking for the presence of key and optional data files, and a check of the structure within each of the files to ensure that it is correct. The collection is then archived. Files or fields that are not required in the processing are then stripped away. The resulting data files are archived for merging in subsequent collections. Data from all data collections is combined in order to enable the identification and appropriate processing of updated records. The latest version of each updated records is retained as the current version. The records from all data are removed by referencing a special mandatory collection file which contains logs of records deleted since the last collection. During the next stage the text data in the data files are encoded replacing them with numeric lookups, reducing the size of the database, and rendering it easier to manipulate computationally. Quality assurance analysis then takes place resulting in patient acceptability criteria and practice up to standard dates. Feedback reports are also generated and sent to practices to help them identify problems and encourage better and more standardized recording. The final processing stage reorganises the data into a consistent column order, and sorts by patient identifier, ready for use in the query tools.
All CPRD staff are appropriately trained and have the necessary understanding of the governance processes pertaining to relevant laws. They will also be aware that any misuse of data may result in disciplinary procedures and, in the case of a severe breach, dismissal and immediate removal from the premises.
Training covering use of data is mandatory for CPRD staff and licence-holders prior to accessing data. Operational staff who are responsible for collection of data and interaction with site staff are precluded from access to data. Data is kept on restricted servers and drives accessible only to appropriately trained research staff.
Online access – Primary care data are available online via a secure portal. A purpose-built query tool allows customers to define patient cohorts and an “extract” tool then enables cuts of the data as specified, against a cohort or control group.
Flat files – Flat files allow licenced customers the same access as online data but supplied on an encrypted hard drive that is sent by courier.
Datasets – The CPRD Data Team will extract datasets for researchers against a query specification. The query and its output content will be agreed with the researcher prior to generation of the data sets.
In accordance with guidance from the ICO, CPRD does not permit personal data to be processed outside its own servers, and hence any such data is retained within the EU.
De-sensitising Data
Where there are potentially identifying characteristics that are indicated by the research, the data is made less specific. For example, where the age of data subjects is relevant to the research, the researchers are normally provided with year of birth rather than exact date of birth. In a study involving children, the month of birth are normally provided to the researcher. Similar processes are in place for geographic data such as where derivations are required from postcode. The standard identifiable geography for the researcher is the region and is restricted to a minimum population of about a million.
Pseudonymisation
Pseudonymisation is applicable in a number of contexts within CPRD to enable recognition of the fact that different records relate to the same individual whilst not revealing the identity of that individual. Whilst it is primarily applicable to the data subject identity, it may also be applied to other individuals recorded in their records (such as clinicians providing clinical care), organisations (such as hospital trusts or general medical practices) and geospatial identifiers (such as postcode, or grid reference).
Primary Pseudonymisation
CPRD establishes data subject pseudonymisation through the establishment of a compound pseudonym key that comprises a practice identifier and a patient identifier (within that practice). This compound key is not identifiable within the source GP EHR.
Multiple Pseudonym Layers
CPRD processes data and makes it available internally to CPRD researchers. In doing so, CPRD replaces the original data source pseudonym(s) with a second layer pseudonym or person ID. This creates multiple layers of separation, such that an adversary would need to translate the CPRD person ID back to the data source pseudonym ID and then gain access to the data source patient index, in order to directly identify a data subject.
When linked data is supplied to third parties, the person ID may be replaced by a third layer pseudonym ID that establishes a further layer of separation.
Encryption
Encryption is used for data in transit between secure locations. This will apply to both identifier data for linkage and clinical / research data. Although the clinical data is pseudonymised, there remains the residual risk of re-identification or the risk of inclusion of disclosive content and data is only intended for processing by authorised recipients. Encryption will mitigate the risk and provide assurance.
The general default minimum standard for encryption will be AES 256 using a complex pass-phrase consisting of 12 characters and a mix of upper case, lower case, numeric and special characters.