NHS Digital Data Release Register - reformatted

Carnall Farrar Limited projects

374 data files in total were disseminated unsafely (information about files used safely is missing for TRE/"system access" projects).

Application for Carnall Farrar to access NHS Digital data, to permit more detailed insights into the needs of the population and the challenges facing the system when shaping clinically and financially sustainable health and social care services across England. — DARS-NIC-243790-Y8K8C

Type of data: information not disclosed for TRE projects

Opt outs honoured: No - data flow is not identifiable, Anonymised - ICO Code Compliant, Identifiable, No (Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), , Health and Social Care Act 2012 – s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 – s261(2)(a), Health and Social Care Act 2012 - s261(5)(a)

Purposes: Yes (Consultancy)

Sensitive: Non Sensitive, and Non-Sensitive, and Sensitive

When:DSA runs 2020-04-01 — 2021-03-31 2020.05 — 2024.04.

Access method: Ongoing, One-Off

Data-controller type: CARNALL FARRAR LIMITED

Sublicensing allowed: No


  1. Hospital Episode Statistics Admitted Patient Care
  2. Hospital Episode Statistics Outpatients
  3. Hospital Episode Statistics Critical Care
  4. Hospital Episode Statistics Accident and Emergency
  5. Secondary Uses Service Payment By Results Outpatients
  6. Secondary Uses Service Payment By Results Spells
  7. Secondary Uses Service Payment By Results Episodes
  8. Mental Health Services Data Set
  9. Mental Health and Learning Disabilities Data Set
  10. Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
  11. Secondary Uses Service Payment By Results Accident & Emergency
  12. Emergency Care Data Set (ECDS)
  13. Secondary Uses Service Payment By Results Accident & Emergency
  14. HES-ID to MPS-ID HES Admitted Patient Care
  15. HES-ID to MPS-ID HES Outpatients
  16. Hospital Episode Statistics Accident and Emergency (HES A and E)
  17. Hospital Episode Statistics Admitted Patient Care (HES APC)
  18. Hospital Episode Statistics Critical Care (HES Critical Care)
  19. Hospital Episode Statistics Outpatients (HES OP)
  20. Mental Health and Learning Disabilities Data Set (MHLDDS)
  21. Mental Health Services Data Set (MHSDS)
  22. Community Services Data Set (CSDS)
  23. Diagnostic Imaging Data Set (DID)


Carnall Farrar Ltd (CF) is a management consultancy and analytics company whose mission is to improve healthcare. CF works with a number of purchasers and providers across NHS organisations in England, Wales and Scotland, including NHS Trusts, Foundation Trusts, Clinical Commissioning Groups and Commissioning Support Units. CF has strong relationships with the Department of Health, NHS England, NHS Improvement and Public Health England. These organisations commission CF to work on projects procured both through direct commissioning agreements, and competitive procurement processes as part of access to over six public sector specific frameworks.

CF’s client base consists primarily of NHS organisations. As of February 2020, CF had active contracts with the following NHS clients:

• Barking, Havering and Redbridge University NHS Trust
• Lancashire and South Cumbria NHS STP
• Bedford, Luton and Milton Keynes NHS ICS
• Harrogate and District NHS Foundation Trust
• Central London Community Healthcare NHS Trust

Over the past few years, CF has worked with over 15 health and social care systems in England, to support health system strategy and strengthening, service transformation, financial sustainability, and integrated care delivery. Over the past five years CF has handled large volumes of data which has been obtained directly from NHS clients. CF has been successful in using this data to provide meaningful analytics which have positively impacted NHS organisations. Unfortunately, the data available directly from each NHS organisation is aggregate data and does not include diagnostic (clinical) codes. The results, although useful, are limited.

Carnall Farrar is requesting data from NHS Digital to expand the products currently on offer for NHS Clients. The record level data disseminated under this Agreement, although pseudonymised, is richer and more accurate than the data held by individual NHS organisations. Using the data under this Agreement, CF will provide more in-depth tools and products for NHS clients, therefore producing more accurate insight into the areas of the NHS requiring improvement and how these improvements can be made.

CF is requesting NHS Digital data in order provide additional benefits to NHS clients by expanding the capabilities of existing products and reducing costs to the NHS. The three products that the data under this Agreement will be used for are:
- CF Foresight
- CF Capacity
- CF Benchmarks

1) CF Foresight:

This tool predicts demand and patient flow through a hospital over the short to medium term (1-4 months). Hospitals can identify “pinch” points in upcoming demand compared to their current capacity and model the impact of a variety of interventions on demand and flow.

Current users:
A basic version of the tool is currently used by Northampton General Hospital NHS Trust to model the impact of increasing discharge rates within the Trust. CF also works with NHS Improvement and NHS England’s Data and Analytical Product team, to co-design the next version of the tool in order to provide a national overview of demand and flow.

Data required:
A five year extract of the following data sets:
- Mental Health Services Data Set
- Mental Health Minimum Data Set

Why that data is required?
The tool incorporates machine learning model to predict future performance. The more years of data that are available for training the better the accuracy of the model. Five years of data is considered sufficient to account for annual variation. The HES APC, CC A&E and the ECDS data are required to train patient level models for predicting attendance, admission, four-hour breaches and discharges within the hospital. The Mental Health Services Data Set and the Mental Health Minimum Data Set are required to model the impact of mental health on A&E performance over time. The row level data is also required to model the specific interventions that trusts plan on using to improve performance. These interventions often require clinical coding to identify the impacted patients e.g. increasing discharge rates in frail elderly population within orthopaedics. CF requires the latest available data-sets to ensure that NHS commissioners and providers, who will receive CF analytics products, can make effective decisions based on the most up-to-date information.

2) CF Capacity:

This tool predicts in which year demand in an acute NHS organisation will exceed capacity based on demographic and non-demographic growth. It allows a system to model a variety of clinical configurations across sites, to maximise the operational effectiveness whilst minimising the negative impacts on patient safety, access and hospital performance.

Current users:
In 2019 CF used this tool on three projects; in Guernsey to ensure that the new hospital will be correctly sized, for the Barking, Havering and Redbridge University NHS Trust to reconfigure services to optimise current capacity across two sites and at Whipp’s Cross Hospital where CF assisted in the design of the hospital to ensure it met future capacity requirements.

Data required:
A five-year extract of the following data sets:
- Mental Health Services Data Set
- Mental Health Minimum Data Set

Why that data is required?
This tool predicts up to 20 years into the future and relies on historical activity data to predict the trend and changes in activity over and above that expected from demographic changes. Five years of historic data is the minimum amount of data CF requires to capture longer term trends. Activity is categorised by point of delivery (outpatient, inpatient, emergency department), age, and HRG code (where possible). This activity data is then mapped to capacity within the hospital. In order to capture the value from the capacity available, CF model scenarios that improve efficiency or reconfigure services. The outputs from the reconfiguration analysis will feed into a separate model, which predicts the effect of different service configurations on finance, workforce, estates, capital and finance. This requires row level data including clinical codes. Access to mental health data-sets will allow CF to analyse the acute patient pathways for mental health service users, and enable a better understanding of the clinical needs now and in the future.

3) CF Benchmarks :

To provide a set of automated reports that allow for a rapid diagnostic of an NHS organisation. An automated set of reports that benchmarks CCGs across hospital demand, projected demand, life expectancy, deprivation, hospital performance, QOF, expenditure, workforce and populations segmentation. This product allows for a rapid diagnostic of a geographic area, freeing up time to focus on improvement rather than number analysis.
The report topics that are included are:
- Population Segmentation
- Mental Health Toolkit
- Drivers of the deficit
- Inpatient activity
- A&E performance
- Outpatient activity

Current users:
In 2019 CF employed an alpha version of this product across 6 NHS organisations:
- Lancashire and South Cumbria NHS STP
- Bedford, Luton and Milton Keynes NHS ICS
- Harrogate and District NHS Foundation Trust
- Barking, Havering and Redbridge University NHS Trust
- Newham CCG
- Tower Hamlets CCG

Data required:
- SUS PbR Spells
- SUS PbR Episodes
- Mental Health Services Data Set
- Mental Health Minimum Data Set

Why that data is required?
Payment by Results (PbR) is required for the drivers of the deficit analysis. CF will use these datasets to enable system financial leadership and regulators to understand the strategic financial challenge of a local area and benchmark against national peers. Mental health data is required to enable clients to monitor performance against internal targets and benchmark against national peers whilst the bridge file is required to support planning for mental health patients based on the benchmarked opportunities. The HES and Mental health data are required for benchmarking inpatient activity, outpatient activity, A&E performance and populations segmentation. Five years of data is required as time-series analysis to assess trends in performance, expenditure, utilisation, demand, seasonal patterns in utilisation as well as directional trends in performance.

The data is required at the patient level as CF’s benchmarking uses national data at the patient level to create explanatory models for each report. These explanatory models are required to compare the performance at the regional, local, site and specialty based on time of day and patient movements within the healthcare system e.g. Quantifying the differential impact across sites that the number of people in an ED department has on the probability of a patient breaching the 4 hour target.

Lawful basis
The lawful bases for processing this data under GDPR is as follows:
• Article 6(1)(f) – it is necessary for the legitimate interests in being able to provide tools and services that will benefit healthcare organisations
• Article 9(2)(j) – it is necessary for reasons that are in the public interest in the area of public health. CF provides tools and services to public healthcare organisations that help them to monitor and improve the standards and quality of care that they offer. The processing is designed to benefit patients' and society as a whole, through facilitating better healthcare in the UK.

To determine the lawfulness of processing the data for these legitimate interests, Carnall Farrar has undertaken a Legitimate Interests Assessment (LIA) and determined that:

i. The processing is necessary for the purpose:
The processing of patient data is required to provide accurate analysis of patient pathways, outcomes and performance. Aggregate data does not allow the patient level predictions and evaluations required to perform the detailed analysis and modelling required for:
CF Foresight (1) : Predicting patient admission and discharge data, modelling patient level intervention
CF Capacity (2) : Mapping individual patients to activity groups, identifying opportunities within care pathways
CF Benchmarking (3) : Creating patient segmentation groups, modelling A&E performance, identifying financial opportunities for clinical pathways

ii. The processing is proportionate to the purpose:
The processing of this data will relate solely to the improvement of the health and social care system in England and Wales through providing detailed recommendations and initiatives that aggregate data cannot provide due to the unconnected nature of the data-sets.
The negative impact on individual patients is minimal and CF predicts that patients will benefit from the enhanced service provided by the tools and analysis this data processing produces. This data processing will not identify individuals and the output of this processing will not lead to data-sets that can be used to identify individuals.

iii. The purpose cannot be achieved by processing the data in another more obvious or less intrusive way:
The correlations between behaviours at the level of the individual patient are lost when using aggregate data. These correlations are crucial for capturing the true impact of initiatives and identifying the underlying causes for health system under-performance. Summary statistics mask these relationships and can lead to biased analysis especially when modelling system performance across multiple points of delivery.

iv. The interests of the individual data subjects do not override the legitimate interest:
Carnall Farrar has considered that the data is health data, including data about children or other vulnerable people, which the data subjects are likely to consider particularly private and that some individuals would feel uncomfortable about the processing of their health data outside of the NHS. However, the data sets are pseudonymised and therefore, the possible impacts of the processing will be minimal as Carnall Farrar is not able to de-pseudonymise the data nor will Carnall Farrar attempt to identify patients using any means. Carnall Farrar has put in place policy safeguards to ensure the security and privacy of patient records and limit the impact on patients to positive enhancements of care.

Carnall Farrar believes that the above provides sufficient legitimate interests for lawful processing of the data requested under this Agreement. Carnall Farrar has determined that there is unlikely to be any moral or ethical issue arising from the processing of this data as the data requested is pseudonymised.

Carnall Farrar is the sole Data Controller for the purposes of this Agreement and Carnall Farrar will be directly processing the data. Amazon Web Services (AWS) is a Data Processor for Carnall Farrar. Data processing and storage will take place within Carnall Farrar's secure cloud platform which is based on an AWS UK environment. The servers that store and process the data under this Agreement will be within UK sovereignty (London). The data under this Agreement will be solely used for the purposes stated above and only substantive employees of Carnall Farrar will have access to this data. No employee of AWS will access this data.

Benefits to the Health and Social Care System:
CF will support access of the data to NHS clients, enabling more effective use of the data, validation of data through feedback and, crucially, supporting more effective integration of data. The analytical products that CF builds will help NHS clients to make better decisions. A crucial component of the analytical work with NHS clients is to ensure a legacy of embedded analytical capability. CF aims to improve the flow of data across the NHS, closing the gap between action and feedback. This will help reduce data latency and improve data quality, two key components to effective use of data to create a health system that learns from every patient.

The data disseminated under this Agreement will only be used to provide services to NHS organisations within England and Wales. CF does not currently work with any non-NHS organisations however, previously CF has worked with non-NHS organisations such as charities (Healthy living UK), think tanks (IPPR) and life sciences projects. For CF’s non-NHS clients, only publicly available data is used, and the data is stored in a different physical location within CF's data warehouse. The data provided under this Agreement will only be used for CF’s NHS Clients and will not be shared between NHS and non-NHS projects/organisations.

Expected Benefits:

The aim for all of CF’s work is to help NHS providers and commissioners identify areas of opportunity in performance or
efficiency and work with them to improve. The anticipated benefits of the three analytical products described above to the health and social care system are outlined below:

1) CF Foresight:

Benefits to the NHS:
- Enable providers and commissioners to visualise and understand past pathways of activity;
- Predict future activity;
- Quantify the impact of plans on future occupancy
- Develop plans to change activity over a selected period of time;
- Track activity against baseline to assess real-time efficacy of the new plan.

Benefit to the NHS from the data under this Agreement:
Accurate prediction of a hospital system requires modelling each patient as they enter a hospital, where they go to within the hospital and when they leave. To train the machine learning models that allow these predictions requires a large volume of historical record level data. The benefit of this is two-fold; the prediction is more accurate and so the tool can be trusted by NHS organisations and the interventions that hospitals design to improve performance can be modelled. Without the ability to model individual patient journeys, many interventions are unable to be assessed for their impact on the healthcare system.

Case study:
Over the past 18 months CF has worked with three systems across the country, each of which were in the bottom 10 for urgent and emergency care performance. There were persistent A&E issues that no-one understood, and thus could do nothing to remedy. The team were tasked with understanding the current levels of performance, diagnosing the drivers, assessing the system as a whole, and developing solutions. A major part of this work was the development of a demand, capacity and flow (DCF) model. Using local HES A&E data, a cloud-based dashboard was created which enabled visualisation of three key analyses;
1) Key drivers of historical performance,
2) Predictive models to assess future performance based on ‘do nothing’ and ‘do something’ criteria,
3) Assess current performance vs. plan and understand the drivers.
The analyses revealed that while demand has remained relatively constant, capacity has fallen significantly, and length of stay has increased. This has led to a bed occupancy rate of 95%-100% across the three systems. The ability to visualise these data as a live-feed, and in an intuitive format, enables a rapid assessment of the key drivers of A&E performance.

2) CF Capacity

Benefits to the NHS:
- Allows NHS organisations to extend their current capacity through re-configuration and productivity improvements
- Allows NHS organisation to create the optimal configuration of services given the constraints of workforce, finance, clinical effectiveness and patient access.
- Allows new developments and new hospitals to be developed based on an accurate prediction of future activity and quantification of the efficiency of new capacity

Benefit to the NHS from the data under this Agreement:
Accurate models of future activity rely on historical record level data. This is required to estimate the utilisation of capacity at present levels and associate activity from individual clinical pathways and patients moving between pathways. Without record level data these aggregate figures prevent CF from estimating the distributions of patient activity in both spells per patient and length of stay per patient which is required for both the activity projection and the modelling of future opportunities.

3) CF Benchmark:

Population Segmentation:
Enable providers and commissioners to identify population segments of greatest activity and spend. For example, segmenting by age and condition could reveal the degree of activity and spend taken up by the older populating suffering from multiple long-term conditions; enabling the development of future plans based on the needs of the population; Population health management (PHM) was highlighted in the NHS Long Term Plan as a key component of integrated care systems. CF believes that an appropriate data infrastructure is fundamental to PHM. CF has the capacity to create this data infrastructure for NHS clients and the population segmentation explorer would be central to this.

Benefit to the NHS from the data under this Agreement:
Aggregate data does not allow patient segments to be calculated. Without row level data the benefits of how individuals interact across multiple spells is lost, as is the ability to track the changes in patient segments over time.

Case study:
Over the past 12 months, CF has worked with another STP to address the paucity of information regarding the needs of patients suffering from mental illness. Through use of an integrated data-set containing 1.8m patients, CF set out to segment the population into discrete groups. The team sought to understand the impact of co-morbidity and to do so CF grouped people based on physical health (mostly healthy, 1 long-term condition, 2 long-term conditions, 3+ long-term conditions) and mental health status (mostly mentally healthy, depression or anxiety, severe and enduring mental illness, dementia). Based on the population and spend in each segment CF could then calculate the spend per head. In addition to this segmentation, it is possible to use the integrated data-set to understand spend and activity, by segment, in much more detail; e.g. bed days, A&E attendances, number of mental health contacts and number and type of social care contacts. The results show that mental health is an even bigger driver of cost than age or chronic disease. A person with 3 physical long-term conditions requires 10 times the spend of someone who is mostly physically and mentally healthy. But a person suffering severe and enduring mental illness (SEMI) requires 40 times the spend of someone who is mostly physically and mentally healthy. People with dementia and SEMI account for less than 2% of the population but one seventh of spend. More broadly, people over 16 with a mental health condition account for a sixth of the population, but almost two fifths of system spend. This form of capital spend analysis will enable providers and commissioners to develop forward plans based on the key needs of the local population.

Drivers of the deficit:
- Visualise historical financial performance and predict future activity;
- Compare activity relative to peers to identify potential opportunity;
- Set a plan against the baseline to target specific improvements and track performance of the plan over time

Benefit to the NHS from the data under this Agreement:
- Detailed financial analysis requires HRG and patient level data to diagnose which clinical activities are responsible for poor financial performance and where opportunities for improvement can be found.

Case study:
In mid-2015, the combined NHS commissioner and provider economy in a large health economy in England was forecasting a deficit of £40m for the year. This was expected despite the region being 2% over their capitation funding target. Further, the system was failing to deliver many key performance standards. At the time, it was the only such system in the country facing this scale of challenge. CF was appointed to support the system in developing a plan for sustainable change. CF's initial analysis used data from a wide range of sources, including local PbR data-sets, and focused on understanding the drivers of the current deficit and a ‘do nothing’ deficit of £400m by 2020/21. Moreover, CF developed plans for improvement by identifying 20 major opportunities to deliver clinically and financially sustainable care; this equated to £100m of savings by 2016/17. CF developed an overall strategic financial framework to demonstrate how the system could achieve financial balance, without affecting quality of care, within five years.

Inpatient Activity, A&E performance and Outpatient Activity
Benefit to the NHS:
– Enable both commissioners and providers to assess performance across the system;
– See trends in their key activity and quality metrics, compare performance to peers, and understand the scale of opportunity to improve performance.

Benefit to the NHS from the data under this Agreement:
Identifying the performance and the variables associated with poor performance can be done crudely at the aggregate level but only by building record level explanatory models can you quantify the explicit impact of each variable given all the other variables that may also impact the patient experience. This multivariate approach provides detailed operationally actionable insights based on the historical impact on patients.

Case study:
CF was commissioned in October 2015 by a STP for an initial 3 month project, where advanced analytics were employed to support a population health management based approach to the design of health and care systems. Over 1000 beds were occupied with people who could be better cared for outside of hospital; this created a significant strain on capacity in the STP. The project team focused on understanding the local population at various levels; 1) system wide, 2) CCG, 3) a locality of 30k-50k population, 4) GP practice. Using an integrated data-set containing 1.8m patients, the population was segmented by both conditions and age, understanding the size of the cohort, the service specific and total health and social care spend in each segment. Multivariate regression analysis allowed determination of the drivers of spend in a given geography. Different cohorts of the population require very different support from the health and social care system. This helped CF to identify which cohorts the STP should focus on; in this case, adults and older people with complex needs, with an average spend of £7,000 per head. The system leaders across the STP signed up to an investment case of £192m over four years to fund the transformation to local care, with expected savings of £490m over the same time period (four years). The trusts broadly delivered on their savings. The planned investment in local care was not implemented due to financial pressure and organisational changes, although some areas that prioritised investment displayed the predicted return on investment for the area. The end result was that the predicted £490m was not achieved, but ~£300m was.

Mental Health Toolkit:
– Enable providers and commissioners to understand population needs with respect to mental health in the system
– Enable understanding of the link between mental and physical health in the system using clear and insightful visualisations
– Compare performance to peers using bench-marking analysis
– Create a forward plan based on population needs with respect to mental health
– Track performance of the plan against a pre-defined baseline

Benefit to the NHS from the data under this Agreement:
Without row level data CF cannot model the impact over time on individual patients and so cannot optimise the clinical pathways for mental health patients. It is crucially important to be able to quantify how the interactions these patients have on the system and to do so requires row timestamped data.

Case study:
CF was commissioned in February 2016 by a STP for a three-month project to demonstrate that by improving mental health services, significant financial savings could be made across the system; the STP had not been able to demonstrate this robustly. Using a local mental health data-set, CF developed a mental health toolkit to analyse population needs, outcomes and complexity. The area had both an elderly population and high levels of deprivation, resulting in a large number of people suffering from dementia or serious mental illness. In general, more complex localities were associated with poorer outcomes and higher clustered spend per head. Moreover, high need groups account disproportionately for secondary care resource consumption. CF identified steps to reduce variability, increase reliability and therefore improve the efficacy of spend by £6.4m-8.4m, equivalent to ~5% of mental health spend over and above 2% provider efficiency. Up to £18m could be saved in 2020/21 from other settings of care (acute, primary, community) by better meeting the needs of physical and mental health co-morbidities. Achieving this saving would require system-wide investment of £9m.


CF works on multiple projects at any one time for a number of different national, regional and local organisations across the NHS. Therefore, it is not possible to provide full details of all outputs, as these are highly specific to the requirements for each client. CF will only share aggregated analysis with its NHS clients in presentations, reports or cloud-based visualisation tools, in full compliance with the small numbers guidance. In general, all outputs can be grouped into one of several categories detailed below:

• CF provides detailed reports to clients, which contain data in table format containing aggregated, non-patient identifiable data with small numbers suppressed in line with the HES Analysis Guide;
• These reports may also contain visualisations created using data based on aggregated, non-patient identifiable results of quantitative analysis;
• CF presents the aggregated, non-patient identifiable results with small numbers suppressed, in the form of tables and visualisations, at meetings with NHS client stakeholders;
• CF provides interactive visualisations to NHS clients in the form of cloud-based tools; the software tools that CF builds will allow some analysis to be done on aggregated data (with small numbers suppressed), in house by the providers/commissioners. CF will not allow the providers and commissioners to directly access the patient level data.
• Benchmarking applies across all services. National benchmarks will be derived from the national data. The outputs from queries against these data will be transferred to excel, R, Python or visualisation software for communication to CF clients.
All outputs will only contain results in highly aggregated format and as statistical summaries and measures of association. Small numbers will be suppressed. Record level information will not be released to any third party.

Examples of the specific tool outputs are set out below:

1) CF Foresight
Output : A browser based tool that enables providers and commissioners to view predictions of future attendances, admissions, discharges and occupancy as well as model the impact of interventions to improve performance
Timelines: An alpha version of this product has been available for the past year and we are co-designing a new version with Northampton and NHS England and NHS Improvement for release in Spring 2020.

2) CF Capacity
Output : An Excel based tool which visualises and models the impact of reconfiguration and efficiency improvements.
Timelines: CF has this tool deployed at two sites currently and are releasing the advanced version of the tool in Spring 2020.

3) CF Benchmark
Population segmentation
Output: A browser based interactive explorer of the population segments
Timelines: v1 is available now but only for one geographic area (Kent and Medway)
All other reports:
An automated PowerPoint presentation accessed via the browser
An aggregated version has been used for the past year.

All outputs will only contain results in highly aggregated format and as statistical summaries and measures of association. Small numbers will be suppressed in line with the HES Analysis Guide. Record level information will not be released to any third party.


Carnall Farrar will store data in the Amazon Web Service (AWS) cloud, which facilitates a secure PostgreSQL data warehouse into which data from NHS Digital will be uploaded. According to the agreement with AWS, data will be encrypted using AES-256 both in transit and at rest and meets the standards set out in the Health and Social Care Cloud Security Good Practice Guide. CF has completed the Health and Social Care Data Risk Model and has provided justification to NHS Digital to each point set out in the cloud good practice guide relative to the risk class.

Amazon Web Services is named as a data processor in this Agreement. Amazon Web Services is, strictly, a data processor in the sense that the data are hosted and manipulated on their infrastructure. By design, AWS themselves cannot access or read any of the data that are hosted on their infrastructure, nor can anyone else who is not specifically granted individual access to the data (including Carnall Farrar employees).

Amazon Web Services UK are compliant with many standard security frameworks, including ISO 9001, 27001, 27017, 27018; the Cloud Security Alliance certification and UK Cyber Essentials Plus.

The data from under this Agreement will only be linked with anonymous data and no attempts will be made to re-identify the data.

A secure SQL server will be used to subset and extract specific portions of the data according to the project demands. Further processing and analysis will be performed by CF using a variety of techniques. National benchmarks, for example day case rates or UEC performance, will be derived from the national data and stored on the same servers as the raw data with the same level of security. The aggregated outputs from queries against these data will be transferred to excel or visualisation software for communication to CF colleagues and clients.

CF will restrict access to the database containing data under this Agreement to only those who are CF staff, and for the specific purposes described in this document. All CF staff will receive training to ensure they meet the security and processing standards set out by NHS Digital, both in the HES Analysis Guide and in the Data Sharing Framework Contract and subsequent Data Sharing Agreement. In addition, CF staff are subject to the client confidentiality policy, which outlines the responsibilities of CF staff with regards to confidential information. CF staff will be informed that any misuse of the data will result in formal disciplinary procedures.

There will be two types of users:
1. Primary users will have access to the data under this Agreement:
Primary users are CF permanent staff members. The users will be limited to 10 and will be able to access the data warehouse via a secure, encrypted SQL server. Authorisation controls will be in place to ensure that named users have permissions which restrict them to access only the data designated for their access. A log and audit trail of access and data downloads will be maintained and regularly monitored.

2. Secondary users will have access to datasets derived from the data warehouse:
Secondary users are CF staff members that do not have access to the raw record level data but can access datasets that are aggregated (with small numbers suppressed) by primary users. The data is derived by Primary users through performing aggregations, cleaning and mappings to the raw record level data. An example of this would be a report that provided the length of stay by site for non-elective stays over 10 years, the secondary users would be able to access this data via the primary users and be able to change the view of, and perform analysis on, such data so as to make it most useful to the client.

CF will host an internal Data Security Committee. This panel will comprise a senior team to include a partner, the head of analytics, quality assurance and information governance leads, and senior primary users. The explicit purpose of this committee is to provide oversight on who has access to the data warehouse. The secondary purpose is to approve the requests of the secondary users. The criteria process for approval is based on the Data Sharing Agreement and the Data Sharing Framework Contract agreed between NHS Digital and CF. The committee is responsible for ensuring that all requests align with the agreed constraints set out above. Membership of the committee will be reviewed annually, and all requests will be set out in writing. All decisions made by the committee will be logged for audit. CF may ask for clarification if it is felt a specific request requires disambiguation.

All data downloaded from the data warehouse will be aggregated with small numbers suppressed by Primary Users. Secondary users can request access to aggregate data for specific analyses; these requests will be considered by the Data Security Committee.

No data processing will take place outside of England and Wales. Only high-level analytical outputs (aggregated), never patient-level data, will be shared with third parties. The tools that CF will build will be expressly for the NHS organisations. Aggregated outputs will be used for national and international benchmarking and research in the public interest. Data will only be processed or held at the addresses set out in this Agreement.

Specific processing activities for each product:

CF Foresight:
- Mental health data and HES/ECDS data will be processed to train the machine learning model predicting admission, discharges and occupancy.
- Mental health data and HES/ECDS data will be processed to create patient cohorts for simulation of interventions.

CF Capacity:
- HES data will be processed to develop estimates of baseline activity and capacity for commissioners and providers, aggregated at service line level. The effect of different service configurations will then be compared to baseline in terms of demand, capacity, activity and travel times.
- HES data will be processed to train machine learning models to predict future activity.

CF Benchmark:
- HES data will be processed to analyse outcome, quality and activity metrics, as well as create benchmarks to compare both within and between commissioner and provider peer groups.
- HES data will be processed to segment the population of a local area by both conditions and age.
- PbR data will be processed to calculate baseline activity, spend and occupied bed days.
- Mental Health Dataset will be processed to create benchmarks of operational performance and utilisation rates in mental health care.
- Mental health Data set will be processed to create patient segments.
- All three data sets will be processed to train patient level models to predict performance.

All organisations party to this Agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract - i.e. employees, agents and contractors of the Data Recipient who may have access to that data).