Skip to content

Scripts for generating and analysing fake HES-APC NHS England EHR data.

Notifications You must be signed in to change notification settings

georgemelrose/Dummy-HES-APC-Data-Work

Repository files navigation

Dummy-HES-APC-Data-Work

Scripts for generating and analysing fake HES-APC NHS England EHR data, used for a presentation in an interview for the post of "Research Assistant in Health Data Science" at a Russell Group institution.

From Herbert et al., 2017 - 10.1093/ije/dyx015.

"Hospital Episode Statistics Admitted Patient Care (HES APC) data are collected on all admissions to National Health Service (NHS) hospitals in England. HES APC also covers admissions to independent sector providers (private or charitable hospitals) paid for by the NHS.1 It is estimated that 98–99% of hospital activity in England is funded by the NHS.2 A hospital admission includes any secondary care-based activity that requires a hospital bed, thus including both emergency and planned admissions, day cases, births and associated deliveries."

On the structure of these files, from Herbert et al., 2017 -

"HES APC data files are structured according to financial years. Each row in HES APC indicates a ‘Finished Consultant Episode’ (FCE). An FCE represents a continuous period of care under one consultant, and each is specified with a start and an end date. Episodes are labelled as ‘finished’ and entered in HES APC according to the financial year in which they end. Consequently, episodes that start in one financial year and end in another will be classified as unfinished in the starting financial year, and finished in the ending financial year. Unfinished episodes need to be removed before analysis to prevent double counting.

A hospital admission in HES APC is referred to as a ‘spell’, defined as an uninterrupted inpatient stay at one hospital. A spell may include several FCEs if the patient was seen by multiple consultants during the same stay, but does not include transfers between hospitals. If a patient is transferred to a different hospital, a new spell begins."

Not having access to the data and it being inappropriate for health data or even a derivative of it to be shared publicly, I've generated the data completely from scratch. The dummy patient data used, is an approximation of NHS England HES-APC data. 100K rows (each row representing one FCE), 38 variables. 26 Variables based off of 4C_mortality_score/01_data_prep.R script. For example: dialysis; age.factor; hypertension_mhyn. Remaining 12 variables generated by memory and ICD package, such as dates and causes of first readmission.

Files used -

  • dummy_data_generation.RMD to generate dummy HES-APC data with ICD codes for causes of admission and readmission.

  • health_data_analysis.RMD to perform survival analysis on the dummy data and generate an ioslides presentation.

  • custom_styles.css to customise the appearance of the ioslides presentation.

About

Scripts for generating and analysing fake HES-APC NHS England EHR data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published