CFDE Biomarkers partnership project
For the Biomarkers project, the UNM-IDG Team is developing a dataset of clinically relevant molecular biomarkers, using the Oracle HealthFacts and Real-World Data databases, containing deidentified EHR data, including LOINC codes for laboratory tests. Named entity recognition (NER) associates LOINC terms with biomolecules and particularly genes and proteins, an initial focus of this study. A complementary effort focuses on the FDA Clinical Laboratory Improvement Amendments (CLIA), which refers to an FDA regulatory process whereby lab tests are approved and categorized for clinical use, many of which are specificially related to analytes which are molecular biomarkers.
- Download LOINC db from loinc.org
- relatednames_table.py - Split Loinc.csv relatenames2 column to create separate table.
- Go_loinc_DbCreate.sh - Build PgSql db from Loinc.csv and relatename.tsv.
- Go_loinc_GetData.sh - Query db for chemicals with names, relatednames.
- Go_loinc_NER_tagger_gene.sh - NER for genes using JensenLab Tagger.
- Go_loinc_NER_leadmine_gene.sh - NER for genes using NextMove Leadmine.
- Go_hf_labs.sh,
- hf_lab_loinc_counts.sql - Query db for labs.
- biomarkers_loinc_hf.Rmd
- Generate list of clinically relevant molecular biomarker candidates.
- Count encounters and patients for all LOINC codes (chemical).
- Group lab procedures into list; aggregate on LOINC codes.
- Group protein synonyms; aggregate on LOINC codes.
- Sort LOINC codes by occurence, as a proxy for clinical relevance.
- biomarkers_loinc_hf_out.tsv
- biomarkers_loinc_hf.html
- biomarkers_loinc_rwd_out.tsv
- biomarkers_loinc_rwd.html
- CFDE Biomarkers Project
- LOINC | Learn | Downloads
- Oracle Health real-world data
- BEST (Biomarkers, EndpointS, and other Tools) Resource
- FDA CLIA FDA Clinical Laboratory Improvement Amendments (CLIA)