Paper | Disease/Area | Date | Sample Size | Tool & Technology | Algorithms | Data Processing | Cohort | Utility |
---|---|---|---|---|---|---|---|---|
[18] | Prostate cancer | 2010–2018 | 5461 patients | NLTK [33] | SVM [34], Rule-based algorithms, ConText [35], NegEx [36] | Imputation, Vectorization | Early-stage cancer patients | Clinical & pathological TNM staging |
[19] | Ophthalmology | 2013–2016 | 286 visits | Bespoke Algorithms | Data drop | Ophthalmology outpatient | Clinical workflow analysis | |
[20] | Ophthalmology | 2015–2016 | 8,703 visits | R 3.4.3 [37] | Linear regression [39], RF [40] | Rule And Condition | Pediatric ophthalmology outpatient | Outpatient visit length |
[21] | Non-small cell lung cancer | 2010–2018 | 794 patients | Scikit-Learn 0.24.1 SciPy 1.6.2 [43], BERT [44] | Logistic regression [45], Deep neural network [46] | NER, Rule-Based, NLP Relation Classification, Postprocessing Modules | CT-scanned non-small cell lung cancer patients | Preoperative prediction of lymph node metastasis |
[22] | Type II diabetes | - | 997 patients | Python 3.6, PyTorch 1.0 [47], NVIDIA Titan X GPU, | ADAM [50], 3D UNet [51], Fuzzy C means [52], Convolutional neural network | Segmentation & Slicing, Feature Extraction & Normalization, Annotation | CT scanned patients with and without diabetes | Early Detection of type II diabetes mellitus |
[23] | Acute ischemic stroke | 1992–2019 | 6,136, 686 patients | Lasso logistic regression [55] | Rule-based processing | Patients aged 45 + with first ischemic stroke | Early prediction of symptomatic intracerebral hemorrhage | |
[24] | Nasopharyngeal cancer | 2008–2018 | 54,703 patients | - | - | ETL, Data Structurization & Normalization | Nasopharyngeal carcinoma patient receiving treatment | Platform development for retrospective clinical studies |
[25] | No specific disease | 1980–2014 | 704,587 patients | NCBO BioPortal [56], Open Biomedical Annotator [57] | K-Means, ICA [60], Multi-Layer Neural Network [46], LDA [61], SDA [62], NegEx [36] | Denoising, Topic Modelling, Negation | Patients with one recorded ICD code | Onset of disease based on EHRs |
[26] | Cancer | 1996–2012 | 7000 reports | Weka Software 3.6.11 [63], Perl Lingua Stem module [64], | Logistic regression [45], Naive Bayes [67], K–NN [68], RF [40], J48 decision tree [69], NegEx [36] | Kullback-Leibler [70], NER, Dictionary and Non-dictionary approach, Rule-based classifier | Patients with a recorded clinical note | Detect cancer cases using plaintext medical data |
[27] | Inpatient Accidental Falling | 2010–2014 | 46,241 patients | Ubuntu 14.04 LTS [71], R 3.1.2 [37], lme4 package [72], Epi [73] | Multilevel Logistic Regression [74] | Transformation, Mapping Values | Hospitalized inpatients with recorded data | Predict fall risk to prevent injury |
[28] | Pediatric Care | 2008–2013 | 149,604 visits | - | Statistical Analysis, Correlation, Interpolation | Pediatric physician visits | Compute physician & departmental performance | |
[29] | No specific: Evaluated in Colorectal Cancer | - | *20346 visits | Semantic tool [80], Saxon [81], OWL [82], NCBO BioPortal [56], Protégé [83], Hermit Reasoner | Bespoke phenotyping algorithm, Ontology mapping, Semantic Reasoning | Semantic Representation, Standardization | Colorectal cancer patients | Identification of patient cohorts |
[30] | No specific: Evaluated in HIV, hepatitis C, lab measurements | - | **Multiple | CogStack [89], Bio-YODIE [90], Elasticsearch [91], UMLS [92, 93], SPARQL [88], SNOMED CT [87] | Bidirectional recurrent neural network [94] | NER, Normalization, Semantic Indexing & Computation Negation, Indexing | Pertinent clinical notes for target use cases | Customized care, trial recruitment, and research |