- Research
- Open access
- Published:
Explainable machine learning model for predicting the risk of significant liver fibrosis in patients with diabetic retinopathy
BMC Medical Informatics and Decision Making volume 24, Article number: 332 (2024)
Abstract
Background
Diabetic retinopathy (DR), a prevalent complication in patients with type 2 diabetes, has attracted increasing attention. Recent studies have explored a plausible association between retinopathy and significant liver fibrosis. The aim of this investigation was to develop a sophisticated machine learning (ML) model, leveraging comprehensive clinical datasets, to forecast the likelihood of significant liver fibrosis in patients with retinopathy and to interpret the ML model by applying the SHapley Additive exPlanations (SHAP) method.
Methods
This inquiry was based on data from the National Health and Nutrition Examination Survey 2005–2008 cohort. Utilizing the Fibrosis-4 index (FIB-4), liver fibrosis was stratified across a spectrum of grades (F0-F4). The severity of retinopathy was determined using retinal imaging and segmented into four discrete gradations. A ten-fold cross-validation approach was used to gauge the propensity towards liver fibrosis. Eight ML methodologies were used: Extreme Gradient Boosting, Random Forest, multilayer perceptron, Support Vector Machines, Logistic Regression (LR), Plain Bayes, Decision Tree, and k-nearest neighbors. The efficacy of these models was gauged using metrics, such as the area under the curve (AUC). The SHAP method was deployed to unravel the intricacies of feature importance and explicate the inner workings of the ML model.
Results
The analysis included 5,364 participants, of whom 2,116 (39.45%) exhibited notable liver fibrosis. Following random allocation, 3,754 individuals were assigned to the training set and 1,610 were allocated to the validation cohort. Nine variables were curated for integration into the ML model. Among the eight ML models scrutinized, the LR model attained zenith in both AUC (0.867, 95% CI: 0.855–0.878) and F1 score (0.749, 95% CI: 0.732–0.767). In internal validation, this model sustained its superiority, with an AUC of 0.850 and an F1 score of 0.736, surpassing all other ML models. The SHAP methodology unveils the foremost factors through importance ranking.
Conclusion
Sophisticated ML models were crafted using clinical data to discern the propensity for significant liver fibrosis in patients with retinopathy and to intervene early.
Practice implications
Improved early detection of liver fibrosis risk in retinopathy patients enhances clinical intervention outcomes.
Introduction
Cirrhosis currently ranks as the 11th most prevalent cause of mortality worldwide, claiming an estimated one million lives annually owing to its complications, thereby imposing a substantial economic burden on numerous nations [1,2,3]. Significant liver fibrosis is commonly recognized as a precursor to cirrhosis, marking the initial pathological progression following liver injury [4, 5]. Although liver damage may be reversible during the early phases of fibrosis [6], persistent injury exacerbates the accumulation of fibrous tissue, ultimately culminating in cirrhosis [7,8,9]. Evidence suggests that significant liver fibrosis is correlated with liver-related morbidity and serves as a pivotal prognostic indicator in individuals with liver ailments [10, 11]. Thus, the early identification of liver fibrosis in patients and efficacious intervention prior to progression towards significant fibrosis/cirrhosis are of paramount importance [12, 13]. Timely recognition and efficacious interventions have the potential to reduce the likelihood of complications, enhance longevity, and improve the standard of living [14, 15]. Significantly, the current diagnosis of liver fibrosis predominantly hinges on Transient Elastography or hepatic biopsy, presenting persistent challenges in primary care contexts [16,17,18,19]. The exploration of noninvasive screening rooted in clinical manifestations continues to be a focal point of research within this domain.
Diabetic retinopathy (DR) is a severe ocular complication that affects patients with type 2 diabetes mellitus (T2DM) and has profound ramifications for global health [20,21,22]. Retinopathy manifests in two distinct forms: non-proliferative and proliferative [23]. Macular edema may manifest at any juncture in the disease trajectory, posing a formidable threat to visual acuity [24]. Furthermore, the prolonged duration of diabetes mellitus and suboptimal management of blood glucose and arterial blood pressure are the principal drivers of the onset and progression of retinopathy [24]. Recent studies have explored the robust correlation between retinopathy and significant liver fibrosis [25]. Specifically, evidence suggests that non-proliferative diabetic retinopathy (NPDR) may also be linked to liver fibrosis, as the underlying mechanisms of insulin resistance and metabolic dysregulation present in patients with NPDR can similarly affect liver health. Additionally, emerging evidence has highlighted a broader association between liver fibrosis and ocular manifestations in patients with diabetes, reinforcing the shared pathogenic mechanisms between these conditions [26,27,28]. Although the precise mechanisms remain elusive, shared pathogenic pathways have been identified in patients with T2DM and hepatic fibrosis who experience DR [29]. These pathways include insulin resistance, metabolic inflammation arising from disturbances in glucolipid metabolism, and oxidative stress [30,31,32]. Significant liver fibrosis may exacerbate systemic insulin resistance and hyperglycemia [33, 34], potentially contributing to the progression of retinopathy [35, 36]. This phenomenon is likely due to the augmentation of hepatic and systemic insulin resistance by liver fibrosis, the promotion of dyslipidemia, and the initiation of the synthesis of diverse pro-inflammatory mediators, which could contribute to the occurrence of chronic vascular complications of diabetes [37]. The evaluation of retinopathy severity involves retinal imaging and is typically classified into five grades: no diabetic retinopathy (NDR), mild non-proliferative diabetic retinopathy (NPDR), moderate NPDR, severe NPDR, and proliferative diabetic retinopathy (PDR) [38]. Recent studies have found that retinopathy may independently indicate significant liver fibrosis in a T2DM cohort [25].
The use of machine-learning (ML) methodologies in clinical research has steadily increased in recent years, propelled by the burgeoning availability of intricate clinical datasets [39]. This has garnered considerable interest and acclaim from a broad spectrum of clinicians and researchers. Unlike conventional statistical approaches, ML confers notable advantages in terms of prognostic accuracy and identification of latent patient subgroups characterized by distinct physiological profiles and prognostic trajectories [39]. Moreover, ML exhibits adeptness in navigating intricate interactions and nonlinear relationships, often beyond the purview of conventional clinical study methodologies, which are invariably influenced by multifaceted interrelationships among myriad factors [40, 41]. Consequently, an increasing number of clinical investigations have focused on adopting ML techniques [42]. The burgeoning interest in delineating the association between retinopathy and significant liver fibrosis underscores this frontier of inquiry. No previous studies have investigated predictive models to elucidate the risk of significant liver fibrosis in individuals with retinopathy. Thus, it is imperative to develop a predictive model that amalgamates diverse risk factors to appraise the propensity towards significant liver fibrosis onset in patients with retinopathy.
The primary objective of this investigation was to develop an interpretable ML framework, leveraging clinical data to predict the likelihood of significant liver fibrosis onset in patients with DR. Additionally, we endeavored to gauge the practical viability of such an interpretable ML model in clinical settings.
Materials and methods
Data sources
This study harnessed data sourced from the National Health and Nutrition Examination Survey (NHANES) repository, a publicly accessible online database renowned for its comprehensive, longitudinal, and multidimensional nature [43]. This invaluable resource facilitated the development of predictive models tailored specifically for individuals afflicted with retinopathy. Noteworthy for its representation and scope, the NHANES database furnishes vital insights essential for the formulation of nutrition and public health strategies. It is imperative to underscore that all participants enrolled in this database have provided explicit informed consent or have had proxies duly authorized to do so.
Study population
The data used in this study were extracted from the NHANES database, covering the years 2005 to 2008. Our inclusion criteria were designed to ensure that participants possessed comprehensive experimental data, including key descriptors such as retinal imaging outcomes, age, gender, and relevant medical history. The exclusion criteria eliminated individuals who [1] had incomplete or missing experimental data [2], refused retinal imaging, or [3] lacked essential descriptors. These criteria represent additional screening and exclusions applied beyond the original NHANES study design, tailored specifically to meet the requirements of our analysis. We did not rely solely on NHANES’s initial selection process but performed further filtering to ensure the integrity and relevance of the data for our study. After this selection process, a total of 5,364 participants were included in the final analysis (Fig. 1).
Retinopathy grading
Via meticulous retinal imaging assessments, we discerned the gradations of retinopathy severity, aligning them with the NHANES grading protocol [21]. Accordingly, we categorized retinopathy into four distinct tiers: Grade 1 denoting absence of retinopathy; Grade 2 indicating mild Non-proliferative Diabetic Retinopathy (NPDR); Grade 3 reflecting moderate to severe NPDR; and Grade 4 signifying retinopathy characterized by neovascularization at the optic disc (PR).
Liver fibrosis classification
The Fibrosis-4 index (FIB-4) stands as a prominent non-invasive biomarker widely utilized in the assessment of advanced liver fibrosis [44]. Through the application of the FIB-4, we ascertain the degree of liver fibrosis. It has been established that the FIB-4 calculation, rooted in age, aspartate aminotransferase (AST), alanine aminotransferase (ALT), and platelet (PLT), serves as an effective tool in discerning various stages of liver fibrosis and identifying patients afflicted with significant fibrosis. The formula governing the computation of the FIB-4 index is as follows: FIB-4 index = age × AST (IU/L) / platelet count (× 10^9 /L) × √ALT (IU/L). This computation enables the classification of liver fibrosis into four distinct grades: FIB-4 < 1.45 signifies mild fibrosis (F0-F1), while 1.45–3.25 corresponds to moderate fibrosis (F2), and > 3.25 denotes advanced fibrosis/cirrhosis (F3-F4). In this investigation, lesions falling within the F0-F1 category were categorized as absent/mild fibrotic lesions, whereas those within the F2-F4 range were deemed indicative of significant liver fibrosis/cirrhosis [45].
Data preprocessing and feature selection
The omission of datasets containing absent values in this study ensured the completeness of data from the included participants, obviating the necessity for employing interpolation methods. In the realm of feature selection, the Boruta algorithm emerges as a robust method, distinguished by its substantial stability in the identification and retention of relevant variables [46]. The core principle underlying this methodology involves a comparative analysis of the significance of actual predictive variables against artificially generated counterparts, termed ‘shadow variables,’ through a series of statistical evaluations and multiple iterations of Random Forest (RF) algorithms [47]. Subsequently, variables deemed non-essential, including all shadow variables, are systematically excised. Through the Boruta algorithm, this study ultimately included demographic characteristics encompassed age, sex, and race/ethnicity. Examination data comprised measurements of height (cm), weight (kg), and body mass index (BMI, kg/m²), while disease history documented hepatitis C virus(HCV) infection, T2DM, and DR.
Machine learning models
We adopted a random sampling approach, selecting 70% of participants from the NHANES database spanning 2005 to 2008 for the training set, while allocating the remaining 30% for internal validation. Subsequently, we embarked on the construction and assessment of eight supervised ML models, encompassing diverse methodologies such as extreme gradient boosting (XGBoost), RF, multilayer perceptron (MLP), support vector machine (SVM), logistic regression (LR), Naive Bayes (NB), decision tree (DT), and K-nearest neighbor (KNN). To mitigate the risk of overfitting, each model underwent rigorous tenfold cross-validation. To gauge the alignment between clinical utility and predictive accuracy, we conducted Decision Curve Analysis (DCA). Furthermore, the reserved 30% of data facilitated internal validation, offering insights into the predictive model’s efficacy. Subsequently, leveraging the SHapley Additive exPlanations (SHAP) method, we discerned the most salient risk factors underpinning the propensity towards significant liver fibrosis. SHAP values provided a visual depiction elucidating the significance of each feature and its contribution to the risk of significant liver fibrosis [48]. Implementation of the SHAP methodology was executed using R, specifically leveraging SHAPVIZ version 0.9.3.
Model performance evaluation
The delineation of the ML model’s efficacy is quantitatively assessed through the computation of several pivotal metrics: the area under the receiver operating characteristic curve (AUC), the optimal threshold, Accuracy, sensitivity, specificity, precision, F1 score, Brier score, and the confidence intervals (CI). Among these, the AUC and F1 score stand as paramount indicators for appraising the model’s performance [49, 50]. The AUC provides a comprehensive encapsulation of the ROC curve’s performance, furnishing a solitary figure that aggregates the classifier’s efficacy. This metric is particularly advantageous when juxtaposing various ROC curves—especially under circumstances where these curves may intersect—because it allows for the ranking of models based on their aggregate performance, thus simplifying the evaluation process. The F1 score emerges as a pivotal metric for further scrutiny of the ML models’ effectiveness. By amalgamating precision and recall, it offers a robust measure of accuracy particularly suited for handling datasets with imbalanced classes, thereby establishing a judicious equilibrium between these two elements. Precision measures the accuracy of positive predictions, while recall elucidates the model’s adeptness at identifying true positive instances, thereby mirroring the model’s sensitivity.
Statistical analysis
Data conforming to a normal distribution were depicted as mean ± standard deviation (SD), while median and interquartile range (IQR) were employed to characterize non-normally distributed data. Categorical data were articulated in terms of frequencies and percentages. Continuous variables underwent analysis via the Wilcoxon rank sum test, whereas categorical variables were scrutinized using the chi-square test (Table 1). All p-values are two-sided, with p-values < 0.05 considered statistically significant. All statistical analyses were performed using SPSS software (Version 23.0), R software (Version 3.3.2), and Python software (Version 3.10.4).
Results
Demographic characteristics of participants
In accordance with the specified inclusion and exclusion criteria, a cohort of 5,364 individuals was recruited for this study and subsequently partitioned into a training set (n = 3754) and a validation set (n = 1,610) in a 7:3 ratio (Fig. 1). Within the training set, 1,472 participants (39.2%) presented with significant liver fibrosis/cirrhosis, mirroring the figures observed in the validation set, wherein 644 participants (40.0%) exhibited similar hepatic conditions. The median age (IQR) of individuals manifesting significant liver fibrosis/cirrhosis stood at 70 (62.00–78.00) years, a noteworthy contrast to those who did not exhibit mild fibrotic lesions (52.00 [46.00–60.00] years, P < 0.001). The distributions of Height, and HCV characteristics exhibited no discernible discrepancies between the training and validation sets. Sex, race, Weight, BMI, PLT, ALT, AST, Retinopathy, and T2DM have emerged as factors intricately linked to the progression of significant liver fibrosis. Remarkably, the absence of any disparity in T2DM prevalence within the validation set suggests potential disparities in data distribution between the two sets. Moreover, participants afflicted with retinopathy displayed heightened vulnerability to significant liver fibrosis development compared to their counterparts without retinopathy, a trend observed in both the training and validation sets (P < 0.05) (Table 1).
Feature selection
AST, PLT, and ALT were excluded from the analysis prior to the Boruta algorithm due to their role in calculating the FIB-4 index. Including them would introduce redundancy and multicollinearity, which would distort the model’s accuracy and violate key assumptions of statistical modeling. Following their exclusion, the Boruta algorithm identified 9 variables with the strongest association with significant liver fibrosis (Fig. 2).
Among these, age was retained in the model despite being a component of FIB-4. This decision was made based on the independent and well-established clinical relevance of age in liver fibrosis progression. Age has been consistently recognized as a key determinant of liver disease severity due to its association with the cumulative exposure to various risk factors (e.g., metabolic disorders, viral hepatitis) and the declining regenerative capacity of the liver with advancing age. Moreover, age has been shown to modulate immune response and influence fibrosis development through age-related inflammatory processes and cellular senescence, which are independent of the specific contributions captured by the FIB-4 index.
By excluding age, the model would risk underestimating the impact of time-dependent factors on liver fibrosis, which are critical for accurate patient stratification and prognosis. Thus, while AST, PLT, and ALT were excluded to avoid circularity, age remains an indispensable variable due to its direct clinical implications beyond the scope of its role in FIB-4 calculation. As a result, 9 variables were selected for the final ML model development.
Model evaluation
The LR algorithm emerged with the highest AUC of 0.867 (95% CI: 0.855–0.878) among the ML models employed for significant liver fibrosis prediction (Fig. 3A; Table 2). Notably, the LR model garnered a noteworthy F1 score of 0.749 (95% CI: 0.732–0.767). In the validation cohort, both the LR and RF models displayed the highest AUC values among the eight ML models (LR: 0.850, 95% CI: 0.829–0.869; RF: 0.837, 95% CI: 0.817–0.857) alongside respectable F1 scores (LR: 0.736, 95% CI: 0.709–0.763; RF: 0.732, 95% CI: 0.704–0.761) (Fig. 3B; Table 3). To further illustrate the classification performance of the LR model, confusion matrices were plotted for both the training and validation sets (Figures S1A and S1B, respectively). These matrices provide a granular view of the model’s true positives, false positives, true negatives, and false negatives. The results demonstrated that the LR model correctly classified a high number of true positives and true negatives, reinforcing its ability to accurately distinguish between patients with and without significant liver fibrosis. Importantly, the relatively low rate of false positives and false negatives underscores the model’s reliability in clinical risk prediction, minimizing the potential for both over-diagnosis and missed diagnoses. This analysis adds further weight to the LR model’s favorable performance metrics, complementing the high AUC and F1 scores. These findings affirm the LR model’s robustness and suitability for real-world clinical implementation. Furthermore, the pairwise comparisons of the AUCs between Logistic Regression and other models, as presented in Table S1, provide additional evidence of the statistical significance of these performance differences.
Detailed DCA of the training dataset revealed that the LR model exhibited excellent performance among the ML paradigms, affirming its robust efficacy in clinical implementation (Fig. 3C). Concurrently, the DCA results for the validation sets confirm that the use of the LR model for risk prediction leads to substantial positive net benefits (Fig. 3D). Furthermore, Panels A, and B of Fig. 4 depict the calibration curves for various models in the training, and validation sets, respectively. The LR model exhibits good calibration on all datasets, and the calibration curves are highly coincident with the ideal 45-degree baseline, indicating that the match between its predicted event rates and the actual event rates is more accurate. In contrast, the KNN and MLP models exhibit significant calibration bias on both the internal and external validation sets, especially in the higher probability intervals, where their predictions deviate more from the actual observations.
Moreover, the analysis of precision-recall (PR) curves across the models (Fig. 4C-D) further highlights the superior performance of the LR model. PR curves are especially informative when evaluating models on imbalanced datasets, as they focus on the balance between precision (positive predictive value) and recall (sensitivity). The LR model demonstrates high precision and recall, reflecting its ability to minimize false positives while maintaining a strong sensitivity to true positives. These results underscore the LR model’s superior discriminatory power and reliability for clinical risk prediction in imbalanced clinical datasets, making it the optimal choice for this application.
Visualization by SHAP
Figure 5A illustrates the importance of the SHAP features for the LR model. The features under scrutiny were arranged in descending order of their influence on the projected outcomes, as indicated by the mean absolute value of SHAP. Among these, the top five pivotal features were alterations in age, gender, BMI, height, and weight. The SHAP summary plot of the LR model revealed the effects of these features on the prognostic model (Fig. 5B). Within the predictive framework, elevated SHAP values associated with specific features indicate an augmented predisposition to significant liver fibrosis. For instance, older individuals exhibit heightened susceptibility to significant liver fibrosis compared to their younger counterparts.
Interpreting machine learning models at the patient level
We employed the SHAP method to elucidate the individual predictions for the patients and evaluated the impact of the LR model on individual patient features (Figure S2). The contribution of each feature is depicted in color, with red indicating a positive contribution and blue indicating a negative contribution. The length of the color bar represents the magnitude of the contribution. For patient A (classified in the “true positive” group), the LR model indicated a higher likelihood of significant liver fibrosis. Conversely, for patient B (classified in the “true negative” group), the LR model inferred a relatively low probability of developing significant liver fibrosis.
Construction and clinical application of an online prediction tool
Based on a previously constructed logistic regression model, we successfully developed an online calculator (https://lalalaanjila.shinyapps.io/Logistics_app/) for predicting the probability of a patient experiencing significant liver fibrosis. The calculator integrates several clinical variables and transforms the input individual data into the predicted probability of significant liver fibrosis risk through the Logit transform formula, thus providing clinicians with a convenient and accurate platform for individualized risk assessment (Figure S3). Compared with traditional scoring systems, this calculator not only improves the accuracy of risk prediction, but also significantly simplifies the clinical process, allowing individualized treatment decisions to be made more efficiently. Preliminary validation showed that the tool performed on a validation set in line with the original expectations of the model, further supporting its potential application value in clinical practice.
Discussion
The principal aim of this investigation was to craft interpretable ML models using clinical data to predict the likelihood of significant liver fibrosis in patients with DR. Our primary dataset was sourced from the NHANES database, a comprehensive repository renowned for its extensive utilization across diverse research domains in recent years. In our study, we found a positive correlation between the severity of DR and significant liver fibrosis. Both retinopathy and liver fibrosis pose significant global health challenges, often manifesting subtly, with many patients exhibiting no overt symptoms [51,52,53]. These two maladies are intertwined, elevating the risk of cirrhosis and cancer development in individuals with DR [54]. Consequently, there is a pressing need for a user-friendly, non-invasive modality for early stage liver fibrosis detection in patients with DR. We investigated the relationship between retinopathy and significant liver fibrosis, and our findings indicate that retinopathy could serve as a pivotal indicator of significant liver fibrosis progression in diabetic cohorts.
In the clinical setting, the coexistence of DR and significant liver fibrosis often manifests subtly, with many patients displaying no overt symptoms, making the simultaneous management of these conditions challenging [51]. Given the prevalence of T2DM and its complications, it is critical to recognize the often-overlooked progression of liver diseases, such as significant liver fibrosis, which may lead to compensatory or decompensatory chronic liver diseases [54]. Unfortunately, due to the generally asymptomatic nature of liver fibrosis in its early stages, patients with DR or T2DM frequently underestimate the severity or presence of liver conditions. This recognition underscores the urgent need for innovative approaches in clinical practice to facilitate the early detection and management of liver fibrosis among these patients. Our study advocates the integration of routine liver health assessments into diabetes care protocols by leveraging common clinical features and biomarkers to construct predictive models. Such models can stratify patients with DR according to the risk of liver fibrosis, enabling targeted early screening, prevention, and intervention strategies. This proactive approach aims not only to manage retinopathy, but also to preempt and mitigate the progression of liver fibrosis, thereby improving overall patient outcomes and survival rates.
This study harnessed the power of ML algorithms to predict the risk of significant liver fibrosis in individuals with DR. We meticulously scrutinized eight distinct ML prognostic methodologies, unveiling the LR model as the frontrunner, boasting the highest AUC of 0.867 (95% CI: 0.855–0.878), coupled with an impressive F1 score of 0.749 (95% CI: 0.732–0.767). To verify the validity and applicability of our findings, we meticulously validated the LR model against a validation dataset. In this independent validation cohort, the LR model once again eclipsed its ML counterparts, registering an AUC of 0.850 and an F1 score of 0.736. Furthermore, to unravel the intricacies of our LR model, we employed SHAP summaries and dependency plots to reveal the principal predictors of significant liver fibrosis in the DR population. The elucidation provided by the SHAP imparts significance to clinical metrics that are readily assessable in practice, such as age, gender, BMI, height, and weight, all of which are deemed pivotal features in the ML framework for significant liver fibrosis prognosis. Additionally, the presence of DR emerged as a noteworthy factor in the LR model, further amplifying its prognostic value.
Furthermore, the online calculator we developed provides an accessible, user-friendly tool that fits seamlessly into clinical workflows, potentially allowing for broader adoption in outpatient settings without the need for complex testing. Predicting liver fibrosis risk in DR patients has implications for both hepatologic and ophthalmologic management, as early fibrosis identification facilitates timely intervention, possibly mitigating further progression of fibrosis and related comorbidities [55, 56]. By bridging these clinical domains, this model aligns with personalized medicine principles, offering a streamlined pathway to integrate hepatic risk management within the care continuum for diabetic retinopathy patients.
Numerous studies have investigated the relationship between DR and significant liver fibrosis [25, 29, 57]. Although the precise mechanism remains elusive, a close correlation has been identified. These connections include insulin resistance, inflammation stemming from glucolipid metabolic disorders, and oxidative stress [37]. Studies have indicated that liver fibrosis can exacerbate retinopathy by intensifying systemic insulin resistance and hyperglycemia [37]. Furthermore, mounting evidence suggests that dysregulation of the microbiome and its metabolites may foster the onset and progression of hepatocellular steatosis, inflammation, and fibrosis in non-alcoholic fatty liver disease and DR [58, 59]. Cumulatively, the robust correlation between DR and significant liver fibrosis suggests that DR could serve as a clinical biomarker for the advancement of substantial liver fibrosis. For the first time, we harnessed the power of ML to elucidate the precise incidence of significant liver fibrosis in patients with DR, culminating in a gratifying outcome. This study highlighted the importance of screening for hepatic ailments in patients with diabetes. In particular, we emphasize the importance of early screening and treatment of liver disease in patients with retinopathy who present to the eye clinic of a healthcare facility with ocular discomfort. This endeavor not only holds promise for ameliorating anticipated survival rates, but also provides patients with superior avenues for health stewardship. Moreover, we advocate the use of transient elastography to corroborate the diagnoses in patients with retinopathy.
The FIB-4 index offers a convenient, non-invasive means to assess liver fibrosis, but it does have limitations. Initially validated for chronic hepatitis C, FIB-4’s performance can vary across different populations, such as patients with diabetes or obesity, where the progression and nature of fibrosis may be influenced by multiple factors [60, 61]. Furthermore, while histopathology and elastography techniques (like transient or magnetic resonance elastography) provide detailed, quantitative insights into liver stiffness [62], FIB-4 gives an estimation rather than a direct measure, which may lead to misclassification, especially in patients with intermediate fibrosis stages [63]. Nevertheless, FIB-4’s accessibility and ease of use make it a valuable screening tool in settings where advanced imaging is unavailable.
In our study, FIB-4 was chosen for its proven utility in large-scale, population-based assessments, allowing us to integrate it into a machine learning framework that incorporates additional clinical variables for a comprehensive evaluation. By leveraging this approach, we provide a method to enhance risk stratification in diabetic retinopathy patients, where liver fibrosis is a critical concern, underscoring the practical value of a multimodal predictive model.
Despite the robust sample size of this study and the alignment of the outcomes with our hypotheses, several limitations persisted, necessitating further refinement. First, the utilization of the NHANES database, which is monocentric, for training and testing our ML model may introduce racial bias, constraining its generalizability across diverse populations. To enhance the model applicability, inclusive datasets from various sources are imperative for comprehensive training and validation. Second, given the extraction of data from publicly available databases, the inherent bias stemming from missing data is unavoidable. While efforts have been made to mitigate its impact, missing residual information or bias may still persist. Third, due to its retrospective nature, this study unavoidably had a selection bias. Leveraging data solely from a single NHANES database underscores the necessity for multicenter, large-scale clinical investigations. Fourth, reliance on serological markers exclusively to construct a liver fibrosis severity model, without biopsy or FibroScan diagnostics, poses a limitation in accurately gauging the extent of fibrosis. Future studies could explore the amalgamation of diverse diagnostic modalities for a more nuanced and dependable fibrosis assessment. Prospective studies are warranted to investigate the plausible causal nexus between retinopathy and T2DM-associated liver fibrosis.
Conclusion
ML models employing readily available clinical data can identify patients with DR who are prone to significant liver fibrosis. The SHAP methodology facilitates the interpretation of the ML model predictions, rendering them comprehensible for clinical implementation. In addition, doctors can intervene early and reduce the risk of complications before patients develop serious conditions such as significant liver fibrosis and cirrhosis.
Data availability
The datasets generated and/or analyzed during the current study are available in the GitHub repository, accessible via the following link: https://github.com/lalalaanjila/NHANES_Data_Analysis.git.
Abbreviations
- ALT:
-
Alanine Aminotransferase
- AST:
-
Aspartate Aminotransferase
- AUC:
-
Area Under the Receiver Operating Characteristic Curve
- BMI:
-
Body Mass Index
- CI:
-
Confidence Intervals
- DCA:
-
Decision Curve Analysis
- DR:
-
Diabetic Retinopathy
- DT:
-
Decision Tree
- FIB-4:
-
The Fibrosis-4 Index
- HCV:
-
Hepatitis C Virus
- IQR:
-
Interquartile Range
- KNN:
-
K-Nearest Neighbor
- LR:
-
Logistic Regression
- ML:
-
Machine Learning
- MLP:
-
Multilayer Perceptron
- NB:
-
Naive Bayes
- NHANES:
-
National Health and Nutrition Examination Survey
- NPDR:
-
Non-proliferative Diabetic Retinopathy
- PDR:
-
Proliferative Diabetic Retinopathy
- PLT:
-
Platelet
- RF:
-
Random Forest
- SD:
-
Standard Deviation
- SHAP:
-
SHapley Additive exPlanations
- SVM:
-
Support Vector Machine
- T2DM:
-
Type 2 Diabetes Mellitus
- XGBoost:
-
Extreme Gradient Boosting
References
Asrani SK, Devarbhavi H, Eaton J, Kamath PS. Burden of liver diseases in the world. J Hepatol. 2019;70(1):151–71.
Mokdad AH, Forouzanfar MH, Daoud F, Mokdad AA, El Bcheraoui C, Moradi-Lakeh M, et al. Global burden of diseases, injuries, and risk factors for young people’s health during 1990–2013: a systematic analysis for the global burden of disease study 2013. Lancet. 2016;387(10036):2383–401.
Vos T, Lim SS, Abbafati C, Abbas KM, Abbasi M, Abbasifard M, et al. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the global burden of disease study 2019. Lancet. 2020;396(10258):1204–22.
Casari M, Siegl D, Deppermann C, Schuppan D. Macrophages and platelets in liver fibrosis and hepatocellular carcinoma. Front Immunol. 2023;14:1277808.
Zhou WC, Zhang QB, Qiao L. Pathogenesis of liver cirrhosis. World J Gastroenterol. 2014;20(23):7312–24.
Roehlen N, Crouchet E, Baumert TF. Liver fibrosis: mechanistic concepts and therapeutic perspectives. Cells. 2020;9(4).
Caligiuri A, Gentilini A, Pastore M, Gitto S, Marra F. Cellular and molecular mechanisms underlying liver fibrosis regression. Cells. 2021;10(10).
Khurana A, Sayed N, Allawadhi P, Weiskirchen R. It’s all about the spaces between cells: role of extracellular matrix in liver fibrosis. Ann Transl Med. 2021;9(8):728.
Pinzani M, Rombouts K, Colagrande S. Fibrosis in chronic liver diseases: diagnosis and management. J Hepatol. 2005;42(Suppl1):S22–36.
Taylor RS, Taylor RJ, Bayliss S, Hagström H, Nasr P, Schattenberg JM, et al. Association between fibrosis stage and outcomes of patients with nonalcoholic fatty liver disease: a systematic review and meta-analysis. Gastroenterology. 2020;158(6):1611–25. e12.
Angulo P, Kleiner DE, Dam-Larsen S, Adams LA, Bjornsson ES, Charatcharoenwitthaya P, et al. Liver fibrosis, but no other histologic features, is associated with long-term outcomes of patients with nonalcoholic fatty liver disease. Gastroenterology. 2015;149(2):389–97. e10.
Day JW, Rosenberg WM. The enhanced liver fibrosis (ELF) test in diagnosis and management of liver fibrosis. Br J Hosp Med (London England: 2005). 2018;79(12):694–9.
Kugelmas M, Noureddin M, Gunn N, Brown K, Younossi Z, Abdelmalek M, et al. The use of current knowledge and non-invasive testing modalities for predicting at-risk non-alcoholic steatohepatitis and assessing fibrosis. Liver Int. 2023;43(5):964–74.
Lambrecht J, van Grunsven LA, Tacke F. Current and emerging pharmacotherapeutic interventions for the treatment of liver fibrosis. Expert Opin Pharmacother. 2020;21(13):1637–50.
Sharma S, Khalili K, Nguyen GC. Non-invasive diagnosis of advanced fibrosis and cirrhosis. World J Gastroenterol. 2014;20(45):16820–30.
Honda Y, Yoneda M, Imajo K, Nakajima A. Elastography techniques for the assessment of liver fibrosis in non-alcoholic fatty liver disease. Int J Mol Sci. 2020;21(11).
EASL-ALEH Clinical Practice Guidelines. Non-invasive tests for evaluation of liver disease severity and prognosis. J Hepatol. 2015;63(1):237–64.
Roulot D, Costes JL, Buyck JF, Warzocha U, Gambier N, Czernichow S, et al. Transient elastography as a screening tool for liver fibrosis and cirrhosis in a community-based population aged over 45 years. Gut. 2011;60(7):977–84.
Friedrich-Rust M, Poynard T, Castera L. Critical comparison of elastography methods to assess chronic liver disease. Nat Rev Gastroenterol Hepatol. 2016;13(7):402–11.
Wilkinson-Berka JL. Angiotensin and diabetic retinopathy. Int J Biochem Cell Biol. 2006;38(5–6):752–65.
Vujosevic S, Aldington SJ, Silva P, Hernández C, Scanlon P, Peto T, et al. Screening for diabetic retinopathy: new perspectives and challenges. Lancet Diabetes Endocrinol. 2020;8(4):337–47.
Yau JW, Rogers SL, Kawasaki R, Lamoureux EL, Kowalski JW, Bek T, et al. Global prevalence and major risk factors of diabetic retinopathy. Diabetes Care. 2012;35(3):556–64.
Song P, Yu J, Chan KY, Theodoratou E, Rudan I. Prevalence, risk factors and burden of diabetic retinopathy in China: a systematic review and meta-analysis. J Glob Health. 2018;8(1):010803.
Kollias AN, Ulbig MW. Diabetic retinopathy: early diagnosis and effective treatment. Dtsch Arztebl Int. 2010;107(5):75–83. quiz 4.
Zhang GH, Yuan TH, Yue ZS, Wang L, Dou GR. The presence of diabetic retinopathy closely associated with the progression of non-alcoholic fatty liver disease: a meta-analysis of observational studies. Front Mol Biosci. 2022;9:1019899.
Kang KH, Shin D, Ryu IH, Kim JK, Lee IS, Koh K, et al. Association between cataract and fatty liver diseases from a nationwide cross-sectional study in South Korea. Sci Rep. 2024;14(1):77.
Chen C, Wei L, He W, Zhang Y, Xiao J, Lu Y, et al. Associations of severe liver diseases with cataract using data from UK Biobank: a prospective cohort study. EClinicalMedicine. 2024;68:102424.
Patel R, Nair S, Choudhry H, Jaffry M, Dastjerdi M. Ocular manifestations of liver disease: an important diagnostic aid. Int Ophthalmol. 2024;44(1):177.
Yuan TH, Yue ZS, Zhang GH, Wang L, Dou GR. Beyond the liver: liver-eye communication in clinical and experimental aspects. Front Mol Biosci. 2021;8:823277.
Sheka AC, Adeyi O, Thompson J, Hameed B, Crawford PA, Ikramuddin S. Nonalcoholic steatohepatitis: a review. JAMA. 2020;323(12):1175–83.
Capitão M, Soares R. Angiogenesis and inflammation crosstalk in diabetic retinopathy. J Cell Biochem. 2016;117(11):2443–53.
Marušić M, Paić M, Knobloch M, Liberati Pršo A-M. NAFLD, insulin resistance, and diabetes mellitus type 2. Can J Gastroenterol Hepatol. 2021;2021.
Marchesini G, Marzocchi R, Agostini F, Bugianesi E. Nonalcoholic fatty liver disease and the metabolic syndrome. Curr Opin Lipidol. 2005;16(4):421–7.
McCullough AJ. Pathophysiology of nonalcoholic steatohepatitis. J Clin Gastroenterol. 2006;40:S17–29.
Brownlee M. The pathobiology of diabetic complications: a unifying mechanism. Diabetes. 2005;54(6):1615–25.
Groop PH, Forsblom C, Thomas MC. Mechanisms of disease: pathway-selective insulin resistance and microvascular complications of diabetes. Nat Clin Pract Endocrinol Metab. 2005;1(2):100–10.
Targher G, Lonardo A, Byrne CD. Nonalcoholic fatty liver disease and chronic vascular complications of diabetes mellitus. Nat Rev Endocrinol. 2018;14(2):99–114.
Wilkinson CP, Ferris FL 3rd, Klein RE, Lee PP, Agardh CD, Davis M, et al. Proposed international clinical diabetic retinopathy and diabetic macular edema disease severity scales. Ophthalmology. 2003;110(9):1677–82.
Stevens LM, Mortazavi BJ, Deo RC, Curtis L, Kao DP. Recommendations for reporting machine learning analyses in clinical research. Circ Cardiovasc Qual Outcomes. 2020;13(10):e006556.
Chung H, Ko Y, Lee IS, Hur H, Huh J, Han SU, et al. Prognostic artificial intelligence model to predict 5 year survival at 1 year after gastric cancer surgery based on nutrition and body morphometry. J Cachexia Sarcopenia Muscle. 2023;14(2):847–59.
Shi H, Yang D, Tang K, Hu C, Li L, Zhang L, et al. Explainable machine learning model for predicting the occurrence of postoperative malnutrition in children with congenital heart disease. Clin Nutr. 2022;41(1):202–10.
Handelman GS, Kok HK, Chandra RV, Razavi AH, Lee MJ, Asadi H. eDoctor: machine learning and the future of medicine. J Intern Med. 2018;284(6):603–19.
Wu WT, Li YJ, Feng AZ, Li L, Huang T, Xu AD, et al. Data mining in clinical big data: the frequently used databases, steps, and methodological models. Mil Med Res. 2021;8(1):44.
Bedossa P, Poynard T. An algorithm for the grading of activity in chronic hepatitis C. The METAVIR Cooperative Study Group. Hepatology (Baltimore MD). 1996;24(2):289–93.
So-Armah KA, Lim JK, Lo Re V, Tate JP, Chang CH, Butt AA, et al. FIB-4 stage of liver fibrosis predicts incident heart failure among HIV-infected and uninfected patients. Hepatology (Baltimore MD). 2017;66(4):1286–95.
Kursa MB, Rudnicki WR. Feature selection with the Boruta package. J Stat Softw. 2010;36:1–13.
Degenhardt F, Seifert S, Szymczak S. Evaluation of variable selection methods for random forests and omics data sets. Brief Bioinform. 2019;20(2):492–503.
Lundberg S. A unified approach to interpreting model predictions. arXiv Preprint. 2017;arXiv:170507874.
Bekkar M, Djemaa HK, Alitouche TA. Evaluation measures for models assessment over imbalanced data sets. 2013.
Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf Process Manag. 2009;45(4):427–37.
Zimmet P, Alberti KG, Magliano DJ, Bennett PH. Diabetes mellitus statistics on prevalence and mortality: facts and fallacies. Nat Reviews Endocrinol. 2016;12(10):616–22.
Gregg EW, Sattar N, Ali MK. The changing face of diabetes complications. Lancet Diabetes Endocrinol. 2016;4(6):537–47.
Caballería L, Pera G, Arteaga I, Rodríguez L, Alumà A, Morillas RM, et al. High prevalence of liver fibrosis among European adults with unknown liver disease: a population-based study. Clin Gastroenterol Hepatol. 2018;16(7):1138–45. e5.
Chalasani N, Younossi Z, Lavine JE, Diehl AM, Brunt EM, Cusi K, et al. The diagnosis and management of non-alcoholic fatty liver disease: practice guideline by the American Gastroenterological Association, American Association for the Study of Liver Diseases, and American College of Gastroenterology. Gastroenterology. 2012;142(7):1592–609.
Chalasani N, Younossi Z, Lavine JE, Charlton M, Cusi K, Rinella M, et al. The diagnosis and management of nonalcoholic fatty liver disease: practice guidance from the American Association for the study of Liver diseases. Hepatology (Baltimore MD). 2018;67(1):328–57.
Younossi ZM, Koenig AB, Abdelatif D, Fazel Y, Henry L, Wymer M. Global epidemiology of nonalcoholic fatty liver disease-meta-analytic assessment of prevalence, incidence, and outcomes. Hepatology (Baltimore MD). 2016;64(1):73–84.
Yang W, Xu H, Yu X, Wang Y. Association between retinal artery lesions and nonalcoholic fatty liver disease. Hepatol Int. 2015;9(2):278–82.
Mouzaki M, Loomba R. Insights into the evolving role of the gut microbiome in nonalcoholic fatty liver disease: rationale and prospects for therapeutic intervention. Th Adv Gastroenterol. 2019;12:1756284819858470.
Liu W, Wang C, Xia Y, Xia W, Liu G, Ren C, et al. Elevated plasma trimethylamine-N-oxide levels are associated with diabetic retinopathy. Acta Diabetol. 2021;58(2):221–9.
Vallet-Pichard A, Mallet V, Nalpas B, Verkarre V, Nalpas A, Dhalluin-Venier V, et al. FIB-4: an inexpensive and accurate marker of fibrosis in HCV infection. Comparison with liver biopsy and fibrotest. Hepatology (Baltimore MD). 2007;46(1):32–6.
Sterling RK, Lissen E, Clumeck N, Sola R, Correa MC, Montaner J, et al. Development of a simple noninvasive index to predict significant fibrosis in patients with HIV/HCV coinfection. Hepatology (Baltimore MD). 2006;43(6):1317–25.
Castéra L, Vergniol J, Foucher J, Le Bail B, Chanteloup E, Haaser M, et al. Prospective comparison of transient elastography, fibrotest, APRI, and liver biopsy for the assessment of fibrosis in chronic hepatitis C. Gastroenterology. 2005;128(2):343–50.
Mózes FE, Lee JA, Selvaraj EA, Jayaswal ANA, Trauner M, Boursier J, et al. Diagnostic accuracy of non-invasive tests for advanced fibrosis in patients with NAFLD: an individual patient data meta-analysis. Gut. 2022;71(5):1006–19.
Acknowledgements
The authors would like to express gratitude to all participants and contributors of NHANES, as well as acknowledge the clinical and research staff from various research centers.
Funding
Opening Project of Jiangsu Key Laboratory of Xuzhou Medical University (XZSYSKF2021030). The Basic Research Fund, First Affiliated Hospital of Gannan Medical University (QD095). Xuzhou Key Research and Development Program under Grant (KC23273). The Opening Project of Jiangsu Key Laboratory of Xuzhou Medical University (XZSYSKF2021030). Hospitallevel Scientific Research Project of Affiliated Hospital of Xuzhou Medical University(2022ZL26).
Author information
Authors and Affiliations
Contributions
Gf. Z.: Performed statistical analyses and took the lead in writing the manuscript; N. Y. and Q. Y.: Managed the collection and arrangement of the data; R. X. and Lj. Z.: Provided technical support throughout the research process; Yl. Z. and Jy. L.: Engaged in the clinical practices associated with this study; J. C. and Cx. C.: Reviewed and made significant revisions to the initial draft; Zh.L. and L. H.: Were responsible for primary data collection; Y. X. and Tl. Z.: Oversaw the overall direction and planning of the project, and designed the research topic. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhu, G., Yang, N., Yi, Q. et al. Explainable machine learning model for predicting the risk of significant liver fibrosis in patients with diabetic retinopathy. BMC Med Inform Decis Mak 24, 332 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-024-02749-z
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-024-02749-z