Skip to main content

Development and validation of a machine learning model to predict hemostatic intervention in patients with acute upper gastrointestinal bleeding

Abstract

Background

Acute upper gastrointestinal bleeding (UGIB) is common in clinical practice and has a wide range of severity. Along with medical therapy, endoscopic intervention is the mainstay treatment for hemostasis in high-risk rebleeding lesions. Predicting the need for endoscopic intervention would be beneficial in resource-limited areas for selective referral to an endoscopic center. The proposed risk stratification scores had limited accuracy. We developed a machine learning model to predict the need for endoscopic intervention in patients with acute UGIB.

Methods

A prospectively collected database of UGIB patients from 2011 to 2020 was retrospectively reviewed. Patients older than 18 years diagnosed with UGIB who underwent endoscopy were included. Data comprised demographic characteristics, clinical presentation, and laboratory parameters. The cleaned data was used for model development and validation in Python. We conducted 80%–20% split sample training and test sets. The training set was used for supervised learning of 15 models using a stratified 5-fold cross-validation process. The model with the highest AUROC was then internally validated with the test set to evaluate performance.

Results

Of 1389 patients, 615 (44.3%) of the cohorts received the endoscopic intervention (293 variceal- and 336 nonvariceal-bleeding interventions). Eighteen features, including demographic characteristics, clinical presentation, and laboratory parameters, were selected as input for 15 machine learning models. The result revealed that the linear discriminant analysis model could achieve the highest AUROC of 0.74 to predict endoscopic intervention. The model was validated with the test set, in which the AUROC was increased from 0.74 to 0.81. Finally, the model was deployed as a web application by Streamlit.

Conclusions

Our machine learning model can identify patients with acute UGIB who need endoscopic intervention with good performance. This may help primary care physicians prioritize patients who need referrals and optimize resource allocation in resource-limited areas. Further development and identification of more specific features might improve prediction performance.

Trial Registration

None (Retrospective cohort study)

Patient & Public Involvement

None

Peer Review reports

Background

Acute upper gastrointestinal bleeding (UGIB) remains a prevalent and significant medical condition encountered in routine clinical practice, with an annual incidence of 80–150 per 100,000 population and mortality rates spanning from 2% to 15% [1,2,3]. Given a considerable spectrum of severity, multiple international guidelines recommended pre-endoscopic risk stratification to assess and triage patients according to severity. Esophagogastroduodenoscopy (EGD) should be performed within a suggested time frame of 12–24 h from the onset of presentation [4,5,6,7]. Endoscopy is the most effective tool for diagnosing UGIB, and endoscopic therapy is indicated for lesions with high-risk stigmata to control bleeding and prevent rebleeding. For risk stratification, various scoring systems have been developed during the past two decades, such as the pre-endoscopic Rockall score (RS), Glasgow-Blatchford score (GBS), and AIMS65 score [8,9,10]. These scores aim to classify patients into low-risk or high-risk groups, thereby guiding the subsequent treatment strategy. Patients classified as a high-risk group carry higher mortality and rebleeding rates and, therefore, require in-hospital management. The pre-endoscopic Rockall score and the AIMS65 score were designed to predict mortality, while the GBS aimed to predict the likelihood of in-hospital management, including endoscopy, transfusion support, or surgery. Studies comparing the performance of these scoring systems revealed that the GBS has the highest performance and very high sensitivity [11,12,13]. However, the specificity was still limited [14, 15]. Furthermore, these scoring systems did not focus mainly on predicting hemostatic intervention.

The role of artificial intelligence (AI) in the medical profession is undergoing rapid expansion since machine learning can effectively handle large, complex, and heterogeneous datasets, extract correlations of parameters in detail, and accurately predict the outcome with experience [16,17,18]. As evidenced by numerous studies, the application of AI has yielded satisfying results in risk stratification, endoscopic findings, and mortality rate in UGIB [19,20,21,22,23,24,25,26]. In 2020, Seo et al. created a machine learning model to predict adverse events such as mortality, hypotension, and rebleeding in patients with initially stable nonvariceal bleeding. The model showed a higher ability to detect adverse events than conventional scores [27]. The other two models that focus on predicting blood transfusion or mortality for UGIB in the intensive care unit exhibited impressive performance, with an area under receiver operating characteristic (AUROC) exceeding 0.80 [22, 28]. Moreover, Shung et al. developed an available online machine-learning model from multicenter patient data to stratify low-risk patients who can be treated in an outpatient setting [24]. A multicenter study proved that the model performs better than the GBS, with great sensitivity and high specificity [29].

From our perspective, predicting the need for endoscopic intervention in patients with UGIB is one of the key decisions in management flow, and it would benefit physicians in the resource-limited area. This prospect could optimize the selective referral of patients from the primary healthcare center to the endoscopic center. Currently, only a few machine learning models precisely predict the need for endoscopic intervention in UGIB [30]. Then, we developed a simplified machine-learning model with structured data analysis for clinical decision-supporting systems to indicate the need for endoscopic intervention in patients with acute UGIB.

Material and methods

Patients

Prospectively collected data from adult patients who presented acute overt UGIB at Siriraj Hospital, Bangkok, Thailand, from January 2011 to December 2020 were retrospectively reviewed. The UGIB management protocol in our hospital followed international guidelines [4,5,6,7]. However, the final treatment decision was based on the attending physician and the bleeding team, including the endoscopist, radiological interventionist, and surgeon, who were available on a 24–7 basis. An expert endoscopist or trainees under close supervision performed the endoscopy. Endoscopic findings and hemostatic intervention were recorded using an Endosmart program. Hemostatic interventions for ulcers with high-risk bleeding [Forrest classification Ia (spurting bleeding), Ib (oozing bleeding), and IIa (non-bleeding visible vessel)] include a single or combination of adrenaline injection, hemostatic clip, thermal hemostasis, or hemostatic powder. For adherent clots, an attempt was made to remove the clot and examine the character of the underlying lesion. The intervention will not be applied in low-risk ulcers, including clean base ulcers and hematin spots. For variceal bleeding, rubber band ligation or glue injection was usually used for hemostasis of active bleeding lesions; for nonbleeding esophageal varices with high-risk stigmata, prophylaxis band ligation was performed depending on the endoscopist’s decision. The study included patients aged 18 years and older with acute UGIB who underwent EGD with comprehensive documentation of endoscopic findings. The inclusion was done by retrieving the patients from the Siriraj GI endoscopic center database using the search term “upper GI bleeding or UGIB” from the indication for EGD. The exclusion criteria were patients with in-hospital onset of UGIB, onset of UGIB more than 72 hours before hospital visit, and missing data on baseline characteristics, bleeding presentation, laboratory findings, endoscopic result, or intervention. Eligible patients who met the defined criteria were analyzed and used to develop and validate the machine-learning model.

Data collection and outcome

The data from the chart and endoscopic record were extracted manually by well-trained GI fellows. They were divided into three categories: patient characteristic data, bleeding characteristic data, and laboratory data. All the data must be available before endoscopy. The baseline characteristics were extracted from the patient’s history, which was documented before the hospital visit. The bleeding presentation was noted by the first physician who encountered the patient, and laboratory tests were the initial test after the patient visited the hospital and before the GI consultation. Characteristic data of the patient included age, sex, co-morbidities documented in the medical file before the bleeding event, such as heart disease, stroke, chronic kidney disease, cirrhosis, and active malignancy, use of antithrombotic agents [conventional non-steroidal anti-inflammatory drugs (NSAIDs), cyclooxygenase-(COX) 2 inhibitors, aspirin, clopidogrel, and oral anticoagulants], previous episodes of UGIB and duration of bleeding before hospitalization documented by the first physician who encounters the patient. The definition of each comorbidity was described in Supplementary Material Appendix 1.

Characteristics of bleeding composed of the clinical manifestation of bleeding, such as type of vomitus (red or coffee-ground emesis) and type of stool (melena, maroon stool, red stool), the presence of syncope, altered consciousness (defined by a decreased Glasgow coma score less than 13 or documented as the chief complaint to the hospital), systolic blood pressure (SBP), heart rate, and need for resuscitation (fluid therapy with the rate of fluid higher than 500 ml/h without vasopressor or vasopressor needed after optimal fluid therapy).

Initial laboratory test data included hemoglobin, platelet count, blood urea nitrogen (BUN), creatinine, serum albumin, and international normalized ratio (INR). The BUN and creatinine were analyzed as BUN/creatinine ratio as it would diminish the falsely high BUN from renal causes such as pre-renal azotemia or chronic kidney disease. These parameters are used in previous scoring systems associated with hospital intervention and mortality [8,9,10]. The endpoint selected to develop the machine learning model was the requirement of endoscopic hemostatic intervention for variceal or non-variceal procedures, including epinephrine injection, hemostatic clip, thermal hemostasis, rubber band ligation, glue injection therapy, and hemostatic powder. All the data was cleaned and rechecked for exclusion criteria.

For the data pre-processing step, we adopted the One-Hot Encoding method for this experiment because most machine learning algorithms are not capable of handling categorical data without encoding. Furthermore, this experiment also adopted the normalization technique, which is the z-score, as part of the data pre-processing step for machine learning. This technique was used to rescale the values of numeric columns in the dataset without distorting differences in the ranges of values.

Model development

For this study, we built and developed machine learning models in Python to predict the need for endoscopic intervention (Version 3.10, 64 bits). The data set of patients was randomly divided into two groups with an 80%–20% ratio as training and test sets. To reduce the risk of overfitting and ensure that the model can perform well in various samples, we applied a stratified 5-fold cross-validation as a validation technique. The training data were then fed to all models available in the model library using cross-validation to train and validate the models. These 15 supervised learning models included Linear Discriminant Analysis, Logistic Regression, Naïve Bayes, CatBoost Classifier, Extra Trees Classifier, Quadratic Discriminant Analysis, Random Forest Classifier, Gradient Boosting Classifier, Ada Boost Classifier, Light Gradient Boosting Machine, Extreme Gradient Boosting, K-Neighbors Classifier, Decision Tree Classifier, Dummy Classifier, and Lasso regression. They are available in a Python library called Pycaret (version 3.1.0). Since we adopted the stratified 5-fold cross-validation, the training data was randomly divided into five subsets or folds. The model was trained and evaluated five times, using a different fold as the validation set. Then, performance metrics from each fold were averaged to estimate the model’s performance, which was shown as the average area under the receiver operating characteristic (AUROC), accuracy, sensitivity, and specificity on the validation sets across five folds. Based on these comparison results, we determined the optimal model to predict the need for endoscopic intervention by selecting the model that could achieve the highest AUROC. Since default hyperparameters were implemented in all models, the best model was subsequently adjusted with a hyperparameter-tuning method in Pycaret to find the best prediction performance. In this experiment, we implemented a random grid search over a pre-defined grid search for hyperparameter tuning. In addition, we increased the number of iterations, ranging from 100 to 1000 at intervals of 100, to find the best performance. However, the same results were obtained, so the minimum number of iterations, which is 100, was chosen in this experiment. Then, the tuned model was internally validated with the test set to evaluate the performance and analyze the essential factors for the prediction. Finally, the model was deployed on the local host with the Python library, Streamlit. The outcome was shown as the need or the lack of a need for endoscopic intervention. The prediction probability was noted in the result as a percentage for the user to make decisions for further management.

Statistical analysis

Qualitative data were analyzed by frequency and percentage, while quantitative data were analyzed by mean and standard deviation. The difference in variables between the two groups was analyzed using Fischer’s exact test or the Mann–Whitney U test. The prediction performance was measured by the AUROC curve analysis, sensitivity, specificity, and accuracy. The negative predictive value (NPV) and positive predictive value (PPV) of the model were calculated. The AUROC value was predefined as follows: acceptable threshold (≥ 0.7), fair performance (≥ 0.7 but < 0.8), good performance (≥ 0.8 but < 0.9), and excellent performance (≥ 0.9). A comparison of the AUROC of the proposed model and the conventional score was performed using a paired permutation test. A P-value < 0.05 was considered statistical significance. All statistical analyses were performed using Python (Version 3.10, 64 bits).

The sample size recruitment strategy was designed to include as many patients as possible to achieve a more efficient machine-learning model. This study followed the TRIPOD-AI reporting guideline and the ethical guidelines of the Declaration of Helsinki and was approved by the Siriraj Institutional Review Board (COA No. Si 1028/2021). The checklist of the TRIPOD-AI reporting guideline is provided in Supplementary Material Appendix 2. Since this was a retrospective analysis, informed consent was not obtained from the patients.

Results

Patient characteristic

The database of 2,201 patients with acute UGIB was reviewed. Among these, 635 patients with missing data, 170 patients with delayed hospitalization, and seven patients with in-hospital UGIB were excluded, resulting in 1,389 patients being eligible for model development. All patients underwent upper endoscopy within 120 h, and 615 (44.3%) of the cohorts received the endoscopic intervention; 293 variceal interventions, 336 nonvariceal interventions, and 14 patients received both variceal and nonvariceal interventions. The baseline characteristics, bleeding characteristics, and laboratory findings are presented in Table 1.

Table 1 Baseline characteristics, clinical presentation, and laboratory findings

For the total cohort, the mean age was 64.3 years, with a male predominance of 65%. The patients in the intervention group were younger, and male patients with coexisting cirrhosis and active malignancy were more prevalent. Furthermore, patients in the intervention group came to the hospital earlier. They required a higher rate of resuscitation, which was consistent with a significantly lower systolic blood pressure, a higher heart rate, more red emesis, a lower platelet number, a lower albumin level, and a higher INR level. For medication, the use of NSAIDs, COX-2 inhibitors, and clopidogrel was comparable between the two groups, but the nonintervention group consisted of higher aspirin and anticoagulant users.

Model parameters

After analyzing the data for the model development, some independent parameters were correlated with each other. For example, patients with a history of stroke or heart disease tend to use antiplatelet agents; creatinine levels could reflect chronic kidney disease; syncope and altered mental status can be evaluated as unstable vital signs. These categorical parameters, such as the history of stroke, heart disease, chronic kidney disease, syncope, and alteration of consciousness, were dropped out because of their probable co-linearity by logical assumption, as they displayed equivalent properties of the subjects and caused unstable coefficient estimates or overfitting models. We performed the Variance Inflation Factor (VIF) to evaluate the co-linearity of the numerical parameters. The result showed that hemoglobin, systolic blood pressure, heart rate, and albumin had high VIF. However, they were crucial parameters used in many previous prediction models [31,32,33,34,35,36,37]. Therefore, these parameters were retained in the models. Several types of antithrombotic drugs were grouped as the preliminary models showed similar precision between the grouped parameters and the distinct parameters of this medication.

Finally, 18 parameters, including age, sex, presence of cirrhosis, active malignancy, use of antithrombotic drugs, previous history of UGIB, vomitus characters (red emesis or coffee-ground emesis), and stool characters (melena, maroon stool or red stool), duration of UGIB before hospitalization, resuscitation requirement, systolic blood pressure, heart rate, hemoglobin level, platelet number, serum albumin level, blood urea nitrogen level, creatinine level and INR level were used as input for the machine learning models. From 8 categorical data and 10 numerical data, all categorical features were transformed by the One-Hot Encoding method. Each categorical level becomes a separate feature in the dataset containing binary values, either 0 or 1. In doing so, the total of eighteen features was extended to twenty-one features. The blood urea nitrogen level and creatinine level will be computed as a ratio for machine learning analysis.

As mentioned above, 80% of the cohort (1,111 patients) was used as a training set for machine learning models. The baseline characteristics of the training set and the test set are shown in Table 2. There were no significant differences in patient profile, bleeding presentation, laboratory results, and endoscopic hemostatic intervention between these two sets.

Table 2 Clinical characteristics and intervention of the training set and the test set

Model performance

According to the comparison results in Table 3, the linear discriminant analysis model demonstrated the highest AUROC, accuracy, and specificity. The AUROC of this model is 0.74, with a sensitivity of 57%, a specificity of 80%, and an accuracy of 70%. The result of the 5-fold cross-validation of the linear discriminant analysis model is shown in Supplementary Material Appendix 3. The model with the highest sensitivity of 63% was the Naïve Bayes model, but the AUROC was only 0.73. The linear discriminant analysis model was chosen for fine-tuning to develop the best performance, and its AUROC slightly increased to 0.75. For performance evaluation, the model was internally validated with the test set, which demonstrated an AUROC of 0.81, as shown in Fig. 1. The NPV and PPV of our model were calculated from the confusion matrix of the test set, and the results were 0.75 and 0.74, respectively, with a prevalence of UGIB of 0.44.

Table 3 Comparison of the average prediction performances of 15 models on the validation sets
Fig. 1
figure 1

ROC curve for predicting endoscopic intervention based on Linear Discriminant Analysis model with the test set

The importance of the features was analyzed and shown in Fig. 2. Cirrhosis, red emesis, and the need for resuscitation were the three most important features of the model prediction of the need for endoscopic intervention. The probability threshold values to classify the patients into intervention groups or non-intervention groups are plotted in Fig. 3. The threshold of the model can be adjusted with the same AUROC result. At the default threshold of 0.5, our model has sensitivity and specificity of 74.5% and 81.4%, respectively. When the probability threshold value decreases, the model’s sensitivity increases, but the specificity decreases. For example, at the probability threshold of 0.17, the sensitivity could reach 99.2% with a specificity of 17.5%. After considering the accuracy in the figure, we found that the accuracy values were high when the probability threshold values ranged from 0.4 to 0.6, especially at values around 0.5. Therefore, this study used 0.5 as the threshold value to classify the patients into groups.

Fig. 2
figure 2

Feature importance plot for predicting endoscopic intervention based on the linear discriminant analysis model

Fig. 3
figure 3

The threshold optimization plot of predicting endoscopic intervention based on Linear Discriminant Analysis model

The developed Linear Discriminant Analysis model, GBS, pre-endoscopic RS, and AIMS65 score were applied to the test set cohort, and the AUROC of each score was analyzed, as shown in Fig. 4. The results showed that our model was superior to conventional scoring systems in predicting the need for endoscopic intervention (AUC, developed model 0.81 [95% CI 0.76–0.87] vs GBS 0.55 [95% CI 0.48–0.61] p < 0.001, pre-endoscopic RS 0.60 [95% CI 0.53–0.67] p < 0.001, AIMS65 score 0.54 [95% CI 0.47–0.61] p < 0.001).

Fig. 4
figure 4

Comparison of the ROC curve for predicting endoscopic intervention among the developed model, Glasgow-Blatchford score, Rockall score, and AIMS-65 score

After we achieved the optimal prediction model based on linear discriminant analysis, we implemented this model in a local web application using Streamlit (Python Library), as demonstrated in Fig. 5. The panel on the left side will be used for data input. After inserting all the data, the result and probability of prediction will be instantly shown on the right side. This finalized program can be used on a local host computer.

Fig. 5
figure 5

An application of endoscopic intervention prediction based on linear discriminant analysis (LDA) with an example of 18 features and their corresponding results

Discussion

Our study successfully developed a machine learning model to predict the need for endoscopic hemostatic intervention in patients with acute UGIB. The model was operated by entering 18 simple parameters, including 6 demographic data, 6 bleeding characteristic data, and 6 initial laboratory data. As confirmed by a large validation cohort, the accuracy of this model ranged from fair to good, with an AUROC of 0.81, an accuracy of 70%, a sensitivity of 57%, and a specificity of 80%. The results revealed that the linear discriminant analysis outperformed other machine-learning algorithms. This might be because our data is in a relationship in such a way that it could utilize the advantages of linear discriminant analysis. To illustrate this point, linear discriminant analysis (LDA) maximizes the separation between classes and helps reduce the overlap between classes in the reduced-dimensional space. This makes it highly effective for classification. Additionally, it helps prevent overfitting by reducing dimensionality while maintaining class-related information.

Prior studies have shown that the characteristics of active bleeding, such as red emesis and bloody gastric content, physical signs of hypovolemia, such as the presence of syncope, lower mean arterial pressure, and laboratory findings of low hemoglobin level, prolonged prothrombin time, and higher BUN level, were correlated with the performance of therapeutic intervention [31,32,33]. Combining those factors predicted endoscopic intervention with an AUROC comparable to all the parameters of GBS [33]. Subsequently, several scores were developed to exclusively predict endoscopic intervention in UGIB with parameters less complex than GBS (Table 4). The MAP(ASH) score was created in Spain and validated by a large cohort of the multiethnic population. The AUROC of the MAP(ASH) score was similar to the GBS but was significantly higher than the AIMS65 score [34]. Two Japanese scores were described: Nagoya University and H3B2 scores [35, 36]. Both of which demonstrated an AUROC higher than GBS. The London Haemostat Score (LHS) also showed higher accuracy than GBS [37]. Although these scoring systems provide similar or better accuracy to GBS with less complex parameters, their performance is still limited.

Table 4 Summary of studies that evaluated factors and scoring system in predicting endoscopic intervention and outcome in UGIB

An artificial model focusing on endoscopic intervention prediction has been developed to overcome the limited performance of scoring systems. Veisman et al. conducted a machine learning tool in Israel using 34 parameters to assess the possible correlation between baseline characteristics and endoscopic intervention in 883 patients [30]. The Random Forest model was created with a sensitivity, specificity, and AUROC of 0.55, 0.71, and 0.68, respectively. The AUROC of the model is higher than those of GBS (0.54) and pre-endoscopic RS (0.56), which is similar to the result of our population. The analysis plot showed that syncope, cirrhosis, and erythromycin use are correlated with the risk of intervention. Compared to our model, the sensitivity, specificity, and AUROC are higher than those of Veisman et al., with fewer parameters being used. The better performance of our model could be because we included a higher number of training populations for model derivation, and we tried creating a variety of models and selected the best performance among them. However, the significant parameters for the prediction were in the exact correlation. Cirrhosis is a common risk feature in both models, as it carries various mechanisms that potentiate endoscopic intervention, including esophagogastric varices, thrombocytopenia, and coagulopathy. UGIB patients with cirrhosis generally have varices and usually need endoscopic intervention for both therapeutic and prophylaxis purposes. The presence of red emesis or syncope and the requirement for fluid resuscitation may reflect the severity of bleeding from the underlying high-risk stigmata lesion that leads to endoscopic intervention. For erythromycin, the prokinetic effect was discussed to improve endoscopic visualization, and then the endoscopic intervention can be performed consecutively. However, the portion of erythromycin used in the study was small. In our center, intravenous erythromycin is unavailable, so our model did not include pre-endoscopic prokinetic use as a parameter.

In this study, the hemoglobin level was not an influential factor in predicting the endoscopic intervention. The explanation could be that it was only a one-time static parameter that did not represent the severity of bleeding. Many patients might have a baseline for chronic anemia from other medical conditions. From the previous model predicting endoscopic intervention by Veisman, et al. [30] and Ito et al. [35], hemoglobin was not correlated with increased risk for endoscopic intervention as well. The changes in hemoglobin level from baseline or the first test should be more accurately related to bleeding severity. However, this information was not available in our population.

Although numerous models were developed, the best model with great precision could not be achieved to predict the UGIB outcome. This could be explained by the hypothesis that UGIB is a complex disease with dynamic conditions, and the current risk scores are not dynamic [38]. The clinical severity could change from the first hospital visit to the endoscopy time due to the etiology of bleeding, patient comorbidity, anticoagulation, and resuscitation process. Pre-endoscopic medication, such as proton pump inhibitors, can downstage the endoscopic stigmata and reduce the need for endoscopic intervention [39]. A study comparing urgent endoscopy (<6 h) and early endoscopy (<24 h) in UGIB patients showed more high-risk stigmata lesions in the urgent group (66.4 vs. 47.8%). However, the mortality outcome was not different in both groups [40]. Therefore, static parameters at one time may not accurately predict the need for hemostatic intervention in UGIB. The intervention performed during index EGD may not represent the culprit lesion of that bleeding episode, and different physicians may also decide differently on the same lesion in each situation. For example, a cirrhotic patient with melena from a peptic ulcer, which at the time of endoscopy showed a clean base ulcer and a column of large esophageal varices with high-risk stigmata, may undergo rubber band ligation for primary prophylaxis as recommended in the clinical algorithm [41]. This could cause cirrhosis as the most crucial factor for model prediction. Furthermore, in real-life practice, patients who did not require hemostatic intervention still need endoscopic evaluation to diagnose the underlying etiology. Referral to an endoscopist is still mandatory but less urgent than a patient who needs hemostatic intervention.

If the model were applied to the primary medical center, the patients with the initial diagnosis of UGIB would be evaluated and stabilized regularly. After inputting all parameters, the model will predict the endoscopic intervention probability of each patient. The physician would use this result to prioritize the urgency of an endoscopic intervention consultant or referral. Patients with low risk for intervention could be admitted for medical management and consulted for endoscopy later as elective or OPD cases. The proposed algorithm is shown in Fig. 6. For example, in our UGIB population, with a prevalence of 0.44 and NPV of 0.75, patients with predicted results as no need for intervention would have a 75% probability that they did not truly require urgent endoscopy. However, the physician should not miss the 25% risk. Other in-hospital evaluations, such as vital sign monitoring, serial hemoglobin level, and proper elevation of hemoglobin after blood transfusion, are still crucial for referral judgment. This protocol may reduce unnecessary urgent endoscopic procedures and optimize resource allocation in resource-limited settings.

Fig. 6
figure 6

An algorithm for model implementation in primary care unit with the patients presented with upper GI bleeding

The strength of our model is that we use numerous databases of more than 1000 patients with UGIB, including both variceal and nonvariceal bleeding. We generated 15 models and compared them to select the most productive model for prediction. Our model provided accuracy superior to that of the GBS. However, some limitations were noted in our model. First, this study included 10 years of data collection. Over the decades, the development and changes in practice may affect physicians’ decisions over time. However, the resuscitation and endoscopic treatment recommendations have not changed significantly since 2010 [42]. There are updates on restrictive blood transfusion strategies and new options for rescue therapeutic intervention, such as the over-the-scope clip and hemostatic spray, which do not affect the core concept of acute UGIB management. Second, about 30% of the patients with missing data were excluded, mainly due to missing the onset of symptoms and the INR. The characteristics of those missing data were compared to the eligible data and shown in Supplementary Material Appendix 4. Compared to the eligible population, the onset of symptoms and INR were not significantly different in both groups. There were significant differences in some parameters which might lead to bias. Also, some groups of patients were not included in the model computation, for example, the patients who did not undergo EGD due to low risk for GBS calculation, early death, or limitation of care. Moreover, some medical history not mentioned is considered to be absent such as unrecognized underlying disease, and untold over-the-counter NSAIDs or other medication. Third, certain critical factors that could affect the management decision were not included in the model, such as other anti-platelets in P2Y12 antagonists. They have been available in our center for the past few years. However, with high costs and reimbursement limitations, only a few patients received this medication. From our cohort, there were no patients prescribed these medications. The other crucial factors are time to endoscopy, pre-endoscopic medication, dynamic change in the bleeding characteristic or hemodynamic status after initial management, or the difference in hemoglobin level at present compared to baseline to assess the chronicity and severity of bleeding. In practice, these parameters are essential for the physician’s decision but require multiple data input steps and may not be practical in the primary setting. Further study, including multiple steps of data collection, would be helpful to maximize the model’s performance. Fourth, the CI95% for AUC, sensitivity, and specificity of the model were not presented as the results computed by Pycaret could not be further calculated to CI95%, but it was described in a similar pattern to the previous study [30]. Lastly, the management of UGIB patients relies on physician judgment. In our center, there were no hospital-specific protocols for physicians in the emergency department. Different physicians, including endoscopists, may act differently when encountering similar situations. The threshold for resuscitation can be varied in some cases. For example, the dynamic changes in clinical and laboratory parameters could lead to further resuscitation. Even though the management of endoscopists for intervention is based on the current guidelines mentioned in the method, some interventions depend on the endoscopist’s judgment, such as prophylaxis EV ligation in cirrhotic patients with non-variceal bleeding. In real-life applications, regardless of the accuracy of the models, physicians must combine these prediction results with other dynamic factors to get the most appropriate management for the patients.

Conclusions

The prediction for endoscopic intervention in acute UGIB patients is complex and dynamic. In response to this challenge, our machine learning model, which used simple clinical parameters, performed fairly well in identifying UGIB patients who need endoscopic intervention. The practical implication of this study is that physicians in primary care units could prioritize patients who need a referral to endoscopic centers. Further development and external validation to identify more specific features will improve prediction performance.

Data availability

The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request.

Abbreviations

AUROC:

Area under receiver operating characteristic

BP:

Blood pressure

BUN:

Blood urea nitrogen

COX-2:

Cyclooxygenase-2

Cr:

Creatinine

EGD:

Esophagogastroduodenoscopy

GBS:

Glasgow-Blatchford score

INR:

International normalized ratio

NSAIDs:

Non-steroidal anti-inflammatory drugs

RS:

Rockall score

UGIB:

Upper gastrointestinal bleeding

References

  1. Antunes C, Copelin IE. Upper gastrointestinal bleeding. Petersburg: StatPearls; 2023.

    Google Scholar 

  2. Charatcharoenwitthaya P, Pausawasdi N, Laosanguaneak N, Bubthamala J, Tanwandee T, Leelakusolvong S. Characteristics and outcomes of acute upper gastrointestinal bleeding after therapeutic endoscopy in the elderly. World J Gastroenterol. 2011;17(32):3724–32. https://doiorg.publicaciones.saludcastillayleon.es/10.3748/wjg.v17.i32.3724.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Kamboj AK, Hoversten P, Leggett CL. Upper gastrointestinal bleeding: etiologies and management. Mayo Clin Proc. 2019;94(4):697–703. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.mayocp.2019.01.022.

    Article  PubMed  Google Scholar 

  4. Alali AA, Barkun AN. An update on the management of non-variceal upper gastrointestinal bleeding. Gastroenterol Rep (Oxf). 2023;11:goad011. https://doiorg.publicaciones.saludcastillayleon.es/10.1093/gastro/goad011.

    Article  PubMed  Google Scholar 

  5. Laine L, Barkun AN, Saltzman JR, Martel M, Leontiadis GI. ACG clinical guideline: upper gastrointestinal and ulcer bleeding. Am J Gastroenterol. 2021;116(5):899–917. https://doiorg.publicaciones.saludcastillayleon.es/10.14309/ajg.0000000000001245.

    Article  CAS  PubMed  Google Scholar 

  6. Gralnek IM, Stanley AJ, Morris AJ, et al. Endoscopic diagnosis and management of nonvariceal upper gastrointestinal hemorrhage (NVUGIH): european Society of Gastrointestinal Endoscopy (ESGE) guideline—update 2021. Endoscopy. 2021;53(3):300–32. https://doiorg.publicaciones.saludcastillayleon.es/10.1055/a-1369-5274.

    Article  PubMed  Google Scholar 

  7. Siau K, Hearnshaw S, Stanley AJ, et al. British Society of Gastroenterology (BSG)-led multisociety consensus care bundle for the early clinical management of acute upper gastrointestinal bleeding. Frontline Gastroenterol. 2020;11(4):311–23. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/flgastro-2019-101395.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Rockall TA, Logan RF, Devlin HB, Northfield TC. Risk assessment after acute upper gastrointestinal haemorrhage. Gut. 1996;38(3):316–21. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/gut.38.3.316.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Blatchford O, Murray WR, Blatchford M. A risk score to predict need for treatment for upper-gastrointestinal haemorrhage. Lancet. 2000;356(9238):1318–21. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/S0140-6736(00)02816-6.

    Article  CAS  PubMed  Google Scholar 

  10. Saltzman JR, Tabak YP, Hyett BH, Sun X, Travis AC, Johannes RS. A simple risk score accurately predicts in-hospital mortality, length of stay, and cost in acute upper GI bleeding. Gastrointest Endosc. 2011;74(6):1215–24. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.gie.2011.06.024.

    Article  PubMed  Google Scholar 

  11. Stanley AJ, Laine L, Dalton HR, et al. Comparison of risk scoring systems for patients presenting with upper gastrointestinal bleeding: international multicentre prospective study. BMJ. 2017;356:i6432.

    PubMed  PubMed Central  Google Scholar 

  12. Wang R, Wang Q. Comparison of risk scoring systems for upper gastrointestinal bleeding in patients after renal transplantation: a retrospective observational study in Hunan, China. BMC Gastroenterol. 2022;22(1):353. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12876-022-02426-3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Renukaprasad AK, Narayanaswamy S, Vinay R. A comparative analysis of risk scoring systems in predicting clinical outcomes in upper gastrointestinal bleed. Cureus. 2022;14(7):e26669. https://doiorg.publicaciones.saludcastillayleon.es/10.7759/cureus.26669.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Ramaekers R, Mukarram M, Smith CA, Thiruganasambandamoorthy V. The predictive value of preendoscopic risk scores to predict adverse outcomes in emergency department patients with upper gastrointestinal bleeding: a systematic review. Acad Emerg Med. 2016;23(11):1218–27. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/acem.13101.

    Article  PubMed  Google Scholar 

  15. Chandnani S, Rathi P, Sonthalia N, et al. Comparison of risk scores in upper gastrointestinal bleeding in western India: a prospective analysis. Indian J Gastroenterol. 2019;38(2):117–27. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s12664-019-00951-w.

    Article  PubMed  Google Scholar 

  16. Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Future Healthc J. 2019;6(2):94–98. https://doiorg.publicaciones.saludcastillayleon.es/10.7861/futurehosp.6-2-94.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Yen HH, Wu PY, Chen MF, Lin WC, Tsai CL, Lin KP. Current status and future perspective of artificial intelligence in the management of peptic ulcer bleeding: a review of recent literature. J Clin Med. 2021;10(16):3527. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/jcm10163527.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Alowais SA, Alghamdi SS, Alsuhebany N, et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Educ. 2023;23(1):689. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12909-023-04698-z.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Shung DL. Advancing care for acute gastrointestinal bleeding using artificial intelligence. J Gastroenterol Hepatol. 2021;36(2):273–78. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/jgh.15372.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Yen HH, et al. Performanced comparison of the deep learning and the jumnag endoscopist for bleeding peptic ulcer disease. J Med Biol Eng 2021;4:504–13.

    Google Scholar 

  21. Klang E, Barash Y, Levartovsky A, Barkin Lederer N, Lahat A. Differentiation between malignant and benign endoscopic images of gastric ulcers using deep learning. Clin Exp Gastroenterol. 2021;14:155–62. https://doiorg.publicaciones.saludcastillayleon.es/10.2147/CEG.S292857.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Deshmukh F, Merchant SS. Explainable machine learning model for predicting GI bleed mortality in the intensive care unit. Am J Gastroenterol. 2020;115(10):1657–68. https://doiorg.publicaciones.saludcastillayleon.es/10.14309/ajg.0000000000000632.

    Article  PubMed  Google Scholar 

  23. Rotondano G, Cipolletta L, Grossi E, et al. Artificial neural networks accurately predict mortality in patients with nonvariceal upper GI bleeding. Gastrointest Endosc. 2011;73(2):218–26. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.gie.2010.10.006.

    Article  PubMed  Google Scholar 

  24. Shung D, Simonov M, Gentry M, Au B, Laine L. Machine learning to predict outcomes in patients with acute gastrointestinal bleeding: a systematic review. Dig Dis Sci. 2019;64(8):2078–87. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s10620-019-05645-z.

    Article  PubMed  Google Scholar 

  25. Hou Y, Yu H, Zhang Q, et al. Machine learning-based model for predicting the esophagogastric variceal bleeding risk in liver cirrhosis patients. Diagn Pathol. 2023;18(1):29. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13000-023-01293-0.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Herrin J, Abraham NS, Yao X, et al. Comparative effectiveness of machine learning approaches for predicting gastrointestinal bleeds in patients receiving antithrombotic treatment. JAMA Network Open. 2021;4(5):e2110703. https://doiorg.publicaciones.saludcastillayleon.es/10.1001/jamanetworkopen.2021.10703.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Seo DW, Yi H, Park B, et al. Prediction of adverse events in stable non-variceal gastrointestinal bleeding using machine learning. J Clin Med. 2020;9(8):2603. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/jcm9082603.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Levi R, Carli F, Arevalo AR, et al. Artificial intelligence-based prediction of transfusion in the intensive care unit in patients with gastrointestinal bleeding. BMJ Health Care Inform. 2021;28(1):e100245. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/bmjhci-2020-100245.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Shung DL, Au B, Taylor RA, et al. Validation of a machine learning model that outperforms clinical risk scoring systems for upper gastrointestinal bleeding. Gastroenterology. 2020;158(1):160–67. https://doiorg.publicaciones.saludcastillayleon.es/10.1053/j.gastro.2019.09.009.

    Article  PubMed  Google Scholar 

  30. Veisman I, Oppenheim A, Maman R, et al. A novel prediction tool for endoscopic intervention in patients with acute upper gastro-intestinal bleeding. J Clin Med. 2022;11(19):5893. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/jcm11195893.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Lee CH, Yoon H, Choi YJ, et al. Predictive factors of therapeutic intervention in on-call endoscopy for suspected gastrointestinal bleeding. Scand J Gastroenterol. 2018;53(8):958–63. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/00365521.2018.1493533.

    Article  PubMed  Google Scholar 

  32. Kim SS, Kim KU, Kim SJ, et al. Predictors for the need for endoscopic therapy in patients with presumed acute upper gastrointestinal bleeding. Korean J Intern Med. 2019;34(2):288–95. https://doiorg.publicaciones.saludcastillayleon.es/10.3904/kjim.2016.406.

    Article  CAS  PubMed  Google Scholar 

  33. Acehan F, Karsavuranoglu B, Kalkan C, Aslan M, Altiparmak E, Ates I. Three simple parameters on admission to the emergency department are predictors for endoscopic intervention in patients with suspected nonvariceal upper gastrointestinal bleeding. J Emerg Med. 2024;66(2):64–73. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jemermed.2023.08.016.

    Article  PubMed  Google Scholar 

  34. Redondo-Cerezo E, Vadillo-Calles F, Stanley AJ, et al. MAP(ASH): a new scoring system for the prediction of intervention and mortality in upper gastrointestinal bleeding. J Gastroenterol Hepatol. 2020;35(1):82–89. https://doiorg.publicaciones.saludcastillayleon.es/10.1111/jgh.14811.

    Article  PubMed  Google Scholar 

  35. Ito N, Funasaka K, Furukawa K, et al. A novel scoring system to predict therapeutic intervention for non-variceal upper gastrointestinal bleeding. Intern Emerg Med 2022;17:423–30.

    PubMed  Google Scholar 

  36. Sasaki Y, Abe T, Kawamura N, et al. Prediction of the need for emergency endoscopic treatment for upper gastrointestinal bleeding and new score model: a retrospective study. BMC Gastroenterol. 2022;22(1):337. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12876-022-02413-8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Marks I, Janmohamed IK, Malas S, et al. Derivation and validation of a novel risk score to predict need for haemostatic intervention in acute upper gastrointestinal bleeding (London Haemostat Score). BMJ Open Gastroenterol. 2023;10(1):e001008. https://doiorg.publicaciones.saludcastillayleon.es/10.1136/bmjgast-2022-001008.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Orpen-Palmer J, Stanley AJ. A review of risk scores within upper gastrointestinal bleeding. J Clin Med. 2023;12(11):3678. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/jcm12113678.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Sreedharan A, Martin J, Leontiadis GI, et al. Proton pump inhibitor treatment initiated prior to endoscopic diagnosis in upper gastrointestinal bleeding. Cochrane Database Syst Rev. 2010;2010(7):CD005415. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/14651858.CD005415.pub3.

    Article  PubMed  PubMed Central  Google Scholar 

  40. Lau JYW, Yu Y, Tang RSY, et al. Timing of endoscopy for acute upper gastrointestinal bleeding. N Engl J Med. 2020;382(14):1299–308. https://doiorg.publicaciones.saludcastillayleon.es/10.1056/NEJMoa1912484.

    Article  PubMed  Google Scholar 

  41. Pfisterer N, Unger LW, Reiberger T. Clinical algorithms for the prevention of variceal bleeding and rebleeding in patients with liver cirrhosis. World J Hepatol. 2021;13(7):731–46. https://doiorg.publicaciones.saludcastillayleon.es/10.4254/wjh.v13.i7.731.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Barkun AN, Bardou M, Kuipers EJ, et al. International consensus recommendations on the management of patients with nonvariceal upper gastrointestinal bleeding. Ann Intern Med. 2010;152(2):101–13. https://doiorg.publicaciones.saludcastillayleon.es/10.7326/0003-4819-152-2-201001190-00009.

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

We thank James Mark Simmerman, Ph.D., Siriraj Hospital Faculty of Medicine at Mahidol University, Bangkok, Thailand, for his assistance in correcting the language for this paper.

Funding

Open access funding provided by Mahidol University

Open access funding provided by Mahidol University.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Uayporn Kaosombatwattana; Data curation: UK, Onuma Sattayalertyanyong; Formal analysis: OS, Kajornvit Raghareutai; Investigation: OS; Methodology: KR, Watcharaporn Tanchotsrinon; Project administration: UK; Resource: OS; Software: WT; Supervision: UK; Validation: KR, WT; Visualization: KR, WT; Writing—original draft: KR; Writing—review & editing: UK, WT

Corresponding author

Correspondence to Uayporn Kaosombatwattana.

Ethics declarations

Ethics approval and consent to participate

This study followed the ethical guidelines of the Declaration of Helsinki. Since this was a retrospective analysis, informed consent to participate was waived and approved by the Siriraj Institutional Review Board, Siriraj Hospital, Mahidol University (COA No. Si 1028/2021).

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Raghareutai, K., Tanchotsrinon, W., Sattayalertyanyong, O. et al. Development and validation of a machine learning model to predict hemostatic intervention in patients with acute upper gastrointestinal bleeding. BMC Med Inform Decis Mak 25, 145 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-025-02969-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-025-02969-x

Keywords