Skip to main content

Presenting a prediction model for HELLP syndrome through data mining

Abstract

Background

The HELLP syndrome represents three complications: hemolysis, elevated liver enzymes, and low platelet count. Since the causes and pathogenesis of HELLP syndrome are not yet fully known and well understood, distinguishing it from other pregnancy-related disorders is complicated. Furthermore, late diagnosis leads to a delay in treatment, which challenges disease management. The present study aimed to present a machine learning (ML) attitude for diagnosing HELLP syndrome based on non-invasive parameters.

Method

This cross-sectional study was conducted on 384 patients in Tajrish Hospital, Tehran, Iran, during 2010–2021 in four stages. In the first stage, data elements were identified using a literature review and Delphi method. Then, patient records were gathered, and in the third stage, the dataset was preprocessed and prepared for modeling. Finally, ML models were implemented, and their evaluation metrics were compared.

Results

A total of 21 variables were included in this study after the first stage. Among all the ML algorithms, multi-layer perceptron and deep learning performed the best, with an F1 score of more than 99%.In all three evaluation scenarios of 5fold and 10fold cross-validation, the K-nearest neighbors (KNN), random forest (RF), AdaBoost, XGBoost, and logistic regression (LR) had an F1 score of over 0.95, while this value was around 0.90 for support vector machine (SVM), and the lowest values were below 0.90 for decision tree (DT). According to the modeling output, some variables, such as platelet, gestational age, and alanine aminotransferase (ALT), were the most important in diagnosing HELLP syndrome.

Conclusion

The present work indicated that ML algorithms can be used successfully in the development of HELLP syndrome diagnosis models. Other algorithms besides DTs have an F1 score above 0.90. In addition, this study demonstrated that biomarker features (among all features) have the most significant impact on the diagnosis of HELLP syndrome.

Peer Review reports

Introduction

The HELLP syndrome, first reported in 1982, is a rare and sudden complication occurring in women during pregnancy or after childbirth [1,2,3]. This syndrome represents three complications of hemolysis, increased liver enzymes, and low platelet count, affecting 0.2–0.6% of pregnancies worldwide [1, 4]. Since the causes and pathogenesis of HELLP syndrome are not yet fully known and well understood, distinguishing it from other pregnancy-related disorders is complicated [1, 5, 6]. Furthermore, late diagnosis leads to a delay in treatment, which challenges disease management [5, 6]. In addition to its side effects, including intravascular coagulation, placental abruption, pulmonary edema, and retinal detachment, this syndrome causes perinatal complications like death [2, 4]. Consequently, timely diagnosis of HELLP syndrome is vital and life-saving to prevent complications in the mother and fetus [7].

The related studies usually employed a limited number of HELLP patients and did not provide comprehensive information regarding the prognostic factors for adverse outcomes in patients [2]. While some scholars have described HELLP syndrome as an advanced type of pregnancy poisoning, others considered it of a different nature [6, 8]. Given such discrepancy in the findings, HELLP syndrome is required to be recognized as a distinct entity from other related complications during pregnancy [9]. Although some algorithms are employed using the mother’s biochemical and clinical changes in early pregnancy to predict pregnancy poisoning, no accurate algorithm has been found to predict the HELLP syndrome. Considering the controversial relationship between this syndrome and preeclampsia, this syndrome is a hypertensive pregnancy disorder with a more severe inflammatory reaction compared to preeclampsia [8]. The HELLP can be different from preeclampsia because 15–20% of patients with HELLP do not have antecedent hypertension or proteinuria [10].

The occurrence prediction and early diagnosis of pregnancy-related diseases enable physicians to take preventive measures and try more effective and risk-based pregnancy care pathways [11, 12]. Clinical decisions are often made based on the intuition and experience of practitioners, not based on the rich information found in scientific databases. This practice leads to unwanted bias, errors, and excessive medical costs, which affects the quality of services provided to patients [13]. The patient records are among the main sources for conducting medical research [9]. Given the exponential development of medical databases and resources, extracting knowledge from all available data using traditional processing and analysis methods is time-consuming. The informatic tools play a significant role in analyzing these data to provide meaningful tools, such as data mining, for diagnosis, prognosis, and treatment purposes. Data mining uses various techniques to extract valuable information or knowledge from data. These techniques can be employed to collect data from all fields of science, including medicine [14,15,16]. Data mining consists of the process of determining and analyzing hidden information to discover useful knowledge. In this vein, discovering hidden patterns through data mining has significantly improved our understanding of disease diagnosis, progression, management, quality of care, and clinical decision-making by medical professionals (as the main factor of success in the healthcare process) [16,17,18,19]. Insights obtained from data mining indicate that maintaining a high level of care can affect cost, income, and operational efficiency [1]. The purpose of the development of a model in data mining projects is to discover knowledge and achieve results that are practical in the future [18].

Since the causes of HELLP syndrome are not well understood, the present study employed a data mining process to discover the required knowledge in preventing and diagnosing this syndrome in time. Five data-mining algorithms were used to investigate the retrospective data (including the demographic, clinical, and molecular factors affecting the diagnosis of HELLP syndrome) collected from a population of mothers within 25–37 weeks of pregnancy who showed evidence of hemolysis, low platelets, and abnormal liver tests. In addition, the present study aimed to compare and discover the patterns effective in the development of this syndrome compared to pregnancy poisoning. Given the difficulty of the diagnosis of HELLP syndrome, using ML for prediction can assist in easier detection of this condition and increase the accuracy of diagnoses.

Background

Moreira et al. proposed a model using artificial neural networks (ANNs) and fuzzy logic to predict HELLP syndrome in high-risk pregnancies. In the model, the learning capacity of ANNs was combined with the reasoning ability of fuzzy systems. The model employed mobile cloud computing in mind by avoiding diffuse inference, which required considerable computational effort. This study has reported an F1 score of 70.5% [1].

In another study, Melinte-Popescu et al. (2023) predicted the severity of HELLP syndrome using machine learning (ML) algorithms. They evaluated and compared the predictive performances of four ML-based models (decision tree [DT], random forest [RF], K-nearest neighbor (KNN), Naïve Bayes [NB]) to predict HELLP syndrome and its subtypes according to the Mississippi classification. In this study, all clinical and paraclinical features, including mother’s age, number of pregnancies, being a smoker, mother’s history of unsuccessful pregnancy, constant blood pressure and chronic kidney diseases, edema, gender, infant death, mother’s death, headache, nausea, epigastric pain, platelet, aspartate aminotransferase, and lactate dehydrogenase were used. The results indicated that HELLP syndrome is better predicted by DT (F1 Score = 94%) and KNN (F1 Score = 94%) [20].

Moreover, Melinte-Popescu et al. published a paper in 2023 entitled “Predictive Performance of Machine Learning-Based Methods For the Prediction of Preeclampsia—A Prospective Study.” This study aimed to evaluate and compare the predictive performances of ML-based models for the prediction of preeclampsia and its subtypes, such as HELLP syndrome. This prospective case-control study evaluated pregnancies in women who attended a tertiary maternity hospital in Romania between November 2019 and September 2022. The patients’ clinical and paraclinical characteristics were evaluated in the first trimester and were included in four ML-based models: DT, NB, support vector machine (SVM), and RF, and their predictive performance was assessed. Early-onset preeclampsia (EO-PE) was best predicted by DT (accuracy: 94.1%) and SVM (accuracy: 91.2%) models, while NB (accuracy: 98.6%) and RF (accuracy: 92.8%) models had the highest performance when used to predict all types of preeclampsia. The predictive performance of these models was modest for moderate and severe types of preeclampsia, with accuracies ranging from 70.6 to 82.4% [4]. The ML-based models could be useful tools for EO-PE prediction and could differentiate patients who will develop preeclampsia as early as the first trimester of pregnancy [21].

Furthermore, Villalaín et al. published a paper to predict the delivery within seven days after diagnosis of EO-PE using ML models. They aimed to develop a prediction model using ML tools for the need for delivery within seven days of diagnosis (model D) and the risk of developing HELLP syndrome or abruptio placentae. Maternal basal characteristics and data gathered during EOPE diagnosis: gestational age, blood pressure, platelets, creatinine, transaminases, angiogenesis biomarkers (soluble fms-like tyrosine kinase-1, placental growth factor), and ultrasound data were pooled for analysis. The most relevant variables were selected by bio-inspired algorithms. They developed basal models that solely included demographic characteristics of the patient (D1, HA1) and advanced models, adding information available upon diagnosis of EOPE (D2, HA2). First, they developed a predictive model of the need for delivery within seven days of diagnosis (model D), considering this as the window of the effect of antenatal corticosteroids on fetal maturation. Second, they developed a model to calculate the risk of developing HELLP syndrome or abruptio placentae at any point after EOPE diagnosis (model HA), as these are the most acute and harder-to-predict complications. In their case, they tried SVM, KNN algorithm, Gaussian Naïve Bayes (GNB), and DT models and selected them relying on the F1 score metric. At the time of diagnosis of EOPE, SVM with evolutionary feature selection process provided good predictive information of the need for delivery within seven days and development of HELLP/abruptio placentae, using maternal characteristics and markers that can be obtained routinely [22].

In 2022, Zheng et al. conducted a retrospective study to compare ML and logistic regression (LR) as predictive models for adverse maternal and neonatal outcomes of preeclampsia. The objective of this study was to evaluate the performance of ML and LR in developing short-term predictive models for binary maternal or neonatal outcomes involving preeclampsia, such as HELLP syndrome. The models were generated by common clinical indicators. They employed LR and six ML methods as binary predictive models for a dataset containing 733 women diagnosed with preeclampsia. The participants were grouped by four different pregnancy outcomes. After the imputation of missing values, statistical description and comparison were conducted preliminarily to explore the characteristics of the documented 73 variables. Sequentially, correlation analysis and feature selection were performed as preprocessing steps to filter contributing variables for developing models. In addition, the models were evaluated by multiple criteria. The RF classifier, multi-layer perceptron (MLP), and SVM demonstrated better discriminative power for prediction by comparing the area under the receiver operating characteristic (ROC) curve. However, the DT classifier, RF, and LR yielded better calibration ability, as verified by the calibration curve [23].

Chen et al. published a paper to predict adverse outcomes in de novo hypertensive disorders of pregnancy (HDP). A multitude of ML statistical methods were employed to develop two prediction models, one for maternal complications, including HELLP syndrome, and the other for perinatal deterioration. The maternal model using the RF algorithm produced an area under the curve (AUC) of 0.984 (95% CI (0.978, 0.991)). The best predictor variables selected by the model were platelet count, fetal head/abdominal circumference ratio, and gestational age at the diagnosis of de novo HDP; the perinatal model using the boosted tree algorithm yielded an AUC of 0.925 (95% CI (0.907, 0.945]). The most robust predictor variables selected were gestational age at the diagnosis of de novo HDP, fetal femur length, and fetal head/abdominal circumference ratio. These prediction models can help identify de novo HDP patients at increased risk of complications who might need intense maternal or perinatal care [24].

Moreover, in 2022, Huang et al. conducted a study that predicted preeclampsia complicated by fetal growth restriction and its perinatal outcome based on an ANN model. In this study, authors tried to adopt an ANN to assess the effect and predictive value of changes in maternal peripheral blood parameters and clinical indicators on pregnancy outcomes, such as HELLP syndrome in patients with preeclampsia complicated by fetal growth restriction. A total of 15 factors—maternal age, pre-pregnancy body mass index, inflammatory markers (neutrophil-to-lymphocyte ratio and platelet-to-lymphocyte ratio), coagulation parameters (prothrombin time and thrombin time), lipid parameters (high-density lipoprotein, low-density lipoprotein, and triglyceride counts), platelet parameters (mean platelet volume and platelet crit), uric acid, lactate dehydrogenase, and total bile acids—were correlated with preeclampsia complicated by FGR. A total of six ANNs were constructed with the adoption of these parameters. The accuracy, sensitivity, and specificity of predicting the occurrence of the following diseases and adverse outcomes were respectively as follows: 84.3%, 97.7%, and 78% for preeclampsia complicated by FGR; 76.3%, 97.3%, and 68% for provider-initiated preterm births: 81.9%, 97.2%, and 51% for predicting the severity of FGR; 80.3%, 92.9%, and 79% for premature rupture of membranes 80.1%, 92.3%, and 79% for postpartum hemorrhage; and 77.6%, 92.3%, and 76% for fetal distress [25].

Ejiwale et al. published a paper in 2021 entitled “Prediction of Concurrent Hypertensive Disorders in Pregnancy and Gestational Diabetes Mellitus Using ML Techniques.” This retrospective study sought to investigate, construct, evaluate, compare, and isolate a supervised ML predictive model for the binary classification of co-occurring HDP and gestational diabetes mellitus (GDM) in a cohort of otherwise healthy pregnant women. A total of 33 models were constructed with the following six supervised ML algorithms: LR, RF, DT, SVM, Stacking Classifier (SC), and Keras Classifier (a deep learning [DL] classification algorithm). All the algorithms were evaluated using the Stratified K-fold cross-validation (k = 10) method. The findings of this study indicated that the use of readily available routine prenatal attributes and appropriate ML methods can reliably predict the co-existence of HDP and GDM [26].

In another investigation in 2020, Marik et al. aimed to predict pregnancy toxicity through ML. This study sought to employ all clinical and laboratory data available during prenatal visits in early pregnancy, including HELLP syndrome, and use them to develop a predictive model for pregnancy poisoning. A total of 16,370 records were used in this study, and 67 variables that were examined in different models included the mother’s characteristics, medical history, usual laboratory results before birth, and drug use. In this research, a set of significant features for prediction was identified, and the use of usual information on the risk of pregnancy poisoning was predicted. In this work, two algorithms, gradient boosting and elastic net, were employed, and the highest performance was an AUC of 89% [27].

Furthermore, in 2019, Moreira et al. published a paper entitled “Data Analytics in Mobile Health Environments For High-risk Pregnancy Outcome Prediction.” They proposed the development, performance evaluation, and comparison of ML algorithms based on Bayesian networks capable of identifying at-risk pregnancies based on the symptoms and risk factors presented by the patients. The performance comparison of several Bayes-based ML algorithms determined the best-suited algorithm for predicting, identifying, and accompanying HDP. The contribution of this study focused on finding a smart classifier for the development of novel mobile devices, which presented reliable results in the identification of problems related to pregnancy. Through the well-known cross-validation method, this proposal was evaluated and compared with other recent approaches. The averaged one-dependence estimators presented better results on average than the other approaches. These findings are key to improving the health monitoring of women suffering from high-risk pregnancies around the world. Therefore, this study can contribute to a reduction in both maternal and fetal deaths [28].

Table 1 presents a summary of the reviewed literature.

Table 1 Summary of the reviewed literature

Materials and methods

The present research is a descriptive cross-sectional study conducted in four stages. In the first stage, data elements were identified. Then, patient records were gathered, and in the third stage, the dataset was preprocessed and prepared for modeling. Finally, ML models were implemented and evaluated.

Data element identification

Given the nature of this study, the first step was to identify the effective data elements in the diagnosis of HELLP syndrome, based on which data collection should be conducted. In order to identify the data elements, first, a literature review was conducted. According to the minimum clinical datasets, this literature review ensures that the set of data elements is considered for inclusion in the comprehensive set of elements [29]. The literature review was conducted using electronic databases, including the Scientific Information Database (SID, in Persian), PubMed, Scopus, Web of Science, Medline, and the Google Scholar search engine, to identify appropriate related resources. All the full texts were assessed, and data elements were extracted after excluding the irrelevant papers. Then, an interview was conducted with a group of experts in the field of obstetrics. These interviews were face-to-face, and each interview was conducted for one hour at a maximum. After five interviews and a thematic analysis of the data obtained using MAXQDA (version 12) software, data elements were extracted.

Data collection

In the second stage, patients’ records were collected at Shohadaye Tajrish Hospital in Tehran, Iran. These data were collected from patients between 2010 and 2021. In this study, the data of 384 pregnant mothers were analyzed, including 375 with pre-eclampsia (129 with severe pre-eclampsia, 175 with moderate pre-eclampsia, and 72 with mild pre-eclampsia), 2 with eclampsia, and 6 with HELLP syndrome. It is worth mentioning that the data has been collected without identifying variables, such as name, surname, and national ID number.

Data preparation

In the third stage, the developed dataset was prepared for modeling. The dataset did not contain missing data; however, due to being imbalanced, the Synthetic Minority Oversampling Technique (SMOTE) method was employed to make it balanced. The validity of data and labels was rechecked by an expert, and in case of the presence of any outlier, the correct value was replaced using the patient’s medical record. Considering that the main objective of the present study was the diagnosis of HELLP syndrome and determining the factors affecting the diagnosis of this disease, patients with HELLP syndrome were labeled as 1 and other cases were labeled as 0.

Modeling and evaluation

In the final stage, multiple algorithms were studied, and based on their initial results on the data elements, they were compared. Ultimately, the algorithm that had acceptable initial results was chosen for use in this study. These nine ML algorithms included network-based algorithms (MLP and DL), ensemble algorithms (RF, XGBboost, and Adaboost), and classic algorithms (DT, SVM, LR, and KNN). The ML models were implemented using Python programming language. For cross-validation, the holdout method was used by dividing the dataset into 70% of the training set and 30% of the test set and then using the k-fold method with k = 5 and k = 10. Evaluation indices for each implemented model were calculated for accuracy, precision, sensitivity, F1 score, and AUC. In order to select the best algorithm, the implemented models were compared based on the F1 score. Finally, according to the best model obtained, the importance of the variables in the model as effective features in the diagnosis of HELLP syndrome was reported.

Multi-layer Perceptron (MLP)

The MLP algorithm is a type of feedforward ANN that consists of multiple layers of nodes, each connected to nodes in the adjacent layers. It is a supervised learning algorithm that uses a backpropagation algorithm to update the weights of the connections between nodes to minimize the error between the predicted output and the actual output. The MLP is a versatile algorithm that can be applied for various tasks, including classification, regression, and pattern recognition. It is known for its ability to learn complex patterns in data and is commonly used in applications such as image and speech recognition.

Deep learning (DL)

The DL is a subset of ML that involves ANNs with multiple layers (hence the term “deep”). These networks are capable of learning complex patterns and relationships in data by automatically extracting and transforming features at different levels of abstraction. The DL algorithms have been successfully applied to various tasks, such as image and speech recognition, natural language processing, and autonomous driving. Some popular DL architectures include convolutional neural networks (CNNs) for image recognition, recurrent neural networks (RNNs) for sequence prediction, and generative adversarial networks (GANs) for generating realistic images.

Random Forest (RF)

The RF is an ensemble learning algorithm that combines multiple DTs to create a more robust and accurate model. Each DT in the RF is built using a subset of the training data and a random selection of features, which helps to reduce overfitting and improve generalization. The final prediction is made by aggregating the predictions of all the individual trees, either through a majority voting mechanism for classification tasks or averaging for regression tasks. The RF is known for its high accuracy, scalability, and ability to handle large datasets with high dimensionality. It is also resistant to overfitting and noise in the data, making it a popular selection for various ML tasks.

XGBoost

The XGBoost is a powerful and efficient ML algorithm known for its speed and performance in handling large datasets. It belongs to the ensemble learning method of boosting, where multiple weak learners are combined to create a strong learner. The XGBoost utilizes a gradient boosting framework, which focuses on minimizing the loss function by adding new models that complement the shortcomings of existing models. It is highly customizable, allowing users to tune parameters, such as learning rate, maximum depth of trees, and the number of boosting rounds to optimize performance. The XGBoost is frequently employed in various ML competitions and has been proven to achieve state-of-the-art results in various applications.

AdaBoost

The AdaBoost is a popular ensemble learning algorithm that combines multiple weak classifiers to create a strong classifier. The algorithm works by sequentially training a series of weak learners on the training data, with each subsequent learner focusing on the instances that were misclassified by the previous learners. The predictions of each weak learner are then combined through a weighted sum to make the final prediction. The AdaBoost is particularly effective in dealing with complex classification problems and has been successfully applied in a wide range of domains, including computer vision, speech recognition, and bioinformatics.

Decision tree (DT)

The DT algorithm is a popular supervised ML technique used for classification and regression tasks. It works by recursively splitting the dataset into subsets based on the value of a certain attribute, with the purpose of developing a tree-like structure where each internal node represents a decision based on an attribute, and each leaf node represents the class label or predicted value. The DTs are easy to interpret and visualize, making them useful for understanding the underlying patterns in the data. However, they can be prone to overfitting if the tree is too deep or complex and may not perform well on datasets with high dimensionality or noisy data. Various extensions and ensemble methods, such as RF and Gradient Boosting, have been developed to address these limitations and improve the performance of DTs.

Support Vector Machine (SVM)

The SVM is a robust supervised learning algorithm commonly used for classification and regression tasks. The main objective of SVM is to find the hyperplane that best separates the data points into different classes by maximizing the margin between the classes. It works by mapping the input data into a higher-dimensional space and finding the optimal hyperplane separating the classes with the maximum margin. The SVM effectively handles high-dimensional data and is known for its ability to generalize well to unseen data. In addition, SVM can handle non-linear data by using kernel functions to map the data into a higher-dimensional space.

Logistic regression (LR)

The LR is a statistical model employed for binary classification tasks where the output variable is categorical and includes two classes. It works by estimating the probability that a given input belongs to a particular class, using a logistic function to map the input features to the output. The model calculates the log-odds of the probability that the input belongs to the positive class and then applies a sigmoid function to convert this into a probability score between 0 and 1. During training, the algorithm adjusts the weights of the input features to minimize the error between the predicted probabilities and the actual class labels. The LR is a simple yet powerful algorithm frequently used in various fields, such as healthcare, finance, and marketing, for its interpretability and ease of implementation.

K-Nearest Neighbor (KNN)

The KNN algorithm is a simple and intuitive ML algorithm for classification and regression tasks. In KNN, the algorithm classifies new data points based on the majority class of its KNN in the training dataset. The value of k is a hyperparameter that can be tuned to optimize the model’s performance. The KNN is a non-parametric algorithm, meaning it makes no assumptions regarding the underlying data distribution. This makes KNN a versatile algorithm that can be applied to a wide range of datasets and is particularly useful for datasets with non-linear relationships. However, KNN can be computationally expensive, especially with large datasets, as it requires calculating the distance between all data points in the training set.

Results

Identified data elements

Table 2 indicates the list of data elements that were identified in this study. Variables were classified into four categories: demographics, medical history, test results, and outcome. In this research, the diagnosis feature in the outcome category was the target feature.

Table 2 The dataset features and their descriptions

Data gathering and Preparation

The number of samples gathered in this study was equal to 384. The descriptive statistics of the variables of this dataset are presented in Table 1. In this data set, after applying preliminary pre-processing, such as changing the shape of some variables from a string to a number due to the imbalance of the dataset, Random Under-Sampling, Near Miss Under-Sampling, Adaptive Synthetic Sampling (ADASYN), and SMOTE methods were employed for balancing. Among these methods, SMOTE outperformed based on the final results. Using the SMOTE method, the number of samples increased from 384 to 757. In this sampling process, the percentage of the positive class with HELLP syndrome was 1.5%, and after applying SMOTE, this class was increased to 48%. Tables 3 and 4 present descriptions of nominal and quantitative variables, respectively.

Table 3 Description table of nominal and rank variables of the study
Table 4 Description table of quantitative variables of the study

Modeling

Several algorithms were applied for modeling, among which seven were selected for reporting in this work based on the characteristics of the dataset and the evaluation results of the models. These algorithms included DT, MLP neural network, KNN, SVM, DL, RF, AdaBoost, XGBboost, LR and holdout; k-fold (k = 5, k = 10) modes were used for validation. Therefore, three different models were created and validated for each algorithm using these three validation methods. The evaluation results for a better comparison of the algorithms are presented in Tables 5 and 6, and 7.

Table 5 Performance of data mining models using holdout (70 train-30 test)
Table 6 Performance of data mining models using 5-fold
Table 7 Performance of data mining models using 10-fold

According to the results, the MLP algorithm achieved the best performance among all algorithms in the holdout cross-validation method with F1 Score = 0.994. The ROC diagram of this model is indicated in Fig. 1. Moreover, Table 8 presents the confusion matrix for the best model.

Table 8 Confusion matrix for MLP model
Fig. 1
figure 1

ROC diagram of MLP using holdout cross-validation

In both 5-fold and 10-fold cross-validation methods, the DL algorithm with F1 Score = 0.989 and F1 Score = 0.993, in respective order, outperformed the other algorithms. The ROC diagram of these two models is indicated in Figs. 2 and 3.

Fig. 2
figure 2

ROC diagram of DL using 5-fold

Fig. 3
figure 3

ROC diagram of DL using 10-fold

In Fig. 4, the F1 score comparison of all studied algorithms for three validation methods is presented.

Fig. 4
figure 4

Comparison of the performance of models in different holdout, 5-fold, and 10-fold situations

Figure 5 indicates the importance of the included features in modeling. In this study, platelet count, gestational age, and aspartate transaminase (AST) were the most important features in the modeling, which has been confirmed based on many medical studies, and the number of abortions, twins, and blood pressure were the least important features in the diagnosis of HELLP syndrome. Figure 5 presents the importance of the included features in modeling.

Fig. 5
figure 5

Feature importance based on the modeling output

Discussion

In the present investigations, the medical records of 384 patients referred to Shohadaye Tajrish Hospital in Tehran, Iran, were analyzed, and after applying pre-processing, the diagnosis model of HELLP syndrome was implemented using ML algorithms. Among all the implemented algorithms, those based on neural networks outperformed other algorithms. Although there was not much difference between the high-performance models, the best model in this study was implemented with the ANN algorithm and holdout validation method.

The ANNs have not been frequently utilized in other studies, except for Huang (25), who reported an F1 score of 0.88, while our study achieved an F1 score of 0.995 for MLP. Moreover, Zheng (23) employed MLP but did not exhibit the best performance among the algorithms tested. Furthermore, in all the studies reviewed, only cross-validation was applied; specifically, 10-fold cross-validation was employed in all cases except for Melinte-Popscu (20), which utilized 5-fold to predict the severity of HELLP syndrome. Holdout validation was not employed in any of these studies and has not been compared.

The results indicate that the ML models reported in this study for the diagnosis of HELLP syndrome demonstrated reliable performance compared to previous investigations, with an F1 score exceeding 99% in all cross-validation modes.

For instance, in Moreira’s study, the F1 score for the proposed neuro-fuzzy algorithm was 0.705 [1]. In the Melinte-Popescu study, the highest value of the F1 score for predicting the severity of HELLP syndrome in the DT was 0.94. (20) In addition, Melinte-Popescu’s study on preeclampsia prediction reported that the highest F1 score for predicting all cases of preeclampsia using the NB algorithm was 0.98 [21].

In Moreira’s study, the F1 Scores for all Bayesian machine learning models predicting delivery outcomes for pregnant women and fetuses in cases of HDP, including HELLP syndrome, varied between 0 and 1, indicating different performance levels across the classes [28].

In addition, the effective and important features of the diagnosis of HELLP syndrome were determined in this study. Among the baseline features used in the modeling, gestational age was the most important, and then, the mother’s age was reported as the most important feature. The other two characteristics in this category, namely the number of pregnancies and the mother’s BMI, were found to be equally important but less significant than the previous two characteristics. According to the results presented in the figure, the number of births is a contributing factor to HELLP syndrome. However, as noted previously [30], unlike preeclampsia, not giving birth is not considered a risk factor for HELLP. Moreover, mothers with a history of childbirth account for 50% of affected patients. Moreover, among the characteristics of the baseline category, twins and the number of miscarriages did not affect the prediction model of HELLP syndrome.

Among the clinical features, nausea and diastolic blood pressure had the most significant impact on modeling, but headache, epigastric pain, and systolic blood pressure did not affect the modeling results. According to previous medical studies [31,32,33], blood pressure abnormalities are present in nearly 85% of cases of HELLP syndrome; however, this symptom may not be observed in severe cases of HELLP syndrome.

Among the characteristics of biomarkers, platelets are the most important in modeling, which is also the first factor of significant importance in all categories. This result is consistent with the definition of HELLP syndrome in all references. Other characteristics of this category, including AST, FBS, Bili D, creatinine (CR), LDH, and Bili T, had a significant impact. In addition, among the features of this category, only ALT is considered a feature without impact on modeling. In general, the characteristics of this category were much more effective in diagnosing the disease than other categories.

In Maric’s study [27], the highest coefficient for the blood pressure variable in the prediction model of preeclampsia (which includes patients with HELLP syndrome) was selected using an elastic net. This is not consistent with the results of our study, which considered HELLP syndrome separately from preeclampsia.

In Zheng’s study, after performing feature selection, 15 features were identified among various factors, including demographics, complications, delivery characteristics, neonate features, physical examinations, and laboratory examinations. The largest number of these features (23) and the most significant ones were found in the laboratory examination category, with FBS being noted, which aligns with the results of the present study.

Some previous investigations have diagnosed HELLP syndrome using ML, among which we can mention the study of Melite et al., in which the severity of HELLP syndrome was predicted in three different severity groups using the data of 81 patients. I In this study, four machine learning algorithms were employed, and their results were compared, with DT showing the best performance. Other studies, such as the one conducted by Moya et al., have reported results that were significantly lower than those obtained in the present study.

The rarity of HELLP syndrome samples has made it challenging to conduct such studies, and this issue can be seen in all studies. What distinguishes this study from other studies to some extent is the number of patients included in the study, the quality of the collected data, and finally, based on the clinical approach, the obtained results were attempted to be employed to determine the characteristics that are effective in the diagnosis of HELLP syndrome.

Conclusion

Considering the impact of data mining techniques in the diagnosis and prediction of diseases, data mining techniques were used in the present study to develop a prediction model for HELLP syndrome. The results obtained from the evaluation of the models presented in this study revealed that data mining algorithms can be used successfully in developing HELLP syndrome prediction models. Since other algorithms besides Decision Trees (DTs) also achieved F1 Scores above 0.90, the performance of these algorithms was consistently high across all three evaluation methods: 5-fold cross-validation, 10-fold cross-validation, and hold-out (70/30 test/train). Despite some small differences among the algorithms, their performances were closely aligned. Moreover, this study indicated that Biomarker features have the most significant impact on the diagnosis of HELLP syndrome. Although the accuracy of the obtained results was considerably high, more detailed investigations are necessary to assess the validity and generalizability of the findings and ultimately improve the quality of care for pregnant women.

Further studies involving larger groups of HELLP syndrome patients are recommended. Additionally, research focusing on the differential diagnosis of HELLP syndrome from other pregnancy-related conditions, such as preeclampsia and eclampsia, is suggested. It is also recommended to utilize clustering machine learning methods for this purpose. It is also recommended to apply interpretable DL models to assess the significance of the features used in the present work. In addition, external validity is suggested for implementing models on other datasets. Limited access to data on HELLP syndrome and restricted access to data from Shohadaye Tajrish Hospital can be considered limitations of the present research, which arose due to the implementation of this research during the COVID-19 pandemic.

Data availability

All of the material is owned by the authors and the dataset is accessible via email to the corresponding author.

References

  1. Moreira MW, Rodrigues JJ, Al-Muhtadi J, Korotaev VV, de Albuquerque VHC. Neuro‐fuzzy model for HELLP syndrome prediction in mobile cloud computing environments. Concurrency Computation: Pract Experience. 2021;33(7):1.

    Article  Google Scholar 

  2. Erkılınç S, Eyi EGY. Factors contributing to adverse maternal outcomes in patients with HELLP syndrome. J Maternal-Fetal Neonatal Med. 2018;31(21):2870–6.

    Article  Google Scholar 

  3. Langarizadeh M, Nadjarzadeh A, Maghsoudi B, Fatemi Aghda SA. The nutritional content required to design an educational application for infertile women. BMC Womens Health. 2023;23(1):22.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Moradi M, Khorsandi B, Motaharinezhad M. A case report of a patient with postpartum HELLP syndrome. J Clin Basic Res. 2019;3(3):12–7.

    Article  Google Scholar 

  5. Wallace K, Harris S, Addison A, Bean C. HELLP syndrome: pathophysiology and current therapies. Curr Pharm Biotechnol. 2018;19(10):816–26.

    Article  CAS  PubMed  Google Scholar 

  6. Çoşkun B, Erkilinç S, Kara Ö, Çoşkun B, Elmas B, Şahin D. The investigation of prenatal screening test parameters in predicting HELLP syndrome. Eastern J Med. 2020;25(1):114–7.

    Article  Google Scholar 

  7. Ansar MJ, Sangma OJS, Tanwar K, Tandon M. HELLP syndrome with surgical complication. Indian J Obstet Gynecol Res. 2019;6(4):560–1.

    Article  Google Scholar 

  8. Sisti G, Faraci A, Silva J, Upadhyay R. Neutrophil-to-lymphocyte ratio, platelet-to-lymphocyte ratio and complete blood count components in the first trimester do not predict HELLP syndrome. Medicina. 2019;55(6):219.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Firouzi Jahantigh F, Nazarnejad R, Firouzi Jahantigh M. Investigating the risk factors for low birth weight using data mining: a case study of Imam Ali hospital, Zahedan, Iran. J Mazandaran Univ Med Sci. 2016;25(133):171–82.

    Google Scholar 

  10. [Available from: https://www.uptodate.com/

  11. Kazemi AFN, Sehhatie F, Sattarzade N, Mameghani M. The predictive value of urinary calcium to creatinine ratio, roll-over test and BMI in early diagnosis of pre-eclampsia. 2010.

  12. Kenny LC, Thomas G, Poston L, Myers JE, Simpson NA, McCarthy FP, et al. Prediction of preeclampsia risk in first time pregnant women: metabolite biomarkers for a clinical test. PLoS ONE. 2020;15(12):e0244369.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Lashari SA, Ibrahim R, Senan N, Taujuddin NSA, editors. Application of data mining techniques for medical data classification: a review. MATEC Web of conferences; 2018: EDP Sciences.

  14. Dhillon A, Singh A. Machine learning in healthcare data analysis: a survey. J Biology Today’s World. 2019;8(6):1–10.

    Google Scholar 

  15. Oskouei RJ, Kor NM, Maleki SA. Data mining and medical world: breast cancers’ diagnosis, treatment, prognosis and challenges. Am J cancer Res. 2017;7(3):610.

    PubMed  PubMed Central  Google Scholar 

  16. Ghorbani R, Ghousi R. Predictive data mining approaches in medical diagnosis: a review of some diseases prediction. Int J Data Netw Sci. 2019;3(2):47–70.

    Article  Google Scholar 

  17. Pendyala VS, Figueira S, editors. Automated medical diagnosis from clinical data. 2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService); 2017: IEEE.

  18. Mazaheri S, Ashoori M, Bechari Z. A model to predict Heart disease treatment using data mining. Payavard Salamat. 2017;11(3):287–96.

    Google Scholar 

  19. Salari R, Langarizadeh M, Bahaaddin Beigi K, Akramizadeh A, Kashanian M. Detecting of Preeclampsia by expert system: a case study in Tehran university of medical sciences hospitals. Payavard Salamat. 2016;9(6):556–65.

    Google Scholar 

  20. Melinte-Popescu M, Vasilache I-A, Socolov D, Melinte-Popescu A-S. Prediction of HELLP syndrome severity using machine learning algorithms—results from a retrospective study. Diagnostics. 2023;13(2):287.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Melinte-Popescu A-S, Vasilache I-A, Socolov D, Melinte-Popescu M. Predictive performance of machine learning-based methods for the prediction of preeclampsia—a prospective study. J Clin Med. 2023;12(2):418.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Villalaín C, Herraiz I, Domínguez-Del Olmo P, Angulo P, Ayala JL, Galindo A. Prediction of delivery within 7 days after diagnosis of early onset preeclampsia using machine-learning models. Front Cardiovasc Med. 2022;9:910701.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Zheng D, Hao X, Khan M, Wang L, Li F, Xiang N, et al. Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: a retrospective study. Front Cardiovasc Med. 2022;9:959649.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Chen J, Ji Y, Su T, Jin M, Yuan Z, Peng Y, et al. editors. Prediction of adverse outcomes in DE Novo hypertensive disorders of pregnancy: development and validation of maternal and neonatal prognostic models. Healthcare: MDPI; 2022.

    Google Scholar 

  25. Huang K-H, Chen F-Y, Liu Z-Z, Luo J-Y, Xu R-L, Jiang L-L, et al. Prediction of pre-eclampsia complicated by fetal growth restriction and its perinatal outcome based on an artificial neural network model. Front Physiol. 2022;13:992040.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Ejiwale MO. Prediction of Concurrent Hypertensive disorders in pregnancy and gestational diabetes Mellitus using machine learning techniques. The University of Wisconsin-Milwaukee; 2021.

  27. Marić I, Tsur A, Aghaeepour N, Montanari A, Stevenson DK, Shaw GM, et al. Early prediction of preeclampsia via machine learning. Am J Obstet Gynecol MFM. 2020;2(2):100100.

    Article  PubMed  Google Scholar 

  28. Moreira MW, Rodrigues JJ, Carvalho FH, Chilamkurti N, Al-Muhtadi J, Denisov V. Biomedical data analytics in mobile-health environments for high-risk pregnancy outcome prediction. J Ambient Intell Humaniz Comput. 2019;10:4121–34.

    Article  Google Scholar 

  29. Svensson-Ranallo PA, Adam TJ, Sainfort F. A framework and standardized methodology for developing minimum clinical datasets. AMIA Summits on Translational Science Proceedings. 2011;2011:54.

  30. Audibert F, Friedman SA, Frangieh AY, Sibai BM. Clinical utility of strict diagnostic criteria for the HELLP (hemolysis, elevated liver enzymes, and low platelets) syndrome. Am J Obstet Gynecol. 1996;175(2):460–4.

    Article  CAS  PubMed  Google Scholar 

  31. Sibai BM, Ramadan MK, Usta I, Salama M, Mercer BM, Friedman SA. Maternal morbidity and mortality in 442 pregnancies with hemolysis, elevated liver enzymes, and low platelets (HELLP syndrome). Am J Obstet Gynecol. 1993;169(4):1000–6.

    Article  CAS  PubMed  Google Scholar 

  32. Reubinoff B, Schenker J. HELLP syndrome—a syndrome of hemolysis, elevated liver enzymes and low platelet count—complicating preeclampsia-eclampsia. Int J Gynecol Obstet. 1991;36(2):95–102.

    Article  CAS  Google Scholar 

  33. Sibai BM. The HELLP syndrome (hemolysis, elevated liver enzymes, and low platelets): much ado about nothing? Am J Obstet Gynecol. 1990;162(2):311–6.

    Article  CAS  PubMed  Google Scholar 

Download references

Funding

This research received no external funding.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, B.F; MJ.S and M.L.; Methodology, MJ.S. and B.F.; Validation, M.L; L.A; Formal analysis, B.F and MJ.S.; Investigation, L.A and M.L; Data curation, B.F and MJ.S.; Writing—original draft, B.F and MJ.S.; Supervision, L.A and M.L.; All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Mohammadjavad Sayadi.

Ethics declarations

Ethical approval and consent to participate

This study was conducted in accordance with the ethical standards of the institutional and national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The ethics committee of biomedical research of Iran University of Medical Sciences reviewed and approved the research plan and its experimental protocols. The number of ethical approvals obtained from this committee is IR.IUMS.REC.1400.592. All methods were carried out in accordance with relevant guidelines and regulations. Informed consent was obtained from all individual participants included in the study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farajollahi, B., Sayadi, M., Langarizadeh, M. et al. Presenting a prediction model for HELLP syndrome through data mining. BMC Med Inform Decis Mak 25, 135 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-025-02904-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-025-02904-0

Keywords