- Research
- Open access
- Published:
Association between serum hypertriglyceridemia and hematological indices: data mining approaches
BMC Medical Informatics and Decision Making volume 24, Article number: 410 (2024)
Abstract
Background
High triglyceride (TG) affects and is affected of other hematological factors. The determination of serum fasted triglycerides concentrations, as part of a lipid profile, is crucial key point in hematological factors and significantly affect various systemic diseases. This study was carried out to assess the potential relation between the concentration of TG and hematological factors.
Method
Our sample size was 9704 participants beginning in 2007 and ending in 2020 aged between 35 and 65 years, sourced from the MASHAD cohort (northeastern Iran). Machine learning methodologies, specifically logistic regression, decision tree, and random forest algorithms, were utilized for data analysis in the investigation of individuals with normal and high TG levels.
Results
The highest Gini score belongs to RLR (Red cell distribution width/Lymphocyte) (236.10), RPR (Red cell distribution width/Platelets) (215.78), and PHR (Platelets/high-density lipoprotein) (273.66). We also found that factors such as age are statistically associated with the level of TG in women probably due to the drop in menopausal estrogen. RF model showed to have higher accuracy in predicting the TG level in both males and females.
Conclusion
Our model assessed the association between serum TG with several hematological factors like RLR, RPR, and PHR. Other hematological factors also have been reported to be related to the TG level. As these results give us new insights into the association of TG on various hematological factors and their possible interactions with each other. future studies are needed to provide sufficient data for the mechanism and the pathophysiology of the findings.
Background
Dyslipidemia is an abnormality in the lipids profile. Serum triglycerides (TG), low-density lipoproteins cholesterol (LDL-C), and high-density lipoproteins cholesterol (HDL-C) are the most commonly measured lipid factors [1]. It is estimated that around 10% of the total adult population has hypertriglyceridemia HTG [2]. Several studies have shown the relation between serum TG and cardiovascular disease (CVD) that include myocardial infarction, ischemic heart disease heart failure, and type 2 diabetes mellitus [2]. Additionally, the role of high TG has been previously indicated in chronic and acute systemic diseases such as renal dysfunction, liver disease and pancreatitis [3].
It has been reported that higher hematological parameters such as white blood cells (WBC), red blood cells (RBC), platelet (PLT) and mean corpuscular volume (MCV) are found in patients with HTG [4]. Recent studies indicate a higher risk of RBC aggregation in hyperlipidemic patients. The mechanism may involve the stimulation of hepatic fibrinogen production [5]. TG-glucose (TyG) index is also a useful marker that is associated with the neutrophil count and other hematological parameters [6]. Previous studies have indicated that a high TyG index is associated with the development of diabetes and coronary atherosclerosis [7]. As a secondary cause of HTG, insulin resistance is known to be involved with the concentration of the TyG. HTG has also proven to affect PLT responses to aspirin in ischemic stroke patients, by reducing PLT membrane fluidity and the inability of aspirin to enter the PLT via that rigid membrane [8]. Therefore, blood parameters such as impaired fasting blood glucose (FBG) may play a significant role in the development of HTG [9]. TG-lowering medications are proven to reduce the risk of cardiac events such as acute coronary syndrome (ACS) by both reducing the risk factor of high TG and their anti-inflammation property. Random forest (RF) and decision tree (DT) models are machine learning models that are widely used in medicine [10,11,12]. RF is considered a tree-based algorithm that can reach high accuracy by using bootstrap aggregation and predictor randomization. The RF does not have hyperparameter tweaking and it's shown to have a higher accuracy than other models [13]. On the other hand, DT has a growing interest due to the simplicity, comprehensibility, and mixed-type data handling of the DT [14].
Although there are some interactions found in the field of the relation between the blood TG and hematological, physical, and other parameters, the main subject is not studied comprehensibly. As a prevalent phenomenon in today's society, it is crucial to understand the relations, early signs, and interfering factors of hyper and hypo-triglyceridemia. This relation can better aid physicians in predicting and understanding the effect of high TG in effected population. New factors based on hematological factors can open a new world of understanding and prevention for physicians. Therefore, in this study, we aimed to evaluate the independent relationship between serum TG level and hematological and demographic factors in an Iranian cohort to better understand the interactions between TG and other factors. RF and DT models were also implicated in predicting the TG level by other significantly related factors.
Method
Study population
All individuals enrolled in this study were selected from the initial phase of the MASHAD cohort investigation (N = 9704), a ten-year longitudinal study conducted in northeastern Mashhad, Iran. Among the participants, 3270 were identified with elevated TG levels, comprising 1425 (44%) males and 1845 (56%) females. The initial survey was conducted in 2010, with follow-up assessments performed in 2019 [15]. Prior to participation, all individuals provided their informed consent by signing written consent forms. Exclusion criteria was coronary artery disease, stroke and peripheral arterial disease or prevalent CVD, cancer, chronic kidney disease. This study protocol was reviewed and approved by the Ethics Committee of MUMS, approval number IR.MUMS.REC.1386.250.
Laboratory analysis
The tests conducted for this study took place in Mashhad, Iran. Highly skilled professionals performed all examinations and analyses. The TG measuring instrument used was manufactured by Alfa Classic, Iran. Moreover, a cell counter SYSMEX-KX21 was employed for whole blood differential analysis. The TG determination kit was supplied by Pars Azmoun, Iran.
Baseline examination
Blood and urine specimens (mid-stream) were obtained between 8 and 10 a.m. following a 14-h fasting period. As per the established protocol, the samples were drawn into 20ml (10ml serum and 10 ml plasma) vacuum tubes and promptly centrifuged at room temperature within a timeframe of 30–45 min. Subsequently, the serum and plasma samples were dispatched to the Bu Ali Research Institutes in Mashhad, where they were stored at −80°C for subsequent analysis. The calculation of triglyceride levels was performed in mg/dl utilizing the Friedewald formula [16]. Blood samples were taken from patients while fasting in BD syringes and collected into EDTA tubes which were then sent for whole blood analysis. This includes the number of RBC and WBC, differential white blood cells count including NEUT, and LYM, hemoglobin, hematocrit, mean corpuscular hemoglobin, indicators of red blood cells such as MCV, MCH, MCHC, RDW, and indicators of platelet including the number of PLT. The measurement of hematological factors was based on the use of formulas defined by SI as follows [17]:
The MCV calculation uses HCT and RBC count.
-
MCV (fL) = HCT (%) × 10 / RBC count (10^12/L)
MCHC is calculated with Hb concentration and HCT.
-
MCHC (%) = Hb concentration (g/dL) / HCT (%) × 100
Hematocrit was calculated using the volume of red cells and plasma volume.
-
HCT (%) = Red blood cell / plasma volume
RDW is divided by the standard deviation (SD) using MCV.
-
RDW (%) = SD RBC × 100 / MCV
PDW also calculates with SD of mean cell size and MPV of red cells.
-
PDW (%) = SD RBC × 100 / MPV
Statistical analysis
Data analysis was conducted using the R Statistical Software (v4.1.2; R Core Team 2021) and IBM SPSS Statistics (Version 27). Continuous and normally distributed variables were presented as mean ± standard deviation (SD), while continuous variables with non-normal distribution were expressed as median (lower and upper quartiles), and categorical variables were represented as frequency (%). A significance level of P < 0.05 was considered statistically significant. The t-test was utilized for continuous and normally distributed variables, the Mann–Whitney U test for continuous variables with non-normal distribution, and the chi-square test for categorical variables to compare mean, median, and percentage values of subjects with TG levels < 150 and TG levels ≤ 150.
To evaluate multicollinearity among independent variables, we calculated the variance inflation factor (VIF) and correlation coefficients. Variables with a correlation exceeding 0.95 were identified as highly correlated, indicating the presence of potential multicollinearity.
Logistic regression (LR) was employed to calculate odds ratios (OR) along with their 95% confidence intervals using three models. These models included the variables Sex, Age, WBC, RBC, Hematocrit (HCT), Mean Corpuscular Hemoglobin Concentration (MCHC), Neutrophil/Lymphocyte (NLR), RDW/Lymphocyte (RLR), log of RPR (RDW/PLT), and PLT/HDL (PHR), with the outcome variable being binary TG (model A). Furthermore, separate analyses were conducted for male and female participants (models B and C) within the same model framework.
Random forest model
The information was subjected to a data mining methodology, and a neural network was utilized to construct a predictive model for TG measurements. A RF is a type of machine learning algorithm that utilizes an ensemble of decision trees to make predictions. The goal of an RF is to reduce overfitting and improve generalization performance by combining the predictions of multiple decision trees. To create an RF, a set of decision trees are trained on randomly sampled subsets of the data. During training, each tree is trained independently of the others and is allowed to make a prediction based on only a subset of the available features. This process is known as feature bagging, and it helps to reduce the correlation between the trees and improve the overall performance of the forest. When making a prediction, each decision tree in the forest independently produces a prediction, and the final prediction is made by taking the majority vote of all the trees. This method of combining predictions is known as bagging, and it helps to reduce the variance of the predictions and improve the accuracy of the model.
RF has several advantages over other machine learning algorithms. They are highly accurate, even when dealing with noisy or missing data, and they can handle large datasets with many features. Additionally, they are resistant to overfitting, which makes them a popular choice for many applications. In summary, RF is a powerful machine learning algorithm that combines the predictions of multiple decision trees to make accurate predictions. By using feature bagging and bagging to reduce the variance of the predictions and improve generalization performance, RFs can handle large datasets and are resistant to overfitting.
In this paper, we have divided the train and test data into a ratio of 75% and 25%, respectively. To implement the RF, we used R software, and the confusion matrix of the RF was used to evaluate the accuracy, precision, specificity, and other result.
Decision tree model
TG level was predicted using a data mining method in which a DT was developed for it. Depending on the nature of the target variables, DT is a non-parametric method used in this study to predict the outcomes of TG levels based on different predictors. CHAID (Chi-squared Automatic Interaction Detector) technique as a useful tool for prediction and classification was utilized in the DT framework. CHAID is useful for the detection of intra-variable interactions. CART, ID3, C4.5, and CHAID are examples of algorithms for making DTs. Chi-square tests were used for the identification of the most effective feature in the DT, a method used in the CHAID algorithm. The chi-square test formula:
where y resembles actual values and y' is anticipated values. The significance of the predictor variables is determined by the classification procedure in the model. SPSS software with a 27 version was used for the accuracy, precision, and sensitivity of the DT algorithm by confusion matrix. Eventually, the results obtained from the DT's confusion matrix were interpreted.
Result
Characteristics of the study population
Our study population had 59% females and 41% male individuals (N = 9704 in total). The bassline blood factors of the participants have been summarized in Table 1. It presents both sex's TG blood concentrations and the results which are divided into two groups by binary TG (TG ≥ 150 and TG < 150). As shown in the table, a higher percentage of males had a high level of TG (37% of males and 32% of females) compared to women. Also, Table 1 gives evidence for significant differences between high and low levels of TG in gender subgroups for variables. Some factors were statistically significant in both groups, but might not be clinically significant such as WBC, Hemoglobin (HGB), HCT, PLT, RDW, Neutrophil (NEUT), RLR, PHR, HGB/Lymphocyte (HLR), RBC, MCHC, Lymphocyte (LYM), LHR, NHR, RPR, and HPR (P-value < 0.05). Also, some variables were significant for each gender separately; NLR and MCH were only insignificant for male participants while the factors belonging to women Age, and MPV were statistically significant. In the subjects with high levels of TG, most of the variables in the male group had a higher mean or median than the female group such as HGB, HTC, RBC, MCV, etc.
The association between blood factors and TG using LR model
The models include the effect of having different Sex, Age, WBC, RBC, HTC, MCHC, NLR, RLR, log RPR, and PHR. In Table 2 model A presents the result of the LR model and the corresponding OR for the variables explained. Similarly, models B and C show the OR with the same predictors for males and females, respectively. The accuracy of the models varies from 70 to 75%. In model A, all variables were statistically significant (P-value < 0.05). The presence of Sex adjusted for all other variables in model A indicated that the odds of TG (identify binary risk factors (TG < 150 vs TG ≥ 150)) in females were 27% more than in males (OR = 1.27 (1.12, 1.27)). For each unit increase in Age, RBC, HCT, MCHC, NLR, and PHR, the ratio of TG increased by 4% (OR = 1.04 (1.03, 1.04)), 21% (OR = 1.21 (1.02, 1.21)), 4% (OR = 1.04 (1.02, 1.04)), 6% (OR = 1.06 (1.02, 1.06)), 54% (OR = 1.54 (1.32, 1.54)) and 38% (OR = 1.38(1.33, 1.38)), respectively. Moreover, variables that decreased the OR of TG by 16% and 11% were WBC and RLR, respectively (OR = 0.84 (0.79, 0.84) and OR = 0.89 (0.87, 0.89)). Almost similar results are shown in models B and C except the variables RBC and MCHC for males and WBC for females were not statistically significant.
The association between blood factors and TG using RF
In this part, all variables entered into the RF model that were selected in the LR model. The results of the RF model with an accuracy of 65% for males and 70% for females, indicate that variables such as PHR, RLR, and RPR have the highest influence on TG, meaning that changes in these variables have the greatest effect on the likelihood of individuals developing TG. According to the mean decrease Gini measure in the DT formation for each sample in the RF model, the importance of the model's effects on TG is shown in Table 3. Also, the importance variables in order them and the partial dependence of the first and second importance variables for RF are shown in Fig. 1. Details of the confusion matrix in test and train are shown in Table 4.
Random forest model for males and females
According to the results of the RF model, shown in Fig. 1, the PHR and RLR emerged as the most influential variables for males, with RHR being particularly significant across both groups. Based on the partial dependence plot it can be concluded that when the value of PHR was less than almost 2.5, the likelihood of developing TG was decreasing and when this value was greater than 2.5, the likelihood of individuals developing TG increases. Also, RLR was the second important variable, when was less than 33 the likelihood of developing TG drastically decreased, and for RLR greater than 33 this likelihood was increasing and constant.
The same for women, the PHR and RLR were the important variables respectively. In the case of PHR was less than almost 4 the likelihood of developing TG was increasing, moreover for above 4, this likelihood was increased and when this value was more than 10 the likelihood was constant. The RLR which was the second most important blood factor in women subjects, was less than almost 27 the likelihood of developing TG drastically decreased, and if this value was above 27 the likelihood increased.
The association between blood factors and TG using decision tree models
Binary TG using decision tree models for males and females
Figures 2 and 3 demonstrate the results of the training on TG levels concerning blood factors in male and female subjects. The DT algorithm was employed to identify binary risk factors (TG < 150 vs TG ≥ 150) and organize them into three layers. In the DT model, the primary variable (root) holds the greatest significance in data classification, while subsequent variables possess lesser importance. Figure 2 indicates that in males, the variable RPR yielded the most significant impact on the risk of TG development, followed by RLR, PHR, and WBC. Within the subgroup exhibiting 0.18 < RPR ≤ 0.24, PHR > 6.3, and RLR ≤ 20.1, 57.1% of participants were classified as having the highest risk of experiencing elevated TG levels (≥ 150). Conversely, among individuals with RPR ≤ 0.18 and RLR > 28, 94.4% were identified as having the lowest risk of elevated TG levels (< 150).
For females Fig. 3 showed that RPR had the most crucial effect on TG development risk, followed by Age and PHR. In the 0.15 < RPR ≤ 0.2, PHR > 6.32 and RLR ≤ 15.9, 62.5% of females had the highest risk of getting TG (highest risk in TG ≥ 150), while with RPR > 0.2, Age ≤ 42 and WBC ≤ 4.2, 86.7% of subjects were identified as lowest risk of get TG (highest risk in TG < 150). Detailed rules for TG in males and females created by the DT model are demonstrated in Table 5. Details of the confusion matrix in test and train are shown in Table 6. We summarize the paper in Fig. 4.
Discussion
As a major risk factor in coronary artery disease (CAD), thyroid disorders, kidney and liver dysfunction HTG [18] is known to be both dependently and independently related [19, 20]. HTG is a common clinical complaint that is associated with both inflammatory and non-inflammatory processes that promote atherosclerosis [21, 22]. However, the results of HTG's role in CVD have been the subject of debate since there is a meta-analysis that indicated the role of HDL in CVD after adjusting not HTG and some against [23,24,25]. On the other hand, there as numerous findings on the crucial role of HTG on non-alcoholic fatty liver disease (NAFLD) [26]. Hence the importance and the role of HTG in prognosis of systemic diseases such as CVD, CAD and NAFLD for better understanding the mechanism of disease and hematological factors altered by HTG we tend to design this study. This cohort analysis study investigated hematological and demographical factors associated with the level of TG. TG is a known risk factor in CVD and can highly depend on the lifestyle and person's diet. Baseline characteristics were mostly significantly associated with the level of TG as the male and female groups were divided into under and above 150Â mg/dl TG. Factors like WBC, HGB, HCT, PLT, RDW, neutrophil, RLR, PHR, and RBC were significantly related to the level of TG. In demographical factors, analysis showed that age can be a determining factor in the level of TG in women but not men. This fact might be due to the hormonal difference in women as studies indicate that menopause and aging in women leads to adverse lipid profile and can potentially increase the cholesterol and TG in women by the drop in estrogen which regulate lipid metabolism in the female system more dramatically than men [27].
As the Regression model summary is available in Table 2, model B (males) had a higher accuracy compared to the other two models (mode A with 74% and model C with 70%). As expected earlier and discussed further in this section, gender was a determining factor in the level of TG as with other hematological factors. WBC in women and RBC and MCHC in men were not significantly associated with TG level in these models.
As reported earlier age was a significant factor in TG levels in women despite men. In Table 3 the results also indicate that the mean decrease Gini score (MDGS) is higher in females than men leaving another confirmation on this matter (200.78 in women vs 117.76 in men). Also, results showed, RLR, RPR, and PHR have the highest MDGS. As age is the next in line after PHR, RLR, and RPR in females, RBC is the fourth determining factor according to MDGS. Testosterone known to be the main sex hormone in men, increases the RBC count and also alters TG level by decreasing the level of TG by multiple mechanisms [28]. Additionally, red blood cell distribution width (RDW) has been a factor in determining the type of anemia but recently found RDW can present as an inflammatory associated marker predicting the mortality [29]. Although the mechanism is not fully elaborated, researchers discuss considering RDW as an inflammatory factor like C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR) as RDW is related to the all-cause mortality (both cardiac and non-cardiac mortality) [30]. It has also been discussed to be associated with various cancers and hematological malignancies [31]. Previous study has found that the level of TG is related inversely with the RDW [32]. PLT has been another routine factor in clinical laboratory tests. A study indicated that one out of three platelet activation markers, GP53, was increased in expression in HGT patients [33]. Another study indicated that an increased level of TG is associated with lower platelet reactivity [8]. It is also already known that HDL is reversely related to the TG level making the PHR indicator as the strongest one in MDGS to be associated with TG level.
As Table 5 indicates the specific rules developed by the DT model, RPR was the most important factor in determining the level of TG. For females, the age above or under 42 and 49 was also a determining variable in TG level as this is the common age of menopausal in women. PHR was also a key variable in both genders as the HDL and platelets were both significantly associated with the level of TG in the baseline characteristics. There is also evidence suggesting that HTG can affect the hemoglobin level and blood turbidity positively but the mechanism is yet to be discussed [34].
We compared the performance of two machine learning models, RF and DT. RF has less redistribution compared to DT but takes more time than DT to execute the dataset [35, 36]. RF model reportedly had a higher accuracy in predicting the TG level in training based on the other factors compared to DT (accuracy 99% in RF for both genders vs 66% for males and 70% for females in the DT model). As in the training test, the accuracy was higher in the RF model for females (71% vs 67% for the DT model) and lower in males (65% vs 66% for the DT model). On the other hand, the precision and sensitivity of the training of RF models were higher than in DT models despite the in testing which precision and sensitivity were higher in the DT model.
One of the major strengths of our study that should be taken into account is the grand sample size. Our study enrolled 9704 people mostly the Persian population with a wide age range of 35 to 65. As a wide range of hematological variables in the study, some relation between TG and hematological factors was found which has never been discussed nor discovered before.
Despite the notable strength of our study, some limitations should be taken into account. Although a wide range of variables were obtained from patients to find the possible association between TG and them, some factors such as CRP, ESR, B12 level, and Iron level might be helpful to better understand the mechanism of these findings. Furthermore, we could investigate the relation between all-cause mortality and cardiac-related mortality after follow-ups in these individuals to more confidently verify TG's role in CVD and all-cause mortality.
Our results might have the potential to aid future research in designing and executing applications, algorithms, and artificial intelligence to understand, interpret, and ultimately, encounter patients with HTG. Also, our results have the potential to guide the authorities and regulators to better understand the side effects of HTG by screening some simple blood factors on a worldwide scale.
Conclusion
Our research identified RLR, PHR, and PHR as critical, previously unrecognized factors related to triglyceride (TG) levels. Additionally, we found associations between TG levels and various hematological factors, including WBC, HCT, PLT, neutrophils, RBC, HGB, and MCHC. We also developed models with satisfactory accuracy for predicting TG levels in the study population. The Random Forest model outperformed the decision tree model in predicting TG levels, demonstrating higher accuracy and sensitivity. However, the decision tree model had higher precision in the testing data. The decision tree analysis highlighted RLR and RPR as the most influential factors. Our findings also revealed the significance of gender and age, particularly for females, in relation to TG levels. These factors could prove valuable in clinical healthcare settings due to their cost-effectiveness, ease of use, and routine availability in clinical practice.
Data availability
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.
Abbreviations
- TG:
-
Triglycerides
- Lipoproteins cholesterol:
-
LDL-C
- HDL-C:
-
High-density lipoproteins cholesterol
- CVD:
-
Cardiovascular disease
- WBC:
-
White blood cells
- RBC:
-
Red blood cells
- PLT:
-
Platelet
- MCV:
-
Mean corpuscular volume
- TyG:
-
TG-glucose
- FBG:
-
Fasting blood glucose
- ACS:
-
Acute coronary syndrome
- RF:
-
Random forest
- DT:
-
Decision tree
- SD:
-
Standard deviation
- VIF:
-
Inflation factor
- LR:
-
Logistic regression
- OR:
-
Odds ratios
- HCT:
-
Hematocrit
- MCHC:
-
Mean Corpuscular Hemoglobin Concentration
- NLR:
-
Neutrophil/Lymphocyte
- RLR:
-
RDW/Lymphocyte
- RDW/PLT:
-
Log of RPR
- PHR:
-
PLT/HDL
- CHAID:
-
Chi-squared Automatic Interaction Detector
- HGB:
-
Hemoglobin
- NEUT:
-
Neutrophil
- HLR:
-
HGB/Lymphocyte
- LYM:
-
Lymphocyte
- CAD:
-
Coronary artery disease
- NAFLD:
-
Non-alcoholic fatty liver disease
- MDGS:
-
Mean decrease Gini score
- RDW:
-
Red blood cell distribution width
- CRP:
-
C-reactive protein
- ESR:
-
Erythrocyte sedimentation rate
References
Dipiro JT, Talbert RL, Yee GC, Matzke GR, Wells BG, Posey LM. Pharmacotherapy: a pathophysiologic approach, ed. Connecticut: Appleton and Lange. 2014;4:141–2.
Toth PP, Granowitz C, Hull M, Liassou D, Anderson A, Philip S. High triglycerides are associated with increased cardiovascular events, medical costs, and resource use: a real-world administrative claims analysis of statin-treated patients with high residual cardiovascular risk. J Am Heart Assoc. 2018;7(15): e008740.
Chait AJE, Clinics M. Hypertriglyceridemia. Endocrinol Metab Clin. 2022;51(3):539–55.
Hashemi SN, Saatian M, Hatamzadeh P, Poursadry P. The effects of hyperglycemia and hyperlipidemia on blood indices. Journal of Advanced Pharmacy Education & Research. 2020;10(S4):109–12.
Cicha I, Suzuki Y, Tateishi N, Maeda N. Enhancement of red blood cell aggregation by plasma triglycerides. Clin Hemorheol Microcirc. 2001;24(4):247–56.
Mansoori A, Sahranavard T, Hosseini ZS, Soflaei SS, Emrani N, Nazar E, et al. Prediction of type 2 diabetes mellitus using hematological factors based on machine learning approaches: a cohort study analysis. Sci Rep. 2023;13(1):1–11.
Lin H-Y, Zhang X-J, Liu Y-M, Geng L-Y, Guan L-Y, Li X-H. Comparison of the triglyceride glucose index and blood leukocyte indices as predictors of metabolic syndrome in healthy Chinese population. Sci Rep. 2021;11(1):10036.
Karepov V, Tolpina G, Kuliczkowski W, Serebruany V. Plasma triglycerides as predictors of platelet responsiveness to aspirin in patients after first ischemic stroke. Cerebrovasc Dis. 2008;26(3):272–6.
Laufs U, Parhofer KG, Ginsberg HN, Hegele RA. Clinical review on triglycerides. Eur Heart J. 2020;41(1):99–109c.
Ghazizadeh H, Shakour N, Ghoflchi S, Mansoori A, Saberi-Karimiam M, Rashidmayvan M, et al. Use of data mining approaches to explore the association between type 2 diabetes mellitus with SARS-CoV-2. BMC Pulm Med. 2023;23(1):1–14.
Mansoori A, Hosseini ZS, Ahari RK, Poudineh M, Rad ES, Zo MM, et al. Development of Data Mining Algorithms for Identifying the Best Anthropometric Predictors for Cardiovascular Disease: MASHAD Cohort Study. High Blood Pressure & Cardiovascular Prevention. 2023;30(3):1–11.
Poudineh M, Mansoori A, Sadooghi Rad E, Hosseini ZS, Salmani Izadi F, Hoseinpour M, et al. Platelet distribution widths and white blood cell are associated with cardiovascular diseases: data mining approaches. Acta Cardiologica. 2023:78(9):1033-44.
Rigatti SJ. Random forest. J Insur Med. 2017;47(1):31–9.
Su J, Zhang H. A fast decision tree learning algorithm, AAAI'06: In Proceedings of the 21st national conference on Artificial intelligence. 2006;1:500–5.
Ghayour-Mobarhan M, Moohebati M, Esmaily H, Ebrahimi M, Parizadeh SMR, Heidari-Bakavoli AR, et al. Mashhad stroke and heart atherosclerotic disorder (MASHAD) study: design, baseline characteristics and 10-year cardiovascular risk estimation. Int J Public Health. 2015;60:561–72.
Castelli WP, Garrison RJ, Wilson PW, Abbott RD, Kalousdian S, Kannel WB. Incidence of coronary heart disease and lipoprotein cholesterol levels: the Framingham Study. JAMA. 1986;256(20):2835–8.
Harrison P, Goodall AH. Studies on mean platelet volume (MPV)-new editorial policy. Platelets. 2016;27(7):605–6.
Harzandi A, Lee S, Bidkhori G, Saha S, Hendry BM, Mardinoglu A, et al. Acute kidney injury leading to CKD is associated with a persistence of metabolic dysfunction and hypertriglyceridemia. iScience. 2021;24(2):102046.
Saadatagah S, Pasha AK, Alhalabi L, Sandhyavenu H, Farwati M, Smith CY, et al. Coronary heart disease risk associated with primary isolated hypertriglyceridemia; a population-based study. J Am Heart Assoc. 2021;10(11):e019343.
Nordestgaard BG, Varbo A. Triglycerides and cardiovascular disease. Lancet. 2014;384(9943):626–35.
Hansen SE, Varbo A, Nordestgaard BG, Langsted AJCC. Hypertriglyceridemia-associated pancreatitis: new concepts and potential mechanisms. Clin Chem. 2023;69(10):1132–44.
Hu H, Han Y, Liu Y, Guan M, Wan Q. Triglyceride: a mediator of the association between waist-to-height ratio and non-alcoholic fatty liver disease: a second analysis of a population-based study. Front Endocrinol (Lausanne). 2022;13:973823.
Collaboration APCS. Serum triglycerides as a risk factor for cardiovascular diseases in the Asia-Pacific region. Circulation. 2004;110(17):2678–86.
Hokanson JE, Austin MA. Plasma triglyceride level is a risk factor for cardiovascular disease independent of high-density lipoprotein cholesterol level: a metaanalysis of population-based prospective studies. J Cardiovasc Risk. 1996;3(2):213–9.
Liu J, Zeng FF, Liu ZM, Zhang CX, Ling WH, Chen YM. Effects of blood triglycerides on cardiovascular and all-cause mortality: a systematic review and meta-analysis of 61 prospective studies. Lipids Health Dis. 2013;12:1–11.
Meroni M, Longo M, Paolini E, Tria G, Ripolone M, Napoli L, et al. Expanding the phenotypic spectrum of non-alcoholic fatty liver disease and hypertriglyceridemia. Front Nutr. 2022;9:967899.
Gouni-Berthold I, Ulrich L. Special aspects of cholesterol metabolism in women. Dtsch Arztebl Int. 2024;121(12):401.
Agledahl I, Skjærpe P-A, Hansen J-B, Svartberg J. Low serum testosterone in men is inversely associated with non-fasting serum triglycerides: the Tromsø study. Nutr Metab Cardiovasc Dis. 2008;18(4):256–62.
Hu L, Li M, Ding Y, Pu L, Liu J, Xie J, et al. Prognostic value of RDW in cancers: a systematic review and meta-analysis. Oncotarget. 2017;8(9):16027.
Lippi G, Targher G, Montagnana M, Salvagno GL, Zoppini G, Guidi GC. Relation between red blood cell distribution width and inflammatory biomarkers in a large cohort of unselected outpatients. Arch Pathol Lab Med. 2009;133(4):628–32.
Ilhan A, Gurler F, Yilmaz F, Eraslan E, Dogan M. The relationship between hemoglobin-RDW ratio and clinical outcomes in patients with advanced pancreas cancer. Eur Rev Med Pharmacol Sci. 2023;27(5):2060.
Vayá A, Sarnago A, Fuster O, Alis R, Romagnoli M. Influence of inflammatory and lipidic parameters on red blood cell distribution width in a healthy population. Clin Hemorheol Microcirc. 2015;59(4):379–85.
de Man FH, Nieuwland R, van der Laarse A, Romijn F, Smelt AH, Leuven JAG, et al. Activated platelets in patients with severe hypertriglyceridemia: effects of triglyceride-lowering therapy. Atherosclerosis. 2000;152(2):407–14.
Zeng SG, Zeng TT, Jiang H, Wang LL, Tang SQ, Sun YM, et al. A simple, fast correction method of triglyceride interference in blood hemoglobin automated measurement. J Clin Lab Anal. 2013;27(5):341–5.
Prajwala T. A comparative study on decision tree and random forest using R tool. Int J Adv Res Comp Commun Engine. 2015;4(1):196–9.
Sun Z, Wang G, Li P, Wang H, Zhang M, Liang X. An improved random forest based on the classification accuracy and correlation measurement of decision trees. Exp Syst Appl. 2024;237:121549.
Acknowledgements
The authors are thankful to all such persons who helped them to construct this piece of work and paper in proper shape.
Funding
The present study did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
SG, MG and MGM contributed to the conception, design, and preparation of the manuscript. SG, AM, AK conducted the data collection analysis, and contributed to acquisition and interpretation. AK, SG, MH, HE and GF made substantial contributions in drafting the manuscript and revising it critically for important intellectual content. All authors have read and approved the final version of the manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
This study protocol was reviewed and approved by the Ethics Committee of MUMS, approval number IR.MUMS.REC.1386.250. Informed consent to participate was obtained from all of the participants in the study.
Consent for publication
The consent for publication has been obtained from all the authors.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ghiasi Hafezi, S., Mansoori, A., Kooshki, A. et al. Association between serum hypertriglyceridemia and hematological indices: data mining approaches. BMC Med Inform Decis Mak 24, 410 (2024). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-024-02835-2
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-024-02835-2