Method | Description | Advantages | Disadvantages |
---|---|---|---|
Logistic Regression | Models the probability of stroke based on risk factors. Uses a sigmoid function and gradient descent to build the model. Evaluates model performance with and without regulation | High accuracy (more than 95%), the possibility of improvement through regularization, and ease of implementation | Limited accuracy with nonlinear relationships, and sensitivity to parameter selection |
Decision Tree | A set of decision trees trained on different subsets of data uses voting to make the final decision | Visualization of decisions, ease of interpretation | Tendency to overfitting without regularization |
Random Forest | Aggregates result from multiple decision trees, reducing the risk of overfitting | High accuracy (96%), and reliability | Can be slow on large datasets |
Naive Bayes | Classifies based on the assumption of independence of features | Simplicity, efficiency in many classification tasks | The achieved accuracy is 82%. Limitations with complex relationships between features |
k-Nearest Neighbors (k-NN) | Classifies new observations based on the nearest neighbors in the training set | Simplicity and clarity | Scaling problems, sensitivity to the choice of the k parameter |
Support Vector Machine (SVM) | It uses kernel functions to process nonlinear distributions | High efficiency in high-dimensional data | Sensitivity to parameter selection, difficulties with large datasets |
Deep Learning | Use of convolutional neural networks (CNN) for medical image analysis | High accuracy in detecting complex patterns | Requires large datasets and powerful resources |
Artificial Neural Networks (ANN) | A machine learning model using resampling, data leakage avoidance, feature selection, and interpretability techniques (such as permutation importance and LIME) for stroke prediction | High interpretability with LIME, effective resampling, and feature selection. High prediction accuracy (95%) | Dependence on external dataset validation and ongoing optimization for better performance |
XGBoost | A gradient enhancement algorithm that combines different methods to achieve better results | High prognostic efficiency and interpretability. The accuracy is over 97% | Difficult to configure parameters, requires computing resources |