Predicting the efficacy of microwave ablation of benign thyroid nodules from ultrasound images using deep convolutional neural networks

Agyekum, Enock Adjei; Wang, Yu-guo; Issaka, Eliasu; Ren, Yong-zhen; Tan, Gongxun; Shen, Xiangjun; Qian, Xiao-qin

doi:10.1186/s12911-025-02989-7

Research
Open access
Published: 11 April 2025

Predicting the efficacy of microwave ablation of benign thyroid nodules from ultrasound images using deep convolutional neural networks

Enock Adjei Agyekum ORCID: orcid.org/0000-0003-0021-8555^1,2^na1,
Yu-guo Wang³^na1,
Eliasu Issaka⁴,
Yong-zhen Ren¹,
Gongxun Tan¹,
Xiangjun Shen² &
…
Xiao-qin Qian^5,6,7

BMC Medical Informatics and Decision Making volume 25, Article number: 161 (2025) Cite this article

536 Accesses
Metrics details

Abstract

Background

Thyroid nodules are frequent in clinical settings, and their diagnosis in adults is growing, with some persons experiencing symptoms. Ultrasound-guided thermal ablation can shrink nodules and alleviate discomfort. Because the degree and rate of lesion absorption vary greatly between individuals, there is no reliable model for predicting the therapeutic efficacy of thermal ablation.

Methods

Five convolutional neural network models including VGG19, Resnet 50, EfficientNetB1, EfficientNetB0, and InceptionV3, pre-trained with ImageNet, were compared for predicting the efficacy of ultrasound-guided microwave ablation (MWA) for benign thyroid nodules using ultrasound data. The patients were randomly assigned to one of two data sets: training (70%) or validation (30%). Accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve (AUC) were all used to assess predictive performance.

Results

In the validation set, fine-tuned EfficientNetB1 performed best, with an AUC of 0.85 and an ACC of 0.79.

Conclusions

The study found that our deep learning model accurately predicts nodules with VRR < 50% after a single MWA session. Indeed, when thermal therapies compete with surgery, anticipating which nodules will be poor responders provides useful information that may assist physicians and patients determine whether thermal ablation or surgery is the preferable option. This was a preliminary study of deep learning, with a gap in actual clinical applications. As a result, more in-depth study should be undertaken to develop deep-learning models that can better help clinics. Prospective studies are expected to generate high-quality evidence and improve clinical performance in subsequent research.

Key points

Thermal ablation presents a less invasive approach for addressing thyroid nodules.

Deep learning can predict the efficacy of microwave ablation in patients with thyroid nodules.

The volume reduction rate holds significant importance in the application of microwave ablation

Peer Review reports

Background

Thyroid nodules are common in clinical settings, and their detection in adults has improved due to the use of ultrasound (US) [1]. Benign thyroid nodules can induce subjective symptoms or cosmetic issues that are related to nodular volume [1]. Large thyroid nodules can induce local pain and swallowing difficulties [2] due to esophageal and tracheal compression [3], necessitating surgery. While the majority of thyroid nodules are not malignant, surgery remains the preferred therapeutic option [3,4,5]. Thyroidectomy and surgery are the standard, first-line therapies for benign and malignant thyroid nodules, according to the 2015 American Thyroid Association guidelines and the 2016 Chinese expert consensus and guidelines [5]. Patients frequently refuse surgery because of the related complications, such as permanent scarring, recurrent laryngeal nerve injury, and long-term dependency on levothyroxine [5, 6].

US-guided thermal ablation (TA), which includes methods such as microwave ablation (MWA), laser ablation (LA), and radiofrequency ablation (RFA), is a less invasive approach that has been shown to shrink nodules and preserve thyroid function [7] and is recommended in relevant treatment guidelines [8,9,10,11,12]. They work by generating tissue necrosis with heat and are less expensive than traditional treatments like surgery [13,14,15]. The volume reduction ratio (VRR) is a direct metric for assessing the clinical efficacy of ablation [16,17,18]. Individuals experience varying degrees and rates of lesion absorption following ablation. Various factors, including the morphology of the nodules, may impact the effectiveness of US-guided percutaneous TA. Moreover, as indicated in literature reports, between 5% to 30% of patients with nodules experience unsatisfactory outcomes, characterized by a VRR below 50% following initial TA and experience nodule regrowth [19,20,21,22,23]. Additionally, effective predictive models that analyze the factors influencing the efficacy of benign thyroid nodule ablation are limited. Nodules that decrease to a lesser degree are more prone to recurrence [24]. If thermal therapies are viewed as a viable alternative to surgery, then VRR becomes an important factor in their adoption [25]. Being able to predict the efficacy of thermal treatments could help identify the best candidates for a successful outcome in a single session. Nodules that do not respond adequately will be removed surgically or treated further.

US imaging is chosen over other medical imaging modalities as the initial screening test because it can often reveal thyroid abnormalities [26, 27]. Sometimes it’s hard to tell apart the overlapping texture patterns of the thyroid tumors in the US images, even for an experienced radiologist [28, 29]. Deep learning (DL) approaches can be used to train a model to recognize statistical patterns, classifications, and data-driven predictions [30,31,32,33]. A subset of these techniques, known as convolutional neural networks (CNNs), has increased in prominence.

CNNs improve early disease detection by enhancing feature extraction, making early stage identification more sensitive. They do detailed assessments using multimodal inputs, such as images, and employ transfer learning to perform better with less data [34, 35]. Real-time image processing is now possible due to recent advances in computational efficiency, which has accelerated diagnosis. CNNs outperform conventional approaches in image-based applications by automatically learning hierarchical features and capturing complicated patterns, resulting in enhanced sensitivity and accuracy [34, 35]. They are less susceptible to differences in image quality since they require little preprocessing and are computationally efficient due to parameter sharing. Medical imaging modalities have been thoroughly studied with pre-trained DL networks [33,34,35,36,37]. The DL image characterization is split into two groups: pre-trained models and self-design models [33, 38].

Pre-trained models are frequently used to illustrate transfer learning techniques. Transfer learning begins by utilizing patterns discovered while solving a specific problem, as opposed to beginning from scratch [33, 39]. CNN is getting more and more popular because of its enhanced performance and simplicity of training. In a number of image vision tasks, it has shown encouraging results [33,34,35,36,37, 40]. Two prior studies demonstrated that classical machine learning models based on clinical and US features of benign thyroid nodules may predict RFA and MWA efficacy [25, 41], and help clinicians develop appropriate treatment strategies. In the preceding investigations, clinical characteristics and US features were first extracted, and feature selection was performed before developing the models. The extraction of features and feature selection are time-consuming and difficult. The extracted and selected features affect the performance of the classification if not chosen judicially. The DL approach excludes the process of manual feature extraction.

In this study, we aimed to develop and assess CNN algorithms based on transfer learning utilizing US images of benign thyroid lesions to predict the efficacy of MWA in patients with benign thyroid nodules before surgery. To our knowledge, no research has been undertaken on predicting the efficacy of MWA for patients with benign thyroid nodules utilizing DL models, specifically CNN. We trained our model directly on the US images without first extracting features, which we hope physicians would comprehend and utilize.

Methods

Patients

From March 2021 to May 2023, a retrospective selection was conducted on 168 patients at Jiangsu Hospital of Integrated Traditional Chinese and Western Medicine. The enrolment process is illustrated in Fig. 1A. All patients underwent routine 2-dimensional US examinations and US-guided MWA. Based on the 3-month follow-up examination results post-US-guided MWA, which served as the ground truth, cases were categorized into VRR ≥ 50% and VRR < 50% groups. The patients were randomly assigned to one of two data sets: training (70% n = 118 patients with 184 US images) or validation (30% n = 50 patients with 80 US images). Inclusion and exclusion criteria are detailed in Table 1. This retrospective study was approved by Jiangsu Hospital of Integrated Traditional Chinese and Western Medicine.

Table 1 Inclusion and exclusion criteria

Full size table

Ultrasound examination

Before MWA, all patients had a routine US examination performed by well-trained technicians using a Philips Q5, Philips iU22 (both Healthcare, Eindhoven, the Netherlands) or a GE LOGIQ s8, LOGIQ E20, LOGIQ E9 (GE Medical Systems, American General) US system with a 5–12 MHz linear array transducer. The patients were placed in the supine position, and longitudinal and transverse continuous scanning were performed to obtain longitudinal and transverse images of the thyroid nodules. This enabled the measurement of thyroid tumor initial volume, shape (regular or irregular), nodule type (cystic, solid, mixed, predominantly cystic), internal echo pattern (even or uneven), and tumor boundary (clear, unclear). The target nodules were identified, and images of the maximum transverse and longitudinal sections of thyroid nodules were saved.

Ultrasound-guided MWA procedure

The MWA device comprised a microwave generator, a flexible coaxial cable with low loss, and an internally cooled antenna akin to a 14 G to 16 G needle. Operating at 2450 MHz, the generator could emit 1–100 W of power either in pulse or continuous mode. This energy caused water molecules within the targeted tissue to oscillate, raising its temperature to cytotoxic levels, and resulting in cell death. Subsequently, the induced coagulative necrosis is degraded by the patient’s immune system. When deciding on the best puncture path and ablation, we considered the size, location, and vascularity of thyroid cystic nodules.

Patients were positioned supine with the neck extended during the procedure. To mitigate pain from needle puncture, lidocaine (2%) (5 ml) was administered into the skin puncture site and thyroid capsule before ablation. Before performing MWA on cystic lesions, the fluid inside the nodules was removed using a 20 ml syringe. The cystic fluid was then mixed with normal saline solution (0.9%) to facilitate suction. MWA was conducted after the nodule had largely depleted its fluid content and had significantly shrunk in size. In this procedure, known as the “barrier” technique, a mixture of normal saline and sodium hyaluronate was injected between the thyroid capsule and the surrounding tissues. This effectively separated the thyroid capsule from the adjacent tissues, creating a liquid barrier zone approximately 5–10 mm wide. This step aims to prevent inadvertent thermal damage to vital neighboring structures such as the laryngeal nerve, carotid artery, and trachea. Additionally, the “artery-first” and “marginal venous ablation” techniques were employed. It is crucial to prioritize ablating the feeding artery to restrict the tumor’s blood supply. Draining veins are commonly present around the periphery of thyroid nodules, which during MWA, contribute to a heat-sink effect, impeding complete ablation of the nodule’s borders.

To tackle this concern, the marginal vein was treated by ablation after its perforation by the electrode tip. This approach reduces bleeding during ablation and mitigates the risk of incomplete ablation of the tumor caused by the heat sink effect. This effect is characterized by adjacent arteries draining thermal energy away from the targeted tissues. The microwave needle was precisely positioned within the nodules with the guidance of US. The MWA treatment commenced with a power output ranging from 25 to 40 W. The ablation process was conducted under dynamic US monitoring. Special attention was paid to the region of the capsule wall and the solid portion of the nodules during MWA. Before and immediately after US-guided MWA, contrast-enhanced US (CEUS) was performed. The lesion was deemed completely ablated when the defected area covered the edge of the primary lesion by 1–2 mm, the surrounding thyroid parenchyma was filled with a contrast agent, and multi-angle scanning revealed a significant change in echo post-ablation. After treatment, all patients were observed for over two hours while receiving a 30-minute local neck compression.

Follow-up after ultrasound-guided MWA

US assessments were conducted before and three months after ablation. Measurements of nodules’ size and echogenicity were performed, and their volume was calculated. The largest diameter of each nodule was noted (a), along with the other two perpendicular diameters (b and c). Nodule volume was computed using the formula: V = πabc/6. The VRR was determined using the following formula:

$$VRR(\% ) = \frac{{\left( {pretreatmentvolume} \right) - \left( {followupvolume} \right)}}{{\left( {pretreatmentvolume} \right)}} \times 100\% $$

Data preprocessing and data augmentation

The imaging repository was reviewed to locate all thyroid US images, and the outcomes of MWA were correlated with the US imaging data. Before pre-processing, grayscale conversion was applied to all B-mode US images to eliminate unnecessary image channels. In this study, the region of interest (ROI) of the primary thyroid lesion for each US image was segmented by a radiologist with over 8 years of thyroid US examination experience. The rectangular ROIs were cropped from raw US images according to the tumor segmentation mask, resized to 224 × 224 pixels, and normalized. Previous research has demonstrated the benefits of this frame selection strategy, which is why it was chosen [42, 43]. There are several benefits of using rectangular cropped ROIs as a frame selection strategy as opposed to traditional methods. First of all, it provides targeted focus, enabling the exact isolation of important features within an image. This targeted strategy reduces computing load by processing only the essential data, which leads to shorter training times and less resource consumption.

Model performance is enhanced by cropped ROIs because they increase learning accuracy in tasks like classification. When sized appropriately, they can maintain contextual relevance, which is beneficial for tasks requiring spatial dependence. They are also adaptable, enabling dynamic alterations in reaction to specific objects of interest. Moreover, cropped ROIs enhance input data, which enhances CNN performance and aids models in focusing on critical areas. Comparably simple to apply, this method can be combined with data augmentation techniques to improve training datasets. Compared to full image analysis, feature extraction, and segmentation techniques, cropped ROIs simplify the preprocessing stage and provide targeted and effective input, improving model performance and computing efficiency in image analysis applications. Contextual information, which may be crucial for interpreting spatial relationships in an image, may be lost when employing rectangular cropped ROIs. The fixed size and shape of ROIs may not be sufficient for objects with irregular shapes, and the procedure relies heavily on the efficacy of ROI selection, which can be subjective and time-consuming. Moreover, the model may not be able to generalize to new data due to overfitting from training on certain ROIs. To avoid insufficient mass extraction and to capture some of the surrounding area near the thyroid lesion, which could yield important information, a pixel border was added around the lesion zone in this study. These images were then employed as input for the DL models.

The training set was utilized to optimize the model’s parameters. Given the limited training data available in this study and the need to mitigate overfitting and sample imbalances, a method called data augmentation was employed [44, 45] (Fig. 1B and C), which meant randomly horizontal and vertical flipping the input image, randomly adjusting the height of an image by an amount of 0.2, randomly adjusting the width of an image by an amount of 0.2, randomly zooming into an image by an amount of 0.2, randomly rotating an image by an amount of 0.2. This process ensures that the model focuses on identifying thyroid lesions amidst potential noise sources [45]. We also utilize pre-trained models to leverage learned features from balanced datasets, which enhances model performance on our datasets. Additionally, all augmented images were resized to 224 × 224 pixels to standardize the scale. This strategy, proven effective, helps prevent network overfitting and the memorization of exact training image details [46].

All preprocessing steps were conducted in Python (version 3.10.12) by using the Keras preprocessing (https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing)

Model construction

A CNN comprises various layers, such as input and output layers, along with convolution, pooling, and fully connected layers. The input layer receives raw data, while convolution layers extract features through convolution operations. Pooling layers further reduce parameter scale, and fully connected layers amalgamate all features for classification. For constructing models to predict the efficacy of MWA for benign thyroid nodules using US images, CNN architectures including VGG19, ResNet50, EfficientNetB1, EfficientNetB0, and InceptionV3 [47,48,49,50] with pre-training on ImageNet (http://www.image-net.org/) were employed in this study

VGG19, ResNet50, EfficientNetB1, EfficientNetB0, and InceptionV3 used in this study are all CNN models, and numerous studies have confirmed their ability to efficiently perform classification tasks using US images [42, 43, 51, 52]. These different pre-trained models were obtained from an open-access library (Keras Applications, available at https://www.tensorflow.org/api_docs/python/tf/keras/applications). Transfer learning was utilized to help fine-tune all weights and biases and reduce training time significantly. In our model, parameters pretrained on the ImageNet dataset were used and after loading that, we used our dataset for retraining. At last, the original classifier for the ImageNet classes was replaced by a binary classifier so that the output was a class probability vector ranging from 0 to 1 as the prediction result for each patient.

The VRR of thyroid lesions post MWA treatment after three months was represented in one-hot encoding as the label. The network was trained from scratch, during the training phase, images of rectangular ROIs were fed into the network as input for updating model parameters via backward propagation. The network’s outputs which is the VRR served as classification results, and the loss function was determined by computing the cross-entropy between the outputs and labels. With a batch size of 32, the Adam optimizer was employed to adjust the model parameters, with a learning rate set at 0.001. We utilized Tensorflow version 2.10.0 and Keras version 2.10.0 for the implementation of the training and validation code. The models underwent 10 epochs of training to prevent overfitting. A flowchart of the study is shown in Fig. 2.

Statistical analysis

Statistical analysis was conducted utilizing Python (version 3.10.12) and IBM SPSS Statistics for Windows version 26.0 (Armonk, New York, USA). To compare differences in categorical characteristics, either Pearson’s chi-square or Fisher’s exact test was employed. For continuous factors with a normal distribution, the independent sample t-test was utilized, while for those without a normal distribution, the Mann-Whitney U test was applied. A two-sided P-value < 0.05 was considered indicative of statistically significant differences.

The diagnostic capability of the DL models in distinguishing nodules with VRR < 50% or VRR ≥ 50% was demonstrated using the receiver operating characteristic (ROC) curve. This curve plots the true positive rate (sensitivity) against the false positive rate (1 – sensitivity), with the AUC also calculated. Sensitivity, specificity, accuracy, negative predictive value (NPV), and positive predictive value (PPV) of each prediction model were computed using Scikit-learn version 1.2 [53]. DCA was performed using R software (version 3.6.1, https://www.r-project.org).

Evaluation metrics employed in this study

The models’ performance on both the training and validation datasets was evaluated using metrics such as the AUC, confusion matrix, sensitivity, specificity, NPV, PPV, accuracy, DCA, and other standard clinical statistics. In the prediction of VRR < 50, the confusion matrix provides a concise overview of the model’s performance, juxtaposing its predictions against the real outcomes. The ROC curve plots the true positive rate (sensitivity) against the false positive rate (1 – sensitivity), and the AUC can be calculated. AUC is a metric that is better suited for an imbalance dataset. The accuracy, in the context of VRR < 50% prediction, gauges the model’s overall accuracy in predicting VRR < 50% or VRR ≥ 50% in individuals with thyroid nodules undergoing TA. The accuracy can be calculated from Eq. 1 [54]. Recall or sensitivity (Eqn. 2) for VRR < 50 prediction represents the ratio of correctly predicted cases of VRR < 50 out of all actual cases of VRR < 50% based on the true positive rate, showing a model’s capacity to correctly identify all patients who have VRR < 50% [55].

In the context of VRR < 50% prediction, specificity (Eq. 3) is described in a variety of ways, including a model’s capacity to detect true negatives, being based on the true negative rate, and properly identifying those who do not have VRR < 50%. These evaluation metrics play a crucial role in assessing the efficiency and effectiveness of DL models. In Eqs. 1–5, TP represents the number of true positive predictions (correctly predicted positive VRR < 50%), FP represents the number of false positive predictions (incorrectly predicted positive VRR < 50%), TN represents the number of true negative predictions (correctly predicted negative VRR < 50%), and FN represents the number of false negative predictions (incorrectly predicted negative VRR < 50%). These metrics provide valuable insights into the accuracy and performance of VRR predictions.

$$accuracy = \frac{{TP + TN}}{{TP + FP + TN + FN}}$$

(1)

$$sensitivity = \frac{{TP}}{{TP + FN}}$$

(2)

$$specificity = \frac{{TN}}{{TN + FP}}$$

(3)

$$PPV = \frac{{TP}}{{TP + FP}}$$

(4)

$$NPV = \frac{{FN}}{{FN + TN}}$$

(5)

The reason we chose this evaluation metrics is because it has proven useful in most clinical research involving the application of artificial intelligence models [41, 42, 56, 57].

Results

Clinical characteristics

A total of 168 patients (with 264 US images) with benign thyroid nodules were enrolled, with an average age of 42.26 ± 12.66 years, a range of 13–85 years, and a male-to-female ratio of 43:125. At 3 months following treatment, 98 patients with 150 images (58.3%) achieved VRR ≥ 50%(mean reduction 80%), and 70 patients with 114 images (41.7%) achieved VRR < 50%%(mean reduction 21.39%).

There were 80 patients with solid nodules, with 55 (68.75%) having VRR < 50% and 25 (31.25%) having VRR ≥ 50%. Of the 15 patients with cystic nodules, 7 (46.67%) had VRR < 50%, and 8 (53.3%) had VRR ≥ 50%. Out of 69 individuals with cystic nodules, 7 (10.1%) had VRR < 50%, while 62 (89.9%) had VRR ≥ 50%. There were four patients with mixed nodules, with one (25% of the total) having VRR < 50% and three (75% having VRR ≥ 50%). Most individuals with cystic, predominantly cystic, or mixed nodules had a VRR of ≥ 50%, compared to those with solid nodules. Table 2 displays the clinical data for the VRR ≥ 50% and VRR < 50% groups. The initial volume before ablation, shape, and nodule type (P < 0.05) did differ significantly between the two groups.

Table 2 Characteristics of patients and treated nodules divided by 3-month reduction ≥ or < 50%

Full size table

Diagnostic performance of the models

In the validation cohort, AUCs for InceptionV3, ResNet50, VGG19, EfficientNetB0, and EfficientNetB1 were 0.59, 0.76, 0.81, 0.86, and 0.84, respectively. Table 3 displays detailed information about the prediction performance of the DL models. Figure 3 shows the ROC curves of the five DL models. In the validation cohort, the accuracies were 0.54 for the Inception V3 model, 0.63 for the ResNet50 model, 0.73 for the VGG19 model, 0.74 for the EfficientNetB0 model, and 0.79 for the EfficientNetB1 model; the sensitivities were 0.69, 0.80, 0.78, 0.84 and 0.87; and the specificities were 0.34, 0.40, 0.66,0.60, 0.67, respectively. Table 4 displays the confusion matrices depicting the counts of true-positive, false-positive, true-negative, and false-negative outcomes for the classification models. Figures 4 and 5 exhibit the accuracy and loss curves for the five DL models. EfficientNetB1 exhibited superior performance among the five DL models. Additionally, Fig. 6 illustrates the use of DCA to assess the clinical utility of these models.

Table 3 Predictive performance of the deep learning models for the training and validation cohorts

Full size table

Table 4 Confusion Matrices for the deep learning models

Full size table

To further improve the performance of the EfficientNetB1, a small number of layers within EfficientNetB1 were fine-tuned. To adjust the model, its trainable attribute was initially set to true, allowing all previously frozen layers to be unfrozen. Given the relatively limited training dataset, all layers except the last 15 were then frozen again to facilitate training. Subsequently, the model underwent recompilation to integrate these layer modifications. Considering the fine-tuning process, a learning rate 10 times lower (1e-4) was applied to prevent excessive alterations to previously learned weights. The fine-tuned model underwent an additional 10 epochs of training.

In the validation cohort, the AUC of EfficientNetB1 after fine-tuning was found to be 0.85 (Table 3 and Fig. 3), the accuracy was 0.79; the sensitivity was 0.82; and the specificity was 0.74. The classification confusion matrices that report the number of true-positive, false-positive, true-negative, and false-negative results for the model are shown in Fig. 7 and Table 4. The accuracy and loss trends of the EfficientNetB1 DL model post fine-tuning are depicted in Fig. 8, while the DCA results are illustrated in Fig. 6. Post fine-tuning, the DCA showed that the EfficientNetB1 model yielded substantial overall net benefit, surpassing both the treat-all and treat-none approaches (Fig. 6). Furthermore, compared to InceptionV3, VGG19, ResNet50, EfficientNetB0, and EfficientNetB1, the fine-tuned EfficientNetB1 model demonstrated superior overall net benefit, making it more advantageous than other DL models (Fig. 6).

Discussion

Patients frequently refuse surgery due to the potential risks. As a result, less invasive methods, such as TA including MWA, are increasingly used to treat thyroid nodules. According to research, ablation techniques carry a lower risk of complications and adverse effects compared to conventional surgical methods [1, 58,59,60,61,62]. When it comes to applying TA, the VRR holds significant importance. In a study involving 104 participants by [60], it was found that after 12 months, 31 individuals (29.8%) exhibited a VRR of less than 50.0%, and 39 patients (37.5%) experienced nodule regrowth. The research indicated that a lower VRR following initial ablation was associated with a higher probability of nodule recurrence.

DL has made significant progress in recent times, facilitating automatic description and understanding of complex data [63,64,65]. It is worth noting that in this study, VGG19 (ACC: 0.73, AUC: 0.81), ResNet 50(ACC: 0.63; AUC: 0.76), EfficientNetB1 (ACC: 0.79; AUC: 0.84), EfficientNetB0 (ACC: 0.74; AUC: 0.86), and InceptionV3(ACC: 0.54; AUC: 0.59) deep CNNs pre-trained with ImageNet (http://www.image-net.org/) were innovatively tested to determine the best method for obtaining the best results: the EfficientNetB1 DL algorithm performed best in the validation cohort and it was chosen to fine-tune.

The EfficientNetB1’s higher performance in this research may be attributed to its compound scaling, which properly balances depth, width, and resolution. It uses depthwise separable convolutions to minimize parameters while retaining effective feature extraction [49]. The advanced architecture, including squeeze-and-excitation layers, enhances representational power [66], and its pre-training on large datasets aids in generalization for specific tasks [49]. These elements work together to enable the efficient and successful processing of complicated medical images such as medical ultrasonography. The fine-tuned EfficientNetB1 DL algorithm had an ACC of 0.79 and AUC of 0.85, suggesting a good performance for identifying those nodules that will have VRR < or ≥ 50% at 3 months after one MWA treatment session.

DL-based pre-trained models that perform automatic extraction of underlying features provide a more efficient technique of capturing underlying tissue properties for various initial diagnosis and assessments. Our DL model for predicting the efficacy of MWA for benign thyroid nodules is useful for real-time medical systems since it provides rapid diagnostics, delivering results in minutes and reducing turnaround times when compared to traditional methods. With an AUC of 85% and an accuracy of 79%, the model improves diagnostic confidence, perhaps assisting in identifying the most suited patients with thyroid lesions for a successful MWA session. Thyroid nodules that do not respond well will be surgically removed or treated further.

The DL models developed in this study can be readily integrated into existing clinical procedures, hence improving efficiency, by allowing clinicians to obtain data during patient evaluations. Fast identification of thyroid nodules suitable for ablation can be made feasible by this method, which will enable timely interventions which is a crucial component of effective treatment and better patient outcomes. Because of its ability to learn continuously, it can also be updated with new data, ensuring that it remains effective. DL models require less resources to implement and are simpler to utilize; the clinical physician only needs to input the patient’s US image. The method can automatically complete the DL feature extraction to classification process, which is convenient and highly reproducible, easy to promote, and has a promising application. Overall, this DL model can considerably improve the efficiency and effectiveness of thyroid nodule evaluations in real-time medical settings.

Negro et al. [25] used a dataset of 402 cytologically benign thyroid nodules treated with RFA from six Italian institutions to train a machine-learning algorithm. They reported an accuracy of 0.85, but our top-performing DL algorithm had an ACC of 0.79 and an AUC of 0.85. The large patient cohort may have contributed to the study’s higher ACC levels. However, their research focuses on RFA treatment, which is slightly differs from MWA. Also, rather than US images, they focus on predicting VRR < or ≥ 50% using clinical parameters such as baseline nodule volume, echo structure, macro calcifications, and vascularity.

Li et al. [41] employed six machine-learning algorithms to build models that predicted the therapeutic impact of benign thyroid nodule ablation. With an accuracy of 0.79 and an AUC of 0.86, the XGBoost outperformed the others. In comparison to our study, the top-performing DL algorithm, the fine-tuned EfficientnetB1, had an accuracy of 0.79 and an AUC of 0.85, which was consistent with the above study; however, in their study, features were first extracted from the US image before feature selection and model construction. Feature extraction and selection are both time-consuming and complex processes. We trained our model directly on an US image without first extracting features, which we hope clinicians would comprehend and utilize. The DL technique utilized in this study excels at image-based tasks because it automatically learns hierarchical features and captures complicated patterns, improving sensitivity and accuracy. They require little pre-processing and are computationally efficient due to parameter sharing, making them robust to fluctuations in image quality.

Moreover, the DCA was employed to assess the clinical usefulness of these DL algorithms (Fig. 6). Assuming every patient possesses a VRR < 50%, the solid black line (representing the negative scenario) illustrates that when no patient opts for intervention or treatment, the overall benefit remains at zero. Conversely, the solid grey line (representing the positive scenario) illustrates the net benefits when all patients exhibit a VRR < 50% and undergo treatments or interventions. Based on the prevalence of VRR < 50% among patients subjected to TA, the rational range of thresholds was determined from 0.3 to 0.99. Across the entire spectrum, all DL-based algorithms demonstrated superior net benefits compared to the extreme scenarios (negative and positive lines). Throughout almost the entire range of threshold probabilities, the fine-tuned EfficientNetB1 DL algorithm exhibited the highest net benefit in both the training and validation cohorts (Fig. 6).

In DL, the loss function is crucial for model learning. Its primary aim is to minimize the loss magnitude. If the loss fluctuates instead of diminishing, it suggests the model might not be learning effectively. Additionally, if the loss decreases in the training set but remains stagnant in the validation set, it could signify overfitting. In this study, we used categorical cross-entropy to calculate loss for DL models. Figures 4, 5, and 8 show that in both the training and validation cohorts, accuracy was increasing while loss was decreasing, demonstrating the proficient performance of the DL models.

There are some limitations to the study. To begin with, because this is a retrospective study, case selection bias could have influenced the findings. Furthermore, the DL models for predicting the efficacy of ablation were developed and validated in a single hospital using only grayscale US images. Small sample size and single-center data can reduce model generalizability. Small datasets with limited variability may not accurately reflect the population. Furthermore, single-center data might include biases relating to demographics and clinical procedures, limiting its application in varied situations. The lack of external validation imposes further limitations on the evaluation of performance in various settings. These elements cast doubt on predictive models’ clinical usefulness in practical settings. In the future, a multicenter study with a larger sample size is intended. In future investigations, we plan to look at how CNN model’s predictions differ in accuracy between various nodule types.

Conclusion

The development of DL-based algorithms has attracted much attention for analysing and classification of medical images. A DL model was developed to predict the efficacy of benign thyroid nodule ablation. The present study demonstrates that our DL model is reliable and able to identify nodules that will have VRR < 50% after one session of MWA. Indeed as thermal therapies confronts surgery, predicting which nodules will be poor responders’ represents valuable data that may help physicians and patients decide whether TA or surgery is a better option. This was a preliminary exploration of DL with a gap in actual clinical applications. Therefore, more in-depth research should be conducted to implement DL models that can serve clinics more accurately.

Data availability

The original contributions presented in the study are included in the article. Further inquiries can be directed to the corresponding authors.

Abbreviations

ACC:: Accuracy
AI:: Artificial intelligence
AUC:: Area under the curve
DL:: Deep learning
DCA:: Decision curve analysis
CNN:: Convolutional neural network
LA:: Laser ablation
MWA:: Microwave ablation
NPV:: Negative predictive value
PPV:: Positive predictive value
ROI:: Regions of interest
ROC:: Receiver operating characteristic
SEN:: Sensitivity
SPEC:: Specificity
TA:: Thermal ablation
US:: Ultrasound
VRR:: Volume reduction rate

References

Zheng B, Wang J, Ju J, Wu T, Tong G, Ren J. Efficacy and safety of cooled and uncooled microwave ablation for the treatment of benign thyroid nodules: a systematic review and meta-analysis. Endocrine. 2018;62(2):307–17. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s12020-018-1693-2.
Article CAS PubMed Google Scholar
Korkusuz Y, et al. Thermal ablation of thyroid nodules: are radiofrequency ablation, microwave ablation and high intensity focused ultrasound equally safe and effective methods? Eur Radiol. 2018;28(3):929–35. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s00330-017-5039-x.
Article PubMed Google Scholar
Jin H, Fan J, Liao K, He Z, Li W, Cui M. A propensity score matching study between ultrasound-guided percutaneous microwave ablation and conventional thyroidectomy for benign thyroid nodules treatment. Int J Hyperthermia. 2018;35(1):232–38. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/02656736.2018.1492028.
Article CAS PubMed Google Scholar
Sorrenti S, et al. Iodine: its role in thyroid hormone biosynthesis and beyond. Nutrients. 2021;13(12):4469. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/nu13124469.
Article CAS PubMed PubMed Central Google Scholar
Bo X-W, Lu F, Xu H-X, Sun L-P, Zhang K. Thermal ablation of benign thyroid nodules and papillary thyroid microcarcinoma. Front Oncol. 2020;10:580431. https://doiorg.publicaciones.saludcastillayleon.es/10.3389/fonc.2020.580431.
Article PubMed PubMed Central Google Scholar
Xiaoyin T, et al. Risk assessment and hydrodissection technique for radiofrequency ablation of thyroid benign nodules. J Cancer. 2018;9(17):3058–66. https://doiorg.publicaciones.saludcastillayleon.es/10.7150/jca.26060.
Article PubMed PubMed Central Google Scholar
Ding J, Wang D, Zhang W, Xu D, Wang W. Ultrasound-guided radiofrequency and microwave ablation for the management of patients with benign thyroid nodules: systematic review and meta-analysis. Ultrasound Q. 2023;39(1):61. https://doiorg.publicaciones.saludcastillayleon.es/10.1097/RUQ.0000000000000636.
Article PubMed Google Scholar
Kim J, et al. 2017 thyroid radiofrequency ablation guideline: Korean society of thyroid radiology. Korean J Radiol. 2018;19(4):632–55. https://doiorg.publicaciones.saludcastillayleon.es/10.3348/kjr.2018.19.4.632.
Article PubMed PubMed Central Google Scholar
Papini E, Monpeyssen H, Frasoldati A, Hegedüs L. 2020 European thyroid association clinical practice guideline for the use of image-guided ablation in benign thyroid nodules. Eur Thyroid J. 2020;9(4):172–85. https://doiorg.publicaciones.saludcastillayleon.es/10.1159/000508484.
Article PubMed PubMed Central Google Scholar
Gharib H, et al. American association of clinical endocrinologists, American college of endocrinology, and associazione medici endocrinologi medical guidelines for clinical practice for the diagnosis and management of thyroid nodules–2016 update. Endocr Pract Off J Am Coll Endocrinol Am Assoc Clin Endocrinol. 2016;22(5):622–39. https://doiorg.publicaciones.saludcastillayleon.es/10.4158/EP161208.GL.
Article Google Scholar
Cesareo R, et al. Efficacy of radiofrequency ablation in autonomous functioning thyroid nodules. A systematic review and meta-analysis. Rev Endocr Metab Disord. 2019;20(1):37–44. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11154-019-09487-y.
Article PubMed Google Scholar
Aysan E, Idiz UO, Akbulut H, Elmas L. Single-session radiofrequency ablation on benign thyroid nodules: a prospective single center study: radiofrequency ablation on thyroid. Langenbecks Arch Surg. 2016;401(3):357–63. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s00423-016-1408-1.
Article PubMed Google Scholar
Chung SR, Baek JH, Choi YJ, Lee JH. Management strategy for nerve damage during radiofrequency ablation of thyroid nodules. Int J Hyperth Off J Eur Soc Hyperthermic Oncol North Am Hyperth Group. 2019;36(1):204–10. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/02656736.2018.1554826.
Article CAS Google Scholar
Feng B, et al. Ultrasound-guided percutaneous microwave ablation of benign thyroid nodules: experimental and clinical studies. Eur J Endocrinol. 2012;166(6):1031–37. https://doiorg.publicaciones.saludcastillayleon.es/10.1530/EJE-11-0966.
Article CAS PubMed Google Scholar
Bernardi S, et al. Radiofrequency ablation compared to surgery for the treatment of benign thyroid nodules. Int J Endocrinol. 2014;2014:934595.
Article PubMed PubMed Central Google Scholar
Mauri G, et al. Image-guided thyroid ablation: proposal for standardization of terminology and reporting criteria. Thyroid Off J Am Thyroid Assoc. 2019;29(5):611–18. https://doiorg.publicaciones.saludcastillayleon.es/10.1089/thy.2018.0604.
Article Google Scholar
Sinclair CF, et al. General principles for the safe performance, training, and adoption of ablation techniques for benign thyroid nodules: an american thyroid association statement. Thyroid Off J Am Thyroid Assoc. 2023;33(10):1150–70. https://doiorg.publicaciones.saludcastillayleon.es/10.1089/thy.2023.0281.
Article Google Scholar
La O, et al. Radiofrequency ablation and related ultrasound-guided ablation technologies for treatment of benign and malignant thyroid disease: an international multidisciplinary consensus statement of the American Head and Neck Society Endocrine Surgery Section with the Asia Pacific Society of Thyroid Surgery, Associazione Medici Endocrinologi, British Association of Endocrine and Thyroid Surgeons, European Thyroid Association, Italian Society of Endocrine Surgery Units, Korean Society of Thyroid Radiology, Latin American Thyroid Society, and Thyroid Nodules Therapies Association. Head Neck. 2022;44(3). https://doiorg.publicaciones.saludcastillayleon.es/10.1002/hed.26960.
Negro R, Salem TM, Greco G. Laser ablation is more effective for spongiform than solid thyroid nodules. A 4-year retrospective follow-up study. Int J Hyperth Off J Eur Soc Hyperthermic Oncol North Am Hyperth Group. 2016;32(7):822–28. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/02656736.2016.1212279.
Article Google Scholar
Deandrea M, et al. Long-term efficacy of a single session of RFA for benign thyroid nodules: a longitudinal 5-year observational study. J Clin Endocrinol Metab. 2019;104(9):3751–56. https://doiorg.publicaciones.saludcastillayleon.es/10.1210/jc.2018-02808.
Article PubMed Google Scholar
Papini E, et al. Long-term efficacy of ultrasound-guided laser ablation for benign solid thyroid nodules. Results of a three-year multicenter prospective randomized trial. J Clin Endocrinol Metab. 2014;99(10):3653–59. https://doiorg.publicaciones.saludcastillayleon.es/10.1210/jc.2014-1826.
Article CAS PubMed Google Scholar
Valcavi R, Riganti F, Bertani A, Formisano D, Pacella CM. Percutaneous laser ablation of cold benign thyroid nodules: a 3-year follow-up study in 122 patients. Thyroid Off J Am Thyroid Assoc. 2010;20(11):1253–61. https://doiorg.publicaciones.saludcastillayleon.es/10.1089/thy.2010.0189.
Article CAS Google Scholar
Magri F, et al. Laser photocoagulation therapy for thyroid nodules: long-term outcome and predictors of efficacy. J Endocrinol Invest. 2020;43(1):95–100. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s40618-019-01085-8.
Article CAS PubMed Google Scholar
Negro R, Greco G. Unfavorable outcomes in solid and spongiform thyroid nodules treated with laser ablation. A 5-year follow-up retrospective study. Endocr Metab Immune Disord Drug Targets. 2019;19(7):1041–45. https://doiorg.publicaciones.saludcastillayleon.es/10.2174/1871530319666190206123156.
Article CAS PubMed PubMed Central Google Scholar
Negro R, et al. Machine learning prediction of radiofrequency thermal ablation efficacy: a new option to optimize thyroid nodule selection. Eur Thyroid J. 2020;9(4):205–12. https://doiorg.publicaciones.saludcastillayleon.es/10.1159/000504882.
Article PubMed Google Scholar
Yadav N, Dass R, Virmani J. A systematic review of machine learning based thyroid tumor characterisation using ultrasonographic images. J Ultrasound. 2024;27(2):209–24. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s40477-023-00850-z.
Article PubMed Google Scholar
Zahir ST, Vakili M, Ghaneei A, Sharahjin NS, Heidari F. Ultrasound assistance in differentiating malignant thyroid nodules from benign ones. J Ayub Med Coll Abbottabad JAMC. 2016;28(4):644–49.
PubMed Google Scholar
Yadav N, Dass R, Virmani J. Despeckling filters applied to thyroid ultrasound images: a comparative analysis. Multimed Tools Appl. 2022;81(6):8905–37. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11042-022-11965-6.
Article Google Scholar
Dass R. Speckle noise reduction of ultrasound images using BFO cascaded with wiener filter and discrete wavelet transform in homomorphic region. Procedia Comput Sci. 2018;132:1543–51. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.procs.2018.05.118.
Article Google Scholar
Verma A, Singh VP. Design, analysis and implementation of efficient deep learning frameworks for brain tumor classification. Multimed Tools Appl. 2022;81(26):37541–67. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11042-022-13545-0.
Article Google Scholar
Cheng X, Kadry S, Meqdad MN, Crespo RG. CNN supported framework for automatic extraction and evaluation of dermoscopy images. J Supercomput. 2022;78(15):17114–31. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11227-022-04561-w.
Article Google Scholar
Verma A, Singh VP. HSADML: hyper-sphere angular deep metric based learning for brain tumor classification. In: Mudenagudi U, Nigam A, Sarvadevabhatla RK, Choudhary A, editors. Proceedings of the Satellite Workshops of ICVGIP 2021, vol. 924. Lecture Notes in Electrical Engineering, vol. 924. Singapore: Springer Nature Singapore; 2022. p. 105–20. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/978-981-19-4136-8_8.
Chapter Google Scholar
Yadav N, Dass R, Virmani J. Deep learning-based CAD system design for thyroid tumor characterization using ultrasound images. Multimed Tools Appl. 2023;83(14):43071–113. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11042-023-17137-4.
Article Google Scholar
Litjens G, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60–88.
Article PubMed Google Scholar
Rastegari M, Ordonez V, Redmon J, Farhadi A. XNOR-Net: imageNet classification using binary convolutional neural networks. In: Leibe B, Matas J, Sebe N, Welling M, editors. Computer Vision – ECCV 2016. Cham: Springer International Publishing; 2016. p. 525–42. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/978-3-319-46493-0_32.
Chapter Google Scholar
Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts HJWL. Artificial intelligence in radiology. Nat Rev Cancer. 2018;18(8):500–10. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41568-018-0016-5.
Article CAS PubMed PubMed Central Google Scholar
Jin KH, McCann MT, Froustey E, Unser M. Deep convolutional neural network for inverse problems in imaging. IEEE Trans Image Process. 2017;26(9):4509–22. https://doiorg.publicaciones.saludcastillayleon.es/10.1109/TIP.2017.2713099.
Article Google Scholar
Bianco S, Celona L, Napoletano P, Schettini R. On the use of deep learning for blind image quality assessment. Signal Image Video Process. 2018;12(2):355–62. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s11760-017-1166-8.
Article Google Scholar
Fu Y, Aldrich C. Froth image analysis by use of transfer learning and convolutional neural networks. Miner Eng. 2018;115:68–78. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.mineng.2017.10.005.
Article CAS Google Scholar
Anwar SM, Majid M, Qayyum A, Awais M, Alnowami M, Khan MK. Medical image analysis using convolutional neural networks: a review. J Med Syst. 2018;42(11):226. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s10916-018-1088-1.
Article PubMed Google Scholar
Li Z, et al. A prognostic model for thermal ablation of benign thyroid nodules based on interpretable machine learning. Front Endocrinol. 2024;15:1433192.
Article Google Scholar
Liu H, et al. Deep learning radiomics based prediction of axillary lymph node metastasis in breast cancer. NPJ Breast Cancer. 2024;10(1):1–9. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41523-024-00628-4.
Article CAS Google Scholar
Zheng X, et al. Deep learning radiomics can predict axillary lymph node status in early-stage breast cancer. Nat Commun. 2020;11(1):1236. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41467-020-15027-z.
Article CAS PubMed PubMed Central Google Scholar
Özdemir Ö, Sönmez EB. Attention mechanism and mixup data augmentation for classification of COVID-19 computed tomography images. J King Saud Univ Comput Inf Sci. 2022;34(8):6199–207. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jksuci.2021.07.005.
Article PubMed Google Scholar
Roth HR, et al. Improving computer-aided detection using convolutional neural networks and random view aggregation. IEEE Trans Med Imaging. 2016;35(5):1170–81. https://doiorg.publicaciones.saludcastillayleon.es/10.1109/TMI.2015.2482920.
Article PubMed Google Scholar
Kayalibay B, Jensen G, van der Smagt P. CNN-based segmentation of medical imaging data. 2017. arXiv: arXiv:1701.03056. [Online]. http://arxiv.org/abs/1701.03056. 28 May 2024.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2015. arXiv: arXiv:1409.1556. [Online]. http://arxiv.org/abs/1409.1556. 28 May 2024.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016; p. 770–78. https://doiorg.publicaciones.saludcastillayleon.es/10.1109/CVPR.2016.90.
Tan M, Le Q. EfficientNet: rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning. PMLR; 2019. p. 6105–14. [Online]. https://proceedings.mlr.press/v97/tan19a.html. 28 May 2024.
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016, p. 2818–26. https://doiorg.publicaciones.saludcastillayleon.es/10.1109/CVPR.2016.308.
Yu J, et al. Lymph node metastasis prediction of papillary thyroid carcinoma based on transfer learning radiomics. Nat Commun. 2020;11(1):4807. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41467-020-18497-3.
Article CAS PubMed PubMed Central Google Scholar
Deng C, Li D, Feng M, Han D, Huang Q. The value of deep neural networks in the pathological classification of thyroid tumors. Diagn Pathol. 2023;18:95. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13000-023-01380-2.
Article PubMed PubMed Central Google Scholar
Pedregosa F, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.
Islam MA, Majumder MZH, Hussein MA. Chronic kidney disease prediction based on machine learning algorithms. J Pathol Inform. 2023;14:100189. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.jpi.2023.100189.
Article PubMed PubMed Central Google Scholar
Ziaul Hasan Majumder M, Abu Khaer M, Nayeen Mahi MJ, Shaiful Islam Babu M, Aditya SK. Decision support technique for prediction of acute lymphoblastic leukemia subtypes based on artificial neural network and adaptive neuro-fuzzy inference system. In: Suma V, Chen JI-Z, Baig Z, Wang H, editors. Inventive Systems and Control, vol. 204. Lecture Notes in Networks and Systems, vol. 204. Singapore: Springer Singapore; 2021. p. 539–54. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/978-981-16-1395-1_40.
Chapter Google Scholar
Agyekum EA, et al. Evaluation of cervical lymph node metastasis in papillary thyroid carcinoma using clinical-ultrasound radiomic machine learning-based model. Cancers. 2022;14(21):5266. https://doiorg.publicaciones.saludcastillayleon.es/10.3390/cancers14215266.
Article PubMed PubMed Central Google Scholar
Agyekum EA, et al. Predicting BRAFV600E mutations in papillary thyroid carcinoma using six machine learning algorithms based on ultrasound elastography. Sci Rep. 2023;13(1):12604. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/s41598-023-39747-6.
Article CAS PubMed PubMed Central Google Scholar
Zhou W, Ni X, Xu S, Zhang L, Chen Y, Zhan W. Ultrasound-guided laser ablation versus microwave ablation for patients with unifocal papillary thyroid microcarcinoma: a retrospective study. Lasers Surg Med. 2020;52(9):855–62. https://doiorg.publicaciones.saludcastillayleon.es/10.1002/lsm.23238.
Article PubMed Google Scholar
Wu W, Gong X, Zhou Q, Chen X, Chen X. Ultrasound-guided percutaneous microwave ablation for solid benign thyroid nodules: comparison of MWA versus control group. Int J Endocrinol. 2017;2017:1–7. https://doiorg.publicaciones.saludcastillayleon.es/10.1155/2017/9724090.
Article CAS Google Scholar
Negro R, Greco G, Deandrea M, Rucco M, Trimboli P. 2020. Twelve-month volume reduction ratio predicts regrowth and time to regrowth in thyroid nodules submitted to laser ablation: a 5-year follow-up retrospective study. Korean J Radiol. 21(6):764–72. https://doiorg.publicaciones.saludcastillayleon.es/10.3348/kjr.2019.0798.
Article PubMed PubMed Central Google Scholar
Liu Y-J, Qian L-X, Liu D, Zhao J-F. Ultrasound-guided microwave ablation in the treatment of benign thyroid nodules in 435 patients. Exp Biol Med. 2017;242(15):1515–23. https://doiorg.publicaciones.saludcastillayleon.es/10.1177/1535370217727477.
Article CAS Google Scholar
Wang B, et al. Factors related to recurrence of the benign non-functioning thyroid nodules after percutaneous microwave ablation. Int J Hyperthermia. 2017;33(4):459–64. https://doiorg.publicaciones.saludcastillayleon.es/10.1080/02656736.2016.1274058.
Article PubMed Google Scholar
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44. https://doiorg.publicaciones.saludcastillayleon.es/10.1038/nature14539.
Article CAS PubMed Google Scholar
Ding Y, et al. A deep learning model to predict a diagnosis of alzheimer disease by using 18F-FDG PET of the brain. Radiology. 2019;290(2):456–64. https://doiorg.publicaciones.saludcastillayleon.es/10.1148/radiol.2018180958.
Article PubMed Google Scholar
Ha R, et al. Prior to initiation of chemotherapy, can we predict breast tumor response? Deep learning convolutional neural networks approach using a breast MRI tumor dataset. J Digital Imaging. 2019;32(5):693–701. https://doiorg.publicaciones.saludcastillayleon.es/10.1007/s10278-018-0144-1.
Article Google Scholar
Hu J, Shen L, Albanie S, Sun G, Wu E. Squeeze-and-excitation networks. 2019. arXiv: arXiv:1709.01507. [Online]. http://arxiv.org/abs/1709.01507. 30 Sep 2024.

Download references

Acknowledgements

Not applicable.

Funding

This study was financially supported by the National Natural Science Foundation of China (Project No. 82471987) and the 2023 Clinical Research Project of Zhenjiang First People’s Hospital (YL2023001) and the Special Project for Cross Cooperation at the Hospital Level of Northern Jiangsu People’s Hospital (SBJC23001). The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Author information

Enock Adjei Agyekum and Yu-Guo Wang contributed equally to this work.

Authors and Affiliations

Department of Ultrasound, Affiliated People’s Hospital of Jiangsu University, Zhenjiang, 212002, China
Enock Adjei Agyekum, Yong-zhen Ren & Gongxun Tan
School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu Province, China
Enock Adjei Agyekum & Xiangjun Shen
Department of Ultrasound, Jiangsu Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing, China
Yu-guo Wang
College of Engineering, Birmingham City University, Birmingham, B4 7XG, UK
Eliasu Issaka
Northern Jiangsu People’s Hospital Affiliated to Yangzhou University, Yangzhou, China
Xiao-qin Qian
Northern Jiangsu People’s Hospital, Yangzhou, Jiangsu Province, China
Xiao-qin Qian
The Yangzhou Clinical Medical College of Xuzhou Medical University, Yangzhou, Jiangsu Province, China
Xiao-qin Qian

Authors

Enock Adjei Agyekum
View author publications
You can also search for this author inPubMed Google Scholar
Yu-guo Wang
View author publications
You can also search for this author inPubMed Google Scholar
Eliasu Issaka
View author publications
You can also search for this author inPubMed Google Scholar
Yong-zhen Ren
View author publications
You can also search for this author inPubMed Google Scholar
Gongxun Tan
View author publications
You can also search for this author inPubMed Google Scholar
Xiangjun Shen
View author publications
You can also search for this author inPubMed Google Scholar
Xiao-qin Qian
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

E.A.A. and X.Q. Conceptualization, E.A.A. methodology, E.A.A., E.I. software, Y.G.W. and Y.Z.R. validation, E.A.A.; and E.I.formal analysis, E.A.A. investigation, E.A.A. and Y.G.W. resources, E.A.A. and Y.Z.R. data curation,E.A.A. writing—original draft preparation, E.A.A., X.Q. G.T, writing—review and editing, E.A.A. and Y.Z.R. visualization, X.Q., and X.S. supervision, E.A.A. X.S. and X.Q. project administration, X.Q funding acquisition.

Corresponding authors

Correspondence to Xiangjun Shen or Xiao-qin Qian.

Ethics declarations

Ethics approval and consent to participate

The study was conducted in accordance with the Declaration of Helsinki and approved by the Jiangsu Hospital of Integrated Traditional Chinese and Western Medicine Ethics Committee and patient consent was waived by the ethics committee due to the retrospective nature of the study.

Consent for publication

Not applicable.

Conflicts of interest

The authors declare no conflicts of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Agyekum, E., Wang, Yg., Issaka, E. et al. Predicting the efficacy of microwave ablation of benign thyroid nodules from ultrasound images using deep convolutional neural networks. BMC Med Inform Decis Mak 25, 161 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-025-02989-7

Download citation

Received: 08 August 2024
Accepted: 26 March 2025
Published: 11 April 2025
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12911-025-02989-7

Predicting the efficacy of microwave ablation of benign thyroid nodules from ultrasound images using deep convolutional neural networks

Abstract

Background

Methods

Results

Conclusions

Key points

Background

Methods

Patients

Ultrasound examination

Ultrasound-guided MWA procedure

Follow-up after ultrasound-guided MWA

Data preprocessing and data augmentation

Model construction

Statistical analysis

Evaluation metrics employed in this study

Results

Clinical characteristics

Diagnostic performance of the models

Discussion

Conclusion

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Conflicts of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Medical Informatics and Decision Making

Contact us