To compare the diagnostic performances of physicians and a deep convolutional neural network (CNN) predicting malignancy with ultrasonography images of thyroid nodules with atypia of undetermined significance (AUS)/follicular lesion of undetermined significance (FLUS) results on fine-needle aspiration (FNA). This study included 202 patients with 202 nodules ≥ 1 cm AUS/FLUS on FNA, and underwent surgery in one of 3 different institutions. Diagnostic performances were compared between 8 physicians (4 radiologists, 4 endocrinologists) with varying experience levels and CNN, and AUS/FLUS subgroups were analyzed. Interobserver variability was assessed among the 8 physicians. Of the 202 nodules, 158 were AUS, and 44 were FLUS; 86 were benign, and 116 were malignant. The area under the curves (AUCs) of the 8 physicians and CNN were 0.680–0.722 and 0.666, without significant differences (P > 0.05). In the subgroup analysis, the AUCs for the 8 physicians and CNN were 0.657–0.768 and 0.652 for AUS, 0.469–0.674 and 0.622 for FLUS. Interobserver agreements were moderate (k = 0.543), substantial (k = 0.652), and moderate (k = 0.455) among the 8 physicians, 4 radiologists, and 4 endocrinologists. For thyroid nodules with AUS/FLUS cytology, the diagnostic performance of CNN to differentiate malignancy with US images was comparable to that of physicians with variable experience levels.
Bibliographical notePublisher Copyright:
© 2021, The Author(s).
All Science Journal Classification (ASJC) codes