open access publication

Article, 2024

Differentiation of COVID-19 pneumonia from other lung diseases using CT radiomic features and machine learning: A large multicentric cohort study

International Journal of Imaging Systems and Technology, ISSN 0899-9457, Volume 34, 2, 10.1002/ima.23028

Contributors

Shiri I. 0000-0002-5735-0736 [1] Salimi Y. 0000-0002-1233-9576 [1] Saberi A. [1] Pakbin M. [2] Hajianfar G. 0000-0001-5359-2407 [1] Avval A.H. [3] Sanaat A. 0000-0001-8437-2060 [1] Akhavanallaf A. 0000-0002-1486-4702 [1] Mostafaei S. [4] Mansouri Z. 0000-0003-2087-9721 [1] Askari D. [5] Ghasemian M. [2] Sharifipour E. [2] Sandoughdaran S. [6] Sohrabi A. Sadati E. [7] Livani S. [8] Iranpour P. [9] Kolahi S. [10] Khosravi B. [11] Khateri M. [12] Bijari S. [7] Atashzar M.R. [13] Shayesteh S.P. [14] Babaei M.R. [15] Jenabi E. [16] Hasanian M. [17] Shahhamzeh A. [2] Ghomi S.Y.F. [2] Mozafari A. [18] Shirzad-Aski H. [19] Movaseghi F. [18] Bozorgmehr R. [20] Goharpey N. [20] Abdollahi H. 0000-0003-0761-1309 [21] [22] Geramifar P. [16] Radmard A.R. [23] Arabi H. 0000-0001-8437-2060 [1] Rezaei-Kalantari K. [24] Oveisi M. 0000-0002-8100-5609 [21] [25] Rahmim A. 0000-0002-9980-2403 [21] [22] Zaidi H. 0000-0001-7559-5297 (Corresponding author) [1] [26] [27] [28]

Affiliations

  1. [1] Geneva University Hospital
  2. [NORA names: Switzerland; Europe, Non-EU; OECD];
  3. [2] Qom University of Medical Sciences
  4. [NORA names: Iran; Asia, Middle East];
  5. [3] Mashhad University of Medical Sciences
  6. [NORA names: Iran; Asia, Middle East];
  7. [4] Karolinska Institutet
  8. [NORA names: Sweden; Europe, EU; Nordic; OECD];
  9. [5] Shahid Beheshti University of Medical Sciences
  10. [NORA names: Iran; Asia, Middle East];

Abstract

To derive and validate an effective machine learning and radiomics-based model to differentiate COVID-19 pneumonia from other lung diseases using a large multi-centric dataset. In this retrospective study, we collected 19 private and five public datasets of chest CT images, accumulating to 26 307 images (15 148 COVID-19; 9657 other lung diseases including non-COVID-19 pneumonia, lung cancer, pulmonary embolism; 1502 normal cases). We tested 96 machine learning-based models by cross-combining four feature selectors (FSs) and eight dimensionality reduction techniques with eight classifiers. We trained and evaluated our models using three different strategies: #1, the whole dataset (15 148 COVID-19 and 11 159 other); #2, a new dataset after excluding healthy individuals and COVID-19 patients who did not have RT-PCR results (12 419 COVID-19 and 8278 other); and #3 only non-COVID-19 pneumonia patients and a random sample of COVID-19 patients (3000 COVID-19 and 2582 others) to provide balanced classes. The best models were chosen by one-standard-deviation rule in 10-fold cross-validation and evaluated on the hold out test sets for reporting. In strategy#1, Relief FS combined with random forest (RF) classifier resulted in the highest performance (accuracy = 0.96, AUC = 0.99, sensitivity = 0.98, specificity = 0.94, PPV = 0.96, and NPV = 0.96). In strategy#2, Recursive Feature Elimination (RFE) FS and RF classifier combination resulted in the highest performance (accuracy = 0.97, AUC = 0.99, sensitivity = 0.98, specificity = 0.95, PPV = 0.96, NPV = 0.98). Finally, in strategy #3, the ANOVA FS and RF classifier combination resulted in the highest performance (accuracy = 0.94, AUC =0.98, sensitivity = 0.96, specificity = 0.93, PPV = 0.93, NPV = 0.96). Lung radiomic features combined with machine learning algorithms can enable the effective diagnosis of COVID-19 pneumonia in CT images without the use of additional tests.

Keywords

COVID-19, computed tomography, differential diagnosis, machine learning, radiomics

Funders

  • Universite de Geneve
  • Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung

Data Provider: Elsevier