Artificial Intelligence May Help Differentiate Colon Carcinoma From Acute Diverticulitis

January 27, 2023

Article

Colon carcinoma and acute diverticulitis have similar computed tomography imaging features that can make differential diagnosis difficult for radiologists, but a novel artificial intelligence assistance model was shown to help diagnostic accuracy.

A new study suggests that a novel artificial intelligence (AI) system has potential to help radiologists differentiate between colon carcinoma (CC) and acute diverticulitis (AD) on computed tomagraphy (CT) images.

Patients with AD experience a significant disease burden, making AD a common cause of hospital admission related to gastrointestinal issues. The standard imaging modality for AD diagnosis is a CT scan, with typical signs including bowel wall thickening, fat stranding, enlarged local lymph nodes, and the presence of diverticula. However, these are not unique to AD. Differentiating AD from CC is especially crucial because they are managed differently, but it is often difficult due to their similar imaging features.

The sensitivity and specificity of CC differentiation has been estimated at 40%-95.5%, and AD differentiation at 66%-93.3%, according to the study authors. But in clinical practice, upper margin values on radiologic sensitivity and specificity are often not reached. AI systems have been shown to improve the diagnostic accuracy of radiologists in various imaging settings, they added.

The new study, published in JAMA Network Open, aimed to develop an AI system that could assist radiologists in accurately diagnosing AD versus CC in routinely acquired CT scans. Especially in settings where there may not be an expert radiologist available, such as primary emergency care, an accurate AI support tool may help improve diagnostic accuracy in AD and CC.

A total of 585 patients were included in the study, all of whom had histopathologic confirmation of their conditions as determined by a board-certified pathologist following bowel resection. Of the overall cohort, 318 patients had CC and 267 had AD. A total of 130 patients had external imaging, and 445 had internal scans.

The AI program was developed with a training set of 435 cases, a validation set of 90 cases, and a testing set of 60 cases. AD and CC cases were equally distributed in the test set. A total of 10 readers also analyzed the scans: 3 radiology residents with less than 3 years of experience, 4 radiology residents with more than 3 years of experience, and 3 board-certified radiologists. Two of the board-certified radiologists specialized in gastrointestinal imaging.

Readers were shown deidentified scans in random order without further clinical information and asked to classify them as either AD or CC without AI support. Then, they were told the algorithm’s prediction and were given the option to either change or keep their original answer after receiving AI support. They were not told what the model’s sensitivity or specificity was, and they did not receive feedback on their decisions.

The AI program achieved a sensitivity and specificity of 98% and 92%, respectively, for the training set at a decision threshold of 0.5. In the validation set, it achieved a sensitivity of 94% and specificity of 94%. In the test set model, the sensitivity and specificity were 83.3% and 86.6%, respectively. The board-certified reader group had similar accuracy to the test set model, with a sensitivity of 85.5% and specificity of 86.6%. The mean reader sensitivity overall was 77.6%, and specificity was 81.6%. With the help of AI support, the mean sensitivity and specificity were both higher at 85.6% and 91.3%, respectively.

The negative predictive value (NPV) of the AI program was 83.8%, and the positive predictive value (PPV) was 86.2%. The NPV of readers overall was 78.5%, and the PPV of readers overall was 80.9%. Compared with radiology residents, board-certified radiologists had 11.3% higher sensitivity, the authors noted. AI support reduced the number of false-negative and false-positive readings versus human reading sans assistance, with an NPV of 86.4% and a PPV of 90.8%.

Improvements were seen across experience levels, with AI support reducing the false-negative rate from 22% to 14.3% in the overall group, 26% to 16.1% for the residents, and 14% to 10% for the board-certified radiologists.

While the AI model alone performed similarly to average human readers alone, the addition of AI to typical reading significantly improved performance regardless of reader experience levels. Given the importance of differentiating AD and CC, these findings suggest AI support could be useful in the clinical setting. Broader studies simulating real-world integration and validating the findings on a larger cohort are needed, but the study shows potential for AI support in this setting.

Reference

Ziegelmayer S, Reischl S, Havrda H, et al. Development and validation of a deep learning algorithm to differentiate colon carcinoma from acute diverticulitis in computed tomography images. JAMA Netw Open. Published online January 27, 2023. doi:10.1001/jamanetworkopen.2022.53370