• Center on Health Equity & Access
  • Clinical
  • Health Care Cost
  • Health Care Delivery
  • Insurance
  • Policy
  • Technology
  • Value-Based Care

Machine Learning Predicts 5-Year Survival in Stage III CRC

News
Article

The study aims to build an accurate, interpretable model for stage III colorectal cancer prognosis.

A new study leveraged machine learning (ML) to predict 5-year postoperative survival in patients with stage III colorectal cancer (CRC), identifying key clinical and demographic factors that influence outcomes.1 The model achieved strong predictive performance and could support more personalized treatment strategies, according to the researchers.

CRC surgery | Image credit: MedicalWorks - stock.adobe.com

Study aims to build an accurate, interpretable model for stage III colorectal cancer prognosis. | Image credit: MedicalWorks - stock.adobe.com

This analysis is published in Frontiers in Oncology.

“Although the current TNM staging system provides a fundamental framework for cancer prognosis, it does not fully account for all factors that may affect patients’ survival,” wrote the researchers of the study. “Therefore, prognostic prediction models based on individualized factors, particularly those constructed using ML methods, are essential for improving the accuracy of predicting the 5-year survival status of stage III CRC patients.”

Previous research has highlighted how ML can uncover patterns in postoperative outcomes in patients with CRC.2 By applying unsupervised learning to 12 years of surgical data, researchers identified 3 distinct trends in adverse events. Notably, month grouping emerged as an independent risk factor for anastomotic leak, alongside male sex and longer operation time. These insights suggest ML may help enhance risk stratification and optimize surgical planning in CRC postoperative care.

In this study, data from 13,855 patients with stage III CRC who underwent surgical treatment were extracted from the Surveillance, Epidemiology, and End Results (SEER) database.1 A set of clinical and sociodemographic variables, including marital status, gender, tumor location, histological type, T stage, chemotherapy status, age, tumor size, and lymph node ratio, were collected for analysis. The dataset was randomly divided into training and validation cohorts using a 7:3 ratio. Optimal cutoff values for age, tumor diameter, and lymph node ratio were determined to enhance predictive accuracy. Independent prognostic factors for 5-year postoperative survival were identified through univariate and multivariate logistic regression as well as Lasso regression. These key variables were then incorporated into multiple ML models.

Optimal cutoff values for key prognostic indicators were identified as 65 and 80 years for age, 29 mm and 74 mm for tumor size, and 0.11 and 0.49 for lymph node ratio. The analyses consistently identified several independent prognostic factors for 5-year postoperative survival, including marital status, tumor location, histological type, T stage, chemotherapy, radiotherapy, age, tumor size, lymph node ratio, serum carcinoembryonic antigen level, perineural invasion, and tumor differentiation (P < .05).

ML models incorporating these variables demonstrated strong predictive performance, with area under the curve values ranging from 0.766 to 0.791 in the validation cohort. Among all variables, age, lymph node ratio, chemotherapy status, and T stage emerged as the most influential factors. External validation using data from Shanxi Bethune Hospital confirmed the accuracy and clinical relevance of the prediction models.

However, the researchers noted the study has several limitations. First, SEER data are US-based, which may have limited generalizability of the findings to other populations, and the external validation cohort was relatively small. The database also lacked details on surgical techniques, treatment regimens, and complications. As a retrospective analysis, prospective multi-center studies with larger, more diverse datasets are needed to enhance model accuracy and clinical applicability, the researchers also noted.

“[ML] and artificial intelligence technologies continue to make breakthroughs,” wrote the researchers of the study. “The integration of imaging features and biomarkers into multimodal frameworks will optimize CRC prognosis evaluation and provide reliable evidence for precision medicine. This technological pathway demonstrates significant potential in improving the efficiency of developing individualized treatment plans, particularly in addressing the challenges posed by tumor heterogeneity and therapeutic efficacy variations.”

References

1. Zhang W, Li Y, Jia J, et al. Prediction of 5-year postoperative survival and analysis of key prognostic factors in stage III colorectal cancer patients using novel machine learning algorithms. Front Oncol. 2025;15:1604386. doi:10.3389/fonc.2025.1604386

2. Martín-Arévalo J, Moro-Valdezate D, García-Botello S, et al. Seasonal and cyclical variations in short-term postoperative outcomes of colorectal cancer: a time series analysis. Nature News. February 12, 2025. Accessed July 29, 2025. https://www.nature.com/articles/s41598-025-88782-y

Related Videos
Hadar Avihai Lev-Tov, MD
Mehmet Oz, MD, MBA
Senator Vincent Polistina (R, New Jersey)
Merrill H. Stewart, MD
Coral Omene, MD, PhD, sitting for a vieo interview
Martin Engelke
Dirk Arnold, MD, PhD, medical director, Asklepios Tumour Biology Centre
Samir Shah, MD, MMM, FACR
Related Content
© 2025 MJH Life Sciences
AJMC®
All rights reserved.