Toward an Optimized Staging System for Pancreatic Ductal Adenocarcinoma: A Clinically Interpretable, Artificial Intelligence-Based Model

Dimitris Bertsimas, Georgios Antonios Margonis, Yifei Huang, Nikolaos Andreatos, Holly Wiberg, Yu Ma, Caitlin McIntyre, Alessandra Pulvirenti, Doris Wagner, J. L. Van Dam, Francesca Gavazzi, Stefan Buettner, Katsunori Imai, Georgios Stasinos, Jin He, Carsten Kamphues, Katharina Beyer, Hendrik Seeliger, Matthew J. Weiss, Martin KreisJohn L. Cameron, Alice C. Wei, Peter Kornprat, Hideo Baba, Bas Groot Koerkamp, Alessandro Zerbi, Michael D'Angelica, Christopher L. Wolfgang

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


PURPOSE: The American Joint Committee on Cancer (AJCC) eighth edition schema for pancreatic ductal adenocarcinoma treats T and N stage as independent factors and uses positive lymph nodes (PLNs) to define N stage, despite data favoring lymph node ratio (LNR). We used artificial intelligence-based techniques to compare PLN with LNR and investigate interactions between tumor size and nodal status. METHODS: Patients who underwent pancreatic ductal adenocarcinoma resection between 2000 and 2017 at six institutions were identified. LNR and PLN were compared through shapley additive explanations (SHAP) analysis, with the best predictor used to define nodal status. We trained optimal classification trees (OCTs) to predict 1-year and 3-year risk of death, incorporating only tumor size and nodal status as variables. The OCTs were compared with the AJCC schema and similarly trained XGBoost models. Variable interactions were explored via SHAP. RESULTS: Two thousand eight hundred seventy-four patients comprised the derivation and 1,231 the validation cohort. SHAP identified LNR as a superior predictor. The OCTs outperformed the AJCC schema in the derivation and validation cohorts (1-year area under the curve: 0.681 v 0.603; 0.638 v 0.586, 3-year area under the curve: 0.682 v 0.639; 0.675 v 0.647, respectively) and performed comparably with the XGBoost models. We identified interactions between LNR and tumor size, suggesting that a negative prognostic factor partially overrides the effect of a concurrent favorable factor. CONCLUSION: Our findings highlight the superiority of LNR and the importance of interactions between tumor size and nodal status. These results and the potential of the OCT methodology to combine them into a powerful, visually interpretable model can help inform future staging systems.

Original languageEnglish (US)
Pages (from-to)1220-1231
Number of pages12
JournalJCO clinical cancer informatics
StatePublished - 2021
Externally publishedYes

ASJC Scopus subject areas

  • General Medicine


Dive into the research topics of 'Toward an Optimized Staging System for Pancreatic Ductal Adenocarcinoma: A Clinically Interpretable, Artificial Intelligence-Based Model'. Together they form a unique fingerprint.

Cite this