TY - JOUR
T1 - Benchmarking mouse contamination removing protocols in patient-derived xenografts genomic profiling
AU - Bhandari, Mukund
AU - He, Funan
AU - Rogojina, Anna
AU - Li, Fuyang
AU - Zou, Yi
AU - Jiang, Jing
AU - Lai, Zhao
AU - Houghton, Peter
AU - Kurmasheva, Raushan T.
AU - Chen, Yidong
AU - Wang, Xiaojing
AU - Zheng, Siyuan
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/12
Y1 - 2025/12
N2 - Patient-derived xenograft (PDX) models are widely used in cancer research. Genomic and transcriptomic profiling of PDXs are inevitably contaminated by sequencing reads originated from mouse cells. Here, we examine the impact of mouse read contamination on RNA sequencing (RNAseq), Whole Exome Sequencing (WES), and Whole Genome Sequencing (WGS) data of 21 PDXs. We also systematically benchmark the performance of 12 computational protocols for removing mouse reads from PDXs. We find that mouse read contamination increases expression of immune and stromal related genes, and inflates the number of somatic mutations. However, detection of gene fusions and copy number alterations is minimally affected by mouse read contamination. Using gold standard datasets, we find that pseudo-alignment protocols often demonstrate better prediction performance and computing efficiency. The best performing tool is a relatively new tool Xengsort. Our results emphasize the importance of removing mouse reads from PDXs and the need to adopt new tools in PDX genomic studies.
AB - Patient-derived xenograft (PDX) models are widely used in cancer research. Genomic and transcriptomic profiling of PDXs are inevitably contaminated by sequencing reads originated from mouse cells. Here, we examine the impact of mouse read contamination on RNA sequencing (RNAseq), Whole Exome Sequencing (WES), and Whole Genome Sequencing (WGS) data of 21 PDXs. We also systematically benchmark the performance of 12 computational protocols for removing mouse reads from PDXs. We find that mouse read contamination increases expression of immune and stromal related genes, and inflates the number of somatic mutations. However, detection of gene fusions and copy number alterations is minimally affected by mouse read contamination. Using gold standard datasets, we find that pseudo-alignment protocols often demonstrate better prediction performance and computing efficiency. The best performing tool is a relatively new tool Xengsort. Our results emphasize the importance of removing mouse reads from PDXs and the need to adopt new tools in PDX genomic studies.
UR - https://www.scopus.com/pages/publications/105003291892
UR - https://www.scopus.com/pages/publications/105003291892#tab=citedBy
U2 - 10.1038/s41698-025-00902-z
DO - 10.1038/s41698-025-00902-z
M3 - Article
C2 - 40247091
AN - SCOPUS:105003291892
SN - 2397-768X
VL - 9
JO - npj Precision Oncology
JF - npj Precision Oncology
IS - 1
M1 - 113
ER -