Improved prediction of breast cancer outcome by identifying heterogeneous biomarkers

Jonghwan Choi, Sanghyun Park, Youngmi Yoon, Jaegyoon Ahn

Research output: Contribution to journalArticlepeer-review

30 Citations (Scopus)


Motivation Identification of genes that can be used to predict prognosis in patients with cancer is important in that it can lead to improved therapy, and can also promote our understanding of tumor progression on the molecular level. One of the common but fundamental problems that render identification of prognostic genes and prediction of cancer outcomes difficult is the heterogeneity of patient samples. Results To reduce the effect of sample heterogeneity, we clustered data samples using K-means algorithm and applied modified PageRank to functional interaction (FI) networks weighted using gene expression values of samples in each cluster. Hub genes among resulting prioritized genes were selected as biomarkers to predict the prognosis of samples. This process outperformed traditional feature selection methods as well as several network-based prognostic gene selection methods when applied to Random Forest. We were able to find many cluster-specific prognostic genes for each dataset. Functional study showed that distinct biological processes were enriched in each cluster, which seems to reflect different aspect of tumor progression or oncogenesis among distinct patient groups. Taken together, these results provide support for the hypothesis that our approach can effectively identify heterogeneous prognostic genes, and these are complementary to each other, improving prediction accuracy. Availability and implementation Contact Supplementary informationSupplementary dataare available at Bioinformatics online.

Original languageEnglish
Pages (from-to)3619-3626
Number of pages8
Issue number22
Publication statusPublished - 2017 Nov 15

Bibliographical note

Publisher Copyright:
© The Author 2017. Published by Oxford University Press. All rights reserved.

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics


Dive into the research topics of 'Improved prediction of breast cancer outcome by identifying heterogeneous biomarkers'. Together they form a unique fingerprint.

Cite this