Prioritizing candidate disease genes by network-based boosting of genome-wide association data

Insuk Lee, U. Martin Blom, Peggy I. Wang, Jung Eun Shim, Edward M. Marcotte

Research output: Contribution to journalArticlepeer-review

523 Citations (Scopus)


Network "guilt by association" (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.

Original languageEnglish
Pages (from-to)1109-1121
Number of pages13
JournalGenome Research
Issue number7
Publication statusPublished - 2011 Jul

All Science Journal Classification (ASJC) codes

  • Genetics
  • Genetics(clinical)


Dive into the research topics of 'Prioritizing candidate disease genes by network-based boosting of genome-wide association data'. Together they form a unique fingerprint.

Cite this