Systematic comparison of variant calling pipelines using gold standard personal exome variants

Sohyun Hwang, Eiru Kim, Insuk Lee, Edward M. Marcotte

Research output: Contribution to journalArticlepeer-review

211 Citations (Scopus)

Abstract

The success of clinical genomics using next generation sequencing (NGS) requires the accurate and consistent identification of personal genome variants. Assorted variant calling methods have been developed, which show low concordance between their calls. Hence, a systematic comparison of the variant callers could give important guidance to NGS-based clinical genomics. Recently, a set of high-confident variant calls for one individual (NA12878) has been published by the Genome in a Bottle (GIAB) consortium, enabling performance benchmarking of different variant calling pipelines. Based on the gold standard reference variant calls from GIAB, we compared the performance of thirteen variant calling pipelines, testing combinations of three read aligners- BWA-MEM, Bowtie2, and Novoalign- and four variant callers- Genome Analysis Tool Kit HaplotypeCaller (GATK-HC), Samtools mpileup, Freebayes and Ion Proton Variant Caller (TVC), for twelve data sets for the NA12878 genome sequenced by different platforms including Illumina2000, Illumina2500, and Ion Proton, with various exome capture systems and exome coverage. We observed different biases toward specific types of SNP genotyping errors by the different variant callers. The results of our study provide useful guidelines for reliable variant identification from deep sequencing of personal genomes.

Original languageEnglish
Article number17875
JournalScientific reports
Volume5
DOIs
Publication statusPublished - 2015 Dec 7

Bibliographical note

Funding Information:
We thank Novocraft Company for providing a trial version of Novoalign. This work was supported by grants from the National Institutes of Health, National Science Foundation, Cancer Prevention Research Institute of Texas, U.S. Army Research (58343–MA), and the Welch (F-1515) Foundation to E.M.M., and by grants from the National Research Foundation of Korea (2012M3A9B4028641, 2012M3A9C7050151) to I.L.

All Science Journal Classification (ASJC) codes

  • General

Fingerprint

Dive into the research topics of 'Systematic comparison of variant calling pipelines using gold standard personal exome variants'. Together they form a unique fingerprint.

Cite this