Skip to content

Phylogeny-based Contamination Detection in Mitochondrial and Whole-Genome Sequencing Studies

License

Notifications You must be signed in to change notification settings

genepi/haplocheck

Repository files navigation

haplocheck

Build Status GitHub release

Haplocheck detects in-sample contamination in mtDNA or WGS sequencing studies by analyzing the mitchondrial content. To run haplocheck, you can either use our cloud web service or install it locally.

The main features of haplocheck are:

  • A fast tool to detect in-sample contaminaton by analyzing the mitochondrial content of sequencing data.
  • Works both on VCF and BAM input files.
  • It estimates contamination by detecting polymorphic sites in the mtDNA data and classifies them into mitochondrial haplogroups using haplogrep.
  • It can be used as a proxy tool to estimate the nDNA contamination levels. Our results show that a high concordance to the 1000G contamination levels (using Verifybamid2) can be achieved but can vary in samples showing large differences in the mtDNA copy number (e.g. due to tissue/cell type).

Quick Start (VCF input)

 mkdir haplocheck
 wget https://github.com/genepi/haplocheck/releases/download/v1.3.3/haplocheck.zip
 unzip haplocheck.zip
 ./haplocheck --out <out-file> <input-vcf>

Quick Start (BAM input)

curl -s install.cloudgene.io | bash -s 2.3.3
./cloudgene install https://github.com/genepi/haplocheck/releases/download/v1.3.2/haplocheck.zip 

Documentation

Full documentation for haplocheck can be found here.

Citation

Weissensteiner H, Forer L, Fendt L, Kheirkhah A, Salas A, Kronenberg F, Schoenherr S. 2021. Contamination detection in sequencing studies using the mitochondrial phylogeny. Genome Research. http://dx.doi.org/10.1101/gr.256545.119.

Contact

See here.

mtDNA Blog

Check out our blog regarding mtDNA topics.

Data Simulation

The script on how to create in-silico mixtures of two input samples can be found here.

About

Phylogeny-based Contamination Detection in Mitochondrial and Whole-Genome Sequencing Studies

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages