Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2016 Nov 29;45(Database issue):D139–D144. doi: 10.1093/nar/gkw1064

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

PMC Copyright notice

Figure 1. — SNP2TFBS data generation pipeline. The rectangular boxes represent data files. Duplicated frames indicate multiple files for each type, one for each position weight matrice (PWM). Encircled numbers refer to procedures: (1) Generation of the alternate human genome (hg19a) from reference genome (hg19) (2) Genome coordinates conversion of the reference single nucleotide polymorphisms (SNP) catalogue (VCF format) to the alternate genome. (3) Whole genome scan of both genomes with PWMs from JASPAR CORE 2014 at P-value threshold 10⁻⁵.(4) Extraction of SNPs overlapping PWM matches on the respective genomes. (5) Extraction of SNPs that disrupt, create or change score of overlapping PWM sites between the two genomes (6) Merging of essential information from single PWM files into master file, generation of gene-annotated and reformatted versions (e.g. BED) from primary data files. Variants annotation is carried out using an ANNOVAR input file (humandb/hg19_refGene.txt).