Skip to main content
. Author manuscript; available in PMC: 2020 Aug 4.
Published in final edited form as: J Vis Exp. 2019 Aug 15;(150):10.3791/59542. doi: 10.3791/59542

Table 1. List of Data Sources for MARRVEL.

All databases where MARRVEL obtains data from are listed in this table. For each database, we list the type of database, URL/Link, rationale for including in MARRVEL, and primary references.

Type of database Name of Database URL/Link to Database Rationale for Inclusion Reference
Human Genetics OMIM https://omim.org The three main pieces of information that we draw from OMIM are: gene function, associated phenotypes, and reported alleles. It is helpful to know if a gene is a part of a known Mendelian phenotype whose molecular basis is known (#entries). Genes without this knowledge are candidates for novel gene discovery and for genes that are this category if the patient’s phenotype does not match the reported disease and phenotype as well as those of the patients in the literature, then this increases the opportunity to provide a phenotypic expansion. PMID: 28654725
Human Genetics ExAC exac.broadinstitute.org/ ExAC contains more than 60,000 exomes and is, other than gnomAD (http://gnomad.broadinstitute.org/), the largest public collection of exomes that have been selected against individuals with severe early-onset Mendelian phenotypes. For MARRVEL’s purposes, ExAC serves as the best control population minor allele frequency. We are interested in two sets of outputs from ExAC. The first output is the gene-centric overview of the expected versus observed number of missense and loss of function alleles. A metric called pLI (probability of Loss of function intolerance) ranges between 0 and 1 is likely related to how essential both copies of a gene are before reproductive age. A pLI score of 1 means that this gene is very intolerant of any loss of function variants and is under selective constraint. The second output is data from ExAC that pertains to the specific variant. PMID: 27535533
Human Genetics gnomAD http://gnomad.broadinstitute.org gnomAd contains a total of 123,136 exome sequences and 15,496 whole-genome sequences from unrelated individuals sequenced as part of various disease-specific and population genetic studies. In MARRVEL we display the population frequencies that pertains to specific variant. PMID: 27535533
Human Genetics ClinVar https://www.ncbi.nlm.nih.gov/clinvar/ ClinVar is a public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. Variants with interpretations reported by researchers and clinicians are valuable for analyzing how likely a variant is pathogenic. PMID: 29165669
Human Genetics Geno2MP http://geno2mp.gs.washington.edu/Geno2MP/ Geno2MP is a collection of samples from the University of Washington Center for Mendelian Genetics. It contains ~9,650 exomes of affected individuals and unaffected relatives. This database links the phenotypic as well as mode of inheritance information to specific alleles. For phenotype, we focus on comparing the affected organ system of the patient to the affected individuals in Geno2MP. A match in allele, mode of inheritance, and phenotype provides an increased probability that the variant likely pathogenic. However, due to small sample size a negative association does not necessarily decrease a variant’s pathogenic priority. http://geno2mp.gs.washington.edu/Geno2MP/#/
Human Genetics DGV http://dgv.tcag.ca/dgv/app/home To our knowledge, DGV is the largest public-access collection of structural variants from more than 54,000 individuals. The database includes samples of reportedly healthy individuals, at the time of ascertainment, from up to 72 different studies. Possible limitations to this data include variation in source and method of the data acquired the lack of information regarding incomplete penetrance of pathogenic CNVs, and whether individuals will develop associated diseases subsequent to data collection. PMID: 24174537
Human Genetics DECIPHER https://decipher.sanger.ac.uk/ The data displayed on MARRVEL includes common variants from the control population. The data displayed includes structural variants that cover the genomic location of the input variant. DECIPHER also contains variant and phenotypic information for affected individuals but can only be accessed on their database. PMID: 19344873
Integration DIOPT https://www.flyrnai.org/cgi-bin/DRSC_orthologs.pl DIOPT provided multiple protein sequence alignment of the best predicted orthologs in six model organisms against the protein sequence of the human gene of interest. The alignment will provide information on the conservation of specific amino acids as well as functional protein domains. PMID: 21880147
Gene Function GO Central http://www.geneontology.org/ MARRVEL displays only gene ontology terms (Molecular Function, Cellular Component, and Biological Process) derived from experimental evidence for each gene. They are filtered by “experimental evidence codes” and GO terms based on “computational analysis evidence codes” and “electronic annotation evidence codes” (predictions) are avoided. PMID: 10802651, 25428369
Model Organism SGD https://www.yeastgenome.org/ We collected data from multiple model organism databases and provide a summary of the biological and genetic functions of the predicted orthologs derived by DIOPT. PMID: 22110037
Model Organism PomBae https://www.pombase.org/ PMID:22039153
Model Organism WormBase http://wormbase.org PMID:26578572
Model Organism FlyBase http://flybase.org PMID:26467478
Model Organism ZFIN https://zfin.org/ PMID:26097180
Model Organism MGI http://www.informatics.jax.org/ PMID:25348401
Model Organism RGD https://rgd.mcw.edu/ PMID:25355511
Model Organism GTEx https://gtexportal.org/home/ MARRVEL displays both mRNA and protein expression pattern in human tissues of each gene. The expression pattern can add insight into the phenotypes observed in patients and/or model organisms. PMID: 29019975, 23715323
Model Organism The Human Protein Atlas https://www.proteinatlas.org/ PMID: 21752111
Gene Function IMPC http://www.mousephenotype.org/ MARRVEL provides a link to the mouse gene page on IMPC. If there has been a knock out mouse made by IMPC, an exhaustive list of assays and their results are made available publicly and can provide insight into the phenotype when a gene is lost. PMID: 27626380
Gene Function Monarch Initiative https://monarchinitiative.org/ MARRVEL provides a link to the Phenogrid of a human gene on Monarch Initiative. This grid provides comparisons between the phenotype of model organisms and known human diseases. PMID: 27899636
Integration Ensembl https://useast.ensembl.org/index.html Ensembl gene IDs are used to link the different databases. PMID: 29155950
Integration HGNC https://www.genenames.org/ HGNC official gene symbols are used for MARRVEL searches. PMID: 27799471
Integration Mutalyzer https://mutalyzer.nl/ MARRVEL uses Mutalyzer’s API to convert different variant nomenclatures to genomic location. PMID: 18000842