Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2024 Feb 11:2024.02.08.579330. [Version 1] doi: 10.1101/2024.02.08.579330

The Genotype and Phenotypes in Families (GPF) platform manages the large and complex data at SFARI

Liubomir Chorbadjiev, Murat Cokol, Zohar Weinstein, Kevin Shi, Chris Fleisch, Nikolay Dimitrov, Svetlin Mladenov, Simon Xu, Jake Hall, Steven Ford, Yoon-ha Lee, Boris Yamrom, Steven Marks, Adriana Munoz, Alex Lash, Natalia Volfovsky, Ivan Iossifov
PMCID: PMC10871337  PMID: 38370639

Abstract

The exploration of genotypic variants impacting phenotypes is a cornerstone in genetics research. The emergence of vast collections containing deeply genotyped and phenotyped families has made it possible to pursue the search for variants associated with complex diseases. However, managing these large-scale datasets requires specialized computational tools tailored to organize and analyze the extensive data. GPF (Genotypes and Phenotypes in Families) is an open-source platform ( https://github.com/iossifovlab/gpf ) that manages genotypes and phenotypes derived from collections of families. The GPF interface allows interactive exploration of genetic variants, enrichment analysis for de novo mutations, and phenotype/genotype association tools. In addition, GPF allows researchers to share their data securely with the broader scientific community. GPF is used to disseminate two large-scale family collection datasets (SSC, SPARK) for the study of autism funded by the SFARI foundation. However, GPF is versatile and can manage genotypic data from other small or large family collections. Our GPF-SFARI GPF instance ( https://gpf.sfari.org/ ) provides protected access to comprehensive genotypic and phenotypic data for the SSC and SPARK. In addition, GPF-SFARI provides public access to an extensive collection of de novo mutations identified in individuals with autism and related disorders and to gene-level statistics of the protected datasets characterizing the genes’ roles in autism. Here, we highlight the primary features of GPF within the context of GPF-SFARI.

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES