CNV-Profile Regression: A New Approach for Copy Number Variant Association Analysis in Whole Genome Sequencing Data

Yaqin Si; Wenbin Lu; Shannon Holloway; Hui Wang; Albert A Tucci; Amanda Brucker; Yuhuan Cheng; Li-San Wang; Gerard Schellenberger; Wan-Ping Lee; Jung-Ying Tzeng

doi:10.1101/2024.11.23.624994

Abstract

Copy number variants (CNVs) are DNA gains or losses involving >50 base pairs. Assessing CNV effects on disease risk requires consideration of several factors. First, there are no natural definitions for CNV loci. Second, CNV effects can depend on dosage and length. Third, CNV effects can be more accurately estimated when all CNV events in a genomic region are analyzed together to assess their joint effects. We propose a new framework for association analysis that directly models an individual’s entire CNV profile within a genomic region. This framework represents an individual’s CNVs using a CNV profile curve to capture variations in CNV length and dosage and to bypass the need to predefine CNV loci. CNV effects are estimated at each genome position, making the results comparable across different studies. To jointly estimate the effects of all CNVs, we use a Lasso penalty to select CNVs associated with the trait and integrate a weighted L2-fusion penalty to encourage similar effects of adjacent CNVs when supported by the data. Simulations show that the proposed model can more effectively identify causal CNVs while maintaining false positive rates comparable to baseline methods and yield more precise effect-size estimates across different settings. When applied to CNV derived from whole genome sequencing data of the Alzheimer’s Disease Sequencing Project, the proposed methods identify additional CNVs associated with Alzheimer’s Disease (AD). These identified CNVs overlap with several known AD-risk genes and are significantly enriched by biological processes related to neuron structures and functions crucial in AD development.

PERMALINK

This is a preprint.

CNV-Profile Regression: A New Approach for Copy Number Variant Association Analysis in Whole Genome Sequencing Data

Yaqin Si

Wenbin Lu

Shannon Holloway

Hui Wang

Albert A Tucci

Amanda Brucker

Yuhuan Cheng

Li-San Wang

Gerard Schellenberger

Wan-Ping Lee

Jung-Ying Tzeng

Abstract

Full Text Availability

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

This is a preprint.

CNV-Profile Regression: A New Approach for Copy Number Variant Association Analysis in Whole Genome Sequencing Data

Yaqin Si

Wenbin Lu

Shannon Holloway

Hui Wang

Albert A Tucci

Amanda Brucker

Yuhuan Cheng

Li-San Wang

Gerard Schellenberger

Wan-Ping Lee

Jung-Ying Tzeng

Abstract

Full Text Availability

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases