Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2023 Nov 28;14:7805. doi: 10.1038/s41467-023-43651-y

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2023

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

PMC Copyright notice

Fig. 1 — a SV annotation. A coding SV that is a deletion or a duplication, fully containing gene B and partially encompassing gene C, is segmented into a sequence of six genome segments, including two affected genes, two intergenic noncoding regions, and two zero-padding segments. A noncoding SV that is a deletion or a duplication can potentially affect gene A, B, and C based on distance or TAD annotations (triangle shaded area). The genomic segment sequence has three candidate target genes, five intergenic noncoding regions, a noncoding SV region, and two zero-padding segments. b SV interpretation. Annotated SV with the shape of 6 ×238 or 11 ×238 from (a) is fed into PhenoSV architecture. Each MHA (multi-head attention) block has two types of attention heads to model indirect and direct effects on genes. The pathogenicity for overall SV (PhenoSV scores, $p_{s v}$ ) and individual genes (PhenoSV gene scores, $p_{s v - g e n e}$ ) can be inferred from SV-level and gene-level embeddings, respectively. Prior phenotype information (HPO terms) can be further used to infer phenotype-related pathogenicity for overall SV (phenotype-aware PhenoSV scores, $p_{s v}^{p h e n}$ ) and individual genes (phenotype-aware PhenoSV gene scores, $p_{s v - g e n e}^{p h e n}$ ).