Abstract
Viruses deploy genetically encoded strategies to coopt host machinery and support viral replicative cycles. Here, we use protein structure similarity to scan for molecular mimicry, manifested by structural similarity between viral and endogenous host proteins, across thousands of cataloged viruses and hosts spanning broad ecological niches and taxonomic range, including bacteria, plants and fungi, invertebrates, and vertebrates. This survey identified over 6,000,000 instances of structural mimicry; more than 70% of viral mimics cannot be discerned through protein sequence alone. We demonstrate that the manner and degree to which viruses exploit molecular mimicry varies by genome size and nucleic acid type and identify 158 human proteins that are mimicked by coronaviruses, providing clues about cellular processes driving pathogenesis. Our observations point to molecular mimicry as a pervasive strategy employed by viruses and indicate that the protein structure space used by a given virus is dictated by the host proteome.
A record of this paper’s transparent peer review process is included in the Supplemental Information.
Keywords: virology, mimicry, pathogen-host interaction, evolution, protein structure, modeling, COVID-19
Graphical Abstract
Highlights
-
•
Structural mimicry of host proteins is a pervasive strategy across earth’s virome
-
•
Majority of the >6,000,000 cataloged mimics cannot be discerned through sequence
-
•
Protein structure space utilized by viruses is dictated by host proteomes
-
•
Knowledge of viral mimics provides clues about pathophysiology of COVID-19
Virally encoded factors resembling host factors hijack host molecular machines in a phenomenon called molecular mimicry. Lasso et al. use 3D protein structural information to identify over 6,000,000 instances of structural mimicry, >70% of which cannot be discerned through protein sequence alone. Their work provides the first systematic analysis of molecular mimicry across the earth's virome.
Introduction
Viruses deploy an array of genetically encoded strategies to coopt host machinery and support viral replicative cycles. Among the strategies, protein-protein interactions (PPIs), mediated by promiscuous, multifunctional viral proteins are widely documented. Targeted discovery tools focused largely on viruses of public-health importance and have experimentally mapped thousands of virus-host protein complexes, and structure-informed prediction algorithms have allowed discovery of such interactions across all fully sequenced human-infecting viruses (Lasso et al., 2019). Molecular mimicry, manifested by structural similarity between viral and endogenous host proteins, allows viruses to harness or disrupt cellular functions including nucleic acid metabolism and modulation of immune responses. Yet, while examples of this latter strategy pepper the literature (Wimmer and Schreiner, 2015; Alcami, 2003; Guven-Maiorov et al., 2016), most have focused on human-infecting viruses (Elde and Malik, 2009; Felix and Savvides, 2017), and a systematic analysis of pathogen-encoded molecular mimics has not been performed.
The vast genomic landscape occupied by viruses hampers the discovery of evolutionary relationships between viral proteins and their hosts. As is well known, however, since three-dimensional (3D) protein structure is much better conserved than sequence, structural information can be used to interrogate evolutionary relationships (Lasso et al., 2019; Bamford et al., 2005) as well as uncover virus-encoded structural mimics that cannot be detected by sequence relationships (see STAR Methods). Here, we use protein structure similarity to identify virally encoded mimics of host proteomes. Briefly, we first employ sequence-based methods to identify proteins that have similar structures to queried viral proteins and then use structural alignment to find “structural neighbors” of viral proteins (Figure 1 A). We refer to the corresponding viral proteins as mimics of their host-encoded neighbors. We applied the approach to a set of 337,493 viral proteins representing 7,486 viruses across a broad host taxonomic range, including bacteria, plants and fungi, invertebrates, and vertebrates. Our survey identified over 6,000,000 instances of structural mimicry, the vast majority of which (>70%) cannot be discerned through protein sequence alone (see below). Our results point to molecular mimicry as a pervasive strategy employed by viruses and indicate that the protein structure space used by a given virus is dictated, at least in part, by the host proteome.
We further observe that the manner and degree to which viruses exploit molecular mimicry varies by genome size and nucleic acid type. For example, while human-infecting and arthropod-infecting viruses occupy a structure space most similar to their host proteome, arboviruses, which are transmitted to humans by insect vectors, encode promiscuous proteins that mimic both human and insect proteins. In addition, we find that, relative to their proteome size, single-stranded RNA (ssRNA) viruses, including coronaviruses (CoVs), have circumvented the limitations of their small genomes by mimicking human proteins to a greater extent than their large dsDNA counterparts like Pox and Herpes viruses. Interrogation of proteins mimicked by human-infecting viruses points to broad diversification of cellular pathways targeted via structural mimicry, identifies biological processes that may underly autoimmune disorders, and reveals virally encoded mimics that may be leveraged to engineer synthetic metabolic circuits or may serve as targets for therapeutics. Finally, we identified over 150 cellular proteins (including members of the complement activation pathway and critical regulators of innate and adaptive immunity) that are mimicked by CoV, providing clues about the cellular processes underlying the pathogenesis driving the ongoing COVID-19 pandemic.
Results
Mining the Virome-wide Structure Space to Identify Host Proteins Mimicked by Viruses
In order to identify mimicry relationships, we implemented the strategy summarized in Figure 1A. Sequence-based searches (see STAR Methods) of 337,493 viral proteins (269,077 unique sequences) against the PDB identified structural templates for 92,868 (34.5%), with large disparities in coverage: structural templates were identified for 64.7% of the proteins from human-infecting viruses, whereas the coverage for non-vertebrate-infecting viruses ranged from 31% to 38% (Figure S1A; Table S2). As shown in Figure S1B, structural templates for vertebrate and human-infecting viruses came primarily from other viruses while, for example, bacterial proteins provided most of the templates for bacteria-infecting viruses. Most of the templates for invertebrate-infecting viruses came from vertebrate and bacterial proteins, likely reflecting the limited structural coverage of non-vertebrate viruses and invertebrate hosts in the PDB. In order to measure structure similarity between protein structures we utilized Ska, an extensively utilized and validated tool for inference of structure-based functional relationships even in the absence of detectable sequence similarity (Garzón et al., 2016; Hwang et al., 2017; Lasso et al., 2019; Zhang et al., 2012; Petrey et al., 2003, 2009; Yang and Honig, 2000). In addition to Ska, we employed a conservative global structural similarity criteria (structural alignment score, SAS < 2.5 Å) (Budowski-Tal et al., 2010; Kolodny et al., 2005; Subbiah et al., 1993) to infer structural mimics and minimize biases imposed by local structural similarities (see STAR Methods). As shown in Figure S2, this approach enables rediscovery of known virus-encoded protein structure mimics. Structural alignment of the 92,868 templates with PDB proteins identified 6,083,167 mimicry relationships, involving 88,715 viral proteins and 26,542 host protein structures present in the PDB (data are available for download at https://github.com/gorkaLasso/VirusHost_mimicry). The vast majority of these mimicry relationships (72.2%) could not be retrieved through sequence similarity alone based on the requirement that no e value obtained from any one or three sequence-based methods was greater than 1 × 10−6 (see STAR Methods).
Host Range and Taxonomic Division Drives Viral Protein Structure Space and Mimicry
Taxonomic enrichment analysis (Figures 1B and 1C) reveals a clear correlation between the extent of virus structural mimicry and the identity of the host. Since their proteomes are well represented in the PDB, we first focused on the proteomes of E. coli-infecting phages, their natural host E. coli, the budding yeast Saccharomyces cerevisiae, and human. Significant structure similarity is only observed between E. coli-infecting phages and their natural host (Figure 1B). As shown in Figure 1C, with the exception of invertebrate-infecting viruses, which constitute a special case (see below), a broader analysis across all 7,486 cataloged viruses and 38,363 proteins from 4,045 putative hosts demonstrates that every virus group exhibits significant protein structural similarity to the taxonomic division of their hosts. In addition, viruses exhibit structural similarity to proteomes in other taxonomic divisions that are close in evolutionary distance to the host taxonomic division and reduced structural similarity to more distant taxonomic divisions (e.g., the structural space of human-infecting viruses are enriched for human, vertebrates, and invertebrate proteins but not bacteria or plant proteins, Figure 1C). Such similarities are reflected in evolutionary relationships between different taxonomic divisions. For instance, accumulating evidence describes some compartments of the phytosphere as environmental host spots for horizontal gene transfer (HGT) that facilitate genetic exchange between plants and bacteria (Pontiroli et al., 2009), underscoring the significant structural relationship between plant and fungi-infecting viruses and phages (Figure 1C). Moreover, in agreement with the results obtained for the broader virus categories, we found the protein structure space of human-infecting viruses to be enriched with structures found in human and non-human vertebrate proteomes (Figure 1C)—pointing to a possible path through which zoonotic infections may jump from animal reservoirs to cause human disease.
Unlike viruses of bacteria and vertebrates, we do not observe significant structural similarity between proteomes of invertebrate-infecting viruses and their invertebrate hosts—perhaps owing to the fact that invertebrates account for the taxonomic group with the lowest structural coverage in the PDB (containing just 2,687 unique invertebrate proteins). To bridge this knowledge gap, we modeled the proteome of the mosquito Ae. Aegypti (see STAR Methods) and compared the protein structure space of mosquito with each group of viruses. As shown in Figure 1D, we find that mosquito-infecting viruses occupy a significantly larger mosquito structural space than viruses that infect hosts in other taxonomic divisions. Moreover, we observe that mosquito-borne arboviruses, human-infecting viruses that use mosquitoes as vectors, engage a larger space of mosquito protein mimics than human-infecting viruses that do not use mosquitoes as vectors (p value 2.3 × 10−3; Figure 1D). Together, these data further highlight the evolutionary pressures imposed on viruses to utilize a protein structure space defined by their host’s range.
Diversification of Mimicry Strategies among Human-Infecting Viruses
While we do not observe recursive structural mimicry of human proteins across the human virome, we find that viruses belonging to the same viral family (including astroviruses, coronaviruses, filoviruses, orthomyxoviruses, paramyxoviruses, rhabdoviruses, and picornaviruses; data available for download at https://github.com/gorkaLasso/VirusHost_mimicry) tend to share a large fraction of mimics (where >75% of viruses within the family share >75% structural mimics). Figure 2 A displays the number of human structural mimics per virus for different viral families. A number of families, including Hepadnaviridae, Bunyaviridae, Papillomaviridae, Parvoviridae, Retroviridae, Arenaviridae, and Polyomaviridae, utilize structural mimicry far less than others. On the other hand, poxviruses and herpesviruses, large dsDNA viruses that are less constrained by genome size (Figure 2B) and more likely to retain horizontally transferred genes (Alcami, 2003; Elde and Malik, 2009), encode many human protein structure mimics (Figure 2A). Since viral genome sizes vary greatly both between and within viral families (Figure 2B), we computed the number of human mimics relative to viral proteome sizes in an effort to assess the relative utilizations of mimicry (Figure 2C). We find that relative to their proteome size, small RNA viruses, which are constrained by genome size, display enhanced structural promiscuity, with a larger number of human protein mimics per residue. So, while viruses constrained by genome size may be restricted in their ability to expand their genomes through HGT, they have circumvented this limitation by encoding multifunctional proteins that mimic domains across multiple host proteins. These observations underscore the role of structural mimicry in evolving expanded molecular functionality in size constrained genomes.
Pathways Targeted by Viral Mimicry
Human-Infecting DNA-Encoded Viruses
The critical nature of virus-host PPIs in engaging host molecular machinery to fulfill viral life cycles has led to convergence of evolutionarily unrelated viruses around key cellular pathways (e.g., human-infecting DNA, RNA and retro-transcribing viruses converge, via different sets of PPIs, on multiple cellular metabolic pathways) (Lasso et al., 2019). In contrast, as shown in Figure 2D; Table S3, structurally mimicked proteins revealed little such convergence. Moreover, when convergence was observed, it often involved 2 or 3 viral families belonging to the same Baltimore classification. For example, owing to their large genomes and in agreement with their propensity to acquire host genes through HGT, Herpesviridae, Adenoviridae, and Poxviridae, families of dsDNA viruses are enriched for the largest number of mimicked biological pathways, 113, 38, and 35, respectively (Figure 2D; Table S3). Yet, while these viruses harbor a large number of human protein mimics, they do not mimic cellular components of RNA and DNA metabolism—likely reflecting the fact that they encode polymerases (as is the case of Poxviridae) and/or integrate into the cellular genome and are therefore bystander participants in cellular DNA replication.
We observe an enrichment of cytokine-related pathways in poxvirus- and herpesvirus-mimicked proteins and chemokine-related pathways in proteins mimicked by herpesviruses (Table S3; Figure 2D). Indeed, poxviruses and herpesviruses have evolved a repertoire of immune evasion strategies by acquiring immunomodulatory host genes through genetic recombination (often involving cytokines and cytokine receptors). Undetectable sequence similarity with the host has complicated past attempts to identification virus-host relationships (Odom et al., 2009; Felix and Savvides, 2017). However, while many of the relationships between herpesvirus and chemokine or chemokine receptors can be discerned from sequence, we have expanded the list of known mimics to include 61 herpesviral structure mimics not discernible from sequence alone (Figure 2D; complete list of mimics is available at https://github.com/gorkaLasso/VirusHost_mimicry). For example, as shown in Figure 2D, we observed that US27D protein from Cercopithecine betaherpesvirus 5, a cytomegalovirus, mimics human CXCR4 (sequence identity: 10.4%), a chemokine receptor that regulates T cell inflammatory programs and can serve as a co-receptor for HIV (Busillo and Benovic, 2007). It is unclear if the structural mimics encoded by DNA viruses that we have identified are a result of sequence divergence of products acquired by HGT or through convergent evolution. Yet, they highlight the pervasive use of mimicry by DNA viruses and point to host factors that are targeted by structural mimicry.
Human-Infecting RNA-Encoded Viruses
As highlighted above, RNA viruses have circumvented limitations imposed on their small proteomes by encoding multifunctional and structurally promiscuous proteins that mimic domains across multiple host proteins (Figure 2C). In accordance with known phenotypes (like fibrinolysis and hemorrhagic fever) associated with Dengue virus (DENV; Flaviridae) infection (Chuang et al., 2014; Lin et al., 2011), we find that proteins mimicked by these RNA viruses are enriched for functions and pathways related to blood coagulation and the complement pathway—an observation that is mirrored by PPIs mediated by proteins encoded by these viruses (we previously found that PPI networks mediated by proteins of 51 of 56 flaviviruses were enriched for “complement and coagulation cascades”) (Lasso et al., 2019). While examples of such mimicry have been identified through sequence homology between DENV C, prM, E, and NS1 proteins and human coagulation factors (Lin et al., 2011; Chuang et al., 2014), we find similar mimicry, which cannot be discerned through sequence, employed by four other positive-sense ssRNA ((+)ssRNA) virus families (168, 166, 1,376, and 1,293 structural relationships between coagulation factors and astroviridae, coronaviruses, caliciviruses, and picornaviruses, respectively; sequence identity <20%, Table S3, data available at https://github.com/gorkaLasso/VirusHost_mimicry). So, while DENV immunization or exposure elicits antibodies that cross-react with multiple coagulation factor-associated pathological states (Chuang et al., 2014; Lin et al., 2011), our observations suggest that infections with viruses belonging to these other families may also result in similar imbalances in immune responses. Indeed, viruses within these families (e.g., SARS-Coronaviridae, hepatitis A-Picornaviridae, and hepatitis C-Flaviridae) are known to be associated with coagulation disorders such as thrombocytopenia, thrombosis, and hemorrhage (Goeijenbier et al., 2012).
CoVs
Knowledge of the precise regulatory programs that control the viral life cycle and mediate immune pathology can provide valuable clues about disease determinants. A closer look at coronaviruses underscores the level of diversification and structural promiscuity of protein mimics encoded by these viruses. As highlighted in Figure 2C, relative to proteome size, (+)ssRNA viruses including CoVs utilize structural mimicry more so than other human-infecting viruses. We find that the majority (∼96%) of the 158 structural mimics identified are shared by three or more coronaviruses (data available at https://github.com/gorkaLasso/VirusHost_mimicry). We also observe that seasonal CoVs tend to share greater global sequence identity with the human proteins they mimic than their more pathogenic CoV counterparts (including MERS, SARS-1, and SARS-CoV-2). However, amino acid identities at key functional sites ultimately shape the extent to which host proteins and pathways can be intervened through structural mimicry. To that end, further computational and experimental analysis and interrogation of CoV-encoded structure mimics will be necessary to determine the biological implications of sequence conservation at precise amino acid residues.
Among CoV-encoded structural mimics, we have identified a number that target key regulators of anti-viral immune programs. First, in the context of CoV infection, intracellular sensing of viral RNA through DDX58 (also known as RIG-I; retinoic acid-inducible gene I), a pathogen recognition receptor, triggers a signaling cascade that results in the production and secretion of type I interferons (IFNα/β) (Abe and Shapira, 2019; Jensen and Thomsen, 2012). IFNα/β in turn binds to cell surface receptors (IFNRs) on nearby cells and, following a signal-transduction cascade, leads to STAT1-mediated (signal transducer and activator of transcription 1) upregulation of hundreds of IFN-stimulated genes (ISGs) that govern cellular responses to infection and render cells refractory to viral infection. These responses are indispensable for control of viral replication and initiation of long-term immune responses (Abe and Shapira, 2019; Jensen and Thomsen, 2012). As shown in Figure 3 , we find that NSP13 of CoV mimics DDX3X and RIG-I as well as other helicase proteins, critical cellular components that cooperate through both direct and indirect interactions to initiate immune responses to viral infection. RIG-I ligands include ssRNAs that contain a 5′-triphosphate moiety (Abe and Shapira, 2019; Jensen and Thomsen, 2012), and previous biochemical characterizations have attributed multiple enzymatic activities to NSP13, including hydrolysis of NTPs and dNTPs and RNA 5′-triphosphatase activity (Ivanov et al., 2004; Adedeji and Lazarus, 2016). So, mimicry of RIG-I may serve to reduce the amount of viral RNA that is available to be sensed in infected cells. In addition, the CoV replicase complex has recently been demonstrated to interact with host cell DDX proteins (V'Kovski et al., 2019; Chen et al., 2009), suggesting that coronaviruses may utilize both PPIs and structural mimicry to manipulate functions of these proteins.
In addition to mimicking critical components of the cellular nucleic acid sensing machinery, we have identified CoV-encoded structural mimics that target key regulators of downstream anti-viral programs mediated by STAT1. As shown in Figure 3, we observe significant structural similarity between CoV NSP3 protein and human PARP9 and PARP14, ADP-ribosyltransferases with known cross-regulatory roles in controlling STAT1-mediated signaling (Iwata et al., 2016). Though the precise mode of action has remained unresolved, CoV ORF3, which encodes NSP3, has indeed been shown to be an antagonist of IFNα/β signaling and interferes with STAT1 nuclear translocation (Kopecky-Bromberg et al., 2007). Our results point to structural mimicry as a molecular explanation underlying these observations.
Lastly, as discussed above, we find that structural mimicry of complement components is a feature shared across all coronaviruses that were part of this study (including SARS-CoV-2). The complement system is a critical regulator of immune responses to microbial threats, but when dysregulated by age-related effects or excessive acute and chronic tissue damage, complement activation can contribute to pathologies mediated by inflammation. Our data implicate virally encoded structural mimics that may contribute to SARS (Coronaviridae)-associated coagulation disorders like thrombocytopenia, thrombosis, and hemorrhage (Goeijenbier et al., 2012)—disease outcomes shared with hepatitis A (Picornaviridae) and hepatitis C (Flaviridae) infections (which also encode coagulation and complement cascade mimics). In addition, while the age-related differences in susceptibility to CoV (including SARS-CoV-2) are likely a consequence of multiple underlying variables, one possibility is that prior exposure to coronavirus may potentiate complement-mediated responses to subsequent coronavirus infections. Moreover, a corollary of these observations is that complement-associated syndromes (including complement deficiencies or hyper-activation phenotypes like those associated with macular degeneration) may impact clinical outcome of SARS-CoV-2 infection. Indeed, as part of a separate retrospective clinical study, we have demonstrated that genetic and functional dysregulation of coagulation and complement functions (including history of macular degeneration), are associated with morbidity and mortality in SARS-CoV-2-infected patients—effects that could not be explained by age or sex (Ramlall et al., 2020).
Discussion
The mimicry of host protein structures is a strategy used by viruses to harness or disrupt host cellular functions (Wimmer and Schreiner, 2015; Alcami, 2003; Guven-Maiorov et al., 2016; Elde and Malik, 2009; Felix and Savvides, 2017). Structural mimicry can occur at the level of entire protein domains or in the form of “interface mimicry,” where the structure of host protein residues involved in PPIs is mimicked on the surface of a viral protein (Franzosa and Xia, 2011; Guven-Maiorov et al., 2020, 2019). High-throughput experimental methods, focused largely on viruses of public-health importance, have experimentally mapped thousands of virus-host protein complexes while structure-based computational tools based on interface mimicry have reported notable successes in the prediction of PPIs (Franzosa and Xia, 2011; Guven-Maiorov et al., 2020, 2019). Recently, we reported a structure-informed prediction algorithm, P-HIPSTer, that exploits domain-level similarity and interface properties to predict virus/host PPIs across all fully sequenced human-infecting viruses (Lasso et al., 2019). Studies such as these have revealed valuable information about the molecular mechanisms that underlie viral infection. Yet, while most of these experimental and computational studies have focused on PPIs, a systematic analysis of viral mimicry of host proteins has not been reported. Here, rather than considering PPIs, we explore the extent to which viral mimicry of host proteins is a common phenomenon and ask whether the “structure space” occupied by a particular virus is related to that of the host it infects. Moreover, by analyzing the host proteins that are mimicked, we are able to gain functional insights about infection mechanisms that are complementary to those gained by predicting PPIs.
Structure enables identification of mimics between viral and host proteins that cannot be observed from sequence alone. Mimicry relationships have been detected through sequence similarity and linear motif co-occurrence. The method presented here relies on sequence similarity (albeit distant) to proteins with experimentally solved structures used to build structural relationships. This limitation is reflected in the lower structural coverage of non-vertebrate-infecting viruses (Figure S1A). However, we bypass limitations of sequence-based approaches by leveraging 3D protein structure to identify virally encoded proteins with significant structural similarity to host proteins. Although examples of pathogen-encoded protein mimicry have been reported in the literature, to our knowledge, this is the first comprehensive and systematic analysis of molecular mimicry across the earth’s virome—spanning 337,493 viral proteins from 7,486 viruses (bacteria-, plant-fungi-, invertebrate-, and vertebrate-infecting viruses) and 38,363 proteins from 4,045 putative hosts.
Our pipeline uses a conservative global structural similarity criteria (SAS < 2.5 Å) to infer structural mimics (Subbiah et al., 1993; Kolodny et al., 2005; Budowski-Tal et al., 2010). Consequently, our approach does not consider local structural similarities defined by small protein regions and linear motifs (frequently found in disordered regions) (Franzosa and Xia, 2011; Guven-Maiorov et al., 2016). It is reasonable to consider that global structural mimicry relationships denote a common ancestral origin. However, we cannot rule out convergent evolution as a source of structural mimicry of smaller domains (e.g., <100 residues long) that would nevertheless meet the structural similarity criteria we have imposed. Importantly, structural mimicry does not necessarily imply functional mimicry. Furthermore, some of the mimics that we identify may represent spurious structural relationships that stem from underlying functional class of a given pair of proteins (e.g., virus and host-encoded proteases). Future experimental and functional interrogation will be necessary to demonstrate functional implications for the molecular mimics discovered as part of this work. Nevertheless, the results illustrate the ability of protein-structure-based analysis to infer structural, functional, and evolutionary relationships between viruses and their hosts.
Our results demonstrate that regardless of genome size, replicative cycle, or ecological niche, the evolutionary pressures imposed on viruses are reflected in the structure space they occupy and illustrate that mimicry may both constrain and enable host range. Of note, while structural similarity between viral and host proteins serves as a scaffold enabling viral proteins to coopt host pathways, amino acid identities at key functional sites will ultimately shape the extent to which host pathways can be intervened. Our observations offer a unique first step to investigating the role of amino acid variability at mimicked functional sites that might help explain differences in phenotypic outcome associated with viral infections.
Finally, the repertoire of structural mimics we discover opens new opportunities to identify potential mechanisms underlying autoimmune disorders of viral origin and new protein-based immune-modulatory therapeutics. For example, leveraging mimicry relationships between coronaviruses and human proteins informs about cellular targets and signaling cascades that are tuned during infection. Such knowledge can provide important clues about pathways that mediate pathology associated with infection and may point the way for designer therapeutics directed at these pathways. In addition, though beyond the scope of the current study, such knowledge may be useful in vaccine development as well as engineering of synthetic cellular operations where viral proteins are used as modules to build circuits with robust and predictable outputs. In the short term, and as highlighted by recent discovery of SARS-CoV-2 risk factors (Ramlall et al., 2020), information about virus-encoded structural mimics can help refine large-scale clinical studies and reveal determinants of immunity, susceptibility, and clinical outcome associated with human infection.
STAR★Methods
Key Resources Table
Resource Availability
Lead Contact
Further information for resources should be directed to and will be fulfilled by the Lead Contact, Sagi Shapira (ss4197@cumc.columbia.edu).
Materials Availability
This study did not generate new materials.
Data and Code Availability
All data and code generated as part of this study is available at GitHub (https://github.com/gorkaLasso/VirusHost_mimicry)
-
•
This paper analyzes existing, publicly available data. Appropriate links to these datasets is provided in the Key Resources Table
-
•
This paper does not report original code.
-
•
The scripts used to generate the figures presented in this paper are available at https://github.com/gorkaLasso/VirusHost_mimicry
-
•
Any additional information required to reproduce this work is available from the Lead contact
Experimental Model and Subject Details
This section does not apply to our computational study
Method Details
Viral Dataset Assembly
We compiled a dataset of 7,486 viruses, represented by a total of 337,493 viral protein amino acid sequences together with annotations of their corresponding hosts, from virus-hostDB as of October, 2016 (Mihara et al., 2016). Viruses were classified according to the taxonomic divisions of their corresponding host based on NCBI taxonomic divisions (Federhen, 2012) (Table S1).
Structural Neighbor Search
The 337,493 viral protein amino acid sequences were reduced to 269,077 non-redundant sequences filtered at 100% sequence identity with Cd-hit (Li and Godzik, 2006). Sequence-based methods were then used to identify experimentally determined protein structures that, based on their sequence relationship to the viral protein, are expected to have similar structures (referred to here as “structural templates” - Figure 1A). First, viral proteins were parsed into domains using CD-search (Marchler-Bauer et al., 2017). Next, to identify structural templates, full sequences and parsed domains were queried against the PDB (Berman et al., 2000). Sequence homology search was performed in three steps, where each step is run only if the preceding step fails to report a structural template with a conservative E-value < 1x10-12: i) Blast-based search (Altschul et al., 1990); ii) HHblits-based search (Remmert et al., 2011) and; iii) the third step runs five iterations of PSI-Blast. The pipeline reports the best structural template mapping to non-overlapping sequence segments for each query protein. When multiple structural templates map to the same segment in a query protein, only the structural template(s) with the lowest E-value are reported. The set of structural templates, describing the structural space of viral proteins, is further refined in order to minimize the number of redundant templates. To this end, structural templates derived from the same pdb chain and with their start and end positions within 15 residues are clustered together, using the longest structural template as the cluster representative. Finally, the set of structural templates is searched against the PDB database with Ska (Petrey et al., 2003; Yang and Honig, 2000) using a structural alignment score (SAS) < 2.5Å as a cut-off to identify structurally similar proteins (Kolodny et al., 2005). The ska algorithm focuses on alignment of secondary structure elements and uses only C-alpha coordinates of residues. Hence any differences in loop or side-chain conformations that might be expected because of differences in resolution would have little to no effect on a ska alignment.
Sequence Homology Assessment between Viral Proteins and Their Corresponding Structural Neighbors
Sequence homology among a pair of structurally similar proteins was considered significant when neither a Blast, HHBlits or PSI-Blast alignment yielded an E-value < 1x10-6. In addition, for Figures 2 and 3, sequence identities for pairs of structurally similar proteins, were computed using the Needleman-Wunsch algorithm implemented in the EMBOSS package (considering only the corresponding protein segments sharing structural similarity) (Rice et al., 2000).
Calculating Overlap between Structural Space of a Virus and Proteins in a Given Taxonomic Division
Taxonomic division enrichment (Figures 1B and 1C) was computed with a hypergeometric test that describes the significance of having k structural neighbors belonging to a particular taxonomic division (out of n total structural neighbors for a group of viruses) given the entire set of structurally solved proteins in the PDB of size N that contains K proteins from the same taxonomic division (Johnson et al., 1992). To minimize experimental bias of multiple PDB entries for the same protein, structurally solved proteins were mapped to their Uniprot accession codes (Apweiler et al., 2004) using SIFTS mapping (Velankar et al., 2013).
Protein Modeling and Structural Neighbor Search for Aedes aegypti Proteome
Our validated, inhouse modeling pipeline (Garzón et al., 2016; Lasso et al., 2019) was used to model the 16,652 protein sequences (obtained from the Uniprot database (Apweiler et al., 2004)) in the Aedes aegypti proteome. Three-dimensional models for full-length proteins and protein domains, as defined by the Conserved Domain Database (Marchler-Bauer et al., 2017), were either taken directly from the PDB (Berman et al., 2000) or built by homology modeling as described previously (Garzón et al., 2016; Zhang et al., 2012). Protein structure models were built using NEST (Petrey et al., 2003; Xiang and Honig, 2001). Structural neighbor search was run on the set of modeled mosquito proteins using Ska (Petrey et al., 2003; Yang and Honig, 2000) and a structural alignment score (SAS) < 2.5Å was used as a cut-off to identify structurally similar proteins (Kolodny et al., 2005). The modeling and structural neighbor search pipeline reported structural neighbors for 10,895 (65%) mosquito proteins.
Identifying the Mosquito Structural Neighbor Space for Each Virus in the Dataset
The unique set of mosquito structural neighbors was computed for across all viruses. Structural relationships between viral and mosquito proteins was inferred by identifying common structural neighbors shared between a given viral and mosquito protein pair. Poorly annotated viruses with less than 4 annotated viral proteins were discarded. Invertebrate-infecting viruses were further sub-classified into i) invertebrate-infecting viruses (including mosquito infecting viruses); ii) viruses infecting both human and mosquitoes and; iii) viruses infecting mosquitoes but not humans. We applied the non-parametric Mann-Whitney test to compare the size of the mosquito structural neighbor space per virus (normalized by the total number of annotated viral proteins in each virus) between different groups of viruses (Figure 1C), and outliers were removed with the Rout method (Q = 1%) implemented in Prism (GraphPad, 2019).
Human Protein Structural Neighbors of Human-Infecting Viral Families
To estimate the number of human protein structural neighbors per virus, viruses with 4 or more annotated proteins with structural template for at least 30% of the viral proteins were considered (Figures 2A and 2C). The total number of human structural neighbors per virus was corrected by the corresponding structural coverage (e.g. for a virus where 70% of its proteins have a structural template and shows structural similarity to 30 human proteins, the corrected number of human structural neighbors will be 30 / 0.7 = 42.9, Figure 1A). Viruses were grouped into viral families and families with less than 5 viruses were not plotted. In order to plot the number of human structural neighbors per residue for a given family of human-infecting viruses (Figure 2C), we normalized the total number of human mimics per virus (Figure 2A) by the corresponding proteome size (Figure 2B).
Functional Enrichment Analysis
Enrichment of biological ontologies (molecular function, biological pathways and diseases) was determined using David (Huang et al., 2009a, 2009b), background corrected using the 5,841 human protein structures extracted from PDB. A Bonferroni corrected p-value < 0.01 was used to identify enriched biological ontologies. Viral families and biological ontologies enriched in at least one viral family were clustered using R and the heatmap.2 function within gplots package (values in original matrix: -log10 P-valueBonferroni; distance metric: Euclidean, method: complete) (R Core Team, 2016; Warnes et al., 2016).
Quantification and Statistical Analysis
Statistical parameters, including the definition of center, dispersion and associated significance, are reported in the main text, Figures and Figure legends. We have applied hypergeometric test and Mann-Whitney test to calculate significance. P values for pathway enrichment analysis were adjusted for multiple comparison. The section entitled “Method Details” describes the statistical analyses performed. Data are judged to be statistically significant when p < 0.05 in applied statistical analyses. In Figures 1 and 2 asterisks denote statistical significance (Figure 1D: ∗∗, Pvalue < 0.0001; Figure 2C: ∗, Pvalue < 0.05, ∗∗∗, Pvalue < 0.001, ∗∗∗∗, Pvalue < 0.0001
Additional Resources
All data generated as part of this study is available at GitHub (https://github.com/gorkaLasso/VirusHost_mimicry)
Acknowledgments
We thank Donald Petrey for technical support. This work was funded by NIH grants 5R01GM109018-05 and 5U54CA209997-03 to S.D.S., 5R01GM030518-39 to B.H., and equipment grants S10OD012351 and S10OD021764 to the Columbia University Department of Systems Biology.
Author Contributions
Conceptualization, S.D.S.; Methodology, G.L., B.H., and S.D.S.; Validation, G.L.; Formal Analysis, G.L.; Resources, B.H. and S.D.S.; Data Curation, G.L.; Writing, G.L., B.H., and S.D.S.; Visualization, G.L. and S.D.S.; Supervision, B.H. and S.D.S.; Project Administration, S.D.S.; Funding Acquisition, B.H. and S.D.S.
Declaration of Interests
The authors declare no competing interests.
Published: October 13, 2020
Footnotes
Supplemental Information can be found online at https://doi.org/10.1016/j.cels.2020.09.006.
Supplemental Information
References
- Abe T., Shapira S.D. Negative regulation of cytosolic sensing of DNA. Int. Rev. Cell Mol. Biol. 2019;344:91–115. doi: 10.1016/bs.ircmb.2018.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Adedeji A.O., Lazarus H. Biochemical characterization of Middle East respiratory syndrome coronavirus helicase. mSphere. 2016;1 doi: 10.1128/mSphere.00235-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alcami A. Viral mimicry of cytokines, chemokines and their receptors. Nat. Rev. Immunol. 2003;3:36–50. doi: 10.1038/nri980. [DOI] [PubMed] [Google Scholar]
- Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- Apweiler R., Bairoch A., Wu C.H., Barker W.C., Boeckmann B., Ferro S., Gasteiger E., Huang H., Lopez R., Magrane M. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2004;32:D115–D119. doi: 10.1093/nar/gkh131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bamford D.H., Grimes J.M., Stuart D.I. What does structure tell us about virus evolution? Curr. Opin. Struct. Biol. 2005;15:655–663. doi: 10.1016/j.sbi.2005.10.012. [DOI] [PubMed] [Google Scholar]
- Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N., Bourne P.E. The protein data bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Budowski-Tal I., Nov Y., Kolodny R. FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately. Proc. Natl. Acad. Sci. USA. 2010;107:3481–3486. doi: 10.1073/pnas.0914097107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burg J.S., Ingram J.R., Venkatakrishnan A.J., Jude K.M., Dukkipati A., Feinberg E.N., Angelini A., Waghray D., Dror R.O., Ploegh H.L., Garcia C.K. Structural biology. Structural basis for chemokine recognition and activation of a viral G protein-coupled receptor. Science. 2015;347:1113–1117. doi: 10.1126/science.aaa5026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Busillo J.M., Benovic J.L. Regulation of CXCR4 signaling. Biochim. Biophys. Acta. 2007;1768:952–963. doi: 10.1016/j.bbamem.2006.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen J.Y., Chen W.N., Poon K.M., Zheng B.J., Lin X., Wang Y.X., Wen Y.M. Interaction between SARS-CoV helicase and a multifunctional cellular protein (Ddx5) revealed by yeast and mammalian cell two-hybrid systems. Arch. Virol. 2009;154:507–512. doi: 10.1007/s00705-009-0323-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chuang Y.C., Lin Y.S., Liu H.S., Yeh T.M. Molecular mimicry between dengue virus and coagulation factors induces antibodies to inhibit thrombin activity and enhance fibrinolysis. J. Virol. 2014;88:13759–13768. doi: 10.1128/JVI.02166-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elde N.C., Malik H.S. The evolutionary conundrum of pathogen mimicry. Nat. Rev. Microbiol. 2009;7:787–797. doi: 10.1038/nrmicro2222. [DOI] [PubMed] [Google Scholar]
- Federhen S. The NCBI taxonomy database. Nucleic Acids Res. 2012;40:D136–D143. doi: 10.1093/nar/gkr1178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Felix J., Savvides S.N. Mechanisms of immunomodulation by mammalian and viral decoy receptors: insights from structures. Nat. Rev. Immunol. 2017;17:112–129. doi: 10.1038/nri.2016.134. [DOI] [PubMed] [Google Scholar]
- Fernandez E.J., Wilken J., Thompson D.A., Peiper S.C., Lolis E. Comparison of the structure of vMIP-II with eotaxin-1, RANTES, and MCP-3 suggests a unique mechanism for CCR3 activation. Biochemistry. 2000;39:12837–12844. doi: 10.1021/bi001166f. [DOI] [PubMed] [Google Scholar]
- Franzosa E.A., Xia Y. Structural principles within the human-virus protein-protein interaction network. Proc. Natl. Acad. Sci. USA. 2011;108:10538–10543. doi: 10.1073/pnas.1101440108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garzón J.I., Deng L., Murray D., Shapira S., Petrey D., Honig B. A computational interactome and functional annotation for the human proteome. eLife. 2016;5:e18715. doi: 10.7554/eLife.18715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goeijenbier M., Van Wissen M., Van De Weg C., Jong E., Gerdes V.E., Meijers J.C., Brandjes D.P., Van Gorp E.C. Review: viral infections and mechanisms of thrombosis and bleeding. J. Med. Virol. 2012;84:1680–1696. doi: 10.1002/jmv.23354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- GraphPad Software . La Jolla; California, USA: 2019. GraphPad Software.https://www.graphpad.com/ [Google Scholar]
- Guven-Maiorov E., Hakouz A., Valjevac S., Keskin O., Tsai C.J., Gursoy A., Nussinov R. HMI-PRED: a web server for structural prediction of host-microbe interactions based on interface mimicry. J. Mol. Biol. 2020;432:3395–3403. doi: 10.1016/j.jmb.2020.01.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guven-Maiorov E., Tsai C.-J., Ma B., Nussinov R. Interface-based structural prediction of novel host-pathogen interactions. Methods Mol. Biol. 2019;1851:317–335. doi: 10.1007/978-1-4939-8736-8_18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guven-Maiorov E., Tsai C.J., Nussinov R. Pathogen mimicry of host protein-protein interfaces modulates immunity. Semin. Cell Dev. Biol. 2016;58:136–145. doi: 10.1016/j.semcdb.2016.06.004. [DOI] [PubMed] [Google Scholar]
- Huang D.W., Sherman B.T., Lempicki R.A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009;37:1–13. doi: 10.1093/nar/gkn923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang D.W., Sherman B.T., Lempicki R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009;4:44–57. doi: 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
- Hwang H., Dey F., Petrey D., Honig B. Structure-based prediction of ligand-protein interactions on a genome-wide scale. Proc. Natl. Acad. Sci. USA. 2017;114:13685–13690. doi: 10.1073/pnas.1705381114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ivanov K.A., Thiel V., Dobbe J.C., Van Der Meer Y., Snijder E.J., Ziebuhr J. Multiple enzymatic activities associated with severe acute respiratory syndrome coronavirus helicase. J. Virol. 2004;78:5619–5632. doi: 10.1128/JVI.78.11.5619-5632.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iwata H., Goettsch C., Sharma A., Ricchiuto P., Goh W.W., Halu A., Yamada I., Yoshida H., Hara T., Wei M. PARP9 and PARP14 cross-regulate macrophage activation via STAT1 ADP-ribosylation. Nat. Commun. 2016;7:12849. doi: 10.1038/ncomms12849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jensen S., Thomsen A.R. Sensing of RNA viruses: a review of innate immune receptors involved in recognizing RNA virus invasion. J. Virol. 2012;86:2900–2910. doi: 10.1128/JVI.05738-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson N.L., Kemp A.W., Kotz S. Wiley; 1992. Univariate Discrete Distributions. [Google Scholar]
- Kolodny R., Koehl P., Levitt M. Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. J. Mol. Biol. 2005;346:1173–1188. doi: 10.1016/j.jmb.2004.12.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kopecky-Bromberg S.A., Martínez-Sobrido L., Frieman M., Baric R.A., Palese P. Severe acute respiratory syndrome coronavirus open reading frame (ORF) 3b, ORF 6, and nucleocapsid proteins function as interferon antagonists. J. Virol. 2007;81:548–557. doi: 10.1128/JVI.01782-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kvansakul M., Van Delft M.F., Lee E.F., Gulbis J.M., Fairlie W.D., Huang D.C., Colman P.M. A structural viral mimic of prosurvival Bcl-2: a pivotal role for sequestering proapoptotic Bax and Bak. Mol. Cell. 2007;25:933–942. doi: 10.1016/j.molcel.2007.02.004. [DOI] [PubMed] [Google Scholar]
- Lasso G., Mayer S.V., Winkelmann E.R., Chu T., Elliot O., Patino-Galindo J.A., Park K., Rabadan R., Honig B., Shapira S.D. A structure-informed atlas of human-virus interactions. Cell. 2019;178:1526–1541.e16. doi: 10.1016/j.cell.2019.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li M., Lee H., Yoon D.W., Albrecht J.C., Fleckenstein B., Neipel F., Jung J.U. Kaposi's sarcoma-associated herpesvirus encodes a functional cyclin. J. Virol. 1997;71:1984–1991. doi: 10.1128/jvi.71.3.1984-1991.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li W., Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. doi: 10.1093/bioinformatics/btl158. [DOI] [PubMed] [Google Scholar]
- Lin Y.S., Yeh T.M., Lin C.F., Wan S.W., Chuang Y.C., Hsu T.K., Liu H.S., Liu C.C., Anderson R., Lei H.Y. Molecular mimicry between virus and host and its implications for dengue disease pathogenesis. Exp. Biol. Med. (Maywood) 2011;236:515–523. doi: 10.1258/ebm.2011.010339. [DOI] [PubMed] [Google Scholar]
- Marchler-Bauer A., Bo Y., Han L., He J., Lanczycki C.J., Lu S., Chitsaz F., Derbyshire M.K., Geer R.C., Gonzales N.R. CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 2017;45:D200–D203. doi: 10.1093/nar/gkw1129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mihara T., Nishimura Y., Shimizu Y., Nishiyama H., Yoshikawa G., Uehara H., Hingamp P., Goto S., Ogata H. Linking virus genomes with host taxonomy. Viruses. 2016;8:66. doi: 10.3390/v8030066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ramlall V., Thangaraj P.M., Meydan C., Foox J., Butler D., Kim J., May B., de Freitas J.K., Glicksberg B.S., Mason C.E., Tatonetti N.P., Shapira S.D. Immune complement and coagulation dysfunction in adverse outcomes of SARS-CoV-2 infection. Nat. Med. 2020 doi: 10.1038/s41591-020-1021-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Odom M.R., Hendrickson R.C., Lefkowitz E.J. Poxvirus protein evolution: family wide assessment of possible horizontal gene transfer events. Virus Res. 2009;144:233–249. doi: 10.1016/j.virusres.2009.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petrey D., Fischer M., Honig B. Structural relationships among proteins with different global topologies and their implications for function annotation strategies. Proc. Natl. Acad. Sci. USA. 2009;106:17377–17382. doi: 10.1073/pnas.0907971106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petrey D., Xiang Z., Tang C.L., Xie L., Gimpelev M., Mitros T., Soto C.S., Goldsmith-Fischman S., Kernytsky A., Schlessinger A. Using multiple structure alignments, fast model building, and energetic analysis in fold recognition and homology modeling. Proteins. 2003;53:430–435. doi: 10.1002/prot.10550. [DOI] [PubMed] [Google Scholar]
- Pontiroli A., Rizzi A., Simonet P., Daffonchio D., Vogel T.M., Monier J.M. Visual evidence of horizontal gene transfer between plants and bacteria in the phytosphere of transplastomic tobacco. Appl. Environ. Microbiol. 2009;75:3314–3322. doi: 10.1128/AEM.02632-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Core Team . R Foundation for Statistical Computing; 2016. R: a language and environment for statistical computing.https://www.R-project.org [Google Scholar]
- Remmert M., Biegert A., Hauser A., Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods. 2011;9:173–175. doi: 10.1038/nmeth.1818. [DOI] [PubMed] [Google Scholar]
- Rice P., Longden I., Bleasby A. EMBOSS: the European molecular biology open software suite. Trends Genet. 2000;16:276–277. doi: 10.1016/s0168-9525(00)02024-2. [DOI] [PubMed] [Google Scholar]
- Subbiah S., Laurents D.V., Levitt M. Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core. Curr. Biol. 1993;3:141–148. doi: 10.1016/0960-9822(93)90255-m. [DOI] [PubMed] [Google Scholar]
- Velankar S., Dana J.M., Jacobsen J., Van Ginkel G., Gane P.J., Luo J., Oldfield T.J., O'Donovan C., Martin M.J., Kleywegt G.J. SIFTS: structure integration with function, taxonomy and sequences resource. Nucleic Acids Res. 2013;41:D483–D489. doi: 10.1093/nar/gks1258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- V'Kovski P., Gerber M., Kelly J., Pfaender S., Ebert N., Braga Lagache S., Simillion C., Portmann J., Stalder H., Gaschen V. Determination of host proteins composing the microenvironment of coronavirus replicase complexes by proximity-labeling. eLife. 2019;8:e42037. doi: 10.7554/eLife.42037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Warnes G.R., Bolker B., Bonebakker L., Gentleman R., Huber W., Liaw A., Lumley T., Maechler M., Magnusson A., Moeller S. gplots: various R programming tools for plotting data. 2016. https://github.com/talgalili/gplots
- Wimmer P., Schreiner S. Viral mimicry to usurp ubiquitin and SUMO host pathways. Viruses. 2015;7:4854–4872. doi: 10.3390/v7092849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiang Z., Honig B. Extending the accuracy limits of prediction for side-chain conformations. J. Mol. Biol. 2001;311:421–430. doi: 10.1006/jmbi.2001.4865. [DOI] [PubMed] [Google Scholar]
- Yang A.S., Honig B. An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments. J. Mol. Biol. 2000;301:691–711. doi: 10.1006/jmbi.2000.3975. [DOI] [PubMed] [Google Scholar]
- Zhang Q.C., Petrey D., Deng L., Qiang L., Shi Y., Thu C.A., Bisikirska B., Lefebvre C., Accili D., Hunter T. Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature. 2012;490:556–560. doi: 10.1038/nature11503. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data and code generated as part of this study is available at GitHub (https://github.com/gorkaLasso/VirusHost_mimicry)
-
•
This paper analyzes existing, publicly available data. Appropriate links to these datasets is provided in the Key Resources Table
-
•
This paper does not report original code.
-
•
The scripts used to generate the figures presented in this paper are available at https://github.com/gorkaLasso/VirusHost_mimicry
-
•
Any additional information required to reproduce this work is available from the Lead contact