Skip to main content
Poultry Science logoLink to Poultry Science
. 2024 Apr 17;103(7):103775. doi: 10.1016/j.psj.2024.103775

Codon usage bias of goose circovirus and its adaptation to host

Quanming Xu *, Jie Cao *, Kul Raj Rai , Binling Zhu §, Dan Liu ¶,1, Chunhe Wan †,1,2
PMCID: PMC11091504  PMID: 38713985

Abstract

Goose circovirus (GoCV), a potential immunosuppressive virus possessing a circular single-stranded DNA genome, is widely distributed in both domesticated and wild geese. This virus infection causes significant economic losses in the waterfowl industry. The codon usage patterns of viruses reflect the evolutionary history and genetic architecture, allowing them to adapt quickly to changes in the external environment, particularly to their hosts. In this study, we retrieved the coding sequences (Rep and Cap) and the genome of GoCV from GenBank, conducting comprehensive research to explore the codon usage patterns in 144 GoCV strains. The overall codon usage of the GoCV strains was relatively similar and exhibited a slight bias. The effective number of codons (ENC) indicated a low overall extent of codon usage bias (CUB) in GoCV. Combined with the base composition and relative synonymous codon usage (RSCU) analysis, the results revealed a bias toward A- and G-ending codons in the overall codon usage. Analysis of the ENC-GC3s plot and neutrality plot suggested that natural selection plays an important role in shaping the codon usage pattern of GoCV, with mutation pressure having a minor influence. Furthermore, the correlations between ENC and relative indices, as well as correspondence analysis (COA), showed that hydrophobicity and geographical distribution also contribute to codon usage variation in GoCV, suggesting the possible involvement of natural selection. In conclusion, GoCV exhibits comparatively slight CUB, with natural selection being the major factor shaping the codon usage pattern of GoCV. Our research contributes to a deeper understanding of GoCV evolution and its host adaptation, providing valuable insights for future basic studies and vaccine design related to GoCV.

Key words: goose circovirus, codon usage bias, natural selection, mutation, host adaptability

INTRODUCTION

Circoviruses are non-enveloped, icosahedral viruses with a diameter ranging from 15 to 25 nm. They possess circular single-stranded DNA (ssDNA) genomes spanning 1,700 to 2,100 nucleotides (Ball et al., 2004; Rosario et al., 2017). The Circovirus genome contains 2 major open reading frames (ORFs): the replication-associated protein (Rep) and capsid protein (Cap) genes. The Rep of circovirus facilitates viral genome replication and regulation of gene expression, while the Cap is an immunogenic protein which forms the protective viral shell and mediates viral entry into host cell (Chen et al., 2003). These genes are transcribed bidirectionally, with their encoding occurring on the virion sense strand and complementary sense strand, respectively (Stenzel et al., 2018). Circoviruses have the ability to infect a diverse range of bird, mammal, and fish species (Todd, 2004; Decaro, et al., 2014; Feher, et al., 2022). In avian hosts, circovirus infections primarily lead to the destruction of lymphoid tissues, causing immunosuppression. This immunosuppressed state, in consequence, makes Circovirus-infected birds more susceptible to secondary infections by other pathogenic microorganisms, including bacteria, fungi, and viruses (Todd, 2004; Guo et al., 2011; Cui et al., 2023).

Goose circovirus (GoCV), belonging to the Circovirus genus in the Circoviridae family, was first identified by Soike et al. in 1999. This virus had been associated with growth retardation and feathering disorders in the commercial geese (Soike et al., 1999; Chen et al., 2006). Subsequently, GoCV has been detected in domestic geese across various European and Asian regions (Chen, et al., 2003; Ball et al., 2004; Kozdrun, et al., 2012; Yu, et al., 2007). Notably, GoCV has also been identified in wild geese (Stenzel, et al., 2018). Indeed, infections with GoCV can affect the immune system, causing varying degrees of lymphocyte depletion and histiocytosis in the bursa of Fabricius, spleen, and thymus, ultimately leading to more severe secondary infections (Ball, et al., 2004). Published data indicates an overall prevalence of GoCV infections in domesticated geese ranging from 20 to 56% (Stenzel, et al., 2018). While previous studies have primarily focused on virus identification, epidemiology, and histopathology, there is a need for a comprehensive examination of GoCV's evolutionary strategy to offer valuable insights for future basic studies and vaccine design.

Generally, the degeneracy or redundancy of codons provides an opportunity for evolution to enhance translation efficiency while maintaining the same sequence of amino acids (McFeely, et al., 2022). The non-random use of synonymous codons is called codon usage bias (CUB) (Grantham, et al., 1980). The mutation pressure and natural selection are the key factors influencing CUB, in addition to this, other factors, such as dinucleotide abundance, tRNA levels, GC content, protein properties, and geographic location, can also influence viral CUB (Rahman, et al., 2018; Deb, et al., 2021). Unlike prokaryotes and eukaryotes, viruses, as obligate intracellular parasites, entirely rely on the host cellular machinery for replication and protein synthesis. The choice of codon usage between viruses and their hosts affects virus survival, virulence, evolution, and evasion from the host immune system (Lauring, et al., 2012; Costafreda, et al., 2014; Bera, et al., 2017). Therefore, studying the CUB of viruses could further clarify the relationship between viruses and their hosts and also illuminate the molecular evolution and virus gene regulation (Rahman, et al., 2018; Yao, et al., 2020).

To date, a hundred genomic sequences of GoCV strains are available in the NCBI database. However, a comprehensive analysis of the CUB in GoCV is still lacking. In this paper, we employed various methods to analyze the codon usage pattern of GoCV and determine the main factor affecting its CUB. This study will strengthen our understanding of the evolution of GoCV and provide insight into controlling GoCV transmission.

MATERIALS AND METHODS

Sequence Selection

A total of 144 complete genomes of GoCV available until January 2023 were retrieved from the NCBI GenBank database (http://www.ncbi.nlm.nih.gov/). The ORF V1s (Reps) and ORF C1s (Caps) of all the GoCV strains were extracted via the MEGA-X program (version 10.1.8). After removing the stop codon of ORF V1 and the start codon of ORF C1 (Supplementary Table 1), the extracted ORFs were concatenated in the following order: ORFV1-ORFC1 (ORFVC, complete sequence).

Codon Usage Bias Analysis

For each nucleotide content (A%, U%, C%, and G%), GC content, the frequencies of each nucleotide composition at the third position (U3s, C3s, A3s, and G3s) of synonymous codons, relative synonymous codon usage (RSCU), effective number of codons (ENC), Gravy (hydrophobicity), and Aroma (aromaticity) were calculated using BioEdit (version 7.0.9) and CodonW (version 1.4.2) (http://codonw.sourceforge.net/). Additionally, the GC values for each codon position (GC1, GC2, and GC3) of each sequence were calculated using the online cusp program (http://emboss.toulouse.inra.fr/cgi-bin/emboss/cusp). ENC values range between 20 and 61, and a lower ENC value indicates a greater extent of codon bias within a gene (Zhou et al., 2023). The codon adaptation index (CAI) value is another effective measure of synonymous codon usage preferences in a coding sequence, with a low CAI value indicating low codon usage bias. CAI and expected CAI (E-CAI) values of the GoCV coding sequences were calculated using the CAIcal web-server (http://genomes.urv.cat/CAIcal/), with the codon usage tables of A. cygnoides and A. anser (https://www.kazusa.or.jp/codon/) used as reference sources (Puigbo et al., 2008a). If the CAI value is higher than the E-CAI value, it suggests that the virus adaptation to its host might be due to translational selection (Puigbo et al., 2008b; Ismail et al., 2019). The isoacceptor tRNA database was collected from the genome annotation of A. cygnoides (https://www.ncbi.nlm.nih.gov/datasets/gene/taxon/8845/?gene-type=Small%20RNAs). The relative abundances of 16 dinucleotides in the GoCV coding sequences were calculated using DAMBE (v5.3.19) software (Karlin and Burge, 1995). As a universal standard, when Pxy > 1.23 (or <0.78), the XY dinucleotide is inferred to be overrepresented (or underrepresented), respectively(Li, et al., 2018).

Similarity Analysis of Codon Usage Patterns Between GoCV and its Host

The RSCU represents the ratio of the observed value to the expected value of the specific codon for the same amino acid (Sharp and Li, 1986), removing the effects of amino acid composition and coding sequence length. It is an important measure for analyzing the overall synonymous codon usage bias in a given coding sequence. Usually, codons with RSCU values >1.6 and <0.6 are considered overrepresented and underrepresented, respectively. If a particular codon has the highest RSCU in both the virus and its host, it can be considered evidence of shared codon preference (Roy et al., 2021).

The similarity index (SiD or D [A, B]) is an indicator to assess the potential influence of a host's codon usage pattern on virus codon usage. The SiD values were calculated according to previous study methods (Xu, et al., 2021). The SiD value ranges from 0 to 1, with a higher SiD value indicating a greater influence of the host on the virus codon usage (Zhou, et al., 2013).

The Effect of Mutation Pressure and Natural Selection on Codon Usage Bias in GoCV

An ENC-plot (ENC vs. GC3s) was constructed to determine the factors influencing CUB, such as mutation pressure (Wright 1990; He et al., 2019). When codon usage is primarily influenced by mutation pressure, the ENC value will be on or around the expected curve, representing the functional relationship between the ENC expectation curve and the GC3s value. If the observed ENC values significantly fall below this curve, it suggests that other factors, such as natural selection, play a major role in shaping codon usage bias.

A Neutrality Plot (GC12 vs. GC3) was generated to identify and compare the neutral degree of mutation pressure and natural selection during the evolution of genes (Sueoka, 1999a). The GC contents at the first, second, and third positions (GC1, GC2, and GC3) were computed using the EMBOSS cusp online tool (http://emboss.toulouse.inra.fr/cgi-bin/emboss/cusp), followed by correlation analysis and regression. A slope of the regression curve close to 1 indicates a greater influence of mutation pressure on the gene, while a slope close to 0 implies a greater role of natural selection (Das and Roy, 2021).

The Parity Rule 2 (PR2) plot illustrates the AU bias [A3/(A3+U3)] versus the GC bias [G3/(G3+C3)] of 4-fold degenerate codon families (alanine, glycine, proline, threonine, valine, arginine [CGA, CGU, CGG, and CGC], leucine [CUA, CUU, CUG, and CUC], and serine [UCA, UCU, UCG, and UCC]) within entire genes. Generally, the midpoint is 0.5 (x = 0.5 and y = 0.5, i.e., A=U and G=C). A point falling on the midpoint indicates an equal influence of mutation pressure and natural selection (Sueoka, 1995; Sueoka, 1999b).

Correspondence Analysis of Codon Usage

A correspondence analysis (COA) plot is a widely used multivariate platform employed to detect relationships between variables and samples (Yao et al., 2020; Grantham et al., 1980). Here, each coding sequence of GoCV was denoted in 59 dimensions, with each dimension corresponding to the RSCU value for each codon (excluding the ATG, TGG, and stop codons). According to the COA result, it contributes directly to reflect the trend of strain change. The COA was carried out using Codon W 1.4.2 based on the RSCU values.

Statistical Analysis

In this study, the sequence information of the GoCV strains was statistically sorted and analyzed in Microsoft Excel, and DNAstar software was used to organize the coding region sequences of the GoCV strains. The statistical analyses were conducted using the GraphPad Prism 9 and OriginPro software. A Kolmogorov‒Smirnov test was carried out for the E-CAI. Correlation analyses were performed using Spearman's rank correlation test. P-values of < 0.05 were considered statistically significant.

RESULTS

Characteristics of the Nucleotide Composition Within the GoCV Coding Region

The composition of the coding region in GoCV was examined to assess how compositional constraints might influence codon usage bias. Analysis revealed the percentages of A, C, G, and U to be 29.46 ± 0.36 (mean ± SD), 23.17 ± 0.47, 25.75 ± 0.28, and 21.61 ± 0.45, respectively. Additionally, the mean GC content was found to be 49.01 with an SD of 0.58, indicating nearly equal proportions of GC and AU. Further examination showed average GC12 and GC3 contents of 48.25 ± 0.26 and 50.27 ± 1.52, respectively. The nucleotide composition at the third position of synonymous codons (A3s, C3s, G3s, and U3s) was also determined, with A3s being the most frequent (38.54 ± 1.23), followed by C3s (31.85 ± 1.60), G3s (29.04 ± 1.01), and U3s (26.30 ± 1.60). Similar trends were observed in Cap coding sequences (A3s > C3s > G3s > U3s), while Rep coding sequences displayed different trends (U3s > G3s > C3s > A3s) (Table S1).

Codon Usage Pattern of GoCV Coding Region

ENC values are utilized as a quantitative measure to evaluate the level of CUB present within genes. A higher ENC value corresponds to a lower degree of CUB. In the analysis of 144 strains of GoCV, the average ENC value for their entire coding sequences was determined to be 59.20, with a standard deviation of 1.05. Specifically, the mean ENC values ± standard deviation for the Rep and Cap coding regions were 58.22 ± 1.57 and 44.67 ± 1.97, respectively. This disparity suggests that the Cap region displays a more pronounced CUB compared to the Rep region. In summary, the elevated ENC value (>45) indicates a reduced CUB observed across the GoCV genome.

Analysis of Codon Adaptation to the Host

The Codon Adaptation Index (CAI) functions as a metric indicating the degree of host adaptation exhibited by a gene. In the case of GoCV isolates in reference to A. cygnoides and A. anser hosts, the mean CAI values were determined to be 0.687 ± 0.007 and 0.665 ± 0.005, respectively (Table S2). In an effort to mitigate the potential impact of extreme G+C content and/or amino acid composition and to minimize compositional biases, Puigbo et al. devised an algorithm termed expected CAI (E-CAI) (Puigbo et al., 2008b). Consequently, the expected CAI (E-CAI) and the ratio of CAI to E-CAI values for GoCV concerning these hosts were computed (also elaborated in Table S2). The normalized CAI values with respect to A. cygnoides and A. anser were found to be 1.042 and 1.015, respectively. Moreover, the normalized CAI values for Rep and Cap also surpassed to 1 (Table S2), indicating the influence of natural selection surpassing mutational biases. This influence diminishes the antagonistic effect with the host and enhances the translation efficiency of Rep and Cap, thereby optimizing their codon adaptation specifically to their hosts, viz., A. cygnoides and A. anser.

Comparative Examination of Codon Usage Patterns Between GoCV and its Host

To delve deeper into the codon usage pattern of GoCV, RSCU values were calculated for 59 codons corresponding to 18 amino acids (excluding Met, Trp, and stop codons). Analysis of the results indicates a prevalence of A/U-ended codons in the complete coding sequences of GoCV, with 10 out of 18 preferred codons ending with A/U, while the remaining 5 end with G/C (Table 1). Moreover, the Cap and Rep genes displayed distinct patterns, with 14 over-represented codons (mean RSCU value > 1.6) in the Cap gene and 4 in the Rep gene. Conversely, 24 codons are under-represented (mean RSCU value < 0.6) in the Cap gene and 8 in the Rep gene. Notably, only 3 codons in the complete coding sequences were over-represented, while 4 codons were under-represented (Table 1). These observations suggest a more pronounced codon usage bias in the Cap gene compared to the complete coding sequences of GoCV.

Table 1.

The relative synonymous codon usage (RSCU) analysis of GoCV and its hosts.

AA1 Codon RSCU2 Complete3 Rep4 Cap5 A. cygnoide A. anser C. coscoroba A. albifrons A. fabalis A. platyrhynchos
Phe UUU 0.75 1.11 0.37 1.07 0.81 0.27 0.46 0.16 1.02
UUC 1.257 0.89 1.63 0.93 1.19 1.73 1.54 1.84 0.98
Leu UUA 0.66 0.96 0.06 0.64 0.39 0.03 0.28 0.21 0.6
UUG 0.57 0.62 0.48 0.95 0.74 0.39 0.55 0.14 0.9
CUU 0.89 1.1 0.49 1.01 0.75 0.36 0.47 0.14 0.95
CUC 0.65 0.71 0.55 0.94 1 1.71 1.9 2.41 0.98
CUA 1.06 0.07 3.01 0.52 0.42 0.17 0.31 0.14 0.49
CUG6 2.17 2.55 1.41 1.95 2.7 3.34 2.5 2.97 2.09
Ile AUU 0.87 1.13 0.67 1.18 0.97 0.17 0.65 0.23 1.16
AUC 0.91 0.66 1.11 1.11 1.51 2.62 2 2.31 1.17
AUA 1.22 1.21 1.23 0.71 0.52 0.21 0.35 0.46 0.67
Val GUU 0.94 1.39 0.21 1.01 0.68 0.35 0.56 0.18 0.98
GUC 0.82 0.85 0.76 0.82 1.08 1.4 1.12 1.24 0.83
GUA 1.23 0.46 2.51 0.66 0.41 0 0.41 0.18 0.63
GUG 1.01 1.3 0.52 1.5 1.84 2.26 1.92 2.4 1.56
Ser UCU 1.01 2.01 0.14 1.39 1.31 0.92 1.07 1.33 1.34
UCC 1.09 1.54 0.7 1.12 1.47 2.4 1.6 1.71 1.17
UCA 1.13 0.11 2.01 1.15 0.8 0.37 0.93 0.57 1.12
UCG 0.38 0.48 0.29 0.33 0.42 0.31 0.39 0.38 0.37
AGU 0.7 0.76 0.64 0.87 0.64 0.33 0.54 0.13 0.83
AGC 1.69 1.09 2.21 1.13 1.36 1.67 1.46 1.88 1.17
Pro CCU 0.93 1.46 0.46 1.32 1 1.02 1.05 0.56 1.23
CCC 0.82 0.91 0.74 0.94 1.53 2.41 1.45 2.56 1.04
CCA 1.14 0.5 1.7 1.35 1.05 0.38 1.05 0.67 1.27
CCG 1.12 1.13 1.1 0.4 0.43 0.19 0.44 0.22 0.47
Thr ACU 0.53 0.97 0.29 1.14 0.88 0.69 0.88 0.48 1.1
ACC 1.52 1.57 1.5 1.01 1.72 2.67 1.83 2.79 1.05
ACA 1.41 0.67 1.83 1.38 1 0.37 0.75 0.36 1.35
ACG 0.53 0.79 0.39 0.47 0.4 0.27 0.55 0.36 0.49
Ala GCU 0.76 1.09 0.26 1.34 1.25 1.06 1.07 0.98 1.25
GCC 0.63 0.75 0.45 1.04 1.57 2.33 2.07 2.24 1.14
GCA 1.52 0.74 2.71 1.3 0.72 0.18 0.59 0.34 1.22
GCG 1.08 1.41 0.58 0.31 0.46 0.43 0.27 0.44 0.39
Tyr UAU 0.92 1.1 0.68 0.93 0.78 0.3 0.48 0.15 0.89
UAC 1.08 0.9 1.32 1.07 1.22 1.7 1.52 1.85 1.11
His CAU 0.76 0.03 0.94 0.96 0.74 0.66 0.5 0.3 0.93
CAC 1.24 1.97 1.06 1.04 1.26 1.34 1.5 1.7 1.07
Gln CAA 0.93 0.01 1.31 0.67 0.56 0.2 0.56 0.48 0.64
CAG 1.07 1.99 0.69 1.33 1.44 1.8 1.44 1.52 1.36
Asn AAU 0.62 1.05 0.17 1 0.82 0.41 0.64 0.31 0.98
AAC 1.38 0.95 1.83 1 1.18 1.59 1.36 1.69 1.02
Lys AAA 1.11 0.85 1.63 1.03 0.8 0.33 0.9 0.52 1.01
AAG 0.89 1.15 0.37 0.97 1.2 1.67 1.1 1.48 0.99
Asp GAU 1.13 1.46 0.17 1.11 0.83 0.34 0.84 0.33 1.08
GAC 0.87 0.54 1.83 0.89 1.17 1.66 1.16 1.67 0.92
Glu GAA 1.18 1.06 1.59 1.05 0.89 0.6 0.94 0.94 1.01
GAG 0.82 0.94 0.41 0.95 1.11 1.4 1.06 1.06 0.99
Cys UGU 1.12 1.12 0 0.96 0.78 0.51 0.59 0.31 0.92
UGC 0.88 0.88 2 1.04 1.22 1.49 1.41 1.69 1.08
Arg CGU 0.92 1.32 0.59 0.85 1.01 2 0.69 1.2 0.76
CGC 0.69 0.74 0.64 1.06 1.29 1.18 1.93 2.4 1.17
CGA 1.22 1.11 1.31 0.96 0.58 0.12 0.49 0.2 0.86
CGG 0.78 1.19 0.44 1.13 1.12 0.71 0.88 0.2 1.22
AGA 1.37 0.63 1.97 1.11 0.94 0.35 0.92 0.67 1.09
AGG 1.02 1.01 1.03 0.89 1.06 1.65 1.08 1.33 0.91
Gly GGU 0.67 1.03 0.26 0.81 0.69 0.83 0.55 0.8 0.77
GGC 0.7 0.7 0.7 1.01 1.4 2 1.41 2.13 1.1
GGA 1.74 1.12 2.46 1.33 0.99 0.44 1.08 0.67 1.23
GGG 0.88 1.14 0.58 0.85 0.92 0.73 0.95 0.4 0.9

Anser cygnoides: A. cygnoides; Anser anser: A. anser;Coscoroba coscoroba: C. coscoroba; Anser albifrons: A. albifrons; Anser fabalis: A. fabalis; Anas platyrhynchos: A. platyrhynchos.

1

Represents amino acid.

2

Represents the relative synonymous codon usage.

3

Represents the complete coding sequence of GoCV.

4

Represents the nonstructural (Rep) protein coding sequence of GoCV.

5

Represents the structural (Cep) protein coding sequence of GoCV.

6

The preferentially used codons co-existing in 6 hosts are bold and underlined.

7

The RSCU values of preferentially used codons for each amino acid are shown in italic and bold.

To explore whether the CUB of GoCV is influenced by its hosts, a comparison was conducted between the codon usage patterns of GoCV and those of its hosts, including A. cygnoides, A. anser, C. coscoroba, A. albifrons, A. fabalis, and A. platyrhynchos (Table 1). The findings indicated that a substantial proportion of synonymous codons were equivalently or identically selected across different host species. Specifically, for A. cygnoides, 55 out of 59 synonymous codons showed equivalent selection, while for A. anser, 51 out of 59 codons were identically selected. Similarly, for A. albifrons and A. platyrhynchos, 46 and 55 out of 59 synonymous codons, respectively, exhibited equivalent selection. Interestingly, certain codons such as UUC for Phe, CUG for Leu, AUU for Ile, and others exhibited similarity between GoCV and its hosts, indicating potential correlations with virulence in the Anser species. Conversely, codons like CUA for Leu, CCG for Pro, and GCG for Ala exhibited significant differences between GoCV and its hosts. Furthermore, specific codons such as UCC and UUA for Ser, CCA for Pro, ACG for Thr, and others showed similarity between GoCV and certain host species including A. cygnoides, A. anser, A. albifrons, or A. fabalis.

Additionally, a similarity index analysis was performed to assess the influence of the overall codon usage pattern of the hosts (A. cygnoides, A. anser, C. coscoroba, A. albifrons, A. fabalis, and A. platyrhynchos) on the formation and evolution of the overall codon usage in GoCV (Figure 1). The findings revealed that the D (A, B) values of GoCV polyproteins were highest for the group i.e. C. coscoroba or A. fabalis vs. GoCV, followed by A. albifrons or A. anser vs. GoCV, and the lowest in A. cygnoides or A. platyrhynchos vs. GoCV group. This suggests that C. coscoroba and A. fabalis may exert a more significant impact on the CUB of GoCV compared to A. cygnoides or A. platyrhynchos.

Figure 1.

Figure 1

Similarity analysis for the complete coding sequences, non-structural polyprotein (Rep) coding sequences, and structural polyprotein (Cap) coding sequences of GoCV in relation to different hosts (A. cygnoides, A. anser, C. coscoroba, A. albifrons, A. fabalis and A. platyrhynchos).

Correlation Between Dinucleotide Composition and Codon Usage Bias in GoCV

To evaluate whether dinucleotide compositional constraints influence the codon usage pattern of GoCV, the relative abundance of 16 dinucleotides in GoCV coding regions was computed. Results revealed an over-representation of GA, AU, and UU, while GU, GC, UA, and AG were under-represented, indicating a non-random distribution of relative dinucleotide abundance in GoCV coding regions (Table 2). To further explore the impact of dinucleotide usage on codon usage bias, we compared the over-represented and under-represented dinucleotides with the preferred and under-represented codons. Analysis of RSCU demonstrated that GA-ended codons, such as AGA and GGA, were preferentially utilized in GoCV complete coding sequences. However, among the 8 GU or GC-end codons, 6 codons (AGU, CGU, GGU, UGC, CGC, and GGC) were not preferentially utilized (Table 1). Similar trends in dinucleotide occurrence were observed in the Rep and Cap coding regions (Table S3). Specifically, GA was over-represented, whereas CA was under-represented in the Rep coding sequence. In Cap coding sequences, over-represented dinucleotides included GG and CA, while UG was under-represented. Overall, the biased dinucleotide abundance could potentially influence the CUB of GoCV.

Table 2.

The dinucleotide composition analysis of GoCV.

Dinucleotides Range Means ± SD
AA 0.944–1.044 0.982 ± 0.021
AC 0.958–1.111 1.035 ± 0.031
AG 0.857–0.982 0.907 ± 0.021
AU 1.017–1.152 1.092 ± 0.029
CA 0.850–1.068 0.944 ± 0.041
CC 0.990–1.169 1.075 ± 0.038
CG 0.962–1.093 1.024 ± 0.026
CU 0.847–1.025 0.970 ± 0.028
GA 1.087–1.209 1.143 ± 0.028
GC 0.824–0.969 0.895 ± 0.034
GG 0.961–1.114 1.048 ± 0.041
GU 0.811–0.997 0.863 ± 0.028
UA 0.810–0.990 0.906 ± 0.028
UC 0.918–1.087 1.001 ± 0.040
UG 0.978–1.134 1.047 ± 0.034
UU 1.003–1.150 1.074 ± 0.028

Assessment of the Impact of Mutation Pressure and Natural Selection on Shaping GoCV's Codon Usage Bias

The relationship between ENC values and the GC3s contents was analyzed to discern the predominant factor influencing the codon usage bias in GoCV. If the points on the plot clustered around or closely followed the expected curve, it would imply that mutation pressure primarily dictates the CUB (Jenkins and Holmes, 2003). Conversely, if the points consistently fell below the expected curve, it would suggest the influence of natural selection on CUB. In Figure 2, it was observed that the majority of points, across all groups, clustered around the middle of the horizontal axis. Both the ENC values for the complete coding sequences and the Rep gene closely followed or slightly underlined the expected ENC curve (Figure 2A and 2B), indicating that CUB is predominantly influenced by mutation pressure. However, for the Cap gene, the ENC values were consistently below the expected curve (Figure 2C), implying that natural selection plays a significant role in shaping the CUB of the Cap gene. Overall, the ENC-GC3s plots suggest that both mutational pressure and natural selection contribute to shaping the CUB of GoCV.

Figure 2.

Figure 2

ENC-GC3s plot analysis (ENC plotted against GC3s) for the complete (A), Rep (B), and Cap (C) coding sequences of GoCV. The red solid curve plots the relationship between GC3s and ENC in the absence of selection pressure (left panel). The GoCV strains isolated from A. cygnoides, A. anser, C. coscoroba, A. albifrons, A. fabalis and A. platyrhynchos are shown in red dot, black solid square, green solid triangle, black diamond, black inverted triangle, black circle, green solid star, black sperm respectively (right panel).

A neutrality plot analysis, commonly utilized to elucidate the impact of mutation pressure and natural selection on the CUB of genes, was conducted to ascertain the principal factor shaping the CUB of the GoCV Rep gene, Cap gene, and complete coding sequences. In the case of GoCV complete coding sequences, a significant positive correlation between GC12 and GC3 was identified, indicating that natural selection primarily influenced the CUB (Figure 3A). Similarly, for the GoCV Rep gene, a notable positive correlation between GC12 and GC3 suggested a significant role of natural selection (Figure 3B). Conversely, in the case of the GoCV Cap gene, a significant negative correlation between GC12 and GC3 indicated that natural selection was the primary determinant of the CUB (Figure 3C). Hence, natural selection emerged as the predominant force shaping the CUB of the GoCV Rep gene, Cap gene, and complete coding sequences, with mutation pressure exerting a lesser influence.

Figure 3.

Figure 3

Neutrality plots of the complete (A), Rep (B), and Cap (C) coding sequences, with GC12 plotted against GC3. The slope value represents the ratio of mutation pressure in the total variation. GC12 refers to the GC contents at the first and second codon positions while GC3 means the GC contents at the third codon position. The GoCV strains isolated from A. cygnoides, A. anser, C. coscoroba, A. albifrons, A. fabalis and A. platyrhynchos are shown in red dot, black solid square, green solid triangle, black diamond, black inverted triangle, black circle, green solid star, black sperm respectively.

We also conducted a PR2 bias plot analysis, an additional method to discern the influence of mutational pressure and natural selection on the CUB of genes. The disproportionate utilization of nucleobases across degenerate codon groups within a gene indicates the combined impact of mutational pressure and natural selection on CUB (Wang et al., 2022). In our analysis, we utilized the values of A3/(A3+U3)|4 and G3/(G3+C3)|4 as the ordinate and abscissa, respectively. As depicted in Figure 4, all sequences were situated away from the origin coordinate (0.5, 0.5), suggesting that both mutational pressure and natural selection contributed to the CUB observed in the GoCV Rep gene, Cap gene, and complete coding sequences, although their effects varied. Moreover, our analysis revealed that the nucleotide A was preferentially used over U among the 4-fold degenerate codons in complete coding sequences and the Cap gene, whereas U was favored over A in the Rep gene. These findings underscore that the determinants of CUB are multifaceted, and their impacts vary in shaping the CUB of GoCV.

Figure 4.

Figure 4

Parity rule 2 (PR2) plot showing the AU bias [A3/(A3+U3)|4] and GC bias [G3/(G3+C3)|4] of the complete (A), Rep (B), and Cap(C) coding sequences of GoCV. The center of the plot, where the values of both coordinates are 0.5, indicates a location where there is no bias. The GoCV strains isolated from A. cygnoides, A. anser, C. coscoroba, A. albifrons, A. fabalis and A. platyrhynchos are shown in red dot, black solid square, green solid triangle, black diamond, black inverted triangle, black circle, green solid star, black sperm respectively.

Importantly, investigating whether the codons predominantly favored by GoCV correspond to the most prevalent isoacceptor tRNAs in its hosts aids in understanding translation selection. Analysis of the most favored codons in GoCV and their compatibility with the most abundant isoacceptor tRNAs in A. cygnoide indicated that 5 out of the 18 most favored codons in GoCV perfectly matched with the corresponding most prevalent isoacceptor tRNAs in A. cygnoide (Table 3).

Table 3.

tRNA count by anticodon in Anser cygnoides for most preferred codons of GoCV.

AAa Most preferred codon in GoCV tRNA count by anticodon in A. cygnoides Total
Phe UUC AAA GAA
0 10 10
Leu CUG AAG GAG CAG UAG CAA UAA
6 0 7 2 3 2 20
Ile AUA AAU GAU CAU UAU
8 0 0 1 9
Val GUA AAC GAC CAC UAC
6 1 9 3 19
Ser AGC AGA GGA CGA UGA ACU GCU
7 0 0 4 0 7 18
Pro CCA AGG GGG CGG UGG
5 0 3 2 10
Thr ACC AGU GGU CGU UGU
6 0 2 4 12
Ala GCA AGC GGC CGC UGC
10 0 3 2 15
Tyr UAC AUA GUA
0 11 11
His CAC AUG GUG
0 9 9
Gln CAG CUG UUG
3 4 7
Asn AAC AUU GUU
0 14 14
Lys AAA CUU UUU
8 5 13
Asp GAU AUC GUC
0 8 8
Glu GAA CUC UUC
8 8 16
Cys UGU ACA GCA
0 10 10
Arg AGA ACG GCG CCG UCG CCU UCU
9 0 2 2 4 3 20
Gly GGA ACC GCC CCC UCC
0 13 5 7 25
Trp UGG CCA
5 5
Met AUG CAU
12 12
a

AA represents amino acid. The preferred tRNA used by both of the host and GoCV were shown in italic and bold. Non-optimal codon-anticodon base pairs used in GoCV were double underlined.

Correspondence Analysis

To explore variations in synonymous codon usage among coding sequences of GoCV strains isolated from different hosts, COA was conducted individually on the Rep gene, Cap gene, and complete coding sequences, utilizing RSCU values. The coordinates of all protein-coding regions of GoCV along the first axis (Axis 1) and second axis (Axis 2) were plotted. As depicted in Figure 5, for complete coding sequences, Axis 1 and Axis 2 accounted for 23.92% and 16.11% of the total variation of GoCV, respectively. Similarly, for the Rep gene, the values of the first 2 axes were 26.96% and 15.84%, while for the Cap gene, they were 25.78% and 21.72%, respectively. Additionally, in the correspondence analysis, the points representing the Cap gene exhibited a broader distribution along Axis 1 compared to those of the Rep gene, indicating greater variability in the Cap gene. This observation implies a potential association between the variability of the Cap gene, responsible for encoding viral spike proteins in direct interaction with the host, and the virus's adaptation to various hosts.

Figure 5.

Figure 5

Correspondence analysis (COA) based on RSCU in the complete (A), Rep (B), and Cap (C) coding sequences of GoCV. The first axis accounts for 23.93, 26.96, 25.78% of total variation, and the second axis accounts for 16.11, 15.84, 21.72% of total variation for the complete, Rep and Cap coding sequences, respectively.

Correlation Analysis of Base Composition of GoCV Coding Regions

To elucidate the influence of mutational pressure on the GoCV's codon usage, we analyzed the correlation between nucleotide compositions (A, U, G, C, and GC), third codon compositions (A3s, U3s, G3s, C3s, and GC3s) and ENC values. As shown in Figure 6, most third codon compositions showed significant correlations with overall nucleotide compositions in GoCV's complete coding sequences. Specifically, A3s content exhibited a significant negative correlation with C, G, GC and ENC, but a positive correlation with that of A. Similarly, U3s content had a significant negative correlation with C, G, GC, and ENC, yet a positive correlation with that of U. C3s content showed a significant negative correlation with A, U and ENC, but a positive correlation with C and GC. G3s or GC3s content displayed a significant negative correlation with A, U and ENC, yet a positive correlation with C, G and GC. These trends were consistent in Rep and Cap genes (Figures S1 and S2), indicating a common relationship. Overall, these results indicate that GoCV's codon usage bias was influenced by nucleotide compositions, confirming the major role of mutational pressure. Correlation analysis further revealed negative significant associations between Axis 1 and GC3s, GC, C, G, C3s, and G3s, with positive correlation observed with ENC, A, U, A3s, and U3s in GoCV's complete coding sequences. Axis 2 displayed significant negative correlations with GC, ENC, and G, yet a positive correlation with, A and A3s (Figure 6). In Rep, Axis 1 correlated significantly with other indices except ENC, while Axis 2 correlated significantly with other indices except U, U3s, and C3s (Figure S1). Similarly, in Cap, both Axis 1 and Axis 2 showed significant correlation with other indices except ENC and A (Figure S2). Moreover, correlation analysis further showed that, out of 16 dinucleotides, 14 in complete coding sequences were highly correlated with Axis 1 in COA except GA and UU (Table 4). Moreover, 12 dinucleotides in Rep were significantly correlated with Axis 1 except AC, CA, GU and UU while 13 dinucleotides in Cap were significantly correlated with Axis 1 except AU, GU and UA (Table S4). Therefore, these data indicate that mutation pressure from the base composition affects the CUB of GoCV.

Figure 6.

Figure 6

Correlation analysis among different indices of GoCV. The dark blue indicates a negative correlation, and the dark red indicates a positive correlation; the higher value indicates a more significant correlation. Notably, “*” and “**” denote significance levels of p < =0.05 and p <= 0.01, respectively.

Table 4.

Correlation analysis between 16 dinucleotides and axis in GoCV.

AA AC AG AU CA CC CG CU GA GC GG GU UA UC UG UU
Axis1
r 0.7281 -0.6021 0.2231 -0.2841 -0.7691 0.6761 0.2421 0.2871 0.071 0.5011 -0.6451 0.1692 0.4721 -0.6551 0.5281 -0.154
P 0.000 0.000 0.007 0.001 0.000 0.000 0.003 0.000 0.397 0.000 0.000 0.042 0.000 0.000 0.000 0.065
1

Reprents P-value < 0.01.

2

Reprents 0.01 < P-value < 0.05.

Other Factors Potentially Affecting the Codon Usage Bias of GoCV

To further explore the potential influence of natural selection on the codon usage of GoCV, correlation analysis was conducted between amino acid properties (Aroma and Gravy) and codon bias indices (Axis 1, Axis 2, ENC, GC, and GC3s) (Figure 6). The analysis revealed positive correlations between Aroma and Axis 1, as well as ENC, whereas negative correlations were observed with GC and GC3s. Conversely, Gravy exhibited negative correlations with Axis 1 and Axis 2, but positive correlations with GC (Figure 6). In the case of the Cap gene, Gravy displayed significant correlation solely with GC3s, GC, and ENC. Conversely, for the Rep gene, Gravy demonstrated significant correlation with all codon bias indices (Axis 1, Axis 2, ENC, GC, and GC3s), while Aroma correlated significantly with Axis 1, Axis 2, GC3s, and GC (Figures S1 and S2). Overall, these results suggest that the aromaticity and hydrophobicity of amino acids contribute to the codon usage bias of GoCV, indicating the involvement of translational selection in shaping the codon usage pattern.

Finally, to explore the potential impact of geographic factors on the codon usage patterns of the Rep gene, Cap gene, and complete coding sequences, a CoA based on geographical origins was conducted. As illustrated in Figure 7, distinct divergence among sequences isolated from various geographical regions was observed, particularly evident between sequences originating from China-Mainland and Poland, for both complete coding sequences and the Rep gene. Intriguingly, sequences from the same geographic area did not cluster together, with notable dispersion observed among China-Mainland sequences, suggesting significant variability in codon usage among strains even in the same geographic region. These findings imply that geographic diversity may influence the codon usage patterns of GoCV.

Figure 7.

Figure 7

COA analysis of the complete (A), Rep (B), and Cap(C) coding sequences of GoCV against geographical distribution. Different geographical distributions are represented by different colors and shapes.

DISCUSSION

GoCV, a circovirus found in waterfowl, is known to cause immunosuppression in domestic geese, making infected birds more vulnerable to secondary infections from various pathogens (Yang, et al., 2020). Despite prior research on GoCV variability and evolution (Stenzel et al., 2018), significant knowledge gaps persist, warranting a comprehensive analysis. Understanding codon usage bias is crucial not only for designing vaccines but also for comprehending cross-species transmission. Because vaccine development for other circoviruses has largely focused on the capsid protein (Cap) ( Cao et al., 2023; Kim et al., 2023; Neef et al., 2024), GoCV Cap protein could also be a promising candidate for vaccine development. Until now, there are no commercially available GoCV vaccines. The CUB of GoCV's capsid protein would definitively provide valuable theoretical basis for the development of GoCV vaccine. However, CUB of GoCV Cap protein during evolutionary processes remains unexplored. Thus, this study was aimed to conduct a thorough analysis of GoCV evolution, considering codon usage and host adaptation for the first time, to provide valuable insights for controlling cross-species transmission and developing antiviral treatments.

Previous studies have highlighted compositional constraints as a major factor influencing codon usage patterns. In our research, we observed higher A and G contents compared to C and U contents, consistent with findings in porcine circovirus 3 (PCV3) (Li et al., 2018). Similarly, analysis of RSCU values revealed a preference for A/G-ended codons. Moreover, the first principal axes (Axis 1) showed significant correlations with percentages of A, U, C, and G, aligning with previous reports suggesting that compositional constraints depend not only on C + G contents but also on A and/or U contents (Xu et al., 2015; Li et al., 2018). Additionally, each nucleotide content showed significant correlations with other nucleotide contents at the third position, indicating the influence of compositional constraints on GoCV codon usage patterns.

Nucleotide composition analysis of the complete coding region of GoCV genomes revealed A as the most frequent mononucleotide, consistent with A-rich genomes observed in the majority of circular single-stranded DNA viruses (Li et al., 2018; Feng et al., 2022). Furthermore, the prevalence of A-ended codons among preferentially used codons suggested the existence of codon bias in GoCV. However, despite this bias, we detected a high ENC value, indicating low CUB. Similar low CUBs have been reported in other circovirus strains, such as a Duck circovirus (range from 58.12 to 60.98) (Xu et al., 2015), PCV1 (56.53), PCV2 (54.42), PCV3(55.57), and PCV4 (54.48) (Feng et al., 2022). This suggests benefits in overcoming host defenses and facilitating survival and replication within the host. Low CUB benefits to overcome host defense and facilitating viral survival and replication within the host (Tsai et al., 2007; Zhou et al., 2023).This low CUB observed in GoCV may enhance its adaptive fitness to the natural host (Anser. sp.) through a complex adaptive evolution process, potentially leading to further global transmission.

RSCU analysis revealed some similarities in codon choice between GoCV and its hosts, including A. cygnoide, A. anser, C. coscoroba, A. albifrons, A. fabalis, and A. platyrhynchos. This phenomenon could represent an adaptive mechanism used by GoCV to resist host immunity during long-term co-evolution. Combining the results of A. cygnoide tRNA pool analysis, it was found that out of 18 amino acids except for Phe, Leu, Tyr, His, and Asn, all non-optimal codon-anticodon base pairs were used. A similar pattern of suboptimal tRNA isotype recognition use has been previously reported for the SARS-CoV-2 (Roy et al., 2021) and Nipah virus (Khandia et al., 10AD).Additionally, analysis of suboptimal codon-anticodon base pair usage suggested the usage of suboptimal isoacceptor host tRNAs during the initial phase of infection, facilitating slow but precise translation, which yields the synthesis of accurate and properly folded viral proteins (Khandia et al., 10AD).

While RSCU analysis provides insights into codon usage patterns, it has limitations in revealing the forces that affect overall codon usage (Butt et al., 2014). Therefore, we further analyzed codon usage based on dinucleotides in all GoCV coding sequences. We observed divergence in dinucleotide usage in the Rep and Cap genes of GoCV, with some dinucleotides being either overrepresented or underrepresented. We found no over- or under-represented dinucleotides in GoCV complete coding sequence. However, in Rep gene, dinucleotides CC and GA were overrepresented, while dinucleotide CA was underrepresented. In Cap gene, dinucleotide GG was overrepresented. The high abundance of dinucleotide CC in the Rep gene has more thermodynamic stacking energy resulting in a low transcription and replication efficiency compared to G/C-free dinucleotides, which was consistent with previous finding on atypical porcine pestivirus (Pan et al., 2020). Indeed, GoCV is not a lethal pathogen but exhibits subclinical features (Guo et al., 2011), which may be related to the low replication efficiency caused by the high content of dinucleotide CC. These findings suggest that the biased dinucleotide abundance could influence GoCV's CUB.

To determine the key factors influencing GoCV codon usage bias, we utilized multiple methods, including ENC-plot, neutrality plot, and Parity Rule 2. The results suggested that codon usage was moderately biased, with natural selection and mutation pressure playing unbalanced roles in shaping the codon usage pattern of GoCV. Natural selection appeared to have a dominant effect compared to mutational pressure, similar to other viruses such as porcine circovirus 2 (PCV2) (He et al., 2019), PCV3(Yu et al., 2021), and human bocavirus (Hussain et al., 2019). However, in some DNA viruses, mutation pressure has been shown to play a dominant role in shaping the codon usage pattern (Dave et al., 2019; Zhou et al., 2023). Selection pressure in virus evolution refers to the various factors and conditions, including host immune response, the use of antivirals and vaccines, and environmental factors. These factors drive changes in viral populations over time. These pressures can influence the survival and replication of viral strains, with some variants becoming more successful towing to their specific adaptations. The selection pressure drives the process of natural selection, which leads to the emergence of new viral strains with improved fitness in the given environment (Day et al., 2022; Guo et al., 2023). Therefore, the selective pressure on GoCV warrants attention, and further monitoring of the virus is needed to assess its evolutionary status. In fact, virus with a high degree of CUB shaped by natural selection may exhibit reduced susceptibility to antiviral drugs targeting specific codons or viral proteins (Villanueva et al., 2016; Franzo et al., 2021). Understanding the underlying mechanisms of CUB can aid in the development of more effective antiviral strategies.

In comparative examination of codon usage patterns between GoCV and its host, it was found that a significant proportion of synonymous codons exhibited similar selection patterns across different host species. Specifically, in A. cygnoides, A. anser, A. albifrons, and A. platyrhynchos, a considerable number of synonymous codons displayed equivalent selection. Notably, certain codons, such as UUC for Phe and CUG for Leu, showed similarities between GoCV and its hosts, potentially indicating associations with virulence in Anser species. Conversely, codons like CUA for Leu and CCG for Pro exhibited notable differences between GoCV and its hosts. Additionally, specific codons such as UCC and UUA for Ser exhibited similarity between GoCV and certain host species, including A. cygnoides, A. anser, A. albifrons, or A. fabalis. Therefore, it would be intriguing to investigate whether these codons existing in Anser species are associated with the virulence of GoCV in Anser species.

Correlation analysis between amino acid properties (aromaticity and hydrophobicity) and codon bias indices supported the involvement of natural selection in CUB. Remarkable correlations between ENC and hydrophobicity indicated that natural selection contributes to the CUB of GoCV. The roles of aromaticity and hydrophobicity were also found to be related with CUB in other viruses, such as ZIKA virus (Wang et al., 2016), porcine circovirus (Li et al., 2018) and African swine fever virus (Wang et al., 2024). Previous studies have found similar associations between amino acid properties and CUB in other viruses, suggesting potential roles in viral capsid assembly (Kegel and Schoot, 2004). Therefore, additional investigation into the impact of mutations on hydrophobic and aromatic amino acid residues of GoCV, and their influence on capsid assembly, should be justified in the future studies.

Indeed, geographical distribution and evolutionary divergence were found to influence GoCV CUB (Wang et al., 2020; Si et al., 2021). Analysis revealed that GoCV isolates from different geographical regions were distributed separately, with significant divergence observed, especially among sequences from China-Mainland, China-Taiwan, and Poland. These findings may underscore the importance of geography as a determinant of CUB in GoCV, particularly as animal activities expand globally. However, in this study, the majority of GoCV sequences available in the NCBI database are from China (n = 106), with only a limited representation from other countries like Poland (n = 34), Hungary (n = 3), and Germany (n = 1), may raises concerns about the generalizability of the findings. Of note, studies show a distinct genetic variance between European geese and Chinese geese, that may also add the difference in CUB (Shi et al., 2006; Wang et al., 2017). More importantly, recombinant GoCV is circulating in domesticated and wild geese in Poland (Stenzel et al., 2018), that warrants comprehensive study on geographical distribution and evolutionary divergence in future. From a scientific standpoint, the geographic distribution of virus samples is crucial due to potential regional variations in virus strains, host populations, and environmental factors. Viruses can exhibit genetic diversity due to factors such as host immunity, host range, viral evolution, and transmission dynamics. Therefore, using a dataset that primarily consists of sequences from one region, such as China, may not capture the full spectrum of genetic diversity present in GoCV populations worldwide. Unfortunately, no such data about GoCV representing the whole global population are available in the public database, this is the limitation of the study.

CONCLUSIONS

In conclusion, our study revealed that the overall codon usage of the GoCV strains was relatively similar and exhibited a slight bias, with natural selection playing a major role in shaping its codon usage pattern. Additionally, parameters like Aroma and Gravy, which respectively measure the presence of aromatic amino acids and the overall hydrophobicity of a protein sequence may also contribute to codon usage bias in GoCV. Importantly, our analysis indicated that the codon usage pattern of GoCV shares similarities with its host, suggesting potential co-evolution between GoCV and host. This underscores the importance of enhancing global surveillance efforts on GoCV's CUB. Overall, our findings contribute to a better understanding of GoCV evolution and provide valuable insights for the prevention and treatment of GoCV infections, including the design of drugs and vaccines.

ACKNOWLEDGMENTS

This study was funded by grants from the Natural Science Foundation of Fujian (grant no. 2023J05081, 2020J06029), Fujian Province Science and Technology Plan Project (grant no. 2023Y4014), the Program of Fujian Provincial Key Laboratory for Avian Diseases Control and Prevention (grant no. FKADL-2022-02), the National Key Research and Development Program of China (grant no. 2023YFD1800603), the Research and Technology Program of Fujian Academy of Agricultural Sciences (grant no. YCZX202412, DWHZ2024-13, CXTD2021005), and the Start-up Fund for new teachers in Fujian Police College. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Ethical Approval and Consent to Participate: This article does not contain any studies with human participants or animals performed by any of the authors.

Consent for Publication: The authors consent for publication.

Authors Contributions: Quanming Xu: Conceptualization, Formal analysis, Data curation, Writing-original draft, reviewed and edited the paper. Jie Cao: Formal analysis, Data curation. Kul Raj Rai and Binling Zhu: Conceptualization, Methodology, Writing-review & editing. Dan Liu and Chunhe Wan: Conceptualization, Writing-review & editing, conceived and designed the experiments, reviewed and edited the paper. All the authors have read and approved the final manuscript.

Data Availability: The datasets supporting the conclusions of this article are included within the article. All datasets are available from the corresponding author upon reasonable request.

DISCLOSURES

The authors declare no conflicts of interest.

Footnotes

Supplementary material associated with this article can be found in the online version at doi:10.1016/j.psj.2024.103775.

Appendix. Supplementary materials

mmc1.pptx (331.1KB, pptx)
mmc2.pptx (290.8KB, pptx)
mmc3.xlsx (69.5KB, xlsx)
mmc4.xlsx (11.4KB, xlsx)
mmc5.xlsx (11.1KB, xlsx)
mmc6.xlsx (11.8KB, xlsx)

REFERENCES

  1. Ball N.W., Smyth J.A., Weston J.H., Borghmans B.J., Palya V., Glavits R., Ivanics E., Dan A., Todd D. Diagnosis of goose circovirus infection in Hungarian geese samples using polymerase chain reaction and dot blot hybridization tests. Avian Pathol. 2004;33:51–58. doi: 10.1080/03079450310001610613. [DOI] [PubMed] [Google Scholar]
  2. Bera B.C., Virmani N., Kumar N., Anand T., Pavulraj S., Rash A., Elton D., Rash N., Bhatia S., Sood R., Singh R.K., Tripathi B.N. Genetic and codon usage bias analyses of polymerase genes of equine influenza virus and its relation to evolution. BMC Genomics. 2017;18:652. doi: 10.1186/s12864-017-4063-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Butt A.M., Nasrullah I., Tong Y. Genome-wide analysis of codon usage and influencing factors in chikungunya viruses. PLoS One. 2014;9:e90905. doi: 10.1371/journal.pone.0090905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Cao X., Huang M., Wang Y., Chen Y., Yang H., Quan F. Immunogenicity analysis of PCV3 recombinant capsid protein virus-like particles and their application in antibodies detection. Int. J. Mol. Sci. 2023;24:10377. doi: 10.3390/ijms241210377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Chen C.L., Chang P.C., Lee M.S., Shien J.H., Ou S.J., Shieh H.K. Nucleotide sequences of goose circovirus isolated in Taiwan. Avian Pathol. 2003;32:165–171. doi: 10.1080/0307945021000071614. [DOI] [PubMed] [Google Scholar]
  6. Chen C.L., Wang P.X., Lee M.S., Shien J.H., Shien H.K., Ou S.J., Chen C.H., Chang P.C. Development of a polymerase chain reaction procedure for detection and differentiation of duck and goose circovirus. Avian Dis. 2006;50:92–95. doi: 10.1637/7435-090705R1.1. [DOI] [PubMed] [Google Scholar]
  7. Costafreda M.I., Perez-Rodriguez F.J., D’Andrea L., Guix S., Ribes E., Bosch A., Pinto R.M. Hepatitis A virus adaptation to cellular shutoff is driven by dynamic adjustments of codon usage and results in the selection of populations with altered capsids. J Virol. 2014;88:5029–5041. doi: 10.1128/JVI.00087-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cui X., Zhu Y., Wu Q., He D., Mao M., Wei F., Wu B., Zhu S., Cui Y., Han Q., Wang D., Wu M., Zhao Y., Ren H., Wei X., Zhang M., Tang Y., Diao Y. Pathogenicity of duck circovirus 1 in experimentally infected specific pathogen-free ducks. Poult Sci. 2023;103:103301. doi: 10.1016/j.psj.2023.103301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Das J.K., Roy S. Comparative analysis of human coronaviruses focusing on nucleotide variability and synonymous codon usage patterns. Genomics. 2021;113:2177–2188. doi: 10.1016/j.ygeno.2021.05.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Dave U., Srivathsan A., Kumar S. Analysis of codon usage pattern in the viral proteins of chicken anaemia virus and its possible biological relevance. Infect. Genet. Evol. 2019;69:93–106. doi: 10.1016/j.meegid.2019.01.002. [DOI] [PubMed] [Google Scholar]
  11. Day T., Kennedy D.A., Read A.F., Gandon S. Pathogen evolution during vaccination campaigns. PLoS Biol. 2022;20 doi: 10.1371/journal.pbio.3001804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Deb B., Uddin A., Chakraborty S. Analysis of codon usage of Horseshoe Bat Hepatitis B virus and its host. Virology. 2021;561:69–79. doi: 10.1016/j.virol.2021.05.008. [DOI] [PubMed] [Google Scholar]
  13. Decaro N., Martella V., Desario C., Lanave G., Circella E., Cavalli A., Elia G., Camero M., Buonavoglia C. Genomic characterization of a circovirus associated with fatal hemorrhagic enteritis in dog, Italy. PLoS One. 2014;9 doi: 10.1371/journal.pone.0105909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Feher E., Kaszab E., Bali K., Hoitsy M., Sos E., Banyai K. Novel circoviruses from birds share common evolutionary roots with fish origin circoviruses. Life (Basel) 2022;12:368. doi: 10.3390/life12030368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Feng H., Segales J., Wang F., Jin Q., Wang A., Zhang G., Franzo G. Comprehensive analysis of codon usage patterns in chinese porcine circoviruses based on their major protein-coding sequences. Viruses. 2022;14:81. doi: 10.3390/v14010081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Franzo G., Tucciarone C.M., Legnardi M., Cecchinato M. Effect of genome composition and codon bias on infectious bronchitis virus evolution and adaptation to target tissues. BMC Genomics. 2021;22:244. doi: 10.1186/s12864-021-07559-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Grantham R., Gautier C., Gouy M. Codon frequencies in 119 individual genes confirm consistent choices of degenerate bases according to genome type. Nucleic Acids Res. 1980;8:1893–1912. doi: 10.1093/nar/8.9.1893. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Grantham R., Gautier C., Gouy M., et al. Codon catalog usage and the genome hypothesis. Nucleic Acids Res. 1980;8:r49–r62. doi: 10.1093/nar/8.1.197-c. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Guo J., Tian J., Tan X., Yu H., Ding S., Sun H., Yu X. Pathological observations of an experimental infection of geese with goose circovirus. Avian Pathol. 2011;40:55–61. doi: 10.1080/03079457.2010.538371. [DOI] [PubMed] [Google Scholar]
  20. Guo X., Zhang Y., Pan Y., Yang K., Tong X., Wang Y. Phylogenetic analysis and codon usage bias reveal the base of feline and canine chaphamaparvovirus for cross-species transmission. Animals (Basel) 2023;13:2617. doi: 10.3390/ani13162617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. He W., Wang N., Tan J., et al. Comprehensive codon usage analysis of porcine deltacoronavirus. Mol. Phylogenet. Evol. 2019;141 doi: 10.1016/j.ympev.2019.106618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. He W., Zhao J., Xing G., et al. Genetic analysis and evolutionary changes of Porcine circovirus 2. Mol. Phylogenet. Evol. 2019;139 doi: 10.1016/j.ympev.2019.106520. [DOI] [PubMed] [Google Scholar]
  23. Hussain S., Rasool S.T., Asif A.H. A detailed analysis of synonymous codon usage in human bocavirus. Arch. Virol. 2019;164:335–347. doi: 10.1007/s00705-018-4063-8. [DOI] [PubMed] [Google Scholar]
  24. Ismail S., Baharum S.N., Fazry S., Low C.F. Comparative genome analysis reveals a distinct influence of nucleotide composition on virus-host species-specific interaction of prawn-infecting nodavirus. J. Fish Dis. 2019;42:1761–1772. doi: 10.1111/jfd.13093. [DOI] [PubMed] [Google Scholar]
  25. Jenkins G.M., Holmes E.C. The extent of codon usage bias in human RNA viruses and its evolutionary origin. Virus Res. 2003;92:1–7. doi: 10.1016/s0168-1702(02)00309-x. [DOI] [PubMed] [Google Scholar]
  26. Karlin S., Burge C. Dinucleotide relative abundance extremes: a genomic signature. Trends Genet. 1995;11:283–290. doi: 10.1016/s0168-9525(00)89076-9. [DOI] [PubMed] [Google Scholar]
  27. Kegel W.K., Schoot Pv P. Competing hydrophobic and screened-coulomb interactions in hepatitis B virus capsid assembly. Biophys J. 2004;86:3905–3913. doi: 10.1529/biophysj.104.040055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Khandia R., Singhal S., Kumar U., Ansari A., Tiwari R., Dhama K., Das J., Munjal A., Singh R.K. Analysis of Nipah virus codon usage and adaptation to hosts. Front Microbiol. 2019;10:886. doi: 10.3389/fmicb.2019.00886. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kim K., Choi K., Shin M., Hahn T.W. A porcine circovirus type 2d-based virus-like particle vaccine induces humoral and cellular immune responses and effectively protects pigs against PCV2d challenge. Front. Microbiol. 2023;14:1334968. doi: 10.3389/fmicb.2023.1334968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kozdrun W., Wozniakowski G., Samorek-Salamonowicz E., Czekaj H. Viral infections in goose flocks in Poland. Pol. J. Vet. Sci. 2012;15:525–530. doi: 10.2478/v10181-012-0080-9. [DOI] [PubMed] [Google Scholar]
  31. Lauring A.S., Acevedo A., Cooper S.B., Andino R. Codon usage determines the mutational robustness, evolutionary capacity, and virulence of an RNA virus. Cell Host Microbe. 2012;12:623–632. doi: 10.1016/j.chom.2012.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Li G., Wang H., Wang S., Xing G., Zhang C., Zhang W., Liu J., Zhang J., Su S., Zhou J. Insights into the genetic and host adaptability of emerging porcine circovirus 3. Virulence. 2018;9:1301–1313. doi: 10.1080/21505594.2018.1492863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. McFeely C.A.L., Dods K.K., Patel S.S., Hartman M.C.T. Expansion of the genetic code through reassignment of redundant sense codons using fully modified tRNA. Nucleic Acids Res. 2022;50:11374–11386. doi: 10.1093/nar/gkac846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Neef A., Nath B.K., Das T., Luque D., Forwood J.K., Raidal S.R., Das S. Recombinantly expressed virus-like particles (VLPs) of canine circovirus for development of an indirect ELISA. Vet. Res. Commun. 2024;48:11211133. doi: 10.1007/s11259-023-10290-z. [DOI] [PubMed] [Google Scholar]
  35. Pan S., Mou C., Wu H., Chen Z. Phylogenetic and codon usage analysis of atypical porcine pestivirus (APPV) Virulence. 2020;11:916–926. doi: 10.1080/21505594.2020.1790282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Puigbo P., Bravo I.G., Garcia-Vallve S. E-CAI: a novel server to estimate an expected value of Codon Adaptation Index (eCAI) BMC Bioinformatics. 2008;9:65. doi: 10.1186/1471-2105-9-65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Puigbo P., Bravo I.G., Garcia-Vallve S. CAIcal: a combined set of tools to assess codon usage adaptation. Biol. Direct. 2008;3:38. doi: 10.1186/1745-6150-3-38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Rahman S.U., Yao X., Li X., Chen D., Tao S. Analysis of codon usage bias of Crimean-Congo hemorrhagic fever virus and its adaptation to hosts. Infect Genet Evol. 2018;58:1–16. doi: 10.1016/j.meegid.2017.11.027. [DOI] [PubMed] [Google Scholar]
  39. Rosario K., Breitbart M., Harrach B., Segales J., Delwart E., Biagini P., Varsani A. Revisiting the taxonomy of the family Circoviridae: establishment of the genus Cyclovirus and removal of the genus Gyrovirus. Arch. Virol. 2017;162:1447–1463. doi: 10.1007/s00705-017-3247-y. [DOI] [PubMed] [Google Scholar]
  40. Roy A., Guo F., Singh B., Gupta S., Paul K., Chen X., Sharma N.R., Jaishee N., Irwin D.M., Shen Y. Base Composition and host adaptation of the SARS-CoV-2: insight from the codon usage perspective. Front. Microbiol. 2021;12:548275. doi: 10.3389/fmicb.2021.548275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Sharp P.M., Li W.H. An evolutionary perspective on synonymous codon usage in unicellular organisms. J. Mol. Evol. 1986;24:28–38. doi: 10.1007/BF02099948. [DOI] [PubMed] [Google Scholar]
  42. Shi X.W., Wang J.W., Zeng F.T., Qiu X.P. Mitochondrial DNA cleavage patterns distinguish independent origin of Chinese domestic geese and Western domestic geese. Biochem Genet. 2006;44:237–245. doi: 10.1007/s10528-006-9028-z. [DOI] [PubMed] [Google Scholar]
  43. Si F., Jiang L., Yu R., Wei W., Li Z. Study on the Characteristic codon usage pattern in porcine epidemic diarrhea virus genomes and its host adaptation phenotype. Front Microbiol. 2021;12:738082. doi: 10.3389/fmicb.2021.738082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Soike D., Kohler B., Albrecht K. A circovirus-like infection in geese related to a runting syndrome. Avian Pathol. 1999;28:199–202. doi: 10.1080/03079459994939. [DOI] [PubMed] [Google Scholar]
  45. Stenzel T., Dziewulska D., Muhire B.M., Hartnady P., Kraberger S., Martin D.P., Varsani A. Recombinant goose circoviruses circulating in domesticated and wild geese in Poland. Viruses. 2018;10:107. doi: 10.3390/v10030107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Sueoka N. Intrastrand parity rules of DNA base composition and usage biases of synonymous codons. J Mol Evol. 1995;40:318–325. doi: 10.1007/BF00163236. [DOI] [PubMed] [Google Scholar]
  47. Sueoka N. Two aspects of DNA base composition: G+C content and translation-coupled deviation from intra-strand rule of A = T and G = C. J Mol Evol. 1999;49:49–62. doi: 10.1007/pl00006534. [DOI] [PubMed] [Google Scholar]
  48. Sueoka N. Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene. 1999;238:53–58. doi: 10.1016/s0378-1119(99)00320-0. [DOI] [PubMed] [Google Scholar]
  49. Todd D. Avian circovirus diseases: lessons for the study of PMWS. Vet. Microbiol. 2004;98:169–174. doi: 10.1016/j.vetmic.2003.10.010. [DOI] [PubMed] [Google Scholar]
  50. Tsai C.T., Lin C.H., Chang C.Y. Analysis of codon usage bias and base compositional constraints in iridovirus genomes. Virus Res. 2007;126:196–206. doi: 10.1016/j.virusres.2007.03.001. [DOI] [PubMed] [Google Scholar]
  51. Villanueva E., Marti-Solano M., Fillat C. Codon optimization of the adenoviral fiber negatively impacts structural protein expression and viral fitness. Sci Rep. 2016;6:27546. doi: 10.1038/srep27546. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wang H., Liu S., Zhang B., Wei W. Analysis of synonymous codon usage bias of Zika virus and its adaption to the hosts. PLoS One. 2016;11 doi: 10.1371/journal.pone.0166260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Wang Q., Lyu X., Cheng J., Fu Y., Lin Y., Abdoulaye A.H., Jiang D., Xie J. Codon usage provides insights into the adaptive evolution of mycoviruses in their associated fungi host. Int. J. Mol. Sci. 2022;23:7441. doi: 10.3390/ijms23137441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Wang X., Xu W., Fan K., Chiu H.C., Huang C. Codon usage bias in the H gene of canine distemper virus. Microb Pathog. 2020;149:104511. doi: 10.1016/j.micpath.2020.104511. [DOI] [PubMed] [Google Scholar]
  55. Wang Y., Chi C., Zhang J., Zhang K., Deng D., Zheng W., Chen N., Meurens F., Zhu J. Systematic analysis of the codon usage patterns of African swine fever virus genome coding sequences reveals its host adaptation phenotype. Microb. Genom. 2024;10:001186. doi: 10.1099/mgen.0.001186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Wang Y., Hu Y., He D., Chen S., Li S., Lan D., Ren P., Lin Z., Liu Y. Contribution of both positive selection and relaxation of selective constraints to degeneration of flyability during geese domestication. PLoS One. 2017;12 doi: 10.1371/journal.pone.0185328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Wright F. The 'effective number of codons' used in a gene. Gene. 1990;87:23–29. doi: 10.1016/0378-1119(90)90491-9. [DOI] [PubMed] [Google Scholar]
  58. Xu Q., Chen H., Sun W., Zhu D., Zhang Y., Chen J.L., Chen Y. Genome-wide analysis of the synonymous codon usage pattern of Streptococcus suis. Microb. Pathog. 2021;150:104732. doi: 10.1016/j.micpath.2021.104732. [DOI] [PubMed] [Google Scholar]
  59. Xu Y., Jia R., Zhang Z., Lu Y., Wang M., Zhu D., Chen S., Liu M., Yin Z., Cheng A. Analysis of synonymous codon usage pattern in duck circovirus. Gene. 2015;557:138–145. doi: 10.1016/j.gene.2014.12.019. [DOI] [PubMed] [Google Scholar]
  60. Yang K.K., Yin D.D., Xu L., Liang Y.Q., Tu J., Song X.J., Shao Y., Liu H.M., Qi K.Z. A TaqMan-based quantitative real-time PCR assay for identification of the goose circovirus. Mol. Cell Probes. 2020;52:101564. doi: 10.1016/j.mcp.2020.101564. [DOI] [PubMed] [Google Scholar]
  61. Yao X., Fan Q., Yao B., Lu P., Rahman S.U., Chen D., Tao S. Codon usage bias analysis of bluetongue virus causing livestock infection. Front. Microbiol. 2020;11:655. doi: 10.3389/fmicb.2020.00655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Yu X., Gao K., Pi M., Li H., Zhong W., Li B., Ning Z. Phylogenetic and codon usage analysis for replicase and capsid genes of porcine circovirus 3. Vet. Res. Commun. 2021;45:353–361. doi: 10.1007/s11259-021-09816-0. [DOI] [PubMed] [Google Scholar]
  63. Yu X., Zhu C., Zheng X., He S., Liu X. Genome analysis and epidemiological investigation of goose circovirus detected in eastern China. Virus Genes. 2007;35:605–609. doi: 10.1007/s11262-007-0112-1. [DOI] [PubMed] [Google Scholar]
  64. Zhou J., Wang X., Zhou Z., Wang S. Insights into the evolution and host adaptation of the Monkeypox virus from a codon usage perspective: focus on the ongoing 2022 outbreak. Int. J. Mol. Sci. 2023;24:11524. doi: 10.3390/ijms241411524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Zhou J.H., Zhang J., Sun D.J., Ma Q., Chen H.T., Ma L.N., Ding Y.Z., Liu Y.S. The distribution of synonymous codon choice in the translation initiation region of dengue virus. PLoS One. 2013;8 doi: 10.1371/journal.pone.0077239. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mmc1.pptx (331.1KB, pptx)
mmc2.pptx (290.8KB, pptx)
mmc3.xlsx (69.5KB, xlsx)
mmc4.xlsx (11.4KB, xlsx)
mmc5.xlsx (11.1KB, xlsx)
mmc6.xlsx (11.8KB, xlsx)

Articles from Poultry Science are provided here courtesy of Elsevier

RESOURCES