Abstract
Codon usage depends on mutation bias, tRNA-mediated selection, and the need for high efficiency and accuracy in translation. One codon in a synonymous codon family is often strongly over-used, especially in highly expressed genes, which often leads to a high dN/dS ratio because dS is very small. Many different codon usage indices have been proposed to measure codon usage and codon adaptation. Sense codon could be misread by release factors and stop codons misread by tRNAs, which also contribute to codon usage in rare cases. This chapter outlines the conceptual framework on codon evolution, illustrates codon-specific and gene-specific codon usage indices, and presents their applications. A new index for codon adaptation that accounts for background mutation bias (Index of Translation Elongation) is presented and contrasted with codon adaptation index (CAI) which does not consider background mutation bias. They are used to re-analyze data from a recent paper claiming that translation elongation efficiency matters little in protein production. The reanalysis disproves the claim.
Introduction
We will first learn a few key definitions and notations on tRNA, its anticodon, and codon families. We will then outline the conceptual framework of codon adaptation, mediated by mutation and selection. This brings us to indices of codon usage bias , their calculation and interpretations, and factors that may confound their interpretations. There are codon-specific indices such as relative synonymous codon usage (RSCU , Sharp et al. 1986) or gene-specific indices such as index of translation elongation (ITE , Xia 2015) and codon adaptation index ( CAI, Sharp and Li 1987; Xia 2007c). All these indices are implemented in DAMBE (Xia 2013, 2017d).
ITE takes background mutation bias into consideration, while CAI does not. ITE is reduced to CAI if there is no background mutation bias. I will illustrate the applications of these indices in practical research. Keep in mind that a codon adaptation index is just one variable which will not be particularly interesting until you relate it to other variables and understand their relationships.
Two additional topics are dealt with close to the end of the chapter. The first involves how to discriminate between selection for translation efficiency and accuracy (Akashi 1994). The second is on the effect of amino acid usage on translation elongation efficiency. The general prediction concerning amino acid usage is that highly expressed proteins should maximize the use of amino acids that are abundant and energetically cheap (Akashi and Gojobori 2002) to make and have many tRNAs to carry them (Xia 1998a). The same argument has been used for transcription, i.e., an mRNA with many A nucleotides will be transcribed faster than one with many C nucleotides because A is in general far more abundant than C and it takes extra ATP to make CTP (Xia 1996; Xia et al. 2006).
Basic Notations, Definitions, and Abbreviations
Notations, definitions, and abbreviations are essential in science. We are lucky enough to have almost all of them unambiguous. If you were studying social sciences, you would have to come to define what is man and what is woman, and the debate on a proper definition will last forever, eventually with all debaters losing their mind and being called jerks.
tRNA Notation and Identification of tRNA Anticodon
The simplest notation of a tRNA is tRNAAA, where AA is a specific amino acid. For example, tRNAGly refers to all tRNAs that can be charged with amino acid glycine (Gly). A slightly more complicated notation is tRNAAA/AC, where AC refers to tRNA anticodon. For example, tRNAGly/GCC refers specifically to tRNAGly with a GCC anticodon. The general notation of a tRNA is AA2-tRNAAA1/AC, where AA1 is the amino acid the tRNA is supposed to carry, AA2 is the amino acid that is actually carried by the tRNA, and AC is the anticodon. In most cases, AA1 and AA2 are the same. However, there are two cases where AA1 and AA2 can be different. The first is modification of AA2 by a biochemist. The second occurs naturally in a number of species across all three domains of life (Sheppard et al. 2008; Yuan et al. 2008), where Gln-tRNAGln, Asn-tRNAAsn, Cys-tRNACys, and Sec-tRNASec are formed indirectly by two steps. Take Gln-tRNAGln and Asn-tRNAAsn, for example. Glu is first misacylated to tRNAGln, and Asp to tRNAAsn, to form Glu-tRNAGln and Asp-tRNAAsn, respectively. The resulting misacylated tRNAs are then converted to Gln-tRNAGln and Asn-tRNAAsn by a group of tRNA-dependent modifying enzyme.
Isoacceptor tRNA is a somewhat confusing term as it may carry two slightly different meanings. It could refer to a single tRNA decoding different synonymous codons, e.g., tRNAGly/GCC decoding GGC and GGU codons. Alternatively, it could refer to a set of different tRNAs that carry the same amino acid but decode different synonymous codons. For example, tRNAGly/GCC, tRNAGly/CCC, and tRNAGly/UCC are isoacceptor tRNA s. They all carry amino acid Gly but with different anticodons decoding different synonymous Gly codons. Different isoacceptor tRNAs could decode the same codon. For example, tRNAGly/CCC decodes GGG, but tRNAGly/UCC decodes both GGA and GGG, so GGG is decoded by both tRNAGly/CCC and tRNAGly/UCC. Thus, isoacceptor tRNA refers to (1) one tRNA decoding different synonymous codons or (2) a set of tRNAs that carry the same amino acid but decode different sets of synonymous codons. The intersection of different sets of synonymous codons may not be empty. For example, the set of codons decoded by tRNAGly/CCC is {GGG}, and the set of codons decoded by tRNAGly/UCC is {GGA, GGG}. The intersection of the two sets is {GGG}.
Related to isoacceptor tRNA is another potentially confusing concept, near-cognate tRNA , which is defined in two ways. The first is based on empirical evidence. If codon XYZ encoding amino acid AA1 can be misread by tRNA carrying amino acid AA2 (AA1 ≠ AA2), then that tRNA is a near-cognate tRNA for codon XYZ. The second definition is based on nucleotide similarity among codons. A codon XYZ has nine XYZ-like codons which differ from XYZ by a single nucleotide. Some of these XYZ-like codons are synonymous to XYZ and some not. The set of tRNAs that can decode any of those nonsynonymous XYZ-like codons are near-cognate tRNAs for codon XYZ because they can “potentially” misread codon XYZ. For example, tRNAAsp is a near-cognate for codons GAA and GAG because Asp is encoded by GAC and GAU which are GAA-like and GAG-like codons.
Genetic Code s and Associated Concepts and Definitions
It is through genetic code that the 64 codons are interpreted as encoding amino acids or translation stop. Nature is superfluous in her creation of genetic code. There are now 24 known genetic codes listed from 1 to 31 (Table 9.1). The standard genetic code is shown previously in Table 10.1007/978-3-319-90684-3_2#Tab7.
Table 9.1.
The 24 genetic tables named after representative species and corresponding translation tables (TT)
| Name | TT |
|---|---|
| Standard | 1 |
| Vertebrate mitochondrial | 2 |
| Yeast mitochondrial | 3 |
| Mold, protozoan, and coelenterate mitochondrial code and the Mycoplasma/Spiroplasma | 4 |
| Invertebrate mitochondrial | 5 |
| Ciliate, Dasycladacean, and Hexamita nuclear | 6 |
| Echinoderm and flatworm mitochondrial | 9 |
| Euplotid nuclear | 10 |
| Bacterial, archaeal, and plant plastid | 11 |
| Alternative yeast nuclear | 12 |
| Ascidian mitochondrial | 13 |
| Alternative flatworm mitochondrial | 14 |
| Chlorophycean mitochondrial | 16 |
| Trematode mitochondrial | 21 |
| Scenedesmus obliquus mitochondrial | 22 |
| Thraustochytrium mitochondrial | 23 |
| Pterobranchia mitochondrial | 24 |
| Candidate division SR1 and Gracilibacteria | 25 |
| Pachysolen tannophilus nuclear | 26 |
| Karyorelict nuclear | 27 |
| Condylostoma nuclear | 28 |
| Mesodinium nuclear | 29 |
| Peritrich nuclear | 30 |
| Blastocrithidia nuclear | 31 |
Some codons do not change their meanings, e.g., Phe (UUY), Tyr (UAY), and Pro (CCN), whereas some others change their meaning frequently. Table 9.2 lists those codons with different meanings in different genetic codes. These codons tend to end with a purine, except for CUN. However, even within the CUR codon family, CUR codons are involved in recoding more often than CUY codons (Table 9.2).
Table 9.2.
Codons with different meanings in different translation tables (TT)
| TT | UUA | UCA | UAA | UAG | UGA | CUU | CUC | CUA | CUG | AUA | AAA | AGA | AGG |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | L | S | * | * | * | L | L | L | L | I | K | R | R |
| 2 | L | S | * | * | W | L | L | L | L | M | K | * | * |
| 3 | L | S | * | * | W | T | T | T | T | M | K | R | R |
| 4 | L | S | * | * | W | L | L | L | L | I | K | R | R |
| 5 | L | S | * | * | W | L | L | L | L | M | K | S | S |
| 6 | L | S | Q | Q | * | L | L | L | L | I | K | R | R |
| 9 | L | S | * | * | W | L | L | L | L | I | N | S | S |
| 10 | L | S | * | * | C | L | L | L | L | I | K | R | R |
| 11 | L | S | * | * | * | L | L | L | L | I | K | R | R |
| 12 | L | S | * | * | * | L | L | L | S | I | K | R | R |
| 13 | L | S | * | * | W | L | L | L | L | M | K | G | G |
| 14 | L | S | Y | * | W | L | L | L | L | I | N | S | S |
| 16 | L | S | * | L | * | L | L | L | L | I | K | R | R |
| 21 | L | S | * | * | W | L | L | L | L | M | N | S | S |
| 22 | L | * | * | L | * | L | L | L | L | I | K | R | R |
| 23 | * | S | * | * | * | L | L | L | L | I | K | R | R |
| 24 | L | S | * | * | W | L | L | L | L | I | K | S | K |
| 25 | L | S | * | * | G | L | L | L | L | I | K | R | R |
| 26 | L | S | * | * | * | L | L | L | A | I | K | R | R |
| 27 | L | S | Q | Q | w | L | L | L | L | I | K | R | R |
| 28 | L | S | q | q | w | L | L | L | L | I | K | R | R |
| 29 | L | S | Y | Y | * | L | L | L | L | I | K | R | R |
| 30 | L | S | E | E | * | L | L | L | L | I | K | R | R |
| 31 | L | S | e | e | W | L | L | L | L | I | K | R | R |
A small-case letter, such as q in translation table 28, means that the corresponding codon can mean either amino acid Q or a stop codon
We can build a distance tree from Table 9.2 by counting the pairwise number of reassignment events (i.e., when a codon for one amino acid is reassigned to a different amino acid or a stop codon). The only problem is how to treat reassignment between a sense codon and a stop. Such a change probably should occur less frequently than reassignments involving two sense codons. All pairwise comparisons among the 24 rows (24 genetic codes) generate 609 reassignments involving 2 sense codons and 445 reassignments between a sense codon and a stop codon. However, during the long evolutionary time, the more frequent reassignments will erase each other and the frequencies of their occurrences will be underestimated. So the actual difference between the two numbers must be much greater. If we count each reassignment between a sense codon and a stop codon as equivalent to four reassignments between two sense codons, we obtain a distance-based tree in Fig. 9.1. The topology remains the same if we treat each reassignment between a sense codon and a stop codon as equivalent to two, three, or five reassignments involving two sense codons.
Fig. 9.1.
“Phylogenetic tree” of 24 genetic codes with their differences shown in Table 9.2, based on pairwise number of codon reassignments. A reassignment between a sense codon and a stop codon is treated as equivalent to four codon reassignment events between two nonsynonymous sense codons. Leaves labeled with a “MT”-ending are mitochondrial genetic codes
Most bacteria use genetic code 11 which is the same as the standard code except for the difference in start codon usage. The wall-less bacteria including Mycoplasma and Spiroplasma use genetic code 4 which is identical to the mitochondrial genetic code used in a number of fungal lineages, red algae, and protozoa. The use of the same genetic code 4 by bacteria and mitochondria in eukaryotic lineages suggests two alternative hypotheses. First, it is convergence. Second, the ancestor of mitochondrial lineages in Cluster 3 (Fig. 9.1) is a Mycoplasma-like bacteria. This would imply multiple origin of mitochondrial lineages.
The main arguments for a single origin of mitochondria are (1) extensive phylogenetic reconstruction with rRNA sequences from diverse array of mitochondrial and bacterial lineages appears to recover mitochondrial lineages as a monophyletic taxon, with its closest phylogenetic relative being in Alphaproteobacteria lineages, especially Rickettsiales (Williams et al. 2007), and (2) all diverse mitochondrial genomes appear to represent reduced form of the mitochondrial genome from Reclinomonas americana (Lang et al. 1997). In particular, the closest phylogenetic relative for the mitochondrial genome from R. Americana among bacterial lineages is Ehrlichia muris strain AS145 within Rickettsiales. These lines of evidence, taken together, represent compelling evidence for the single-origin hypothesis of mitochondria.
Genetic codes also differ in start codons (Table 9.3). While AUG is used universally and dominantly as a start codon, other codons are used as well, although there has been no species in which a non-AUG codon is used as a start codon more frequently than AUG. For eukaryotic species where AUG is part of translation initiation signal such as in the Kozak consensus RxxAUGG, non-AUG codons are rarely used. In bacterial species where start codon is localized by pairing of Shine-Dalgarno (SD) sequences and anti-SD sequences, the requirement for AUG as a start codon is less stringent.
Table 9.3.
The 24 translation tables (24) differ in start codon usage
| TT | TTA | TTG | CTG | ATT | ATC | ATA | ATG | GTG |
|---|---|---|---|---|---|---|---|---|
| 1 | – | M | M | – | – | – | M | – |
| 2 | – | – | – | M | M | M | M | M |
| 3 | – | – | – | – | – | M | M | – |
| 4 | M | M | M | M | M | M | M | M |
| 5 | – | M | – | M | M | M | M | M |
| 6 | – | – | – | – | – | – | M | – |
| 9 | – | – | – | – | – | – | M | M |
| 10 | – | – | – | – | – | – | M | – |
| 11 | – | M | M | M | M | M | M | M |
| 12 | – | – | M | – | – | – | M | – |
| 13 | – | M | – | – | – | M | M | M |
| 14 | – | – | – | – | – | – | M | – |
| 16 | – | – | – | – | – | – | M | – |
| 21 | – | – | – | – | – | – | M | M |
| 22 | – | – | – | – | – | – | M | – |
| 23 | – | – | – | M | – | – | M | M |
| 24 | – | M | M | – | – | – | M | M |
| 25 | – | M | – | – | – | – | M | M |
| 26 | – | – | M | – | – | – | M | – |
| 27 | – | – | – | – | – | – | M | – |
| 28 | – | – | – | – | – | – | M | – |
| 29 | – | – | – | – | – | – | M | – |
| 30 | – | – | – | – | – | – | M | – |
| 31 | – | – | – | – | – | – | M | – |
A synonymous codon family refers to all codons coding the same amino acids. For example, GGA, GGC, GGG, and GGU codons all code Gly and are collectively referred to as the Gly codon family or just Gly family. I may use “family” for “synonymous codon family” when there is no confusion. A codon family such as Gly family that differs only at the third codon position is a simple family. The Gly codon family is a simple family. In contrast, a codon family that differs not only at the third codon position but also at other codon positions is a compound codon family. For example, in standard genetic code, Leu is coded by UUR (where R stands for purine) and CUN (where N stands for any nucleotide) codons. Therefore, Leu codon family is a compound family. Other compound families in the standard code include Ser (coded by UCN and AGY, where Y stands for pyrimidine) and Arg (coded by CGN and AGR). Compound families are often divided into subfamilies. For example, the Ser family is broken into UCN subfamily and AGY subfamily.
The phenomenon that one amino acid may be encoded by multiple codons is called codon degeneracy. This gives rise to 4-fold, 3-fold, 2-fold, and 1-fold (0-fold is a misnomer) degenerate sites. An n-fold site is one that can be occupied by n different nucleotides without changing the meaning of the encoded amino acid. For example, the third site in the four Gly codons above is fourfold degenerate. In the standard code, AUA, AUC, and AUU all encode amino acid Met, so that the third codon site is threefold degenerate. AAA and AAG both encode amino acid Lys, so that the third codon site is twofold degenerate. We may also have a twofold degenerate site at the first codon site. For example, both CUA and UUA encode amino acid Leu, so the first codon site is twofold degenerate. The second codon site of Gly codons is onefold degenerate because replacing it by any other nucleotide will change the meaning of the encoded amino acid.
A synonymous mutation refers to the change of a codon by another synonymous codon. A nonsynonymous mutation refers to codon replacement involving amino acid replacement. A substitution is a mutation that has spread to all individuals in the population. Synonymous substitutions occur often, but nonsynonymous substitutions occur rarely.
Throughout text, we will abbreviate highly and lowly expressed genes as HEGs and LEGs . Unless specified otherwise, HEGs and LEGs in this chapter pertain to protein expression, not mRNA expression. One may rank all proteins according to experimentally measured abundance and take the top and bottom 1/3 as HEGs and LEGs, respectively. Non-HEGs are simply all genes from a genome that is not included in HEGs. Protein abundance values for most model species may be found in PaxDb (Wang et al. 2012).
Elongation Efficiency Depends on Amino Acid and Codon Usage
Many unicellular organisms, especially bacterial species, need to grow and replicate the cell rapidly in order not to be outcompeted by others. For example, an cell replicates once every 20 min with unlimited nutrients. To replicate a cell, not only the genome needs to be replicated, but a large amount of proteins have to be produced, with some proteins produced in nearly half a million copies in an E. coli cell. For such highly expressed proteins, it is very important for their coding genes to have efficient coding strategy to maximize the rate of translation. Translation involves three sub-processes, initiation, elongation, and termination. The previous chapter illustrates how natural selection can drive evolution toward more efficient translation initiation. This chapter addresses the question of how translation elongation can be improved through codon adaptation.
There are two obvious ways of increasing translation elongation efficiency for mass-produced proteins. The first is to optimize amino acid usage , i.e., to use energetically cheap and typically abundant amino acids as building blocks (Akashi and Gojobori 2002). The second is to maximize the usage of codons that match the anticodon of the most abundant cognate tRNA (Gouy and Gautier 1982; Ikemura 1992; Xia 1998a, 2005, 2009, 2015). For example, the amino acid glycine (Gly) can be coded by GGA, GGC, GGG, and GGU codons, but tRNAGly species that decode GGY codons are more abundant than tRNAGly species that decode GGR codons in cells. What codons should E. coli use to code glycine? Obviously natural selection should favor those that maximize the usage of GGY codons against GGR codons given the differential tRNA availability. However, selection and mutation may go in opposite directions, so any study of codon adaptation would be incomplete without considering both selection and mutation.
Empirical Illustration of Codon-Anticodon Adaptation
Ikemura’s pioneering works established the relationship between differential tRNA abundance and its effect on codon usage in rapidly replicating bacterial species and unicellular eukaryotes (Ikemura 1981a, b, 1982, 1992). Many studies have since demonstrated a strong relationship not only between codon adaptation and gene expression (Coghlan and Wolfe 2000; Comeron and Aguade 1998; Duret and Mouchiroud 1999; Gouy and Gautier 1982; Xia 2007c) but also between experimentally modified codon usage and protein production (Haas et al. 1996; Ngumbela et al. 2008; Robinson et al. 1984; Sorensen et al. 1989). These results have led to the explicit formulation of codon-anticodon coevolution and adaptation theory (e.g., Akashi 1994; Moriyama and Powell 1997; Ran and Higgs 2012; Xia 1998a, 2008) which states that (1) protein production is rate-limited by both translation initiation and elongation efficiency; (2) codon usage and tRNA anticodon coevolve to adapt to each other, resulting in increased production of correctly translated proteins; and (3) the increased elongation efficiency and accuracy represent the driving force for the HEGs to acquire a high degree of codon-anticodon adaptation.
Empirical Illustration of Codon-Anticodon Adaptation in Yeast
The baker’s yeast, , replicates rapidly and is expected to use codons with many decoding tRNAs and avoid codons with few decoding tRNAs. The earliest association between tRNA and codon usage was empirically demonstrated by Ikemura (1981a, b, 1992). Tables 9.4 and 9.5 show the association between tRNA gene copy number (T in Tables 9.4 and 9.5) in the genome and codon usage in highly expressed yeast genes (F in Tables 9.4 and 9.5). T is a good proxy for tRNA abundance (Percudani et al. 1997).
Table 9.4.
Copy number of tRNA genes in the yeast genome (T) and codon counts (F) in highly expressed yeast protein-coding genes, compiled in the Eyeastcai.cut file distributed with EMBOSS (Rice et al. 2000)
| AAa | Codonb | T | F | AAa | Codonb | T | F | |
|---|---|---|---|---|---|---|---|---|
| Arg | AGA | 11 | 314 | His | CAC | 7 | 102 | |
| Arg | AGG | 1 | 1 | His | CAU | 0 | 25 | |
| Asn | AAC | 10 | 208 | Leu | UUA | 7 | 42 | |
| Asn | AAU | 0 | 11 | Leu | UUG | 10 | 359 | |
| Asp | GAC | 16 | 202 | Lys | AAA | 7 | 65 | |
| Asp | GAU | 0 | 112 | Lys | AAG | 14 | 483 | |
| Cys | UGC | 4 | 3 | Phe | UUC | 10 | 168 | |
| Cys | UGU | 0 | 39 | Phe | UUU | 0 | 19 | |
| Gln | CAA | 9 | 153 | Ser | AGC | 2 | 6 | |
| Gln | CAG | 1 | 1 | Ser | AGU | 0 | 4 | |
| Glu | GAA | 14 | 305 | Tyr | UAC | 8 | 141 | |
| Glu | GAG | 2 | 5 | Tyr | UAU | 0 | 10 |
Only twofold codon families are included
aAmino acid carried by tRNA
bCodons forming Watson-Crick base pair with the anticodon of tRNA
Table 9.5.
Copy number of tRNA genes in the yeast genome (T) and codon counts (F) in highly expressed yeast protein-coding genes, compiled in the Eyeastcai.cut file distributed with EMBOSS (Rice et al. 2000)
| AA | Codon | T | F | AA | Codon | T | F |
|---|---|---|---|---|---|---|---|
| Ala | GCA | 5 | 6 | Pro | CCA | 10 | 211 |
| Ala | GCG | 0 | 0 | Pro | CCG | 0 | 0 |
| Ala | GCC | 0 | 130 | Pro | CCC | 0 | 2 |
| Ala | GCU | 11 | 411 | Pro | CCU | 2 | 10 |
| Arg | CGA | 0 | 0 | Ser | UCA | 3 | 7 |
| Arg | CGG | 1 | 0 | Ser | UCG | 1 | 1 |
| Arg | CGC | 0 | 0 | Ser | UCC | 0 | 133 |
| Arg | CGU | 6 | 43 | Ser | UCU | 11 | 192 |
| Gly | GGA | 3 | 1 | Thr | ACA | 4 | 2 |
| Gly | GGG | 2 | 2 | Thr | ACG | 1 | 1 |
| Gly | GGC | 16 | 9 | Thr | ACC | 0 | 164 |
| Gly | GGU | 0 | 459 | Thr | ACU | 11 | 151 |
| Ile | AUA | 2 | 0 | Val | GUA | 2 | 0 |
| Ile | AUC | 0 | 181 | Val | GUG | 2 | 5 |
| Ile | AUU | 13 | 149 | Val | GUC | 0 | 231 |
| Leu | CUA | 3 | 14 | Val | GUU | 14 | 278 |
| Leu | CUG | 0 | 1 | ||||
| Leu | CUC | 1 | 1 | ||||
| Leu | CUU | 0 | 2 |
Only threefold and fourfold codon families are included. Symbols as in Table 9.4
The association between T and F is obvious in Tables 9.4 and 9.5. Take the two Arg codons AGA and AGG in Table 9.4, for example. There are 11 tRNAArg/UCU genes in the yeast genome that form perfect Watson-Crick base pair with AGA but only one tRNAArg/CCU with AGG. So we expect yeast genes, especially highly expressed ones, to use AGA and avoid AGG, which is true (Table 9.4). The same applies to all other synonymous codon families or subfamilies, except for the Cys codon family. Why the rarely used Cys codon family should be exceptional remains unknown. It is possible that Cys codon UGC may happen to be followed by a GNN codon, leading to methylation of C at the third codon position which then changes to T via spontaneous deamination. Whether the yeast genome has cytosine methylation remains controversial, with both evidence for (Tang et al. 2012) and against (Capuano et al. 2014) the existence of methylation in . However, there is significant CpG deficiency and TpG and CpA surplus in genome, which is consistent with CpG-specific DNA methylation.
One can obtain tables similar to Tables 9.4 and 9.5 by downloading the yeast genome from GenBank and then using DAMBE to compile the data in three steps. First, read the GenBank files for yeast chromosome sequences into DAMBE (Xia 2013, 2017d) to extract the coding sequences (CDSs) and tRNA genes. Second, compute ITE (Xia 2015) as a proxy of gene expression, and choose a subset of CDSs with highest ITE as HEGs . Third, use DAMBE to obtain codon usage of these HEGs. In this way, a table similar to Table 9.4 can be generated in minutes.
Codon Usage Changes When tRNA Abundance Changes
An evolutionary change in tRNA composition or relative abundance is expected to alter codon-anticodon adaptation. This is not controversial theoretically, but empirically difficult to demonstrate. However, recent studies (Xia 2012c; Xia et al. 2007) have documented that changes in tRNAMet genes (where Met is the amino acid carried by the tRNA) in animal mitochondrial DNA (mtDNA) are associated with changes in Met codon usage.
In mtDNA of most animal species, Met is coded by AUA and AUG codons. In some animal species, e.g., vertebrates, these two codons are translated by a single tRNAMet/CAU species (where CAU is the anticodon in the 5′ to 3′ orientation) with a modified C (i.e., f5C) at the first anticodon position (Grosjean et al. 2010) to allow C/A pairing. In other animal species, e.g., tunicates, an additional tRNAMet/UAU gene is present in the mtDNA. One would expect that, when tRNAMet/UAU is absent, Met should be preferably coded by AUG with a reduced AUA usage. The gain of tRNAMet/UAU would favor more Met to be coded by AUA.
In addition to tunicates, MtDNA in bivalve species also have two tRNAMet genes. In some bivalve species (e.g., Acanthocardia tuberculata, Crassostrea gigas, C. virginica, Hiatella arctica, Placopecten magellanicus, and Venerupis philippinarum), both tRNAMet genes have a CAU anticodon forming Watson-Crick base pair with codon AUG. In some other bivalve species (e.g., Mytilus edulis, Mytilus galloprovincialis, and Mytilus trossulus), one tRNAMet has a CAU anticodon, and the other has a UAU anticodon forming Watson-Crick base pair with the AUA codon. One would predict that the latter should be more likely to code Met by AUA than the former, i.e., the proportion of AUA codon within the AUR codon family, designated PAUA, should be greater in the latter with both a tRNAMet/CAU and a tRNAMet/UAU gene than in the former with tRNAMet/CAU gene only (Xia et al. 2007).
One complication in testing the prediction is that AUA usage will increase with genomic AT%. To control for this effect, one may use another A-ending codon, such as UUA as a reference. Thus, given the same PUUA (the proportion of UUA codon in the UUR codon family), PAUA in the three Mytilus mtDNA with both a tRNAMet/CAU and a tRNAMet/UAU gene should be higher than that in the six bivalve species without a tRNAMet/UAU gene. This is supported by empirical evidence (ANCOVA test, p = 0.0111, Fig. 9.2a). Thus, the presence of tRNAMet/UAU increases AUA usage significantly.
Fig. 9.2.
Relationship between PAUA and PUUA, highlighting the observation that PAUA is greater when both a tRNAMet/CAU and a tRNAMet/UAU are present than when only tRNAMet/CAU is present in the mtDNA, for bivalve species (a) and chordate species (b). The filled squares are for mtDNA containing both tRNAMet/CAU and tRNAMet/UAU genes, and the open triangles are for mtDNA without a tRNAMet/UAU gene
A similar comparison can be performed between the urochordates ( tunicates, with both tRNAMet/CAU and tRNAMet/UAU genes in their mtDNA) and cephalochordates ( lancelets, with only a tRNAMet/CAU gene in their mtDNA). Figure 9.2b shows that PAUA is much smaller in lancelets than in tunicates at the same PUUA level. Thus, AUA usage is consistently increased by the gain of a tRNAMet/UAU gene (or consistently decreased by the loss of a tRNAMet/UAU gene) in animal mtDNA.
A gain of a tRNAMet/UAU gene is also associated with a surplus of AUG→AUA substitutions in animal mitochondrial coding sequences (results not shown). Similar associations can also be observed with other gain/loss of tRNA genes in animal mitochondrial. In contrast, a gain/loss of tRNA genes in plant mtDNA appears to have little effect on nucleotide substitutions or codon usage, presumably because such gain/loss events do not significantly alter the tRNA pool in plant cells where nuclear tRNAs are mass-imported into plant mitochondria.
Effect of Biased Mutation on Codon Usage and Some Misconceptions
Biased mutation has long been known to affect codon usage (Muto and Osawa 1987; Sueoka 1964; Xia and Yuen 2005; Xia et al. 2002). The third codon position is the most amenable to mutation bias (Fig. 9.4) because most nucleotide substitutions at the third codon position are synonymous. Nucleotide substitutions are synonymous at some first codon positions but nonsynonymous at all second codon position. Furthermore, all nucleotide substitutions at the second codon positions typically involve rather different amino acids and therefore should be subject to strong purifying selection (Xia 1998b; Xia and Li 1998). One therefore would predict that the third codon position should increase more rapidly with the genomic GC% than the first codon position which in turn should have its GC% increase more rapidly with the genomic GC% than the second codon position. The empirical results (Fig. 9.3) strongly support the prediction (Muto and Osawa 1987).
Fig. 9.4.
Base pairs between nucleotides at the first anticodon site (which can have I, G, C, U but rarely A) and the third codon site. The inset shows the site numbering system of codon and anticodon, with codon sites subscripted with 1, 2 and 3 and anticodon sites subscripted with I, II, and III, which is illustrated by the paring of II/C3, GII/C2, CIII/G1.
Fig. 9.3.
Correlation of GC% between genomic DNA and first, second, and third codon positions (Muto and Osawa 1987). While the actual position of the points may be substantially revised with new genomic data (e.g., the GC% for the first, second, and third codon positions for Mycoplasma capricolum is 35.8%, 27.4%, and 8.8% based on all annotated CDSs in the genomic sequence), the general trend remains the same
However, the pattern in Fig. 9.3, while consistent with the mutation hypothesis, has resulted in two misconceptions. First, the pattern shown by the third codon position is often interpreted to reflect mutation bias. This interpretation is incorrect because the third codon position is subject to selection by differential availability of tRNA species (Carullo and Xia 2008; Xia 1998a, 2005, 2008; Xia et al. 2007). We may contrast a GC-rich Streptomyces coelicolor and a GC-poor Mycoplasma capricolum as an illustrative example. M. capricolum has no tRNA with a C or G at the wobble site for fourfold codon families (Ala, Gly, Pro, Thr, and Val), i.e., the translation machinery would be inefficient in translating C-ending or G-ending codons. This implies selection in favor of A-ending or U-ending codons and will consequently reduce GC% at the third codon position. This most likely has contributed to the low GC% at the third codon position in M. capricolum. In contrast, most of the tRNA genes translating the five fourfold codon families in the GC-rich S. coelicolor have G or C at the wobble site, and should favor the use of C-ending or G-ending codons. This most likely has contributed to the high GC% at the third codon position in S. coelicolor. In these two cases, mutation bias and tRNA-mediated selection are in the same direction to drive up or down GC% at the third codon position. The same pattern is observed for twofold codon families. The most conspicuous one is the Gln codon family (CAA and CAG). There is only one tRNAGln gene in M. capricolum with a UUG anticodon favoring the CAA codon. In contrast, there are two tRNAGln in S. coelicolor, both with a CUG anticodon favoring the CAG codon. Thus, the high slope for the third codon position in Fig. 9.3 is at least partially attributable to the tRNA-mediated selection. Relative contribution of mutation and tRNA-mediated selection to codon usage has been evaluated in several recent studies (Carullo and Xia 2008; Xia 2005, 2008; Xia et al. 2007).
The second misconception arising from Fig. 9.3 is that the frequency of G-ending and C-ending codons will increase and A-ending and U-ending codons decrease, with genomic GC% or GC-biased mutation (Kliman and Bernal 2005). This is not generally true (Palidwor et al. 2010). Take the arginine codons, for example. Given the transition probability matrix for the six synonymous codons shown in Table 9.6, the equilibrium frequencies (π) for the six codons are
| 9.1 |
The three solutions correspond to the number of GC in the codon, with AGA having one, AGG, CGA and CGT having two, and CGC and CGG having three G or C. One may note that the G-ending codon AGG has the same equilibrium frequency as that of the A-ending CGA and the T-ending CGT. Thus, we should not expect A-ending or T-ending codons to always decrease or G-ending and C-ending codons always increase, with increasing genomic GC% or GC-biased mutation. In fact, according to the solutions in Eq. (9.1), πAGG, πCGA, and πCGT will first increase with k until k reaches and will then decrease with k when k > (Palidwor et al. 2010).
Two Hypotheses on Translation Elongation Efficiency
It is controversial as to what degree is protein production limited by translation elongation. Early theoretical considerations (Andersson and Kurland 1983; Bulmer 1990, 1991; Liljenstrom and von Heijne 1987) tend to favor the argument that translation elongation is not rate-limiting in protein production, but translation initiation is. This hypothesis does not deny the existence of codon adaptation, but it asserts that codon-anticodon adaptation and increased elongation efficiency are not related to protein production. Instead, the benefit of codon adaptation and increased elongation efficiency is to increase ribosomal availability for global translation. This hypothesis was explicitly formulated only recently and empirically tested (Kudla et al. 2009).
We thus have two alternative hypotheses attributing different benefits to codon-anticodon adaptation. The first assumes that protein production is rate-limited by both initiation and elongation and codon-anticodon adaptation would result in higher elongation efficiency and more efficient and accurate protein production, especially for HEGs . The second claims that protein production is rate-limited only by initiation efficiency but improved codon adaptation and consequently increased elongation efficiency have the benefit of increasing ribosomal availability for global translation.
Table 9.6.
Transition probability matrix for the six synonymous arginine codons, with α for transitions (C↔T and A↔G), β for transversions, and k modeling AT-biased mutation (0 ≤ k ≤ 1) or GC-biased mutation (k > 1)
| CGT | CGC | CGA | CGG | AGA | AGG | |
|---|---|---|---|---|---|---|
| CGT | kα | β | kβ | 0 | 0 | |
| CGC | α | β | β | 0 | 0 | |
| CGA | β | kβ | kα | β | 0 | |
| CGG | β | β | α | 0 | β | |
| AGA | 0 | 0 | kβ | 0 | kα | |
| AGG | 0 | 0 | 0 | kβ | α |
We ignore nonsynonymous substitutions because nonsynonymous substitution rate is often negligibly low compared to synonymous rate. The diagonal is constrained by the row sum equal to 1
How should we go about testing these two hypotheses? Note that the two hypotheses make different predictions about the relationship among three variables: (1) translation initiation efficiency, (2) translation elongation efficiency, and (3) protein production. Before we can test these two hypotheses, we need to understand how these variables can be measured. The previous chapter outlines a few factors contributing to translation initiation efficiency. Here we first learn a few indices of codon usage bias as a proxy for translation elongation efficiency and then include them in the test of the two hypotheses in the section illustrating the application of index of translation elongation (Xia 2015).
Wobble Hypothesis and Its Extensions
The wobble hypothesis is proposed to explain how a set of tRNA molecules can decode all sense codons which are much larger in number. The wobble-pairing rules are specified in Fig. 9.4, together with the numbering system used here for individual codon and anticodon sites that is more precise than, but different from, the conventional one. The original wobble hypothesis (Crick 1966), with its extended codon-anticodon base pairs (Fig. 9.4), played a crucial role in understanding the working of the translation machinery. It explains why tRNAIle/IAU, where I in IAU is inosine derived from A, is able to translate all three Ile codons (AUC, AUU, and, albeit inefficiently, AUA), why a tRNA with a GI can translate Y-ending codons (where Y stands for C or U), and why a tRNA with a UI can translate R-ending codons (where R stands for A or G). The hypothesis also explains the lack of AI in tRNA genes for decoding twofold Y-ending codon family because such a tRNA, when its AI is modified to II, would misread the near-cognate R-ending codons.
Wobble pairing reduces the number of tRNAs needed for translation and simplifies the translation machinery. As an example of parsimonious tRNA usage, the Y-ending codons, be they in twofold or fourfold codon families, are decoded by tRNAs with either a II or a GI, but never both. This rule is obeyed in all three kingdoms of life. Almost all fourfold codon families in Mycoplasma pulmonis (including the Ser UCN codon family and Leu CUN codon family) are decoded by a single tRNA species with a UI, except for the Thr ACN and Arg CGN codon families which are each decoded by two tRNA species, one with a UI and other with a GI. The most dramatic simplification of tRNome is observed in vertebrate mitochondria, e.g., vertebrate mitochondrial genomes which contain only 22 tRNA genes, with each tRNA species decoding a codon family. Instead of separate initiation tRNAiMet/CAU and elongation tRNAeMet/CAU present in all nuclear genomes, a single tRNAMet/CAU, with a modified CI, decodes both the initiation AUG codon and internal Met AUR codons. Each Y-ending codon family is decoded by a single tRNA species with a wobble GI and each R-ending codon family by a single tRNA with a wobble UI which is modified to prevent its pairing with U or C. All fourfold codon families are decoded by a tRNA with a wobble UI which is not modified.
Wobble pairing is not without cost as it often reduces translation efficiency and accuracy and is generally avoided (Xia 2008). For example, an II/A3 pair is bulky because it involves two purines (Fig. 9.4) in contrast to other base pairs which typically involve a large purine and a small pyrimidine. For this reason, Ile is rarely coded by AUA except for certain viruses with a strong A-biased mutation (van Weringh et al. 2011). Among a set of highly expressed genes in the yeast ( ), AUA is not used at all (Table 9.5). Similarly, a tRNA with a UI can translate A-ending codons better than G-ending codons (Grosjean et al. 2010; Xia 2008). Most of the yeast tRNAArg have a UI, and only one AGG codon is found in contrast to 314 AGA codons in highly expressed yeast genes (Table 9.4). Yeast genomic data also suggest that a tRNA with a GI can translate C-ending codons better than U-ending codons. For example, the yeast tRNAAsn genes translating the Asn AAY codon family all have a GI. Among 219 Asn codons in highly expressed yeast genes, only 11 are AAU codons, suggesting strong selection against AAU codons in favor of AAC codons (Table 9.4). Note that the yeast genome is strongly AT-biased. If there is no selection against AAU codons, we would expect more AAU codons than AAC codons, which is contrary to the observed frequencies. However, the selection against GI/U3 pair is in general much weaker than that against UI/G3 pair. In fungal mitochondrial genomes, there is no avoidance of GI/U3 pair in favor of GI/C3 pair, although UI/G3 pair is strongly avoided in favor of UI/A3 pair (Xia 2008). The weak, or lack of, selection against GI/U3 can explain several puzzling counterexamples against the codon-anticodon adaptation theory (Bulmer 1991; Ikemura 1981b; Xia 1998a) which states that the most frequently used codon in each synonymous codon family should form Watson-Crick base paring with the anticodon of the most abundant tRNA species to reduce translation error and increase translation efficiency. For example, Cys codons (UGY) are translated by tRNACys/GCA in both cytoplasm and mitochondria in the yeast, yet most Cys codons have U3. If there is little selection against GI/U3 pair (i.e., GI/U3 pair is as efficient and accurate as GI/C3 pair), then the frequencies of UGC and UGU will be mostly determined by AT-bias. Because the yeast nuclear and mitochondrial genomes are both AT-rich, we have more UGU codons than UGC codons, in spite of GI in tRNACys. The weak selection against GI/U3 but strong selection against UI/G3 also explains why Y-ending codons are typically translated by a tRNA with a GI, whereas R-ending codons are typically translated by two different tRNAs, one with a UI and the other with a CI (Xia 2008).
The wobble hypothesis points to the necessity of nucleotide modification in tRNA to either increase or decrease the wobble versatility to improve accuracy and efficiency of translation. The observation that an unmodified UI can pair with all N3 in many mitochondrial genomes suggests that UI in tRNA for twofold R-ending codon families needs to be modified to restrict its wobble versatility to avoid misreading the near-cognate Y-ending codons. Chemical modification of UI to restrict its pair versatility to R3 in twofold R-ending codon family is universal in all three kingdoms of life and in organelles (Grosjean et al. 2010; Lim 1994). On the other hand, the tRNAMet/CAU in vertebrate mitochondria need to read both the initiation AUG codon and the internal AUG and AUA codons, and its CI is modified to f5CI to increase its wobble versatility so as to form a f5CI/A3 pairing between the anticodon and the AUA codon. Nucleotide modification in tRNA has been extensively reviewed (Grosjean et al. 2010) and chemically detailed in MODOMICS (Czerwoniec et al. 2009).
Wobble pairing implies the theoretical possibility of adding new base pairs of novel nucleotides to protein-coding genes to increase the coding capacity (Hirao and Kimoto 2010). A single novel base pair, involving two novel nucleotides, would increase the number of codons from 64 to 216 (=63), and one can then use these extra codons, together with engineered tRNAs to recognize these codons and to carry new amino acid analogs, to produce novel proteins.
The wobble hypothesis can be extended to explain the lack of UCG anticodon in Arg CGN codon family in a large number of evolutionary lineages. A tRNA species with a wobble UI is almost always present among tRNA species decoding fourfold codon families and twofold R-ending codon families, with most exceptions observed in the Arg CGN codon family. In the mitochondrial genomes of Caenorhabditis elegans (metazoan), Marchantia polymorpha (plant), Pichia canadensis (fungus), and (fungus), there is no tRNAArg/UCG, and Arg CGN codon family is decoded by tRNAArg/ACG (Xia 2005). The lack of tRNAArg/UCG in the mitochondrial genome of these diverse taxa suggests that the lack is an ancestral state and that the presence of tRNAArg/UCG in vertebrate mitochondria is a derived state. This is substantiated by the fact that almost all eubacterial species, from which the mitochondrion was originally derived, lack tRNAArg/UCG (Grosjean et al. 2010).
The expanded wobble hypothesis for the lack of tRNAArg/UCG requires an extension of the wobble hypothesis by invoking wobble paring between the third anticodon site (NIII) and the first codon site (N1), conditional on a CII/G2 or GII/C2 with three hydrogen bonds. Thus, the anticodon UCG would wobble-pair with stop codon UGA through a wobble GIII/U1 pair and should therefore be strongly selected against (Carullo and Xia 2008). This explains not only the absence of tRNAArg/UCG in diverse evolutionary lineages but in particular why tRNAArg/UCG is absent in most eubacterial species and ancestral mitochondrial lineages where UGA is used as a stop codon and why it is present in derived mitochondrial lineages such as vertebrate mitochondrial genomes where UGA is no longer used as a stop codon.
Commonly Used Codon Usage Indices
There are two key factors contributing to codon usage bias : the mutation bias (Osawa et al. 1987) and the tRNA-mediated selection (Ikemura 1981a, 1982, 1992; Xia 1998a, 2015). There are also two types of codon usage indices, but they do not correspond to the two factors shaping codon usage. The first type of codon usage indices is codon-specific best represented by relative synonymous codon usage (RSCU , Sharp et al. 1986), which measures deviation of codon usage from equal usage. The second type of codon usage indices is gene-specific with several well-known representatives including codon adaptation index effective number of codons (ENC, Sun et al. 2013; Wright 1990), codon adaptation index ( CAI, Sharp and Li 1987; Xia 2007c), codon bias index (CBI, Bennetzen and Hall 1982), frequency of optimal codons (Fop, Ikemura 1985), tRNA adaptation index ( tAI, dos Reis et al. 2004), and index of translation elongation (ITE , Xia 2015).
ENC aims to measure deviation of codon usage from equal usage and may be considered as the gene-specific equivalent of the codon-specific RSCU . They are both descriptive and do not distinguish between mutation bias or tRNA-mediated selection in their contribution to codon usage bias . All other gene-specific indices aim to measure the intensity of the tRNA-mediated selection on codon usage bias. A gene encoding a mass-produced (highly expressed) protein is expected to be under stronger selection to optimize its codon usage corresponding to differential tRNA availability than a gene encoding lowly expressed protein, and we expect CAI , CBI, tAI, and ITE to be greater for the highly expressed gene than the lowly expressed gene. However, CAI, CBI, and tAI ignore background mutation bias. ITE is a generalization of CAI, by incorporating background mutation, and is reduced to CAI when there is no background mutation bias (Xia 2015).
Codon indices that aim to measure tRNA-mediated selection (i.e., CAI , CBI, Fop, tAI, and ITE ) all define a translationally optimal codon (TOC) within each codon family, and the codon usage index value will be the highest if all codons in a gene are TOCs. However, TOC is defined differently among these indices. CBI, Fop, and tRNA define a TOC mainly as one that corresponds to the most abundant isoacceptor tRNA , with CBI incorporating gene expression information as well. CAI defines a TOC as one in its codon family that is used most frequently in HEGs . ITE defines a TOC as one in its codon family that is used most frequently in HEGs after adjustment of mutation bias reflected in LEGs . Comparative studies (Coghlan and Wolfe 2000; Comeron and Aguade 1998) suggest that CAI is better than ENC, CBI, and Fop in predicting gene expression levels, tAI is better than CAI (dos Reis et al. 2004; Tuller et al. 2010), and ITE is better than CAI and tAI (Xia 2015). However, such comparison depends not only on the methods but also on the quality of the software that implements the methods. A good method could be conceptually sound but implemented erroneously and generate poor results. Moreover, the same index could be implemented differently. For example, one implementation could treat all synonymous codons into one family so that some codons could have six or even eight synonymous codons (trematode mitochondrial code has eight Ser codons: UCN and AGN), whereas another implementation would break all compound codon families, such as Leu, Ser, and Arg codon families, into separate fourfold and twofold codon families.
RSCU (Relative Synonymous Codon Usage)
RSCU measures codon usage bias for each codon within each codon family. It is essentially a normalized codon frequency so that the expectation is 1 when there is no codon usage bias. A codon is overused if its RSCU value is greater than 1 and underused if its RSCU value is less than 1. It is computed directly from input sequences.
Calculation of RSCU
The general equation for computing RSCU is
| 9.2 |
where i refers to a codon family and j to a specific codon within the family. For example, i may refer to the alanine codon family with four codons (GCU, GCC, GCA, and GCG) and j to a specific codon such as GCU. In this case, the numerator is the frequency of GCU, and the denominator is the summation of the four codon frequencies divided by the number of codons in the codon family, i.e., 4.
For biology students, it is always easier to learn by numerical examples. Suppose we counted the codon frequencies of one particular protein-coding sequence and have obtained the codon frequencies (Table 9.7). The RSCU for the GCU codon is computed, according to Eq. (9.2), as
| 9.3 |
which is displayed in Table 9.7. Biology students are recommended to cover up the last column in Table 9.7 and finish the computation of the rest of the RSCU values.
Table 9.7.
Data for illustrating the calculation of RSCU
| Codon | AA | N | RSCU |
|---|---|---|---|
| GCU | Ala | 52 | 0.84 |
| GCC | Ala | 91 | 1.47 |
| GCA | Ala | 103 | 1.66 |
| GCG | Ala | 2 | 0.03 |
| GAA | Glu | 78 | 1.64 |
| GAG | Glu | 17 | 0.36 |
| … | … | … | … |
AA amino acid, T codon frequency
Illustration of RSCU Applications
As I mentioned earlier, a variable such as RSCU is often not interesting by itself, but it becomes more interesting when you relate the variable to some other variables. Figure 9.5 shows the correlation of RSCU for genes and that for the double-stranded DNA (dsDNA) phage TLS. This strong and positive correlation suggests adaptation of host tRNA pool. This adaptation the phage genes and the host genes to the same tRNA pool in E. coli cells and the evolution of the very similar codon usage patterns is an example of convergent evolution, i.e., phylogenetically remote organisms evolving similar features not due to coancestry, but in response to the same selection regime induced by the same environment.
Fig. 9.5.
Correlation in RSCU between and its double-stranded DNA phage TLS
What explanation would you offer if we find little correlation in RSCU between a phage and its host? There are in fact a large number of cases in which a virus and its host share little similarity in codon usage. Will such cases invalidate our convergent evolution explanation for the strong and positive correlation between phage TLS and its host? Science thrives in questions, and such questions immediately drive us to search for answers, and the answers enrich our explanatory conceptual framework. Ronald Fisher once said that “No aphorism is more frequently repeated in connection with field trials, than that we must ask Nature few questions, or ideally, one question at a time. The writer is convinced that this view is wholly mistaken. Nature, he suggests, will respond to a logical and carefully thought-out questionnaire; indeed, if we ask her a single question, she will often refuse to answer until some other topic has been discussed” (Fisher 1926).
There are at least six factors that will weaken the correlation in RSCU between a virus and its host. First, some dsDNA phages carry many tRNA genes of their own genome, and the transcription of these tRNA genes would modify the host tRNA pool. For example, another dsDNA phage, enterobacteria phage WV8, carries 20 tRNA genes on its genome. In such cases, the phage genes would adapt to the modified tRNA pool which may be different from the tRNA pool where E. coli mRNAs are translated normally (i.e., without phage infection). Partly for this reason, the correlation in RSCU between enterobacteria phage WV8 and its E. coli host is much weaker than that shown in Fig. 9.5 (Chithambaram et al. 2014a). Phage TLS (Fig. 9.5) happens to have a genome that does not encode any tRNA genes of its own. So it depends entirely on the host tRNA pool to decode the codons of its genes.
Second, codon usage adaptation takes time. If a phage having adapted to one host has switched to a new host, and if the original host and the new host differ in their tRNA pools, then the phage codon usage will be more similar to that of the original host than the new host. This may be applicable to phage PRD1 which belongs to the peculiar Tectiviridae family with members parasitizing both gram-negative and gram-positive bacteria. Phage PRD1 is the only species in the family known to parasitize gram-negative bacteria, with other members of the family, i.e., phages PR3, PR4, PR5, L17, and PR772, parasitizing gram-positive bacteria (Bamford et al. 1995; Grahn et al. 2006). It is reasonably safe to assume that the phage PRD1 lineage has switched host from gram-positive to gram-negative bacteria. Furthermore, there is only one amino acid difference in the coat protein between phages PRDl and PR4 (Bamford et al. 1995). This suggests that PRD1 is phylogenetically close to its relative parasitizing gram-positive, i.e., the host-switching may have occurred quite recently. In fact, codon usage in phage PRD1 is more similar to that in gram-positive bacteria than in gram-negative bacteria (Chithambaram et al. 2014b). Among 87 bacterial genomes covering major groups of bacterial species, the host species with codon usage most similar to that of phage PRD1 are strains in the gram-positive Geobacillus (NC_014206, NC _012793, NC_014650, NC_014915, NC_013411).
Third, a phage with a wide range of host species may imply diverse tRNA pools that would represent fluctuating selection with different optima. Phage PRD1 mentioned above does have a variety of gram-negative bacteria as hosts, including Salmonella, Pseudomonas, Escherichia, Proteus, Vibrio, Acinetobacter, and Serratia species (Bamford et al. 1995; Grahn et al. 2006). However, this diverse array of hosts actually have rather similar codon usage, so host variability is not a good explanation for the lack of similarity in codon usage between PRD1 and (Chithambaram et al. 2014b).
Fourth, the tRNA-mediated selection differs in its effectiveness between temperate phages (i.e., those with lysogeny) and virulent phages (i.e., those without lysogeny). The lysogenic phase effectively hides protein-coding genes of the phage from tRNA-mediated selection, and the phage codon usage will be at the mercy of mutation bias in the host genome. In contrast, virulent phages have their codon usage under tRNA-mediated selection every time they enter the host cell. For this reason, one would expect better codon usage adaptation in virulent phages than in temperate phages, which is true (Prabhakaran et al. 2015).
Fifth, mass translation of phage mRNA often occurs in the late infection phase when the host cellular environment has already been dramatically altered, presumably with a quite different tRNA pool in the late phase from that in the early phase. In vaccinia virus, the degradation of host mRNA appears nearly complete 6 h after the viral infection as no host poly(A) mRNA is detectable at/after this time (Katsafanas and Moss 2007). Shutdown or drastic alteration of host protein and RNA expression implies that many tRNA species are no longer sequestered for host translation, which would dramatically alter availability of different tRNA species. Many other viruses, including hepatitis C (Chan and Egan 2009), SARS (Minakshi et al. 2009), Japanese encephalitis virus (Su et al. 2002), and coxsackie B2 virus (Zhang et al. 2010), can induce stress responses such as the UPR (unfolded protein response) in late phase. URP often results in the shutdown of transcription of ribosomal RNAs as well as repression of translation via phosphorylation of eukaryotic translation initiation factor eIF-2α (DuRose et al. 2009). All these suggest that the tRNA pool in the late phase differs from that in the normal cell. If codon usage of phage genes adapts to the altered tRNA pool in the late phase, whereas that of host genes adapts to the tRNA pool and normal cells, then we should not expect the parasite and the host share high similarity in codon usage. Interestingly, HIV-1 early genes have RSCU positively correlated with RSCU of human genes, but HIV-1 late genes have RSCU values negatively correlated with RSCU of human genes (van Weringh et al. 2011).
Sixth, if mutation bias is in different direction from tRNA-mediated selection, e.g., if tRNA-mediated selection favors Y-ending codons whereas mutation bias favors R-ending codons (where Y and R stand for pyrimidine and purine, respectively), then strong mutation bias will disrupt selection. This may well be the case for the poor codon adaptation in HIV-1. According to a recent compilation of tRNAs in human genome (Chan and Lowe 2009), the AUC codon can be translated by 17 tRNAIle species (14 tRNAIle/IAU and 3 tRNAIle/GAU) and AUU can be translated by 14 tRNAIle/IAU species, whereas AUA can be translated by only 5 tRNAIle/UAU species. In agreement with the tRNA-mediated selection, human genes code Ile mostly by AUC and least by AUA. In contrast, HIV-1 genes code Ile mostly by AUA and least by AUC (Haas et al. 1996; Nakamura et al. 2000). The poor codon adaptation of HIV-1 (Fig. 9.6a) reduces the translation efficiency of HIV-1 genes. Modifying HIV-1 codon usage according to host codon usage has been shown to increase the production of viral proteins (Haas et al. 1996; Ngumbela et al. 2008). The high frequency of maladaptive AUA codons in HIV-1 genes is due to high A-biased mutation at the third codon position of HIV-1 genes (Jenkins and Holmes 2003). The A-bias is mediated by the error-prone reverse transcriptase (Martinez et al. 1994; Vartanian et al. 2002) and the human APOBEC3 protein (Yu et al. 2004). The frequency of A can reach up to 40% in some HIV-1 genomes (Vartanian et al. 2002), resulting in a preponderance of A-ending codons which are typically rarely used in the human HEGs (Kypr and Mrazek 1987; Sharp 1986).
Fig. 9.6.
Relative synonymous codon usage ( RSCU) of HIV-1 (a) and HTLV-1 (b) plotted against RSCU of highly expressed human genes. Modified from van Weringh et al. (2011)
One would predict a better correlation in RSCU between HIV-1 genes and highly expressed human genes. One viral species that may shed light on this prediction is HTLV-1 which infects the same type of host cell as HIV-1. Both HIV-1 and HTLV-1 are retroviruses with RNA genomes, but HTLV-1 is exceptional in that it does not have a strong A-biased mutation (Van Dooren et al. 2004; van Hemert and Berkhout 1995). HTLV-1 relies for the most part on the host polymerase to replicate through clonal expansion of infected cells rather than undergoing iterative replication cycles like HIV-1 (Strebel 2005). The substitution rate of HTLV-1 is consequently lower, about 5.2 × 10−6 substitutions/site/year (Hanada et al. 2004; Van Dooren et al. 2004), whereas that of HIV-1 is around 2.5 × 10−3 substitutions/site/year (Hanada et al. 2004). Thus, although HTLV-1 infects the same cells as HIV-1, i.e., human CD4+ T cells (Rimsky et al. 1988), and both viruses are therefore subject to the same selective pressures on codon usage by the host tRNA pool, mutations are less likely to disrupt codon-anticodon adaptation in HTLV-1 than in HIV-1 as they occur at a lower rate in the former. The positive correlation in RSCU between HTLV-1 and highly expressed human genes (Fig. 9.6b) is highly significant (Pearson r = 0.4982, p < 0.0001, Spearman r = 0.4688, p = 0.0002).
CAI (Codon Adaptation Index)
CAI has been used extensively in biological research. Other than its primary use for measuring the efficiency of translation elongation, it has contributed to the finding that functionally related genes are conserved in their expression across different microbial species (Lithwick and Margalit 2005), to the prediction of protein production (Futcher et al. 1999; Gygi et al. 1999), and to the optimization of DNA vaccines (Ruiz et al. 2006).
Calculation of CAI
While RSCU characterizes codon usage bias in each codon family, CAI quantifies the codon usage bias in one gene. It is based on (1) the codon frequencies of the gene and (2) the codon frequencies of a set of known HEGs (often referred to as the reference set). The reference set of genes is used to generate a column of w values computed as
| 9.4 |
where RefCodFreqij is the frequency of codon j in synonymous codon family i and RefCodFreqi.max is the maximum codon frequency in synonymous codon family i. For example, if the four alanine codons GCA, GCC, GCG, and GCU have frequencies 20, 4, 4, and 2, respectively, then their associated w value are 1, 0.2, 0.2, and 0.1, respectively. The codon whose frequency is RefCodFreqi.max is often referred to as the major codon (whose w is 1), and the other codons in the synonymous codon family are referred to as minor codons. The major codon is assumed to be the translationally optimal codon.
It is easy to see the relationship between wij and RSCU . The former is obtained by dividing each RSCU by the largest RSCU value within each codon family. With the w values for a particular species, we can now compute the CAI value of any protein-coding sequence from the species by using the following equation:
| 9.5 |
where n is the number of sense codons (excluding codon families with a single codon, e.g., AUG for methionine and UGG for tryptophan in the standard genetic code). Note that the exponent is simply a weighted average of ln(w). Because the maximum of w is 1, ln(w) will never be greater than 0. Consequently, the exponent will never be greater than 0. Thus, the maximum CAI value is 1. The minimum CAI depends on the w values for minor codons in each codon family. If the minor codons all have w values close to zero, then the minimum CAI will also be very close to zero.
The calculation of CAI is numerically illustrated in Table 9.8 for a gene whose observed codon frequency is in column ObsFreq (Table 9.8). The codon frequency of the highly expressed reference set is in column “RefCodFreq.” The column “w” is obtained by dividing RefCodFreq values by the largest value in the codon family. For example, the first w value in the table, 0.606, is obtained by dividing RefCodFreq value 195 by the largest RefCodFreq value in the alanine codon family, i.e., 322. We take a weight average of ln(w) as shown in Eq. (9.5) and then exponentiate it to obtain CAI.
Table 9.8.
Illustration of CAI calculation for a gene whose observed codon frequencies are in column “ObsFreq”
| Codon | AA | ObsFreq | RefCodFreq | w |
|---|---|---|---|---|
| GCA | A | 1 | 195 | 0.606 |
| GCU | A | 15 | 322 | 1.000 |
| GCG | A | 0 | 81 | 0.252 |
| GCC | A | 8 | 242 | 0.752 |
| UGC | C | 3 | 123 | 1.000 |
| UGU | C | 3 | 112 | 0.911 |
| GAU | D | 9 | 69 | 1.000 |
| GAC | D | 11 | 40 | 0.580 |
| GAG | E | 11 | 289 | 0.863 |
| GAA | E | 14 | 335 | 1.000 |
| UUU | F | 3 | 118 | 0.554 |
| UUC | F | 9 | 213 | 1.000 |
| … | … | … | … |
The codon frequency of the highly expressed reference set is in column “RefCodFreq.” The column “w” is obtained by dividing RefCodFreq values by the largest value in the codon family
The way w is calculated implies that, if a protein contains only methionine and tryptophan, both encoded by a single codon (AUG and UGG, respectively, in standard code), then the gene will have the highest CAI value of 1 because w values are 1 for such codons. Similarly, a gene with many AUG and UGG codons would have high CAI values even if it is not under any tRNA-mediated selection. For this reason, a good implementation of CAI should exclude single-member codon families from CAI calculation.
I have previously mentioned that codon usage indices such as CAI can be implemented differently with different classification of codon families, so gene A could have a higher CAI value than gene B from one software, but the opposite from another software. I wish to illustrate this so that the reader can better interpret their results.
In highly expressed yeast genes (e.g., compiled in the Eyeastcai.cut in EMBOSS distribution), CGU is by far the most frequent codon in the CGN (coding for arginine) codon family. The overuse of CGT and the avoidance of CGG, CGA, and CGC codons in highly expressed yeast genes make sense because the yeast genome contains six tRNAArg genes with anticodon ACG forming Watson-Crick base pairing with the CGT codon, but no other tRNAArg gene forming Watson-Crick base pairing with the other three CGN codons (the nucleotide A in anticodon ACG is modified to inosine but still pairs with U better than with other nucleotides). While this illustrates well the codon-anticodon adaptation, it causes practical problems with computing CAI .
Suppose we now use a sequence consisting entirely of CGU codons and expect the resulting CAI to be 1 by using the Eyeastcai.cut reference set. The resulting CAI value from the EMBOSS.cai program is 0.140 instead of 1. It turns out that amino acid arginine is coded by two codon subfamilies, the CGN codon family we have mentioned and the AGR codon family. The largest codon frequency among these six codons is 314 (for AGA codon) in Eyeastcai.cut. So the w value for CGT is not 1 (43/43) as we have thought but is only 0.1369 (= 43/314). For this reason, some CAI-calculating programs, e.g., DAMBE (Xia 2013, 2017d), may separate compound codon families such as the arginine family into two separate families, one twofold and one fourfold.
Illustration of CAI Applications
The most obvious application of CAI or related codon usage indices is to optimize codon usage to optimize protein expression. Many experiments have demonstrated increased protein production by optimizing codon usage and decreased protein production if codons are replaced by rarely used ones (Haas et al. 1996; Kaishima et al. 2016; Ngumbela et al. 2008; Robinson et al. 1984; Sorensen et al. 1989). There are claims that codon optimization does increase protein production (e.g., Kudla et al. 2009), but these claims were found to be due to wrong data analysis (Tuller et al. 2010; Xia 2015) and will be dealt with on a later section on ITE (Xia 2015). Below I list two less obvious applications of CAI.
Does High Mutation Rate Prevent HIV-1 Genes from Evolving Codon Adaptation?
I have mentioned in the section on RSCU that the lack of concordance in codon usage between HIV-1 and human genes was conventionally explained by high mutation rate in HIV-1, based on the observation that (1) HIV-1 genome is known to experience strongly A-biased mutations, (2) usage of A-ending codons in HIV-1 genes is particularly different from that of the host genes, and (3) HTLV-1 that parasitizes the same human CD4+ T cells but has reduced mutation rate does have codon usage similar to human genes (Fig. 9.6b). Thus, the lack of concordance in codon usage between HIV-1 and human genes is interpreted as poor codon adaptation caused by high mutation rate disrupting codon adaptation.
However, van Weringh et al. (2011) objected to this interpretation. They argued that the lack of concordance in codon usage between HIV-1 and human genes is not due to poor codon adaptation in the part of HIV-1 genes, but because HIV-1 genes, especially the late genes, have adapted to a tRNA pool that is fundamentally different from that in a normal human CD4+ T cell. What originally prompted them to formulate this hypothesis is the observation that CAI for HIV-1 early genes are significantly greater than CAI for HIV-1 late genes when highly expressed human genes are used as reference genes. These late genes encode mass-translated HIV-1 structural proteins and are typically expected to have higher CAI than the relatively lowly expressed early genes. So it is thus a surprise to see late genes having smaller CAI than early genes, unless the mass-translated late genes adapt to a tRNA pool different from the early genes.
van Weringh et al. (2011) investigated experimentally measured tRNA abundance in the human cell when the late HIV-1 genes are translated and HIV-1 virions are produced. The tRNA pool for the late genes is indeed different in the expected direction, supporting their hypothesis that the lack of concordance in codon usage between HIV-1 and human genes is not due to poor codon adaptation in HIV-1 genes but because HIV-1 genes, especially the late genes, have adapted to a tRNA pool different from the one with which highly expressed human genes are translated (van Weringh et al. 2011).
Detecting Horizontally Transferred Genes
CAI has also been used jointly with a reformulated effective number of codons (Nc, Sun et al. 2013) to detect horizontally transferred genes. genes with a strong codon usage bias typically have high CAI values. However, three genes (yagF, yagG, and yagH) from the defective CP 4–6 prophages of E. coli (Wang et al. 2010) have strongly biased codon usage (small Nc values) but relatively small CAI values. This codon usage pattern sets the three genes apart from the rest of E. coli genes (Fig. 9.7) which highlight the value of using the “Nc versus CAI” plot to detect recently horizontally transferred genes. These genes have been “naturalized” in E. coli genome and contribute to E. coli survival and growth (Wang et al. 2010).
Fig. 9.7.
Plot of CAI against a reformulated effective number of codons (Nc, Sun et al. 2013) for genes facilitates the detection of newly “immigrant” genes that exhibit codon usage bias different from the “native” genes. Three E. coli genes (yagF, yagG, and yagH) from the defective CP 4–6 prophages of E. coli (Wang et al. 2010) have strongly biased codon usage (relatively small Nc) but relatively poor codon adaptation (mediocre CAI values). The red points represent 179 annotated E. coli pseudogenes (NC_000913) that have not accumulated frameshifting mutations
The largest mucin gene (mucin 14A) in Drosophila melanogaster also exhibits strong codon usage bias (Nc = 38.6), but in the direction opposite to those highly expressed D. melanogaster genes. Its CAI value is equal to 0.1277, which is the second smallest among all D. melanogaster genes. It is unknown how and why the gene has evolved to have such a peculiar feature.
The distribution of CAI values for the 179 annotated pseudogenes are indicated in red. These pseudogenes have not accumulated frameshifting mutations and presumably were pseudogenized only recently. They tend to be clustered on the lower end of CAI distribution, suggesting that genes with high CAI values require tRNA-mediated selection to maintain the high CAI values.
The gene with the smallest CAI is mgtL, which has only 17 sense codons and is a bacterial mRNA leader that controls the expression of the downstream mgtA (Park et al. 2010). The low CAI is not due to stochastic fluctuation due to small number of codons but because almost all used codons are minor codons. This may represent a real case of a gene preferring minor codons to facilitate its regulatory function.
Problems with CAI and Other Gene-Specific Codon Usage Indices
There are major problems with CAI and other commonly used codon usage indices. While some minor problems have been addressed before (Xia 2007c), the key issue of properly inferring translationally optimal codons (TOCs) remains unresolved. These gene-specific codon usage indices all need to infer TOCs, by using two types of information. The first, represented by tAI (dos Reis et al. 2004), uses the most abundant tRNA and its anticodon to infer TOC within each codon family, i.e., the codon that base-pairs best with the most abundant tRNA is the TOC. The second, represented by CAI, considers the most frequent codon in HEGs as the TOC within each codon family. I will outline the problems to pave the way for the presentation of a new index of translation elongation in the next section (ITE , Xia 2015).
Problem with Codon Usage Indices Using tRNA Abundance to Infer TOCs
For indices such as tAI that use tRNA abundance information to define TOCs, the main problem is that TOCs cannot be inferred reliably from tRNA gene copy numbers or experimentally measured tRNA abundance. For example, inosine is expected to pair best with C and U, less with A (partly because of the bulky I/A pairing involving two purines), and not with G. However, tRNAVal/IAC from rabbit liver pairs better with GUG codon than with other synonymous codons (Jank et al. 1977; Mitra et al. 1977). No one would have identified GUG as the best codon for tRNAVal/IAC without actually seeing the experimental result.
Similarly, the genome codes tRNAAla/GGC for decoding GCY codons. One would have thought that GCC codon, which forms Watson-Crick base pairing with the anticodon, would be translationally more optimal than GCU. However, GCU is used much more frequently than GCC in HEGs than LEGs in . We have encountered a similar example in Table 9.4 involving Cys codon usage in HEGs. There are four tRNACys genes with the same anticodon GCA forming Watson-Crick base pair with UGC codon, but no tRNACys gene with anticodon forming Watson-Crick base pair with the alternative UGU codon. We would have taken UGC as the TOC. However, UGU is used far more frequently than UGC codon in highly expressed yeast genes relative to LEGs. In short, in all these cases we would be wrong to use the most abundant tRNA species and its matching codon to infer TOC.
There is one more reason for tRNA abundance not able to reliably predict TOCs. What matters in translation elongation is not the abundance of transcribed tRNAs but the availability of charged tRNAs. It is tedious to determine the level of charged tRNAs, and researchers typically would use transcriptionally determined tRNAs or even the number of tRNA genes in the genome as a proxy of charged tRNAs. Unfortunately, the abundance of tRNAs often do not reflect the abundance of charged tRNA (Elf et al. 2003).
Furthermore, codon-anticodon base pairing is known to be context-dependent (Lustig et al. 1989). For example, a wobble cmo5U in the anticodon of tRNAPro, tRNAAla, and tRNAVal can read all four synonymous codons in the respective codon family, but the same cmo5U in tRNAThr cannot read C-ending codons (Nasvall et al. 2007). For this reason, the optimal codon usage is likely better approximated by the codon usage of HEGs than what we can infer based on codon-anticodon pairing. Consistent with this proposition, CAI , which is based on the codon usage of HEGs (HEGs), performs better in predicting protein production or abundance than other indices based on tRNAs (Coghlan and Wolfe 2000; Comeron and Aguade 1998; Duret and Mouchiroud 1999).
Problem with Using Codon Usage of HEGs to Infer TOCs
Codon usage indices such as CAI that use codon usage of HEGs to infer TOCs also have problems. Other than those previously outlined (Xia 2007c), it often leads to wrong interpretation of tRNA-mediated selection. I illustrate this problem here with the Ala codon subfamily GCR (where R stands for either A or G). The frequencies of GCA and GCG in HEGs, as compiled and distributed with EMBOSS (Rice et al. 2000), are 1973 and 2654, respectively, which may lead one to think that E. coli translation machinery prefers GCG over GCA. However, the codon frequencies of GCA and GCG for E. coli non-HEGs are 25,511 and 43,261, respectively. Thus, GCA is relatively more frequent in E. coli HEGs than in E. coli non-HEGs. This suggests that mutation bias favors GCG, but tRNA-mediated selection favors GCA. The battle between the mutation bias and tRNA-mediated selection leads to increased usage of GCA in E. coli HEGs relative to LEGs , although GCA is still not as frequent as GCG in HEGs. This interpretation is corroborated by the E. coli genome encoding three tRNAArg genes for GCR codons, all with a UGC anticodon forming perfect Watson-Crick base pair with codon GCA.
The example above illustrates the point that mutation bias is reflected to codon usage of lowly expressed genes. This is what has driven the formulation, development, and implementation of a new codon usage index, ITE (Xia 2015).
ITE (Index of Translation Elongation)
Illustration of ITE Calculation
ITE is implemented in DAMBE (Xia 2013, 2017d). There are in fact four different implementations of ITE in DAMBE, depending on how one would classify codons into codon families. The first implementation is the most extreme (unconventional) and classifies all sense codons into NNR or NNY codon families or subfamilies. For example, the fourfold alanine codon is broken into GCR and GCY subfamilies. For such an NNR or NNY codon family or subfamily i, we first define Pi.HEG and Pi.non-HEG as the proportion of codon i within its R-ending or Y-ending family for HEGs and non-HEGs. Take data for codons GCA and GCG in Table 9.9, for example:
| 9.6 |
| 9.7 |
where SGCA and SGCG may be viewed as relative codon frequencies of HEGs corrected for the “background” non-HEGs. Codon i is considered selected for if Si > 1 and against if Si < 1. Thus, codon GCA is considered selected for because, according to Eq. (9.7), SGCA > 0. This insight would be obscured if we use codon frequency data from HEGs only which would have suggested that codon GCA is selected against. The Si values for the four sense codons in E. coli are listed in Table 9.9.
Table 9.9.
Codon frequency (CF) for highly expressed genes (HEGs ) and non-HEGs, as well as the computed Si values according to Eq. (9.7)
| AA | Codon | CFHEG | CFnon-HEG | Si |
|---|---|---|---|---|
| A | GCA | 1973 | 25,511 | 1.1495 |
| A | GCG | 2654 | 43,261 | 0.9118 |
| A | GCC | 1306 | 33,463 | 0.5646 |
| A | GCU | 2288 | 18,526 | 1.7865 |
| … | … | … | … | … |
We now compute wi as follows:
| 9.8 |
The index of translation elongation (ITE ) is then calculated in the same way as CAI except that, in this particular codon family classification, the computation is applied to NNR and NNY codon subfamilies:
| 9.9 |
where Fi is the frequency of codon i and Ns is the number of sense codons (excluding those in single-codon families). For example, AUG for methionine, AUA for isoleucine, and UGG for tryptophan in the standard genetic code are excluded from computing ITE . Just like CAI , tAI, and Nc, ITE is a gene-specific index of codon usage bias .
One may note that CAI is a special case of ITE when there is absolutely no codon usage bias in non-HEGs in all codon subfamilies. That is, when NGCA.non-HEG = NGCG.non-HEG, NGCC.non-HEG = NGCU.non-HEG, and so on. The range of ITE is the same as CAI, i.e., between 0 and 1.
Readers may demand a justification for the extreme classification of all sense codons into NNR and NNY codon families. The main reason is that, for genes encoded by the nuclear genome, the R-ending codons are typically decoded by two types of tRNA species (one with a wobble C and the other with a wobble U), whereas the Y-ending codons are decoded typically by a single type of tRNA species with either a wobble G or a wobble A modified to inosine, but never by both (Grosjean et al. 2007; Marck and Grosjean 2002). For this reason, the R-ending and Y-ending codons, even within a single fourfold codon family, are subject to different tRNA-mediated selection and therefore should be treated separately. Such implementation is also relevant for certain experimental settings that induce mutation almost exclusively in NNY codons, which is the case in Kudla et al. (2009). However, for comparative purposes, I have included two alternative ITE implementations in DAMBE (Xia 2013, 2017d): (1) with compound sixfold and eightfold codon families broken into twofold and fourfold codon families and (2) lumping all synonymous codons into one codon family. One may access the function by clicking “Seq.Analysis|Codon usage|Index of translation elongation” and then choosing the desired implementation.
A Major Controversy Resolved by the Application of ITE
Highly expressed genes in bacteria and unicellular eukaryotes overuse codons that match the anticodon of the most abundant tRNA (Ikemura 1981a, b, 1982, 1992). When such codons are replaced by rarely used codons, protein production is reduced (Robinson et al. 1984; Sorensen et al. 1989). Similarly, when codon usage is optimized, protein production is increased (Haas et al. 1996; Kaishima et al. 2016; Ngumbela et al. 2008). However, to what degree is translation elongation rate-limiting has been controversial. Early theoretical considerations (Andersson and Kurland 1983; Bulmer 1990, 1991; Liljenstrom and von Heijne 1987) tend to favor the argument that translation elongation is not rate-limiting in protein production, but translation initiation is. This hypothesis states that codon-anticodon adaptation and increased elongation efficiency are not related to protein production. Instead, the benefit of codon adaptation and increased elongation efficiency is to increase ribosomal availability for global translation and timely response to environmental perturbations.
To test these two alternative hypotheses, Kudla et al. (2009) engineered a synthetic library of 154 genes, all encoding the same green fluorescent protein in , but differing in synonymous sites (and consequently the degree of codon adaptation, as measured by codon adaptation index or CAI). All sequences share an identical 5′ UTR of 144 nt long, so there is no variation in the Shine-Dalgarno sequence. Because the engineered genes all encode the same protein, it is justifiable to use protein abundance as a proxy for protein production (assuming that protein molecules sharing the same amino acid sequence have the same degradation rate).
Kudla et al. (2009) used minimum folding energy (MFE), computed from sites −4 to +37 (where ribosomes position themselves at the initiation codon), as a proxy for initiation efficiency. The rationale for using MFE as a measure of translation initiation is that an initiation codon would be inaccessible if it is embedded in a strong secondary structure and that accessibility of the initiation codon is a key determinant of translation initiation efficiency (Nakamoto 2006). Stable secondary structure in sequences positioned at or before the start codon has been experimentally shown to inhibit translation initiation (Osterman et al. 2013), presumably because it embeds SD and start codon in a structural stem and consequently hiding the SD and start codon signals from ribosomes. The previous chapter on translation initiation has already highlighted the point that mRNAs in bacteria and unicellular eukaryotes tend to have much weaker secondary structure near the start codon than elsewhere, especially those from highly expressed.
Kudla et al. interpreted CAI as a proxy of translation elongation. If both translation initiation and elongation contribute to translation efficiency, then protein production is expected to depend on both MFE and CAI. If only translation initiation is important, then protein production will depend on MFE only. They found that MFE accounts for 44% of the variation in protein production but CAI is essentially unrelated to protein production. They concluded consequently that “translation initiation, not elongation, is rate-limiting for gene expression.”
The conclusion by Kudla et al. (2009), however, is based on two critical assumptions. First, MFE and CAI are good proxies of translation initiation and elongation efficiencies, respectively. Second, the effect of translation elongation is independent on translation initiation. The problem with the second assumption has been pointed out recently (Supek and Smuc 2010; Tuller et al. 2010) who reanalyzed the data in addition to providing an overwhelming amount of additional empirical evidence to demonstrate the joint effect of both translation initiation and elongation on protein production. In short, protein production rate is expected to increase with elongation efficiency only when translation initiation is efficient. If translation initiation is slow, then increasing elongation rate is not expected to increase protein production. Kudla et al. (2009) ignored the dependence of elongation effect on translation initiation.
Xia (2015) reanalyzed the experimental data in Kudla et al. (2009) with two improvements, by replacing CAI by ITE and by incorporating translation initiation and elongation into one model. Three points are worth highlighting in Fig. 9.8a. First, in contrast to a nonsignificant relationship between protein abundance and CAI, the protein abundance and ITE are highly significantly correlated (p = 0.0001, Fig. 9.8a). Second, when ITE is small (e.g., ITE < 0), protein abundance is generally low, suggesting that translation elongation is limiting. Third, a large ITE (efficient translation elongation) does not imply high protein production, e.g., when translation initiation is very slow. One expects a large ITE to be associated with increase protein production only when translation initiation is efficient.
Fig. 9.8.
Relationship between protein abundance (measured by GFP normalized fluorescence; data kindly provided by Dr. Plotkin) translation elongation efficiency (ITE ). (a) Without considering translation initiation. (b) The relationship between protein abundance and ITE is characterized separately for four groups of data, with MFE1, MFE2, MFE3, and MFE4 corresponding to groups of genes with increasing translation initiation efficiency. (Modified from Xia 2015)
Xia (2015) binned MFE into four MFE categories, from strong secondary structure to weak secondary structure (−15.3, −11), (−10.9, −9), (−8.7, −6.2), and (−6, −3.5), representing translation initiation from the lowest to the highest, and designated as MFE1-MFE4 (Fig. 9.8b). The intervals are chosen in such a way that all MFE values fall into four roughly equal-sized groups with within-group MFE being as small as possible. The benefit of binning is that one can exclude the MFE variable so that the effect of ITE can be modeled more explicitly. It is for the same reason that Tuller et al. (2010) also used binned analysis for this data set.
In the MFE1 group, translation initiation is the lowest, and we should expect little increase of protein production with translation elongation efficiency (ITE ). This is consistent with the empirical result (Fig. 9.8b) where the relationship between ITE and protein abundance is not statistically significant in the MFE1 group (b = 67.545, p = 0.4213, Fig. 9.8b), with ITE accounting for only 2% of total variation in ranked protein abundance (rProt). In contrast, when translation initiation is more efficient in groups MFE2-MFE4, rProt increases significantly with ITE, with the simple linear model consistently accounts for about 17% of the total variation in rProt (Fig. 9.8b, with b varying from 216.60 to 263.87). Thus, the contribution of translation elongation (ITE) to protein production is much greater than previously documented for this data set, i.e., absent (Kudla et al. 2009) or less than 3% of the total variation in protein production (Tuller et al. 2010). Readers may consult Xia (2015) for more explicit modeling of the protein abundance on translation initiation and elongation.
One might wonder why previous studies, although not taking translation initiation into consideration, almost always consistently show positive relationship between translation efficiency and codon adaptation. There are two explanations. First, previous experimental studies were carried out typically on highly expressed genes with efficient translation initiation efficiency. Such studies are equivalent to excluding the MFE1 group in Fig. 9.8b. Second, for correlational studies, nature generally does not generate bacterial genes with high translation initiation efficiency but poor codon adaptation or low translation initiation with high codon adaptation. However, the experiment by Kudla et al. (2009) generated both of these unnatural associations, leading to a lack of positive association between protein production and codon adaptation. This example highlights the point that a well-intended and well-done experiment can mislead us. It represents another illustration of Simpson’s Paradox in which wrong conclusion is reached when one omits a contributing variable.
Translation Elongation Efficiency and Accuracy
Given a fixed translation initiation efficiency, our conceptual model for the relationship between codon adaptation (CA) and tRNA-mediated selection, in its simplest form, is
| 9.10 |
where CA is tRNA-mediated codon adaptation often measured by CAI or ITE (Xia 2015) and SE is selection for translation efficiency (in unit of protein produced per mRNA molecule). The slope b is typically positive, i.e., stronger selection for translation efficiency leads to better codon adaptation. Many studies have demonstrated a strong relationship between codon adaptation and gene expression (Coghlan and Wolfe 2000; Duret and Mouchiroud 1999; Gouy and Gautier 1982).
One key deficiency in Eq. (9.10) is that it does not distinguish between selection due to translation efficiency or that due to translation accuracy (Akashi 1994). Take Asn codons AAC and AAU in E. coli, for example. AAC is a major codon (heavily used by highly expressed genes and decoded by the most abundant isoacceptor tRNA ), whereas AAU is a rarely used minor codon. A major codon is typically translated faster than a minor codon, and highly expressed E. coli genes use AAC almost exclusively to code for Asn, so one could argue that the overuse of AAC is driven by SE. However, AAC and AAU also differ in misreading rate, in particular by tRNALys which ideally should decode only AAA and AAG codons but does misread AAC and AAU, leading to Asn replaced by Lys. This misreading error rate is six times greater for AAU than for AAC, with the error ratio maintained in both Asn-starved and Asn-non-starved conditions (Johnston et al. 1984) or with streptomycin used to inhibit translation (Johnston and Parker 1985). Thus, the overuse of AAC could be driven either by selection for increased translation efficiency or increased translation accuracy or both. Designating SA as selection for translation accuracy, we have three alternative hypotheses expressed, in the simplest form, as
| 9.11 |
| 9.12 |
| 9.13 |
Akashi (1994) classified amino acid sites into conserved sites (assumed to be functionally important with high SA) and variable sites (assumed to experience low SA). He reasoned that, if codon adaptation is due to selection for translation efficiency, then all codons in the gene should be subject to similar selection regardless of whether the codon is in a functionally important or unimportant site. In contrast, if codon adaptation is driven by selection for translation accuracy, then the selection is stronger in functionally important sites than in functionally unimportant sites. So we should observe greater codon usage bias in functionally important codon sites than functionally unimportant codon sites. He found greater codon adaptation in conserved amino acid sites than in variable amino acid sites and concluded that this difference between the conserved and variable sites to have resulted from selection for accuracy.
There is a problem with the conclusion. Take lysine codons (AAA and AAG) and glutamate codons (GAA and GAG), for example. Suppose that AAA codon is favored by selection in lysine codon family and GAG favored in glutamate codon family. Also suppose that an ancestral gene has good codon adaptation with lysine coded by AAA and glutamate coded by GAG. Now some lysine sites experienced nonsynonymous substitutions from AAA to GAA. These sites are now designated as variable sites and are occupied by a minor codon GAA. This would result in an association between “poor codon adaptation” and variable sites that have little to do with translation accuracy. Akashi (1994) was aware of this problem but did not provide a definitive solution.
Amino Acid Usage and Translation Elongation Efficiency
There are at least four factors contributing to amino acid usage. The first two are related to selection for translation elongation efficiency, the third related to number of synonymous codons, and the fourth related to genomic mutation bias.
Factors Related to Selection for Translation Elongation Efficiency
Some amino acids are abundant and energetically cheap to make, i.e., consuming few ATPs in their production, whereas others are rare and energetically expensive, so mass-produced proteins should maximize the use of abundant and cheap amino acids (Akashi and Gojobori 2002). However, such a hypothesis, without considering other factors, often does not produce easily testable predictions. For example, we expect highly expressed proteins to maximize the use of energetically cheap amino acids and avoid the use of the expensive ones. However, many ribosome proteins are highly expressed, yet the need for many of them to bind to the negatively charged mRNA demands the usage of positively charged amino acids such as Lys and Arg that are typically energetically expensive to make in the cell. This would lead to an association between high expression and energetically expensive amino acid, thus confounding the prediction that highly expressed genes should maximize the use of cheap amino acids. Furthermore, amino acid availability changes with environment, and the same amino acid may be manufactured differently with different energy consumption in different organisms. So it is not easy to measure energetic cost of amino acids in different organisms. One could, however, turn the question around and ask how one can characterize energetic costs of amino acids by bioinformatic means. For example, in the ideal situation when all other factors affecting amino acid usage have been controlled for, we may infer that the avoided amino acid is perhaps rare or energenetically expensive to make. This type of inference is of course not very satisfactory and is often derogatively termed the backdoor smuggling approach because one does not present direct evidence for energetic cost.
The other factor related to translation elongation is the tRNA abundance, and one expects mass-produced proteins to use amino acids with many tRNAs to carry them. Designating the proportion of tRNAs carrying amino acid i as Pi, and the frequency of amino acid i in highly expressed genes as Ni, Xia (1998a) analytically derived an equation with Pi linearly increasing with the square root of Ni. The relationship was well substantiated with data from , Salmonella typhimurium, and (Xia 1998a).
Single-stranded DNA (ssDNA) bacteriophages do not carry their own tRNA and depend entirely on the host tRNA pool for decoding their codons. So one would predict that amino acid usage in these phages should be correlated with the abundance of tRNAs in the host cell. This prediction is tested in a study (Chithambaram et al. 2014b) of phages infecting E. coli, by using tRNA gene copy number in E. coli as a proxy of tRNA abundance (Fig. 9.9). An amino acid carried by more tRNA is used more frequently than another carried by few tRNAs.
Fig. 9.9.
Amino acid usage in single-stranded DNA phages infecting E. coli increases with the abundance of isoaccepting tRNA
Number of Synonymous Codons
In the lack of any selection, we would expect amino acid usage to increase with the number of synonymous codons (Fig. 9.10). However, this relationship is confounded with the number of tRNAs carrying each amino acid in the cell. If we designate the number of tRNA carrying amino acid i as Ni.tRNA and the number of synonymous codons for amino acid i as Ni.syn codon, then amino acid usage depends on both. Ni.tRNA and Ni.syn codon are also positively correlated.
Fig. 9.10.
Amino acid count in all coding sequences in I12 (NC_000913) increases with number of synonymous codons
Genomic Mutation Bias
genomes have roughly equal nucleotide frequencies. A more AT-rich or GC-rich genome would tend to have more AT-rich or GC-rich codon and their encoded amino acids. For example, AT-rich genomes in bacterial pathogens tend to have many more lysine (encoded by AAA and AAG) than less AT-rich genomes (Xia and Palidwor 2005). This is highly visible even with mild difference in genomic AT content. For example, yeast ( ) is only mildly AT-rich (0.3090, 0.1917, 0.1913, and 0.3080 for A, C, G, and T, respectively), but the yeast clearly uses more amino acids encoded by AT-rich codons and fewer amino acid encoded by GC-rich codons (Table 9.10).
Table 9.10.
Amino acid usage in K12 (NC_000913) and (NC_001133-NC_001148) coding sequences
| AA | Codon | E. coli | Yeast | E. coli% | Yeast% |
|---|---|---|---|---|---|
| Ala | GCT,GCC,GCA,GCG | 125,332 | 160,810 | 9.5527 | 5.4966 |
| Arg | CGT,CGC,CGA,CGG,AGA,AGG | 72,502 | 130,068 | 5.5260 | 4.4458 |
| Asn | AAT,AAC | 51,075 | 179,836 | 3.8929 | 6.1469 |
| Asp | GAT,GAC | 67,349 | 171,072 | 5.1333 | 5.8473 |
| Cys | TGT,TGC | 15,188 | 37,093 | 1.1576 | 1.2679 |
| Gln | CAA,CAG | 58,360 | 115,741 | 4.4481 | 3.9561 |
| Glu | GAA,GAG | 75,786 | 191,267 | 5.7763 | 6.5376 |
| Gly | GGT,GGC,GGA,GGG | 96,701 | 145,433 | 7.3705 | 4.9710 |
| His | CAT,CAC | 29,751 | 63,505 | 2.2676 | 2.1706 |
| Ile | ATT,ATC,ATA | 78,845 | 191,677 | 6.0095 | 6.5516 |
| Leu | TTG,TTA,CTT,CTC,CTA,CTG | 140,571 | 277,988 | 10.7142 | 9.5017 |
| Lys | AAA,AAG | 57,620 | 214,842 | 4.3917 | 7.3434 |
| Met | ATG | 37,093 | 60,672 | 2.8272 | 2.0738 |
| Phe | TTT,TTC | 51,131 | 129,516 | 3.8972 | 4.4269 |
| Pro | CCT,CCC,CCA,CCG | 58,293 | 128,177 | 4.4430 | 4.3811 |
| Ser | TCT,TCC,TCA,TCG,AGT,AGC | 75,661 | 263,096 | 5.7668 | 8.9927 |
| Thr | ACT,ACC,ACA,ACG | 70,494 | 173,084 | 5.3730 | 5.9161 |
| Trp | TGG | 20,060 | 30,387 | 1.5290 | 1.0386 |
| Tyr | TAT,TAC | 37,134 | 98,746 | 2.8303 | 3.3752 |
| Val | GTT,GTC,GTA,GTG | 93,061 | 162,642 | 7.0930 | 5.5592 |
Amino acids encoded by AT-rich codons are in bold, and those encoded by GC-rich codons are italicized
In summary, amino acid usage (U) is a function of four factors:
| 9.14 |
where E is energetic cost, NtRNA and Nsyncodon have been defined before, and GC% is genomic GC% reflecting mutation bias. One needs to include all these factors in a model in order to reach a reasonable understanding of the determinants of amino acid usage.
Postscript
I usually will share Simpson’s Paradox with students after lecturing on the joint effect of translation initiation and elongation on protein production. If we do not take translation initiation into consideration, we may arrive at a wrong conclusion that codon usage bias contributes little to the rate of protein synthesis, as did by Kudla et al. (2009). Simpson’s Paradox, illustrated with data in Table 9.11, presents a similar case in which one would reach a wrong conclusion when one factor is ignored.
Table 9.11.
Success rate (in percentage) of two surgical treatments for removing kidney stone: “all open procedure” (AOS) or percutaneous nephrolithotomy (PN), taken from Table 2 of Charig et al. (1986)
| Size | AOS | PN |
|---|---|---|
| Small stones | 93% (81/87) | 87% (234/270) |
| Large stones | 73% (192/263) | 69% (55/80) |
| Pooled | 78% (273/350) | 83% (289/350) |
Values in parenthesis are in the format of “Number of successes/number of patients treated.” Kidney stone size (Size) is discretized into two categories as in the original paper
Charig et al. (1986) summarized their findings in the abstract on the basis of the last row of “Pooled” data, stating that “Success was achieved in 273 (78%) patients after open surgery, 289 (83%) after percutaneous nephrolithotomy.” A reader would have thought that AOS is worse (78% success rate) than PN (83% success rate). However, taking kidney stone size into consideration allows us to immediately reach an opposite (and correct) conclusion, i.e., AOS is better than PN for both small stones (93% vs. 87%) and large stones (73% vs 69%). We also note that both AOS and PN have much higher success rate for small stones than for large stones. Patients treated with PN had mostly small stones and patients treated with AOS had mostly large stones. It is this association between PN and small stone that leads to the misleading conclusion that PN is better than AOS when kidney stone size is ignored.
References
- Abdel-Hameed EA, Ji H, Shata MT. HIV-induced epigenetic alterations in host cells. Adv Exp Med Biol. 2016;879:27–38. doi: 10.1007/978-3-319-24738-0_2. [DOI] [PubMed] [Google Scholar]
- Abolbaghaei A, Silke JR, Xia X. How changes in anti-SD sequences would affect SD sequences in Escherichia coli and Bacillus subtilis. G3 (Bethesda, Md) 2017;7(5):1607–1615. doi: 10.1534/g3.117.039305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abraham EP, Chain E. An enzyme from bacteria able to destroy penicillin. Rev Infect Dis. 1940;10(4):677–678. [PubMed] [Google Scholar]
- Abraham EP, Chain E, Fletcher CM, Florey HW, Gardner AD, Heatley NG, Jennings MA. Further observations on penicillin. Lancet. 1941;238(6155):177–189. doi: 10.1016/S0140-6736(00)72122-2. [DOI] [PubMed] [Google Scholar]
- Abraham JM, Feagin JE, Stuart K. Characterization of cytochrome c oxidase III transcripts that are edited only in the 3′ region. Cell. 1988;55(2):267–272. doi: 10.1016/0092-8674(88)90049-9. [DOI] [PubMed] [Google Scholar]
- Adamski FM, McCaughan KK, Jorgensen F, Kurland CG, Tate WP. The concentration of polypeptide chain release factors 1 and 2 at different growth rates of Escherichia coli. J Mol Biol. 1994;238(3):302–308. doi: 10.1006/jmbi.1994.1293. [DOI] [PubMed] [Google Scholar]
- Aerts S, Van Loo P, Thijs G, Mayer H, de Martin R, Moreau Y, De Moor B. TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis. Nucleic Acids Res. 2005;33(Web Server):W393–W396. doi: 10.1093/nar/gki354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aerts S, van Helden J, Sand O, Hassan BA. Fine-tuning enhancer models to predict transcriptional targets across multiple genomes. PLoS One. 2007;2(11):e1115. doi: 10.1371/journal.pone.0001115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ahn BY, Jones EV, Moss B. Identification of the vaccinia virus gene encoding an 18-kilodalton subunit of RNA polymerase and demonstration of a 5′ poly(A) leader on its early transcript. J Virol. 1990;64(6):3019–3024. doi: 10.1128/jvi.64.6.3019-3024.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aird WC, Parvin JD, Sharp PA, Rosenberg RD. The interaction of GATA-binding proteins and basal transcription factors with GATA box-containing core promoters. A model of tissue-specific gene expression. J Biol Chem. 1994;269(2):883–889. [PubMed] [Google Scholar]
- Akaike H. Information theory and an extension of maximum likelihood principle. In: Petrov BN, Csaki F, editors. Second international symposium on information theory. Budapest: Akademiai Kiado; 1973. pp. 267–281. [Google Scholar]
- Akaike H. A new look at the statistical model identification. IEEE Trans Autom Control. 1974;19:716–723. doi: 10.1109/TAC.1974.1100705. [DOI] [Google Scholar]
- Akashi H. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics. 1994;136(3):927–935. doi: 10.1093/genetics/136.3.927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Akashi H, Gojobori T. Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci USA. 2002;99(6):3695–3700. doi: 10.1073/pnas.062526999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alatortsev VS, Cruz-Reyes J, Zhelonkina AG, Sollner-Webb B. Trypanosoma brucei RNA editing: coupled cycles of U deletion reveal processive activity of the editing complex. Mol Cell Biol. 2008;28(7):2437–2445. doi: 10.1128/MCB.01886-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alderwick LJ, Seidel M, Sahm H, Besra GS, Eggeling L. Identification of a novel arabinofuranosyltransferase (AftA) involved in cell wall arabinan biosynthesis in Mycobacterium tuberculosis. J Biol Chem. 2006;281(23):15653–15661. doi: 10.1074/jbc.M600045200. [DOI] [PubMed] [Google Scholar]
- Allen A, Flemstrom G, Garner A, Kivilaakso E. Gastroduodenal mucosal protection. Physiol Rev. 1993;73(4):823–857. doi: 10.1152/physrev.1993.73.4.823. [DOI] [PubMed] [Google Scholar]
- Alm RA, Trust TJ. Analysis of the genetic diversity of Helicobacter pylori: the tale of two genomes. J Mol Med. 1999;77(12):834–846. doi: 10.1007/s001099900067. [DOI] [PubMed] [Google Scholar]
- Alm RA, Ling LS, Moir DT, King BL, Brown ED, Doig PC, Smith DR, Noonan B, Guild BC, deJonge BL, et al. Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature. 1999;397(6715):176–180. doi: 10.1038/16495. [DOI] [PubMed] [Google Scholar]
- Alm RA, Bina J, Andrews BM, Doig P, Hancock RE, Trust TJ. Comparative genomics of Helicobacter pylori: analysis of the outer membrane protein families. Infect Immun. 2000;68(7):4155–4168. doi: 10.1128/IAI.68.7.4155-4168.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Althaus E, Caprara A, Lenhof HP, Reinert K. Multiple sequence alignment with arbitrary gap costs: computing an optimal solution using polyhedral combinatorics. Bioinformatics. 2002;18(Suppl 2):S4–S16. doi: 10.1093/bioinformatics/18.suppl_2.S4. [DOI] [PubMed] [Google Scholar]
- Altschul SF. Local alignment statistics. Meth Enzymol. 1996;274:460–480. doi: 10.1016/S0076-6879(96)66029-7. [DOI] [PubMed] [Google Scholar]
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anderson KP, Crable SC, Lingrel JB. Multiple proteins binding to a GATA-E box-GATA motif regulate the erythroid Kruppel-like factor (EKLF) gene. J Biol Chem. 1998;273(23):14347–14354. doi: 10.1074/jbc.273.23.14347. [DOI] [PubMed] [Google Scholar]
- Andersson DI, Kurland CG. Ram ribosomes are defective proofreaders. Mol Gen Genet. 1983;191(3):378–381. doi: 10.1007/BF00425749. [DOI] [PubMed] [Google Scholar]
- Arava Y, Wang Y, Storey JD, Liu CL, Brown PO, Herschlag D. Genome-wide analysis of mRNA translation profiles in Saccharomyces cerevisiae. Proc Natl Acad Sci USA. 2003;100(7):3889–3894. doi: 10.1073/pnas.0635171100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arbibe L, Sansonetti PJ. Epigenetic regulation of host response to LPS: causing tolerance while avoiding toll errancy. Cell Host Microbe. 2007;1(4):244–246. doi: 10.1016/j.chom.2007.05.011. [DOI] [PubMed] [Google Scholar]
- Arnqvist G. Sensory exploitation and sexual conflict. Philos Trans R Soc Lond Ser B Biol Sci. 2006;361(1466):375–386. doi: 10.1098/rstb.2005.1790. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arvaniti E, Moulos P, Vakrakou A, Chatziantoniou C, Chadjichristos C, Kavvadas P, Charonis A, Politis PK. Whole-transcriptome analysis of UUO mouse model of renal fibrosis reveals new molecular players in kidney diseases. Sci Rep. 2016;6:26235. doi: 10.1038/srep26235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ast G. How did alternative splicing evolve? Nat Rev Genet. 2004;5(10):773–782. doi: 10.1038/nrg1451. [DOI] [PubMed] [Google Scholar]
- Auch AF, Henz SR, Holland BR, Goker M. Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences. BMC Bioinform. 2006;7:350. doi: 10.1186/1471-2105-7-350. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Awan AR, Manfredo A, Pleiss JA. Lariat sequencing in a unicellular yeast identifies regulated alternative splicing of exons that are evolutionarily conserved with humans. Proc Natl Acad Sci USA. 2013;110(31):12762–12767. doi: 10.1073/pnas.1218353110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Axon AT. Are all helicobacters equal? Mechanisms of gastroduodenal pathology and their clinical implications. Gut. 1999;45(Suppl 1):I1–I4. doi: 10.1136/gut.45.2008.i1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bablanian R, Banerjee AK. Poly(riboadenylic acid) preferentially inhibits in vitro translation of cellular mRNAs compared with vaccinia virus mRNAs: possible role in vaccinia virus cytopathology. Proc Natl Acad Sci USA. 1986;83(5):1290–1294. doi: 10.1073/pnas.83.5.1290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bablanian R, Coppola G, Masters PS, Banerjee AK. Characterization of vaccinia virus transcripts involved in selective inhibition of host protein synthesis. Virology. 1986;148(2):375–380. doi: 10.1016/0042-6822(86)90334-X. [DOI] [PubMed] [Google Scholar]
- Bablanian R, Goswami SK, Esteban M, Banerjee AK. Selective inhibition of protein synthesis by synthetic and vaccinia virus-core synthesized poly(riboadenylic acids) Virology. 1987;161(2):366–373. doi: 10.1016/0042-6822(87)90129-2. [DOI] [PubMed] [Google Scholar]
- Bablanian R, Scribani S, Esteban M. Amplification of polyadenylated nontranslated small RNA sequences (POLADS) during superinfection correlates with the inhibition of viral and cellular protein synthesis. Cell Mol Biol Res. 1993;39(3):243–255. [PubMed] [Google Scholar]
- Bag J. Feedback inhibition of poly(A)-binding protein mRNA translation. A possible mechanism of translation arrest by stalled 40 S ribosomal subunits. J Biol Chem. 2001;276(50):47352–47360. doi: 10.1074/jbc.M107676200. [DOI] [PubMed] [Google Scholar]
- Bag J, Bhattacharjee RB. Multiple levels of post-transcriptional control of expression of the poy (A)-binding protein. RNA Biol. 2010;7(1):5–12. doi: 10.4161/rna.7.1.10256. [DOI] [PubMed] [Google Scholar]
- Baik SC, Kim KM, Song SM, Kim DS, Jun JS, Lee SG, Song JY, Park JU, Kang HL, Lee WK, et al. Proteomic analysis of the sarcosine-insoluble outer membrane fraction of Helicobacter pylori strain 26695. J Bacteriol. 2004;186(4):949–955. doi: 10.1128/JB.186.4.949-955.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bailey TL, Williams N, Misleh C, Li WW. MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 2006;34(Web Server issue):W369–W373. doi: 10.1093/nar/gkl198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baird SD, Turcotte M, Korneluk RG, Holcik M. Searching for IRES. RNA. 2006;12(10):1755–1785. doi: 10.1261/rna.157806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baird SD, Lewis SM, Turcotte M, Holcik M. A search for structurally similar cellular internal ribosome entry sites. Nucleic Acids Res. 2007;35(14):4664–4677. doi: 10.1093/nar/gkm483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baldi P, Brunak S. Bioinformatics: the machine learning approach. Cambridge, MA: The MIT Press; 2001. [Google Scholar]
- Bamford DH, Caldentey J, Bamford JK. Bacteriophage PRD1: a broad host range DSDNA tectivirus with an internal membrane. Adv Virus Res. 1995;45:281–319. doi: 10.1016/S0065-3527(08)60064-0. [DOI] [PubMed] [Google Scholar]
- Bao J, Bedford MT. Epigenetic regulation of the histone-to-protamine transition during spermiogenesis. Reproduction. 2016;151(5):R55–R70. doi: 10.1530/REP-15-0562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baron D, Cocquet J, Xia X, Fellous M, Guiguen Y, Veitia RA. An evolutionary and functional analysis of FoxL2 in rainbow trout gonad differentiation. J Mol Endocrinol. 2004;33:705–715. doi: 10.1677/jme.1.01566. [DOI] [PubMed] [Google Scholar]
- Bastianelli G, Bouillon A, Nguyen C, Crublet E, Petres S, Gorgette O, Le-Nguyen D, Barale JC, Nilges M. Computational reverse-engineering of a spider-venom derived peptide active against Plasmodium falciparum SUB1. PLoS One. 2011;6(7):e21812. doi: 10.1371/journal.pone.0021812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bauerfeind P, Garner R, Dunn BE, Mobley HL. Synthesis and activity of Helicobacter pylori urease and catalase at low pH. Gut. 1997;40(1):25–30. doi: 10.1136/gut.40.1.25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baumgartner HK, Montrose MH. Regulated alkali secretion acts in tandem with unstirred layers to regulate mouse gastric surface pH. Gastroenterology. 2004;126(3):774–783. doi: 10.1053/j.gastro.2003.11.059. [DOI] [PubMed] [Google Scholar]
- Beier H, Grimm M. Misreading of termination codons in eukaryotes by natural nonsense suppressor tRNAs. Nucleic Acids Res. 2001;29(23):4767–4782. doi: 10.1093/nar/29.23.4767. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bell D, Bell AH, Bondaruk J, Hanna EY, Weber RS. In-depth characterization of the salivary adenoid cystic carcinoma transcriptome with emphasis on dominant cell type. Cancer. 2016;122(10):1513–1522. doi: 10.1002/cncr.29959. [DOI] [PubMed] [Google Scholar]
- Ben-Gal I, Shani A, Gohr A, Grau J, Arviv S, Shmilovici A, Posch S, Grosse I. Identification of transcription factor binding sites with variable-order Bayesian networks. Bioinformatics. 2005;21(11):2657–2666. doi: 10.1093/bioinformatics/bti410. [DOI] [PubMed] [Google Scholar]
- Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57(1):289–300. [Google Scholar]
- Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple hypothesis testing under dependency. Ann Stat. 2001;29:1165–1188. doi: 10.1214/aos/1013699998. [DOI] [Google Scholar]
- Bennetzen JL, Hall BD. Codon selection in yeast. J Biol Chem. 1982;257(6):3026–3031. [PubMed] [Google Scholar]
- Benoit G, Lemaitre C, Lavenier D, Drezen E, Dayris T, Uricaru R, Rizk G. Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn graph. BMC Bioinform. 2015;16:288. doi: 10.1186/s12859-015-0709-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benzer S, Champe SP. A change from nonsense to sense in the genetic code. Proc Natl Acad Sci USA. 1962;48:1114–1121. doi: 10.1073/pnas.48.7.1114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berg JM, Tymoczko JL, Stryer L. Biochemistry. New York: W. H. Freeman and Co; 2002. [Google Scholar]
- Berger MF, Levin JZ, Vijayendran K, Sivachenko A, Adiconis X, Maguire J, Johnson LA, Robinson J, Verhaak RG, Sougnez C, et al. Integrative analysis of the melanoma transcriptome. Genome Res. 2010;20(4):413–427. doi: 10.1101/gr.103697.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bergsten E, Uutela M, Li X, Pietras K, Ostman A, Heldin CH, Alitalo K, Eriksson U. PDGF-D is a specific, protease-activated ligand for the PDGF beta-receptor. Nat Cell Biol. 2001;3(5):512–516. doi: 10.1038/35074588. [DOI] [PubMed] [Google Scholar]
- Bertholet C, Van Meir E, ten Heggeler-Bordier B, Wittek R. Vaccinia virus produces late mRNAs by discontinuous synthesis. Cell. 1987;50(2):153–162. doi: 10.1016/0092-8674(87)90211-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Besemer J, Borodovsky M. GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res. 2005;33(Web Server issue):W451–W454. doi: 10.1093/nar/gki487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bestor TH, Coxon A. The pros and cons of DNA methylation. Curr Biol. 1993;6:384–386. doi: 10.1016/0960-9822(93)90209-7. [DOI] [PubMed] [Google Scholar]
- Betney R, de Silva E, Krishnan J, Stansfield I. Autoregulatory systems controlling translation factor expression: thermostat-like control of translational accuracy. RNA. 2010;16(4):655–663. doi: 10.1261/rna.1796210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beznoskova P, Gunisova S, Valasek LS. Rules of UGA-N decoding by near-cognate tRNAs and analysis of readthrough on short uORFs in yeast. RNA. 2016;22(3):456–466. doi: 10.1261/rna.054452.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhagwat M, Aravind L. PSI-BLAST tutorial. Methods Mol Biol. 2007;395:177–186. doi: 10.1007/978-1-59745-514-5_10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhatia B, Ponia SS, Solanki AK, Dixit A, Garg LC. Identification of glutamate ABC-transporter component in Clostridium perfringens as a putative drug target. Bioinformation. 2014;10(7):401–405. doi: 10.6026/97320630010401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM, Delano D, Zhang L, Schroth GP, Gunderson KL, et al. High density DNA methylation array with single CpG site resolution. Genomics. 2011;98(4):288–295. doi: 10.1016/j.ygeno.2011.07.007. [DOI] [PubMed] [Google Scholar]
- Bickel DR. Robust cluster analysis of microarray gene expression data with the number of clusters determined biologically. Bioinformatics. 2003;19(7):818–824. doi: 10.1093/bioinformatics/btg092. [DOI] [PubMed] [Google Scholar]
- Bierne H, Hamon M, Cossart P. Epigenetics and bacterial infections. Cold Spring Harb Perspect Med. 2012;2(12):a010272. doi: 10.1101/cshperspect.a010272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bigaud E, Corrales FJ. Methylthioadenosine (MTA) regulates liver cells proteome and methylproteome: implications in liver biology and disease. Mol Cell Proteomics. 2016;15(5):1498–1510. doi: 10.1074/mcp.M115.055772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447(7146):799–816. doi: 10.1038/nature05874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bjorkholm B, Lundin A, Sillen A, Guillemin K, Salama N, Rubio C, Gordon JI, Falk P, Engstrand L. Comparison of genetic divergence and fitness between two subclones of Helicobacter pylori. Infect Immun. 2001;69(12):7832–7838. doi: 10.1128/IAI.69.12.7832-7838.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bjornsson A, Isaksson LA. Accumulation of a mRNA decay intermediate by ribosomal pausing at a stop codon. Nucleic Acids Res. 1996;24(9):1753–1757. doi: 10.1093/nar/24.9.1753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blackburne BP, Whelan S. Class of multiple sequence alignment algorithm affects genomic analysis. Mol Biol Evol. 2013;30(3):642–653. doi: 10.1093/molbev/mss256. [DOI] [PubMed] [Google Scholar]
- Blakqori G, van Knippenberg I, Elliott RM. Bunyamwera orthobunyavirus S-segment untranslated regions mediate poly(A) tail-independent translation. J Virol. 2009;83(8):3637–3646. doi: 10.1128/JVI.02201-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blanchet S, Cornu D, Argentini M, Namy O. New insights into the incorporation of natural suppressor tRNAs at stop codons in Saccharomyces cerevisiae. Nucleic Acids Res. 2014;42(15):10061–10072. doi: 10.1093/nar/gku663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blanchette M, Tompa M. Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res. 2002;12(5):739–748. doi: 10.1101/gr.6902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blanchette M, Bataille AR, Chen X, Poitras C, Laganiere J, Lefebvre C, Deblois G, Giguere V, Ferretti V, Bergeron D, et al. Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression. Genome Res. 2006;6(5):656–668. doi: 10.1101/gr.4866006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boehringer D, Thermann R, Ostareck-Lederer A, Lewis JD, Stark H. Structure of the hepatitis C virus IRES bound to the human 80S ribosome: remodeling of the HCV IRES. Structure. 2005;13(11):1695. doi: 10.1016/j.str.2005.08.008. [DOI] [PubMed] [Google Scholar]
- Bogenhagen DF, Clayton DA. The mitochondrial DNA replication bubble has not burst. Trends Biochem Sci. 2003;28(7):357–360. doi: 10.1016/S0968-0004(03)00132-4. [DOI] [PubMed] [Google Scholar]
- Bolden JE, Peart MJ, Johnstone RW. Anticancer activities of histone deacetylase inhibitors. Nat Rev Drug Discov. 2006;5(9):769–784. doi: 10.1038/nrd2133. [DOI] [PubMed] [Google Scholar]
- Borodovsky M, McIninch J. GENMARK: parallel gene recognition for both DNA strands. Comput Chem. 1993;17:123–133. doi: 10.1016/0097-8485(93)85004-V. [DOI] [Google Scholar]
- Bossi L. Context effects: translation of UAG codon by suppressor tRNA is affected by the sequence following UAG in the message. J Mol Biol. 1983;164(1):73–87. doi: 10.1016/0022-2836(83)90088-8. [DOI] [PubMed] [Google Scholar]
- Bossi L, Ruth JR. The influence of codon context on genetic code translation. Nature. 1980;286(5769):123–127. doi: 10.1038/286123a0. [DOI] [PubMed] [Google Scholar]
- Brauch H, Weirich G, Brieger J, Glavac D, Rodl H, Eichinger M, Feurer M, Weidt E, Puranakanitstha C, Neuhaus C, et al. VHL alterations in human clear cell renal cell carcinoma: association with advanced tumor stage and a novel hot spot mutation. Cancer Res. 2000;60(7):1942–1948. [PubMed] [Google Scholar]
- Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, et al. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet. 2001;29(4):365–371. doi: 10.1038/ng1201-365. [DOI] [PubMed] [Google Scholar]
- Britten RJ. Rates of DNA sequence evolution differ between taxonomic groups. Science. 1986;231:1393–1398. doi: 10.1126/science.3082006. [DOI] [PubMed] [Google Scholar]
- Brooks DR, McLennan DA. Phylogeny, ecology and behavior: a research program in comparative biology. Chicago: University of Chicago Press; 1991. [Google Scholar]
- Brown CM, Stockwell PA, Trotman CN, Tate WP. Sequence analysis suggests that tetra-nucleotides signal the termination of protein synthesis in eukaryotes. Nucleic Acids Res. 1990;18(21):6339–6345. doi: 10.1093/nar/18.21.6339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown M, Hughey R, Krogh A, Mian IS, Sjolander K, Haussler D. Using Dirichlet mixture priors to derive hidden Markov models for protein families. Proc Int Conf Intell Syst Mol Biol. 1993;1:47–55. [PubMed] [Google Scholar]
- Brown TA, Cecconi C, Tkachuk AN, Bustamante C, Clayton DA. Replication of mitochondrial DNA occurs by strand displacement with alternative light-strand origins, not via a strand-coupled mechanism. Genes Dev. 2005;19(20):2466–2476. doi: 10.1101/gad.1352105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brumme ZL, Dong WW, Yip B, Wynhoven B, Hoffman NG, Swanstrom R, Jensen MA, Mullins JI, Hogg RS, Montaner JS, et al. Clinical and immunological impact of HIV envelope V3 sequence variation after starting initial triple antiretroviral therapy. AIDS. 2004;18(4):F1–F9. doi: 10.1097/00002030-200403050-00001. [DOI] [PubMed] [Google Scholar]
- Bucklew JA. Large deviation techniques in decision, simulation, and estimation. New York: Wiley; 1990. [Google Scholar]
- Bulmer M. The effect of context on synonymous codon usage in genes with low codon usage bias. Nucleic Acids Res. 1990;18(10):2869–2873. doi: 10.1093/nar/18.10.2869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bulmer M. The selection-mutation-drift theory of synonymous codon usage. Genetics. 1991;129:897–907. doi: 10.1093/genetics/129.3.897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bumann D, Aksu S, Wendland M, Janek K, Zimny-Arndt U, Sabarth N, Meyer TF, Jungblut PR. Proteome analysis of secreted proteins of the gastric pathogen Helicobacter pylori. Infect Immun. 2002;70(7):3396–3403. doi: 10.1128/IAI.70.7.3396-3403.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997;268:78–94. doi: 10.1006/jmbi.1997.0951. [DOI] [PubMed] [Google Scholar]
- Burge CB, Karlin S. Finding the genes in genomic DNA. Curr Opin Struct Biol. 1998;8(3):346–354. doi: 10.1016/S0959-440X(98)80069-9. [DOI] [PubMed] [Google Scholar]
- Burnham KP, Anderson DR. Model selection and multimodel inference: a practical information-theoretic approach. New York: Springer; 2002. [Google Scholar]
- Bury-Mone S, Skouloubris S, Labigne A, De Reuse H. The Helicobacter pylori UreI protein: role in adaptation to acidity and identification of residues essential for its activity and for acid activation. Mol Microbiol. 2001;42(4):1021–1034. doi: 10.1046/j.1365-2958.2001.02689.x. [DOI] [PubMed] [Google Scholar]
- Calderone TL, Stevens RD, Oas TG. High-level misincorporation of lysine for arginine at AGA codons in a fusion protein expressed in Escherichia coli. J Mol Biol. 1996;262(4):407–412. doi: 10.1006/jmbi.1996.0524. [DOI] [PubMed] [Google Scholar]
- Cao Y, Janke A, Waddell PJ, Westerman M, Takenaka O, Murata S, Okada N, Paabo S, Hasegawa M. Conflict among individual mitochondrial proteins in resolving the phylogeny of eutherian orders. J Mol Evol. 1998;47(3):307–322. doi: 10.1007/PL00006389. [DOI] [PubMed] [Google Scholar]
- Capecchi MR. Polypeptide chain termination in vitro: isolation of a release factor. Proc Natl Acad Sci USA. 1967;58(3):1144–1151. doi: 10.1073/pnas.58.3.1144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Capuano F, Mulleder M, Kok R, Blom HJ, Ralser M. Cytosine DNA methylation is found in Drosophila melanogaster but absent in Saccharomyces cerevisiae, Schizosaccharomyces pombe, and other yeast species. Anal Chem. 2014;86(8):3697–3702. doi: 10.1021/ac500447w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cardon LR, Burge C, Clayton DA, Karlin S. Pervasive CpG suppression in animal mitochondrial genomes. Proc Natl Acad Sci USA. 1994;91:3799–3803. doi: 10.1073/pnas.91.9.3799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carlini DB. Context-dependent codon bias and messenger RNA longevity in the yeast transcriptome. Mol Biol Evol. 2005;22(6):1403–1411. doi: 10.1093/molbev/msi135. [DOI] [PubMed] [Google Scholar]
- Carroll J, Fearnley IM, Shannon RJ, Hirst J, Walker JE. Analysis of the subunit composition of complex I from bovine heart mitochondria. Mol Cell Proteomics. 2003;2(2):117–126. doi: 10.1074/mcp.M300014-MCP200. [DOI] [PubMed] [Google Scholar]
- Carullo M, Xia X. An extensive study of mutation and selection on the wobble nucleotide in tRNA anticodons in fungal mitochondrial genomes. J Mol Evol. 2008;66(5):484–493. doi: 10.1007/s00239-008-9102-8. [DOI] [PubMed] [Google Scholar]
- Censini S, Lange C, Xiang Z, Crabtree JE, Ghiara P, Borodovsky M, Rappuoli R, Covacci A. Cag, a pathogenicity island of Helicobacter pylori, encodes type I-specific and disease-associated virulence factors. Proc Natl Acad Sci USA. 1996;93(25):14648–14653. doi: 10.1073/pnas.93.25.14648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cesar Sanchez J, Padron G, Santana H, Herrera L. Elimination of an HuIFN alpha 2b readthrough species, produced in Escherichia coli, by replacing its natural translational stop signal. J Biotechnol. 1998;63(3):179–186. doi: 10.1016/S0168-1656(98)00073-X. [DOI] [PubMed] [Google Scholar]
- Chakrabarti S, Lanczycki CJ. Analysis and prediction of functionally important sites in proteins. Protein Sci. 2007;16(1):4–13. doi: 10.1110/ps.062506407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chakraborty R. Estimation of time of divergence from phylogenetic studies. Can J Genet Cytol. 1977;19:217–223. doi: 10.1139/g77-024. [DOI] [PubMed] [Google Scholar]
- Chambaud I, Heilig R, Ferris S, Barbe V, Samson D, Galisson F, Moszer I, Dybvig K, Wroblewski H, Viari A, et al. The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. Nucleic Acids Res. 2001;29(10):2145–2153. doi: 10.1093/nar/29.10.2145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chan S-W, Egan P. Effects of hepatitis C virus envelope glycoprotein unfolded protein response activation on translation and transcription. Arch Virol. 2009;154(10):1631–1640. doi: 10.1007/s00705-009-0495-5. [DOI] [PubMed] [Google Scholar]
- Chan PP, Lowe TM. GtRNAdb: a database of transfer RNA genes detected in genomic sequence. Nucleic Acids Res. 2009;37(Database issue):D93–D97. doi: 10.1093/nar/gkn787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang SY, McGary EC, Chang S. Methionine aminopeptidase gene of Escherichia coli is essential for cell growth. J Bacteriol. 1989;171(7):4071–4072. doi: 10.1128/jb.171.7.4071-4072.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Charig CR, Webb DR, Payne SR, Wickham JE. Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy. Br Med J (Clin Res Ed) 1986;292(6524):879–882. doi: 10.1136/bmj.292.6524.879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen JJ, Peck K, Hong TM, Yang SC, Sher YP, Shih JY, Wu R, Cheng JL, Roffler SR, Wu CW, et al. Global analysis of gene expression in invasion by a lung cancer model. Cancer Res. 2001;61(13):5223–5230. [PubMed] [Google Scholar]
- Chen Q, Yan M, Cao Z, Li X, Zhang Y, Shi J, Feng GH, Peng H, Zhang X, Qian J, et al. Sperm tsRNAs contribute to intergenerational inheritance of an acquired metabolic disorder. Science. 2016;351(6271):397–400. doi: 10.1126/science.aad7977. [DOI] [PubMed] [Google Scholar]
- Chilingaryan A, Gevorgyan N, Vardanyan A, Jones D, Szabo A. Multivariate approach for selecting sets of differentially expressed genes. Math Biosci. 2002;176(1):59–69. doi: 10.1016/S0025-5564(01)00105-5. [DOI] [PubMed] [Google Scholar]
- Chithambaram S, Prabhakaran R, Xia X. Differential codon adaptation between dsDNA and ssDNA phages in escherichia coli. Mol Biol Evol. 2014;31(6):1606–1617. doi: 10.1093/molbev/msu087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chithambaram S, Prabhakaran R, Xia X. The effect of mutation and selection on codon adaptation in escherichia coli bacteriophage. Genetics. 2014;197(1):301–315. doi: 10.1534/genetics.114.162842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, et al. A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell. 1998;2(1):65–73. doi: 10.1016/S1097-2765(00)80114-8. [DOI] [PubMed] [Google Scholar]
- Chou PY, Fasman GD. Empirical predictions of protein conformation. Annu Rev Biochem. 1978;47:251–276. doi: 10.1146/annurev.bi.47.070178.001343. [DOI] [PubMed] [Google Scholar]
- Chou PY, Fasman GD. Prediction of the secondary structure of proteins from their amino acid sequence. Adv Enzymol Relat Areas Mol Biol. 1978;47:45–148. doi: 10.1002/9780470122921.ch2. [DOI] [PubMed] [Google Scholar]
- Chu C, Qu K, Zhong FL, Artandi SE, Chang HY. Genomic maps of long noncoding RNA occupancy reveal principles of RNA-chromatin interactions. Mol Cell. 2011;44(4):667–678. doi: 10.1016/j.molcel.2011.08.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chu C, Quinn J, Chang HY. Chromatin isolation by RNA purification (ChIRP) J Vis Exp. 2012;61:e3912. doi: 10.3791/3912. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chuang SE, Daniels DL, Blattner FR. Global regulation of gene expression in Escherichia coli. J Bacteriol. 1993;175(7):2026–2036. doi: 10.1128/jb.175.7.2026-2036.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark AT. DNA methylation remodeling in vitro and in vivo. Curr Opin Genet Dev. 2015;34:82–87. doi: 10.1016/j.gde.2015.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Claverie JM. Some useful statistical properties of position-weight matrices. Comput Chem. 1994;18(3):287–294. doi: 10.1016/0097-8485(94)85024-0. [DOI] [PubMed] [Google Scholar]
- Claverie JM, Audic S. The statistical significance of nucleotide position-weight matrix matches. Comput Appl Biosci. 1996;12(5):431–439. doi: 10.1093/bioinformatics/12.5.431. [DOI] [PubMed] [Google Scholar]
- Clayton DA. Replication of animal mitochondrial DNA. Cell. 1982;28(4):693–705. doi: 10.1016/0092-8674(82)90049-6. [DOI] [PubMed] [Google Scholar]
- Clayton DA. Transcription and replication of mitochondrial DNA. Hum Reprod. 2000;15(Suppl 2):11–17. doi: 10.1093/humrep/15.suppl_2.11. [DOI] [PubMed] [Google Scholar]
- Cocquet J, De Baere E, Gareil M, Pannetier M, Xia X, Fellous M, Veitia RA. Structure, evolution and expression of the FOXL2 transcription unit. Cytogenet Genome Res. 2003;101:206–211. doi: 10.1159/000074338. [DOI] [PubMed] [Google Scholar]
- Coessens B, Thijs G, Aerts S, Marchal K, De Smet F, Engelen K, Glenisson P, Moreau Y, Mathys J, De Moor B. INCLUSive: a web portal and service registry for microarray and regulatory sequence analysis. Nucleic Acids Res. 2003;31(13):3468–3470. doi: 10.1093/nar/gkg615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coghlan A, Wolfe KH. Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast. 2000;16(12):1131–1145. doi: 10.1002/1097-0061(20000915)16:12<1131::AID-YEA609>3.0.CO;2-F. [DOI] [PubMed] [Google Scholar]
- Comeron JM, Aguade M. An evaluation of measures of synonymous codon usage bias. J Mol Evol. 1998;47(3):268–274. doi: 10.1007/PL00006384. [DOI] [PubMed] [Google Scholar]
- Correa P. Helicobacter pylori as a pathogen and carcinogen. J Physiol Pharmacol. 1997;48(Suppl 4):19–24. [PubMed] [Google Scholar]
- Cottrell JS. Protein identification by peptide mass fingerprinting. Pept Res. 1994;7(3):115–124. [PubMed] [Google Scholar]
- Cottrell JS, Sutton CW. The identification of electrophoretically separated proteins by peptide mass fingerprinting. Methods Mol Biol. 1996;61:67–82. doi: 10.1385/0-89603-345-7:67. [DOI] [PubMed] [Google Scholar]
- Covacci A, Falkow S, Berg DE, Rappuoli R. Did the inheritance of a pathogenicity island modify the virulence of Helicobacter pylori? Trends Microbiol. 1997;5(5):205–208. doi: 10.1016/S0966-842X(97)01035-4. [DOI] [PubMed] [Google Scholar]
- Covell DG, Wallqvist A, Rabow AA, Thanki N. Molecular classification of cancer: unsupervised self-organizing map analysis of gene expression microarray data. Mol Cancer Ther. 2003;2(3):317–332. [PubMed] [Google Scholar]
- Cox SS, van der Giezen M, Tarr SJ, Crompton MR, Tovar J. Evidence from bioinformatics, expression and inhibition studies of phosphoinositide-3 kinase signalling in Giardia intestinalis. BMC Microbiol. 2006;6:45. doi: 10.1186/1471-2180-6-45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Craigen WJ, Caskey CT. Expression of peptide chain release factor 2 requires high-efficiency frameshift. Nature. 1986;322(6076):273–275. doi: 10.1038/322273a0. [DOI] [PubMed] [Google Scholar]
- Craigen WJ, Caskey CT. The function, structure and regulation of E. coli peptide chain release factors. Biochimie. 1987;69(10):1031–1041. doi: 10.1016/0300-9084(87)90003-4. [DOI] [PubMed] [Google Scholar]
- Craigen WJ, Cook RG, Tate WP, Caskey CT. Bacterial peptide chain release factors: conserved primary structure and possible frameshift regulation of release factor 2. Proc Natl Acad Sci USA. 1985;82(11):3616–3620. doi: 10.1073/pnas.82.11.3616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Craigen WJ, Lee CC, Caskey CT. Recent advances in peptide chain termination. Mol Microbiol. 1990;4(6):861–865. doi: 10.1111/j.1365-2958.1990.tb00658.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crick FH. Codon—anticodon pairing: the wobble hypothesis. J Mol Biol. 1966;19(2):548–555. doi: 10.1016/S0022-2836(66)80022-0. [DOI] [PubMed] [Google Scholar]
- Curran JF, Yarus M. Use of tRNA suppressors to probe regulation of Escherichia coli release factor 2. J Mol Biol. 1988;203(1):75–83. doi: 10.1016/0022-2836(88)90092-7. [DOI] [PubMed] [Google Scholar]
- Czerwoniec A, Dunin-Horkawicz S, Purta E, Kaminska KH, Kasprzak JM, Bujnicki JM, Grosjean H, Rother K. MODOMICS: a database of RNA modification pathways. 2008 update. Nucleic Acids Res. 2009;37(Database issue):D118–D121. doi: 10.1093/nar/gkn710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danchin A. The Delphic boat : what genomes tell us. Cambridge, MA: Harvard University Press; 2002. [Google Scholar]
- David E, Tramontin T, Zemmel R. Pharmaceutical R&D: the road to positive returns. Nat Rev Drug Discov. 2009;8(8):609–610. doi: 10.1038/nrd2948. [DOI] [PubMed] [Google Scholar]
- Davies J, Jones DS, Khorana HG. A further study of misreading of codons induced by streptomycin and neomycin using ribopolynucleotides containing two nucleotides in alternating sequence as templates. J Mol Biol. 1966;18(1):48–57. doi: 10.1016/S0022-2836(66)80075-X. [DOI] [PubMed] [Google Scholar]
- Dayhoff MO, Schwartz RM, Orcutt BC. A model of evolutionary change in proteins. In: Dayhoff MO, editor. Atlas of protein sequence and structure. Washington, DC: National Biomedical Research Foundation; 1978. pp. 345–352. [Google Scholar]
- Delorenzi M, Speed T. An HMM model for coiled-coil domains and a comparison with PSSM-based predictions. Bioinformatics. 2002;18(4):617–625. doi: 10.1093/bioinformatics/18.4.617. [DOI] [PubMed] [Google Scholar]
- Deng R, Huang M, Wang J, Huang Y, Yang J, Feng J, Wang X. PTreeRec: phylogenetic tree reconstruction based on genome BLAST distance. Comput Biol Chem. 2006;30(4):300–302. doi: 10.1016/j.compbiolchem.2006.04.003. [DOI] [PubMed] [Google Scholar]
- Deng W, Lee J, Wang H, Miller J, Reik A, Gregory PD, Dean A, Blobel GA. Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor. Cell. 2012;149(6):1233–1244. doi: 10.1016/j.cell.2012.03.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deng Q, Ramskold D, Reinius B, Sandberg R. Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science. 2014;343(6167):193–196. doi: 10.1126/science.1245316. [DOI] [PubMed] [Google Scholar]
- Deng W, Rupon JW, Krivega I, Breda L, Motta I, Jahn KS, Reik A, Gregory PD, Rivella S, Dean A, et al. Reactivation of developmentally silenced globin genes by forced chromatin looping. Cell. 2014;158(4):849–860. doi: 10.1016/j.cell.2014.05.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Desper R, Gascuel O. Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. J Comput Biol. 2002;9(5):687–705. doi: 10.1089/106652702761034136. [DOI] [PubMed] [Google Scholar]
- Dewey CN, Rogozin IB, Koonin EV. Compensatory relationship between splice sites and exonic splicing signals depending on the length of vertebrate introns. BMC Genomics. 2006;7:311. doi: 10.1186/1471-2164-7-311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Diehn M, Eisen MB, Botstein D, Brown PO. Large-scale identification of secreted and membrane-associated gene products using DNA microarrays. Nat Genet. 2000;25(1):58–62. doi: 10.1038/75603. [DOI] [PubMed] [Google Scholar]
- Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dobzhansky T. Nothing in biology makes sense except in the light of evolution. Am Biol Teach. 1973;35:125–129. doi: 10.2307/4444260. [DOI] [Google Scholar]
- Donly BC, Edgar CD, Adamski FM, Tate WP. Frameshift autoregulation in the gene for Escherichia coli release factor 2: partly functional mutants result in frameshift enhancement. Nucleic Acids Res. 1990;18(22):6517–6522. doi: 10.1093/nar/18.22.6517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doolittle RF, Hunkapiller MW, Hood LE, Devare SG, Robbins KC, Aaronson SA, Antoniades HN. Simian sarcoma virus onc gene, v-sis, is derived from the gene (or genes) encoding a platelet-derived growth factor. Science. 1983;221(4607):275–277. doi: 10.1126/science.6304883. [DOI] [PubMed] [Google Scholar]
- Dorokhov YL, Skulachev MV, Ivanov PA, Zvereva SD, Tjulkina LG, Merits A, Gleba YY, Hohn T, Atabekov JG. Polypurine (A)-rich sequences promote cross-kingdom conservation of internal ribosome entry. Proc Natl Acad Sci USA. 2002;99(8):5301–5306. doi: 10.1073/pnas.082107599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- dos Reis M, Savva R, Wernisch L. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 2004;32(17):5036–5044. doi: 10.1093/nar/gkh834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Doudna JA, Sarnow P. Translation initiation by viral internal ribosome entry sites. In: Mathews MB, Sonenberg N, Hershey J, editors. Translational control in biology and medicine. Cold Spring Harbor: Cold Spring Harbor Laboratory Press; 2007. pp. 129–154. [Google Scholar]
- Drews J, Ryser S. The role of innovation in drug development. Nat Biotechnol. 1997;15(13):1318–1319. doi: 10.1038/nbt1297-1318. [DOI] [PubMed] [Google Scholar]
- Drouin G, Daoud H, Xia J. Relative rates of synonymous substitutions in the mitochondrial, chloroplast and nuclear genomes of seed plants. Mol Phylogenet Evol. 2008;49(3):827–831. doi: 10.1016/j.ympev.2008.09.009. [DOI] [PubMed] [Google Scholar]
- Drummond A, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007;7(1):214. doi: 10.1186/1471-2148-7-214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drummond A, Rodrigo AG. Reconstructing genealogies of serial samples under the assumption of a molecular clock using serial-sample UPGMA. Mol Biol Evol. 2000;17(12):1807–1815. doi: 10.1093/oxfordjournals.molbev.a026281. [DOI] [PubMed] [Google Scholar]
- Drummond A, Forsberg R, Rodrigo AG. The inference of stepwise changes in substitution rates using serial sequence samples. Mol Biol Evol. 2001;18(7):1365–1371. doi: 10.1093/oxfordjournals.molbev.a003920. [DOI] [PubMed] [Google Scholar]
- Drummond AJ, Pybus OG, Rambaut A, Forsberg R, Rodrigo AG. Measurably evolving populations. Trends Ecol Evol. 2003;18(9):481–488. doi: 10.1016/S0169-5347(03)00216-7. [DOI] [Google Scholar]
- Drummond A, Pybus OG, Rambaut A. Inference of viral evolutionary rates from molecular sequences. Adv Parasitol. 2003;54:331–358. doi: 10.1016/S0065-308X(03)54008-8. [DOI] [PubMed] [Google Scholar]
- Durbin R. Biological sequence analysis : probabilistic models of proteins and nucleic acids. Cambridge: Cambridge University Press; 1998. [Google Scholar]
- Duret L, Mouchiroud D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci USA. 1999;96(8):4482–4487. doi: 10.1073/pnas.96.8.4482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DuRose JB, Scheuner D, Kaufman RJ, Rothblum LI, Niwa M. Phosphorylation of eukaryotic translation initiation factor 2alpha coordinates rRNA transcription and translation inhibition during endoplasmic reticulum stress. Mol Cell Biol. 2009;29(15):4295–4307. doi: 10.1128/MCB.00260-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duval M, Korepanov A, Fuchsbauer O, Fechter P, Haller A, Fabbretti A, Choulier L, Micura R, Klaholz BP, Romby P, et al. Escherichia coli Ribosomal protein S1 unfolds structured mRNAs onto the ribosome for active translation initiation. PLoS Biol. 2013;11(12):e1001731. doi: 10.1371/journal.pbio.1001731. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eckhardt F, Lewin J, Cortese R, Rakyan VK, Attwood J, Burger M, Burton J, Cox TV, Davies R, Down TA, et al. DNA methylation profiling of human chromosomes 6, 20 and 22. Nat Genet. 2006;38(12):1378–1385. doi: 10.1038/ng1909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eddy SR. Hidden Markov models. Curr Opin Struct Biol. 1996;6(3):361–365. doi: 10.1016/S0959-440X(96)80056-X. [DOI] [PubMed] [Google Scholar]
- Eddy SR. Profile hidden Markov models. Bioinformatics. 1998;14(9):755–763. doi: 10.1093/bioinformatics/14.9.755. [DOI] [PubMed] [Google Scholar]
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edgar RC, Batzoglou S. Multiple sequence alignment. Curr Opin Struct Biol. 2006;16(3):368–373. doi: 10.1016/j.sbi.2006.04.004. [DOI] [PubMed] [Google Scholar]
- Efron B. The jackknife, the bootstrap and other resampling plans. Philadelphia: Society for Industrial and Applied Mathematics; 1982. [Google Scholar]
- Ehnman M, Missiaglia E, Folestad E, Selfe J, Strell C, Thway K, Brodin B, Pietras K, Shipley J, Ostman A, et al. Distinct effects of ligand-induced PDGFRalpha and PDGFRbeta signaling in the human rhabdomyosarcoma tumor cell and stroma cell compartments. Cancer Res. 2013;73(7):2139–2149. doi: 10.1158/0008-5472.CAN-12-1646. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ehrenberg M, Tenson T. A new beginning of the end of translation. Nat Struct Biol. 2002;9(2):85–87. doi: 10.1038/nsb0202-85. [DOI] [PubMed] [Google Scholar]
- Einstein A, Russell B, Dewey J, Millikan RA, Dreiser T, Wells HG, Nansen F, Jeans SJ, Babbitt I, Keith SA, et al. Living philosophies. New York: Simon and Schuster; 1931. [Google Scholar]
- Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998;95(25):14863–14868. doi: 10.1073/pnas.95.25.14863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elf J, Nilsson D, Tenson T, Ehrenberg M. Selective charging of tRNA isoacceptors explains patterns of codon usage. Science. 2003;300(5626):1718–1722. doi: 10.1126/science.1083811. [DOI] [PubMed] [Google Scholar]
- Elroy-Stein O, Merrick W. Translation initiation via cellular internal ribosome entry sites. In: Mathews MB, Sonenberg N, Hershey J, editors. Translational control in biology and medicine. Cold Spring Harbor: Cold Spring Harbor Laboratory Press; 2007. pp. 155–172. [Google Scholar]
- Engel E, Peskoff A, Kauffman GL, Jr, Grossman MI. Analysis of hydrogen ion concentration in the gastric gel mucus layer. Am J Phys. 1984;247(4 Pt 1):G321–G338. doi: 10.1152/ajpgi.1984.247.4.G321. [DOI] [PubMed] [Google Scholar]
- Engelberg-Kulka H. UGA suppression by normal tRNA Trp in Escherichia coli: codon context effects. Nucleic Acids Res. 1981;9(4):983–991. doi: 10.1093/nar/9.4.983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Epstein CB, Butow RA. Microarray technology – enhanced versatility, persistent challenge. Curr Opin Biotechnol. 2000;11(1):36–41. doi: 10.1016/S0958-1669(99)00065-8. [DOI] [PubMed] [Google Scholar]
- Eswarappa SM, Potdar AA, Koch WJ, Fan Y, Vasu K, Lindner D, Willard B, Graham LM, DiCorleto PE, Fox PL. Programmed translational readthrough generates antiangiogenic VEGF-Ax. Cell. 2014;157(7):1605–1618. doi: 10.1016/j.cell.2014.04.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Evans T, Felsenfeld G, Reitman M. Control of globin gene transcription. Annu Rev Cell Biol. 1990;6:95–124. doi: 10.1146/annurev.cb.06.110190.000523. [DOI] [PubMed] [Google Scholar]
- Eyre-Walker A. The close proximity of Escherichia coli genes: consequences for stop codon and synonymous codon use. J Mol Evol. 1996;42(2):73–78. doi: 10.1007/BF02198830. [DOI] [PubMed] [Google Scholar]
- Eyre-Walker A, Bulmer M. Reduced synonymous substitution rate at the start of enterobacterial genes. Nucleic Acids Res. 1993;21:4599–4603. doi: 10.1093/nar/21.19.4599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ezzell C. Proteins rule. Sci Am. 2002;286(4):40–47. doi: 10.1038/scientificamerican0402-40. [DOI] [PubMed] [Google Scholar]
- Farazi TA, Waksman G, Gordon JI. The biology and enzymology of protein N-myristoylation. J Biol Chem. 2001;276(43):39501–39504. doi: 10.1074/jbc.R100042200. [DOI] [PubMed] [Google Scholar]
- Farnham PJ, Platt T. Rho-independent termination: dyad symmetry in DNA causes RNA polymerase to pause during transcription in vitro. Nucleic Acids Res. 1981;9(3):563–577. doi: 10.1093/nar/9.3.563. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fasman GD, Chou PY. Prediction of protein conformation: consequences and aspirations. In: Blout ER, Bovey FA, Goodman M, Latan N, editors. Peptides, polypeptides and proteins. New York: Wiley; 1974. pp. 114–125. [Google Scholar]
- Fatemi M, Hermann A, Pradhan S, Jeltsch A. The activity of the murine DNA methyltransferase Dnmt1 is controlled by interaction of the catalytic domain with the N-terminal part of the enzyme leading to an allosteric activation of the enzyme after binding to methylated DNA. J Mol Biol. 2001;309(5):1189–1199. doi: 10.1006/jmbi.2001.4709. [DOI] [PubMed] [Google Scholar]
- Felsenstein J. Maximum-likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Syst Zool. 1973;22:240–249. doi: 10.2307/2412304. [DOI] [Google Scholar]
- Felsenstein J. Cases in which parsimony and compatibility methods will be positively misleading. Syst Zool. 1978;27:401–410. doi: 10.2307/2412923. [DOI] [Google Scholar]
- Felsenstein J. The number of evolutionary trees. Syst Zool. 1978;27:27–33. doi: 10.2307/2412810. [DOI] [Google Scholar]
- Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17:368–376. doi: 10.1007/BF01734359. [DOI] [PubMed] [Google Scholar]
- Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–791. doi: 10.1111/j.1558-5646.1985.tb00420.x. [DOI] [PubMed] [Google Scholar]
- Felsenstein J. Inferring phylogenies. Sunderland: Sinauer; 2004. [Google Scholar]
- Felsenstein J, Churchill GA. A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996;13(1):93–104. doi: 10.1093/oxfordjournals.molbev.a025575. [DOI] [PubMed] [Google Scholar]
- Feng DF, Doolittle RF. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol. 1987;25(4):351–360. doi: 10.1007/BF02603120. [DOI] [PubMed] [Google Scholar]
- Feng DF, Doolittle RF. Progressive alignment and phylogenetic tree construction of protein sequences. Methods Enzymol. 1990;183:375–387. doi: 10.1016/0076-6879(90)83025-5. [DOI] [PubMed] [Google Scholar]
- Fernandez-Pinar R, Lo Sciuto A, Rossi A, Ranucci S, Bragonzi A, Imperi F. In vitro and in vivo screening for novel essential cell-envelope proteins in Pseudomonas aeruginosa. Sci Rep. 2015;5:17593. doi: 10.1038/srep17593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fickett JW. Quantitative discrimination of MEF2 sites. Mol Cell Biol. 1996;16(1):437–441. doi: 10.1128/MCB.16.1.437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Figeys D. Adapting arrays and lab-on-a-chip technology for proteomics. Proteomics. 2002;2(4):373–382. doi: 10.1002/1615-9861(200204)2:4<373::AID-PROT373>3.0.CO;2-I. [DOI] [PubMed] [Google Scholar]
- Figeys D. Novel approaches to map protein interactions. Curr Opin Biotechnol. 2003;14(1):119–125. doi: 10.1016/S0958-1669(02)00005-8. [DOI] [PubMed] [Google Scholar]
- Figeys D. Proteomics in 2002: a year of technical development and wide-ranging applications. Anal Chem. 2003;75(12):2891–2905. doi: 10.1021/ac030142m. [DOI] [PubMed] [Google Scholar]
- Fisher RA. The arrangement of field experiments. J Minist Agric. 1926;33:503–513. [Google Scholar]
- Fisher RA. The use of multiple measurements in taxonomic problems. Ann Eugenics. 1936;7:179–188. doi: 10.1111/j.1469-1809.1936.tb02137.x. [DOI] [Google Scholar]
- Fitch WM. Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool. 1971;20:406–416. doi: 10.2307/2412116. [DOI] [Google Scholar]
- Fitch WM, Margoliash E. Construction of phylogenetic trees. Science. 1967;155:279–284. doi: 10.1126/science.155.3760.279. [DOI] [PubMed] [Google Scholar]
- Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269(5223):496–512. doi: 10.1126/science.7542800. [DOI] [PubMed] [Google Scholar]
- Fong TC, Emerson BM. The erythroid-specific protein cGATA-1 mediates distal enhancer activity through a specialized beta-globin TATA box. Genes Dev. 1992;6(4):521–532. doi: 10.1101/gad.6.4.521. [DOI] [PubMed] [Google Scholar]
- Forde CE, McCutchen-Maloney SL. Characterization of transcription factors by mass spectrometry and the role of SELDI-MS. Mass Spectrom Rev. 2002;21(6):419–439. doi: 10.1002/mas.10040. [DOI] [PubMed] [Google Scholar]
- Forrester WC, Epner E, Driscoll MC, Enver T, Brice M, Papayannopoulou T, Groudine M. A deletion of the human beta-globin locus activation region causes a major alteration in chromatin structure and replication across the entire beta-globin locus. Genes Dev. 1990;4(10):1637–1649. doi: 10.1101/gad.4.10.1637. [DOI] [PubMed] [Google Scholar]
- Frank C, Makkonen H, Dunlop TW, Matilainen M, Vaisanen S, Carlberg C. Identification of pregnane X receptor binding sites in the regulatory regions of genes involved in bile acid homeostasis. J Mol Biol. 2005;346(2):505–519. doi: 10.1016/j.jmb.2004.12.003. [DOI] [PubMed] [Google Scholar]
- Fraser CM, Gocayne JD, White O, Adams MD, Clayton RA, Fleischmann RD, Bult CJ, Kerlavage AR, Sutton G, Kelley JM, et al. The minimal gene complement of Mycoplasma genitalium. Science. 1995;270(5235):397–403. doi: 10.1126/science.270.5235.397. [DOI] [PubMed] [Google Scholar]
- Frederico LA, Kunkel TA, Shaw BR. A sensitive genetic assay for the detection of cytosine deamination: determination of rate constants and the activation energy. Biochemistry (Mosc) 1990;29(10):2532–2537. doi: 10.1021/bi00462a015. [DOI] [PubMed] [Google Scholar]
- Frishman D, Mironov A, Mewes HW, Gelfand M. Combining diverse evidence for gene recognition in completely sequenced bacterial genomes. Nucleic Acids Res. 1998;26(12):2941–2947. doi: 10.1093/nar/26.12.2941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frolova LY, Tsivkovskii RY, Sivolobova GF, Oparina NY, Serpinsky OI, Blinov VM, Tatkov SI, Kisselev LL. Mutations in the highly conserved GGQ motif of class 1 polypeptide release factors abolish ability of human eRF1 to trigger peptidyl-tRNA hydrolysis. RNA. 1999;5(8):1014–1020. doi: 10.1017/S135583829999043X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frottin F, Martinez A, Peynot P, Mitra S, Holz RC, Giglione C, Meinnel T. The proteomics of N-terminal methionine cleavage. Mol Cell Proteomics. 2006;5(12):2336–2349. doi: 10.1074/mcp.M600225-MCP200. [DOI] [PubMed] [Google Scholar]
- Furukawa R, Hachiya T, Ohmomo H, Shiwa Y, Ono K, Suzuki S, Satoh M, Hitomi J, Sobue K, Shimizu A. Intraindividual dynamics of transcriptome and genome-wide stability of DNA methylation. Sci Rep. 2016;6:26424. doi: 10.1038/srep26424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Futcher B, Latter GI, Monardo P, McLaughlin CS, Garrels JI. A sampling of the yeast proteome. Mol Cell Biol. 1999;19(11):7357–7368. doi: 10.1128/MCB.19.11.7357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gaasterland T, Bekiranov S. Making the most of microarray data [news] Nat Genet. 2000;24(3):204–206. doi: 10.1038/73392. [DOI] [PubMed] [Google Scholar]
- Gallie DR, Tanguay R. Poly(A) binds to initiation factors and increases cap-dependent translation in vitro. J Biol Chem. 1994;269(25):17166–17173. [PubMed] [Google Scholar]
- Gal-Mor O, Finlay BB. Pathogenicity islands: a molecular toolbox for bacterial virulence. Cell Microbiol. 2006;8(11):1707–1719. doi: 10.1111/j.1462-5822.2006.00794.x. [DOI] [PubMed] [Google Scholar]
- Galtier N, Lobry JR. Relationships between genomic G+C content, RNA secondary structures, and optimal growth temperature in prokaryotes. J Mol Evol. 1997;44(6):632–636. doi: 10.1007/PL00006186. [DOI] [PubMed] [Google Scholar]
- Gao L, Qi J. Whole genome molecular phylogeny of large dsDNA viruses using composition vector method. BMC Evol Biol. 2007;7:41. doi: 10.1186/1471-2148-7-41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gapp K, Jawaid A, Sarkies P, Bohacek J, Pelczar P, Prados J, Farinelli L, Miska E, Mansuy IM. Implication of sperm RNAs in transgenerational inheritance of the effects of early trauma in mice. Nat Neurosci. 2014;17(5):667–669. doi: 10.1038/nn.3695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gascuel O, Steel M. Neighbor-joining revealed. Mol Biol Evol. 2006;23(11):1997–2000. doi: 10.1093/molbev/msl072. [DOI] [PubMed] [Google Scholar]
- Ge Y, Sealfon SC, Speed TP. Some step-down procedures controlling the false discovery rate under dependence. Stat Sin. 2008;18(3):881–904. [PMC free article] [PubMed] [Google Scholar]
- Geller AI, Rich A. A UGA termination suppression tRNATrp active in rabbit reticulocytes. Nature. 1980;283(5742):41–46. doi: 10.1038/283041a0. [DOI] [PubMed] [Google Scholar]
- Geman S, Geman D. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell. 1984;6:721–741. doi: 10.1109/TPAMI.1984.4767596. [DOI] [PubMed] [Google Scholar]
- Ghaemmaghami S, Huh WK, Bower K, Howson RW, Belle A, Dephoure N, O’Shea EK, Weissman JS. Global analysis of protein expression in yeast. Nature. 2003;425(6959):737–741. doi: 10.1038/nature02046. [DOI] [PubMed] [Google Scholar]
- Gibbs JB. Mechanism-based target identification and drug discovery in cancer research. Science. 2000;287(5460):1969–1973. doi: 10.1126/science.287.5460.1969. [DOI] [PubMed] [Google Scholar]
- Giglione C, Vallon O, Meinnel T. Control of protein life-span by N-terminal methionine excision. EMBO J. 2003;22(1):13–23. doi: 10.1093/emboj/cdg007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Giglione C, Boularot A, Meinnel T. Protein N-terminal methionine excision. Cell Mol Life Sci. 2004;61(12):1455–1474. doi: 10.1007/s00018-004-3466-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilbert WV. Alternative ways to think about cellular internal ribosome entry. J Biol Chem. 2010;285(38):29033–29038. doi: 10.1074/jbc.R110.150532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilbert WV, Zhou K, Butler TK, Doudna JA. Cap-independent translation is required for starvation-induced differentiation in yeast. Science. 2007;317(5842):1224–1227. doi: 10.1126/science.1144467. [DOI] [PubMed] [Google Scholar]
- Gillespie JH. The causes of molecular evolution. Oxford: Oxford University Press; 1991. [Google Scholar]
- Gojobori T, Li WH, Graur D. Patterns of nucleotide substitution in pseudogenes and functional genes. J Mol Evol. 1982;18(5):360–369. doi: 10.1007/BF01733904. [DOI] [PubMed] [Google Scholar]
- Gonzalez B, Ceciliani F, Galizzi A. Growth at low temperature suppresses readthrough of the UGA stop codon during the expression of Bacillus subtilis flgM gene in Escherichia coli. J Biotechnol. 2003;101(2):173–180. doi: 10.1016/S0168-1656(02)00340-1. [DOI] [PubMed] [Google Scholar]
- Gorodkin J, Heyer LJ, Brunak S, Stormo GD. Displaying the information contents of structural RNA alignments: the structure logos. Comput Appl Biosci. 1997;13(6):583–586. doi: 10.1093/bioinformatics/13.6.583. [DOI] [PubMed] [Google Scholar]
- Goto M, Washio T, Tomita M. Causal analysis of CpG suppression in the Mycoplasma genome. Microb Comp Genomics. 2000;5(1):51–58. doi: 10.1089/10906590050145267. [DOI] [PubMed] [Google Scholar]
- Gotoh O. An improved algorithm for matching biological sequences. J Mol Biol. 1982;162(3):705–708. doi: 10.1016/0022-2836(82)90398-9. [DOI] [PubMed] [Google Scholar]
- Gould SJ, Vrba ES. Exaptation – a missing term in the science of form. Paleobiology. 1982;8:4–15. doi: 10.1017/S0094837300004310. [DOI] [Google Scholar]
- Gouy M. Codon contexts in enterobacterial and coliphage genes. Mol Biol Evol. 1987;4(4):426–444. doi: 10.1093/oxfordjournals.molbev.a040450. [DOI] [PubMed] [Google Scholar]
- Gouy M, Gautier C. Codon usage in bacteria: correlation with gene expressivity. Nucleic Acids Res. 1982;10:7055–7064. doi: 10.1093/nar/10.22.7055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gowri-Shankar V, Rattray M. A reversible jump method for Bayesian phylogenetic inference with a nonhomogeneous substitution model. Mol Biol Evol. 2007;24(6):1286–1299. doi: 10.1093/molbev/msm046. [DOI] [PubMed] [Google Scholar]
- Grahn AM, Butcher SJ, Bamford JKH, Bamford DH. PRD1: dissecting the genome, structure and entry. In: Calendar R, editor. The bacteriophages. Oxford: Oxford University Press; 2006. pp. 176–185. [Google Scholar]
- Gramm J, Niedermeier R. Breakpoint medians and breakpoint phylogenies: a fixed-parameter approach. Bioinformatics. 2002;18(Suppl 2):S128–S139. doi: 10.1093/bioinformatics/18.suppl_2.S128. [DOI] [PubMed] [Google Scholar]
- Grantham R. Amino acid difference formula to help explain protein evolution. Science. 1974;185:862–864. doi: 10.1126/science.185.4154.862. [DOI] [PubMed] [Google Scholar]
- Graveley BR. Mutually exclusive splicing of the insect Dscam pre-mRNA directed by competing intronic RNA secondary structures. Cell. 2005;123(1):65–73. doi: 10.1016/j.cell.2005.07.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grech B, Maetschke S, Mathews S, Timms P. Genome-wide analysis of chlamydiae for promoters that phylogenetically footprint. Res Microbiol. 2007;158(8–9):685–693. doi: 10.1016/j.resmic.2007.08.005. [DOI] [PubMed] [Google Scholar]
- Grigg GW. Sequencing 5-methylcytosine residues by the bisulphite method. DNA Seq. 1996;6(4):189–198. doi: 10.3109/10425179609008443. [DOI] [PubMed] [Google Scholar]
- Grigg G, Clark S. Sequencing 5-methylcytosine residues in genomic DNA. BioEssays. 1994;16(6):431–436. doi: 10.1002/bies.950160612. [DOI] [PubMed] [Google Scholar]
- Grosjean H, Marck C, de Crecy-Lagard V. The various strategies of codon decoding in organisms of the three domains of life: evolutionary implications. Nucleic Acids Symp Ser (Oxf) 2007;51:15–16. doi: 10.1093/nass/nrm008. [DOI] [PubMed] [Google Scholar]
- Grosjean H, de Crecy-Lagard V, Marck C. Deciphering synonymous codons in the three domains of life: co-evolution with specific tRNA modification enzymes. FEBS Lett. 2010;584(2):252–264. doi: 10.1016/j.febslet.2009.11.052. [DOI] [PubMed] [Google Scholar]
- Grossi de Sa MF, Standart N, Martins de Sa C, Akhayat O, Huesca M, Scherrer K. The poly(A)-binding protein facilitates in vitro translation of poly(A)-rich mRNA. Eur J Biochem. 1988;176(3):521–526. doi: 10.1111/j.1432-1033.1988.tb14309.x. [DOI] [PubMed] [Google Scholar]
- Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59(3):307–321. doi: 10.1093/sysbio/syq010. [DOI] [PubMed] [Google Scholar]
- Gumbel EJ. Statistics of extremes. New York: Columbia University Press; 1958. [Google Scholar]
- Gupta SK, Kececioglu JD, Schaffer AA. Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment. J Comput Biol. 1995;2(3):459–472. doi: 10.1089/cmb.1995.2.459. [DOI] [PubMed] [Google Scholar]
- Gusfield D. Algorithms on strings, trees, and sequences : computer science and computational biology. Cambridge: Cambridge University Press; 1997. [Google Scholar]
- Gygi SP, Rochon Y, Franza BR, Aebersold R. Correlation between protein and mRNA abundance in yeast. Mol Cell Biol. 1999;19(3):1720–1730. doi: 10.1128/MCB.19.3.1720. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haas J, Park E-C, Seed B. Codon usage limitation in the expression of HIV-1 envelope glycoprotein. Curr Biol. 1996;6(3):315–324. doi: 10.1016/S0960-9822(02)00482-7. [DOI] [PubMed] [Google Scholar]
- Hacker J, Kaper JB. Pathogenicity islands and the evolution of microbes. Annu Rev Microbiol. 2000;54:641–679. doi: 10.1146/annurev.micro.54.1.641. [DOI] [PubMed] [Google Scholar]
- Hacker J, Blum-Oehler G, Muhldorfer I, Tschape H. Pathogenicity islands of virulent bacteria: structure, function and impact on microbial evolution. Mol Microbiol. 1997;23(6):1089–1097. doi: 10.1046/j.1365-2958.1997.3101672.x. [DOI] [PubMed] [Google Scholar]
- Hamajima N, Goto Y, Nishio K, Tanaka D, Kawai S, Sakakibara H, Kondo T. Helicobacter pylori eradication as a preventive tool against gastric cancer. Asian Pac J Cancer Prev. 2004;5(3):246–252. [PubMed] [Google Scholar]
- Hanada K, Suzuki Y, Gojobori T. A large variation in the rates of synonymous substitution for RNA viruses and its relationship to a diversity of viral infection and transmission modes. Mol Biol Evol. 2004;21(6):1074–1080. doi: 10.1093/molbev/msh109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartigan JA. Clustering algorithms. New York: Wiley; 1975. [Google Scholar]
- Hasegawa M, Kishino H. Heterogeneity of tempo and mode of mitochondrial DNA evolution among mammalian orders. Jpn J Genet. 1989;64(4):243–258. doi: 10.1266/jjg.64.243. [DOI] [PubMed] [Google Scholar]
- Hasegawa M, Kishino H, Yano T. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol. 1985;22(2):160–174. doi: 10.1007/BF02101694. [DOI] [PubMed] [Google Scholar]
- Haustead DJ, Stevenson A, Saxena V, Marriage F, Firth M, Silla R, Martin L, Adcroft KF, Rea S, Day PJ, et al. Transcriptome analysis of human ageing in male skin shows mid-life period of variability and central role of NF-kappaB. Sci Rep. 2016;6:26846. doi: 10.1038/srep26846. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hayes WS, Borodovsky M. How to interpret an anonymous bacterial genome: machine learning approach to gene identification. Genome Res. 1998;8(11):1154–1171. doi: 10.1101/gr.8.11.1154. [DOI] [PubMed] [Google Scholar]
- Heath JR, Ribas A, Mischel PS. Single-cell analysis tools for drug discovery and development. Nat Rev Drug Discov. 2016;15(3):204–216. doi: 10.1038/nrd.2015.16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hein J. A unified approach to phylogenies and alignments. Methods Enzymol. 1990;183:625–644. doi: 10.1016/0076-6879(90)83041-7. [DOI] [PubMed] [Google Scholar]
- Hein J. TreeAlign. Methods Mol Biol. 1994;25:349–364. doi: 10.1385/0-89603-276-0:349. [DOI] [PubMed] [Google Scholar]
- Hendy MD, Penny D. Branch and bound algorithms to determine minimal evolutionary trees. Math Biosci. 1982;60:133–142. doi: 10.1016/0025-5564(82)90125-0. [DOI] [Google Scholar]
- Hendy MD, Penny D. A framework for the quantitative study of evolutionary trees. Syst Zool. 1989;38:297–309. doi: 10.2307/2992396. [DOI] [Google Scholar]
- Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A. 1992;89:10915–10919. doi: 10.1073/pnas.89.22.10915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henz SR, Huson DH, Auch AF, Nieselt-Struwe K, Schuster SC. Whole-genome prokaryotic phylogeny. Bioinformatics. 2005;21(10):2329–2335. doi: 10.1093/bioinformatics/bth324. [DOI] [PubMed] [Google Scholar]
- Herman JL, Challis CJ, Novak A, Hein J, Schmidler SC. Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure. Mol Biol Evol. 2014;31(9):2251–2266. doi: 10.1093/molbev/msu184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hernández G. Was the initiation of translation in early eukaryotes IRES-driven? Trends Biochem Sci. 2008;33(2):58. doi: 10.1016/j.tibs.2007.11.002. [DOI] [PubMed] [Google Scholar]
- Hernandez G, Vazquez-Pianzola P, Sierra JM, Rivera-Pomar R. Internal ribosome entry site drives cap-independent translation of reaper and heat shock protein 70 mRNAs in Drosophila embryos. RNA. 2004;10(11):1783–1797. doi: 10.1261/rna.7154104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Herniou EA, Luque T, Chen X, Vlak JM, Winstanley D, Cory JS, O’Reilly DR. Use of whole genome sequence data to infer baculovirus phylogeny. J Virol. 2001;75(17):8117–8126. doi: 10.1128/JVI.75.17.8117-8126.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hertz GZ, Stormo GD. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics. 1999;15(7–8):563–577. doi: 10.1093/bioinformatics/15.7.563. [DOI] [PubMed] [Google Scholar]
- Hertz GZ, Hartzell GW, 3rd, Stormo GD. Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Comput Appl Biosci. 1990;6(2):81–92. doi: 10.1093/bioinformatics/6.2.81. [DOI] [PubMed] [Google Scholar]
- Hertzberg L, Izraeli S, Domany E. STOP: searching for transcription factor motifs using gene expression. Bioinformatics. 2007;23(14):1737–1743. doi: 10.1093/bioinformatics/btm249. [DOI] [PubMed] [Google Scholar]
- Hiard S, Maree R, Colson S, Hoskisson PA, Titgemeyer F, van Wezel GP, Joris B, Wehenkel L, Rigali S. PREDetector: a new tool to identify regulatory elements in bacterial genomes. Biochem Biophys Res Commun. 2007;357(4):861–864. doi: 10.1016/j.bbrc.2007.03.180. [DOI] [PubMed] [Google Scholar]
- Hickson RE, Simon C, Perrey SW. The performance of several multiple-sequence alignment programs in relation to secondary-structure features for an rRNA sequence. Mol Biol Evol. 2000;17(4):530–539. doi: 10.1093/oxfordjournals.molbev.a026333. [DOI] [PubMed] [Google Scholar]
- Higashi K, Kashiwagi K, Taniguchi S, Terui Y, Yamamoto K, Ishihama A, Igarashi K. Enhancement of +1 frameshift by polyamines during translation of polypeptide release factor 2 in Escherichia coli. J Biol Chem. 2006;281(14):9527–9537. doi: 10.1074/jbc.M513752200. [DOI] [PubMed] [Google Scholar]
- Higgins DG. CLUSTAL V: multiple alignment of DNA and protein sequences. Methods Mol Biol. 1994;25:307–318. doi: 10.1385/0-89603-276-0:307. [DOI] [PubMed] [Google Scholar]
- Higgs PG, Attwood TK. Bioinformatics and molecular evolution. Malden: Blackwell; 2005. [Google Scholar]
- Higgs PG, Ran W. Coevolution of codon usage and tRNA genes leads to alternative stable states of biased codon usage. Mol Biol Evol. 2008;25(11):2279–2291. doi: 10.1093/molbev/msn173. [DOI] [PubMed] [Google Scholar]
- Hiller K, Grote A, Scheer M, Munch R, Jahn D. PrediSi: prediction of signal peptides and their cleavage positions. Nucleic Acids Res. 2004;32(Web Server issue):W375–W379. doi: 10.1093/nar/gkh378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hirao I, Kimoto M. Expansion of the genetic alphabet in nucleic acids by creating new base pairs. In: Mayer G, editor. The chemical biology of nucleic acids. Chichester: Wiley; 2010. pp. 39–62. [Google Scholar]
- Hirsh D, Gold L. Translation of the UGA triplet in vitro by tryptophan transfer RNA’s. J Mol Biol. 1971;58(2):459–468. doi: 10.1016/0022-2836(71)90363-9. [DOI] [PubMed] [Google Scholar]
- Hirst JD, Sternberg MJ. Prediction of ATP/GTP-binding motif: a comparison of a perceptron type neural network and a consensus sequence method [corrected] Protein Eng. 1991;4(6):615–623. doi: 10.1093/protein/4.6.615. [DOI] [PubMed] [Google Scholar]
- Hoagland MB, Stephenson ML, Scott JF, Hecht LI, Zamecnik PC. A soluble ribonucleic acid intermediate in protein synthesis. J Biol Chem. 1958;231(1):241–257. [PubMed] [Google Scholar]
- Hobolth A, Christensen OF. Mailund T, Schierup MH. Genomic relationships and speciation times of human, chimpanzee, and gorilla inferred from a coalescent hidden Markov model. PLoS Genet. 2007;3(2):e7. doi: 10.1371/journal.pgen.0030007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hofacker IL. Vienna RNA secondary structure server. Nucleic Acids Res. 2003;31(13):3429–3431. doi: 10.1093/nar/gkg599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hofacker IL, Fekete M, Stadler PF. Secondary structure prediction for aligned RNA sequences. J Mol Biol. 2002;319(5):1059–1066. doi: 10.1016/S0022-2836(02)00308-X. [DOI] [PubMed] [Google Scholar]
- Hofer A, Steverding D, Chabes A, Brun R, Thelander L. Trypanosoma brucei CTP synthetase: a target for the treatment of African sleeping sickness. Proc Natl Acad Sci U S A. 2001;98(11):6412–6416. doi: 10.1073/pnas.111139498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hogeweg P, Hesper aB. The alignment of sets of sequences and the construction of phylogenetic trees: an integrated method. J Mol Evol. 1984;20:175–186. doi: 10.1007/BF02257378. [DOI] [PubMed] [Google Scholar]
- Holmes I, Bruno WJ. Evolutionary HMMs: a Bayesian approach to multiple alignment. Bioinformatics. 2001;17(9):803–820. doi: 10.1093/bioinformatics/17.9.803. [DOI] [PubMed] [Google Scholar]
- Holstege FC, Jennings EG, Wyrick JJ, Lee TI, Hengartner CJ, Green MR, Golub TR, Lander ES, Young RA. Dissecting the regulatory circuitry of a eukaryotic genome. Cell. 1998;95(5):717–728. doi: 10.1016/S0092-8674(00)81641-4. [DOI] [PubMed] [Google Scholar]
- Hou C, Zhao H, Tanimoto K, Dean A. CTCF-dependent enhancer-blocking by alternative chromatin loop formation. Proc Natl Acad Sci U S A. 2008;105(51):20398–20403. doi: 10.1073/pnas.0808506106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hua S, Sun Z. Support vector machine approach for protein subcellular localization prediction. Bioinformatics. 2001;17(8):721–728. doi: 10.1093/bioinformatics/17.8.721. [DOI] [PubMed] [Google Scholar]
- Hudson RR. Gene trees, species trees and the segregation of ancestral alleles. Genetics. 1992;131(2):509–513. doi: 10.1093/genetics/131.2.509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huelsenbeck JP, Larget B, Alfaro ME. Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo. Mol Biol Evol. 2004;21(6):1123–1133. doi: 10.1093/molbev/msh123. [DOI] [PubMed] [Google Scholar]
- Hughes D. Mutant forms of tufA and tufB independently suppress nonsense mutations. J Mol Biol. 1987;197(4):611–615. doi: 10.1016/0022-2836(87)90467-0. [DOI] [PubMed] [Google Scholar]
- Hui A, de Boer HA. Specialized ribosome system: preferential translation of a single mRNA species by a subpopulation of mutated ribosomes in Escherichia coli. Proc Natl Acad Sci U S A. 1987;84(14):4762–4766. doi: 10.1073/pnas.84.14.4762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hunt RH. Will eradication of Helicobacter pylori infection influence the risk of gastric cancer? Am J Med. 2004;117(Suppl 5A):86S–91S. doi: 10.1016/j.amjmed.2004.07.030. [DOI] [PubMed] [Google Scholar]
- Hurst LD, Merchant AR. High guanine-cytosine content is not an adaptation to high temperature: a comparative analysis amongst prokaryotes. Proc R Soc Lond B. 2001;268:493–497. doi: 10.1098/rspb.2000.1397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huynen M, Dandekar T, Bork P. Differential genome analysis applied to the species-specific features of Helicobacter pylori. FEBS Lett. 1998;426(1):1–5. doi: 10.1016/S0014-5793(98)00276-2. [DOI] [PubMed] [Google Scholar]
- Hwang S, Gou Z, Kuznetsov IB. DP-Bind: a web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins. Bioinformatics. 2007;23(5):634–636. doi: 10.1093/bioinformatics/btl672. [DOI] [PubMed] [Google Scholar]
- Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinform. 2010;11:119. doi: 10.1186/1471-2105-11-119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Igarashi K, Kashiwagi K. Polyamine Modulon in Escherichia coli: genes involved in the stimulation of cell growth by polyamines. J Biochem. 2006;139(1):11–16. doi: 10.1093/jb/mvj020. [DOI] [PubMed] [Google Scholar]
- Ikemura T. Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes. J Mol Biol. 1981;146:1–21. doi: 10.1016/0022-2836(81)90363-6. [DOI] [PubMed] [Google Scholar]
- Ikemura T. Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E coli translational system. J Mol Biol. 1981;151:389–409. doi: 10.1016/0022-2836(81)90003-6. [DOI] [PubMed] [Google Scholar]
- Ikemura T. Correlation between the abundance of yeast transfer RNAs and the occurrence of the respective codons in protein genes. Differences in synonymous codon choice patterns of yeast and Escherichia coli with reference to the abundance of isoaccepting transfer RNAs. J Mol Biol. 1982;158(4):573–597. doi: 10.1016/0022-2836(82)90250-9. [DOI] [PubMed] [Google Scholar]
- Ikemura T. Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol. 1985;2:13–34. doi: 10.1093/oxfordjournals.molbev.a040335. [DOI] [PubMed] [Google Scholar]
- Ikemura T. Correlation between codon usage and tRNA content in microorganisms. In: Hatfield DL, Lee BJ, Pirtle RM, editors. Transfer RNA in protein synthesis. Boca Raton: CRC Press; 1992. pp. 87–111. [Google Scholar]
- Ilkow CS, Mancinelli V, Beatch MD, Hobman TC. Rubella virus capsid protein interacts with poly(a)-binding protein and inhibits translation. J Virol. 2008;82(9):4284–4294. doi: 10.1128/JVI.02732-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ingolia NT. Genome-wide translational profiling by ribosome footprinting. Methods Enzymol. 2010;470:119–142. doi: 10.1016/S0076-6879(10)70006-9. [DOI] [PubMed] [Google Scholar]
- Ingolia NT. Ribosome profiling: new views of translation, from single codons to genome scale. Nat Rev Genet. 2014;15(3):205–213. doi: 10.1038/nrg3645. [DOI] [PubMed] [Google Scholar]
- Ingolia NT. Ribosome footprint profiling of translation throughout the Genome. Cell. 2016;165(1):22–33. doi: 10.1016/j.cell.2016.02.066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 2009;324(5924):218–223. doi: 10.1126/science.1168978. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ingolia NT, Lareau LF, Weissman JS. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell. 2011;147(4):789–802. doi: 10.1016/j.cell.2011.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ingolia NT, Brar GA, Stern-Ginossar N, Harris MS, Talhouarne GJ, Jackson SE, Wills MR, Weissman JS. Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Rep. 2014;8(5):1365–1379. doi: 10.1016/j.celrep.2014.07.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ingram VM. A specific chemical difference between the globins of normal human and sickle-cell anaemia haemoglobin. Nature. 1956;178(4537):792–794. doi: 10.1038/178792a0. [DOI] [PubMed] [Google Scholar]
- Ingram VM. Gene mutations in human haemoglobin: the chemical difference between normal and sickle cell haemoglobin. Nature. 1957;180(4581):326–328. doi: 10.1038/180326a0. [DOI] [PubMed] [Google Scholar]
- Ingrosso D, Perna AF. Epigenetics in hyperhomocysteinemic states. A special focus on uremia. Biochim Biophys Acta. 2009;1790(9):892–899. doi: 10.1016/j.bbagen.2008.11.010. [DOI] [PubMed] [Google Scholar]
- Ingrosso D, Cimmino A, Perna AF, Masella L, De Santo NG, De Bonis ML, Vacca M, D’Esposito M, D’Urso M, Galletti P, et al. Folate treatment and unbalanced methylation and changes of allelic expression induced by hyperhomocysteinaemia in patients with uraemia. Lancet. 2003;361(9370):1693–1699. doi: 10.1016/S0140-6736(03)13372-7. [DOI] [PubMed] [Google Scholar]
- Ink BS, Pickup DJ. Vaccinia virus directs the synthesis of early mRNAs containing 5′ poly(A) sequences. Proc Natl Acad Sci U S A. 1990;87(4):1536–1540. doi: 10.1073/pnas.87.4.1536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Insinga A, Minucci S, Pelicci PG. Mechanisms of selective anticancer action of histone deacetylase inhibitors. Cell Cycle. 2005;4(6):741–743. doi: 10.4161/cc.4.6.1717. [DOI] [PubMed] [Google Scholar]
- Insinga A, Monestiroli S, Ronzoni S, Gelmetti V, Marchesi F, Viale A, Altucci L, Nervi C, Minucci S, Pelicci PG. Inhibitors of histone deacetylases induce tumor-selective apoptosis through activation of the death receptor pathway. Nat Med. 2005;11(1):71–76. doi: 10.1038/nm1160. [DOI] [PubMed] [Google Scholar]
- Ito T, Bulger M, Pazin MJ, Kobayashi R, Kadonaga JT. ACF, an ISWI-containing and ATP-utilizing chromatin assembly and remodeling factor. Cell. 1997;90(1):145–155. doi: 10.1016/S0092-8674(00)80321-9. [DOI] [PubMed] [Google Scholar]
- Ito K, Uno M, Nakamura Y. A tripeptide ‘anticodon’ deciphers stop codons in messenger RNA. Nature. 2000;403(6770):680–684. doi: 10.1038/35001115. [DOI] [PubMed] [Google Scholar]
- Jackson RJ, Hellen CU, Pestova TV. The mechanism of eukaryotic translation initiation and principles of its regulation. Nat Rev Mol Cell Biol. 2010;11(2):113–127. doi: 10.1038/nrm2838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacob F. The possible and the actual. Seattle: University of Washington Press; 1982. p. 70. [Google Scholar]
- Jacob F. The statue within: an autobiography. New York: Basic Books, Inc.; 1988. [Google Scholar]
- Jacob F, Monod J. Genetic regulatory mechanisms in the synthesis of proteins. J Mol Biol. 1961;3:318–356. doi: 10.1016/S0022-2836(61)80072-7. [DOI] [PubMed] [Google Scholar]
- Jacobson A, Favreau M. Possible involvement of poly(A) in protein synthesis. Nucleic Acids Res. 1983;11(18):6353–6368. doi: 10.1093/nar/11.18.6353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- James P, Quadroni M, Carafoli E, Gonnet G. Protein identification in DNA databases by peptide mass fingerprinting. Protein Sci. 1994;3(8):1347–1350. doi: 10.1002/pro.5560030822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jan E, Sarnow P. Factorless ribosome assembly on the internal ribosome entry site of cricket paralysis virus. J Mol Biol. 2002;324(5):889–902. doi: 10.1016/S0022-2836(02)01099-9. [DOI] [PubMed] [Google Scholar]
- Jan E, Thompson SR, Wilson JE, Pestova TV, Hellen CU, Sarnow P. Initiator Met-tRNA-independent translation mediated by an internal ribosome entry site element in cricket paralysis virus-like insect viruses. Cold Spring Harb Symp Quant Biol. 2001;66:285–292. doi: 10.1101/sqb.2001.66.285. [DOI] [PubMed] [Google Scholar]
- Janin L, Schulz-Trieglaff O, Cox AJ. BEETL-fastq: a searchable compressed archive for DNA reads. Bioinformatics. 2014;30(19):2796–2801. doi: 10.1093/bioinformatics/btu387. [DOI] [PubMed] [Google Scholar]
- Jank P, Shindo-Okada N, Nishimura S, Gross HJ. Rabbit liver tRNA1Val:I. Primary structure and unusual codon recognition. Nucleic Acids Res. 1977;4(6):1999–2008. doi: 10.1093/nar/4.6.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jayaswal V, Jermiin LS, Robinson J. Estimation of phylogeny using a general markov model. Evol Bioinform Online. 2005;1:62–80. doi: 10.1177/117693430500100005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jenkins GM, Holmes EC. The extent of codon usage bias in human RNA viruses and its evolutionary origin. Virus Res. 2003;92(1):1–7. doi: 10.1016/S0168-1702(02)00309-X. [DOI] [PubMed] [Google Scholar]
- Jensen JL, Hein J. Gibbs sampler for statistical multiple alignment. Stat Sin. 2005;15:889–907. [Google Scholar]
- Jia W, Higgs PG. Codon usage in mitochondrial genomes: distinguishing context-dependent mutation from translational selection. Mol Biol Evol. 2008;25(2):339–351. doi: 10.1093/molbev/msm259. [DOI] [PubMed] [Google Scholar]
- Jin P, Alisch RS, Warren ST. RNA and microRNAs in fragile X mental retardation. Nat Cell Biol. 2004;6(11):1048–1053. doi: 10.1038/ncb1104-1048. [DOI] [PubMed] [Google Scholar]
- Jin VX, Leu YW, Liyanarachchi S, Sun H, Fan M, Nephew KP, Huang TH, Davuluri RV. Identifying estrogen receptor alpha target genes using integrated computational genomics and chromatin immunoprecipitation microarray. Nucleic Acids Res. 2004;32(22):6627–6635. doi: 10.1093/nar/gkh1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jin VX, O’Geen H, Iyengar S, Green R, Farnham PJ. Identification of an OCT4 and SRY regulatory module using integrated computational and experimental genomics approaches. Genome Res. 2007;17(6):807–817. doi: 10.1101/gr.6006107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnston TC, Parker J. Streptomycin-induced, third-position misreading of the genetic code. J Mol Biol. 1985;181(2):313–315. doi: 10.1016/0022-2836(85)90094-4. [DOI] [PubMed] [Google Scholar]
- Johnston TC, Borgia PT, Parker J. Codon specificity of starvation induced misreading. Mol Gen Genet MGG. 1984;195(3):459–465. doi: 10.1007/BF00341447. [DOI] [PubMed] [Google Scholar]
- Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992;8:275–282. doi: 10.1093/bioinformatics/8.3.275. [DOI] [PubMed] [Google Scholar]
- Jorgensen F, Adamski FM, Tate WP, Kurland CG. Release factor-dependent false stops are infrequent in Escherichia coli. J Mol Biol. 1993;230(1):41–50. doi: 10.1006/jmbi.1993.1124. [DOI] [PubMed] [Google Scholar]
- Josse J, Kaiser AD, Kornberg A. Enzymatic synthesis of deoxyribonucleic acid VII. Frequencies of nearest neighbor base-sequences in deoxyribonucleic acid. J Biol Chem. 1961;236:864–875. [PubMed] [Google Scholar]
- Jukes TH, Cantor CR. Evolution of protein molecules. In: Munro HN, editor. Mammalian protein metabolism. New York: Academic; 1969. pp. 21–123. [Google Scholar]
- Kaishima M, Ishii J, Matsuno T, Fukuda N, Kondo A. Expression of varied GFPs in Saccharomyces cerevisiae: codon optimization yields stronger than expected expression and fluorescence intensity. Sci Rep. 2016;6:35932. doi: 10.1038/srep35932. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kamalakaran S, Radhakrishnan SK, Beck WT. Identification of estrogen-responsive genes using a genome-wide analysis of promoter elements for transcription factor binding sites. J Biol Chem. 2005;280(22):21491–21497. doi: 10.1074/jbc.M409176200. [DOI] [PubMed] [Google Scholar]
- Kanehisa M. Molecular network analysis of diseases and drugs in KEGG. Methods Mol Biol. 2013;939:263–275. doi: 10.1007/978-1-62703-107-3_17. [DOI] [PubMed] [Google Scholar]
- Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2016;44(D1):D457–D462. doi: 10.1093/nar/gkv1070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaneko T, Tanaka A, Sato S, Kotani H, Sazuka T, Miyajima N, Sugiura M, Tabata S. Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. I. Sequence features in the 1 Mb region from map positions 64% to 92% of the genome. DNA Res. 1995;2(4):153–166. doi: 10.1093/dnares/2.4.153. [DOI] [PubMed] [Google Scholar]
- Kaneko T, Sato S, Kotani H, Tanaka A, Asamizu E, Nakamura Y, Miyajima N, Hirosawa M, Sugiura M, Sasamoto S, et al. Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. DNA Res. 1996;3(3):109–136. doi: 10.1093/dnares/3.3.109. [DOI] [PubMed] [Google Scholar]
- Karlin S, Burge C. Dinucleotide relative abundance extremes: a genomic signature. TIG. 1995;11(7):283–290. doi: 10.1016/S0168-9525(00)89076-9. [DOI] [PubMed] [Google Scholar]
- Katsafanas GC, Moss B. Colocalization of transcription and translation within cytoplasmic poxvirus factories coordinates viral expression and subjugates host functions. Cell Host Microbe. 2007;2(4):221. doi: 10.1016/j.chom.2007.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karlin S, Mrazek J. What drives codon choices in human genes. J Mol Biol. 1996;262:459–472. doi: 10.1006/jmbi.1996.0528. [DOI] [PubMed] [Google Scholar]
- Kass RE, Raftery AE. Bayes factors. J Am Stat Assoc. 1995;90(430):773–795. doi: 10.1080/01621459.1995.10476572. [DOI] [Google Scholar]
- Katoh K, Toh H. Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform. 2008;9(4):286–298. doi: 10.1093/bib/bbn013. [DOI] [PubMed] [Google Scholar]
- Katoh K, Toh H. Parallelization of the MAFFT multiple sequence alignment program. Bioinformatics. 2010;26(15):1899–1900. doi: 10.1093/bioinformatics/btq224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33(2):511–518. doi: 10.1093/nar/gki198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katoh K, Asimenos G, Toh H. Multiple alignment of DNA sequences with MAFFT. Methods Mol Biol. 2009;537:39–64. doi: 10.1007/978-1-59745-251-9_3. [DOI] [PubMed] [Google Scholar]
- Katsafanas GC, Moss B. Colocalization of transcription and translation within cytoplasmic poxvirus factories coordinates viral expression and subjugates host functions. Cell Host Microbe. 2007;2(4):221. doi: 10.1016/j.chom.2007.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kawashima T, Douglass S, Gabunilas J, Pellegrini M, Chanfreau GF. Widespread use of non-productive alternative splice sites in Saccharomyces cerevisiae. PLoS Genet. 2014;10(4):e1004249. doi: 10.1371/journal.pgen.1004249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kazan K. Alternative splicing and proteome diversity in plants: the tip of the iceberg has just emerged. Trends Plant Sci. 2003;8(10):468–471. doi: 10.1016/j.tplants.2003.09.001. [DOI] [PubMed] [Google Scholar]
- Keeling PJ, Doolittle WF. A non-canonical genetic code in an early diverging eukaryotic lineage. EMBO J. 1996;15(9):2285–2290. doi: 10.1002/j.1460-2075.1996.tb00581.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kersulyte D, Chalkauskas H, Berg DE. Emergence of recombinant strains of Helicobacter pylori during human infection. Mol Microbiol. 1999;31(1):31–43. doi: 10.1046/j.1365-2958.1999.01140.x. [DOI] [PubMed] [Google Scholar]
- Kim H, Park H. Prediction of protein relative solvent accessibility with support vector machines and long-range interaction 3D local descriptor. Proteins. 2004;54(3):557–562. doi: 10.1002/prot.10602. [DOI] [PubMed] [Google Scholar]
- Kim DW, Lee KH, Lee D. Detecting clusters of different geometrical shapes in microarray gene expression data. Bioinformatics. 2005;21(9):1927–1934. doi: 10.1093/bioinformatics/bti251. [DOI] [PubMed] [Google Scholar]
- Kimura M. Evolutionary rate at the molecular level. Nature. 1968;217:624–626. doi: 10.1038/217624a0. [DOI] [PubMed] [Google Scholar]
- Kimura M. Preponderance of synonymous changes as evidence for the neutral theory of molecular evolution. Nature. 1977;267:275–276. doi: 10.1038/267275a0. [DOI] [PubMed] [Google Scholar]
- Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol. 1980;16:111–120. doi: 10.1007/BF01731581. [DOI] [PubMed] [Google Scholar]
- Kimura M. The neutral theory of molecular evolution. Cambridge: Cambridge University Press; 1983. [Google Scholar]
- Kimura M, Ohta T. On the stochastic model for estimation of mutational distance between homologous proteins. J Mol Evol. 1972;2:87–90. doi: 10.1007/BF01653945. [DOI] [PubMed] [Google Scholar]
- King MC, Jukes TH. Non-Darwinian evolution. Science. 1969;164:788–798. doi: 10.1126/science.164.3881.788. [DOI] [PubMed] [Google Scholar]
- Kingsford C, Patro R. Reference-based compression of short-read sequences using path encoding. Bioinformatics. 2015;31(12):1920–1928. doi: 10.1093/bioinformatics/btv071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kioussis D, Vanin E, deLange T, Flavell RA, Grosveld FG. Beta-globin gene inactivation by DNA translocation in gamma beta-thalassaemia. Nature. 1983;306(5944):662–666. doi: 10.1038/306662a0. [DOI] [PubMed] [Google Scholar]
- Kishino H, Hasegawa M. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J Mol Evol. 1989;29:170–179. doi: 10.1007/BF02100115. [DOI] [PubMed] [Google Scholar]
- Kishino H, Hasegawa M. Converting distance to time: application to human evolution. Methods Enzymol. 1990;183:550–570. doi: 10.1016/0076-6879(90)83036-9. [DOI] [PubMed] [Google Scholar]
- Kjer KM. Use of ribosomal-RNA secondary structure in phylogenetic studies to identify homologous positions – an example of alignment and data presentation from the frogs. Mol Phylogenet Evol. 1995;4(3):314–330. doi: 10.1006/mpev.1995.1028. [DOI] [PubMed] [Google Scholar]
- Kliman RM, Bernal CA. Unusual usage of AGG and TTG codons in humans and their viruses. Gene. 2005;352:92. doi: 10.1016/j.gene.2005.04.001. [DOI] [PubMed] [Google Scholar]
- Kobayashi H, Akitomi J, Fujii N, Kobayashi K, Altaf-Ul-Amin M, Kurokawa K, Ogasawara N, Kanaya S. The entire organization of transcription units on the Bacillus subtilis genome. BMC Genomics. 2007;8:197. doi: 10.1186/1471-2164-8-197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kodama Y, Shumway M, Leinonen R. The sequence read archive: explosive growth of sequencing data. Nucleic Acids Res. 2012;40(Database issue):D54–D56. doi: 10.1093/nar/gkr854. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kohonen T. Self-organizing maps. Berlin: Springer; 2001. [Google Scholar]
- Komar AA, Hatzoglou M. Internal ribosome entry sites in cellular mRNAs: mystery of their existence. J Biol Chem. 2005;280(25):23425–23428. doi: 10.1074/jbc.R400041200. [DOI] [PubMed] [Google Scholar]
- Korenke GC, Fuchs S, Krasemann E, Doerr HG, Wilichowski E, Hunneman DH, Hanefeld F. Cerebral adrenoleukodystrophy (ALD) in only one of monozygotic twins with an identical ALD genotype. Ann Neurol. 1996;40(2):254–257. doi: 10.1002/ana.410400221. [DOI] [PubMed] [Google Scholar]
- Korkmaz G, Holm M, Wiens T, Sanyal S. Comprehensive analysis of stop codon usage in bacteria and its correlation with release factor abundance. J Biol Chem. 2014;289(44):30334–30342. doi: 10.1074/jbc.M114.606632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kornblihtt AR. Promoter usage and alternative splicing. Curr Opin Cell Biol. 2005;17(3):262–268. doi: 10.1016/j.ceb.2005.04.014. [DOI] [PubMed] [Google Scholar]
- Kozak M. How do eucaryotic ribosomes select initiation regions in messenger RNA? Cell. 1978;15(4):1109–1123. doi: 10.1016/0092-8674(78)90039-9. [DOI] [PubMed] [Google Scholar]
- Kozak M. Evaluation of the “scanning model” for initiation of protein synthesis in eucaryotes. Cell. 1980;22(1 Pt 1):7–8. doi: 10.1016/0092-8674(80)90148-8. [DOI] [PubMed] [Google Scholar]
- Kozak M. Influence of mRNA secondary structure on binding and migration of 40S ribosomal subunits. Cell. 1980;19(1):79–90. doi: 10.1016/0092-8674(80)90390-6. [DOI] [PubMed] [Google Scholar]
- Kozak M. Possible role of flanking nucleotides in recognition of the AUG initiator codon by eukaryotic ribosomes. Nucleic Acids Res. 1981;9(20):5233–5252. doi: 10.1093/nar/9.20.5233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kozak M. Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell. 1986;44(2):283–292. doi: 10.1016/0092-8674(86)90762-2. [DOI] [PubMed] [Google Scholar]
- Kozak M. Effects of long 5′ leader sequences on initiation by eukaryotic ribosomes in vitro. Gene Expr. 1991;1(2):117–125. [PMC free article] [PubMed] [Google Scholar]
- Kozak M. Recognition of AUG and alternative initiator codons is augmented by G in position +4 but is not generally affected by the nucleotides in positions +5 and +6. EMBO J. 1997;16(9):2482–2492. doi: 10.1093/emboj/16.9.2482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kozak M. Initiation of translation in prokaryotes and eukaryotes. Gene. 1999;234(2):187–208. doi: 10.1016/S0378-1119(99)00210-3. [DOI] [PubMed] [Google Scholar]
- Kozak M. A second look at cellular mRNA sequences said to function as internal ribosome entry sites. Nucleic Acids Res. 2005;33(20):6593–6602. doi: 10.1093/nar/gki958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kozak M. Some thoughts about translational regulation: forward and backward glances. J Cell Biochem. 2007;102(2):280–290. doi: 10.1002/jcb.21464. [DOI] [PubMed] [Google Scholar]
- Krasemann EW, Meier V, Korenke GC, Hunneman DH, Hanefeld F. Identification of mutations in the ALD-gene of 20 families with adrenoleukodystrophy/adrenomyeloneuropathy. Hum Genet. 1996;97(2):194–197. doi: 10.1007/BF02265264. [DOI] [PubMed] [Google Scholar]
- Kreutzer DA, Essigmann JM. Oxidized, deaminated cytosines are a source of C --> T transitions in vivo. Proc Natl Acad Sci U S A. 1998;95(7):3578–3582. doi: 10.1073/pnas.95.7.3578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krogh A, Mian IS, Haussler D. A hidden Markov model that finds genes in E. coli DNA. Nucleic Acids Res. 1994;22(22):4768–4778. doi: 10.1093/nar/22.22.4768. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kudla G, Murray AW, Tollervey D, Plotkin JB. Coding-sequence determinants of gene expression in escherichia coli. Science. 2009;324(5924):255–258. doi: 10.1126/science.1170160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kullback S. Information theory and statistics. New York: Wiley; 1959. [Google Scholar]
- Kullback S. The Kullback-Leibler distance. Am Stat. 1987;41:340–341. [Google Scholar]
- Kullback S, Leibler RA. On information and sufficiency. Ann Math Stat. 1951;22:79–86. doi: 10.1214/aoms/1177729694. [DOI] [Google Scholar]
- Kumar S, Filipski A. Multiple sequence alignment: in pursuit of homologous DNA positions. Genome Res. 2007;17(2):127–135. doi: 10.1101/gr.5232407. [DOI] [PubMed] [Google Scholar]
- Kumar KK, Shelokar PS. An SVM method using evolutionary information for the identification of allergenic proteins. Bioinformation. 2008;2(6):253–256. doi: 10.6026/97320630002253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33(7):1870–1874. doi: 10.1093/molbev/msw054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kungulovski G, Jeltsch A. Epigenome editing: state of the art, concepts, and perspectives. Trends Genet. 2016;32(2):101–113. doi: 10.1016/j.tig.2015.12.001. [DOI] [PubMed] [Google Scholar]
- Kurland CG. Strategies for efficiency and accuracy in gene expression. Trends Biochem Sci. 1987;12:126. doi: 10.1016/0968-0004(87)90060-0. [DOI] [Google Scholar]
- Kutlar A. Sickle cell disease: a multigenic perspective of a single gene disorder. Hemoglobin. 2007;31(2):209–224. doi: 10.1080/03630260701290233. [DOI] [PubMed] [Google Scholar]
- Kuznetsov IB, Gou Z, Li R, Hwang S. Using evolutionary and structural information to predict DNA-binding sites on DNA-binding proteins. Proteins. 2006;64(1):19–27. doi: 10.1002/prot.20977. [DOI] [PubMed] [Google Scholar]
- Kypr J, Mrazek JAN. Unusual codon usage of HIV. Nature. 1987;327(6117):20. doi: 10.1038/327020a0. [DOI] [PubMed] [Google Scholar]
- Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982;157:105–132. doi: 10.1016/0022-2836(82)90515-0. [DOI] [PubMed] [Google Scholar]
- Lacerda R, Menezes J, Romao L. More than just scanning: the importance of cap-independent mRNA translation initiation for cellular stress response and cancer. Cell Mol Life Sci. 2016;74(9):1659–1680. doi: 10.1007/s00018-016-2428-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laemmli UK. Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nat Biotechnol. 1970;227:680–685. doi: 10.1038/227680a0. [DOI] [PubMed] [Google Scholar]
- Lake JA. Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances. Proc Natl Acad Sci U S A. 1994;91:1455–1459. doi: 10.1073/pnas.91.4.1455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lamendola DE, Duan Z, Yusuf RZ, Seiden MV. Molecular description of evolving paclitaxel resistance in the SKOV-3 human ovarian carcinoma cell line. Cancer Res. 2003;63(9):2200–2205. [PubMed] [Google Scholar]
- Lamond AI. RNA editing and the mysterious undercover genes of trypanosomatid mitochondria. Trends Biochem Sci. 1988;13(8):283–284. doi: 10.1016/0968-0004(88)90117-X. [DOI] [PubMed] [Google Scholar]
- Lanave C, Preparata G, Saccone C, Serio G. A new method for calculating evolutionary substitution rates. J Mol Evol. 1984;20(1):86–93. doi: 10.1007/BF02101990. [DOI] [PubMed] [Google Scholar]
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409(6822):860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
- Lang BF, Burger G, O’Kelly CJ, Cedergren R, Golding GB, Lemieux C, Sankoff D, Turmel M, Gray MW. An ancestral mitochondrial DNA resembling a eubacterial genome in miniature. Nature. 1997;387(6632):493–497. doi: 10.1038/387493a0. [DOI] [PubMed] [Google Scholar]
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL. Searching for SNPs with cloud computing. Genome Biol. 2009;10(11):R134. doi: 10.1186/gb-2009-10-11-r134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B, Hansen KD, Leek JT. Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol. 2010;11(8):R83. doi: 10.1186/gb-2010-11-8-r83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF, Wootton JC. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science. 1993;262(5131):208–214. doi: 10.1126/science.8211139. [DOI] [PubMed] [Google Scholar]
- Lee C, Wang Q. Bioinformatics analysis of alternative splicing. Brief Bioinform. 2005;6(1):23–33. doi: 10.1093/bib/6.1.23. [DOI] [PubMed] [Google Scholar]
- Leinonen R, Sugawara H, Shumway M. The sequence read archive. Nucleic Acids Res. 2011;39(Database):D19–D21. doi: 10.1093/nar/gkq1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lemay DG, Hwang DH. Genome-wide identification of peroxisome proliferator response elements using integrated computational genomics. J Lipid Res. 2006;47(7):1583–1587. doi: 10.1194/jlr.M500504-JLR200. [DOI] [PubMed] [Google Scholar]
- Lesk AM. Introduction to protein science: architecture, function and genomics. New York: Oxford University Press; 2004. [Google Scholar]
- Li CC. First course in population genetics. Pacific Grove: The Boxwood Press; 1976. [Google Scholar]
- Li W-H. Evolution of duplicate genes and pseudogenes. Sunderland: Sinauer; 1983. [Google Scholar]
- Li W-H. Molecular evolution. Sunderland: Sinauer; 1997. [Google Scholar]
- Li X, Chang YH. Amino-terminal protein processing in Saccharomyces cerevisiae is an essential function that requires two distinct methionine aminopeptidases. Proc Natl Acad Sci U S A. 1995;92(26):12357–12361. doi: 10.1073/pnas.92.26.12357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li GL, Leong TY. Feature selection for the prediction of translation initiation sites. Genomics Proteomics Bioinformatics. 2005;3(2):73–83. doi: 10.1016/S1672-0229(05)03012-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li W-H, Tanimura M. The molecular clock runs more slowly in man than in apes and monkeys. Nature. 1987;326:93–96. doi: 10.1038/326093a0. [DOI] [PubMed] [Google Scholar]
- Li WH, Wu CI. Rates of nucleotide substitution are evidently higher in rodents than in man. Mol Biol Evol. 1987;4(1):74–82. doi: 10.1093/oxfordjournals.molbev.a040423. [DOI] [PubMed] [Google Scholar]
- Li WH, Gojobori T, Nei M. Pseudogenes as a paradigm of neutral evolution. Nature. 1981;292(5820):237–239. doi: 10.1038/292237a0. [DOI] [PubMed] [Google Scholar]
- Li W-H, Wolfe KH, Sourdis J, Sharp PM. Reconstruction of phylogenetic trees and estimation of divergence times under nonconstant rates of evolution. Cold Spring Harb Symp Quant Biol. 1987;52:847–856. doi: 10.1101/SQB.1987.052.01.092. [DOI] [PubMed] [Google Scholar]
- Li F, Ge P, Hui WH, Atanasov I, Rogers K, Guo Q, Osato D, Falick AM, Zhou ZH, Simpson L. Structure of the core editing complex (L-complex) involved in uridine insertion/deletion RNA editing in trypanosomatid mitochondria. Proc Natl Acad Sci U S A. 2009;106(30):12306–12310. doi: 10.1073/pnas.0901754106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liang KC, Wang X, Anastassiou D. A profile-based deterministic sequential Monte Carlo algorithm for motif discovery. Bioinformatics. 2008;24(1):46–55. doi: 10.1093/bioinformatics/btm543. [DOI] [PubMed] [Google Scholar]
- Liberman N, Gandin V, Svitkin YV, David M, Virgili G, Jaramillo M, Holcik M, Nagar B, Kimchi A, Sonenberg N. DAP5 associates with eIF2beta and eIF4AI to promote Internal Ribosome Entry Site driven translation. Nucleic Acids Res. 2015;43(7):3764–3775. doi: 10.1093/nar/gkv205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009;326(5950):289–293. doi: 10.1126/science.1181369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liebler DC, TBDC L, III, fb JRY, Publisher : c . Introduction to proteomics: tools for the new biology. Totowa: Humana Press; 2002. [Google Scholar]
- Liljenstrom H, von Heijne G. Translation rate modification by preferential codon usage: intragenic position effects. J Theor Biol. 1987;124(1):43–55. doi: 10.1016/S0022-5193(87)80251-5. [DOI] [PubMed] [Google Scholar]
- Lim VI. Analysis of action of wobble nucleoside modifications on codon-anticodon pairing within the ribosome. J Mol Biol. 1994;240(1):8–19. doi: 10.1006/jmbi.1994.1413. [DOI] [PubMed] [Google Scholar]
- Lin JP, Aker M, Sitney KC, Mortimer RK. First position wobble in codon-anticodon pairing: amber suppression by a yeast glutamine tRNA. Gene. 1986;49(3):383–388. doi: 10.1016/0378-1119(86)90375-6. [DOI] [PubMed] [Google Scholar]
- Lin HC, Tsai K, Chang BL, Liu J, Young M, Hsu W, Louie S, Nicholas HB, Jr, Rosenquist GL. Prediction of tyrosine sulfation sites in animal viruses. Biochem Biophys Res Commun. 2003;312(4):1154–1158. doi: 10.1016/j.bbrc.2003.11.047. [DOI] [PubMed] [Google Scholar]
- Lin GN, Cai Z, Lin G, Chakraborty S, Xu D. ComPhy: prokaryotic composite distance phylogenies inferred from whole-genome gene sets. BMC Bioinform. 2009;10(Suppl 1):S5. doi: 10.1186/1471-2105-10-S1-S5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lindahl T. Instability and decay of the primary structure of DNA. Nature. 1993;362:709–715. doi: 10.1038/362709a0. [DOI] [PubMed] [Google Scholar]
- Lipman DJ, Pearson WR. Rapid and sensitive protein similarity searches. Science. 1985;227(4693):1435–1441. doi: 10.1126/science.2983426. [DOI] [PubMed] [Google Scholar]
- Lipman DJ, Altschul SF, Kececioglu JD. A tool for multiple sequence alignment. Proc Natl Acad Sci U S A. 1989;86(12):4412–4415. doi: 10.1073/pnas.86.12.4412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lipscombe D. Neuronal proteins custom designed by alternative splicing. Curr Opin Neurobiol. 2005;15(3):358–363. doi: 10.1016/j.conb.2005.04.002. [DOI] [PubMed] [Google Scholar]
- Lithwick G, Margalit H. Relative predicted protein levels of functionally associated proteins are conserved across organisms. Nucleic Acids Res. 2005;33(3):1051–1057. doi: 10.1093/nar/gki261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu J, Louie S, Hsu W, Yu KM, Nicholas HB, Jr, Rosenquist GL. Tyrosine sulfation is prevalent in human chemokine receptors important in lung disease. Am J Respir Cell Mol Biol. 2008;38(6):738–743. doi: 10.1165/rcmb.2007-0118OC. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu X, Jiang H, Gu Z, Roberts JW. High-resolution view of bacteriophage lambda gene expression by ribosome profiling. Proc Natl Acad Sci U S A. 2013;110(29):11928–11933. doi: 10.1073/pnas.1309739110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Livesey Rick. Genome Biology. 2002;3(9):comment2009.1. doi: 10.1186/gb-2002-3-9-comment2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lobry JR. Asymmetric substitution patterns in the two DNA strands of bacteria. Mol Biol Evol. 1996;13(5):660–665. doi: 10.1093/oxfordjournals.molbev.a025626. [DOI] [PubMed] [Google Scholar]
- Lockhart PJ, Steel MA, Hendy MD, Penny D. Recovering evolutionary trees under a more realistic model of sequence evolution. Mol Biol Evol. 1994;11:605–612. doi: 10.1093/oxfordjournals.molbev.a040136. [DOI] [PubMed] [Google Scholar]
- Lodish HF, Nathan DG. Regulation of hemoglobin synthesis. Preferential inhibition of and globin synthesis. J Biol Chem. 1972;247(23):7822–7829. [PubMed] [Google Scholar]
- Lopez P, Philippe H, Myllykallio H, Forterre P. Identification of putative chromosomal origins of replication in Archaea. Mol Microbiol. 1999;32(4):883–886. doi: 10.1046/j.1365-2958.1999.01370.x. [DOI] [PubMed] [Google Scholar]
- Lowry JA, Atchley WR. Molecular evolution of the GATA family of transcription factors: conservation within the DNA-binding domain. J Mol Evol. 2000;50(2):103–115. doi: 10.1007/s002399910012. [DOI] [PubMed] [Google Scholar]
- Lu C, Bablanian R. Characterization of small nontranslated polyadenylylated RNAs in vaccinia virus-infected cells. Proc Natl Acad Sci U S A. 1996;93(5):2037–2042. doi: 10.1073/pnas.93.5.2037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lunter G, Rocco A, Mimouni N, Heger A, Caldeira A, Hein J. Uncertainty in homology inferences: assessing and improving genomic sequence alignment. Genome Res. 2008;18(2):298–309. doi: 10.1101/gr.6725608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lustig F, Boren T, Guindy YS, Elias P, Samuelsson T, Gehrke CW, Kuo KC, Lagerkvist U. Codon discrimination and anticodon structural context. Proc Natl Acad Sci U S A. 1989;86(18):6873–6877. doi: 10.1073/pnas.86.18.6873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ma B, Nussinov R. Release factors eRF1 and RF2: a universal mechanism controls the large conformational changes. J Biol Chem. 2004;279(51):53875–53885. doi: 10.1074/jbc.M407412200. [DOI] [PubMed] [Google Scholar]
- Ma P, Xia X (2011) Factors affecting splicing strength of yeast genes. Comp Funct Genomics:Article ID 212146, 13 pages [DOI] [PMC free article] [PubMed]
- Ma S, Musa T, Bag J. Reduced stability of mitogen-activated protein kinase kinase-2 mRNA and phosphorylation of poly(A)-binding protein (PABP) in cells overexpressing PABP. J Biol Chem. 2006;281(6):3145–3156. doi: 10.1074/jbc.M508937200. [DOI] [PubMed] [Google Scholar]
- MacKay VL, Li X, Flory MR, Turcott E, Law GL, Serikawa KA, Xu XL, Lee H, Goodlett DR, Aebersold R, et al. Gene expression analyzed by high-resolution state array analysis and quantitative proteomics: response of yeast to mating pheromone. Mol Cell Proteomics. 2004;3(5):478–489. doi: 10.1074/mcp.M300129-MCP200. [DOI] [PubMed] [Google Scholar]
- Madden SL, Galella EA, Zhu J, Bertelsen AH, Beaudry GA. SAGE transcript profiles for p53-dependent growth regulation. Oncogene. 1997;15(9):1079–1085. doi: 10.1038/sj.onc.1201091. [DOI] [PubMed] [Google Scholar]
- Maher CA, Kumar-Sinha C, Cao X, Kalyana-Sundaram S, Han B, Jing X, Sam L, Barrette T, Palanisamy N, Chinnaiyan AM. Transcriptome sequencing to detect gene fusions in cancer. Nature. 2009;458(7234):97–101. doi: 10.1038/nature07638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mannella CA, Neuwald AF, Lawrence CE. Detection of likely transmembrane beta strand regions in sequences of mitochondrial pore proteins using the Gibbs sampler. J Bioenerg Biomembr. 1996;28(2):163–169. doi: 10.1007/BF02110647. [DOI] [PubMed] [Google Scholar]
- Marck C, Grosjean H. tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA. 2002;8(10):1189–1232. doi: 10.1017/S1355838202022021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marin A, Xia X. GC skew in protein-coding genes between the leading and lagging strands in bacterial genomes: new substitution models incorporating strand bias. J Theor Biol. 2008;253(3):508–513. doi: 10.1016/j.jtbi.2008.04.004. [DOI] [PubMed] [Google Scholar]
- Martinez MA, Vartanian J-P, Simon W-H. Hypermutagenesis of RNA using human immunodeficiency virus type 1 reverse transcriptase and biased dNTP concentrations. Proc Natl Acad Sci U S A. 1994;91(25):11787–11791. doi: 10.1073/pnas.91.25.11787. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matin A, Zychlinsky E, Keyhan M, Sachs G. Capacity of Helicobacter pylori to generate ionic gradients at low pH is similar to that of bacteria which grow under strongly acidic conditions. Infect Immun. 1996;64(4):1434–1436. doi: 10.1128/iai.64.4.1434-1436.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McNulty DE, Claffee BA, Huddleston MJ, Porter ML, Cavnar KM, Kane JF. Mistranslational errors associated with the rare arginine codon CGG in Escherichia coli. Protein Expr Purif. 2003;27(2):365–374. doi: 10.1016/S1046-5928(02)00610-1. [DOI] [PubMed] [Google Scholar]
- McPherson DT. Codon preference reflects mistranslational constraints: a proposal. Nucleic Acids Res. 1988;16(9):4111–4120. doi: 10.1093/nar/16.9.4111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Medawar PB, Medawar JS. Aristotle to zoos: a philosophical dictionary of biology. Cambridge, MA: Harvard University Press; 1983. [Google Scholar]
- Meinnel T, Mechulam Y, Blanquet S. Methionine as translation start signal: a review of the enzymes of the pathway in Escherichia coli. Biochimie. 1993;75(12):1061–1075. doi: 10.1016/0300-9084(93)90005-D. [DOI] [PubMed] [Google Scholar]
- Melo EO, de Melo Neto OP, Martins de Sa C. Adenosine-rich elements present in the 5′-untranslated region of PABP mRNA can selectively reduce the abundance and translation of CAT mRNAs in vivo. FEBS Lett. 2003;546(2–3):329–334. doi: 10.1016/S0014-5793(03)00620-3. [DOI] [PubMed] [Google Scholar]
- Melo EO, Dhalia R, Martins de Sa C, Standart N, de Melo Neto OP. Identification of a C-terminal poly(A)-binding protein (PABP)-PABP interaction domain: role in cooperative binding to poly (A) and efficient cap distal translational repression. J Biol Chem. 2003;278(47):46357–46368. doi: 10.1074/jbc.M307624200. [DOI] [PubMed] [Google Scholar]
- Menaker RJ, Sharaf AA, Jones NL. Helicobacter pylori infection and gastric cancer: host, bug, environment, or all three? Curr Gastroenterol Rep. 2004;6(6):429–435. doi: 10.1007/s11894-004-0063-9. [DOI] [PubMed] [Google Scholar]
- Mendz GL, Hazell SL. The urea cycle of Helicobacter pylori. Microbiology. 1996;142(Pt 10):2959–2967. doi: 10.1099/13500872-142-10-2959. [DOI] [PubMed] [Google Scholar]
- Meng SY, Hui JO, Haniu M, Tsai LB. Analysis of translational termination of recombinant human methionyl-neurotrophin 3 in Escherichia coli. Biochem Biophys Res Commun. 1995;211(1):40–48. doi: 10.1006/bbrc.1995.1775. [DOI] [PubMed] [Google Scholar]
- Metropolis N. The beginnning of the Monte Carlo method. Los Alamos Sci. 1987;15(Special issue):125–130. [Google Scholar]
- Meyer IM, Durbin R. Gene structure conservation aids similarity based gene prediction. Nucleic Acids Res. 2004;32(2):776–783. doi: 10.1093/nar/gkh211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller JH, Albertini AM. Effects of surrounding sequence on the suppression of nonsense codons. J Mol Biol. 1983;164(1):59–71. doi: 10.1016/0022-2836(83)90087-6. [DOI] [PubMed] [Google Scholar]
- Miller CG, Kukral AM, Miller JL, Movva NR. pepM is an essential gene in Salmonella typhimurium. J Bacteriol. 1989;171(9):5215–5217. doi: 10.1128/jb.171.9.5215-5217.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Milman G, Goldstein J, Scolnick E, Caskey T. Peptide chain termination. 3. Stimulation of in vitro termination. Proc Natl Acad Sci U S A. 1969;63(1):183–190. doi: 10.1073/pnas.63.1.183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Min Jou W, Haegeman G, Ysebaert M, Fiers W. Nucleotide sequence of the gene coding for the bacteriophage MS2 coat protein. Nature. 1972;237(5350):82–88. doi: 10.1038/237082a0. [DOI] [PubMed] [Google Scholar]
- Minakshi R, Padhan K, Rani M, Khan N, Ahmad F, Jameel S. The SARS coronavirus 3a protein causes endoplasmic reticulum stress and induces ligand-independent downregulation of the type 1 interferon receptor. PLoS One. 2009;4(12):e8342. doi: 10.1371/journal.pone.0008342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mine T, Muraoka H, Saika T, Kobayashi I. Characteristics of a clinical isolate of urease-negative Helicobacter pylori and its ability to induce gastric ulcers in Mongolian gerbils. Helicobacter. 2005;10(2):125–131. doi: 10.1111/j.1523-5378.2005.00300.x. [DOI] [PubMed] [Google Scholar]
- Mitra SK, Lustig F, Akesson B, Lagerkvist U. Codon-acticodon recognition in the valine codon family. J Biol Chem. 1977;252(2):471–478. [PubMed] [Google Scholar]
- Miura F, Kawaguchi N, Sese J, Toyoda A, Hattori M, Morishita S, Ito T. A large-scale full-length cDNA analysis to explore the budding yeast transcriptome. Proc Natl Acad Sci. 2006;103(47):17846–17851. doi: 10.1073/pnas.0605645103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miyata T, Yasunaga T. Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. J Mol Evol. 1980;16(1):23–36. doi: 10.1007/BF01732067. [DOI] [PubMed] [Google Scholar]
- Miyata T, Miyazawa S, Yasunaga T. Two types of amino acid substitutions in protein evolution. J Mol Evol. 1979;12(3):219–236. doi: 10.1007/BF01732340. [DOI] [PubMed] [Google Scholar]
- Mlera L, Lam J, Offerdahl DK, Martens C, Sturdevant D, Turner CV, Porcella SF, Bloom ME. Transcriptome analysis reveals a signature profile for tick-borne Flavivirus persistence in HEK 293T cells. MBio. 2016;7(3):e00314–e00316. doi: 10.1128/mBio.00314-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mobley HL, Hu LT, Foxal PA. Helicobacter pylori urease: properties and role in pathogenesis. Scand J Gastroenterol. 1991;187(Supplement):39–46. doi: 10.3109/00365529109098223. [DOI] [PubMed] [Google Scholar]
- Moerschell RP, Hosokawa Y, Tsunasawa S, Sherman F. The specificities of yeast methionine aminopeptidase and acetylation of amino-terminal methionine in vivo. Processing of altered iso-1-cytochromes c created by oligonucleotide transformation. J Biol Chem. 1990;265(32):19638–19643. [PubMed] [Google Scholar]
- Moffat JG, Rudolph J, Bailey D. Phenotypic screening in cancer drug discovery – past, present and future. Nat Rev Drug Discov. 2014;13(8):588–602. doi: 10.1038/nrd4366. [DOI] [PubMed] [Google Scholar]
- Moi P, Loudianos G, Lavinha J, Murru S, Cossu P, Casu R, Oggiano L, Longinotti M, Cao A, Pirastu M. Delta-thalassemia due to a mutation in an erythroid-specific binding protein sequence 3′ to the delta-globin gene. Blood. 1992;79(2):512–516. [PubMed] [Google Scholar]
- Monteiro PT, Mendes ND, Teixeira MC, d’Orey S, Tenreiro S, Mira NP, Pais H, Francisco AP, Carvalho AM, Lourenco AB, et al. YEASTRACT-DISCOVERER: new tools to improve the analysis of transcriptional regulatory associations in Saccharomyces cerevisiae. Nucleic Acids Res. 2008;36(Database issue):D132–D136. doi: 10.1093/nar/gkm976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mora L, Heurgue-Hamard V, Champ S, Ehrenberg M, Kisselev LL, Buckingham RH. The essential role of the invariant GGQ motif in the function and stability in vivo of bacterial release factors RF1 and RF2. Mol Microbiol. 2003;47(1):267–275. doi: 10.1046/j.1365-2958.2003.03301.x. [DOI] [PubMed] [Google Scholar]
- Mora L, Heurgue-Hamard V, de Zamaroczy M, Kervestin S, Buckingham RH. Methylation of bacterial release factors RF1 and RF2 is required for normal translation termination in vivo. J Biol Chem. 2007;282(49):35638–35645. doi: 10.1074/jbc.M706076200. [DOI] [PubMed] [Google Scholar]
- Morin R, Bainbridge M, Fejes A, Hirst M, Krzywinski M, Pugh T, McDonald H, Varhol R, Jones S, Marra M. Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing. BioTechniques. 2008;45(1):81–94. doi: 10.2144/000112900. [DOI] [PubMed] [Google Scholar]
- Morin RD, O’Connor MD, Griffith M, Kuchenbauer F, Delaney A, Prabhu AL, Zhao Y, McDonald H, Zeng T, Hirst M, et al. Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res. 2008;18(4):610–621. doi: 10.1101/gr.7179508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morita M, Shimozawa N, Kashiwayama Y, Suzuki Y, Imanaka T. ABC subfamily D proteins and very long chain fatty acid metabolism as novel targets in adrenoleukodystrophy. Curr Drug Targets. 2011;12(5):694–706. doi: 10.2174/138945011795378577. [DOI] [PubMed] [Google Scholar]
- Moriyama EN, Powell JR. Codon usage bias and tRNA abundance in Drosophila. J Mol Evol. 1997;45(5):514–523. doi: 10.1007/PL00006256. [DOI] [PubMed] [Google Scholar]
- Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5(7):621–628. doi: 10.1038/nmeth.1226. [DOI] [PubMed] [Google Scholar]
- Mottagui-Tabar S, Isaksson LA. Only the last amino acids in the nascent peptide influence translation termination in Escherichia coli genes. FEBS Lett. 1997;414(1):165–170. doi: 10.1016/S0014-5793(97)00978-2. [DOI] [PubMed] [Google Scholar]
- Moult J, Hubbard T, Fidelis K, Pedersen JT. Critical assessment of methods of protein structure prediction (CASP): round III. Proteins. 1999;37(Suppl 3):2–6. doi: 10.1002/(SICI)1097-0134(1999)37:3+<2::AID-PROT2>3.0.CO;2-2. [DOI] [PubMed] [Google Scholar]
- Muller HJ, Altenburg E. The frequency of translocations produced by X-rays in Drosophila. Genetics. 1930;15(4):283–311. doi: 10.1093/genetics/15.4.283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murphy J, Mahony J, Ainsworth S, Nauta A, van Sinderen D. Bacteriophage orphan DNA methyltransferases: insights from their bacterial origin, function, and occurrence. Appl Environ Microbiol. 2013;79(24):7547–7555. doi: 10.1128/AEM.02229-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murtagh F. Complexities of hierarchic clustering algorithms: state of the art. Comput Stat Q. 1984;1:101–113. [Google Scholar]
- Muto A, Osawa S. The guanine and cytosine content of genomic DNA and bacterial evolution. Proc Natl Acad Sci U S A. 1987;84:166–169. doi: 10.1073/pnas.84.1.166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nachman MW, Crowell SL. Estimate of the mutation rate per nucleotide in humans. Genetics. 2000;156(1):297–304. doi: 10.1093/genetics/156.1.297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nakai K, Horton P. PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci. 1999;24(1):34–36. doi: 10.1016/S0968-0004(98)01336-X. [DOI] [PubMed] [Google Scholar]
- Nakamoto T. A unified view of the initiation of protein synthesis. Biochem Biophys Res Commun. 2006;341(3):675–678. doi: 10.1016/j.bbrc.2006.01.019. [DOI] [PubMed] [Google Scholar]
- Nakamura Y, Ito K, Matsumura K, Kawazu Y, Ebihara K. Regulation of translation termination: conserved structural motifs in bacterial and eukaryotic polypeptide release factors. Biochem Cell Biol. 1995;73(11–12):1113–1122. doi: 10.1139/o95-120. [DOI] [PubMed] [Google Scholar]
- Nakamura Y, Ito K, Isaksson LA. Emerging understanding of translation termination. Cell. 1996;87(2):147–150. doi: 10.1016/S0092-8674(00)81331-8. [DOI] [PubMed] [Google Scholar]
- Nakamura Y, Gojobori T, Ikemura T. Codon usage tabulated from international DNA sequence databases: status for the year 2000. Nucleic Acids Res. 2000;28(1):292. doi: 10.1093/nar/28.1.292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nakashima H, Fukuchi S, Nishikawa K. Compositional changes in RNA, DNA and proteins for bacterial adaptation to higher and lower temperatures. J Biochem (Tokyo) 2003;133(4):507–513. doi: 10.1093/jb/mvg067. [DOI] [PubMed] [Google Scholar]
- Nasvall SJ, Chen P, Bjork GR. The wobble hypothesis revisited: uridine-5-oxyacetic acid is critical for reading of G-ending codons. RNA. 2007;13(12):2151–2164. doi: 10.1261/rna.731007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Needleman SB, Wunsch CD. A general method applicable to the search of similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48:443–453. doi: 10.1016/0022-2836(70)90057-4. [DOI] [PubMed] [Google Scholar]
- Nei M. Phylogenetic analysis in molecular evolutionary genetics. Annu Rev Genet. 1996;30:371–403. doi: 10.1146/annurev.genet.30.1.371. [DOI] [PubMed] [Google Scholar]
- Nei M, Kumar S. Molecular evolution and phylogenetics. New York: Oxford University Press; 2000. [Google Scholar]
- Neuwald AF, Liu JS, Lawrence CE. Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Sci. 1995;4(8):1618–1632. doi: 10.1002/pro.5560040820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ngumbela KC, Ryan KP, Sivamurthy R, Brockman MA, Gandhi RT, Bhardwaj N, Kavanagh DG. Quantitative effect of suboptimal codon usage on translational efficiency of mRNA encoding HIV-1 gag in intact T cells. PLoS One. 2008;3(6):e2356. doi: 10.1371/journal.pone.0002356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nicholas HB, Jr, Chan SS, Rosenquist GL. Reevaluation of the determinants of tyrosine sulfation. Endocrine. 1999;11(3):285–292. doi: 10.1385/ENDO:11:3:285. [DOI] [PubMed] [Google Scholar]
- Nichols T, Hayasaka S. Controlling the familywise error rate in functional neuroimaging: a comparative review. Stat Meth Med Res. 2003;12(5):419–446. doi: 10.1191/0962280203sm341ra. [DOI] [PubMed] [Google Scholar]
- Nicolae M, Pathak S, Rajasekaran S. LFQC: a lossless compression algorithm for FASTQ files. Bioinformatics. 2015;31(20):3276–3281. doi: 10.1093/bioinformatics/btv384. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nishimura S, Takahashi S, Kuroha T, Suwabe N, Nagasawa T, Trainor C, Yamamoto M. A GATA box in the GATA-1 gene hematopoietic enhancer is a critical element in the network of GATA factors and sites that regulate this gene. Mol Cell Biol. 2000;20(2):713–723. doi: 10.1128/MCB.20.2.713-723.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nissen P, Kjeldgaard M, Thirup S, Polekhina G, Reshetnikova L, Clark BF, Nyborg J. Crystal structure of the ternary complex of Phe-tRNAPhe, EF-Tu, and a GTP analog. Science. 1995;270(5241):1464–1472. doi: 10.1126/science.270.5241.1464. [DOI] [PubMed] [Google Scholar]
- Noedl H, Se Y, Schaecher K, Smith BL, Socheat D, Fukuda MM. Evidence of artemisinin-resistant malaria in western Cambodia. N Engl J Med. 2008;359(24):2619–2620. doi: 10.1056/NEJMc0805011. [DOI] [PubMed] [Google Scholar]
- Noedl H, Socheat D, Satimai W. Artemisinin-resistant malaria in Asia. N Engl J Med. 2009;361(5):540–541. doi: 10.1056/NEJMc0900231. [DOI] [PubMed] [Google Scholar]
- Noedl H, Se Y, Sriwichai S, Schaecher K, Teja-Isavadharm P, Smith B, Rutvisuttinunt W, Bethell D, Surasri S, Fukuda MM, et al. Artemisinin resistance in Cambodia: a clinical trial designed to address an emerging problem in Southeast Asia. Clin Infect Dis. 2010;51(11):e82–e89. doi: 10.1086/657120. [DOI] [PubMed] [Google Scholar]
- Nomenclature Committee of the International Union of Biochemistry Nomenclature for incompletely specified bases in nucleic acid sequences. Recommendations 1984. Eur J Biochem. 1985;150:1–5. doi: 10.1111/j.1432-1033.1985.tb08977.x. [DOI] [PubMed] [Google Scholar]
- Notredame C, O’Brien EA, Higgins DG. RAGA: RNA sequence alignment by genetic algorithm. Nucleic Acids Res. 1997;25(22):4570–4580. doi: 10.1093/nar/25.22.4570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Numanagic I, Bonfield JK, Hach F, Voges J, Ostermann J, Alberti C, Mattavelli M, Sahinalp SC. Comparison of high-throughput sequencing data compression tools. Nat Methods. 2016;13(12):1005–1008. doi: 10.1038/nmeth.4037. [DOI] [PubMed] [Google Scholar]
- Nur I, Szyf M, Razin A, Glaser G, Rottem S, Razin S. Procaryotic and eucaryotic traits of DNA methylation in spiroplasmas (mycoplasmas) J Bacteriol. 1985;164(1):19–24. doi: 10.1128/jb.164.1.19-24.1985. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nussinov R. Doublet frequencies in evolutionary distinct groups. Nucleic Acids Res. 1984;12(3):1749–1763. doi: 10.1093/nar/12.3.1749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Brien JD, She ZS, Suchard MA. Dating the time of viral subtype divergence. BMC Evol Biol. 2008;8:172. doi: 10.1186/1471-2148-8-172. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Obenauer JC, Cantley LC, Yaffe MB. Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs. Nucleic Acids Res. 2003;31(13):3635–3641. doi: 10.1093/nar/gkg584. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ohta T, Gray TA, Rogan PK, Buiting K, Gabriel JM, Saitoh S, Muralidhar B, Bilienska B, Krajewska-Walasek M, Driscoll DJ, et al. Imprinting-mutation mechanisms in Prader-Willi syndrome. Am J Hum Genet. 1999;64(2):397–413. doi: 10.1086/302233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ordway JM, Fenster SD, Ruan H, Curran T. A transcriptome map of cellular transformation by the fos oncogene. Mol Cancer. 2005;4(1):19. doi: 10.1186/1476-4598-4-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orkin SH. Globin gene regulation and switching: circa 1990. Cell. 1990;63(4):665–672. doi: 10.1016/0092-8674(90)90133-Y. [DOI] [PubMed] [Google Scholar]
- Orkin SH. GATA-binding transcription factors in hematopoietic cells. Blood. 1992;80(3):575–581. [PubMed] [Google Scholar]
- Osawa S, Jukes TH, Muto A, Yamao F, Ohama T, Andachi Y. Role of directional mutation pressure in the evolution of the eubacterial genetic code. Cold Spring Harb Symp Quant Biol. 1987;52:777–789. doi: 10.1101/SQB.1987.052.01.087. [DOI] [PubMed] [Google Scholar]
- Osterman IA, Evfratov SA, Sergiev PV, Dontsova OA. Comparison of mRNA features affecting translation initiation and reinitiation. Nucleic Acids Res. 2013;41(1):474–486. doi: 10.1093/nar/gks989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ostrin EJ, Li Y, Hoffman K, Liu J, Wang K, Zhang L, Mardon G, Chen R. Genome-wide identification of direct targets of the Drosophila retinal determination protein Eyeless. Genome Res. 2006;16(4):466–476. doi: 10.1101/gr.4673006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ota S, Li WH. NJML: a hybrid algorithm for the neighbor-joining and maximum-likelihood methods. Mol Biol Evol. 2000;17(9):1401–1409. doi: 10.1093/oxfordjournals.molbev.a026423. [DOI] [PubMed] [Google Scholar]
- Ota S, Li WH. NJML+: an extension of the NJML method to handle protein sequence data and computer software implementation. Mol Biol Evol. 2001;18(11):1983–1992. doi: 10.1093/oxfordjournals.molbev.a003740. [DOI] [PubMed] [Google Scholar]
- Otu HH, Sayood K. A new sequence distance measure for phylogenetic tree construction. Bioinformatics. 2003;19(16):2122–2130. doi: 10.1093/bioinformatics/btg295. [DOI] [PubMed] [Google Scholar]
- Palidwor GA, Perkins TJ, Xia X. A general model of codon bias due to GC mutational bias. PLoS One. 2010;5(10):e13431. doi: 10.1371/journal.pone.0013431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palstra RJ, Tolhuis B, Splinter E, Nijmeijer R, Grosveld F, de Laat W. The beta-globin nuclear compartment in development and erythroid differentiation. Nat Genet. 2003;35(2):190–194. doi: 10.1038/ng1244. [DOI] [PubMed] [Google Scholar]
- Pandey RR, Mondal T, Mohammad F, Enroth S, Redrup L, Komorowski J, Nagano T, Mancini-Dinardo D, Kanduri C. Kcnq1ot1 antisense noncoding RNA mediates lineage-specific transcriptional silencing through chromatin-level regulation. Mol Cell. 2008;32(2):232–246. doi: 10.1016/j.molcel.2008.08.022. [DOI] [PubMed] [Google Scholar]
- Pappin DJ, Hojrup P, Bleasby AJ. Rapid identification of proteins by peptide-mass fingerprinting. Curr Biol. 1993;3(6):327–332. doi: 10.1016/0960-9822(93)90195-T. [DOI] [PubMed] [Google Scholar]
- Park SY, Cromie MJ, Lee EJ, Groisman EA. A bacterial mRNA leader that employs different mechanisms to sense disparate intracellular signals. Cell. 2010;142(5):737–748. doi: 10.1016/j.cell.2010.07.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker J. Errors and alternatives in reading the universal genetic code. Microbiol Rev. 1989;53(3):273–298. doi: 10.1128/mr.53.3.273-298.1989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Patel GP, Bag J. IMP1 interacts with poly(A)-binding protein (PABP) and the autoregulatory translational control element of PABP-mRNA through the KH III-IV domain. FEBS J. 2006;273(24):5678–5690. doi: 10.1111/j.1742-4658.2006.05556.x. [DOI] [PubMed] [Google Scholar]
- Patel GP, Ma S, Bag J. The autoregulatory translational control element of poly(A)-binding protein mRNA forms a heteromeric ribonucleoprotein complex. Nucleic Acids Res. 2005;33(22):7074–7089. doi: 10.1093/nar/gki1014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pauling L, Itano HA, Singer SJ, Wells IC. Sickle cell anemia a molecular disease. Science. 1949;110(2865):543–548. doi: 10.1126/science.110.2865.543. [DOI] [PubMed] [Google Scholar]
- Pazin MJ, Kamakaka RT, Kadonaga JT. ATP-dependent nucleosome reconfiguration and transcriptional activation from preassembled chromatin templates. Science. 1994;266(5193):2007–2011. doi: 10.1126/science.7801129. [DOI] [PubMed] [Google Scholar]
- Pazin MJ, Sheridan PL, Cannon K, Cao Z, Keck JG, Kadonaga JT, Jones KA. NF-kappa B-mediated chromatin reconfiguration and transcriptional activation of the HIV-1 enhancer in vitro. Genes Dev. 1996;10(1):37–49. doi: 10.1101/gad.10.1.37. [DOI] [PubMed] [Google Scholar]
- Pazin MJ, Hermann JW, Kadonaga JT. Promoter structure and transcriptional activation with chromatin templates assembled in vitro. A single Gal4-VP16 dimer binds to chromatin or to DNA with comparable affinity. J Biol Chem. 1998;273(51):34653–34660. doi: 10.1074/jbc.273.51.34653. [DOI] [PubMed] [Google Scholar]
- Peabody MA, Laird MR, Vlasschaert C, Lo R, Brinkman FS. PSORTdb: expanding the bacteria and archaea protein subcellular localization database to better reflect diversity in cell envelope structures. Nucleic Acids Res. 2016;44(D1):D663–D668. doi: 10.1093/nar/gkv1271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pearson WR. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol. 1990;183:63–98. doi: 10.1016/0076-6879(90)83007-V. [DOI] [PubMed] [Google Scholar]
- Pearson WR. Using the FASTA program to search protein and DNA sequence databases. Methods Mol Biol. 1994;24:307–331. doi: 10.1385/0-89603-246-9:307. [DOI] [PubMed] [Google Scholar]
- Pearson WR. Empirical statistical estimates for sequence similarity searches. J Mol Biol. 1998;276(1):71–84. doi: 10.1006/jmbi.1997.1525. [DOI] [PubMed] [Google Scholar]
- Pearson WR, Lipman DJ. Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A. 1988;85:2444–2448. doi: 10.1073/pnas.85.8.2444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pei J, Kim BH, Grishin NV. PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res. 2008;36(7):2295–2300. doi: 10.1093/nar/gkn072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Percudani R, Pavesi A, Ottonello S. Transfer RNA gene redundancy and translational selection in Saccharomyces cerevisiae. J Mol Biol. 1997;268(2):322–330. doi: 10.1006/jmbi.1997.0942. [DOI] [PubMed] [Google Scholar]
- Pereira SL, Baker AJ. A mitogenomic timescale for birds detects variable phylogenetic rates of molecular evolution and refutes the standard molecular clock. Mol Biol Evol. 2006;23(9):1731–1740. doi: 10.1093/molbev/msl038. [DOI] [PubMed] [Google Scholar]
- Pestova TV, Shatsky IN, Fletcher SP, Jackson RJ, Hellen CU. A prokaryotic-like mode of cytoplasmic eukaryotic ribosome binding to the initiation codon during internal translation initiation of hepatitis C and classical swine fever virus RNAs. Genes Dev. 1998;12(1):67–83. doi: 10.1101/gad.12.1.67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pestova TV, Lomakin IB, Hellen CU. Position of the CrPV IRES on the 40S subunit and factor dependence of IRES/80S ribosome assembly. EMBO Rep. 2004;5(9):906–913. doi: 10.1038/sj.embor.7400240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petronis A. The origin of schizophrenia: genetic thesis, epigenetic antithesis, and resolving synthesis. Biol Psychiatry. 2004;55(10):965–970. doi: 10.1016/j.biopsych.2004.02.005. [DOI] [PubMed] [Google Scholar]
- Petronis A. Epigenetics and twins: three variations on the theme. Trends Genet. 2006;22(7):347–350. doi: 10.1016/j.tig.2006.04.010. [DOI] [PubMed] [Google Scholar]
- Petronis A, Gottesman II, Kan P, Kennedy JL, Basile VS, Paterson AD, Popendikyte V. Monozygotic twins exhibit numerous epigenetic differences: clues to twin discordance? Schizophr Bull. 2003;29(1):169–178. doi: 10.1093/oxfordjournals.schbul.a006988. [DOI] [PubMed] [Google Scholar]
- Petrullo LA, Gallagher PJ, Elseviers D. The role of 2-methylthio-N6-isopentenyladenosine in readthrough and suppression of nonsense codons in Escherichia coli. Mol Gen Genet. 1983;190(2):289–294. doi: 10.1007/BF00330653. [DOI] [PubMed] [Google Scholar]
- Petry S, Brodersen DE, FVt M, Dunham CM, Selmer M, Tarry MJ, Kelley AC, Ramakrishnan V. Crystal structures of the ribosome in complex with release factors RF1 and RF2 bound to a cognate stop codon. Cell. 2005;123(7):1255–1266. doi: 10.1016/j.cell.2005.09.039. [DOI] [PubMed] [Google Scholar]
- Pevzner PA. Computational molecular biology: an algorithmic approach. Cambridge, MA: The MIT Press; 2000. [Google Scholar]
- Pielou EC. The interpretation of ecological data: a primer on classification and ordination. New York: Wiley; 1984. [Google Scholar]
- Pietras K, Sjoblom T, Rubin K, Heldin CH, Ostman A. PDGF receptors as cancer drug targets. Cancer Cell. 2003;3(5):439–443. doi: 10.1016/S1535-6108(03)00089-8. [DOI] [PubMed] [Google Scholar]
- Pinheiro JC, Bates DM. Mixed-effects models in S and S-PLUS. Berlin/Heidelberg: Springer; 2000. [Google Scholar]
- Pleiss JA, Whitworth GB, Bergkessel M, Guthrie C. Rapid, transcript-specific changes in splicing in response to environmental stress. Mol Cell. 2007;27(6):928–937. doi: 10.1016/j.molcel.2007.07.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pobre V, Arraiano CM. Next generation sequencing analysis reveals that the ribonucleases RNase II, RNase R and PNPase affect bacterial motility and biofilm formation in E. coli. BMC Genomics. 2015;16:72. doi: 10.1186/s12864-015-1237-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poole ES, Brown CM, Tate WP. The identity of the base following the stop codon determines the efficiency of in vivo translational termination in Escherichia coli. EMBO J. 1995;14(1):151–158. doi: 10.1002/j.1460-2075.1995.tb06985.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poole ES, Major LL, Mannering SA, Tate WP. Translational termination in Escherichia coli: three bases following the stop codon crosslink to release factor 2 and affect the decoding efficiency of UGA-containing signals. Nucleic Acids Res. 1998;26(4):954–960. doi: 10.1093/nar/26.4.954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Popa A, Lebrigand K, Barbry P, Waldmann R. Pateamine A-sensitive ribosome profiling reveals the scope of translation in mouse embryonic stem cells. BMC Genomics. 2016;17:52. doi: 10.1186/s12864-016-2384-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Poulos MG, Batra R, Charizanis K, Swanson MS. Developments in RNA splicing and disease. Cold Spring Harb Perspect Biol. 2011;3(1):a000778. doi: 10.1101/cshperspect.a000778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Povolotskaya IS, Kondrashov FA, Ledda A, Vlasov PK. Stop codons in bacteria are not selectively equivalent. Biol Direct. 2012;7:30. doi: 10.1186/1745-6150-7-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prabhakaran R, Chithambaram S, Xia X. Escherichia coli and Staphylococcus phages: effect of translation initiation efficiency on differential codon adaptation mediated by virulent and temperate lifestyles. J Gen Virol. 2015;96(Pt 5):1169–1179. doi: 10.1099/vir.0.000050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Prensner JR, Iyer MK, Balbin OA, Dhanasekaran SM, Cao Q, Brenner JC, Laxman B, Asangani IA, Grasso CS, Kominsky HD, et al. Transcriptome sequencing across a prostate cancer cohort identifies PCAT-1, an unannotated lincRNA implicated in disease progression. Nat Biotechnol. 2011;29(8):742–749. doi: 10.1038/nbt.1914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Press WH, Teukolsky SA, Tetterling WT, Flannery BP. Numerical recipes in C: the art of scientifi computing. Cambridge: Cambridge University Press; 1992. [Google Scholar]
- Prival MJ. Isolation of glutamate-inserting ochre suppressor mutants of Salmonella typhimurium and Escherichia coli. J Bacteriol. 1996;178(10):2989–2990. doi: 10.1128/jb.178.10.2989-2990.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ptashne M. A genetic switch: gene control and phage lambda. Cambridge, MA: Cell Press and Blackwell Scientific; 1986. [Google Scholar]
- Pure GA, Robinson GW, Naumovski L, Friedberg EC. Partial suppression of an ochre mutation in Saccharomyces cerevisiae by multicopy plasmids containing a normal yeast tRNAGln gene. J Mol Biol. 1985;183(1):31–42. doi: 10.1016/0022-2836(85)90278-5. [DOI] [PubMed] [Google Scholar]
- Pyronnet S, Pradayrol L, Sonenberg N. A cell cycle-dependent internal ribosome entry site. Mol Cell. 2000;5(4):607–616. doi: 10.1016/S1097-2765(00)80240-3. [DOI] [PubMed] [Google Scholar]
- Qin ZS, McCue LA, Thompson W, Mayerhofer L, Lawrence CE, Liu JS. Identification of co-regulated genes through Bayesian clustering of predicted regulatory binding sites. Nat Biotechnol. 2003;21(4):435–439. doi: 10.1038/nbt802. [DOI] [PubMed] [Google Scholar]
- Qu K, McCue LA, Lawrence CE. Bayesian protein family classifier. Proc Int Conf Intell Syst Mol Biol. 1998;6:131–139. [PubMed] [Google Scholar]
- Raaum RL, Sterner KN, Noviello CM, Stewart C-B, Disotell TR. Catarrhine primate divergence dates estimated from complete mitochondrial genomes: concordance with fossil and nuclear DNA evidence. J Hum Evol. 2005;48(3):237. doi: 10.1016/j.jhevol.2004.11.007. [DOI] [PubMed] [Google Scholar]
- Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE. 1989;77(2):257–286. doi: 10.1109/5.18626. [DOI] [Google Scholar]
- Rahi SJ, Pecani K, Ondracka A, Oikonomou C, Cross FR. The CDK-APC/C oscillator predominantly entrains periodic cell-cycle transcription. Cell. 2016;165(2):475–487. doi: 10.1016/j.cell.2016.02.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rambaut A, Bromham L. Estimating divergence dates from molecular sequences. Mol Biol Evol. 1998;15(4):442–448. doi: 10.1093/oxfordjournals.molbev.a025940. [DOI] [PubMed] [Google Scholar]
- Ran W, Higgs PG. Contributions of speed and accuracy to translational selection in bacteria. PLoS One. 2012;7(12):e51652. doi: 10.1371/journal.pone.0051652. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rannala B, Yang Z. Inferring speciation times under an episodic molecular clock. Syst Biol. 2007;56(3):453–466. doi: 10.1080/10635150701420643. [DOI] [PubMed] [Google Scholar]
- Rashid M, Saha S, Raghava GP. Support Vector Machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs. BMC Bioinformatics. 2007;8:337. doi: 10.1186/1471-2105-8-337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Razin A, Razin S. Methylated bases in mycoplasmal DNA. Nucleic Acids Res. 1980;8(6):1383–1390. doi: 10.1093/nar/8.6.1383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Regier JC, Shultz JW, Zwick A, Hussey A, Ball B, Wetzer R, Martin JW, Cunningham CW. Arthropod relationships revealed by phylogenomic analysis of nuclear protein-coding sequences. Nature. 2010;463(7284):1079–1083. doi: 10.1038/nature08742. [DOI] [PubMed] [Google Scholar]
- Reinert K, Stoye J, Will T. An iterative method for faster sum-of-pairs multiple sequence alignment. Bioinformatics. 2000;16(9):808–814. doi: 10.1093/bioinformatics/16.9.808. [DOI] [PubMed] [Google Scholar]
- Rektorschek M, Buhmann A, Weeks D, Schwan D, Bensch KW, Eskandari S, Scott D, Sachs G, Melchers K. Acid resistance of Helicobacter pylori depends on the UreI membrane protein and an inner membrane proton barrier. Mol Microbiol. 2000;36(1):141–152. doi: 10.1046/j.1365-2958.2000.01835.x. [DOI] [PubMed] [Google Scholar]
- Rice P, Longden I, Bleasby A. EMBOSS: the European molecular biology open software suite. Trends Genet. 2000;16(6):276–277. doi: 10.1016/S0168-9525(00)02024-2. [DOI] [PubMed] [Google Scholar]
- Rideout WMI, Coetzee GA, Olumi AF, Jones PA. 5-Methylcytosine as an endogenous mutagen in the human LDL receptor and p53 genes. Science. 1990;249:1288–1290. doi: 10.1126/science.1697983. [DOI] [PubMed] [Google Scholar]
- Rimsky L, Hauber J, Dukovich M, Malim MH, Langlois A, Cullen BR, Greene WC. Functional replacement of the HIV-1 rev protein by the HTLV-1 rex protein. Nature. 1988;335(6192):738–740. doi: 10.1038/335738a0. [DOI] [PubMed] [Google Scholar]
- Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, Brugmann SA, Goodnough LH, Helms JA, Farnham PJ, Segal E, et al. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell. 2007;129(7):1311–1323. doi: 10.1016/j.cell.2007.05.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ritland K, Clegg M. Molecular evolution. New York: Alan R. Liss; 1990. Optimal DNA sequence divergence for testing phylogenetic hypotheses; pp. 289–296. [Google Scholar]
- Roberts A, Pachter L. Streaming fragment assignment for real-time analysis of sequencing experiments. Nat Methods. 2013;10(1):71–73. doi: 10.1038/nmeth.2251. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts A, Trapnell C, Donaghey J, Rinn JL, Pachter L. Improving RNA-Seq expression estimates by correcting for fragment bias. Genome Biol. 2011;12(3):R22. doi: 10.1186/gb-2011-12-3-r22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts A, Feng H, Pachter L. Fragment assignment in the cloud with eXpress-D. BMC Bioinform. 2013;14:358. doi: 10.1186/1471-2105-14-358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts A, Schaeffer L, Pachter L. Updating RNA-Seq analyses after re-annotation. Bioinformatics. 2013;29(13):1631–1637. doi: 10.1093/bioinformatics/btt197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods. 2007;4(8):651–657. doi: 10.1038/nmeth1068. [DOI] [PubMed] [Google Scholar]
- Robinson M, Lilley R, Little S, Emtage JS, Yarranton G, Stephens P, Millican A, Eaton M, Humphreys G. Codon usage can affect efficiency of translation of genes in Escherichia coli. Nucleic Acids Res. 1984;12(17):6663–6671. doi: 10.1093/nar/12.17.6663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodgers AB, Morgan CP, Leu NA, Bale TL. Transgenerational epigenetic programming via sperm microRNA recapitulates effects of paternal stress. Proc Natl Acad Sci U S A. 2015;112(44):13699–13704. doi: 10.1073/pnas.1508347112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rogers MF, Thomas J, Reddy AS, Ben-Hur A. SpliceGrapher: detecting patterns of alternative splicing from RNA-Seq data in the context of gene models and EST data. Genome Biol. 2012;13(1):R4. doi: 10.1186/gb-2012-13-1-r4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rogozin IB, Managadze D, Shabalina SA, Koonin EV. Gene family level comparative analysis of gene expression in mammals validates the ortholog conjecture. Genome Biol Evol. 2014;6(4):754–762. doi: 10.1093/gbe/evu051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rosenberg MS, Kumar S. Heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference. Mol Biol Evol. 2003;20(4):610–621. doi: 10.1093/molbev/msg067. [DOI] [PubMed] [Google Scholar]
- Rosenblatt F. The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev. 1958;65(6):386–408. doi: 10.1037/h0042519. [DOI] [PubMed] [Google Scholar]
- Ross S, Giglione C, Pierre M, Espagne C, Meinnel T. Functional and developmental impact of cytosolic protein N-terminal methionine excision in Arabidopsis. Plant Physiol. 2005;137(2):623–637. doi: 10.1104/pp.104.056861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roth JR. UGA nonsense mutations in Salmonella typhimurium. J Bacteriol. 1970;102(2):467–475. doi: 10.1128/jb.102.2.467-475.1970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rouchka EC (1997) A brief overview of Gibbs Sampling. IBC Statistics Study Group, Washington University, Institute for Biomedical Computing
- Ruiz LM, Armengol G, Habeych E, Orduz S. A theoretical analysis of codon adaptation index of the Boophilus microplus bm86 gene directed to the optimization of a DNA vaccine. J Theor Biol. 2006;239(4):445–449. doi: 10.1016/j.jtbi.2005.08.009. [DOI] [PubMed] [Google Scholar]
- Ryan MJ, Fox JH, Wilczynski W, Rand AS. Sexual selection for sensory exploitation in the frog Physalaemus pustulosus. Nature. 1990;343:66–67. doi: 10.1038/343066a0. [DOI] [PubMed] [Google Scholar]
- Ryden SM, Isaksson LA. A temperature-sensitive mutant of Escherichia coli that shows enhanced misreading of UAG/A and increased efficiency for some tRNA nonsense suppressors. Mol Gen Genet. 1984;193(1):38–45. doi: 10.1007/BF00327411. [DOI] [PubMed] [Google Scholar]
- Rzhetsky A, Nei M. Unbiased estimates of the number of nucleotide substitutions when substitution rate varies among different sites. J Mol Evol. 1994;38(3):295–299. doi: 10.1007/BF00176091. [DOI] [PubMed] [Google Scholar]
- Rzhetsky A, Nei M. Unbiased estimates of the number of nucleotide substitutions when substitution rate varies among different sites. J Mol Evol. 1994;38(3):295–299. doi: 10.1007/BF00176091. [DOI] [PubMed] [Google Scholar]
- Rzhetsky A, Nei M. Tests of applicability of several substitution models for DNA sequence data. Mol Biol Evol. 1995;12(1):131–151. doi: 10.1093/oxfordjournals.molbev.a040182. [DOI] [PubMed] [Google Scholar]
- Saadatpour A, Lai S, Guo G, Yuan GC. Single-cell analysis in cancer genomics. Trends Genet. 2015;31(10):576–586. doi: 10.1016/j.tig.2015.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sachs AB, Davis RW, Kornberg RD. A single domain of yeast poly(A)-binding protein is necessary and sufficient for RNA binding and cell viability. Mol Cell Biol. 1987;7(9):3268–3276. doi: 10.1128/MCB.7.9.3268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sachs G, Meyer-Rosberg K, Scott DR, Melchers K. Acid, protons and Helicobacter pylori. Yale J Biol Med. 1996;69(3):301–316. [PMC free article] [PubMed] [Google Scholar]
- Sachs G, Weeks DL, Melchers K, Scott DR. The gastric biology of Helicobacter pylori. Annu Rev Physiol. 2003;65(1):349–369. doi: 10.1146/annurev.physiol.65.092101.142156. [DOI] [PubMed] [Google Scholar]
- Saha S, Sparks AB, Rago C, Akmaev V, Wang CJ, Vogelstein B, Kinzler KW, Velculescu VE. Using the transcriptome to annotate the genome. Nat Biotechnol. 2002;20(5):508–512. doi: 10.1038/nbt0502-508. [DOI] [PubMed] [Google Scholar]
- Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4:406–425. doi: 10.1093/oxfordjournals.molbev.a040454. [DOI] [PubMed] [Google Scholar]
- Sakaluk SK. Sensory exploitation as an evolutionary origin to nuptial food gifts in insects. Proc Biol Sci. 2000;267(1441):339–343. doi: 10.1098/rspb.2000.1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salzberg SL, Delcher AL, Kasif S, White O. Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 1998;26(2):544–548. doi: 10.1093/nar/26.2.544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sambrook JF, Fan DP, Brenner S. A strong suppressor specific for UGA. Nature. 1967;214(5087):452–453. doi: 10.1038/214452a0. [DOI] [PubMed] [Google Scholar]
- Samso M, Palumbo MJ, Radermacher M, Liu JS, Lawrence CE. A Bayesian method for classification of images from electron micrographs. J Struct Biol. 2002;138(3):157–170. doi: 10.1016/S1047-8477(02)00001-1. [DOI] [PubMed] [Google Scholar]
- Sancar A, Sancar GB. DNA repair enzymes. Annu Rev Biochem. 1988;57:29–67. doi: 10.1146/annurev.bi.57.070188.000333. [DOI] [PubMed] [Google Scholar]
- Sanderson MJ. A nonparametric approach to estimating divergence times in the absence of rate constancy. Mol Biol Evol. 1997;14:1218–1232. doi: 10.1093/oxfordjournals.molbev.a025731. [DOI] [Google Scholar]
- Sankoff D. Minimal mutation trees of sequences. J SIAM Appl Math. 1975;28:35–42. doi: 10.1137/0128004. [DOI] [Google Scholar]
- Sankoff D, Morel C, Cedergren RJ. Evolution of 5S RNA and the non-randomness of base replacement. Nat New Biol. 1973;245(147):232–234. doi: 10.1038/newbio245232a0. [DOI] [PubMed] [Google Scholar]
- Sankoff D, Cedergren RJ, Lapalme G. Frequency of insertion-deletion, transversion, and transition in the evolution of 5S ribosomal RNA. J Mol Evol. 1976;7(2):133–149. doi: 10.1007/BF01732471. [DOI] [PubMed] [Google Scholar]
- Sawa T, Ohno-Machado L. A neural network-based similarity index for clustering DNA microarray data. Comput Biol Med. 2003;33(1):1–15. doi: 10.1016/S0010-4825(02)00032-X. [DOI] [PubMed] [Google Scholar]
- Schena M. Genome analysis with gene expression microarrays. BioEssays. 1996;18(5):427–431. doi: 10.1002/bies.950180513. [DOI] [PubMed] [Google Scholar]
- Schena M. Microarray analysis. New York: Wiley-Liss; 2003. [Google Scholar]
- Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270(5235):467–470. doi: 10.1126/science.270.5235.467. [DOI] [PubMed] [Google Scholar]
- Schena M, Heller RA, Theriault TP, Konrad K, Lachenmeier E, Davis RW. Microarrays: biotechnology’s discovery platform for functional genomics [see comments] Trends Biotechnol. 1998;16(7):301–306. doi: 10.1016/S0167-7799(98)01219-0. [DOI] [PubMed] [Google Scholar]
- Schmucker D, Clemens JC, Shu H, Worby CA, Xiao J, Muda M, Dixon JE, Zipursky SL. Drosophila Dscam is an axon guidance receptor exhibiting extraordinary molecular diversity. Cell. 2000;101(6):671–684. doi: 10.1016/S0092-8674(00)80878-8. [DOI] [PubMed] [Google Scholar]
- Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990;18(20):6097–6100. doi: 10.1093/nar/18.20.6097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schuler M, Connell SR, Lescoute A, Giesebrecht J, Dabrowski M, Schroeer B, Mielke T, Penczek PA, Westhof E, Spahn CM. Structure of the ribosome-bound cricket paralysis virus IRES RNA. Nat Struct Mol Biol. 2006;13(12):1092–1096. doi: 10.1038/nsmb1177. [DOI] [PubMed] [Google Scholar]
- Schwartz S, Silva J, Burstein D, Pupko T, Eyras E, Ast G. Large-scale comparative analysis of splicing signals and their corresponding splicing factors in eukaryotes. Genome Res. 2008;18(1):88–103. doi: 10.1101/gr.6818908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6(2):461–464. doi: 10.1214/aos/1176344136. [DOI] [Google Scholar]
- Schwer B, Stunnenberg HG. Vaccinia virus late transcripts generated in vitro have a poly(A) head. EMBO J. 1988;7(4):1183–1190. doi: 10.1002/j.1460-2075.1988.tb02929.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schwer B, Visca P, Vos JC, Stunnenberg HG. Discontinuous transcription or RNA processing of vaccinia virus late messengers results in a 5′ poly(A) leader. Cell. 1987;50(2):163–169. doi: 10.1016/0092-8674(87)90212-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scolnick EM, Caskey CT. Peptide chain termination. V. The role of release factors in mRNA terminator codon recognition. Proc Natl Acad Sci U S A. 1969;64(4):1235–1241. doi: 10.1073/pnas.64.4.1235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scolnick E, Tompkins R, Caskey T, Nirenberg M. Release factors differing in specificity for terminator codons. Proc Natl Acad Sci U S A. 1968;61(2):768–774. doi: 10.1073/pnas.61.2.768. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scott D, Weeks D, Melchers K, Sachs G. The life and death of Helicobacter pylori. Gut. 1998;43(Suppl 1):S56–S60. doi: 10.1136/gut.43.2008.S56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scott DR, Marcus EA, Weeks DL, Sachs G. Mechanisms of acid resistance due to the urease system of Helicobacter pylori. Gastroenterology. 2002;123(1):187–195. doi: 10.1053/gast.2002.34218. [DOI] [PubMed] [Google Scholar]
- Seetharam R, Heeren RA, Wong EY, Braford SR, Klein BK, Aykent S, Kotts CE, Mathis KJ, Bishop BF, Jennings MJ, et al. Mistranslation in IGF-1 during over-expression of the protein in Escherichia coli using a synthetic gene containing low frequency codons. Biochem Biophys Res Commun. 1988;155(1):518–523. doi: 10.1016/S0006-291X(88)81117-3. [DOI] [PubMed] [Google Scholar]
- Segurel L, Bon C. On the evolution of lactase persistence in humans. Annu Rev Genomics Hum Genet. 2017;18:297–319. doi: 10.1146/annurev-genom-091416-035340. [DOI] [PubMed] [Google Scholar]
- Sendler E, Johnson GD, Mao S, Goodrich RJ, Diamond MP, Hauser R, Krawetz SA. Stability, delivery and functions of human sperm RNAs at fertilization. Nucleic Acids Res. 2013;41(7):4104–4117. doi: 10.1093/nar/gkt132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seo EY, Namkung JH, Lee KM, Lee WH, Im M, Kee SH, Tae Park G, Yang JM, Seo YJ, Park JK, et al. Analysis of calcium-inducible genes in keratinocytes using suppression subtractive hybridization and cDNA microarray. Genomics. 2005;86(5):528–538. doi: 10.1016/j.ygeno.2005.06.013. [DOI] [PubMed] [Google Scholar]
- Serero A, Giglione C, Sardini A, Martinez-Sanz J, Meinnel T. An unusual peptide deformylase features in the human mitochondrial N-terminal methionine excision pathway. J Biol Chem. 2003;278(52):52953–52963. doi: 10.1074/jbc.M309770200. [DOI] [PubMed] [Google Scholar]
- Shadel GS, Clayton DA. Mitochondrial DNA maintenance in vertebrates. Annu Rev Biochem. 1997;66:409–435. doi: 10.1146/annurev.biochem.66.1.409. [DOI] [PubMed] [Google Scholar]
- Sharma U, Conine CC, Shea JM, Boskovic A, Derr AG, Bing XY, Belleannee C, Kucukural A, Serra RW, Sun F, et al. Biogenesis and function of tRNA fragments during sperm maturation and fertilization in mammals. Science. 2016;351(6271):391–396. doi: 10.1126/science.aad6780. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharp PM. What can AIDS virus codon usage tell us? Nature. 1986;324(6093):114. doi: 10.1038/324114a0. [DOI] [PubMed] [Google Scholar]
- Sharp PM, Bulmer M. Selective differences among translation termination codons. Gene. 1988;63(1):141–145. doi: 10.1016/0378-1119(88)90553-7. [DOI] [PubMed] [Google Scholar]
- Sharp PM, Li WH. The codon adaptation index – a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 1987;15(3):1281–1295. doi: 10.1093/nar/15.3.1281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sharp PM, Tuohy TM, Mosurski KR. Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res. 1986;14(13):5125–5143. doi: 10.1093/nar/14.13.5125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sheppard K, Yuan J, Hohn MJ, Jester B, Devine KM, Soll D. From one amino acid to another: tRNA-dependent amino acid biosynthesis. Nucleic Acids Res. 2008;36(6):1813–1825. doi: 10.1093/nar/gkn015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sheridan PL, Sheline CT, Cannon K, Voz ML, Pazin MJ, Kadonaga JT, Jones KA. Activation of the HIV-1 enhancer by the LEF-1 HMG protein on nucleosome-assembled DNA in vitro. Genes Dev. 1995;9(17):2090–2104. doi: 10.1101/gad.9.17.2090. [DOI] [PubMed] [Google Scholar]
- Sheth N, Roca X, Hastings ML, Roeder T, Krainer AR, Sachidanandam R. Comprehensive splice-site analysis using comparative genomics. Nucl Acids Res. 2006;34(14):3955–3967. doi: 10.1093/nar/gkl556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shimodaira H, Hasegawa M. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol. 1999;16(8):1114–1116. doi: 10.1093/oxfordjournals.molbev.a026201. [DOI] [Google Scholar]
- Shine J, Dalgarno L. The 3′-terminal sequence of Escherichia coli 16S ribosomal RNA: complementarity to nonsense triplets and ribosome binding sites. Proc Natl Acad Sci U S A. 1974;71(4):1342–1346. doi: 10.1073/pnas.71.4.1342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shine J, Dalgarno L. Identical 3′-terminal octanucleotide sequence in 18S ribosomal ribonucleic acid from different eukaryotes. A proposed role for this sequence in the recognition of terminator codons. Biochem J. 1974;141(3):609–615. doi: 10.1042/bj1410609a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shine J, Dalgarno L. Determinant of cistron specificity in bacterial ribosomes. Nature. 1975;254(5495):34–38. doi: 10.1038/254034a0. [DOI] [PubMed] [Google Scholar]
- Shirokikh NE, Spirin AS. Poly(A) leader of eukaryotic mRNA bypasses the dependence of translation on initiation factors. Proc Natl Acad Sci U S A. 2008;105(31):10738–10743. doi: 10.1073/pnas.0804940105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer. 2006;6(10):813–823. doi: 10.1038/nrc1951. [DOI] [PubMed] [Google Scholar]
- Shoemaker DD, Schadt EE, Armour CD, He YD, Garrett-Engele P, McDonagh PD, Loerch PM, Leonardson A, Lum PY, Cavet G, et al. Experimental annotation of the human genome using microarray technology. Nature. 2001;409(6822):922–927. doi: 10.1038/35057141. [DOI] [PubMed] [Google Scholar]
- Shoemaker R, Deng J, Wang W, Zhang K. Allele-specific methylation is prevalent and is contributed by CpG-SNPs in the human genome. Genome Res. 2010;20(7):883–889. doi: 10.1101/gr.104695.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shpaer EG. Constraints on codon context in Escherichia coli genes. Their possible role in modulating the efficiency of translation. J Mol Biol. 1986;188(4):555–564. doi: 10.1016/S0022-2836(86)80005-5. [DOI] [PubMed] [Google Scholar]
- Siavoshi F, Malekzadeh R, Daneshmand M, Smoot DT, Ashktorab H. Association between Helicobacter pylori infection in gastric cancer, ulcers and gastritis in Iranian patients. Helicobacter. 2004;9(5):470. doi: 10.1111/j.1083-4389.2004.00256.x. [DOI] [PubMed] [Google Scholar]
- Siepel A, Haussler D. Combining phylogenetic and hidden Markov models in biosequence analysis. J Comput Biol. 2004;11(2–3):413–428. doi: 10.1089/1066527041410472. [DOI] [PubMed] [Google Scholar]
- Siepel A, Haussler D. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol. 2004;21(3):468–488. doi: 10.1093/molbev/msh039. [DOI] [PubMed] [Google Scholar]
- Siepel A, Haussler D. Phylogenetic hidden Markov models. In: Nielsen R, editor. Statistical methods in molecular evolution. New York: Springer; 2005. pp. 325–351. [Google Scholar]
- Sim J, Kim SY, Lee J. PPRODO: prediction of protein domain boundaries using neural networks. Proteins. 2005;59(3):627–632. doi: 10.1002/prot.20442. [DOI] [PubMed] [Google Scholar]
- Simpson RM, Bruno AE, Bard JE, Buck MJ, Read LK. High-throughput sequencing of partially edited trypanosome mRNAs reveals barriers to editing progression and evidence for alternative editing. RNA. 2016;22(5):677–695. doi: 10.1261/rna.055160.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sloane AJ, Duff JL, Wilson NL, Gandhi PS, Hill CJ, Hopwood FG, Smith PE, Thomas ML, Cole RA, Packer NH, et al. High throughput peptide mass fingerprinting and protein macroarray analysis using chemical printing strategies. Mol Cell Proteomics. 2002;1(7):490–499. doi: 10.1074/mcp.M200020-MCP200. [DOI] [PubMed] [Google Scholar]
- Smircich P, Eastman G, Bispo S, Duhagon MA, Guerra-Slompo EP, Garat B, Goldenberg S, Munroe DJ, Dallagiovanna B, Holetz F, et al. Ribosome profiling reveals translation control as a key mechanism generating differential gene expression in Trypanosoma cruzi. BMC Genomics. 2015;16:443. doi: 10.1186/s12864-015-1563-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smit AF. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr Opin Genet Dev. 1999;9(6):657–663. doi: 10.1016/S0959-437X(99)00031-3. [DOI] [PubMed] [Google Scholar]
- Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981;147(1):195–197. doi: 10.1016/0022-2836(81)90087-5. [DOI] [PubMed] [Google Scholar]
- Smith AB, Pisani D, Mackenzie-Dodds JA, Stockley B, Webster BL, Littlewood DT. Testing the molecular clock: molecular and paleontological estimates of divergence times in the Echinoidea (Echinodermata) Mol Biol Evol. 2006;23(10):1832–1851. doi: 10.1093/molbev/msl039. [DOI] [PubMed] [Google Scholar]
- Smyth RP, Davenport MP, Mak J. The origin of genetic diversity in HIV-1. Virus Res. 2012;169(2):415–429. doi: 10.1016/j.virusres.2012.06.015. [DOI] [PubMed] [Google Scholar]
- Smyth RP, Schlub TE, Grimm AJ, Waugh C, Ellenberg P, Chopra A, Mallal S, Cromer D, Mak J, Davenport MP. Identifying recombination hot spots in the HIV-1 genome. J Virol. 2014;88(5):2891–2902. doi: 10.1128/JVI.03014-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sneath PHA. The construction of taxonomic groups. In: Ainsworth GC, Sneath PHA, editors. Microbial classification. Cambridge: Cambridge University Press; 1962. pp. 289–332. [Google Scholar]
- Sokal RR, Michener CD. A statistical method for evaluating systematic relationships. Univ Kans Sci Bull. 1958;28:1409–1438. [Google Scholar]
- Solnick JV, Hansen LM, Salama NR, Boonjakuakul JK, Syvanen M. Modification of Helicobacter pylori outer membrane protein expression during experimental infection of rhesus macaques. Proc Natl Acad Sci U S A. 2004;101(7):2106–2111. doi: 10.1073/pnas.0308573100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sommerer N, Centeno D, Rossignol M. Peptide mass fingerprinting: identification of proteins by maldi-tof. Methods Mol Biol. 2006;355:219–234. doi: 10.1385/1-59745-227-0:219. [DOI] [PubMed] [Google Scholar]
- Sonenberg N, Meerovitch K. Translation of poliovirus mRNA. Enzyme. 1990;44(1–4):278–291. doi: 10.1159/000468765. [DOI] [PubMed] [Google Scholar]
- Sorensen MA, Kurland CG, Pedersen S. Codon usage determines translation rate in Escherichia coli. J Mol Biol. 1989;207:365–377. doi: 10.1016/0022-2836(89)90260-X. [DOI] [PubMed] [Google Scholar]
- Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 1984;12(1 Pt 2):505–519. doi: 10.1093/nar/12.1Part2.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stamm S, Ben-Ari S, Rafalska I, Tang Y, Zhang Z, Toiber D, Thanaraj TA, Soreq H. Function of alternative splicing. Gene. 2005;344:1–20. doi: 10.1016/j.gene.2004.10.022. [DOI] [PubMed] [Google Scholar]
- Steinberg MH, Rodgers GP. Pathophysiology of sickle cell disease: role of cellular and genetic modifiers. Semin Hematol. 2001;38(4):299–306. doi: 10.1016/S0037-1963(01)90023-X. [DOI] [PubMed] [Google Scholar]
- Steitz JA, Jakes K. How ribosomes select initiator regions in mRNA: base pair formation between the 3′ terminus of 16S rRNA and the mRNA during initiation of protein synthesis in Escherichia coli. Proc Natl Acad Sci U S A. 1975;72(12):4734–4738. doi: 10.1073/pnas.72.12.4734. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stepankiw N, Raghavan M, Fogarty EA, Grimson A, Pleiss JA. Widespread alternative and aberrant splicing revealed by lariat sequencing. Nucleic Acids Res. 2015;43(17):8488–8501. doi: 10.1093/nar/gkv763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stingl K, Uhlemann Em EM, Deckers-Hebestreit G, Schmid R, Bakker EP, Altendorf K. Prolonged survival and cytoplasmic pH homeostasis of Helicobacter pylori at pH 1. Infect Immun. 2001;69(2):1178–1180. doi: 10.1128/IAI.69.2.1178-1181.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stingl K, Altendorf K, Bakker EP. Acid survival of Helicobacter pylori: how does urease activity trigger cytoplasmic pH homeostasis? Trends Microbiol. 2002;10(2):70–74. doi: 10.1016/S0966-842X(01)02287-9. [DOI] [PubMed] [Google Scholar]
- Stingl K, Uhlemann E-M, Schmid R, Altendorf K, Bakker EP. Energetics of Helicobacter pylori and its implications for the mechanism of urease-dependent acid tolerance at pH 1. J Bacteriol. 2002;184(11):3053–3060. doi: 10.1128/JB.184.11.3053-3060.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stormo GD, Schneider TD, Gold L, Ehrenfeucht A. Use of the ‘Perceptron’ algorithm to distinguish translational initiation sites in E. coli. Nucleic Acids Res. 1982;10(9):2997–3011. doi: 10.1093/nar/10.9.2997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stormo GD, Schneider TD, Gold LM. Characterization of translational initiation sites in E. coli. Nucleic Acids Res. 1982;10(9):2971–2996. doi: 10.1093/nar/10.9.2971. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stormo GD, Schneider TD, Gold L. Quantitative analysis of the relationship between nucleotide sequence and functional activity. Nucleic Acids Res. 1986;14(16):6661–6679. doi: 10.1093/nar/14.16.6661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stoye J, Moulton V, Dress AW. DCA: an efficient implementation of the divide-and-conquer approach to simultaneous multiple sequence alignment. Comput Appl Biosci. 1997;13(6):625–626. doi: 10.1093/bioinformatics/13.6.625. [DOI] [PubMed] [Google Scholar]
- Strebel K. APOBEC3G & HTLV-1: inhibition without deamination. Retrovirology. 2005;2(1):37. doi: 10.1186/1742-4690-2-37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Strigini P, Brickman E. Analysis of specific misreading in Escherichia coli. J Mol Biol. 1973;75(4):659–672. doi: 10.1016/0022-2836(73)90299-4. [DOI] [PubMed] [Google Scholar]
- Su HL, Liao CL, Lin YL. Japanese encephalitis virus infection initiates endoplasmic reticulum stress and an unfolded protein response. J Virol. 2002;76(9):4162–4171. doi: 10.1128/JVI.76.9.4162-4171.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sueoka N. On the evolution of informational macromolecules. New York: Academic; 1964. [Google Scholar]
- Suerbaum S, Smith JM, Bapumia K, Morelli G, Smith NH, Kunstmann E, Dyrek I, Achtman M. Free recombination within Helicobacter pylori. Proc Natl Acad Sci U S A. 1998;95(21):12619–12624. doi: 10.1073/pnas.95.21.12619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suerbaum S, Josenhans C, Sterzenbach T, Drescher B, Brandt P, Bell M, Droge M, Fartmann B, Fischer HP, Ge Z, et al. The complete genome sequence of the carcinogenic bacterium Helicobacter hepaticus. Proc Natl Acad Sci U S A. 2003;100(13):7901–7906. doi: 10.1073/pnas.1332093100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sun XY, Yang Q, Xia X. An improved implementation of effective Number of Codons (Nc) Mol Biol Evol. 2013;30:191–196. doi: 10.1093/molbev/mss201. [DOI] [PubMed] [Google Scholar]
- Sund J, Ander M, Aqvist J. Principles of stop-codon reading on the ribosome. Nature. 2010;465(7300):947–950. doi: 10.1038/nature09082. [DOI] [PubMed] [Google Scholar]
- Supek F, Smuc T. On relevance of codon usage to expression of synthetic and natural genes in Escherichia coli. Genetics. 2010;185(3):1129–1134. doi: 10.1534/genetics.110.115477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sutton CW, Pemberton KS, Cottrell JS, Corbett JM, Wheeler CH, Dunn MJ, Pappin DJ. Identification of myocardial proteins from two-dimensional gels by peptide mass fingerprinting. Electrophoresis. 1995;16(3):308–316. doi: 10.1002/elps.1150160151. [DOI] [PubMed] [Google Scholar]
- Sved J, Bird A. The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model. Proc Natl Acad Sci U S A. 1990;87:4692–4696. doi: 10.1073/pnas.87.12.4692. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Svitkin YV, Imataka H, Khaleghpour K, Kahvejian A, Liebig HD, Sonenberg N. Poly(A)-binding protein interaction with elF4G stimulates picornavirus IRES-dependent translation. RNA. 2001;7(12):1743–1752. [PMC free article] [PubMed] [Google Scholar]
- Swofford D. Phylogenetic analysis using parsimony. Champaign: Illinois Natural History Survey; 1993. [Google Scholar]
- Tajima F. Unbiased estimation of evolutionary distance between nucleotide sequences. Mol Biol Evol. 1993;10(3):677–688. doi: 10.1093/oxfordjournals.molbev.a040031. [DOI] [PubMed] [Google Scholar]
- Tajima F, Nei M. Estimation of evolutionary distance between nucleotide sequences. Mol Biol Evol. 1984;1(3):269–285. doi: 10.1093/oxfordjournals.molbev.a040317. [DOI] [PubMed] [Google Scholar]
- Takezaki N, Nei M. Inconsistency of the maximum parsimony method when the rate of nucleotide substitution is constant. J Mol Evol. 1994;39(2):210–218. doi: 10.1007/BF00163810. [DOI] [PubMed] [Google Scholar]
- Takezaki N, Rzhetsky A, Nei M. Phylogenetic test of the molecular clock and linearized trees. Mol Biol Evol. 1995;12(5):823–833. doi: 10.1093/oxfordjournals.molbev.a040259. [DOI] [PubMed] [Google Scholar]
- Tamai I, Sai Y, Kobayashi H, Kamata M, Wakamiya T, Tsuji A. Structure-internalization relationship for adsorptive-mediated endocytosis of basic peptides at the blood-brain barrier. J Pharmacol Exp Ther. 1997;280(1):410–415. [PubMed] [Google Scholar]
- Tamura K, Kumar S. Evolutionary distance estimation under heterogeneous substitution pattern among lineages. Mol Biol Evol. 2002;19(10):1727–1736. doi: 10.1093/oxfordjournals.molbev.a003995. [DOI] [PubMed] [Google Scholar]
- Tamura K, Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol. 1993;10:512–526. doi: 10.1093/oxfordjournals.molbev.a040023. [DOI] [PubMed] [Google Scholar]
- Tamura K, Nei M, Kumar S. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc Natl Acad Sci U S A. 2004;101(30):11030–11035. doi: 10.1073/pnas.0404206101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tamura K, Dudley J, Nei M, Kumar S. MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol Biol Evol. 2007;24(8):1596–1599. doi: 10.1093/molbev/msm092. [DOI] [PubMed] [Google Scholar]
- Tanabe M, Kanehisa M (2012) Using the KEGG database resource. Curr Protoc Bioinformatics Chapter 1:Unit1 12 [DOI] [PubMed]
- Tanaka M, Ozawa T. Strand asymmetry in human mitochondrial DNA mutations. Genomics. 1994;22(2):327–335. doi: 10.1006/geno.1994.1391. [DOI] [PubMed] [Google Scholar]
- Tang N, Tornatore P, Weinberger SR. Current developments in SELDI affinity technology. Mass Spectrom Rev. 2004;23(1):34–44. doi: 10.1002/mas.10066. [DOI] [PubMed] [Google Scholar]
- Tang Y, Gao XD, Wang Y, Yuan BF, Feng YQ. Widespread existence of cytosine methylation in yeast DNA measured by gas chromatography/mass spectrometry. Anal Chem. 2012;84(16):7249–7255. doi: 10.1021/ac301727c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taniguchi T, Weissmann C. Inhibition of Qbeta RNA 70S ribosome initiation complex formation by an oligonucleotide complementary to the 3′ terminal region of E. coli 16S ribosomal RNA. Nature. 1978;275(5682):770–772. doi: 10.1038/275770a0. [DOI] [PubMed] [Google Scholar]
- Tao H, Bausch C, Richmond C, Blattner FR, Conway T. Functional genomics: expression analysis of Escherichia coli growing on minimal and rich media. J Bacteriol. 1999;181(20):6425–6440. doi: 10.1128/jb.181.20.6425-6440.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taramelli R, Kioussis D, Vanin E, Bartram K, Groffen J, Hurst J, Grosveld FG. Gamma delta beta-thalassaemias 1 and 2 are the result of a 100 kbp deletion in the human beta-globin cluster. Nucleic Acids Res. 1986;14(17):7017–7029. doi: 10.1093/nar/14.17.7017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tate WP, Brown CM. Translational termination: “stop” for protein synthesis or “pause” for regulation of gene expression. Biochemistry (Mosc) 1992;31(9):2443–2450. doi: 10.1021/bi00124a001. [DOI] [PubMed] [Google Scholar]
- Tate WP, Mannering SA. Three, four or more: the translational stop signal at length. Mol Microbiol. 1996;21(2):213–219. doi: 10.1046/j.1365-2958.1996.6391352.x. [DOI] [PubMed] [Google Scholar]
- Tate WP, Mansell JB, Mannering SA, Irvine JH, Major LL, Wilson DN. UGA: a dual signal for ‘stop’ and for recoding in protein synthesis. Biochemistry (Mosc) 1999;64(12):1342–1353. [PubMed] [Google Scholar]
- Tavaré S. Some probabilistic and statistical problems in the analysis of DNA sequences. In: Miura RM, editor. Some mathematical questions in biology – DNA sequence analysis. Providence: American Mathematical Society; 1986. pp. 57–86. [Google Scholar]
- Team GE. Closure of the NCBI SRA and implications for the long-term future of genomics data storage. Genome Biol. 2011;12(3):402. doi: 10.1186/gb-2011-12-3-402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tech M, Merkl R. YACOP: enhanced gene prediction obtained by a combination of existing methods. In Silico Biol. 2003;3(4):441–451. [PubMed] [Google Scholar]
- Terasaki T, Deguchi Y, Sato H, K-i H, Tsuji A. In vivo transport of a Dynorphin-like analgesic peptide, E-2078, through the blood–brain barrier: an application of brain microdialysis. Pharm Res. 1991;8(7):815. doi: 10.1023/A:1015882924470. [DOI] [PubMed] [Google Scholar]
- Terenin IM, Dmitriev SE, Andreev DE, Royall E, Belsham GJ, Roberts LO, Shatsky IN. A cross-kingdom internal ribosome entry site reveals a simplified mode of internal ribosome entry. Mol Cell Biol. 2005;25(17):7879–7888. doi: 10.1128/MCB.25.17.7879-7888.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thijs G, Lescot M, Marchal K, Rombauts S, De Moor B, Rouze P, Moreau Y. A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics. 2001;17(12):1113–1122. doi: 10.1093/bioinformatics/17.12.1113. [DOI] [PubMed] [Google Scholar]
- Thijs G, Marchal K, Lescot M, Rombauts S, De Moor B, Rouze P, Moreau Y. A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes. J Comput Biol. 2002;9(2):447–464. doi: 10.1089/10665270252935566. [DOI] [PubMed] [Google Scholar]
- Thijs G, Moreau Y, De Smet F, Mathys J, Lescot M, Rombauts S, Rouze P, De Moor B, Marchal K. INCLUSive: integrated clustering, upstream sequence retrieval and motif sampling. Bioinformatics. 2002;18(2):331–332. doi: 10.1093/bioinformatics/18.2.331. [DOI] [PubMed] [Google Scholar]
- Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson W, Rouchka EC, Lawrence CE. Gibbs recursive sampler: finding transcription factor binding sites. Nucleic Acids Res. 2003;31(13):3580–3585. doi: 10.1093/nar/gkg608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson W, Palumbo MJ, Wasserman WW, Liu JS, Lawrence CE. Decoding human regulatory circuits. Genome Res. 2004;14(10A):1967–1974. doi: 10.1101/gr.2589004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thorne JL, Kishino H. Freeing phylogenies from artifacts of alignment. Mol Biol Evol. 1992;9(6):1148–1162. doi: 10.1093/oxfordjournals.molbev.a040783. [DOI] [PubMed] [Google Scholar]
- Thorne JL, Kishino H. Estimation of divergence times from molecular sequence data. In: Nielsen R, editor. Statistical methods in molecular evolution. New York: Springer; 2005. pp. 233–256. [Google Scholar]
- Tinn O, Oakley TH. Erratic rates of molecular evolution and incongruence of fossil and molecular divergence time estimates in Ostracoda (Crustacea) Mol Phylogenet Evol. 2008;48(1):157–167. doi: 10.1016/j.ympev.2008.03.001. [DOI] [PubMed] [Google Scholar]
- Tjaden B. De novo assembly of bacterial transcriptomes from RNA-seq data. Genome Biol. 2015;16:1. doi: 10.1186/s13059-014-0572-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tolhuis B, Palstra RJ, Splinter E, Grosveld F, de Laat W. Looping and interaction between hypersensitive sites in the active beta-globin locus. Mol Cell. 2002;10(6):1453–1465. doi: 10.1016/S1097-2765(02)00781-5. [DOI] [PubMed] [Google Scholar]
- Tomatsu S, Orii KO, Bi Y, Gutierrez MA, Nishioka T, Yamaguchi S, Kondo N, Orii T, Noguchi A, Sly WS. General implications for CpG hot spot mutations: methylation patterns of the human iduronate-2-sulfatase gene locus. Hum Mutat. 2004;23(6):590–598. doi: 10.1002/humu.20046. [DOI] [PubMed] [Google Scholar]
- Tomb JF, White O, Kerlavage AR, Clayton RA, Sutton GG, Fleischmann RD, Ketchum KA, Klenk HP, Gill S, Dougherty BA, et al. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature. 1997;388(6642):539–547. doi: 10.1038/41483. [DOI] [PubMed] [Google Scholar]
- Toronen P, Kolehmainen M, Wong G, Castren E. Analysis of gene expression data using self-organizing maps. FEBS Lett. 1999;451(2):142–146. doi: 10.1016/S0014-5793(99)00524-4. [DOI] [PubMed] [Google Scholar]
- Trapnell C. Defining cell types and states with single-cell genomics. Genome Res. 2015;25(10):1491–1498. doi: 10.1101/gr.190595.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–1111. doi: 10.1093/bioinformatics/btp120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–515. doi: 10.1038/nbt.1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7(3):562–578. doi: 10.1038/nprot.2012.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol. 2013;31(1):46–53. doi: 10.1038/nbt.2450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trudel MV, Vincent AT, Attere SA, Labbe M, Derome N, Culley AI, Charette SJ. Diversity of antibiotic-resistance genes in Canadian isolates of Aeromonas salmonicida subsp. salmonicida: dominance of pSN254b and discovery of pAsa8. Sci Rep. 2016;6:35617. doi: 10.1038/srep35617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trutschl M, Dinkova TD, Rhoads RE. Application of machine learning and visualization of heterogeneous datasets to uncover relationships between translation and developmental stage expression of C. elegans mRNAs. Physiol Genomics. 2005;21(2):264–273. doi: 10.1152/physiolgenomics.00307.2004. [DOI] [PubMed] [Google Scholar]
- Tuller T, Waldman YY, Kupiec M, Ruppin E. Translation efficiency is determined by both codon bias and folding energy. Proc Natl Acad Sci U S A. 2010;107(8):3645–3650. doi: 10.1073/pnas.0909910107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Valenzuela M, Cerda O, Toledo H. Overview on chemotaxis and acid resistance in Helicobacter pylori. Biol Res. 2003;36(3–4):429–436. doi: 10.4067/s0716-97602003000300014. [DOI] [PubMed] [Google Scholar]
- Van de Peer Y, Neefs JM, De Rijk P, De Wachter R. Reconstructing evolution from eukaryotic small-ribosomal-subunit RNA sequences: calibration of the molecular clock. J Mol Evol. 1993;37(2):221–232. doi: 10.1007/BF02407359. [DOI] [PubMed] [Google Scholar]
- Van Dooren S, Pybus OG, Salemi M, Liu HF, Goubau P, Remondegui C, Talarmin A, Gotuzzo E, Alcantara LC, Galvao-Castro B, et al. The low evolutionary rate of human T-cell lymphotropic virus type-1 confirmed by analysis of vertical transmission chains. Mol Biol Evol. 2004;21(3):603–611. doi: 10.1093/molbev/msh053. [DOI] [PubMed] [Google Scholar]
- Van Esch H, Devriendt K. Transcription factor GATA3 and the human HDR syndrome. Cell Mol Life Sci. 2001;58(9):1296–1300. doi: 10.1007/PL00000940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- van Hemert FJ, Berkhout B. The tendency of lentiviral open reading frames to become A-rich: constraints imposed by viral genome organization and cellular tRNA availability. J Mol Evol. 1995;41(2):132–140. doi: 10.1007/BF00170664. [DOI] [PubMed] [Google Scholar]
- van Weringh A, Ragonnet-Cronin M, Pranckeviciene E, Pavon-Eternod M, Kleiman L, Xia X. HIV-1 modulates the tRNA pool to improve translation efficiency. Mol Biol Evol. 2011;28(6):1827–1834. doi: 10.1093/molbev/msr005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vartanian J-P, Henry M, Wain-Hobson S. Sustained G->A hypermutation during reverse transcription of an entire human immunodeficiency virus type 1 strain Vau group O genome. J Gen Virol. 2002;83(4):801–805. doi: 10.1099/0022-1317-83-4-801. [DOI] [PubMed] [Google Scholar]
- Vasilescu J, Figeys D. Mapping protein-protein interactions by mass spectrometry. Curr Opin Biotechnol. 2006;17(4):394–399. doi: 10.1016/j.copbio.2006.06.008. [DOI] [PubMed] [Google Scholar]
- Vazquez-Pianzola P, Hernandez G, Suter B, Rivera-Pomar R. Different modes of translation for hid, grim and sickle mRNAs in Drosophila. Cell Death Differ. 2007;14(2):286–295. doi: 10.1038/sj.cdd.4401990. [DOI] [PubMed] [Google Scholar]
- Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995;270(5235):484–487. doi: 10.1126/science.270.5235.484. [DOI] [PubMed] [Google Scholar]
- Velculescu VE, Zhang L, Zhou W, Vogelstein J, Basrai MA, Bassett DE, Jr, Hieter P, Vogelstein B, Kinzler KW. Characterization of the yeast transcriptome. Cell. 1997;88(2):243–251. doi: 10.1016/S0092-8674(00)81845-0. [DOI] [PubMed] [Google Scholar]
- Velculescu VE, Madden SL, Zhang L, Lash AE, Yu J, Rago C, Lal A, Wang CJ, Beaudry GA, Ciriello KM, et al. Analysis of human transcriptomes. Nat Genet. 1999;23(4):387–388. doi: 10.1038/70487. [DOI] [PubMed] [Google Scholar]
- Velculescu VE, Vogelstein B, Kinzler KW. Analysing uncharted transcriptomes with SAGE. Trends Genet. 2000;16(10):423–425. doi: 10.1016/S0168-9525(00)02114-4. [DOI] [PubMed] [Google Scholar]
- Vellanoweth RL, Rabinowitz JC. The influence of ribosome-binding-site elements on translational efficiency in Bacillus subtilis and Escherichia coli in vivo. Mol Microbiol. 1992;6(9):1105–1114. doi: 10.1111/j.1365-2958.1992.tb01548.x. [DOI] [PubMed] [Google Scholar]
- Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. The sequence of the human genome. Science. 2001;291(5507):1304–1351. doi: 10.1126/science.1058040. [DOI] [PubMed] [Google Scholar]
- Vert JP. Support vector machine prediction of signal peptide cleavage site using a new class of kernels for strings. Pac Symp Biocomput. 2002;7:649–660. doi: 10.1142/9789812799623_0060. [DOI] [PubMed] [Google Scholar]
- Vestergaard B, Van LB, Andersen GR, Nyborg J, Buckingham RH, Kjeldgaard M. Bacterial polypeptide release factor RF2 is structurally distinct from eukaryotic eRF1. Mol Cell. 2001;8(6):1375–1382. doi: 10.1016/S1097-2765(01)00415-4. [DOI] [PubMed] [Google Scholar]
- Vestergaard B, Sanyal S, Roessle M, Mora L, Buckingham RH, Kastrup JS, Gajhede M, Svergun DI, Ehrenberg M. The SAXS solution structure of RF1 differs from its crystal structure and is similar to its ribosome bound cryo-EM structure. Mol Cell. 2005;20(6):929–938. doi: 10.1016/j.molcel.2005.11.022. [DOI] [PubMed] [Google Scholar]
- Viterbi AJ. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inf Theory. 1967;13(2):260–269. doi: 10.1109/TIT.1967.1054010. [DOI] [Google Scholar]
- Vlasschaert C, Xia X, Coulombe J, Gray DA. Evolution of the highly networked deubiquitinating enzymes USP4, USP15, and USP11. BMC Evol Biol. 2015;15:230. doi: 10.1186/s12862-015-0511-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vlasschaert C, Xia X, Gray DA. Selection preserves Ubiquitin Specific Protease 4 alternative exon skipping in therian mammals. Sci Rep. 2016;6:20039. doi: 10.1038/srep20039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vlasschaert C, Cook D, Xia X, Gray DA. The evolution and functional diversification of the deubiquitinating enzyme superfamily. Genome Biol Evol. 2017;9(3):558–573. doi: 10.1093/gbe/evx020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Voelter-Mahlknecht S. Epigenetic associations in relation to cardiovascular prevention and therapeutics. Clin Epigenetics. 2016;8:4. doi: 10.1186/s13148-016-0170-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waddell PJ, Steel MA. General time-reversible distances with unequal rates across sites: mixing gamma and inverse Gaussian distributions with invariant sites. Mol Phylogenet Evol. 1997;8(3):398–414. doi: 10.1006/mpev.1997.0452. [DOI] [PubMed] [Google Scholar]
- Waddell PJ, Steel MA. General time-reversible distances with unequal rates across sites: mixing lambda and inverse Gaussian distributions with invariant sites. Mol Phylogenet Evol. 1997;8(3):398–414. doi: 10.1006/mpev.1997.0452. [DOI] [PubMed] [Google Scholar]
- Wade PA, Wolffe AP. ReCoGnizing methylated DNA. Nat Struct Biol. 2001;8(7):575–577. doi: 10.1038/89593. [DOI] [PubMed] [Google Scholar]
- Walsh D, Arias C, Perez C, Halladin D, Escandon M, Ueda T, Watanabe-Fukunaga R, Fukunaga R, Mohr I. Eukaryotic translation initiation factor 4F architectural alterations accompany translation initiation factor redistribution in poxvirus-infected cells. Mol Cell Biol. 2008;28(8):2648–2658. doi: 10.1128/MCB.01631-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang HC, Hickey DA. Evidence for strong selective constraint acting on the nucleotide composition of 16S ribosomal RNA genes. Nucleic Acids Res. 2002;30(11):2501–2507. doi: 10.1093/nar/30.11.2501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang G, Humayun MZ, Taylor DE. Mutation as an origin of genetic variability in Helicobacter pylori. Trends Microbiol. 1999;7(12):488–493. doi: 10.1016/S0966-842X(99)01632-7. [DOI] [PubMed] [Google Scholar]
- Wang J, Delabie J, Aasheim H, Smeland E, Myklebost O. Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study. BMC Bioinform. 2002;3:36. doi: 10.1186/1471-2105-3-36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang HC, Xia X, Hickey DA. Thermal adaptation of ribosomal RNA genes: a comparative study. J Mol Evol. 2006;63(1):120–126. doi: 10.1007/s00239-005-0255-4. [DOI] [PubMed] [Google Scholar]
- Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57–63. doi: 10.1038/nrg2484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang X, Kim Y, Ma Q, Hong SH, Pokusaeva K, Sturino JM, Wood TK. Cryptic prophages help bacteria cope with adverse environments. Nat Commun. 2010;1:147. doi: 10.1038/ncomms1146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang M, Weiss M, Simonovic M, Haertinger G, Schrimpf SP, Hengartner MO, von Mering C. PaxDb, a database of protein abundance averages across all three domains of life. Mol Cell Proteomics. 2012;11(8):492–500. doi: 10.1074/mcp.O111.014704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Washburn MP, Wolters D, Yates JR., 3rd Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat Biotechnol. 2001;19(3):242–247. doi: 10.1038/85686. [DOI] [PubMed] [Google Scholar]
- Waterfield MD, Scrace GT, Whittle N, Stroobant P, Johnsson A, Wasteson A, Westermark B, Heldin CH, Huang JS, Deuel TF. Platelet-derived growth factor is structurally related to the putative transforming protein p28sis of simian sarcoma virus. Nature. 1983;304(5921):35–39. doi: 10.1038/304035a0. [DOI] [PubMed] [Google Scholar]
- Waterman MS, Vingron M. Rapid and accurate estimates of statistical significance for sequence data base searches. Proc Natl Acad Sci U S A. 1994;91(11):4625–4628. doi: 10.1073/pnas.91.11.4625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Webster J, Oxley D. Peptide mass fingerprinting: protein identification using MALDI-TOF mass spectrometry. Methods Mol Biol. 2005;310:227–240. doi: 10.1007/978-1-59259-948-6_16. [DOI] [PubMed] [Google Scholar]
- Weeks DL, Eskandari S, Scott DR, Sachs G. A H+−gated urea channel: the link between Helicobacter pylori urease and gastric colonization. Science. 2000;287(5452):482–485. doi: 10.1126/science.287.5452.482. [DOI] [PubMed] [Google Scholar]
- Wei Y, Xia X. The role of +4U as an extended translation termination signal in bacteria. Genetics. 2017;205(2):539–549. doi: 10.1534/genetics.116.193961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wei Y, Wang J, Xia X. Coevolution between stop codon usage and release factors in bacterial species. Mol Biol Evol. 2016;33(9):2357–2367. doi: 10.1093/molbev/msw107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wei Y, Silke JR, Xia X (2017) Elucidating the 16S rRNA 3′ boundaries and defining optimal SD/aSD pairing in Escherichia coli and Bacillus subtilis using RNA-Seq data. Sci Rep. 10.1038/s41598-017-17918-6 [DOI] [PMC free article] [PubMed]
- Weigert MG, Garen A. Base composition of nonsense codons in E. coli. evidence from amino-acid substitutions at a tryptophan site in alkaline phosphatase. Nature. 1965;206(988):992–994. doi: 10.1038/206992a0. [DOI] [PubMed] [Google Scholar]
- Weiner AM, Weber K. A single UGA codon functions as a natural termination signal in the coliphage q beta coat protein cistron. J Mol Biol. 1973;80(4):837–855. doi: 10.1016/0022-2836(73)90213-1. [DOI] [PubMed] [Google Scholar]
- Weir BS. Genetic data analysis. Sunderland: Sinauer Associates; 1990. [Google Scholar]
- Weiss RB, Dunn DM, Dahlberg AE, Atkins JF, Gesteland RF. Reading frame switch caused by base-pair formation between the 3′ end of 16S rRNA and the mRNA during elongation of protein synthesis in Escherichia coli. EMBO J. 1988;7(5):1503–1507. doi: 10.1002/j.1460-2075.1988.tb02969.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wen Y, Marcus EA, Matrubutham U, Gleeson MA, Scott DR, Sachs G. Acid-adaptive genes of Helicobacter pylori. Infect Immun. 2003;71(10):5921–5939. doi: 10.1128/IAI.71.10.5921-5939.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wenthzel AM, Stancek M, Isaksson LA. Growth phase dependent stop codon readthrough and shift of translation reading frame in Escherichia coli. FEBS Lett. 1998;421(3):237–242. doi: 10.1016/S0014-5793(97)01570-6. [DOI] [PubMed] [Google Scholar]
- Wilks SS. The large-sample distribution of the likelihood ratio for testing composite hypotheses. Annals Math Stat. 1938;9:60–62. doi: 10.1214/aoms/1177732360. [DOI] [Google Scholar]
- Williams CL, Preston T, Hossack M, Slater C, McColl KE. Helicobacter pylori utilises urea for amino acid synthesis. FEMS Immunol Med Microbiol. 1996;13(1):87–94. doi: 10.1111/j.1574-695X.1996.tb00220.x. [DOI] [PubMed] [Google Scholar]
- Williams KP, Sobral BW, Dickerman AW. A robust species tree for the alphaproteobacteria. J Bacteriol. 2007;189(13):4578–4586. doi: 10.1128/JB.00269-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson DS, Nock S. Functional protein microarrays. Curr Opin Chem Biol. 2002;6(1):81–85. doi: 10.1016/S1367-5931(01)00281-2. [DOI] [PubMed] [Google Scholar]
- Wilson KS, von Hippel PH. Transcription termination at intrinsic terminators: the role of the RNA hairpin. Proc Natl Acad Sci U S A. 1995;92(19):8793–8797. doi: 10.1073/pnas.92.19.8793. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Winston F, Botstein D, Miller JH. Characterization of amber and ochre suppressors in Salmonella typhimurium. J Bacteriol. 1979;137(1):433–439. doi: 10.1128/jb.137.1.433-439.1979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wolfe KH, Li WH, Sharp PM. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast and nuclear DNAs. Proc Natl Acad Sci U S A. 1987;84:9054–9058. doi: 10.1073/pnas.84.24.9054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wong KM, Suchard MA, Huelsenbeck JP. Alignment uncertainty and genomic analysis. Science. 2008;319(5862):473–476. doi: 10.1126/science.1151532. [DOI] [PubMed] [Google Scholar]
- Wright F. The ‘effective number of codons’ used in a gene. Gene. 1990;87(1):23–29. doi: 10.1016/0378-1119(90)90491-9. [DOI] [PubMed] [Google Scholar]
- Wright GL., Jr SELDI proteinchip MS: a platform for biomarker discovery and cancer diagnosis. Expert Rev Mol Diagn. 2002;2(6):549–563. doi: 10.1586/14737159.2.6.549. [DOI] [PubMed] [Google Scholar]
- Wu J, Bag J. Negative control of the poly(A)-binding protein mRNA translation is mediated by the adenine-rich region of its 5′-untranslated region. J Biol Chem. 1998;273(51):34535–34542. doi: 10.1074/jbc.273.51.34535. [DOI] [PubMed] [Google Scholar]
- Wu CI, Li WH. Evidence for higher rates of nucleotide substitution in rodents than in man. Proc Natl Acad Sci U S A. 1985;82(6):1741–1745. doi: 10.1073/pnas.82.6.1741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu J, Tzanakakis ES. Deconstructing stem cell population heterogeneity: single-cell analysis and modeling approaches. Biotechnol Adv. 2013;31(7):1047–1062. doi: 10.1016/j.biotechadv.2013.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. Maximizing transcription efficiency causes codon usage bias. Genetics. 1996;144:1309–1320. doi: 10.1093/genetics/144.3.1309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. How optimized is the translational machinery in Escherichia coli, Salmonella typhimurium and Saccharomyces cerevisiae? Genetics. 1998;149(1):37–44. doi: 10.1093/genetics/149.1.37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. The rate heterogeneity of nonsynonymous substitutions in mammalian mitochondrial genes. Mol Biol Evol. 1998;15:336–344. doi: 10.1093/oxfordjournals.molbev.a025930. [DOI] [PubMed] [Google Scholar]
- Xia X. Phylogenetic relationship among horseshoe crab species: the effect of substitution models on phylogenetic analyses. Syst Biol. 2000;49:87–100. doi: 10.1080/10635150050207401. [DOI] [PubMed] [Google Scholar]
- Xia X. Data analysis in molecular biology and evolution. Boston: Kluwer Academic Publishers; 2001. [Google Scholar]
- Xia X. DNA methylation and mycoplasma genomes. J Mol Evol. 2003;57:S21–S28. doi: 10.1007/s00239-003-0003-6. [DOI] [PubMed] [Google Scholar]
- Xia X. Mutation and selection on the anticodon of tRNA genes in vertebrate mitochondrial genomes. Gene. 2005;345(1):13–20. doi: 10.1016/j.gene.2004.11.019. [DOI] [PubMed] [Google Scholar]
- Xia X. Topological bias in distance-based phylogenetic methods: problems with over- and underestimated genetic distances. Evol Bioinforma. 2006;2:375–387. doi: 10.1177/117693430600200034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. The +4G site in Kozak consensus is not related to the efficiency of translation initiation. PLoS One. 2007;2:e188. doi: 10.1371/journal.pone.0000188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. Bioinformatics and the cell: modern computational approaches in genomics, proteomics and transcriptomics. New York: Springer US; 2007. [Google Scholar]
- Xia X. An improved implementation of codon adaptation index. Evol Bioinforma. 2007;3:53–58. doi: 10.1177/117693430700300028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. The cost of wobble translation in fungal mitochondrial genomes: integration of two traditional hypotheses. BMC Evol Biol. 2008;8:211. doi: 10.1186/1471-2148-8-211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. Information-theoretic indices and an approximate significance test for testing the molecular clock hypothesis with genetic distances. Mol Phylogenet Evol. 2009;52:665–676. doi: 10.1016/j.ympev.2009.04.017. [DOI] [PubMed] [Google Scholar]
- Xia X. DNA replication and strand asymmetry in prokaryotic and mitochondrial genomes. Curr Genomics. 2012;13(1):16–27. doi: 10.2174/138920212799034776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X (2012b). Position Weight Matrix, Gibbs Sampler, and the associated significance tests in motif characterization and prediction. Scientifica 2012: Article ID 917540, 15 pp [DOI] [PMC free article] [PubMed]
- Xia X. Rapid evolution of animal mitochondria. In: Singh RS, Xu J, Kulathinal RJ, editors. Evolution in the fast lane: rapidly evolving genes and genetic systems. Oxford: Oxford University Press; 2012. pp. 73–82. [Google Scholar]
- Xia X. DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol. 2013;30:1720–1728. doi: 10.1093/molbev/mst064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. Phylogenetic bias in the likelihood method caused by missing data coupled with among-site rate variation: an analytical approach. In: Basu M, Pan Y, Wang J, editors. Bioinformatics research and applications. New York: Springer; 2014. pp. 12–23. [Google Scholar]
- Xia X. A major controversy in codon-anticodon adaptation resolved by a new codon usage index. Genetics. 2015;199:573–579. doi: 10.1534/genetics.114.172106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. PhyPA: phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences. Mol Phylogenet Evol. 2016;102:331–343. doi: 10.1016/j.ympev.2016.07.001. [DOI] [PubMed] [Google Scholar]
- Xia Xuhua. ARSDA: A New Approach for Storing, Transmitting and Analyzing Transcriptomic Data. G3: Genes|Genomes|Genetics. 2017;7(12):3839–3848. doi: 10.1534/g3.117.300271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. Bioinformatics and drug discovery. Curr Top Med Chem. 2017;17(15):1709–1726. doi: 10.2174/1568026617666161116143440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. DAMBE6: new tools for microbial genomics, phylogenetics and molecular evolution. J Hered. 2017;108(4):431–437. doi: 10.1093/jhered/esx033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X. Self-organizing map for characterizing heterogeneous nucleotide and amino acid sequence motifs. Computation. 2017;5(4):43. doi: 10.3390/computation5040043. [DOI] [Google Scholar]
- Xia X, Holcik M. Strong eukaryotic IRESs have weak secondary structure. PLoS One. 2009;4(1):e4136. doi: 10.1371/journal.pone.0004136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X, Kumar S. Codon-based detection of positive selection can be biased by heterogeneous distribution of polar amino acids along protein sequences. In: Markstein P, Xu Y, editors. Computational systems bioinformatics: proceedings of the conference CSB 2006. London: Imperial College Press; 2006. pp. 335–340. [PMC free article] [PubMed] [Google Scholar]
- Xia X, Lemey P. Assessing substitution saturation with DAMBE. In: Lemey P, Salemi M, Vandamme AM, editors. The phylogenetic handbook. 2. Cambridge: Cambridge University Press; 2009. pp. 615–630. [Google Scholar]
- Xia X, Li WH. What amino acid properties affect protein evolution? J Mol Evol. 1998;47(5):557–564. doi: 10.1007/PL00006412. [DOI] [PubMed] [Google Scholar]
- Xia X, Palidwor G. Genomic adaptation to acidic environment: evidence from Helicobacter pylori. Am Nat. 2005;166(6):776–784. doi: 10.1086/497400. [DOI] [PubMed] [Google Scholar]
- Xia X, Xie Z. AMADA: analysis of microarray data. Bioinformatics. 2001;17:569–570. doi: 10.1093/bioinformatics/17.6.569. [DOI] [PubMed] [Google Scholar]
- Xia X, Xie Z. DAMBE: software package for data analysis in molecular biology and evolution. J Hered. 2001;92(4):371–373. doi: 10.1093/jhered/92.4.371. [DOI] [PubMed] [Google Scholar]
- Xia X, Xie Z. Protein structure, neighbor effect, and a new index of amino acid dissimilarities. Mol Biol Evol. 2002;19(1):58–67. doi: 10.1093/oxfordjournals.molbev.a003982. [DOI] [PubMed] [Google Scholar]
- Xia X, Yang Q. A distance-based least-square method for dating speciation events. Mol Phylogenet Evol. 2011;59(2):342–353. doi: 10.1016/j.ympev.2011.01.017. [DOI] [PubMed] [Google Scholar]
- Xia X, Yuen KY. Differential selection and mutation between dsDNA and ssDNA phages shape the evolution of their genomic AT percentage. BMC Genet. 2005;6(1):20. doi: 10.1186/1471-2156-6-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X, Hafner MS, Sudman PD. On transition bias in mitochondrial genes of pocket gophers. J Mol Evol. 1996;43:32–40. doi: 10.1007/BF02352297. [DOI] [PubMed] [Google Scholar]
- Xia XH, Wei T, Xie Z, Danchin A. Genomic changes in nucleotide and dinucleotide frequencies in Pasteurella multocida cultured under high temperature. Genetics. 2002;161(4):1385–1394. doi: 10.1093/genetics/161.4.1385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X, Xie Z, Kjer KM. 18S ribosomal RNA and tetrapod phylogeny. Syst Biol. 2003;52(3):283–295. doi: 10.1080/10635150390196948. [DOI] [PubMed] [Google Scholar]
- Xia X, Xie Z, Salemi M, Chen L, Wang Y. An index of substitution saturation and its application. Mol Phylogenet Evol. 2003;26(1):1–7. doi: 10.1016/S1055-7903(02)00326-3. [DOI] [PubMed] [Google Scholar]
- Xia X, Wang H, Xie Z, Carullo M, Huang H, Hickey D. Cytosine usage modulates the correlation between CDS length and CG content in prokaryotic genomes. Mol Biol Evol. 2006;23(7):1450–1454. doi: 10.1093/molbev/msl012. [DOI] [PubMed] [Google Scholar]
- Xia X, Huang H, Carullo M, Betran E, Moriyama EN. Conflict between translation initiation and elongation in vertebrate mitochondrial genomes. PLoS One. 2007;2:e227. doi: 10.1371/journal.pone.0000227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xia X, MacKay V, Yao X, Wu J, Miura F, Ito T, Morris DR. Translation initiation: a regulatory role for poly(A) tracts in front of the AUG codon in saccharomyces cerevisiae. Genetics. 2011;189(2):469–478. doi: 10.1534/genetics.111.132068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiao L, Wang K, Teng Y, Zhang J. Component plane presentation integrated self-organizing map for microarray data analysis. FEBS Lett. 2003;538(1–3):117–124. doi: 10.1016/S0014-5793(03)00156-X. [DOI] [PubMed] [Google Scholar]
- Xu Z, Hao B. CVTree update: a newly designed phylogenetic study platform using composition vectors and whole genomes. Nucleic Acids Res. 2009;37(Web Server):W174–W178. doi: 10.1093/nar/gkp278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yamaoka Y, Kita M, Kodama T, Imamura S, Ohno T, Sawai N, Ishimaru A, Imanishi J, Graham DY. Helicobacter pylori infection in mice: role of outer membrane proteins in colonization and inflammation. Gastroenterology. 2002;123(6):1992–2004. doi: 10.1053/gast.2002.37074. [DOI] [PubMed] [Google Scholar]
- Yang Z. A space-time process model for the evolution of DNA sequences. Genetics. 1995;139:993–1005. doi: 10.1093/genetics/139.2.993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Z. Computational molecular evolution. Oxford: Oxford University Press; 2006. [Google Scholar]
- Yang Z, Yoder AD. Comparison of likelihood and Bayesian methods for estimating divergence times using multiple gene Loci and calibration points, with application to a radiation of cute-looking mouse lemur species. Syst Biol. 2003;52(5):705–716. doi: 10.1080/10635150390235557. [DOI] [PubMed] [Google Scholar]
- Yang Z, O’Brien JD, Zheng X, Zhu HQ, She ZS. Tree and rate estimation by local evaluation of heterochronous nucleotide data. Bioinformatics. 2007;23(2):169–176. doi: 10.1093/bioinformatics/btl577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Z, Bruno DP, Martens CA, Porcella SF, Moss B. Simultaneous high-resolution analysis of vaccinia virus and host cell transcriptomes by deep RNA sequencing. Proc Natl Acad Sci U S A. 2010;107(25):11513–11518. doi: 10.1073/pnas.1006594107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yates JR. Mass spectral analysis in proteomics. Annu Rev Biophys Biomol Struct. 2004;33:297–316. doi: 10.1146/annurev.biophys.33.111502.082538. [DOI] [PubMed] [Google Scholar]
- Yates JR. Mass spectrometry as an emerging tool for systems biology. BioTechniques. 2004;36(6):917–919. doi: 10.2144/04366TE01. [DOI] [PubMed] [Google Scholar]
- Yip TT, Lomas L. SELDI ProteinChip array in oncoproteomic research. Technol Cancer Res Treat. 2002;1(4):273–280. doi: 10.1177/153303460200100408. [DOI] [PubMed] [Google Scholar]
- Yoder AD, Yang Z. Estimation of primate speciation dates using local molecular clocks. Mol Biol Evol. 2000;17(7):1081–1090. doi: 10.1093/oxfordjournals.molbev.a026389. [DOI] [PubMed] [Google Scholar]
- Yoon JH, De S, Srikantan S, Abdelmohsen K, Grammatikakis I, Kim J, Kim KM, Noh JH, White EJ, Martindale JL, et al. PAR-CLIP analysis uncovers AUF1 impact on target RNA fate and genome integrity. Nat Commun. 2014;5:5248. doi: 10.1038/ncomms6248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yoshinaka Y, Katoh I, Copeland TD, Oroszlan S. Murine leukemia virus protease is encoded by the gag-pol gene and is synthesized through suppression of an amber termination codon. Proc Natl Acad Sci U S A. 1985;82(6):1618–1622. doi: 10.1073/pnas.82.6.1618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- You J, Cohen RE, Pickart CM. Construct for high-level expression and low misincorporation of lysine for arginine during expression of pET-encoded eukaryotic proteins in Escherichia coli. BioTechniques. 1999;27(5):950–954. doi: 10.2144/99275st01. [DOI] [PubMed] [Google Scholar]
- Young JA, Johnson JR, Benner C, Yan SF, Chen K, Le Roch KG, Zhou Y, Winzeler EA. In silico discovery of transcription regulatory elements in Plasmodium falciparum. BMC Genomics. 2008;9:70. doi: 10.1186/1471-2164-9-70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu KM, Liu J, Moy R, Lin HC, Nicholas HB, Jr, Rosenquist GL. Prediction of tyrosine sulfation in seven-transmembrane peptide receptors. Endocrine. 2002;19(3):333–338. doi: 10.1385/ENDO:19:3:333. [DOI] [PubMed] [Google Scholar]
- Yu Q, Chen D, König R, Mariani R, Unutmaz D, Landau NR. APOBEC3B and APOBEC3C are potent inhibitors of simian immunodeficiency virus replication. J Biol Chem. 2004;279(51):53379–53386. doi: 10.1074/jbc.M408802200. [DOI] [PubMed] [Google Scholar]
- Yu Y, Sweeney TR, Kafasla P, Jackson RJ, Pestova TV, Hellen CU. The mechanism of translation initiation on Aichivirus RNA mediated by a novel type of picornavirus IRES. EMBO J. 2011;30(21):4423–4436. doi: 10.1038/emboj.2011.306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yuan ZC, Zaheer R, Morton R, Finan TM. Genome prediction of PhoB regulated promoters in Sinorhizobium meliloti and twelve proteobacteria. Nucleic Acids Res. 2006;34(9):2686–2697. doi: 10.1093/nar/gkl365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yuan J, Sheppard K, Soll D. Amino acid modifications on tRNA. Acta Biochim Biophys Sin Shanghai. 2008;40(7):539–553. doi: 10.1111/j.1745-7270.2008.00435.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang S, Ryden-Aulin M, Isaksson LA. Functional interaction between release factor one and P-site peptidyl-tRNA on the ribosome. J Mol Biol. 1996;261(2):98–107. doi: 10.1006/jmbi.1996.0444. [DOI] [PubMed] [Google Scholar]
- Zhang L, Zhou W, Velculescu VE, Kern SE, Hruban RH, Hamilton SR, Vogelstein B, Kinzler KW. Gene expression profiles in normal and cancer cells. Science. 1997;276(5316):1268–1272. doi: 10.1126/science.276.5316.1268. [DOI] [PubMed] [Google Scholar]
- Zhang HM, Ye X, Su Y, Yuan J, Liu Z, Stein DA, Yang D. Coxsackievirus B3 infection activates the unfolded protein response and induces apoptosis through downregulation of p58IPK and activation of CHOP and SREBP1. J Virol. 2010;84(17):8446–8459. doi: 10.1128/JVI.01416-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zharkikh A. Estimation of evolutionary distances between nucleotide sequences. J Mol Evol. 1994;39:315–329. doi: 10.1007/BF00160155. [DOI] [PubMed] [Google Scholar]
- Zheng CL, Fu XD, Gribskov M. Characteristics and regulatory elements defining constitutive splicing and different modes of alternative splicing in human and mouse. RNA. 2005;11(12):1777–1787. doi: 10.1261/rna.2660805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou J, Korostelev A, Lancaster L, Noller HF. Crystal structures of 70S ribosomes bound to release factors RF1, RF2 and RF3. Curr Opin Struct Biol. 2012;22(6):733–742. doi: 10.1016/j.sbi.2012.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu C, Byrd RH, Lu P, Nocedal J. Algorithm 778: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw. 1997;23(4):550–560. doi: 10.1145/279232.279236. [DOI] [Google Scholar]
- Zhu J, Liu JS, Lawrence CE. Bayesian adaptive sequence alignment algorithms. Bioinformatics. 1998;14(1):25–39. doi: 10.1093/bioinformatics/14.1.25. [DOI] [PubMed] [Google Scholar]
- Zhu Z, Li L, Zhang Y, Yang Y, Yang X. CompMap: a reference-based compression program to speed up read mapping to related reference sequences. Bioinformatics. 2015;31(3):426–428. doi: 10.1093/bioinformatics/btu656. [DOI] [PubMed] [Google Scholar]
- Zhu Z, Zhang Y, Ji Z, He S, Yang X. High-throughput DNA sequence data compression. Brief Bioinform. 2015;16(1):1–15. doi: 10.1093/bib/bbt087. [DOI] [PubMed] [Google Scholar]
- Zid BM, Rogers AN, Katewa SD, Vargas MA, Kolipinski MC, Lu TA, Benzer S, Kapahi P. 4E-BP extends lifespan upon dietary restriction by enhancing mitochondrial activity in Drosophila. Cell. 2009;139(1):149–160. doi: 10.1016/j.cell.2009.07.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zien A, Ratsch G, Mika S, Scholkopf B, Lengauer T, Muller KR. Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics. 2000;16(9):799–807. doi: 10.1093/bioinformatics/16.9.799. [DOI] [PubMed] [Google Scholar]
- Zon LI, Gurish MF, Stevens RL, Mather C, Reynolds DS, Austen KF, Orkin SH. GATA-binding transcription factors in mast cells regulate the promoter of the mast cell carboxypeptidase A gene. J Biol Chem. 1991;266(34):22948–22953. [PubMed] [Google Scholar]
- Zuckerkandl E, Pauling L. Evolutionary divergence and convergence in proteins. In: Bryson V, Vogel HJ, editors. Evolving genes and proteins. New York: Academic; 1965. pp. 97–166. [Google Scholar]










