Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2000 Jul 1;28(13):2597–2604. doi: 10.1093/nar/28.13.2597

Theoretical design of antisense genes with statistically increased efficacy

Maik Jörg Lehmann 1, Volker Patzel 1,a, Georg Sczakiel 1,2
PMCID: PMC102702  PMID: 10871411

Abstract

Endogenous expression of antisense RNA represents one major way of applying antisense nucleic acids. To express antisense RNA intracellularly, recombinant antisense genes have to be designed and introduced into cells where the target RNA is encountered. Efficient annealing between the antisense RNA and the target RNA is crucial for efficacy and is strongly influenced by RNA structure. Here we extend structural rules for the design of in vitro transcribed antisense RNAs to the design of recombinant antisense genes. Intracellularly expressed antisense RNA transcripts contain a central antisense portion and additional flanking vector-derived sequences. A computer algorithm was generated to compose large sets of antisense genes, to calculate secondary structures of the transcribed sequences and to select for favorable structures of antisense RNA in terms of annealing and efficacy. The biological test system to measure efficiency of antisense genes was human immunodeficiency virus type 1 (HIV-1) replication in 293T cells. When considering the lower intracellular steady-state levels of favorably structured endogenous transcripts, an antisense effect against HIV-1 replication was observed that was up to 60-fold stronger than that measured for predicted unfavorable species. The computational selection was successful for antisense portions of 300 nt but not 100 nt in length. This theoretical design of antisense genes supports their improved application under time- and labor-saving conditions.

INTRODUCTION

Accumulating evidence suggests that the efficacy of long chain antisense RNA in living cells is related to fast annealing of the antisense RNA and the target RNA in vitro and, presumably, also in vivo (1,2). Recent insights into the relationship between structural elements and annealing kinetics of artificial antisense RNAs indicate that global flexibility and a high number of external nucleotides of in vitro transcribed HIV-1 gag-directed antisense RNA without additional non-complementary sequences favor both RNA–RNA annealing in vitro and efficacy in living cells (3). Recombinant antisense genes, rather than in vitro transcribed antisense RNA, represent one of the major forms for the application of antisense inhibitors. The relationship between RNA structure and efficacy in this case is complicated by stretches of vector-derived transcribed sequences that usually flank the central antisense portion of endogenously expressed antisense RNA. As a major consequence it seems to be extremely difficult to establish experimental procedures to identify effective antisense genes out of the space of all possible antisense genes against a given target at sufficient reliability and at appropriate expense. Conversely, computational approaches may serve as an alternative tool to select effective antisense genes provided that parameters can be defined that are accessible to computer calculation and that are related to the efficacy of antisense genes in living cells.

To test this possibility, we developed an algorithm that generates all possible endogenous antisense transcripts against a given target sequence. Target sequences and their structures are usually defined by the biological system of interest and the chosen target system whereas antisense species can be selected out of the complete antisense sequence space. To minimize the influence of the target sequence on the efficacy of antisense nucleic acids tested here, we chose HIV-1 gag sequences for several reasons. First, gag target sequences are relatively homogeneous with respect to the local folding potential (4), i.e. within the RNA encoding gag, the occurrence of extremely stable or unstable sub-structures is unlikely. This is especially true for target regions longer than 100 nt. Secondly, gag-directed antisense RNA has already been shown to efficiently inhibit HIV-1 replication in human cells (5). Thirdly, despite extensive splicing within transcripts of HIV-1, the gag target sequence used here occurs only on the unspliced RNA in vivo, i.e. the target RNA in infected living cells is defined.

Here, transcribed plasmid-derived sequences and all possible consecutive 100 or 300 nt long antisense sequences directed against HIV-1 gag were combined by computer and the secondary structures of all antisense transcripts were calculated and recorded. The selection criteria for favorable antisense transcripts were, in principle, those that had recently been shown to be relevant (3,6). The highest ranking was assigned to structures with the highest global flexibility, reflected in high numbers of external unpaired nucleotides and structural components. To average over individual properties, we cloned a total of 12 favorably predicted HIV-1 gag-directed antisense genes and seven antisense constructs predicted to behave less effectively. To compare the anti-HIV-1 efficacy of different antisense genes, we considered intracellular RNA steady-state levels of antisense transcripts. These were used to calculate relative antisense effects. The results indicate that the length of the antisense stretch is a critical parameter. The computational design described in this work can significantly and reliably improve the efficiency of antisense genes.

MATERIALS AND METHODS

Computer-aided design

The computational selection was perfomed using a modification of the algorithm Foldanalyze (3), which is based on the program Mfold v.2.0 (7) and which is included in the Heidelberg UNIX Sequence Analysis Resources (HUSAR) (8). The program Space was developed to characterize the antisense structure spaces.

Construction and preparation of plasmid DNA

For construction of plasmids harboring antisense genes we used plasmid pEGFP-C1 (Clontech, Palo Alto, CA). The gfp coding sequence was deleted between the Eco47III and SmaI sites. The selected PCR fragments from HIV-1 proviral DNA clone pNL4-3 (EMBL accession no. REHIVNL4) were subcloned into the MluI and NdeI restriction sites of the polylinker. Plasmid DNA was amplified in Escherichia coli strain XL1-blue, isolated and purified by cesium chloride gradient centrifugation. The purity was monitored by agarose gel electrophoresis (1% agarose). The DNA concentration was measured by UV absorption at 260 nm.

Cell lines and cell culture

Cell line 293T was cultured in DMEM medium (Life Technologies, Karlsruhe, Germany) supplemented with 10% fetal calf serum, l-glutamine (2 mmol/l), penicillin (100 IU/ml) and streptomycin (100 µg/ml) at 37°C and passaged twice a week. For transfection experiments the cells were seeded in 48-well culture plates at a density of 2 × 104 cells/well and grown to half confluence (12 h in culture).

Transfection and inhibition of HIV-1 replication

HIV-1 replication was measured 24 h after calcium phosphate co-transfection (9) of 150 ng HIV-1 proviral DNA (clone pNL4-3) and 250 ng plasmid DNA per well. HIV-1-specific antigen concentrations in the culture supernatants were determined by ELISA (Organon Teknika, Boxtel, The Netherlands). Values were standardized to the effect of the cloning vector (100% p24 expression).

Determination of intracellular RNA steady-state levels

293T cells were transfected as described above without HIV-1 proviral DNA and incubated for 24 h at 37°C. The cells were removed from the culture plates by trypsinization and RNA was isolated using the RNeasy protocol (Qiagen, Hilden, Germany). Isolated RNA was reverse transcribed (Reverse Transcription Reagents; PE Biosystems, Foster City, CA) and amplified by quantitative PCR (SYBR Green PCR Core Reagents, Gene Amp Sequence Detector 5700; PE Biosystems) using forward primer 5′-GATCCGCTAGCGGGATCC-3′ and reverse primer 5′-CCTCTACAAATGTGGTATGGCTGA-3′ for the HIV-1 gag-directed antisense RNA and forward primer 5′-GAAGGTGAAGGTCGGAGTC-3′ and reverse primer 5′-GAAGATGGTGATGGGATTTC-3′ for GAPDH, which was used as an internal standard.

RESULTS

Theoretical basis of the design of antisense genes

Recent observations on the relationship between predicted secondary structures of 100 nt long antisense RNA and their efficacy in mammalian cells suggested that specific structural features are related to effectiveness (3). First, high numbers of external bases, i.e. nucleotides that are not involved in base pairing and that do not belong to structural elements. Thus external bases include free ends and joint sequences but not unpaired nucleotides within loops and bulges. Secondly, high numbers of structural components, i.e. structural folding units that are linked via flexible joint sequences to each other. These observations cannot be simply transferred to endogenously expressed fusion transcripts with 5′- and 3′-terminal non-antisense sequences which are likely to affect folding of the central antisense sequence.

For the computational design of antisense genes the following algorithm was used: (i) a given target sequence of L nucleotides in length is used to generate the complete relevant antisense sequence space [complexity = L/2 × (L + 1)]; (ii) constant vector-encoded sequences that are co-transcribed within the antisense sequence-containing expression cassette, such as promotor element-derived sequences or termination signals, are attached to the 5′- and 3′-end, respectively; (iii) the five lowest free energy secondary structures are predicted for all species of the resulting virtual space of antisense transcripts; (iv) all resulting secondary structures which are conserved among the five lowest free energy foldings are ranked according to given selection rules for efficiently annealing structures such as those described above. Highest ranking was given to structures with primarily the highest numbers of external bases and additional high numbers of components. No cut-off was used.

Computational selection of HIV-1 gag-directed antisense RNA structures

The parental expression plasmid used in this work was plasmid pEGFP-C1. Its gfp coding sequences were replaced by the selected antisense sequences. In the resulting antisense expression cassettes central antisense sequences were flanked by terminal plasmid-derived constant sequences, 33 nt of the CMV promotor at the 5′-end and 177 nt of the SV40 polyadenylation signal [without the poly(A) tail] at the 3′-end. In order to eliminate possible effects of the length of antisense transcripts on their efficiency in living cells and to be able to analyze a variety of constructs in a comparable way we restricted the computational analysis to molecules containing either a 100 or 300 nt antisense stretch which is coincident with restriction of the analyzed sequence spaces to a diversity of (L – 100) or (L – 300) different molecules respectively. The distributions of external bases and components and their variances along the HIV-1 gag gene are shown in Figure 1. The two sets of antisense genes were selected according to the selection scheme described above. First, a set of 10 antisense genes resulting in a RNA transcript of 310 nt total length containing a central antisense portion of 100 nt (antisense 100mers; Fig. 2A). Six of these structures (F1–F6) were regarded as favorable structures in terms of effectiveness according to the selection parameters (Figs 1 and 3A) and four structures (U1–U4) were predicted to be unfavorable. Secondly, a set of nine antisense genes resulting in RNA transcripts of 510 nt total length containing a central antisense portion of 300 nt (antisense 300mers) was selected (Fig. 2B), comprising favorable structures F7–F12 and unfavorable structures (U5–U7). As shown in Figure 3A, the favorable structures contain 10-fold (7-fold) higher numbers of external bases and 2-fold (3-fold) higher numbers of structural components in the case of the antisense 100mers (300mers).

Figure 1.

Figure 1

Distributions and variances of selection parameters of endogenously transcribed antisense RNA 100mers and 300mers along the HIV-1 gag gene. Calculated numbers of external bases (A) and components (B) are plotted versus the reverse complement (first position of the analyzed antisense window) of the HIV-1 primary transcript. For example, position 9255 corresponds to position 1 of the HIV-1 primary transcript or to position 455 of HIV-1 clone pNL4-3 (EMBL accession no. REHIVNL4), respectively. Parameters corresponding to selected constructs are indicated. Mean values are indicated by red lines.

Figure 2.

Figure 2

Computer-predicted minimum free energy secondary structures of HIV-1 gag-directed antisense transcripts with (A) a 100 or (B) a 300 nt antisense sequence portion. Antisense sequence stretches of favorable structures (F1–F12) are colored green; antisense sequence stretches of unfavorable structures (U1–U7) are colored red; black represents non-complementary vector-derived sequences identical for all RNA transcripts. Structural elements relevant for the computational selection are specified in structure F1. Components are shaded; external bases are not shaded.

Figure 3.

Figure 3

Relationship between predictable RNA structure parameters, the numbers of external bases and components (A), and the inhibition of HIV-1 replication in 293T cells by theoretically selected antisense genes (B). Predicted structures were selected for high numbers of external bases and high numbers of components. Values for HIV-1 replication are mean values of 5 × 3 co-transfection experiments. The endogenously transcribed antisense RNAs are 310 or 510 nt in length [without a poly(A) tail] and contain a 100 or 300 nt antisense portion respectively. Green bars (solid or hatched) represent favorable structures (F1–F12) and red bars (solid or hatched) unfavorable structures (U1–U7); black bars represent the effect of the cloning vector itself (100% HIV-1 replication). Letters with bars indicate the average values of favorable or unfavorable RNA structures. Mean values are additionally indicated by horizontal lines.

Intracellular steady-state levels of HIV-1 gag-directed endogenous antisense RNA transcripts

Steady-state levels of endogenously transcribed antisense RNA were measured 24 h after transfection of the corresponding antisense genes into 293T cells by quantitative RT–PCR. All steady-state levels of antisense RNA (Fig. 4A) were standardized to the corresponding levels of GAPDH mRNA and were compared to the level of RNA transcribed from the insert-free vector backbone. The average values indicate ~2-fold lower steady-state levels of the favorable structures compared to unfavorable structures (Fig. 4A). Individual transcripts however, such as F7 and F8, showed up to 10-fold lower steady-state levels compared to the average of all favorable antisense 300mers. Interestingly, higher levels were observed in the case of the shorter antisense transcripts.

Figure 4.

Figure 4

(A) Steady-state levels of HIV-1 gag-directed antisense RNA species in 293T cells determined by quantitative RT–PCR. Values are standardized to levels of GAPDH. Values are mean values of three experiments. (B) Antisense effects of selected antisense genes according to the equation: antisense effect = 100/[(HIV-1 replication [p24]) × (steady-state level)]. For the color coding see the legend to Figure 3. The values obtained for the cloning vector (lane C) are set to one.

Efficacy of HIV-1 gag-directed antisense genes in 293T cells

To measure antisense RNA-mediated inhibition in mammalian cells, we used a transient co-transfection assay for HIV-1 replication which represents a valid system: human 293T cells were co-transfected with a mixture of antisense RNA expression plasmid and the cloned infectious proviral HIV-1 DNA pNL4-3 (EMBL accession no. REHIVNL4) by the calcium phosphate co-precipitation protocol (9). Virus was released into the cell culture supernatant and quantified by HIV-1 p24 antigene ELISA. The ratio of both types of DNA was chosen such that levels of inhibition were in a range that allowed comparison of all constructs. All constructs harboring a 100 nt antisense stretch showed significant inhibition, though no significant difference was observed between the group of favorably predicted antisense genes and those predicted to be less effective (Fig. 3B). For the constructs containing a 300 nt stretch of antisense sequences, a 2-fold difference in efficacy was measured among the two groups of theoretically selected antisense genes (Fig. 3B), which is statistically significant. All positively selected species showed stronger inhibition of HIV-1 compared to all negatively selected structures. The probability of observing this relation on a statistical basis is 0.012. It is reasonable to assume that intracellular concentrations of antisense RNA are important for efficacy. Thus, we defined an antisense effect of the different antisense genes by considering the measured intracellular RNA steady-state levels in order to compare the antiviral efficacy. As shown in Figure 4B, the dynamic range between individual species as well as between positively and negatively selected antisense structures became significantly greater. On average, the positively selected antisense structures showed an 11-fold higher antisense effect compared to the negatively selected species in the case of the antisense 300mers, and even for the antisense 100mers, for which no difference was observed in virus replication, a 2- to 3-fold higher antisense effect was calculated (Fig. 4B).

Relationship between secondary structures of HIV-1 gag-directed antisense transcripts and efficacy

The results presented here indicate that, with some limitations, the previously identified selection parameters (3) can be used for improved design of endogenous HIV-1-directed antisense transcripts or antisense genes. Structures with higher numbers of external bases and components are stronger inhibitors of HIV-1 replication in the case of a 300 nt but not a 100 nt antisense stretch. However, the correlation between the theoretically calculated structural parameters of antisense RNA and their efficacy in living cells is of a qualitative and not a quantitative nature.

DISCUSSION

The theoretical approach presented here seems to be suitable to significantly improve the design of antisense genes directed against HIV-1. In the case of the constructs with a 300 nt antisense portion all favorably predicted structures (F7–F12) showed improved inhibition of HIV-1 replication. Conversely, without consideration of intracellular RNA steady-state levels, no influence of computational selection on viral replication was observed in the case of structures with a 100 nt antisense portion. The average inhibitory potential of the antisense 100mers was found to be between the efficacy of positively and negatively selected antisense 300mers. The failure of the computational prediction in the case of the shorter antisense stretches could be attributed to a structure-dominating effect of the relatively long constant non-complementary vector-derived sequences at the termini of these molecules. This idea is supported by the observation that the vector-derived portions fold into identical structural sub-domains within many positively and negatively selected species (Fig. 2A). For example, an identical stem–loop is predicted for the CMV promoter-derived 5′-terminal sequences within structures F6, U1 and U2. Another extended stem–loop structure is predicted within the SV40 poly(A)-derived sequence portion of structures F1–F6, U1 and U2 (Fig. 2A). The ratio of non-complementary to complementary bases is 210/100 = 2.1 in the case of the antisense 100mers (310 nt in length) and 210/300 = 0.7 in the case of the antisense 300mers (510 nt in length). These constant vector-derived terminal sequences significantly reduced the diversity of structures concerning the two selection parameters, the numbers of external bases and components compared to completely complementary sequences of identical chain length. This becomes obvious when regarding the variability of the two selection parameters of different HIV-1 gag-directed antisense structure spaces (Fig. 5). As expected, the structural diversity, with respect to external bases and components, was significantly higher in the case of antisense 510mers compared to antisense 310mers (fully complementary sequences each), which is due to the higher potential of alternative base pairing within the longer sequences. However, when substituting 33 nt located at the 5′-ends and 177 nt located at the 3′-ends by invariable vector sequences, this diversity was reduced more dramatically in the case of the 310mers (Fig. 5). Further, the frequency of structures with the maximum number of external bases was much higher in the case of completely complementary sequences and was more strongly reduced by terminal vector-derived sequences in the case of 310mers compared to 510mers (Fig. 5). These considerations are compatible with the previous observation that the computational design of HIV-1 gag-directed antisense RNA of 100 nt in length can dramatically enhance inhibition of HIV-1 replication if the RNAs are transcribed in vitro with a maximum of six non-complementary bases at the 5′-end prior to transfection of mammalian cells (3). Terminal vector-derived sequences avoid complementary external bases to be located in terminal positions. Especially, complementary terminal external bases were found to be critical for annealing and antiviral activity of HIV-1-directed antisense RNA.

Figure 5.

Figure 5

Characterization of hypothetical HIV-1 gag-directed antisense RNA structure spaces for the schematically depicted sequence constructs of different chain length. Variable antisense portions are represented by single lanes, black boxes represent transcribed nucleotides of the CMV promotor and hatched boxes represent the SV40 poly(A) tail. A set of theoretically determined parameters was calculated with the program Space. ext., external; max., maximum; no., number. Diversity of components and diversity of ext. bases reflect the dynamic range of these parameters within the analyzed sequence space. Max. no. of ext. bases reflects the highest number of external bases observed for a single species. Highest amount of ext. bases indicates the percentage of external bases within the species with the maximum number of external bases. Frequency of the max. no. of ext. bases indicates the number of structures with the maximum number of external bases found during each analysis.

On a statistical basis long antisense RNA molecules anneal faster with the target strand than shorter ones and, thus, are potentially stronger inhibitors (10). It is reasonable to assume that, analogous to in vitro or in vivo selection strategies, the success of computational selection is also correlated with the structural diversity of the initial pool. However, an arbitrary elongation of the antisense stretch up to the length of the target sequence would reduce the sequence and structure diversity to one. This would result in a statistically potent but probably sub-optimal molecule. The probability of identifying favorable structures in terms of annealing and efficacy decreases with the length of the antisense molecule (Fig. 5). When regarding the fully complementary sequences, the highest number of external bases among all HIV-1 gag-directed antisense species was 80% in the case of 100mers, 43% in the case of 300mers, 42% for 310mers and only 30% for 500mers. Similarly, in the case of endogenous transcripts, the highest number of external bases was 39% for 310mers and 33% for 510mers. The longer the antisense molecules were, the closer to the average they became in terms of structural features. Thus, in order to improve the computational design of endogenously transcribed antisense RNA and the efficacy of selected species it seems to be promising to reduce the ratio of non-complementary to complementary sequences, particularly by minimizing the non-complementary sequence portions rather than by elongating the antisense stretch as one might expect when extrapolating the results of the analyzed antisense 100mers and 300mers. On the basis of these interpretations it seems to be attractive to use the polymerase III promotor, which generates a minimum of transcribed promotor sequences, for the future design of antisense genes.

Despite a potentially stronger influence of vector-derived sequences in the case of the antisense 100mers, the question has to be addressed of why the computational selection does not improve the design of antisense genes in terms of efficacy, although from the theoretical point of view the secondary structure prediction should be more reliable in the case of shorter sequences. One explanation could have been a thermodynamically more stable local target region in the case of the positively selected HIV-1-directed antisense 100mers. A strong influence on annealing and inhibition of gene expression of this, so-called, local folding potential of the target sequence has been investigated previously (4). However, in the case of the HIV-1-directed antisense 100mers there is no hint that local thermodynamic target stability might be responsible for the lack of correlation between RNA secondary structure prediction and efficacy in human cells (Table 1). From a statistical point of view, thermodynamics should rather support duplex formation between the target RNA and favorably structured antisense molecules. A more trivial explanation would assume that other biochemical parameters than annealing, such as intracellular stability (11), subcellular localization (12) or intracellular transcription rate, are important for the efficacy of the gag-directed antisense 100mers. The sum of all these anabolic and catabolic processes is reflected in the intracellular RNA steady-state levels determined in the absence of target RNA, which are significantly lower in the case of the favorable antisense structures (Fig. 4A). This is compatible with the previous observation that such selected potentially fast annealing structures are less stable in cell extracts (3). Low intracellular RNA stability would necessarily result in low RNA steady-state levels. During the infectious assay HIV-1 transcripts were present and annealing with the target sequence could compete with RNA degradation, probably resulting in RNA levels lower than those determined in the absence of target sequences. While fast annealing is related to higher efficiency of antisense RNA, lower RNase resistance might just have equalized this effect in the case of the favorable antisense 100mers but not the 300mers. The annealing rate constants of the individual species are not known. The 3-fold lower RNA steady-state levels of the antisense 300mers might be a hint of an endonucleolytic rather than an exonucleolytic degradation pathway of endogenously transcribed artificial RNA.

Table 1. ΔG values of intramolecular structure formation of local targets and antisense sequences.

Antisense RNA Target positiona ΔG target (kcal/mol) Average ΔG targets (kcal/mol) ΔG antisense species (kcal/mol) Average ΔG antisense species (kcal/mol)
HIV 100          
F1 2617–2716 –5.5 –9.4 –36.1 –37.5
F2 2587–2686 –10.3   –43.2  
F3 2302–2401 –8.8   –37.4  
F4 1768–1867 –15.4   –43.5  
F5 814–913 –6.9   –36.5  
F6 785–884 –9.2   –28.5  
U1 2748–2847 –12.6 –11.9 –40.2 –41.4
U2 2457–2556 –10.8   –45.4  
U3 1635–1734 –12.3   –38.5  
U4 1094–1193 –12.0   –41.4  
HIV 300          
F7 2601–2900 –35.3 –47.6 –63.0 –68.9
F8 2374–2673 –44.7   –65.5  
F9 2173–2472 –51.0   –78.9  
F10 1440–1739 –50.0   –77.1  
F11 628–927 –51.7   –63.4  
F12 607–906 –53.0   –65.2  
U5 1699–1998 –48.0 –58.0 –68.5 –80.3
U6 1047–1346 –44.3   –72.7  
U7 275–574 –81.8   –99.8  

aSequence published as EMBL accession no. REHIVNL4.

There are controversal opinions about the usefulness of single secondary structure predictions for the evaluation of favorable antisense sequences in the antisense field. This and previous work (3,6) indicate that a more extensive computational secondary structure analysis can significantly support the design of antisense compounds. We present a first approach to improve the design of antisense genes compared with a purely statistical basis. We suggest that when using a low ratio of vector-derived to complementary sequence portions, the computer-aided design of endogenously expressed antisense RNA significantly improves the efficacy of antisense genes. The potential of computational selection is more dramatically reflected in the definition of an antisense effect which considers the lower intracellular steady-state levels and, hence, lower nuclease resistance of favorable antisense RNA molecules. Thus, it seems promising to consider RNA-stabilizing structural domains during computational selection to design fast annealing and more stable antisense molecules. Similarly, RNA transport signals (13,14) considered during the selection protocol might help to direct endogenous antisense transcripts to subcellular locations where they encounter their target. The method presented here is time- and labor-saving and, hence, may be applied prior to any kind of experimental antisense gene design.

Acknowledgments

ACKNOWLEDGEMENTS

We cordially thank K.-H. Glatting and S. Suhai from the Steinbeis-Transferzentrum für Genominformatik and the Deutsches Krebsforschungszentrum for computational help. This work was supported by the Deutsche Forschungsgemeinschaft, grant Sc14/1-3.

REFERENCES


Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES