Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2007 Nov 8.
Published in final edited form as: Genomics. 2005 Dec 1;87(3):399–409. doi: 10.1016/j.ygeno.2005.10.003

An expansion of the dual clip-domain serine proteinase family in Manduca sexta: Gene organization, expression, and evolution of prophenoloxidase-activating proteinase-2, hemolymph proteinase 12, and other related proteinases

Yang Wang 1,1, Zhen Zou 1,1, Haobo Jiang 1,*
PMCID: PMC2071929  NIHMSID: NIHMS30634  PMID: 16324822

Abstract

Prophenoloxidase-activating proteinases (PAPs) take part in insect defense responses including melanotic encapsulation and wound healing. To understand their gene structure and regulation, we screened a genomic library and isolated overlapping λ clones for Manduca sexta PAP-2, hemolymph proteinase 12 (HP12), and HP24. Complete nucleotide sequence analysis indicated that all three genes encode polypeptides with two regulatory clip domains at the amino terminus, a linker region, and a catalytic serine proteinase domain at the carboxyl terminus. Each gene contains eight exons, with introns located at equivalent positions. Similar sequences are present in introns as well as exons, indicating that these genes arose from recent gene duplication and sequence divergence. We analyzed their 5′ flanking sequences and identified putative immune and hormone responsive elements. Reverse transcription-polymerase chain reactions confirmed that PAP-2 and HP12 mRNA levels in the larval fat body and hemocytes increased after a bacterial challenge. However, HP24 expression was barely detected. PAP-2 transcripts in cultured fat body became less abundant after 20-hydroxyecdysone treatment. Thus, PAP-2, HP12, and HP24 mRNA levels are differentially regulated by immune and developmental signals. Comparison with HP15, HP23, and PAP-3 sequences suggested an evolutionary pathway of the dual clip-domain serine proteinases in M. sexta.

Keywords: Insect immunity, Hemolymph protein, Melanization, Gene cluster, Tobacco hornworm, Clip domain, Ecdysone

Introduction

Insects are capable of defending themselves against pathogen/parasite infection [7,8,15]. Their immune systems consist of physical barriers (e.g., integument, gut), cellular responses (e.g., phagocytosis, nodulation, and encapsulation), and reactions mediated by plasma proteins [e.g., coagulation, prophenoloxidase (proPO) activation, and effect of antimicrobial peptides]. Serine proteinase cascades coordinate some of these defense mechanisms [10,14]. For instance, the proteolytic activation of proPO yields active phenoloxidase (PO), which generates quinones for melanin synthesis, wound healing, and encapsulation [1,3,18].

In the tobacco hornworm Manduca sexta, proPO activation is catalyzed by proPO-activating proteinases (PAP-1, PAP-2, and PAP-3) in the presence of two serine proteinase homologues (SPHs) [9,11,12,25]. The SPHs contain an amino-terminal clip domain and a carboxyl-terminal proteinase-like domain that is catalytically inactive (due to the lack of one or more catalytic residues in a serine proteinase). Precursors of PAPs and SPHs are activated by a serine proteinase cascade triggered on recognition of pathogens. PAPs, also known as PPAEs for proPO-activating enzymes, have been isolated and cloned from other arthropod species, including Holotrichia diomphalia, Bombyx mori, and Pacifastacus leniusculus [16,21,23]. B. mori PPAE, M. sexta PAP-2, and M. sexta PAP-3 contain two clip domains at the amino terminus, but the other PAPs only have one. M. sexta SPH-1 and SPH-2 associate with immulectin-2, a C-type lectin that binds lipopolysaccharide and localizes PO to the bacterial surface. Additionally, serine proteinase inhibitors of the serpin super-family negatively regulate PAPs [14].

Our knowledge on transcriptional regulation of PAP gene expression is limited. We recently published the gene structure and expression pattern of PAP-1 and PAP-3 [26,27]. In this paper, we report the discovery of a cluster of dual clip-domain serine proteinase genes in the M. sexta genome. We have investigated the structure and evolution of these genes. A search for conserved sequence motifs uncovered putative regulatory elements in their 5′ flanking regions. Tissue specificity, immune inducibility, developmental profile, and hormone responsiveness of the gene expression were surveyed.

Results

Isolation of cDNA and genomic clones for PAP-2, HP12, HP23, and HP24

We screened the M. sexta genomic library using a full-length PAP-2 cDNA probe [11] and isolated eight positive clones. λ11 and λ12 were selected for subcloning and sequence determination. λ11 contained exons 3–8 of the PAP-2 gene, whereas λ12 included exons 3–8 of HP24 and exon 1 of PAP-2 (Fig. 1A). To fill the gap in the PAP-2 sequences, we used two pairs of primers in exon 1 and exon 2 to amplify genomic DNA by nested PCR and obtained a 1-kb fragment (G1) containing the 3′ end of exon 1, the entire intron 1, and the 5′ end of exon 2. Similarly, a seminested PCR resulted in a 1.4-kb genomic fragment (G1.4) containing the 3′ end of exon 2, the entire intron 2, and the 5′ end of exon 3 of PAP-2. Perfect matches with the ends of λ11 and λ12 indicated that the PCR-derived fragments represented the gap between the λ clones.

Fig. 1.

Fig. 1

Structures of M. sexta HP12, HP24, PAP-2, PAP-3, and B. mori PPAE. (A) Exon–intron organization, cloning, and restriction map of a 34.4-kb M. sexta genomic fragment containing HP12, HP24, and PAP-2. Locations of exons 1–8 (vertical bars) and introns 1–7 (horizontal lines) are indicated. Thin lines denote the genomic clones (λ1, λ3, λ11, and λ12) and PCR products (G1 and G1.4) analyzed by restriction digestion, subcloning, and sequencing. S, SalI; X, XbaI; E, EcoRI; H, HindIII. (B) M. sexta PAP-3 and silkworm PPAE structures [24,26]. The same scale is used for both panels. The 5′- and 3′-untranslated regions in exons 1 and 8 are marked by open boxes. Dashed lines represent unidentified genomic regions.

The corresponding exons in λ12 were 67% identical in sequence to those in λ11. Based on a sequence comparison with PAP-2 cDNA, we predicted the exon–intron junctions of HP24. A BLASTX search of GenBank with the assembled exon sequences indicated that this partial gene encoded a serine proteinase with two clip domains. We designed a pair of primers in exons 3 and 8 of HP24 and amplified a 960-bp cDNA fragment of HP24 and, thus, demonstrated that HP24 is an active gene. Comparison of the genomic and cDNA sequences confirmed the predicted exon-intron structure. To isolate a full-length HP24 cDNA, we screened the cDNA libraries via RecA-mediated homologous pairing and biotin-streptavidin interaction [13]. Among the captured positive clones, 14 were PAP-2, 1 was HP12, and none were HP24.

We used the HP15 probe [13] and isolated a full-length cDNA clone for HP15 and for HP23. Highly similar sequences of HP15 and HP23 (Table 1) seem to be responsible for the cross-hybridization.

Table 1.

Percentage identity (upper triangle) and similarity (lower triangle) of amino acid sequences of the dual clip-domain serine proteinases including signal peptide

  HP12 HP24 PAP-2 HP15 HP23 PAP-3 PPAE
HP12 80 63 54 50 46 47
HP24 85 64 52 48 46 46
PAP-2 73 75 58 54 52 50
HP15 66 68 72 69 49 49
HP23 64 65 68 80 47 46
PAP-3 59 61 65 61 60 51
PPAE 63 62 65 65 63 64

To isolate genomic clones for HP12 and the rest of HP24, we amplified an HP12 cDNA fragment corresponding to exons 1 and 2 of PAP-2, labeled the product with [32P]dCTP, screened the genomic library, and isolated four positives. Based on the Southern blot analysis, λ1 and λ3 were selected for subcloning and sequence determination. λ3 contained the entire HP12 and exons 1–3 of HP24, whereas λ1 included exons 2 through 8 of HP24 and exon 1 of PAP-2 (Fig. 1A). Thus far, we have de termined the complete nucleotide sequence of a 35-kb region containing the HP12–HP24–PAP-2 gene cluster.

We searched the silkworm genome databases and the only dual clip-domain serine proteinase gene found was PPAE. The first exon of PPAE is located at the 3′ end of one scaffold, whereas exons 2–8 are present at the 5′ end of another scaffold (Fig. 1B). A large intron was also identified between exons 1 and 2 of M. sexta PAP-3 [26]. In summary, seven dual clip-domain proteinase genes (M. sexta HP12, HP15, HP23, HP24, PAP-2, PAP-3, and B. mori PPAE) have been discovered thus far.

Transcription initiation

We determined the transcription start site of PAP-2 by primer extension. Reverse transcriptase extended the primer (derived from nucleotides 93–114 of PAP-2 cDNA) 111 bases and yielded a 133-nucleotide product (Fig. 2, lane 3). Therefore, the RNA synthesis begins at an A (nucleotide +1) in the middle of TCAGA. This sequence is similar to TCAGT, a motif typically found within 10 nucleotides before or after the transcription start site in arthropod genes [4]. Additionally, we identified an A/T-rich region (AATATATT-TATA) between nucleotides −30 and −19. This sequence contains motifs that are similar to a TATA box (TATATA or TATAAA), residing in the “−30 region” of many eukaryotic genes.

Fig. 2.

Fig. 2

Primer extension of M. sexta HP12, HP24, and PAP-2 transcripts. Oligonucleotides complementary to a region near the 5′ end of the mRNAs were individually labeled and separately annealed to total RNA of fat body from the bacteria-challenged larvae. The primers were extended using M-MLV reverse transcriptase. The set of sequencing reaction (ACGT) on the left for use as a sizing ladder was from dideoxynucleotide sequencing of single-stranded M13mp18 DNA using “−40” primer. Numbers on the right indicate sizes of the extension products (*) and underlining denotes predominant bands. Lane 1, HP24; lane 2, HP12; lane 3, PAP-2.

Primer extension revealed heterogeneous initiation sites in HP12: a series of products (135–212 nucleotides long) were synthesized from an oligonucleotide complementary to nucleotides 124–146 of the cDNA (Fig. 2, lane 2). Based on the strongest band at 184 bp, we deduced that the transcription initiation site is a T (nucleotide +1) in the context of TAGT (−1 to +4). This sequence closely resembles TCAGT, the arthropod pentanucleotide capsite. A perfect TATA box (TATAAA) was found between nucleotides −29 and −24. Two other TATA boxes (TATATA), located −78 to −73 and −63 to −58, may cause transcription initiation at A (−28) and A (−15), as inferred from the 212- and 199-nucleotide extension products. However, we did not find a perfect capsite between nucleotides −38 and −5 – the closest analogs, CTGGT (−41 to −37) and CTAGT (−14 to −10), contain two or three mismatches.

We did not detect any extension product using an HP24-specific primer under the same conditions (Fig. 2, lane 1), probably because HP24 was barely expressed in induced larval fat body (Fig. 3). For the purpose of sequence comparison, we assigned a tentative transcription initiation site for HP12 and PPAE genes based on the nucleotide sequence alignment (Fig. 4A).

Fig. 3.

Fig. 3

Inducibility, developmental, and tissue-specific expression of HP12, HP24, and PAP-2 genes. (A) Immune responsiveness: cDNA samples from control and induced hemocytes (CH, IH) and fat body (CF, IF) were normalized with M. sexta ribosomal protein S3 (rpS3) and subjected to PCR analysis using primers specific for HP12, HP24, or PAP-2. (B) Developmental profile: fat body total RNA samples were isolated from M. sexta at different stages of M. sexta for the expression analysis. (C) Tissue specificity: total RNA samples were isolated from nerve tissue (N), salivary gland (S), Malpighian tubule (Mt), tracheae (T), midgut (Mg), hemocytes (H), fat body (F), integument (I), and muscle (Mu) for RT-PCR analysis. (D) Hormonal regulation: changes in PAP-2 mRNA level after the cultured fat body tissues had been treated with 0, 1, or 5 mg/ml of 20-hydroxyecdysone for 48 h. The PCR product and cycle number are indicated on the right. These experiments were repeated at least once and, in most cases, total RNA samples from another batch of insects were analyzed to confirm the results.

Fig. 4.

Fig. 4

Alignment of the upstream (A) and similar (B–D) regions in M. sexta HP12, HP24, PAP-2, PAP-3, and B. mori PPAE genes. (A) The 5′ flanking sequences were compared using the ClustalW program [22]. Positions identical in all five sequences are indicated by an asterisk. GATA boxes (6-nucleotide, marked G) and ISRE sites (13-nucleotide, marked I), boldface and double-underlined; NF-κB motifs (10-nucleotide, marked N), boldface and single-underlined; TATA boxes (6-nucleotide), boldface italic and double-underlined. Mismatches are shown in lowercase letters. The highly similar region in HP12 and HP24 is marked by “graphic file with name nihms30634ig1.jpg” Transcription initiation sites (including the putative ones) are indicated by “┌”, whereas translational start sites “ATG” by “▶”. (B–D) Asterisks indicate nucleotides identical to the top sequence; dashes mark the gaps.

Organization, features, and variations of the exons and introns

A comparison between the genomic and cDNA sequences indicated that the structures of M. sexta HP12, HP24, PAP-2, PAP-3, and B. mori PPAE genes are identical: all contain 8 exons and 7 introns (Fig. 1, Fig. 5A, and Table 2). Exon 1 includes a 5′-untranslated region (UTR) followed by a sequence encoding the signal peptide. Exon 2 and the first 9 nucleotides of exon 3 code for clip domain 1, whereas the rest of exon 3 and the 5′ end of exon 4 encode clip domain 2. The remainder of exon 4 encompasses the linker region as well as the first two residues of a serine proteinase catalytic domain. Most of the catalytic domain is encoded by exons 5–7 and the 5′ end of exon 8.

Fig. 5.

Fig. 5

Multiple sequence alignment (A) and phylogenetic analysis (B) of the serine proteinases with two clip domains. (A) Completely conserved residues are indicated by asterisks and conservative substitutions are indicated by periods underneath the sequences. The exon–exon junctions, identical in all of the genes with known structures (Fig. 1), are located right after “♥”. The proteolytic activation site is marked by “∥”. The catalytic residues (His, Asp, and Ser) are marked by boxes, whereas determinants of the specificity pocket by “@”. Cys residues in the mature proteins are underlined, and the absolutely conserved ones in each clip domain are numbered 1 through 6. The paired letters (a–a, b–b, c–c) in the catalytic domains indicate the disulfide linkages conserved in the S1 family of serine proteinases. The two unique Cys residues in most group 2 proteinases are shown by “+” [10]. The Cys, marked with “#”, is probably involved in the interdomain disulfide bond with its partner (one of the Cys residues marked “?”) in the linker region. (B) Based on the sequence alignment in A, a distance tree was constructed by the neighbor-joining method. The horizontal branch lengths are proportional to the minimum number of amino acid substitutions necessary for the evolution of the sequence differences observed. The numbers in boldface indicate bootstrap values (%), whereas the lightface numbers represent branch lengths.

Table 2.

Exons and introns in the genes of dual clip-domain serine proteinases

Exon 1 2 3 4 5 6 7 8
HP15a >57/17b 155/52 149/50 158/52 166/56 206/68 170/57 >280/89
HP23a >862/17 155/52 149/50 161/53 172/58 206/68 167/56 >280/89
HP12 112/19 155/52 149/50 170/56 163/55 209/69 170/57 >340/97
HP24 >100/19 155/52 149/50 161/53 163/55 209/69 170/57 >340/97
PAP-2 103/19 155/52 149/50 161/53 163/55 206/68 170/57 1204/87
PAP-3 127/19 155/52 137/46 152/50 163/55 206/68 155/52 1299/85
PPAE >107/21 155/52 128/43 179/59 163/55 206/68 167/56 >300/87
                 
Intron 1 2 3 4 5 6 7  

HP12 1231c 946 1977 859 369 413 715  
HP24 2492 899 2797 1613 693 332 755  
PAP-2 1386 649 1002 677 798 234 328  
PAP-3 >6900 2611 1250 352 367 675 334  
PPAE >18,700 991 1090 1081 471 334 491  
a

Intron positions are predicted based on the sequence alignment and phylogenetic relationships shown in Fig. 5

b

Size of exon (bp) / encoded amino acid sequence (residues)

c

Intron size in bp

The 3′-UTRs in exon 8 of PAP-2 and PAP-3 are 1.2 and 1.3 kb long, whereas those in HP12, HP15, and HP23 are shorter (~0.3 kb) (Table 2). The HP23 cDNA contains a 5′-UTR of 862 bp, much longer than the counterparts in the other proteinase genes. The coding regions in individual exon groups have similar sizes and their phases are identical. For instance, exon 6 is 206 or 209 nucleotides long, encoding 68 or 69 amino acid residues. In contrast, the sizes of the introns in each group are more variable. The alignment of intron sequences pointed to many similar regions in individual groups (data not shown). We further compared the 5′ and 3′ ends of the intron sequences and identified the consensus as 5′-GTR …YAG-3′.

The comparison of cDNA and gene sequences revealed some minor discrepancies. The full-length HP12 and PAP-2 cDNA sequences and the 960-bp HP24 cDNA fragment contain 5, 5, and 21 nucleotide substitutions, compared to the genome sequences. These minor differences are most likely caused by allelic variations. Among them, only 10 give rise to amino acid residue changes: P196A and Q197K in HP12, N144K, K194Q, L261I, D266E, E267D, A317P, and N363K in HP24, as well as K273E in PAP-2.

Putative regulatory elements and repeated sequences in 5′-UTRs

Computer-based sequence analysis revealed motifs for key transcription factors in the 5′ flanking regions of HP12, HP24, PAP-2, PAP-3, and PPAE (Fig. 4A and Table 3). HP12 has three TATA boxes that may lead to transcription initiation at multiple sites (Fig. 2). Two overlapping TATA boxes are located 60–67 nucleotides upstream of the predicted transcription start site in HP24. PAP-2 does not contain a typical TATA box.

Table 3.

Putative regulatory elements in the 5’ flanking region of the dual clip-domain serine proteinase genesa

Regulatory element (consensus sequence) HP12 HP24b PAP-2 PAP-3 PPAEb
TATA box (TATAT or TATAAA) +TATATA (−79) +TATATA (−67)   +TATAAA (−88) +TATAAA (−39)
  +TATATA (−64) +TATATA (−65)      
  +TATAAA (−29)        
GATA (WGATAA) +AGATAA (−928) −AGATAA (−641) +AGATAA (−880) +TGATAA (−536) +AGATAA (−699)
  −AGATAA (−323)   +TGATAA (−869) +AGATAA (−145) −TGATAA (−562)
      +AGATAA (−537)    
      +TGATAA (−375)    
      +TGATAA (−205)    
ISRE (GGAAANNGAAANN) +aGAtAGAGAAAC (−877) −AGAAAATGAgAAT (−725) −GtAAAGAcAAAAA (−810) −GtAAACTGAtACA (−944) +GGAAATTaAACG (−871)
  +GgcAAGGGAcAG (−566) −TcAAATCGAAAAA (−478) +GGAAAATGAccTC (−441) −TGAAtTTGAAATT (−292) +GGAAAAAcAAAA (−315)
  +GGAAAGGGctAA (−421) +GGAAtACcAAACA (−432) +GGAAcCGGAtATA (−334) +GGgcACGGAAAAG (−213) −GaAAATAGAAAT (−293)
  +GGAAACAGtAcG (−384) −GgtAAGCGAtACG (−360) +GGAAtGCGAcACA (−307) +aGAAgGAGAAACT (−193) −GGAAAATGAtAC (−125)
    +GtAAAACcAAATT (−310)      
    +GcAAAATtAAAAA (−182)      
NF-κB (GGGRAYYYYY) +GGGAAaTCTT (−944)   −GcGGACTCTC (−781) −GGGGACTTCT (−1765) −GaGAATCCTT (−68)
  +GGGtATTTTT (−807)   +GGGGATTaCT (−190) +GGcAATTCCC (−1018)  
  +tGGGATCTCC (−350)   +aGGGACTTCC (−152) +GGaGATTCTC (−427)  
EcRE (RRGKTCANTGAMCYY)     +cGGGaaAATGACCTC (−443) +AGtTTgAATGcACTT (−291)  
      −GAGGTCATTttCCCg (−431) +AGGTTCAGTGcgCgT (−6)  
a

Mismatches are shown in lower case letter.

b

Nucleotide positions (in parentheses) are relative to the predicted transcriptional initiation site.

Most of these proteinase genes contain putative GATA boxes, NF-κB motifs, and interferon-stimulated response elements (ISREs) (Table 3). A perfect NF-κB motif is found between nucleotides −1765 and −1756 of PAP-3 and the other nine match 9 of the 10 positions in the consensus sequence (GGGRAYYYYY). HP24 lacks an NF-κB motif. We have identified 12 GATA boxes and 22 ISREs on positive or negative strands of the genes. Each putative ISRE differs from the consensus (GGAAANNGAAANN) at two sites. The sequence between nucleotides −443 and −429 of PAP-2 gene match 12 of the 15 positions in the ecdysone-response element (EcRE) on both strands of the DNA.

A 178-bp region in intron 7 of HP12 is 90% identical to the reverse complement of a 180-bp fragment in intron 3 of the same gene (Fig. 4B). Another nearly perfect repeat resides in nucleotides −2675 to −2423 and nucleotides −1664 to −1413 of HP12. This 253-bp sequence is also highly similar to nucleotides 1300 to 1553 in HP24 (Fig. 4C). The overall sequence identity among these sequences is 89%. Part of this repeat is also similar to a fragment in M. sexta PAP-1, PAP-3, juvenile hormone-binding protein, eclosion hormone, serpin-1, and arylphorin genes [26,27].

Other than these repeats, highly similar sequences are also present in the 5′-UTR of HP12 and HP24. The sequence between a putative TATA box (starting at nucleotide −65) and the transcription start site of HP12 is nearly identical to the same region in HP24 (Fig. 4A). Further upstream, a 247-bp fragment (nucleotides −961 to −718) of HP12 is 84% identical to the region between nucleotides −1295 and −1049 of HP24 (Fig. 4D). These similar sequences indicate that HP12 and HP24 might share some regulatory elements and, hence, comparable expression patterns. Nevertheless, the RT-PCR results failed to support this hypothesis (Fig. 3).

Expression patterns of HP12, HP24, and PAP-2 genes

To test whether transcription of HP12, HP24, and PAP-2 is up-regulated after an immune challenge, we examined total RNA samples of hemocytes and fat body from M. sexta larvae injected with water or Micrococcus luteus. HP12 mRNA was undetected in either hemocyte sample. In contrast, induced fat body yielded a major PCR product of the expected size after 25 cycles, and the band intensity was markedly higher than that of the control (Fig. 3A). The HP24 mRNA level was below the detection limit in all four of the samples. On the other hand, upregulation of PAP-2 transcription was observed in both fat body and hemocytes, a result consistent with previous Northern blot analysis [11].

After method validation, we inspected the mRNA levels of HP12, HP24, and PAP-2 in fat body at different developmental stages (Fig. 3B). HP12 mRNA was present in fat body on day 5 of the fifth instar and day 3 of the wandering stage. HP24 signal was not detected in any of the RNA samples. PAP-2 mRNA was not detected in the larval fat body until the wandering stage started. Its level remained high from day 2 to day 5 of the wandering stage and then gradually decreased in the pupal stage. PAP-2 transcripts resurged in the adult.

In addition, we checked the tissue-specific expression of the three genes. HP12 was expressed mainly in tracheae and integument of the fifth-instar larvae, and slightly in the midgut of wandering stage (Fig. 3C). Again, we did not detect HP24 transcripts. PAP-2 mRNA was present at low to intermediate levels in several larval and pupal tissues. The highest level was observed in fat body of the wandering larvae.

The finding of putative ecdysone response elements in PAP-2 led us to test whether 20-hydroxyecdysone affects the PAP-2 expression in cultured fat body. At a low concentration of 1 µg/ml, 20-hydroxyecdysone slightly reduced the PAP-2 mRNA level. The decrease became more severe after the tissue had been treated with the steroid at 5 µg/ml (Fig. 3D).

Structural properties and evolutionary relationships

As expected from their similar gene structures, M. sexta HP12, HP24, PAP-2, HP15, HP23, PAP-3, and B. mori PPAE share common features at the protein level (Table 4). They all contain a secretion peptide ending with a Gly or Ala. The mature proenzymes almost all start with Gln, which may form pyroGlu, as shown in PAP-2, PAP-3, and PPAE [11,12,21]. The first clip domain contains 53 residues, slightly longer than the second clip domain (51 residues in most cases). The clip domains are separated by 1 residue in PPAE, by 3 residues in PAP-3, and by 7 residues in the five other proteins. In contrast, the linker region between the second clip domain and the catalytic domain is much longer, containing 3 conserved Cys residues. The putative activation site of HP12 and HP24 is VSD(K/R)*IIGG, different from FDNK*ILGG in PAP-2 and VG(D/N)K*I(I/V)GG in HP15, PAP-3, and PPAE. HP23 contains an unusual activation site of EENK*LLAT. Sequences near the catalytic triads are highly conserved: KYVLLTA(G/A)HCV…DI(A/G)LIRL…GKDSCKGDSGGPLMY. With Asp, Gly, and Gly being determinants of the primary specificity pocket, these seven proteinases are predicted or shown to be trypsin-like by cleavage after Arg or Lys residues. We identified potential N-linked and O-linked glycosylation sites in all these proteins, some of which were confirmed by sequence determination and chromatographic analysis [11,12,21].

Table 4.

Structural features of cDNA/protein sequences of the dual clip-domain serine proteinases

  HP12 HP24 PAP-2 HP15 HP23 PAP-3 PPAE
cDNA length, bp 1428 1416 2299 1376 2177 1381 1447
open reading frame, bp 1368 1359 1326 1326 1332 1284 1326
protein length, residues 455 452 441 441 443 427 441
signal peptidase cutting site G19|Q G19|Q G19|Q G17|Q G17|E G19|Q A21|Q
isoelectric point of mature protein 5.27 5.47 6.02 5.03 7.40 6.53 9.10
calculated Mr of mature protein 47,780 48,146 45,797 46,811 47,395 44,174 45,637
N-linked glycosylation site 3 1 2 6 5 2 2
O-linked glycosylation site 1 2 1 1 1 3 3

A striking feature of these proteins is the absolute conservation of 24 Cys residues in the sequences (Fig. 5A), which stabilize the enzymes with a common disulfide network. Their overall sequence identity and similarity range from 46 to 80% and from 59 to 85%, respectively (Table 1). The groups of the first or second clip domains have the same length and contain similar sequences (Fig. 6A). This is somewhat surprising because clip domains and linkers are typically hypervariable in their sizes and sequences [10]. Moreover, sequence conservation is found in the linker between the second clip domain and the catalytic domain.

Fig. 6.

Fig. 6

Sequence comparison (A) and phylogenetic relationships (B) among the clip domains of M. sexta HP12, HP15, HP23, HP24, PAP-2, PAP-3, and B. mori PPAE. Parameters for multiple sequence alignment and tree construction are described under Materials and Methods. Absolutely conserved residues are indicated underneath the sequences by asterisks and conservative substitutions are indicated by periods. The Cys residues in each clip domain are numbered 1 through 6. In the horseshoe crab proclotting enzyme, three disulfide bonds form between Cys1–Cys5, Cys2–Cys4, and Cys3–Cys6 [17]. The boldface and lightface numbers in B indicate bootstrap values (%) and branch lengths, respectively.

HP12, HP24 and PAP-2 form a separate branch on a phylogenetic tree of the dual clip-domain proteinases (Fig. 5B). The proximity in the M. sexta genome is consistent with their close evolutionary relationships. Although structures and genomic locations of HP15 and HP23 are unknown at this moment, they probably arose from a gene duplication and have an ancestor in common with the HP12–HP24–PAP-2 cluster. This evolutionary pathway is well supported by the bootstrap analysis. On the other hand, PAP-3 and PPAE are less closely related to the five other proteinases. In the analysis of individual clip domains (Fig. 6), it is clear that duplication of the clip-domain region occurred prior to duplication and divergence of this family of genes in Lepidoptera.

Discussion

The discovery of the HP12–HP24–PAP-2 gene cluster and comparison with HP15, HP23, PAP-3 and silkworm PPAE provided useful insights into the evolution of dual clip-domain serine proteinases in M. sexta. First of all, individual gene copies apparently evolved as independent units following gene duplications. Although exon shuffling cannot be ruled out as a mechanism to generate two clip domains at the time of family founding, this does not seem to have occurred during subsequent family expansion (Fig. 6).

Assuming that selection favored the sequence/function divergence of this protein family, why did exon shuffling (another efficient mechanism for introducing diversity and novelty into existing genes) not happen? Based on the clip-domain structure, we suggest that the exon–intron organization of these genes may have prevented exon shuffling (Fig. 5A): exon 2 encodes most of clip domain 1, exon 3 encodes Val–Cys–Cys of clip domain 1 and clip domain 2 before Cys4, and exon 4 encodes the rest of clip domain 2 as well as the linker. In comparison, M. sexta PAP-1 and D. melanogaster SP4, SP7, and SP10 contain a single clip domain encoded by an entire exon [19,27].

The lepidopteran dual clip-domain proteinases differ in domain subgroup and gap length from Drosophila serine proteinases with multiple clip domains [19]. They probably evolved after Lepidoptera and Diptera diverged from their common ancestor ~330 million years ago. In contrast, M. sexta PAP-3 and B. mori PPAE are probably orthologous to each other by sharing similar sequences, biochemical properties, and physiological functions (Table 1 and Table 4). Although PPAE may be the only dual clip-domain enzyme in the silkworm, gene duplication and sequence divergence gave rise to at least six such genes in the tobacco hornworm. With HP12, HP24, PAP-2, and PAP-3 structures determined, sequences of HP15, HP23, as well as other unknown genes for dual clip-domain enzymes become necessary for understanding how these genes arose in the genome of M. sexta.

In addition to the evolutionary relationships, we examined the expression patterns of HP12, HP24, and PAP-2. A search for conserved sequence motifs uncovered putative regulatory elements in the 5′ flanking regions of HP12, HP24, PAP-2, PAP-3, and silkworm PPAE (Table 3). Consistent with the presence of putative NF-κB motifs in HP12, PAP-2, and PAP-3, their transcription was up-regulated after bacterial injection (Fig. 3A) [26]. Under the experimental conditions, we did not detect HP24 expression before or after the bacterial injection—a possible consequence of not having any NF-κB motif. It is unclear whether silkworm PPAE, containing a putative NF-κB motif, is immune responsive or not.

The putative EcREs on both DNA strands of the same region (Table 3) could be responsible for the down-regulation of PAP-2 transcription by 20-hydroxyecdysone. M. sexta PAP-1 gene expression was also suppressed by the ecdysteroid [27]. Interestingly, that putative EcRE is a perfect inverted repeat that may allow the binding of hormone-associated, transcription factor dimer [5]. In contrast, PAP-3 gene transcription was unaffected by 20-hydroxyecdysone [26]. Although there are two putative EcREs in the gene, both of them are located on the negative strand.

The high levels of HP12 and PAP-2 transcripts in the larval trachea and cuticular epidermal cells (Fig. 3C) suggest that they protect against pathogen invasion via the respiratory system or integument. The cuticle of B. mori larvae contains proPO and its activating system [2] and the PPAE mRNA level in the larval integument is much higher than that in hemocytes or fat body [21]. Whereas PAP-2 mRNA is also present at low levels in hemocytes and fat body of naive M. sexta, we detected a high level of the transcript in the nerve tissue. The roles of PAP-1 [27] and PAP-2 (Fig. 3C) in the central nervous system are unknown.

The cloning and sequencing of dual clip-domain serine proteinase genes/cDNAs led us to propose an evolution pathway for M. sexta PAP-3, HP15, HP23, PAP-2, HP12, and HP24. Identification of putative regulatory elements and RT-PCR expression analysis revealed a complex picture of temporal and spatial regulation of their transcription. Whereas these results form a foundation for future studies, further investigations are needed for elucidating the functions of these enzymes in different physiological processes, such as host defense and metamorphosis.

Materials and Methods

Insects, RNA, and genomic DNA

M. sexta eggs were originally purchased from Carolina Biological Supplies and the larvae were reared on an artificial diet [6]. Hemocytes were collected from three fifth-instar larvae (day 2), three wandering larvae (day 3), and three pupae (day 2) [9]. Fat body, salivary gland, midgut, Malpighian tubule, tracheae, muscle, nerve tissue (thoracic ganglia), and integument were then dissected from the bled insects. After being rinsed with 137 mM NaCl, 2.7 mM KCl, 4.3 mM Na2HPO4, 1.4 mM KH2PO4, pH 7.4, RNA samples were extracted from the combined cells or tissues using the Micro-to-mid Total RNA Purification System (Invitrogen). In RT-PCR analysis of bacteria-induced gene expression, three larvae (fifth instar, day 2) were injected with 50 µl of H2O (control) or M. luteus (Sigma, 1 mg/ml in H2O). Hemocyte and fat body RNA samples were prepared 24 h later. Genomic DNA isolated from M. sexta larvae [26] was used as a template for PCR amplification of a gap in the PAP-2 gene (Fig. 1).

Genomic library screening

A M. sexta genomic DNA library in λ-gem-11, constructed by Dr. Yucheng Zhu at the Southern Insect Management Research Unit of USDA-ARS, was screened with 32P-labeled M. sexta PAP-2 cDNA as described by Sambrook and Russell [20]. From a total of 3 × 105 recombinant plaques, positive plaques were purified and DNA samples were prepared from plate lysates using a λ DNA purification kit from Qiagen. The bacteriophage DNAs were digested with restriction enzymes and the fragments were separated by agarose gel electrophoresis and transferred to nitrocellulose membranes. After Southern blot analysis, fragments of PAP-2 and related genes were subcloned into pBluescript-KS (Stratagene). The resulting recombinant plasmids were sequenced by primer walking using a BigDye Terminator Cycle Sequencing Ready Reaction Kit (PE Applied Biosystems). Sequence editing and assembly were performed using MacVector Sequence Analysis Software.

Cloning a gap between the λ clones

PCRs were performed to obtain the genomic fragment between exons 1 and 3 of the PAP-2 gene. Using oligonucleotide primers designed based on the cDNA sequence (j332: 5′-GAGACGGTCGCGTGGTGTC-3′; and j333: 5′-TTGTCCATCAAATC CGCAAG-3′), the region between exons 1 and 2 was amplified from M. sexta genomic DNA (1 µg) by Pfu DNA polymerase (Stratagene). The thermal cycling conditions were as follows: 94°C for 30 s; 45°C for 30 s; 72°C for 5 min; for 35 cycles. The reaction mixture (1 µl) was directly used as a template for a second, nested PCR. Taq DNA polymerase (Promega), primer j714 (5′-CGAACAGTAAACATGAA-3′), primer j715 (5′-CAGATTCCCTCAGGAAC-3′), and the same cycling conditions were employed. The region between exons 2 and 3 of PAP-2 was amplified in a similar manner. Primers j716 (5′-AAGCCTGCACATTGCCA-3′) and j717 (5′-AGGTGTGTACACT TCTT-3′) were used in the first PCR and primers j718 (5′-ACGACAAAGGGACTTGC-3′) and j717 were used in a second, seminested PCR. The reaction products were separated by 1% agarose gel electrophoresis, recovered from the gel, and cloned into pGem-T (Promega) prior to sequence determination.

Cloning of HP12, HP23, and HP24 cDNAs

As indicated in Fig. 1, the HP24 gene was first identified in λ12, which cross-hybridized with the PAP-2 probe. To amplify HP24 cDNA, a mixture of λ phage DNA (20 ng) isolated from the bacteria-induced fat body and hemocyte cDNA libraries was used as a template in a PCR. Primers j314 (5′-AATGAAGAGGAGAAGGTGTT-3′) and j315 (5′-CTCCCACATTTGCTC-GAC-3′) were designed based on regions most different from the corresponding sequences in PAP-2 cDNA. In the first PCR, primer j315 was used along with vector-specific T3 primer. The thermal cycling conditions were as follows: 35 cycles of 94°C for 30 s; 54°C for 40 s; and 72°C for 90 s. The reaction product (1 µl) was used as a template for the semi-nested PCR using primers j314 and j315. After 1% agarose gel electrophoresis, a PCR product of the expected size (960 bp) was recovered and inserted into pGem-T by T/A cloning. For isolating full-length HP24 cDNA, biotin – 21-dUTP was incorporated into a 576-bp region within the 960-bp sequence by PCR amplification. The bacteria-induced fat body and hemocyte cDNA libraries were screened using the biotin-labeled probe and ClonCapture cDNA Selection Kit (BD Biosciences). Although HP24 clone was not isolated using this probe, a full-length cDNA for HP12 was obtained [13]. HP23 cDNA was serendipitously cloned during the library screening with the HP15 probe.

Determination of transcription initiation sites

Primer extension was carried out using primers reverse complementary to the cDNAs [28]. Primers j730 (5′-TTTGTCGCGGCATCACACTCTGT-3′), j733 (5′-TTTTCACCGCCTCGCACTCGCC-3′), and j736 (5′-G TTG TTTGGCAATGTGCAGGCT-3′) correspond to nucleotides 124–146 of HP12 cDNA, nucleotides 146–167 of the predicted HP24 cDNA, and nucleotides 93–114 of PAP-2 cDNA, respectively. Terminal labeling of the oligonucleotides, primer annealing, cDNA synthesis, and product analysis were performed as described previously [27].

Nucleotide and amino acid sequence analyses

Multiple sequence alignment and phylogenetic analysis of PAP-2, HP12, HP24, HP15, HP23, PAP-3, and silkworm PPAE were performed using the ClustalW program [22]. The 950-bp sequences immediately upstream of the start codon in the HP12, HP24, PAP-2, PAP-3, and PPAE genes were aligned under default conditions. Putative regulatory elements were identified by MacVector. Blosum 30 matrix, an open gap penalty of 10, an extension gap penalty of 0.1, and a gap separation distance of 8 were used for aligning the clip domain and entire protein sequences by ClustalW. Phylogenetic trees were constructed by the neighbor-joining method.

Expression analysis by RT-PCR

The RNA sample (2–4 µg), oligo(dT) (0.5 µg), and dNTPs (10 mM each, 1 µl) were mixed with diethyl pyrocarbonate-treated H2O in a final volume of 12 µl, denatured at 65°C for 5 min, and quickly chilled on ice for 3 min. cDNA was synthesized by M-MLV reverse transcriptase (Invitrogen, 200 U/µl, 1 µl) in the presence of 5× buffer (4 µl), 0.1 M dithiothreitol (2 µl), RNase OUT (Invitrogen, 40 U/µl, 1 µl), and the denatured RNA sample (12 µl) at 37°C for 50 min, followed by 70°C for 15 min. The M. sexta ribosomal protein S3 mRNA was used as an internal control to normalize the cDNA samples in a preliminary PCR using primers 501 (5′-GCCGTTCTTGCCCTGTT-3′) and 504 (5′-CGCGAGTTGACTTCGGT-3′). Fragments of PAP-2, HP12, and HP24 cDNA were amplified using primer pairs specific for individual proteinases under the cycling conditions empirically chosen to avoid saturation. The primers were as follows: j737 (5′-GACAAGCCTGCACATTG-3′) and j317 (5′-CATCAGACGGCACGCAA-3′) for PAP-2; j731 (5′-CGTCAGTAGATGTGGATAGTCT-3′) and j732 (5′-TCCGATACCACATTCTCCCAG-3′) for HP12; j734 (5′-GAAGTATCATCAGCAGTTCC-3′) and j315 (5′-CTCCCACATTTGCTCGAC-3′) for HP24. The cycling conditions were as follows: 94°C for 30 s; 55°C for 30 s; 72°C for 60 s; for 25, 30, and 35 cycles. The relative cDNA levels of PAP-2, HP12, and HP24 in the normalized samples were determined by 1% agarose gel electrophoresis. To examine the effect of 20-hydroxyecdysone on the PAP-2 mRNA level, fat body was dissected from M. sexta larvae and cultured as described by Zou et al. [27]. After hormone treatment for 18 h at 0, 1, or 5 µg/ml, total RNA samples were isolated from the tissue for RT-PCR analysis.

Acknowledgments

This work was supported by National Institutes of Health Grant GM58634. This article was approved for publication by the Director of Oklahoma Agricultural Experimental Station and supported in part under Project OKLO2450. We thank Drs. Kanost, Melcher, and Dillwith for their helpful comments on the manuscript.

Footnotes

Sequence data from this article have been deposited with the GenBank/EBI Data Libraries under Accession Nos. DQ115323 and DQ115324.

References

  • 1.Ashida M, Brey PT. Recent advances on the research of the insect prophenoloxidase cascade. In: Brey PT, Hultmark D, editors. Molecular Mechanisms of Immune Responses in Insects. London: Chapman & Hall; 1998. pp. 135–172. [Google Scholar]
  • 2.Ashida M, Brey PT. Role of the integument in insect defense: pro-phenol oxidase cascade in the cuticular matrix. Proc. Natl. Acad. Sci. USA. 1995;92:10698–11702. doi: 10.1073/pnas.92.23.10698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Cerenius L, Söderhäll K. The prophenoloxidase-activating system in invertebrates. Immunol. Rev. 2004;198:116–126. doi: 10.1111/j.0105-2896.2004.00116.x. [DOI] [PubMed] [Google Scholar]
  • 4.Cherbas L, Cherbas P. The arthropod initiator: the capsite consensus plays an important role in transcription. Insect Biochem. Mol. Biol. 1993;23:81–90. doi: 10.1016/0965-1748(93)90085-7. [DOI] [PubMed] [Google Scholar]
  • 5.Cherbas L, Cherbas P. Molecular aspects of ecdysteroid hormone action. In: Gilbert LI, Tata JR, Atkinson BG, editors. Metamorphosis: Postembryonic Reprogramming of Gene Expression in Amphibian and Insect Cells. San Diego: Academic Press; 1996. pp. 175–221. [Google Scholar]
  • 6.Dunn P, Drake D. Fate of bacteria injected into naive and immunized larvae of the tobacco hornworm, Manduca sexta. J. Invert. Pathol. 1983;41:77–85. [Google Scholar]
  • 7.Gillespie JP, Kanost MR, Trenczek T. Biological mediators of insect immunity. Annu. Rev. Entomol. 1997;42:611–643. doi: 10.1146/annurev.ento.42.1.611. [DOI] [PubMed] [Google Scholar]
  • 8.Hoffmann JA. The immune response of Drosophila. Nature. 2003;426:33–38. doi: 10.1038/nature02021. [DOI] [PubMed] [Google Scholar]
  • 9.Jiang H, Wang Y, Kanost MR. Pro-phenol oxidase activating proteinase from an insect, Manduca sexta: a bacteria-inducible protein similar to Drosophila easter. Proc. Natl. Acad. Sci. USA. 1998;95:12220–12225. doi: 10.1073/pnas.95.21.12220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jiang H, Kanost MR. The clip-domain family of serine proteinases in arthropods. Insect Biochem. Mol. Biol. 2000;30:95–105. doi: 10.1016/s0965-1748(99)00113-7. [DOI] [PubMed] [Google Scholar]
  • 11.Jiang H, Wang Y, Yu XQ, Kanost MR. Prophenoloxidase-activating proteinse-2 (PAP-2) from hemolymph of Manduca sexta: a bacteria-inducible serine proteinase containing two clip domains. J. Biol. Chem. 2003;278:3552–3561. doi: 10.1074/jbc.M205743200. [DOI] [PubMed] [Google Scholar]
  • 12.Jiang H, Wang Y, Yu X-Q, Zhu Y, Kanost MR. Prophenoloxidase-activating proteinase-3 (PAP-3) from Manduca sexta hemolymph: a clip-domain serine proteinase regulated by serpin-1J and serine proteinase homologs. Insect Biochem. Mol. Biol. 2003;33:1049–1060. doi: 10.1016/s0965-1748(03)00123-1. [DOI] [PubMed] [Google Scholar]
  • 13.Jiang H, Wang Y, Gu Y, Guo X, Zou Z, Scholz F, Trenczek TE, Kanost MR. Molecular identification of a bevy of serine proteinases in Manduca sexta hemolymph. Insect Biochem. Mol. Biol. 2005;35:931–943. doi: 10.1016/j.ibmb.2005.03.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kanost MR, Jiang H, Yu X. Innate immune responses of a lepidopteran insect, Manduca sexta. Immunol. Rev. 2004;198:97–105. doi: 10.1111/j.0105-2896.2004.0121.x. [DOI] [PubMed] [Google Scholar]
  • 15.Lavine MD, Strand MR. Insect hemocytes and their role in immunity. Insect Biochem. Mol. Biol. 2002;32:1295–1309. doi: 10.1016/s0965-1748(02)00092-9. [DOI] [PubMed] [Google Scholar]
  • 16.Lee SY, Kwon TH, Hyun JH, Choi JS, Kawabata S, Iwanaga S, Lee BL. In vitro activation of pro-phenol-oxidase by two kinds of pro-phenoloxidase-activating factors isolated from hemolymph of coleopteran, Holotrichia diomphalia larvae. Eur. J. Biochem. 1998;254:50–57. doi: 10.1046/j.1432-1327.1998.2540050.x. [DOI] [PubMed] [Google Scholar]
  • 17.Muta T, Hashimoto R, Miyata T, Nishimura H, Toh Y, Iwanaga S. Proclotting enzyme from horseshoe crab hemocytes: cDNA cloning, disulfide locations, and subcellular localization. J. Biol. Chem. 1990;265:22426–22433. [PubMed] [Google Scholar]
  • 18.Nappi AJ, Vass E. Cytotoxic reactions associated with insect immunity. Adv. Exp. Med. Biol. 2001;484:329–348. doi: 10.1007/978-1-4615-1291-2_33. [DOI] [PubMed] [Google Scholar]
  • 19.Ross J, Jiang H, Kanost MR, Wang Y. Serine proteases and their homologs in the Drosophila melanogaster genome: an initial analysis of sequence conservation and phylogenetic relationship. Gene. 2003;304:117–131. doi: 10.1016/s0378-1119(02)01187-3. [DOI] [PubMed] [Google Scholar]
  • 20.Sambrook J, Russell DW. Molecular cloning: A Laboratory Manual. 3rd ed. NY: Cold Spring Harbor Laboratory Press, Cold Spring Harbor Laboratory; 2001. pp. 2.90–2.100. [Google Scholar]
  • 21.Satoh D, Horii A, Ochiai M, Ashida M. Prophenoloxidase-activating enzyme of the silkworm, Bombyx mori: purification, characterization and cDNA cloning. J. Biol. Chem. 1999;274:7441–7453. doi: 10.1074/jbc.274.11.7441. [DOI] [PubMed] [Google Scholar]
  • 22.Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Wang R, Lee SY, Cerenius L, Söderhäll K. Properties of the prophenoloxidase activating enzyme of the freshwater crayfish, Pacifastacus leniusculus. Eur. J. Biochem. 2001;268:895–902. doi: 10.1046/j.1432-1327.2001.01945.x. [DOI] [PubMed] [Google Scholar]
  • 24.Xia Q, Zhou Z, Lu C, Cheng D, Dai F, et al. A draft sequence for the genome of the domesticated silkworm (Bombyx mori) Science. 2004;306:1937–1940. doi: 10.1126/science.1102210. [DOI] [PubMed] [Google Scholar]
  • 25.Yu X, Jiang H, Wang Y, Kanost MR. Nonproteolytic serine proteinase homologs are involved in phenoloxidase activation in the tobacco hornworm, Manduca sexta. Insect Biochem. Mol. Biol. 2003;33:197–208. doi: 10.1016/s0965-1748(02)00191-1. [DOI] [PubMed] [Google Scholar]
  • 26.Zou Z, Jiang H. Gene structure and transcriptional regulation of Manduca sexta prophenoloxidase-activating proteinase-3 (PAP-3), an immune protein containing two clip domains. Insect Mol. Biol. 2005;14:433–442. doi: 10.1111/j.1365-2583.2005.00574.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Zou Z, Wang Y, Jiang H. Manduca sexta prophenoloxidase activating proteinase-1 (PAP-1) gene: organization, expression, and regulation by immune and hormonal signals. Insect Biochem. Mol. Biol. 2005;35:627–636. doi: 10.1016/j.ibmb.2005.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ausubel FM, Brent R, Kingston RE, Moore DD, Seidman JG, Smith JA, Struhl K. Current Protocols in Molecular Biology. NY: Greene and Wiley-Interscience; 1987. pp. 4.8.1–4.8.5. [Google Scholar]

RESOURCES