Skip to main content
Applications in Plant Sciences logoLink to Applications in Plant Sciences
. 2013 Jan 2;1(1):apps.1200179. doi: 10.3732/apps.1200179

Annotation and re-sequencing of genes from de novo transcriptome assembly of Abies alba (Pinaceae)1

Anna M Roschanski 2,4, Bruno Fady 3, Birgit Ziegenhagen 2, Sascha Liepelt 2
PMCID: PMC4105350  PMID: 25202477

Abstract

Premise of the study: We present a protocol for the annotation of transcriptome sequence data and the identification of candidate genes therein using the example of the nonmodel conifer Abies alba.

Methods and Results: A normalized cDNA library was built from an A. alba seedling. The sequencing on a 454 platform yielded more than 1.5 million reads that were de novo assembled into 25149 contigs. Two complementary approaches were applied to annotate gene fragments that code for (1) well-known proteins and (2) proteins that are potentially adaptively relevant. Primer development and testing yielded 88 amplicons that could successfully be resequenced from genomic DNA.

Conclusions: The annotation workflow offers an efficient way to identify potential adaptively relevant genes from the large quantity of transcriptome sequence data. The primer set presented should be prioritized for single-nucleotide polymorphism detection in adaptively relevant genes in A. alba.

Keywords: Abies alba, adaptation, annotation, candidate genes, de novo sequencing, Pinaceae


To gain insights into the molecular level of adaptation, attention has turned to the investigation of adaptively relevant genes (candidate genes). For nonmodel organisms, access to candidate genes is limited and the transfer of primers, e.g., from expressed sequence tag (EST) libraries, if available, requires high labor costs. For instance, the resequencing of 800 genes selected from more than 7000 ESTs from Pinus taeda L. yielded only 70 candidate genes for Abies alba Mill. (Mosca et al., 2012). Because sequencing costs are decreasing rapidly, de novo sequencing in nonmodel organisms is now achievable. For the identification of candidate genes in de novo–sequenced organisms, the use of differential expression profiling (e.g., Street et al., 2006; Huang et al., 2012) can be performed, but it requires the sequencing of several samples. The sequencing of a single transcriptome, in contrast, is very cost-effective. However, the reduction of the data remains challenging. Blasting against available databases is the standard method, which results in outputs of large quantities and is therefore mainly used for annotation only (e.g., Parchman et al., 2010). Here, we present a protocol for the efficient reduction of transcriptomic data down to 283 candidate gene sequences that were used for immediate primer development. The protocol is applicable for species that lack genomic resources. It combines a standard and a specific annotation approach and led to the resequencing of 88 gene fragments in A. alba.

METHODS AND RESULTS

A normalized transcriptome of a 1-yr-old A. alba seedling from the Black Forest (Forest District Calw, Germany; voucher MB-P-001007, Herbarium Marburgense, University of Marburg) was sequenced on a 454 GS FLX Titanium platform (cDNA library preparation: Vertis Biotechnology AG, Freising, Germany; sequencing: Genoscreen, Lille, France). The 454 run yielded 1521698 reads with an average length of 359 nucleotides (nt). Trimming and de novo assembly of the raw reads into contigs using Newbler software version 2.3 (454 Life Sciences, Branford, Connecticut, USA) resulted in 25149 contigs consisting of 381808 complete and 619615 partially assembled reads. The contig length was between 100 nt and 2394 nt, with an average length of 498 nt. A total of 484576 reads remained as singletons (Table 1). Contigs were submitted to the Transcriptome Shotgun Assembly database (TSA) at the National Center for Biotechnology Information (NCBI) (accession no.: JV134525–JV157085).

Table 1.

Statistics of the 454 transcriptome sequencing run and metrics of the Newbler assembly software.

Size (nt) in quantiles
Sequence type Number % Nucleotides Average (nt) 0% 25% 50% 75% 100%
Reads trimmed 1521698 100 546346058 359.0 <21 <303 <395 <444 <1088
Reads assembled 381808 25.1
Reads partial 619615 40.7
Reads singleton 484576 31.8 175198711 361.6 <50 <307 < 397 < 443 <876
Reads repeat 1617 0.1
Reads outlier 20389 1.3
Reads too short 13693 0.9
Contigs 25149 12511848 498 <100 <365 <468 <601 <2394
N50 Contiga 704
a

Half of all bases are assembled in contigs of this size or longer.

In the specific approach (Fig. 1), we tested a novel annotation protocol: After a literature survey with key words “adaptation,” “candidates,” “drought,” “evolution,” “RT-PCR,” and “selection” in various combinations using the Web of Science database, we selected 5349 unique proteins and downloaded them from UniProt or NCBI (downloaded in November 2011). The proteins were subsequently searched against the contigs coming from the de novo transcriptome sequencing that were formatted as the reference database using the BLAST+ 2.2.24 toolkit (tBLASTn parameters: softmasking = threshold 15 max_target_seqs 10000). To increase reliability of alignments and to avoid too-short amplicons, only alignments with a length of at least 100 amino acids and an identity of at least 90% were considered further. From the contigs that passed the filter, 157 were selected for primer design. In the standard approach (Fig. 1), contigs were searched against the refseq_protein database (downloaded from NCBI 14 June 2011) with strict BLAST-settings (BLASTx parameters: threshold 999, window-size 4, gapopen 32767, gapextend 32767, E-value 1e−20) (Altschul et al., 1990). Gene ontologies (Ashburner et al., 2000) were assigned to contig-protein hits using Blast2GO 2.5.0 (Conesa et al., 2005) and subsequently filtered as described above. To select for well-described proteins, contig sequences were used for primer design if they could be assigned to enzyme IDs with the Kyoto Encyclopedia of Genes and Genomes (KEGG) (Ogata et al., 1999) in the final annotation step. Primers were developed specifying the amplified range according to the contig-protein alignment boundaries using default standard PCR settings of PerlPrimer (version 1.1.12; Marshall, 2004). Primers were tested in a 30 μL PCR reaction with 17.28 μL double-distilled water, 3 μL 10× PCR buffer with MgCl2 (20 mM), 1.2 μL MgCl2 (25 mM), 3 μL Primermix (forward and reverse each 2 μM), 1.44 μL dNTPs (each 5 mM), 0.24 μL bovine serum albumin (BSA) (20 mg mL−1), 0.24 μL Dream Taq polymerase (5 U μL−1, Fermentas, St. Leon-Rot, Germany), and 3.6 μL DNA (10 ng μL−1). The PCR was performed with 5 min initial denaturation at 94°C followed by 35 repetitions of 45 s denaturation at 94°C, 45 s annealing at 52–59°C, 45 s elongation at 72°C, and a 10 min final elongation at 72°C. For the amplification test, four samples were randomly chosen for each gene from a set of 80 different silver fir trees that were sampled in May 2011 in Mont Ventoux (44°10′44.35″N, 5°14′32.29″E, France). Amplification was evaluated by electrophoresis in 1% agarose gels. When amplification was too weak, the volume of MgCl2 was increased to 1.8 μL. When faint ancillary bands appeared, no additional magnesium was added to the mastermix. If PCR products occurred as a single band, one sample was chosen for sequence analysis in each case to ensure that the region of interest was amplified (LGC Genomics GmbH, Berlin, Germany). Gene sequences were aligned to the corresponding contigs using the CodonCode Aligner software (default large gap settings) to reveal the location of the introns. The gene sequences were searched against the nr nucleotide database of NCBI (default discontiguous megaBLAST settings, web application).

Fig. 1.

Fig. 1.

Workflow of the annotation protocol. Numbers of the output after each step are given. The standard approach starts with 25149 contigs. The specific approach uses them as the reference database for the tBLASTn step.

In the specific approach, tBLASTn and subsequent sorting led to 321 contigs. For primer development, 185 contigs were picked. In the standard approach, the initial number of contigs was decreased to one third after the BLASTx step. Approximately half of the hits could be further annotated with Gene Ontologies. After filtering, 126 contigs were successfully assigned to enzyme-IDs and used for primer design (Fig. 1). In combination, 283 different contigs were annotated and only 28 were annotated with both approaches. Primer testing and sequencing resulted in 88 gene sequences (Table 2). Fifty-seven genes were annotated using the specific approach, and 42 using the standard approach. Eleven were annotated by both approaches. The assembly of the gene sequences and the corresponding cDNA contigs revealed 43 introns in 26 genes. The length of the gene sequences ranged from 262 to 1486 nt. All gene sequences aligned to sequences from the nr nucleotide database (NCBI) where the highest E-value was 5.00e−32. Twelve gene sequences hit organelle DNA (10 chloroplast, one mitochondrial, and one ribosomal). The remaining 76 are involved in the biosynthesis of different compounds (21), regulation (20), primary metabolism (14), growth (11), stress response (8), and water transport (2). In the biosynthesis group, enzymes from the auxin pathways, the phenylpropanoid pathways, and the tetrapyrrol pathways were dominant. With the exception of the primary metabolism group, all groups included candidates for the analysis of adaptation at gene level that had been investigated in previous studies of conifers (e.g., González-Martínez et al., 2006).

Table 2.

Primers for resequencing of annotated gene fragments in Abies alba.a

Gene Locus ID Primer sequences (5′–3′) Ta (°C) No. of introns Intron length (nt) Total length (nt) Annotation approach BLASTn of gene sequences against nr nucleotide database (E-value) GO-ID biological process
95 F: ACAGAAACTAAAGCTAGTGTCG 57 0 696 1 Keteleeria davidiana chloroplast DNA, complete sequence (0) reductive pentose-phosphate cycle, photorespiration, oxidation reduction
R: CCTTAATTTCACCCGTCTCAG
215 F: CCAAGGACTCTGATCGAATCC 56 2 411 1486 2 Abies firma clone 1 4-coumarate:CoA ligase (4CL) gene, partial cds (0) response to UV, response to wounding, phenylpropanoid metabolic process, response to fungus
R: GAAGCCAGCATTCAAAGACTC
241 F: AACGTCCGTTAATACTTCGG 56 3 256 1370 2 Arabidopsis thaliana fructose-bisphosphate aldolase, class I (FBA1) mRNA, complete cds (1E-125) glycolysis
R: AGTAAGTGTAGCCCTTCACG
323 F: AAGCAAGCTTCTGAAATTCC 53 2 278 804 1 A. thaliana plasma membrane H+-ATPase gene, complete cds (1E-90) auxin biosynthetic process, ATP biosynthetic process, proton transport
R: TGGTAGAGTCTACCAAATGAG
1362 F: GAAGAGGTAGCTGCATTGGT 59 0 871 1 Ricinus communis processing-splicing factor, putative, mRNA (0.0) response to hypoxia, sucrose biosynthetic process, nuclear mRNA splicing, via spliceosome
R: GGGCTTATACCGTAAATATACCCA
1704 F: CAACTACTTCAGAGACAGAC 52 2 327 858 2 Pinus taeda mitogen-activated protein kinase 13 (MAPK13) mRNA, complete cds (2E-84)
R: AAAGATTCCCTCCAAATCAG
2387 F: TAAATGGCTCAATTCCTCCTACTG 61 1 128 624 1 Medicago truncatula Alpha-1,4-galacturonosyltransferase (MTR_7g075840) mRNA, complete cds (8E-99)
R: GTTCCAAGCTTCCACAATACTC
2565 F: GTGTCTGGAAGGGAATACAAGG 58 0 432 1 PREDICTED: Vitis vinifera adenosylhomocysteinase-like, transcript variant 1 (LOC100253872), mRNA (1E-109) embryonic development ending in seed dormancy, one-carbon metabolic process, posttranscriptional gene silencing, methylation-dependent chromatin silencing
R: CCTTGACTCCTTCATGGATCAG
2774 F: GTTACAGGAAGCCTTTCTGG 55 0 502 2 Citrus sinensis pectinesterase mRNA, complete cds (5E-32) cell wall modification
R: GCGGGATGAATTATCTTGTC
2937 F: TGAGCTGATTGCTAATGCGG 58 0 622 2 Solanum tuberosum clone 154D06 fructose-bisphosphate aldolase-like mRNA, complete cds (5E-120) glycolysis
R: GGACATGGTGGTCATTGAGG
2986 F: CTGTCTGTGACGGATCTAGC 57 0 355 1 Populus trichocarpa arogenate/prephenate dehydratase (PDT1), mRNA (1E-52) l-phenylalanine biosynthetic process
R: TGAGGATGGCTTACAACACG
3421 F: CTCATCTCTGCCAGAAAGAC 55 0 324 2 Picea sitchensis isolate CR201 phenylalanine ammonia lyase-like protein mRNA, partial cds (0.0) phenylpropanoid metabolic process, biosynthetic process, l-phenylalanine catabolic process
R: GTAGAGCTTCATCTACGAGG
3593 F: AGGACCTGAAATACCTTGCT 56 0 337 2 Abies firma chloroplast, partial genome (6E-170) transport, respiratory electron transport chain, photosynthesis
R: TCCGTGTTTATCTCACAGGT
3689 F: CGATTGCATCTCTGTACGCC 58 0 619 2 Pseudotsuga menziesii var. menziesii haplotype Pm-TBE_412m2 thiazole biosynthetic enzyme (TBE) gene, complete cds (0.0) thiamin biosynthetic process
R: GCTCTTGAGCCTCTTGACAC
3918 F: TTCCAAGGTCTTCTCAAGGT 55 0 400 2 Pinus taeda cellulose synthase catalytic subunit (CesA1) mRNA, complete cds (0.0) cellulose biosynthetic process, cellular cell wall organization, secondary cell wall biogenesis, rhamnogalacturonan I side chain metabolic process
R: TGAAGAGTAGGAGTTTCGGT
3942 F: GTATGATACCGATGTGACGA 55 0 273 2 Ricinus communis proteasome subunit alpha type, putative, mRNA (8E-48) ubiquitin-dependent protein catabolic process
R: TTTGTAATGGATGCACTCGG
3981 F: GGAGAAGTCTACAGTTCCAG 54 0 918 1 Pinus radiata UDP-glucose dehydrogenase gene, partial sequence (0.0) oxidation reduction
R: ATAGTCCAGTGTCTTGAACTC
4103 F: ATGGCCACCTTACTAAGAAGC 57 0 841 1 Pinus pinaster mRNA for S-adenosylmethionine synthase 1 (sams1 gene) (0.0) auxin biosynthetic process, one-carbon metabolic process
R: CCACTTAAGGACCTTTACAGTCTC
4492 F: TGGGTGCAACTGAAGATAGAG 57 0 698 1 Medicago truncatula magnesium-chelatase subunit chlI (MTR_2g015390) mRNA, complete cds (4E-160) auxin biosynthetic process, chlorophyll biosynthetic process, photosynthesis
R: TTTCTACAACTAGCAAGCCTGAG
4921 F: GAAGGTCGGCTATATCAGGT 56 0 664 2 PREDICTED: Glycine max proteasome subunit alpha type-4-like, transcript variant 1 (LOC100786457), mRNA (2E-147) response to cadmium ion, ubiquitin-dependent protein catabolic process
R: AGCTTAGACAGAGACTCAGG
5004 F: CAGATGTGAGCCATTACTTTGAC 57 0 461 1 Picea sitchensis isolate VD401 magnesium chelatase H-like protein mRNA, partial cds (0) chlorophyll biosynthetic process
R: CAACCTCTGAATATAGCTGCCT
5823 F: TGCTTGATATACGTCCTGGG 57 0 293 2 Picea sitchensis isolate VD401 phytochrome A-like protein mRNA, partial cds (0) regulation of transcription, photomorphogenesis, tryptophan biosynthetic process
R: CTAGACAGTGTTGCTCCACG
5945 F: CTGTCACTCAGATCTTCAGC 55 0 339 2 P. abies (L.) Karst. Lhcb1*2-2 mRNA for light-harvesting chlorophyll a/b-binding protein (0.0) photosynthesis, light harvesting, protein-chromophore linkage
R: AGATGATCAGCGAGATTCTC
6119 F: AGAGGATGTTGGGCATTATGG 57 0 567 1 Picea mariana pyruvate dehydrogenase E1 beta subunit (Sb68) mRNA, partial cds (0.0) pollen tube development, oxidation reduction
R: CATCACATGGTATCTCATCCGA
6594 F: TGGCTTTATCTTGGAGACTTCAC 58 1 348 712 1 Ricinus communis phosphatidylinositol 4-kinase, putative, mRNA (5E-51) phosphoinositide biosynthetic process, phosphoinositide phosphorylation, signal transduction, phosphoinositide-mediated signaling
R: GAATAAGGTCATAGCCTGCCG
6757 F: TATCATGCCCTGAAAGCGTC 58 5 177 939 1 Arabidopsis thaliana ribonucleoside-diphosphate reductase subunit M1 (RNR1) mRNA, complete cds (1E-39)
R: ACTTCCACAAGCAAGACACTC
7098 F: CTTTACTGTTGGAGGTAGATCAG 55 0 782 1 Arabidopsis thaliana UDP-glucuronic acid decarboxylase (AUD1) mRNA, complete cds (1E-153) dTDP-rhamnose biosynthetic process, d-xylose metabolic process
R: GTTTGTTTGTCTTTGTACTCCC
7208 F: GTTACATTCGTAAGTAGCTTGG 54 0 326 1 Pinus thunbergii NADH dehydrogenase subunit 5 (nad5) gene, partial cds; mitochondrial (0) transport, ATP synthesis coupled electron transport
R: AAATGGTCGAGAAGTCTACTG
7324 F: ATTGGAGATGGAGCCATGAC 57 0 471 1 Picea abies 1-deoxy-d-xylulose 5-phosphate synthase type I (DXS1) mRNA, complete cds (0) terpenoid biosynthetic process, thiamin biosynthetic process
R: TCTCTGCATATGGGTAACCC
8248 F: CAAGTATTCCGAAAGGCAGC 57 1 601 1128 2 P. abies mRNA for porin Mip1 (3E-154) response to water deprivation, water transport, transmembrane transport, response to salt stress
R: ACAAAGGTGCCCACAATCTC
8583 F: TCTCCTACATTGACGATCCC 56 0 393 2 Picea sitchensis isolate VA301 phenylalanine ammonia lyase-like protein mRNA, partial cds (5E-162) phenylpropanoid metabolic process, biosynthetic process, l-phenylalanine catabolic process
R: CCATCCAAGCACTTGAAGAG
8855 F: TATTTGCTGGTCGGGATTCG 58 2 275 926 2 P. sylvestris Lhca4*1-2 mRNA encoding Lhca4 protein (type 4 protein of light-harvesting complex of photosystem I) (partial) (7E-179) photosynthesis, light harvesting
R: CTGCACTAGGTTCTCGAACG
9366 F: AGTGAAAGCAACAACTTAGG 53 0 598 1 Tamarix hispida peroxiredoxin 2 (Prx2) mRNA, complete cds (5E-139) cell redox homeostasis, oxidation reduction
R: TCTGGCTTCATTGATTTGTC
9512 F: GTACTGGAGTAGCTGCACGA 59 1 99 415 2 Cycas revoluta class III HD-Zip protein HDZ32 gene, partial cds (4E-52) regulation of transcription, DNA-dependent
R: TACAAAGTGCTGCACAGCAG
9652 F: TGCAAAGAAAGTCAAGGCGA 58 2 418 913 2 Pinus pinaster COBRA-like protein gene, partial cds (0)
R: CCCATACGGTGTTAATGGCT
11301 F: GATGTTGTTCGTGCAAAGAC 54 0 490 2 Pinus pinaster mRNA for malate dehydrogenase (MDH gene) (0.0) malate metabolic process, oxidation reduction, tricarboxylic acid cycle, glycolysis
R: GCGAACTTAATTCCCTTCTC
13329 F: GATATGTGCCCAAGAACATTCTG 57 0 350 1 PREDICTED: Glycine max probable rhamnose biosynthetic enzyme 1-like (LOC100789909), mRNA (7E-87)
R: CCTTGCATGCTTCAAGAAGG
13536 F: CTGCTGATTCTGATCAGTCC 56 0 368 2 Pinus thunbergii PtANTL1 mRNA for AINTEGUMENTA-like protein, complete cds (0.0) regulation of transcription, DNA-dependent
R: TCCACAATGCAAACATAGGC
14455 F: GAACAAGATCGACTACTGCC 56 0 834 2 Pinus taeda mRNA for alpha-1,6-xylosyltransferase (x34.1 gene) (0.0) root hair elongation, xyloglucan biosynthetic process
R: TTTGATGGCCTTGAAAGCAG
14479 F: CCACTCCCAAGTACTCAAAGG 57 0 588 1 Picea abies mRNA for translation elongation factor-1 alpha, partial (0.0) translational elongation
R: CAAGTGTGGCAATCCAACAC
14514 F: GGGTTCTGATTCTCCAAAGG 56 0 322 2 Metasequoia glyptostroboides fructose-1,6-diphosphate aldolase mRNA, complete cds (2E-74) pentose-phosphate shunt, response to salt stress, glycolysis, response to cadmium ion
R: CTGCATACTTGGCCAAAGTG
14585 F: TCTTGAATTCTTCCTATGTCCCAG 57 1 193 915 1 PREDICTED: Vitis vinifera galacturonosyltransferase 8-like (LOC100258818), mRNA (6E-119) homogalacturonan biosynthetic process
R: AATTGCACATCTGCACAAACTC
14887 F: GGTTAGACCAGTTCATAACC 53 0 1156 2 PREDICTED: Glycine max elongation factor 2-like (LOC100788357), mRNA (0.0)
R: GTCTTCAAACTCTGACAAGG
15135 F: TTGCAGGACTTCTTTAATGG 53 0 657 2 Ricinus communis heat shock protein, putative, mRNA (0.0) oxidation reduction, response to stress, auxin biosynthetic process
R: TCTTCTTGTCAGATGGATCC
15337 F: TTTATTGTATTCCTCCTAGGCCAG 57 1 232 1086 1 Picea glauca isolate D8411049-162 cellulose synthase family protein gene, partial sequence (0.0) cellulose biosynthetic process, cellular cell wall organization
R: CACAATCTAAGCCACATTCTTCC
15484 F: TTCGACGCCAACGTTATCTG 58 0 663 2 Pinus pinaster phenylalanine ammonia-lyase (pal2) mRNA, complete cds (0.0) phenylpropanoid metabolic process, biosynthetic process, l-phenylalanine catabolic process
R: GGCCCAGAGAATTGACATCC
15727 F: CACTGAAGGTTGTGGACGAG 58 0 325 2 Pinus pinaster mRNA for cytosolic serine hydroxymethyltransferase (cshmt gene) (2E-138) l-serine metabolic process, one-carbon metabolic process, glycine metabolic process
R: GTTCAGAAGGGCTGTGTAGG
15811 F: TTCGAGATCATCTGGACTGC 57 0 438 2 Abies alba genotype Lamacce 1 chalcone synthase (CHS) gene, CHS-A8 allele, complete cds (0) biosynthetic process
R: CGACTGTTTCGACAGTGAGG
15969 F: GGAAACCTTCTTGTTCACATCTG 57 0 990 1 Pinus contorta S-adenosylmethionine synthetase (sams2) mRNA, complete cds (0.0) auxin biosynthetic process, one-carbon metabolic process
R: CTTGTCTGGAATCCTCCCTG
16727 F: GGTGACTGTGAAGGCAATGG 58 0 331 2 Populus EST from severe drought-stressed opposite wood (0.000000003) lipid transport
R: TCCCACATTTCTTTCCAGCT
16816 F: CATCTTGGCTTCGTGATTGTC 57 3 132 562 2 Pseudotsuga menziesii class III homeodomain-leucine zipper (C3HDZ1) gene, complete cds (0)
R: TGCAATTTGGCGTAATCGAC
16883 F: CTCACAGAGGTCAGAAAGAATGG 58 0 710 1 Pinus contorta S-adenosylmethionine synthetase (sams2) mRNA, complete cds (0.0) auxin biosynthetic process, one-carbon metabolic process
R: CTGCTTCAAAGGTTTGACAATCTC
16979 F: CCTGGATAGTGAAATTGGAGG 55 0 535 1 Keteleeria davidiana chloroplast DNA, complete sequence (0) auxin biosynthetic process
R: ATCCTTCTCTGAATGAGTTTCG
17340 F: CTTGGTTAATTTCCGTCCTG 54 0 281 2 Abies firma chloroplast, partial genome (0) transport, photosynthesis, electron transport chain
R: CAGCTCCTACATTTAAACCC
17637 F: TGCTGAGAAAGTTGATTCTTCC 56 0 424 1 Ricinus communis transferase, transferring glycosyl groups, putative, mRNA (2E-69)
R: GTATTCGAGGTGTAGATTGCTG
17975 F: CAAACATTGCTGCAAAGCTC 56 2 94 547 1 Ricinus communis cysteine synthase, putative, mRNA (1E-65) cysteine biosynthetic process from serine
R: CCTATTCCAGCAACCAATATGTC
18135 F: GAGACTTTGGATTCGATCCC 55 1 132 683 2 Picea abies mRNA for putative chlorophyll A-B binding protein, (pPA0001 gene) (0) photosynthesis, light harvesting in photosystem I
R: AGAAGGCCGCAAATATAGTG
18444 F: ATTAATCTTTGCAGGGAAGC 54 0 313 2 P. sylvestris mRNA for polyubiquitin (3E-116)
R: AGACGAGATGAAGTGTAGAC
18599 F: GGAATGCATGATCCATTTCTG 55 0 678 1 S. tuberosum mRNA for NADH dehydrogenase, NADH-binding subunit (complex I) (0.0) oxidation reduction
R: TACCTGAATTGTTCTTGCGA
18680 F: CTGCGATGGATAAACTACCT 55 1 214 465 2 Picea glauca isolate D761009-28 myb family protein gene, partial sequence (1E-140) regulation of transcription
R: GCTAGTGTTGCTATTGTGGG
19005 F: GGAGATTGAGCAACGAAGAG 56 0 368 1 Abies firma chloroplast, partial genome (0) auxin biosynthetic process
R: TTTGAATCCCTGAAATCCTGG
19173 F: AGAACCAATCCCTGTTACAC 55 0 343 2 Ricinus communis proteasome subunit alpha type, putative, mRNA (2E-84) defense response to bacterium, ubiquitin-dependent protein catabolic process, response to zinc ion
R: GATCAGTTCCAATCACACCT
19540 F: ACCAATTCTCTTGTTCTCGG 55 0 634 1 Cedrus deodara chloroplast DNA, complete sequence (0) plasma membrane ATP synthesis coupled proton transport, auxin biosynthetic process
R: CGAACCATGTAAAGATCATTCC
20156 F: ATGGATCCCTGGAATTTATGC 55 0 386 1 Picea sitchensis isolate VD401 magnesium chelatase H-like protein mRNA, partial cds (3E-110) RNA processing, chlorophyll biosynthetic process
R: ATACTCTACCTACTACAGAATCCC
20318 F: ACAGCTCCCATTAATCTGAC 55 0 356 1 PREDICTED: Glycine max cellulose synthase-like protein D3-like (LOC100785985), mRNA (6E-69) root hair elongation, cellulose biosynthetic process, response to cold, cellular cell wall organization, plant-type cell wall biogenesis
R: CCAGAATTGTTCATTTCTCCAC
20694 F: GTCGAACAATGAAGACGAGG 56 0 346 1 PREDICTED: Vitis vinifera zinc finger CCCH domain-containing protein 49-like (LOC100259323), mRNA (6E-43) cell wall modification, regulation of transcription
R: TGTGAGCGAAGAAACAAACC
21136 F: AGACTGGTGTTACATTTGCGT 57 1 229 535 1 P. taeda gene for protochlorophyllide reductase (3E-168) oxidation reduction, chlorophyll biosynthetic process, photosynthesis
R: CCAACAAGCTTCTCACTAATTTCC
21165 F: ATGCACGATGTTCTTGATGC 57 2 204 644 1 PREDICTED: Glycine max premRNA-processing-splicing factor 8-like (LOC100804026), mRNA (4E-46) response to hypoxia, sucrose biosynthetic process
R: GGTGTCATGTTTATATGACAGTGG
21173 F: ACATTGTTGCTAACGATCCG 56 0 333 2 Picea sitchensis isolate VA100 basic endochitinase-like protein mRNA, partial cds (1E-138) cell wall macromolecule catabolic process, chitin catabolic process
R: AGACGAGGTAGAGATTGAGC
21890 F: GAAAGCTTACAGGGAAGCAG 55 1 358 607 2 Picea sitchensis isolate VD401 SWAP domain-containing protein-like protein mRNA, partial cds (2E-116) RNA processing
R: ACGATATCCAAGCATCATCC
21957 F: AACAACTTCACAGTTTCTCC 54 0 292 2 Abies firma chloroplast, partial genome (2E-157) auxin biosynthetic process, chlorophyll biosynthetic process, oxidation reduction, photosynthesis, dark reaction
R: GGAATCGGTAAATCAACGAC
22174 F: GATGATCCGGTTCGAATACC 55 0 334 1 Abies firma chloroplast, partial genome (6E-157) regulation of apoptosis, transcription, DNA-dependent
R: AAACGTAAGATACAAGTGGGTG
23660 F: AGGAAGATGTTAGGCTCGGG 58 1 781 1232 2 P. abies mRNA for porin Mip1 (6E-157) response to water deprivation, water transport, transmembrane transport
R: GAAGCCCTTCACAACTCCAG
23809 F: ATGCGCTCTATGTTAGAACG 55 0 1058 2 Abies firma chloroplast, partial genome (9E-168) oxidation reduction, chlorophyll biosynthetic process, photosynthesis, dark reaction
R: AATCTCAAGACGTTTACCGA
23850 F: GAAGATTTATTCGGCAACTG 52 1 449 695 2 Pinus taeda mitogen-activated protein kinase 6 (MAPK6) mRNA, complete cds (4E-65) auxin biosynthetic process, protein amino acid phosphorylation, conjugation, mitosis, cell division
R: ATCTGATCCTCTGTTAAGGT
23982 F: TGAGACTTGCTTGGGAAGAG 57 2 586 921 1 Pisum sativum nonphosphorylating glyceraldehyde-3-phosphate dehydrogenase (gapN) gene, complete cds (4E-33) metabolic process
R: AGCCCATTGTAAACGAAGGA
24523 F: TTCAGACTCGAACGTTTGCA 58 0 449 2 Ginkgo biloba catalase mRNA, complete cds (3E-98) hydrogen peroxide catabolic process, oxidation reduction
R: AAGCTTTCATTCCCAGACGG
24699 F: AAGATAAGCAGTTTGCTGCA 56 0 262 2 Ageratina adenophora heat shock protein 70.58 mRNA, complete cds (2E-81) auxin biosynthetic process, response to stress
R: AACATTCTTCTCGCCAACAG
24902 F: CCCTCTCAATCTTGAGGATGC 58 1 240 662 1 Arabidopsis thaliana ferredoxin–NADP+ reductase (RFNR2) mRNA, complete cds (3E-54) electron transport chain
R: CAGATGGACCTGTAATTTGAACCT
25060 F: CTGCAAGATACTTCAAAGATGCAC 58 2 163 624 1 PREDICTED: Glycine max ATP-citrate synthase beta chain protein 1-like (LOC100800904), mRNA (2E-74) acetyl-CoA biosynthetic process, cellular carbohydrate metabolic process
R: ATTTGGTGTAGAGAACATCTTCCC
26089 F: GATTATTGATTCTACCACCGGA 55 1 281 1233 1 Rosa multiflora elongation factor 1-alpha mRNA, complete cds (0.0) translational elongation
R: TTTCTCAACAGCCTTGATGAC
26764 F: GGGAATTGGCTCGTATCTGG 58 0 359 1 Cucumis sativus 6-phosphogluconate dehydrogenase (6PGDH) mRNA, complete cds (7E-55) response to glucose stimulus, response to sucrose stimulus, response to fructose stimulus, response to salt stress, pentose-phosphate shunt, oxidation reduction, response to cadmium ion
R: GTTCTGCTTAGCAATCTTTGTCC
27033 F: TTTACTCCACCATTACGAGG 55 0 948 2 Medicago truncatula heat shock protein (MTR_7g024390) mRNA, complete cds (0) response to virus, auxin biosynthetic process, protein folding, response to heat, response to bacterium, response to cadmium ion, response to high light intensity, response to hydrogen peroxide, protein amino acid phosphorylation
R: TTCGCAATGATAGGATTGCA
27963 F: TAGGCCCATAGCTAACAAACC 57 0 318 1 Keteleeria davidiana chloroplast DNA, complete sequence (0) transcription, DNA-dependent
R: TCGAATTGTTTCATCCTCCCA
28203 F: TGTGGACGAGGAGATATTCG 56 0 315 2 Pinus pinaster mRNA for cytosolic serine hydroxymethyltransferase (cshmt gene) (1E-128) l-serine metabolic process, one-carbon metabolic process, glycine metabolic process
R: TTCAGAAAGGGCTGTGTAGG
28456 F: GATTTCGAGAGCTGGTATCCC 58 0 853 1 Ricinus communis oligosaccharyl transferase, putative, mRNA (7E-169) protein amino acid glycosylation
R: AGCTGTCGGTTGATGTTCTG
28639 F: GTAGAATAAGTGGGAGCCGT 57 0 438 2 Abies fabri 26S ribosomal RNA gene, partial sequence (0)
R: ATAGGAAGAGCCGACATCGA
29437 F: CTTCAGGTGCTCGATATCGT 56 0 403 2 Populus trichocarpa argonaute protein group (AGO911), mRNA (2E-99)
R: TCAACTGGAAACGTTAGCTC

Note: — = not available; T = annealing temperature.

a

Values are based on the sequence of one sample randomly chosen from a sample set of 80 trees from a population at Mont Ventoux (France).

CONCLUSIONS

The two approaches of the workflow are complementary, each contributing approximately half of the annotations in the final set of sequences. The standard approach can be run rapidly, but targets only well-known genes. The specific approach based on a review of the relevant literature is novel and provided a substantial amount of nonredundant annotations. As an advantage, it can be easily adjusted and extended freely to the researcher’s interest. The quality-tested primers can be used for assessing the degree of gene polymorphism in ecological genetics studies.

LITERATURE CITED

  1. Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. 1990. Basic local alignment search tool. Journal of Molecular Biology 215: 403–410 [DOI] [PubMed] [Google Scholar]
  2. Ashburner M., Ball C. A., Blake J. A., Botstein D., Butler H., Cherry J. M., Davis A. P., et al. 2000. Gene Ontology: Tool for the unification of biology. Nature Genetics 25: 25–29 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Conesa A., Götz S., García-Gómez J. M., Terol J., Talón M., Robles M. 2005. Blast2GO: A universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics (Oxford, England) 21: 3674–3676 [DOI] [PubMed] [Google Scholar]
  4. González-Martínez S. C., Ersoz E., Brown G. R., Wheeler N. C., Neale D. B. 2006. DNA sequence variation and selection of tag single-nucleotide polymorphisms at candidate genes for drought-stress response in Pinus taeda L. Genetics 172: 1915–1926 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Huang H.-R., Yan P.-C., Lascoux M., Ge X.-J. 2012. Flowering time and transcriptome variation in Capsella bursa‐pastoris (Brassicaceae). New Phytologist 194: 676–689 [DOI] [PubMed] [Google Scholar]
  6. Marshall O. 2004. PerlPrimer: Cross-platform, graphical primer design for standard, bisulphite, and real-time PCR. Bioinformatics 20: 2471–2472 [DOI] [PubMed] [Google Scholar]
  7. Mosca E., Eckert A. J., Liechty J. D., Wegrzyn J. L., La Porta N., Vendramin G. G., Neale D. B. 2012. Contrasting patterns of nucleotide diversity for four conifers of Alpine European forests. Evolutionary Applications 5: 762–775 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Ogata H., Goto S., Sato K., Fujibuchi W., Bono H., Kanehisa M. 1999. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research 27: 29–34 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Parchman T. L., Geist K. S., Grahnen J. A., Bankman C. W., Buerkle C. A. 2010. Transcriptome sequencing of an ecologically important tree species: Assembly, annotation, and marker discovery. BMC Genomics 11: 180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Street N. R., Skogström O., Sjödin A., Tucker J., Rodríguez-Acosta M., Nilsson P., Jansson S., Taylor G. 2006. The genetics and genomics of the drought response in Populus. Plant Journal 48: 321–341 [DOI] [PubMed] [Google Scholar]

Articles from Applications in Plant Sciences are provided here courtesy of Wiley

RESOURCES