Abstract
The Gulf killifish, Fundulus grandis, is a small teleost fish that inhabits marshes of the Gulf of Mexico and demonstrates high tolerance of environmental variation, making it an excellent subject for the study of physiological and molecular adaptations to environmental stress. In the present study, two-dimensional (2D) gel electrophoresis and matrix-assisted laser desorption/ionization time-of-flight tandem mass spectrometry were used to resolve and identify proteins from five tissues: skeletal muscle, liver, brain, heart, and gill. Of 864 protein features excised from 2D gels, 424 proteins were identified, corresponding to a 49% identification rate. For any given tissue, several protein features were identified as the same protein, resulting in a total of 254 nonredundant proteins. These nonredundant proteins were categorized into a total of 11 molecular functions, including catalytic activity, structural molecule, binding, and transport. In all tissues, catalytic activity and binding were the most highly represented molecular functions. Comparing across the tissues, proteome coverage was lowest in skeletal muscle, due to a combination of a low number of gel spots excised for analysis and a high redundancy of identifications among these spots. Nevertheless, the identification of a substantial number of proteins with high statistical confidence from other tissues suggests that F. grandis may serve as a model fish for future studies of environmental proteomics and ultimately help to elucidate proteomic responses of fish and other vertebrates to environmental stress.
Introduction
Proteomic approaches are being increasingly applied to comparative and integrative biology, especially to address the question of the effects of environmental stress on patterns of protein expression in a variety of organisms (Martyniuk and Denslow 2009; Forné et al. 2010; Karim et al. 2011; Sanchez et al. 2011; Tomanek 2011). Generally, proteomic analyses such as these face several challenges including accurately identifying the proteins from “nonmodel” organisms; understanding the interactions among these proteins; and, ultimately, elucidating the links among changes in protein expression, biological function, and a specific environmental insult. In the present study, we address some of these challenges in a proteomic analysis of multiple tissues of the Gulf killifish, Fundulus grandis. This small teleost fish occurs in estuarine habitats where they face dramatic variation in temperature, salinity, oxygen, and other abiotic factors. Remarkable tolerance of environmental variation makes this species and the closely-related F. heteroclitus excellent models for the study of physiological and molecular adaptations to environmental stress (Burnett et al. 2007).
The standard proteomic workflow requires the separation of complex mixtures of proteins or peptides, followed by their identification by mass spectrometry (MS) and database searching. The main separation techniques are liquid chromatography (LC) and two-dimensional (2D) gel electrophoresis (Gygi et al. 2000; Frohlich and Arnold 2006). Regardless of the separation technique, the method of choice for protein identification is MS, using soft-ionization techniques such as electrospray ionization (ESI) (Fenn et al. 1989) or matrix-assisted laser-desorption ionization (MALDI) (Karas and Hillenkamp 1988). In tandem MS (MS/MS), selected peptides are fragmented into a series of ions whose masses provide information about the amino-acid sequence of the parent peptide (Aerbersold and Mann 2003). Proteins are then identified by searching the masses of peptides or fragment ions against protein or translated nucleotide databases using search engines, such as Sequest (Eng et al. 1994), ProteinProspector (Clauser et al. 1999), and MASCOT (Perkins et al. 1999). The MASCOT algorithm calculates the probability that a match between an experimental set of data and a sequence in the database is due to random chance, with the match having the lowest probability of being random assigned the highest score. The MASCOT score is compared against a threshold score that depends upon the size of the database and other search parameters, and matches with score equal to, or greater than, the threshold are considered identified.
With any large-scale database approach to searching, which might include hundreds or even thousands of individual searches, false-positive identification is a major concern (Paterson 2003). A false-positive identification occurs when a match of MS data returns a score equal to or greater than the threshold and ranks first among all matches, but is due to random chance and is an incorrect assignment. An application within MASCOT was developed to assess the scoring accuracy for a given database search strategy. The “target-decoy search” script creates a “randomized sequence database,” in which the amino-acid sequences of all proteins in the database are randomized creating new protein sequences. If MS or MS/MS spectra match a randomized sequence better than a true sequence, then this is considered a false positive (FP). The number of random matches in a large-scale experiment is then used to determine the false-positive rate, which in turn is a gage of reliability of the identifications of proteins (Elias and Gygi 2007).
Recently, proteomic approaches have been employed to address questions of physiology, toxicology, development, and infection in fishes (Martyniuk and Denslow 2009; Forné et al. 2010; Karim et al. 2011; Sanchez et al. 2011; Tomanek 2011). Zebrafish are popular experimental models for a variety of reasons (Westerfield 2007), not the least of which is the presence of a complete genome sequence (Postlethwait et al. 1999). Accordingly, the zebrafish has been used to examine patterns of protein expression in cells, embryos, and adults (Bosworth et al. 2005; De Souza et al. 2009; Forné et al. 2010). In addition, the proteome has been studied in commercially important species (Martin et al. 2007; Forné et al. 2010) and in fish that serve as environmental reporters (Martyniuk et al. 2009; Karim et al. 2011). While these studies have led to important insights into the proteomic responses of fish to various biological and environmental changes, they generally focus their analyses on a single tissue. It has been argued that to understand the integrated organismal response to physiological, developmental, or pathological change, a multi-tissue approach to proteomics is needed (Capitanio et al. 2009; Dowd et al. 2010; Dowd 2012).
In the genus Fundulus, F. heteroclitus and F. grandis are widespread estuarine species with pronounced tolerance of variation in their physiochemical environment (Nordlie 2006). This tolerance, combined with ease of collection and of maintenance in the laboratory, has led to the promotion of Fundulus as models for environmental genomic studies (Burnett et al. 2007). The goal of the current study was to apply 2D electrophoresis and MALDI–TOF/TOF (time-of-flight) MS to characterize patterns of protein expression in multiple tissues of the Gulf killifish, F. grandis. In particular, we wanted to assess the rate and reliability of protein identification in a species without a sequenced genome as a step toward developing this species for studies of environmental proteomics.
Materials and methods
Maintenance of fish and preparation of samples
Fundulus grandis were purchased from a bait store and kept in 40 L aquaria at room temperature (22–26°C) in dechlorinated tap water adjusted to ∼5 psu with Instant Ocean Sea Salt. Aquarium water was aerated and filtered through charcoal and biological filters. Fish were fed ad libitum with TetraMin Tropical Flake fish food (Blacksburg, VA, USA) once a day and maintained in the laboratory at least 2 weeks prior to sampling their tissues. Fish were fasted 24 h prior to use and euthanized on the day of tissue sampling with an overdose of tricaine methanesulfonate (MS 222, 1 g/L, pH buffered with 4 g NaHCO3). Only male fish were used in this study, and all research conformed to national and institutional guidelines for research on vertebrate animals (protocol no. UNO-10-001).
White skeletal muscle, liver, brain, and heart were harvested by dissection from six or seven fish and frozen immediately in liquid nitrogen. Skeletal muscle samples were taken from the epaxial musculature dorsal to the lateral line, between the head and the dorsal fin. The white skeletal muscles of fish are composed mainly of fast-twitch glycolytic-type fibers (Johnston 1981). Gills from the same fish were dissected and immersed in RNAlater® (Ambion, Grand Island, NY, USA) at room temperature (Abbaraju et al. 2011). All tissue samples were stored at −80°C until prepared for electrophoresis. For muscle, liver, brain, and gill, samples of 20–50 mg were homogenized in glass–glass homogenizers (Kontes, Vineland, NJ, USA) in 500 µL lysis buffer containing 7 M urea, 2 M thiourea, 2% (w/v) CHAPS, 1% (w/v) ASB-14, 40 mM dithiothreitol (DTT). Hearts (5–10 mg total mass) were homogenized in 200 µL lysis buffer. Tissue homogenates were made on ice and centrifuged at 2400 g for 15 min at 8°C. Tissue supernatants were stored at −80°C until further analysis.
2D gel electrophoresis
Protein concentrations of the supernatant solutions were determined using Amersham Biosciences 2D Quant kit (GE Health Care, Piscataway, NJ, USA). Samples of ∼1 mg protein per tissue were obtained by pooling tissue supernatants from all fish. The use of pooled protein samples allowed separation and characterization of proteins from several individuals in a single 2D electrophoresis experiment; however, we could not address individual variation in protein expression in this study. Proteins were precipitated by trichloroacetic acid–acetone (Damerval et al. 1986) and redissolved in rehydration buffer containing 7 M urea, 2 M thiourea, 2 % (w/v) CHAPS, 40 mM DTT, 0.5% (v/v) IPG buffer, and 0.001% bromophenol blue. Protein concentration was determined again, and samples of 600 µg protein were used for 2D electrophoresis as described below. Two protein samples (technical replicates) were prepared and analyzed for all tissues except heart, which yielded enough protein for only one pooled sample.
First dimension electrophoresis, isoelectric focusing (IEF), was performed using immobilized pH gradient (IPG) strips with Ettan IPGphor II Isoelectric Focusing Unit (GE Health Care). For tissues other than skeletal muscle, samples were loaded on 13 cm 3–10 NL IPG strips by active rehydration overnight in a total volume of 250 µL. IEF was performed using a five-step protocol: 30 V for 10 h (active rehydration); 500 V for 1 h (linear); 1000 V for 1 h (gradient); 8000 V for 2 h 30 min (gradient); and 8000 V for 55 min (linear). Total electrophoresis was for 20 kVh and the current was limited to 50 µA per strip. For skeletal muscle, protein was cup-loaded for first dimension electrophoresis. The IPG strips were rehydrated overnight in 250 µL of rehydration buffer containing 1.2% (v/v) destreak solution (GE Health Care) without DTT. The sample was applied and IPG strips were focused in IEF following a three-step protocol: 500 V for 1 min (gradient); 4000 V for 1 h 30 min (gradient); and 8000 V for 1 h 50 min (linear). Total electrophoresis was for 8.8 kVh. Current was limited to 50 µA per strip. The IPG strips were frozen at −80°C until further analysis.
Prior to second dimension electrophoresis, the IPG strips were incubated for 10 min at room temperature in equilibration buffer (6 M urea, 75 mM Tris–HCl pH 8.8, 29.3% [v/v] glycerol, 2% [w/v] SDS) containing 1% (w/v) DTT followed by another 10 min in equilibration buffer containing 2.5% (w/v) iodoacetamide. Second dimension was sodium dodecyl sulfate polyacrylamide gel electrophoresis according to Laemmli (1970). Gels (16 cm × 18 cm × 1 mm) were 12.5% (v/v) polyacrylamide (37.5:1, acrylamide: bisacrylamide). The equilibrated IPG strips were fixed on the top of the gels with 0.5% (w/v) agarose in 25 mM Tris base, 192 mM glycine, 0.1% (w/v) SDS, and 0.002% (w/v) bromophenol blue. Running buffer was 25 mM Tris base, 192 mM glycine, and 0.1% (w/v) SDS buffer. Gels were electrophoresed at 15 mA per gel for 15 min followed by 60 mA per gel for 3 h at 25°C.
Imaging of gels, spot excision, and trypsin digestion
After 2D electrophoresis, gels were stained using modified Neuhoff's colloidal Coomassie protocol (Peisker 1988). Gels were fixed overnight in 100 mL 50% (v/v) ethanol: 3% (v/v) phosphoric acid. Gels were then washed in three 30 min changes of 500 mL distilled water followed by staining in 300 mL of 34% (v/v) methanol, 17% (w/v) ammonium sulfate, 3% (v/v) phosphoric acid, and 0.0066% (w/v) Coomassie brilliant blue G-250 (SERVA, New York, NY, USA). Gels were de-stained in water for 3 days and imaged using a GS 700 densitometer (Bio-Rad Laboratories, Hercules, CA, USA). Protein spots were detected by PDQUEST™ 2D analysis software (Bio-Rad Laboratories) and ranked from highest to lowest staining intensity. The 96 most intense spots for skeletal muscle and 192 most intense spots for liver, brain, heart, and gill were excised from the gels using EXQuest™ spotcutter (Bio-Rad Laboratories) and collected in 96-well plates. Proteins were digested using 130 ng sequencing-grade trypsin (Promega, Madison, WI, USA) per sample by the Investigator Progest™ automated digester (Genomic Solutions, Ann Arbor, MI, USA).
MS and database searching
A saturated solution of α-cyano-4-hydroxycinnamic acid was prepared in 50:50 acetonitrile:water containing 0.1% (v/v) trifluoroacetic acid (matrix solution). The tryptic digests from 96-well plates were mixed with an equal volume of matrix solution and samples of 0.6 µL were spotted in duplicate on MALDI plates. The MS and MS/MS spectra of the tryptic digests were acquired with an Applied Biosystems 4800 MALDI–TOF/TOF mass spectrometer (Foster City, CA, USA) in the positive reflectron mode. The total accelerating voltage was (+) 20,000 V with a delay time of 510 ns. Peptide mass fingerprints were acquired using 400 laser shots in a mass range between m/z 800 and 4000. Two trypsin autolysis peaks at m/z 842 and 2211 were used as internal standards for MS calibration. The ten most intense peptide precursors were selected and MS/MS spectra were acquired using 1000 laser shots.
The MS and MS/MS spectra were transferred to GPS Explorer™ software, version 3.6, that used an underlying search algorithm of a locally installed copy of MASCOT, version 2.1 (Matrix Science; http://www.matrixscience.com). The data were searched against the Actinopterygii fishes subset of the nonredundant sequences deposited with NCBI database (updated on July 14, 2010). Parameters for MASCOT searches were set as follows: fixed modification = carbamidomethylation of cysteine residues; variable modification = oxidation of methione residues; maximum missed cleavages = 1; mass tolerance = 100 ppm; and MS/MS fragment tolerance = 0.5 Da. Using these parameters, significant matches had MASCOT scores of 67 and higher. When the top match was a protein of unknown function, its sequence was compared to other fish sequences using BLAST (Altschul et al. 1990) to determine potential homologs.
Gene ontology (GO) annotations were accomplished using PANTHER (Mi et al. 2005), QuickGO (Binns et al. 2009), and manually searching for homologies. The list of total identified proteins was first reduced to a list of nonredundant protein for each tissue. The GI accession numbers of these proteins were uploaded to PANTHER and the proteins were categorized based on molecular function in all five tissues. The unannotated proteins from PANTHER were further analyzed by QuickGO and by manual searching of the GO website (Ashburner et al. 2000).
All chemicals were of reagent or electrophoresis grade and, unless specified above, were purchased from GE Health Care, Bio-Rad Laboratories, Sigma Chemical Company (St Louis, MO, USA), or Fisher Scientific (Pittsburg, PA, USA).
Results and discussion
Tissue sampling and 2D electrophoresis
Central to accurate proteomic analyses is the ability to capture the complement of proteins present in a tissue in a state as close as possible to the in vivo condition. Recently, we used one-dimensional gel electrophoresis to show that rapid freezing of tissues in liquid nitrogen preserved a broad range of proteins for many tissues, but failed to recover specific proteins of high-molecular weight in gills of F. grandis (Abbaraju et al. 2011). Accordingly, we evaluated recovery of protein in samples preserved in RNAlater®, a reagent developed to stabilize RNA, and found that rapid immersion of gills in RNAlater® improved the recovery, especially of high-molecular weight species. In the current 2D electrophoresis study, therefore, we sampled muscle, liver, brain, and heart by snap-freezing in liquid nitrogen, but we sampled gill in RNAlater®.
Furthermore, in 2D electrophoresis-based proteomics, conditions for electrophoresis need to be optimized to ensure good resolution of proteins. For skeletal muscle, the abundance of myofibrillar proteins can cause vertical and horizontal streaking, thereby interfering with resolution (Ohlendieck 2011). This problem was partially mitigated by rehydrating IPG strips in the presence of destreak reagent prior to cup-loading muscle proteins and first-dimensional electrophoresis. This led to an artifact in the low pH range, where samples were applied to IPG strips, but allowed good resolution of muscle proteins in the rest of the gel (Fig. 1). For other tissues, active rehydration of IPG strips overnight was used prior to first-dimensional electrophoresis, and the resulting 2D gels were characterized by well-distributed protein profiles with minimal artifacts (Fig. 1). Colloidal Comassie staining and image analysis detected approximately 200 protein spots in skeletal muscle and between 400 and 600 proteins the other tissues. The lower number of proteins detected in muscle is due to a few abundant proteins comprising a larger amount of the total protein loaded on the gels, as observed in skeletal muscle in other species (Bosworth et al. 2005; Gelfi et al. 2006; Ohlendieck 2011).
Fig. 1.
Representative 2D gel images of five tissues from Fundulus grandis: skeletal muscle, liver, brain, heart, and gill. Protein equivalent to 600 μg was loaded on each gel. The gels were stained with colloidal Coomassie blue.
Identification of proteins
A total of 864 protein spots were excised from 2D separations of the five F. grandis tissues, digested, and analyzed by MALDI–TOF/TOF MS. The resultant mass spectra were searched against the Actinopterygii protein data of the NCBI and led to the identification of 424 proteins (Tables S1–S6 of Supplementary Data). The results of database searching are summarized in Table 1 and Fig. 2. MASCOT scores of the 424 protein assignments ranged from 67 (the threshold for a positive match) to over 700 (Fig. 2A). The median MASCOT score for matches varied from 125 (liver) to 256 (muscle) and the median score was 145 across all tissues (Table 1). MASCOT scores are logarithmically related to the probably that matches are random, meaning that high MASCOT scores reflect exceedingly low probabilities that these matches occurred by chance. Proteins were identified by matching anywhere from 1 to 30 peptides (Fig. 2B), with ∼75% of the assignments of proteins based upon six or more peptides matched. The median number of peptides matched varied from 7 (liver) to 11 (muscle and gill) and the median value was nine across all tissues (Table 1). When specific peptides yielded ion fragments in MS/MS that matched the database, the ion scores were included in the MASCOT score, meaning that fewer peptides were necessary to get a score above the threshold for a positive identification. Matched peptides accounted for anywhere from 5% to 83% sequence coverage (Fig. 2C). The median value for sequence coverage ranged from 24% (liver) to 38% (muscle) and the median coverage was 27% across all tissues (Table 1).
Table 1.
Summary of MASCOT database searching of F. grandis proteins separated by 2D electrophoresis and identified by MALDI–TOF/TOF MS.
| Tissue | Protein assignments | Percent identified | MASCOT score (median) | Peptides matched (median) | Sequence coverage (median) (%) |
|---|---|---|---|---|---|
| Muscle | 49 | 51 | 256 | 11 | 38 |
| Liver | 93 | 48 | 125 | 7 | 24 |
| Brain | 118 | 62 | 143 | 8 | 26 |
| Heart | 93 | 48 | 147 | 9 | 26 |
| Gill | 71 | 37 | 172 | 11 | 33 |
| All tissues | 424 | 49 | 145 | 9 | 27 |
Median values are given for MASCOT score, peptides matched, and the percent coverage of the corresponding amino acid sequences. The threshold MASCOT score for a significant match was ≥67. The distributions of these values are shown in Fig. 2 and all values can be found in Tables S1–S5 of Supplementary Data.
Fig. 2.
Frequency distributions of (A) MASCOT scores, (B) number of matched peptides, and (C) percent sequence coverage for 424 protein identifications from tissues of Fundulus grandis. The threshold MASCOT score for a significant match was ≥67.
Positive assignment of 424 out of 864 protein spots corresponded to an identification rate of 49%. The identification rate varied from 37% (gill) to 62% (brain), and is in the range of values reported in proteomic studies of other fish species, including the model species zebrafish (Forné et al. 2010). More protein identifications were based upon matches to zebrafish homologs (123 out of 424) than any other species, a result that can be attributed to the presence of a complete zebrafish genome sequence. Matches to homologs in Atlantic salmon (Salmo salar; 40 out of 424) and the mummichog (Fundulus heteroclitus; 35 out of 424) accounted for the next highest number of positive identifications, with the remainder of the identifications based upon matches to homologs from a variety of other fishes. These results indicate that procedures for MS analysis and database searching are robust enough to allow reasonably high rates of identification of proteins in F. grandis, despite the lack of a genome sequence.
Still, about half of the protein samples submitted to MS analysis were not identified, an observation that may have several explanations. First, some samples yielded poor spectra with few peaks and low signal-to-noise ratios. As expected, the occurrence of such samples increased as the abundance of protein decreased, as reflected by Coomassie staining. Indeed, when two rounds of spot excision and digestion were performed (all tissues except muscle), the identification rate was uniformly higher for the more abundant proteins in the first round (∼60%) than it was for the less abundant proteins in the second round (20–30%). For skeletal muscle, the low staining intensities of proteins in the second round of spot excision prevented reliable MS analyses of those samples (N. V. Abbaraju, M. N. Boutaghou, R. B. Cole, and B. B. Rees, unpublished data). Second, samples with good MS spectra will not return a match if a homolog is not in the database. More likely, a homologous protein occurs in the database, but it differs enough in sequence that it will not match the peptide masses or ion fragments from F. grandis. Third, the protein submitted to MS analysis was posttranslationally modified, altering its mass, and resulting in a failure to match a database of unmodified protein sequences. These concerns can be addressed by increasing the amount of sample, improving analytical sensitivity, and continuing to develop databases of nucleic acid and proteins for “nonmodel” organisms.
Evaluating the false identification rate
With large data sets, hundreds of searches, and multiple identifications, there is the concern that some of the identifications are incorrect assignments due to random chance. To address this concern, we submitted each sample to a target-decoy search to evaluate the rate of false-positive identifications (Elias and Gygi 2007). A randomized database was created from the original protein database by shuffling the amino-acid sequence of each protein; the randomized database, therefore, has the same number of proteins, composed of the same amino acids, but with random sequences. This database was then concatenated with the original database, and the MS and MS/MS spectra were searched against this target-decoy database.
We considered a match being a false identification when the MASCOT score exceeded the threshold and when the match ranked first among all identifications. The most important result from this process was that none of the 424 protein assignments (Table 1 and Supplementary Data) matched a random sequence with a higher score than the match against an actual protein sequence. The second observation was that for the unmatched protein spots, only two matched random sequences. For gill, one random match had a MASCOT score of 68, while for liver one random match had a MASCOT score of 72. Thus, out of a total of 864 database searches, only two resulted in hits against a randomized protein sequence, corresponding to 0.23% of all searches.
Elias and Gygi (2007) presented an approach for evaluating the false-positive rate that expresses the random matches as a proportion of the correct assignments of proteins, rather than as a proportion of all searches. According to this approach, the number of FPs is the number of hits against the decoy database multiplied by two in order account for random matches that would have occurred when searching the real database, but which were masked by real identifications with higher MASCOT scores. The number of true positives (TPs) is equal to the total number of assignments (424) minus the FPs (4), or a total of 420. Finally, the false-positive rate is equal to FP/TP (4/420 = 0.00952), or ∼1%. By this measure, as many as 4 out of the 424 identified proteins might be incorrectly assigned.
In summary, 0.23% of all database searches matched random sequences with MASCOT scores exceeding the threshold of 67. Using the approach described by Elias and Gygi (2007), our false-positive rate was ∼1%. Both approaches suggest that ≥99% of the 424 gel spots were correctly identified, confirming the view that the vast majority of the protein assignments made by database searching with MASCOT are correct.
From protein lists to biological function
Among the 424 protein spots identified in five tissues, several were identified as the same protein. For example, in skeletal muscle, 12 spots were annotated as muscle-type creatine kinase and another 6 were identified as skeletal muscle α-actin. This multiplicity of protein identifications has been noted for structural proteins, metabolic enzymes, and other abundant proteins in previous 2D electrophoretic studies of fish tissues (Bosworth et al. 2005; Lucitt et al. 2008), and may represent different alleles, splice variants, posttranslational modifications, or degradation products. This multiplicity tends to inflate the representation of these proteins in subsequent characterization of tissue function. Therefore, to reduce redundancy in the data set, all spots identified as a given protein were considered to represent one nonredundant protein. Accordingly, the numbers of nonredundant identifications of proteins were 20 in skeletal muscle, 59 in liver, 61 in brain, 55 in heart, and 59 in gill, for a total of 254 nonredundant protein identifications (Table 2).
Table 2.
Summary of GO molecular function searching of F. grandis proteins separated by 2D electrophoresis and identified by MALDI–TOF/TOF MS
| Tissue | Protein assignments | Nonredundant assignments | Mapped genes | Molecular function categories |
|---|---|---|---|---|
| Muscle | 49 | 20 | 12 | 3 |
| Liver | 93 | 59 | 49 | 9 |
| Brain | 118 | 61 | 56 | 8 |
| Heart | 93 | 55 | 52 | 8 |
| Gill | 71 | 59 | 45 | 9 |
| All tissues | 424 | 254 | 214 | 11 |
We then used the web-based annotation tool PANTHER to assign GO molecular functions to these proteins (Mi et al. 2005). Of the 254 proteins submitted for analysis on PANTHER, however, GO annotations were retrieved for only 123 proteins (48% of the nonredundant proteins). Lucitt et al. (2008) achieved a similar rate of GO annotation (41–54%) for proteins from zebrafish embryos. These low rates are presumably due to incomplete GO annotations for zebrafish (King et al. 2003). We supplemented this list with another 78 (31%) proteins annotated using Quick GO on UNIPROT (Binns et al. 2009) and 13 (5%) annotated by manually searching human and mouse homologs (Ashburner et al. 2000). Thus, a total of 214 of the nonredundant proteins were annotated with GO molecular functions, or >80% of the total nonredundant identifications (Table 2).
The resulting annotations fell into a limited number of categories of molecular function (Table 2), which are shown for each tissue in Fig. 3. Catalytic activity and binding activity were the most predominant molecular functions represented in all tissues, consistent with the majority of the proteins identified being enzymes or other proteins with binding activity. Other molecular functions aligned with known functions of these tissues, for example a relatively high proportion of ion-channel and transporter activity in brain. Also, the results suggest that the tissues differed in terms of functional diversity. Out of the five tissues, muscle proteins represented only three molecular functions, whereas in the other tissues, eight or nine molecular functions were returned. The smaller number of molecular functions for muscle is consistent with a less complex protein profile observed in 2D gel images (Fig 1). This conclusion should be viewed cautiously, however, because the low number of molecular functions represented in muscle can be partially attributed to a smaller number of identified proteins (49) and to a higher level of redundancy among these identifications leading to a limited number of nonredundant proteins with assigned molecular functions (12) in muscle compared to other tissues (Table 2).
Fig. 3.
GO molecular function terms represented by proteins identified in five tissues of Fundulus grandis: skeletal muscle, liver, brain, heart, and gill.
Conclusions and perspectives
In this study, 2D electrophoresis and MALDI–TOF/TOF MS were used to describe the proteome of skeletal muscle, liver, brain, heart, and gill of the Gulf killifish, F. grandis. The overall identification rate was similar to that observed in both model and nonmodel fish species, arguing that F. grandis is an appropriate fish for investigating protein responses to environmental changes. Moreover, the rate of false-positive identification of proteins was estimated to be <1%, supporting a high confidence in the accuracy of these identifications. Identified proteins were assigned GO molecular functions, a process that was impeded by the incomplete annotation of the sequence databases of fish, highlighting the need for improved bioinformatic tools and resources for proteomics research on fish. Nevertheless, the distribution of assigned molecular functions among tissues demonstrate similarities (e.g., the high proportion of catalytic activity and binding activity), as well as differences (e.g., the number of molecular functions represented) among tissues.
This study focused on the most abundant proteins in F. grandis tissues. As analytical sensitivities improve, future studies will mine deeper into the proteome, enabling a more complete picture of the tissue proteome. In addition, the development and application of “gel-free” approaches will complement and extend 2D gel electrophoretic studies (Lucitt et al. 2008; De Souza et al. 2009; Martyniuk and Denslow 2009). Another area of advancement is bioinformatics; it will be necessary to develop more robust and complete resources, especially for nonmodel organisms, to gain a better understanding of the functional relationships of identified proteins, especially in the context of environmental proteomics. Another observation made in this and other studies that deserves more attention is that annotated proteins frequently migrate in 2D gels as multiple forms. This multiplicity may represent an important feature of biological regulation, which will likely differ among proteins (e.g., splice variants versus posttranslational modification).
Finally, this study and others in this symposium demonstrate both the feasibility and importance of proteomic analyses in “nonmodel” organisms. The techniques for the separation, identification, and statistical analyses of proteins are generally applicable, with the caveat that identification, in particular, is limited by available sequence databases. By applying these techniques to examine protein expression in “real organisms” in “real world settings,” environmental proteomics has great potential to illuminate physiological and biochemical responses in an array of organisms to an ever-changing world.
Supplementary Data
Supplementary Data are available at ICB online.
2DPAGE Database
Images of 2D gels annotated with protein identifications and corresponding MS data have been deposited with the World-2DPAGE Repository (http://world-2dpage.expasy.org/repository/0050/).
Funding
This work was supported by the National Science Foundation (OCE-0308777); the Board of Regents Support Fund via the Louisiana Experimental Program to Stimulate Competitive Research (NSF(2010)-PFUND-222); and the National Institutes of Health Research Centers at Minority Institutions (5G12RR026260-02).
Supplementary Material
Acknowledgments
The authors thank the Society of Comparative and Integrative Biology for supporting the symposium in which this paper was presented.
References
- Abbaraju NV, Cai Y, Rees BB. Protein recovery and identification from the gulf killifish, Fundulus grandis: comparing snap-frozen and RNAlater® preserved tissues. Proteomics. 2011;11:4257–61. doi: 10.1002/pmic.201100328. [DOI] [PubMed] [Google Scholar]
- Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature. 2003;422:198–207. doi: 10.1038/nature01511. [DOI] [PubMed] [Google Scholar]
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. The Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25:25–9. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Binns D, Dimmer E, Huntley R, Barrell D, O’Donovan C, Apweiler R. QuickGO: a web-based tool for Gene Ontology searching. Bioinformatics. 2009;25:3045–6. doi: 10.1093/bioinformatics/btp536. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bosworth CA, IV, Chou CW, Cole RB, Rees BB. Protein expression patterns in zebrafish skeletal muscle: initial characterization and the effects of hypoxic exposure. Proteomics. 2005;5:1362–71. doi: 10.1002/pmic.200401002. [DOI] [PubMed] [Google Scholar]
- Burnett KG, Bain LJ, Baldwin WS, Callard GV, Cohen S, Di Giulio RT, Evans DH, Gómez-Chiarri M, Hahn ME, Hoover CA, et al. Fundulus as the premier teleost model in environmental biology: opportunities for new insights using genomics. Comp Biochem Physiol D. 2007;2:257–86. doi: 10.1016/j.cbd.2007.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Capitanio1 D, Vasso1 M, Fania C, Moriggi M, Viganò A, Procacci P, Magnaghi V, Gelfi C. Comparative proteomic profile of rat sciatic nerve and gastrocnemius muscle tissues in ageing by 2-D DIGE. Proteomics. 2009;9:2004–20. doi: 10.1002/pmic.200701162. [DOI] [PubMed] [Google Scholar]
- Clauser KR, Baker P, Burlingame AL. Role of accurate mass measurement (±10 ppm) in protein identification strategies employing MS or MS MS and database searching. Anal Chem. 1999;71:2871–2. doi: 10.1021/ac9810516. [DOI] [PubMed] [Google Scholar]
- Damerval C, De Vienne D, Zivy M, Thiellement H. Technical improvements in two-dimensional electrophoresis increase the level of genetic variation detected in wheat-seedling proteins. Electrophoresis. 1986;7:52–4. [Google Scholar]
- De Souza AG, MacCormack TJ, Wang N, Li L, Goss GG. Large-scale proteome profile of the zebrafish (Danio rerio) gill for physiological and biomarker discovery studies. Zebrafish. 2009;6:229–38. doi: 10.1089/zeb.2009.0591. [DOI] [PubMed] [Google Scholar]
- Dowd WW. Challenges for biological interpretation of environmental proteomics data in non-model organisms. Integr Comp Biol. 2012;52:705–20. doi: 10.1093/icb/ics093. [DOI] [PubMed] [Google Scholar]
- Dowd WW, Renshaw GMC, Cech JJ, Kültz D. Compensatory proteome adjustments imply tissue-specific structural and metabolic reorganization following episodic hypoxia or anoxia in the epaulette shark (Hemiscyllium ocellatum) Physiol Genom. 2010;42:93–114. doi: 10.1152/physiolgenomics.00176.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elias JE, Gygi SP. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods. 2007;4:207–14. doi: 10.1038/nmeth1019. [DOI] [PubMed] [Google Scholar]
- Eng JK, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data of peptides with amino-acid-sequences in a protein database. J Am Soc Mass Spectrom. 1994;5:976–89. doi: 10.1016/1044-0305(94)80016-2. [DOI] [PubMed] [Google Scholar]
- Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM. Electrospray ionization for mass-spectrometry of large biomolecules. Science. 1989;246:64–71. doi: 10.1126/science.2675315. [DOI] [PubMed] [Google Scholar]
- Forné I, Abián J, Cerdà J. Fish proteome analysis: model organisms and non-sequenced species. Proteomics. 2010;10:858–72. doi: 10.1002/pmic.200900609. [DOI] [PubMed] [Google Scholar]
- Frohlich T, Arnold GJ. Proteome research based on modern liquid-chromatography tandem mass spectrometry: separation, identification and quantification. J Neural Transm. 2006;113:973–94. doi: 10.1007/s00702-006-0509-3. [DOI] [PubMed] [Google Scholar]
- Gelfi C, Viganò A, De Palma S, Ripamonti M, Begum S, Cerretelli P, Wait R. 2-D protein maps of rat gastrocnemius and soleus muscles: a tool for muscle plasticity assessment. Proteomics. 2006;6:321–40. doi: 10.1002/pmic.200501337. [DOI] [PubMed] [Google Scholar]
- Gygi SP, Corthals GL, Zhang Y, Rochon Y, Aebersold R. Evaluation of two-dimensional gel electrophoresis-based proteome analysis technology. Proc Natl Acad Sci USA. 2000;97:9390–9395. doi: 10.1073/pnas.160270797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnston IA. Structure and function of fish muscles. In: Day MH, editor. Symp. Zool. Soc. Vertebrate locomotion. Vol. 48. London: Academic Press; 1981. pp. 71–113. [Google Scholar]
- Karas M, Hillenkamp F. Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. Anal Chem. 1988;60:2299–301. doi: 10.1021/ac00171a028. [DOI] [PubMed] [Google Scholar]
- Karim M, Puiseux-Dao S, Edery M. Toxins and stress in fish: proteomic analyses and response network. Toxicon. 2011;57:959–69. doi: 10.1016/j.toxicon.2011.03.018. [DOI] [PubMed] [Google Scholar]
- King OD, Foulger RE, Dwight SS, White JV, Roth FP. Predicting gene function from patterns of annotation. Genome Res. 2003;13:896–904. doi: 10.1101/gr.440803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laemmli UK. Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature. 1970;227:680–5. doi: 10.1038/227680a0. [DOI] [PubMed] [Google Scholar]
- Lucitt MB, Price TS, Pizarro A, Wu W, Yocum AK, Seiler C, Pack MA, Blair IA, FitzGerald GA, Grosser T. Analysis of the zebrafish proteome during embryonic development. Mol Cell Proteomics. 2008;7:981–94. doi: 10.1074/mcp.M700382-MCP200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Martin SAM, Mohanty BP, Cash P, Houlihan DF, Secombes CJ. Proteome analysis of the Atlantic salmon (Salmo salar) cell line SHK-1 following recombinant IFN-gamma stimulation. Proteomics. 2007;7:2275–86. doi: 10.1002/pmic.200700020. [DOI] [PubMed] [Google Scholar]
- Martyniuk CJ, Alvarez S, McClung S, Villeneuve DL, Ankley GT, Denslow ND. Quantitative proteomic profiles of androgen receptor signaling in the liver of fathead minnows (Pimephales promelas) J Proteome Res. 2009;8:2186–200. doi: 10.1021/pr800627n. [DOI] [PubMed] [Google Scholar]
- Martyniuk CJ, Denslow ND. Towards functional genomics in fish using quantitative proteomics. Gen Comp Endocrinol. 2009;164:135–41. doi: 10.1016/j.ygcen.2009.01.023. [DOI] [PubMed] [Google Scholar]
- Mi H, Lazareva-Ulitsky B, Loo R, Kejariwal A, Vandergriff J, Rabkin S, Guo N, Muruganujan A, Doremieux O, Campbell MJ, et al. The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 2005;33:D284–D288. doi: 10.1093/nar/gki078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nordlie FG. Physiochemical environments and tolerances of cyprinodontoid fishes found in estuaries and salt marshes of eastern North America. Rev Fish Biol Fish. 2006;16:51–106. [Google Scholar]
- Ohlendieck K. Skeletal muscle proteomics: current approaches, technical challenges and emerging techniques. Skelet Muscle. 2011;1:6. doi: 10.1186/2044-5040-1-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Patterson SD. Data analysis—the Achilles heel of proteomics. Nat Biotechnol. 2003;422:233–7. doi: 10.1038/nbt0303-221. [DOI] [PubMed] [Google Scholar]
- Peisker K. Application of Neuhoff's optimized Coomassie brilliant blue G-250/ammonium sulfate/phosphoric acid protein staining to ultrathin polyacrylamide gels on polyester films. Electrophoresis. 1988;5:236–8. doi: 10.1002/elps.1150090510. [DOI] [PubMed] [Google Scholar]
- Perkins DN, Pappin DJC, Creasy DM, Cottrell JS. Probability-based protein identification by searching sequence database using mass spectrometry data. Electrophoresis. 1999;20:3551–67. doi: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2. [DOI] [PubMed] [Google Scholar]
- Postlethwait J, Amores A, Force A, Yan YL. The zebrafish genome. Methods Cell Biol. 1999;60:149–63. [PubMed] [Google Scholar]
- Sanchez BC, Ralston-Hooper K, Sepúlveda MS. Review of recent proteomic applications in aquatic toxicology. Environ Toxicol Chem. 2011;30:274–82. doi: 10.1002/etc.402. [DOI] [PubMed] [Google Scholar]
- Tomanek L. Environmental proteomics: changes in the proteome of marine organisms in response to environmental stress, pollutants, infection, symbiosis, and development. Ann Rev Mar Sci. 2011;3:373–99. doi: 10.1146/annurev-marine-120709-142729. [DOI] [PubMed] [Google Scholar]
- Westerfield M. The zebrafish book. A guide for the laboratory use of zebrafish (Danio rerio) 5th ed. Eugene: University of Oregon Press; 2007. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



