Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Apr 2.
Published in final edited form as: Anal Chem. 2019 Mar 18;91(7):4797–4805. doi: 10.1021/acs.analchem.9b00345

Microsampling Capillary Electrophoresis Mass Spectrometry Enables Single-cell Proteomics in Complex Tissues: Developing Cell Clones in Live Xenopus laevis and Zebrafish Embryos

Camille Lombard-Banek 1, Sally A Moody 2, M Chiara Manzini 3, Peter Nemes 1,2,*
PMCID: PMC6688183  NIHMSID: NIHMS1041538  PMID: 30827088

Abstract

Label-free single-cell proteomics by mass spectrometry (MS) is currently incompatible with complex tissues without requiring cell culturing, single-cell dissection, or tissue dissociation. We here report the first example of label-free single-cell MS-based proteomics directly in single cells in live vertebrate embryos. Our approach integrates optically guided in situ subcellular capillary microsampling, one-pot extraction/digestion of the collected proteins, peptide separation by capillary electrophoresis, ionization by an ultrasensitive electrokinetically pumped nano-electrospray, and detection by high-resolution MS (Orbitrap). With a 700-zmol (420,000 copies) lower limit of detection, this trace-sensitive technology confidently identified and quantified ~750–800 protein groups (<1% false discovery rate) by analyzing just ~5 ng of protein digest, viz. <0.05% of the total protein content from individual cells in the 16-cell Xenopus laevis (frog) embryo. After validating the approach by recovering animal-vegetal pole proteomic asymmetry in the frog zygote, the technology was used to uncover proteomic reorganization as the dorsal-animal (D11) cell of the 16-cell embryo gave rise to its neural-tissue fated clone in the embryo developing to the 32-, 64-, and 128-cell stage. In addition to enabling proteomics on smaller cells in X. laevis, we also demonstrated this technology to be scalable to single cells in live zebrafish embryos. Microsampling single-cell MS-based proteomics raises exciting opportunities to study cell and developmental processes directly in complex tissues and whole organisms at the (sub)level and of the building block of life: the cell.

Keywords: Single cell analysis, Proteomics, Microsampling, Mass spectrometry, Capillary electrophoresis, Xenopus

Graphical Abstract

graphic file with name nihms-1041538-f0001.jpg

INTRODUCTION

Single-cell proteomics by mass spectrometry (MS) promises to revolutionize our understanding of cell biology. However, MS-based single-cell proteomics has so far been limited to cells in cultures or suspensions, typically obtained by dissection from tissues or dissociation of tissues or organisms. Additionally, these workflows present drawbacks that complicate results interpretation for biological investigations, such as the loss of cellular locational information that is important for cell identity, physiology, and/or the cellular microenvironment. Dissociation, dissection, or flow sorting can also perturb dynamic molecular processes via exposure to physical or chemical forces, as was recently evidenced at the level of the metabolome.1, 2 These strategies are also prone to contamination from neighboring cells and can be difficult to scale to smaller cells and complex tissues and organisms. Thus far, no single-cell MS technologies have been developed to enable the label-free detection of large numbers of proteins directly in complex tissues and whole organisms.

Single-cell mass spectrometry (MS) is a new frontier in cell and developmental biology and health studies by enabling the direct measurement of proteins, peptides, and metabolites in single cells. This field of single-cell MS was the focus of several reviews lately.314 At present, proteomics by MS pools together large numbers, usually thousands to millions, of cells to obtain deep to nearly complete coverage of proteomes.15 However, cell averaging inherently loses potentially important information specific to each cell, such as differential fates that embryonic cells assume during development of tissues, organs, and the organism. Without molecular amplification available for the whole proteome (or metabolome), technologies of single-cell MS must deliver exquisite detection sensitivity. Detection of α and β globulins in individual erythrocytes by capillary electrophoresis (CE)-MS1618 pioneered MS proteomics for single cells. By using rare earth-metal tagged antibodies, the multiplexity of protein-targeted single-cell measurements were recently augmented to 23 different proteins using mass cytometry (CyTOF)19. Extension of single-cell proteomics to label-free operation would enable the detection of a large number of proteins, thus opening new potentials for cell and developmental biological studies.

We and others recently developed sufficiently sensitive label-free MS to enable proteomics on single isolated cells. Considerably sized cells in early developing embryos of the South African clawed frog (X. laevis) were amenable to manual dissection, followed by discovery (untargeted) proteomic measurement of each isolated cell using custom-built CE-MS instruments20, 21 and liquid chromatography (LC) MS22, 23. For example, microanalytical CE-MS has enabled the detection of ~1,600 different protein groups between single cells that were dissected from 16-cell embryos of the frog, allowing us to discover previously unknown proteomic cell heterogeneity at such an early stage of development.20, 21 To process smaller cells, nanoPOTS (nanodroplet processing in one pot for trace samples platform) was recently introduced, enabling the identification of ~3,000 proteins from ~10 mammalian cells in culture,24 ~670 protein groups from single HeLa cells and human lung primary cells,25 and 164–607 protein groups from 1–5 LNCaP cells isolated from whole blood.26

However, cell dissection, tissue dissociation, and culturing pose inherent limitations for many experiments in biology and health studies. Dissection is challenging to adapt to developing systems, such as developing embryos in X. laevis, zebrafish, or mouse, in which cell division leads to rapidly shrinking cell dimensions, cell adhesion is strengthened in later stages of development, and morphogenic movements can cause translocation of cells over considerable distances. Dissection/dissociation is also prone to contamination by materials from neighboring cells, calling for careful validation of the results. As already mentioned earlier, physical/chemical processing can also alter molecular states, complicating results interpretation for single cells.2 Another challenge for biological studies arises from collateral damage to other cells and tissues, which in turn hinders the study of developing tissues and organs. Because the relationship between the transcriptome and proteome is complex during development and posttranslational modifications vastly expand protein heterogeneity27, there is a high need for the direct characterization of the single-cell proteome to complement single-cell transcriptomics data during vertebrate embryonic development. At present, there exists no label-free single-cell proteomics MS technology capable of characterizing hundreds to thousands of proteins in single cells directly in complex tissues and whole organisms, including developing vertebrate embryos.

To fill this technological gap, we here report MS-based single-cell proteomics of identified cells in live vertebrate embryos of important model organisms. This report follow our initial disclosure of the technology in 201728 with the optimized workflow disseminated in 201829. Instead of averaging cells to boost detection sensitivity, as is classical to MS, or dissecting cells, as currently possible in single-cell MS by us and others,2022 we microfabricated capillaries to aspirate a portion of identified cells directly from live embryos. Proteins from the collected aspirate were characterized label-free using a custom-built CE-ESI-MS system capable of ultrasensitive detection. As an example, we demonstrated the confident quantification of 450–800 different protein groups (<1% false discovery rate) in single cells in cleavage- and blastula-staged Xenopus laevis and zebrafish embryos, two popular vertebrate models in cell and developmental biology. Furthermore, for the first time in MS and developmental biology, we also quantified changes in the expression of hundreds of proteins as a single identified cell divided to give rise to its neural-tissue fated cell clone in the live frog embryo. The single-cell proteomics data from this work complement those already accessible by single-cell transcriptomics, thus opening the door to systems biology studies of cell developmental processes.

METHODS

Reagents

Chemicals, solvents, and TCPK-modified trypsin were purchased in reagent grade or higher purity from Fisher Scientific (Pittsburg, PA). Solutions were prepared in LC-MS grade methanol, acetonitrile, water, formic or acetic acids. Bare fused silica capillaries (40/110 μm inner/outer diameter) were obtained from Polymicro technologies (Phoenix, AZ) and used without modification. Microcapillaries for microsampling and nanoESI were fabricated from borosilicate glass capillaries (0.75/1 mm inner/outer diameter) from Sutter Instruments (Novato, CA).

Solutions

For culturing X. laevis embryos, 100% Steinberg’s solution (pH 7.4) was prepared following an established protocol.30 For culturing zebrafish embryos, the E2 media was prepared following an established protocol.31 To extract proteins, the lysis buffer was prepared to contain 1% (w/v) SDS, 20 mM Tris-HCl, 5 mM EDTA, and 150 mM NaCl in reverse-osmosis water. The CE background electrolyte contained 25% (v/v) MeOH and 1 M formic acid. The CE-nanoESI sheath solution was prepared from 10% (v/v) MeOH with 0.5% (v/v) AcOH.

Animal Care and Embryo Handling

All protocols for the maintenance and handling of Xenopus laevis or zebrafish (Danio rerio) embryos were approved by the Institutional Animal Care and Use Committee of the University of Maryland (IACUC #R-DEC-17–57) or the George Washington University (IACUC #A311). Details regarding the processing of the embryos and the capillary microsampling and dissection of identified cells are provided in the SI document.

CE-nanoESI-HRMS

Digested proteins were measured in a custom-built CE platform that we recently described in detail.20 In this study, ~10 nL of cell digest were electrophoretically separated in a ~90-cm fused silica capillary filled with the background electrolyte. The CE separation potential was +18 kV for 15 min and +15 kV thereafter until separation was stopped at 80 min. Most peptides were separated within a 30-min migration time window. Peptides were ionized in an electrokinetically pumped co-axial low-flow CE-nanoESI interface, which was constructed based on a recent design32. In this study, the spraying mode was operated in the cone-jet regime for efficient ion generation33 by direct imaging of the hydrodynamic meniscus at the emitter tip using a long-working distance objective (Mitutoyo Plan Apo, Edmund Optics, Barrington, NJ) equipped with a CCD camera (EO-2018C, Edmund Optics). Further technical details are available in the SI document.

Safety

Standard chemical and biological safety protocols were followed. Electrospray emitters and capillaries, which pose puncture hazard, were handled with care. Electrically conductive parts of the CE-EK-nanoESI platform, which pose electrical shock hazard, were grounded or shielded in an interlock-enabled enclosure to prevent users from accidental exposure.

Data Analysis

To identify and quantify proteins (<1% false discovery rate), the MS1–MS2 data were analyzed using the Andromeda search engine34 in MaxQuant version 1.5.5.135 against the Xenopus proteome (downloaded from UniProt in October 2017) or zebrafish proteome (downloaded in January 2018 from Uniprot.org). Principal component analysis (PCA) and hierarchical cluster analysis (HCA) were performed in MetaboAnalyst version 3.0.36 Cluster analysis was performed in GProX.37 Gene ontology and pathway annotations were performed based in PANTHER38, 39.

Data Repository

All RAW and processed MS–MS/MS data and the concatenated Xenopus reference proteome have been uploaded to the ProteomeExchange Consortium40 via PRIDE under dataset identifier PXD006905.

RESULTS AND DISCUSSION

Technology Development

Our sampling strategy began with culturing embryos to a desired developmental stage. Cells were identified under a high-resolution stereomicroscope based on morphological cues in this study. Alternatively, endogenous/exogenous molecular markers may be used for cell identification. For example, Figure 1 highlights the neural-tissue fated midline animal-dorsal (D11) cell in the 16-cell X. laevis embryo based on cell size, cleavage pattern, and pigmentation in accord with established cell fate maps41 (see also electronic Supplementary Information, SI). To sample proteins from these cells, we adapted an optically guided capillary microsampling platform2 to a remote-controlled precision three-axis translation stage, allowing for the membrane of the desired cell to be pierced with a borosilicate capillary (pulled to an ~20 μm diameter tip here). A pre-calibrated volume was aspirated from the cell using an online-hyphenated microinjector delivering negative pressure pulses to the capillary (−40 psi). While capillary microsampling has been utilized for metabolites in single cells in culture or tissues (reviewed in Refs. 314) and also recently in the X. laevis embryo by us2, this work demonstrates the first example of capillary microsampling of single cells in a vertebrate embryo for MS-based proteomics. As an example, we extracted ~10 nL of the D11 cell, viz. ~10% of the total cell volume in this study. Based on protein assay data for whole embryos20, this sample contains an ~1 μg of total protein. However, at this stage of development, ~90% of the cell’s proteome is dominated by abundant yolk proteins (vitellogenins), which extrapolates to ~100 ng yolk-free protein in the aspirate. These protein amounts are ~10–100-times lower than those typically required for UPLC MS.22, 23 To enable the proteomic analysis of smaller cells that form at later stages in the embryo, we purposefully measured only ~5 ng from the aspirated protein amount, or <0.05% of the total protein content of each identified cell in the 16-cell embryo. With scalable dimensions by capillary microsampling and ultrasensitive detection by CE-nanoESI-HRMS, our single-cell proteomic strategy should be applicable to other types of cells and models, including human cells.

Figure 1.

Figure 1.

Label-free single-cell proteomics of an identified single cell in a live Xenopus laevis embryo. The example shows optical identification of the midline animal-dorsal cell (D11, white arrow) in the 16-cell embryo, whence ~20 nL of cell content were aspirated using capillary microsampling (microprobe). Proteins in the collected aspirate were extracted and digested. Peptides were separated in a microanalytical capillary electrophoresis (CE) platform and ionized in a custom-built electrokinetically pumped nano-electrospray ionization (EK-nanoESI) source that was monitored using a long-working distance camera for stable operation in the cone-jet regime (see Taylor cone, Tc) for efficient ionization. The resulting peptide ions were analyzed using a high-resolution Orbitrap mass spectrometer. Scale bars = 200 μm (black), 1 mm (white).

To analyze the trace amounts of proteins collected, microprobe sampling was integrated to ultrasensitive CE-ESI-MS (Fig. 1a). Although this instrument recently enabled the detection of broad types of proteins in whole dissected X. laevis cells,20 the miniscule amounts of proteins that were collected in this study required optimization of the bottom-up proteomic workflow. Microaspirated samples were pressure-ejected into individual microtubes, in which proteins were extracted and trypsin-digested in 5 μL of buffer solution. To minimize/avoid potential peptide losses during sample handling, we here eliminated the typical bottom-up proteomic steps of reduction, alkylation, and protein purification. If desired, these steps can be implemented using limited amounts of reagents, as we recently demonstrated for dissected cells.20, 21 The digests were dried and reconstituted in 2 μL of 60% acetonitrile (ACN) with 0.05% acetic acid (AcOH), chosen to preconcentrate peptides on-column via field-amplified sample stacking during CE. We deposited 500 nL of the resulting digest into a sample-loading microvial (see (1) in Fig. 1), whence ~10 nL, containing ~5 ng of total peptides, or ~0.05% of the total single-cell proteome, were loaded into the microanalytical CE platform (Fig. 1) following our established protocols.20 Peptides were electrophoretically separated over an ~30-min window before ionization in an ultra-sensitive CE nano-flow ESI interface. This interface supplied a low-flow co-axial sheath flow around the CE separation capillary (Fig. 1) using an electrokinetic pump following a recent design.32 To further enhance ionization efficiency in this work, we used a long working-distance camera and employed ion current measurements to ensure that the electrospray was operated in the cone-jet spraying regime for efficient ionization33 (see Tc in inset, Fig. 1). This approach enabled a lower limit of detection of ~700 zmol (420,000 copies) for model peptides (see Supplementary Figure 1, Fig. S1). Generated peptide ions were fragmented using a high-resolution tandem mass spectrometer equipped with a higher-energy collision induced dissociation cell (Q-Exactive Plus, Thermo Scientific) to identify proteins based on proteotypic peptides against the proteome of the species.

Benchmarking Microprobe CE-MS to Whole-cell Dissection

We streamlined and then benchmarked microprobe CE-ESI-MS against whole-cell dissection (Fig. 2), which is the closest neighboring single-cell proteomics technology for the X. laevis embryo.2022 To enhance identification and quantification, different buffers were tested to extract and digest proteins: 50 mM ammonium bicarbonate (AmBic) containing no additive, 10% ACN, or 0.05% RapiGest (Waters Corp.) were compared to the reference approach, viz. whole-cell dissection with digestion in 1% SDS20. To assess technical and biological variability, each experiment was repeated for n = 3 different D11 cells, each from a different embryo, with each sample analyzed in technical duplicate.

Figure 2.

Figure 2.

Benchmarking and optimization of technology performance against whole-cell dissection. (A) Comparison of the number of identified protein groups between microprobe (μP) sampling in 50 mM ammonium bicarbonate (AmBic) containing 10% (v/v) acetonitrile (ACN), 0.05% (v/v) RapiGest, and no additive (AmBic) vs. whole-cell dissection in (Diss). (B) Comparison of quantification based on the number of quantified proteins (left panel) and repeatability (relative standard deviation, STD, right panel) between the approaches. Protein numbers are summarized in Table S1 with proteins listed in Tables S2AD for identification and Tables S2EH for quantification. The reproducibility of the CE-MS platform is shown for reference (“CE-MS”). Box-whisker plots comparing expression of representative proteins. Key: *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001.

Microprobe single-cell CE-ESI-MS identified significantly more proteins than cell dissection. As shown in Figure 2A, the microprobe approach was most efficient with digestion in AmBic without additives. The number of identified proteins is tabulated in Supplementary Table 1 (Table S1). Identified proteins are tabulated in Tables S2AD. Analysis of ~5 ng total peptides from each cell by the microprobe approach yielded ~15,000 tandem mass spectra, including ~3,500 peptide spectral matches, which were confidently assigned to 360 different protein groups (<1% false discovery rate) per cell on average (426 proteins cumulative). Notably, in spite of consuming more than 4-times the amount of protein digest (20 ng), whole-cell dissection still identified fewer proteins (282 proteins/cell by dissection, P = 0.008, Fig. 1b). Additionally, microprobe CE-MS exclusively identified twice more proteins (compare Venn diagrams in Fig. 2). Gene ontology annotation of molecular processes and subcellular localization revealed no known differences between proteins that were identified by either of these approaches (see Fig. S2).

Furthermore, microprobe sampling also significantly improved protein quantification, despite an almost 10-fold reduction in starting protein amounts used in this workflow. As shown in Figure 2B (see numbers in Table S1), quantification was optimal using AmBic in the absence of additives: 341 proteins/cell (391 proteins cumulative) were quantifiable by optimized microprobe vs. 271 proteins/cell by cell dissection (P = 0.004) using label-free quantification (MaxLFQ42). Quantified proteins are tabulated in Tables S2EH. Based on calculated LFQ intensity values, the reproducibility of quantification was indistinguishable between microprobe CE-MS and whole-cell dissection (P = 0.21). As anticipated, biological variability significantly enhanced the variability of number of proteins that were quantified between cells (22.6% by microprobe vs. 16.6% RSD by CE-MS-only, P = 5.7 × 10−8, Fig. 2B). The observed significant sensitivity improvements may be credited to a combination of factors: Compared to cell dissection, microprobe CE-MS enhanced peptide separation by providing ~55% more theoretical plate numbers (118,000 vs. 183,000, respectively) and enhanced peptide ionization by efficiently reducing interferences from abundant salts2 contained in the embryo culture media.

Technology Validation

To authenticate microprobe CE-ESI-MS, we asked whether the technology was able to find proteomic differences between the animal and vegetal poles of the Xenopus laevis zygote, viz. the fertilized 1-cell embryo (see Fig. 3A), which harbor known molecular differences at the level of transcripts and some proteins. An ~10 nL volume was aspirated at random positions in each hemisphere from n = 4 different zygotes with each sample analyzed in technical duplicate. A total of 752 proteins were identified and 485 proteins were reproducibly quantified using LFQ between each sample using CE-MS (see identification numbers in Table S3). Identified and quantified proteins are listed in Tables S4A and S4B, respectively.

Figure 3.

Figure 3.

Technology validation using the X. laevis zygote. (A) Microprobe CE-HRMS of random areas in the animal (AN, top panel) and vegetal (VG, bottom panel) pole. Scale bars = 200 μm. (B) Unsupervised principal component (PC) analysis of the quantitative data. Scores plot (top panel) revealing systematic proteomic differences between the poles (each data point marks a different cell). The loadings plot (bottom panel) measuring the contribution of each quantified protein to the observed differences (each data point marks a different protein). Representative proteins (gene names) are highlighted. Gradients illustrate the observed abundance trends between the poles. Horizontal and vertical dashed lines mark PC1 = PC2 = 0 coordinates. (C) Statistical analysis revealing significant proteomic differences between the poles. Box-whisker plots highlight representative proteins (see text). Key: *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001.

These quantitative metadata were used to compare proteomic profiles between the poles. Figure 3B shows results from the unsupervised principal component analysis (PCA) the calculated LFQ intensities considering the two most important principal components (PC), which accounted for ~55% of total variance in the data. PCA loadings are listed for each quantified protein in Table S4C. Data points corresponding to the animal and vegetal samples formed clusters that were clearly distinguishable from each other, indicating systematic proteomic differences between the two poles (see scores plot in top panel). The PCA loadings plot informed us of proteins that were comparably expressed between the poles (proteins with PC1 = ~ 0) as well as those that were enriched in the vegetal (PC1 >> 0) or animal (PC1 << 0) pole (see gradients for guidance in bottom panel). Proteins with finer differences in protein expression may be recognized along PC2.

Furthermore, the observed protein differences were statistically significant for several gene products, including some with known intracellular asymmetry in the zygote. As shown in Figure 3C, enrichment was statistically (P < 0.05) and biologically (fold change ≥ 1.3) significant for 115 proteins. The results of statistical analysis are tabulated in Table S4D. For example, basic transcription factor 3l4 (Btf3l4), elongation factor 1-gamma (Eef1g), glyceraldehyde 3-phosphate dehydrogenase (Gaped), and ~40 different ribosomal proteins (e.g., Rpl4) were more abundant in the animal hemisphere (see bottom panel), which are known to be enriched in the animal cap of the embryo.43 Conversely, perilipin 2 (Plin2) and vitellogenin chains (Vtg a1, a2, b1, and b2) were enriched in the vegetal pole (see bottom panel) in concordance with previous studies on transcriptomic or proteomic heterogeneity in the zygote.43 These results demonstrate the ability of microprobe CE-ESI-MS to recover known or expected subcellular proteomic heterogeneity in the zygote, essentially validating the technology. In principle, thereby, our technology is scalable to smaller cells and the three-dimensionally complex structure of the embryo as its cells divide, which we also demonstrated below for a cell clone.

Single-cell Proteomics in a Developing X. laevis Cell Clone

Next, we asked using the microprobe CE-MS technology whether the protein signature of a single identified cell changes as a given cell divides to form its descendent cells (called a clone). Detection of such changes would be intriguing, because currently there is no information available on the proteomic state of cells in the early clone, before zygotic transcription beings with the midblastula transition. As model, we selected the clone that descends from the midline animal-dorsal cell (D11) in the 16-cell X. laevis embryo41 through 4 consecutive cell divisions (see Fig. 4A); this group of cells reproducibly gives rise to neural tissue in the frog.41, 44 We microprobed the left D11 cell in n = 5 different 16-cell embryos and its vegetally located daughter cell (left D112) in n = 7 different 32-cell embryos, which we were able to accurately identify by fate mapping criteria (see earlier and Methods in the SI document). To test for cell heterogeneity over time, we also microanalyzed daughter cells of D112 in n = 6 different 64-cell embryos and grand-daughter cells of D112 in n = 7 different 128-cell embryos. Because the cleavages become irregular at these stages, the sampled cells were randomly chosen from the position of D112. Each analyzed cell was assigned an identifier (#1 to #25, Fig. 4B) in the order of sampling, although this information and cell identity were blinded during initial data analysis and revealed only to help results interpretation. About 1‒5 ng of protein digest, or 0.006–0.25% of the total protein content, was analyzed in triplicate from each cell in the clone.

Figure 4.

Figure 4.

Discovery single-cell proteomics of a neural-tissue fated clone in developing X. laevis embryos. (A) Optical imaging of microprobed cells in the left dorsal-animal cell (D11) lineage. Scale bar = 200 μm. (B) Hierarchical cluster analysis-heat map of quantitative protein abundances considering the 75 most significant features (ANOVA). Embryonic developmental stages (top axis) and individual cell identifiers (bottom axis) are shown. Asterisks (*) mark examples of proteins with quantifiable cell heterogeneity in the 16-cell and 128-cell clone. (C) GProX cluster analysis of protein profiles showing 103 proteins in Cluster #1, 107 proteins in Cluster #2, and 109 proteins in Cluster #3 (see data in Table S6C). (D) Relative quantification of protein production between microprobed cell types revealing significant enrichment differences (see data in Table S6D). Dashed lines mark statistical (P ≤ 0.05) and biological significance (fold change, FC ≥ 1.5).

About 791 protein groups were identified between the single cells, providing rich information on the proteomic state of the cells upon formation of the clone. Identification numbers are summarized for each cell type in Table S5. Identified proteins are tabulated in Table S6A. Based on gene ontology annotation (PantherDB), 26 of these proteins are related to developmental process, such as the homeobox protein C13 (HoxC13), stathmin 1 (STMN1), and adenylate cyclase associated proteins 1 (Cap1). Moreover, our data contained 14 transcription factors, such as high-mobility group protein b2 (Hmgb2), homeobox cut-like 1 (Cux1), and HoxC13. These results highlight notable sensitivity improvements for cell and developmental biology studies using single-cell MS, which so far required ~100 ng protein digest to identify ~450 protein groups from human oocytes23 and several micrograms of proteins to identify ~4,000 proteins between multiple whole cells dissected from X. laevis embryos22.

We used label-free quantification (LFQ) to compare the proteomic state of the cells. About 460 protein groups had quantifiable reads using LFQ among at least half of the biological replicates at every developmental stage from the 16- to the 128-cell. The number of quantified proteins is compared in Table S5. Proteins that were identified and quantified are tabulated in Table S6A and Table S6B, respectively. For each identified protein, we calculated the LFQ intensity (MaxLFQ), which serves as a quantitative proxy for protein concentration.20 These quantitative data were median-normalized and log-transformed for hierarchical cluster analysis (HCA). Figure 4B presents the HCA-heat map calculated for the 75 statistically most significant proteins. Intriguingly, the dendrogram clusters the samples (Cells #1 to #25) based on cell type and developmental stage (see horizontal axis); early cleavage-stage embryos (16- and 32-cells), in which all cells are on the surface, were distinguished from late blastula stages (64- and 128-cells), in which cell division generate internal cells. Likewise, the proteins are categorized based on expression profiles across the cell types (see vertical axis). For example, compared to the mean, expression levels were higher for proteins in group A in the 16- and 32-cell embryo, in group B in the 32- and 128-cell embryo, and in group C in the 64- and 128-cell embryo, whereas proteins in group D exhibited complex enrichment profiles across the clone.

In addition to revealing protein profiles for the cells, these data also informed us of detectable differences between cells of the D11 clone at the same stage of embryonic development. For example, closer inspection of the HCA plot revealed D11 cells #1, #11, and #16 of the 16-cell embryo to be different than the other cells for select proteins (see asterisks). The frequency of cell-to-cell variance over the clone appears to increase with development, as exemplified for cells #4, #5, #20, #22, #24 over their siblings in the 128-cell embryo (see asterisks). Future experiments may address a potential link between intra cell type heterogeneity (same developmental stage here) and divergent tissue specification from cells of the same clone (e.g., the D11 clone gives rise to the retina, brain, spinal cord, cement gland, olfactory placode41).

Proteomic profiles were extracted for each cell-stage. High Pearson correlations (ρ = 0.81–0.93) between the cells in the 16-cell embryo (Fig. S3) and 128-cell embryo (Fig. S4) confirmed sufficient quantitative repeatability across the biological replicates to explore protein patterns. Figure 4C groups proteins with similar expression trends into four clusters based on fuzzy c cluster analysis of the quantitative metadata. Exact cluster assignments are listed for each protein in Table S6C. Of the 456 different proteins quantified, 319 proteins presented with unique trends across the stages: Some proteins had strictly monotonal increasing (cluster 1) or decreasing (see cluster 3) abundance in the measured cells of the clone, whereas others presented with transient enrichment only at the 32-cell (cluster 2) or 64-cell (cluster 4) stage. These results provide previously unavailable data on proteomic reorganization as cells begin their eventual restriction to neural-fated tissue.

The proteomic changes were evaluated for statistical significance. As reference, we selected the D11 cell, the precursor cell of the clone, and calculated fold changes for the 456 proteins that were reproducibly quantified from the 128-cell clone. Calculated ratios and corresponding statistical significance (P value, Student’s t-test) are plotted in Figure 4D. Statistical significance and fold change values are reported for each quantified protein in Table S6D. While protein production was statistically nonvariant for 365 protein groups (P > 0.05), expression differences were significant for 91 proteins (P ≤ 0.05). For example, the D11 cell rather than its 128-cell descendant cell was enriched in fascin actin-bundling protein 1 (Fscn1) and stathmin 1 (Stmn1), both of which promote disassembly of microtubules, and the transcription factors high mobility group box 2 (Hmgb2) and parkinson protein 7 (Park7).43 In contrast, compared to the D11, the level of expression increased in the analyzed cell of the clone for the microtubule associated protein family member 1 (Mapre1, promotes cytoplasmic microtubule nucleation and elongation), the proliferating cell nuclear antigen (Pcna, involved in DNA replication), and the macrophage migration inhibitory factor (Mif) and prohibitin1 (phb1), which both are required for normal neural tissue development.43 Combined, these results established reproducible and significant proteomic differences between cells as the clone developed. Detection of proteomic changes over development of the clone is surprising, because the D11 clone arises from the same founder cell, thus inheriting similar maternal cytoplasmic components, and there is little transcription known at such an early stage of development. This information was obtainable only using the quantitative microanalytical technology that we developed here, which in principle is scalable to any other clone in the embryo.

Scalable Single-cell Proteomics to other models: Live Zebrafish Embryos

Last, we demonstrated that microprobe CE-ESI-MS is scalable to smaller single cells and a different model system, the zebrafish (Danio rerio) embryo (Fig. 5A). Unlike in X. laevis, the cleavage-stage zebrafish embryo lacks visual indicators of the primary body axes; hence, it is not possible to confidently identify the same cell type or track the clone of the same precursor cell type in different embryos. Without cell identification, it is still unknown whether cells of the zebrafish embryo execute different gene regulatory programs at the cleavage stage. Detection of protein expression in cells of zebrafish embryos would provide a direct window to assess potential cell-to-cell differences.

Figure 5.

Figure 5.

Scalable single-cell proteomics for the zebrafish embryo at 2-cell stage. (A) Microsampling of 5 nL (~5 ng protein) from a cell of interest directly in the living embryo by piercing through the vitelline membrane (Vit.) and the yolk sac. Scale bar = 200 μm. (B) Pearson correlation analysis (correlation coefficient, ρ) of detected protein abundances between n = 3 different single cells, each from a different embryo, indicating proteomic differences between the aspirated samples. Each cell was marked with an identifier number (#) for reference.

To tackle this question, we scaled microprobe CE-ESI-MS to measure protein expression in single cells of the 2-cell stage embryo. Challenging the targeted analysis of cells, the zebrafish embryo undergoes free rotation inside the vitelline membrane. As a solution, we immobilized the embryo by inserting the capillary into the cell of interest through the vitelline membrane and then the yolk sac, essentially pinning the embryo in place. We used the microprobe platform to aspirate ~5 nL, or 5 ng protein content, from one of the cells using a capillary pulled to ~5 μm tip diameter (vs. 10 ng collected using 20 μm capillary in X. laevis). Although this provided sufficient protein amount for our microanalytical CE-ESI-MS instrument, we deliberately analyzed only 33 pg of the resulting protein digest (or 0.13% of the total cell proteome) to demonstrate scalability to even smaller cells.

Even with this trace amount of material analyzed, ~330 proteins were identified (Table S7A) and ~320 proteins were quantified by LFQ (Table S7B). Following annotation for GO term enrichment, we found that most proteins identified were involved in translation, transcription, and cell division, as expected during early embryonic development closely following fertilization.45, 46 Roughly 25% of the proteins identified were structural components of the ribosome (n = 69), while another significant group regulated protein translation and folding (n = 22), such as translation elongation factors 1α, 1β, 1δ, 1γ, and 2b (eef1a, eef1b2, eef1da, eef1g, eef2b) and chaperone heath shock protein 90 (hsp90ab1, hsp90b1). Critical enzymes involved in energy production in both glycolysis and citric acid cycle were also enriched (n = 28 proteins with oxidoreductase activity), including pyruvate carboxylase (pox), phosphoglycerate kinase (pgk1), NADH dehydrogenase 1 alpha (ndufa13), and aconitase2 (aco2). Finally, multiple proteins were involved in cytoskeletal regulation during cell division [e.g., actin regulators non-muscle cell myosin (myh9a), cofilin 1 (cfl1) and profilin 2 (pfn2)] and DNA replication [e.g. helicase (mcm9)]. Identification of proteins with developmental relevance showcases the potentials of microprobe CE-ESI-MS for the zebrafish embryo.

The quantitative metadata were used to compare protein expression between the cells. As proof-of concept, we analyzed n = 3 cells, each from a different embryo. With fast and scalable sample collection afforded by microprobe sampling, these measurements can be readily scaled-up to more cells for biologically motivated studies. Based on mean-normalized log-transformed LFQ intensities, protein abundances varied by ~47% RSD, which far exceeds the technical quantitative repeatability of our technology (~7% RSD, Fig. 2B) and the biological quantitative variability observed between the same cell type in X. laevis (~23% RSD, Fig. 2B). Therefore, these data suggest quantitative proteomic differences between the cells that were analyzed in the embryo.

Protein expression was compared between the cells using Pearson cross-correlation analysis. As shown in Figure 5B, the protein composition of cell #1 was highly correlated with that of #2 (Pearson correlation coefficient, ρ = 0.91). Correlation was lower with cell #3 (ρ = 0.84), and notably lower between cells #2 and #3 (ρ = 0.72), revealing marked proteomic differences between the cells. Proteins underlying the observed differences are readily identified outside the 95% confidence prediction band in the plots. For example, among the poorly correlated proteins were three proteins of the ovostatin homolog-1-related signaling molecules (Zgc:171445, Zgc:165518, and Zgc:171426) that are involved in cell signaling (PatherDB). While the biological origin of these differences is yet unknown to us, future studies may use our data to design targeted experiments to test the developmental significance of stably or variably expressed proteins.

CONCLUSIONS

In conclusion, we here developed capillary microsampling single-cell CE-ESI-MS to enable, for the first time, label-free proteomic characterization of (identified) single cells directly in live embryos. With scalability to diverse cell sizes and adaptability to different animal models, the technology is able to identify and quantify hundreds of proteins in cells, capturing previously unknown reorganization of the proteome as cells give rise to clones in developing embryos. In principle, the analyzed protein amounts estimate to those afforded by larger mammalian cells47, raising exciting technical capabilities to study the proteomic aspects of cell and developmental processes in other cell types (e.g., germ cells, neurons, glial cells). Finally, it does not escape our attention that the microsampling approach should be adaptable to other MS technologies, including liquid chromatography and matrix-assisted laser desorption ionization, to measure the proteomic state of cells.

Supplementary Material

SI
SITable

Acknowledgements

We thank Aparna B. Baxi and Reem Al-Shabeeb for help with data processing. We also thank the peer reviewers for valuable comments and suggestions on this manuscript.

Funding. This work was partially funded by the National Institutes of Health Award 1R35GM124755 (to P.N.), the Arnold and Mabel Beckman Foundation Young Investigator Grant (to P.N.), the American Society for Mass Spectrometry Research Award (to P.N.), and the DuPont Company Young Professor Award (to P.N.). The content of the reported work is the sole responsibility of the authors and does not necessarily represent the official views of the funding sources.

Abbreviations

CE

capillary electrophoresis

HRMS

high-resolution mass spectrometry

ESI

electrospray ionization

LC

liquid chromatography

LFQ

label free quantification

Footnotes

Competing Financial Interests. The authors have no competing financial interests.

Associated Content. The Supporting Information is available free of charge on the ACS Publications website at DOI: TBD. Protein identifications and quantification, results from Pearson correlation analysis, Student’s t-test, and GProX analysis

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

SI
SITable

RESOURCES