Abstract
Resolving lineage relationships between cells in an organism is a fundamental interest of developmental biology. Furthermore, investigating lineage can drive understanding of pathological states, including cancer, as well as understanding of developmental pathways that are amenable to manipulation by directed differentiation. Although lineage tracking through the injection of retroviral libraries has long been the state of the art, a recent explosion of methodological advances in exogenous labelling and single-cell sequencing have enabled lineage tracking at larger scales, in more detail, and in a wider range of species than was previously considered possible. In this Review, we discuss these techniques for cell lineage tracking, with attention both to those that trace lineage forwards from experimental labelling, and those that trace backwards across the life history of an organism.
Deriving lineage relationships between cells in a developing organism, and between an early dividing cell of unknown potential and its descendants, have been long-standing interests in developmental biology. Understanding these lineage relationships illuminates the fundamental mechanisms underlying normal development, and can provide insight into pathologies of development and cancer. Lineage relationships are experimentally revealed through fate-mapping methods, and when fate mapping is carried out at single-cell resolution it is known as lineage tracing (also known as lineage tracking).
Fundamental questions of lineage have been addressed since the earliest days of embryology, with technical sophistication increasing over time. Initially, embryologists were limited to visual observation of development in organisms that are small enough to be transparent, such as Caenorhabditis elegans, which enabled the discovery of genes that control cell proliferation, cell fate and cell death1,2. In species with larger numbers of cells, genetic mosaicism was leveraged to investigate cell fate, by creating chimeric embryos from mouse strains with differing coat colour genes3,4 or by grafting quail cells into chicken embryos5. With the development of radioactive, enzymatic and fluorescent cellular labels, it became possible to selectively label one or more cells by direct injection and trace developmental potential directly6–9, although most available labels were subject to dilution with successive cell division10.
In recent years, many new methods have emerged to enable cell lineage tracking with increasing resolution, leading to substantial biological insights. In model organisms, novel cellular labels, such as barcoded retroviral libraries11 and a rainbow of available fluorescent proteins12, have increased the number of founder cells that can be uniquely labelled and tracked. Labels can be delivered at different stages of development using various methods, including viral infection and in utero electroporation. Unlike most early cellular tracers, labels that are inserted into the genome can permanently mark lineages in a variety of experimental organisms without being diluted by cell division, and these modifications are facilitated by genome-editing technologies, such as the CRISPR–Cas9 system13. Furthermore, recent advances in sequencing enable naturally occurring somatic mosaic mutations to be used as lineage marks in cancerous tissue14,15 and normal tissue16,17, illuminating a future in which lineage tracing moves from experimental organisms into humans.
In this Review, we present both historical and recently developed methods for lineage tracing. Following the common division of genetic approaches into ‘forward’ and ‘reverse’ genetics, we discuss methods according to whether they prospectively introduce lineage tracers and follow traced cells forwards in development (prospective lineage analysis), or whether they retrospectively identify lineage-specific tracers and use them to infer past developmental relationships (retrospective lineage analysis) (FIG. 1). We highlight technologies and methods that can make important contributions to the execution and the interpretation of lineage tracing experiments. We conclude with a discussion of systems and organs that present promising or challenging prospects for lineage tracing.
Figure 1. Prospective and retrospective lineage tracing.
Prospective lineage tracing entails experimentally applying a lineage mark (grey rectangle on the blue timeline), then following cells forward to read its output at some later time. By contrast, retrospective lineage tracing follows cells backwards to read endogenous marks (multiple grey rectangles on the blue timeline) that have accumulated over the lifetime of an organism. Compared with retrospective lineage tracing, prospective lineage tracing generally requires greater experimental intervention at the onset of development (left), but less intervention to read the result of lineage tracing (right). In both experimental designs, cells are placed in a dendrogram according to their inferred relationships with each other.
Prospective methods of lineage tracing
A classic approach to cell lineage analysis is to label a single founder cell and trace its progeny over time. This prospective method has been used since biological dyes mapped the fate of cells within chicken and mouse embryos in early observational studies, and continues to be used in current lineage tracking experiments18,19. Early developmental studies hoped to achieve clonal labelling by microinjecting small amounts of dye into an area of interest, whereas advances in genetic tools for prospective lineage tracing now allow for far greater cell and tissue specificity, recombinase-based intersectional analyses and single-cell resolution (FIG. 2; TABLE 1).
Figure 2. Highlighted genetic methods and strategies for prospective lineage tracing in vertebrate animal models and cell culture.
Early observational lineage studies used biological dyes for cell labelling and analysis, whereas advances in recombinant DNA technology, transgenesis and genome-editing platforms have revolutionized prospective lineage tracing. Although not mutually exclusive, these featured techniques are commonly used for the tracking of cell lineage and cell fate in animal models and cell culture. a | Sparse retroviral labelling integrates a reporter transgene and a short DNA barcode tag into the genome of the host cell. After propagation to progeny, cells derived from a common progenitor share the same barcode, whereas clonally unrelated cells harbour different barcodes. b | In a transposon plasmid vector system, such as piggyBac, a helper plasmid expressing a transposase excises (‘cut’) and integrates (‘paste’) a reporter transgene from a donor plasmid into the genome of a cell. Once the transgene is integrated, all daughter cells within that lineage will express the reporter. c | Genetic recombination systems, such as Cre-loxP, leverage the expression of recombinase enzymes to activate the expression of reporter genes in a cell-specific or tissue-specific manner. Once Cre is activated within a cell, all progeny will express the exogenous reporter gene. d | Much like single-colour reporters, multicolour mosaic systems harness recombination to label lineages with multiple unique colours. In the schematic, stochastic recombination at various loxP sites allows for the combinatorial expression of multiple fluorophore colour combinations. e | Genome-editing systems express a lineage barcode with a CRISPR target array that progressively and stably accumulates mutations over cellular divisions. Much like retrospective tracing, lineage relationships are reconstructed on the basis of the pattern of shared mutations among cells. CFP, cyan fluorescent protein; OFP, orange fluorescent protein; RFP, red fluorescent protein; YFP, yellow fluorescent protein.
Table 1.
Lineage reconstruction techniques for prospective tracing
| Lineage marking method | Reconstruction strategy | Requirements | Refs |
|---|---|---|---|
| Microscopy or live imaging | |||
| Retroviral infection | After sparse viral infection, all progeny of that lineage will carry the reporter gene. Lineages expressing the reporter are visualized using microscopy. | Virus and microscope | 22–27 |
| Plasmid transfection | After transfection, progeny of the lineage will carry the reporter gene until cellular divisions dilute the episomal plasmid. Transposon systems can be used to integrate the transgene into the genome of the host to mitigate plasmid loss. Lineages expressing the reporter are visualized using microscopy. | Transfection agent or electroporator and microscope | 36–39, 43–47 |
| Tissue-specific genetic recombination | After Cre-based recombination, all progeny of that lineage will carry the reporter gene. Lineages expressing the reporter are visualized using microscopy. | Genetically modifiable lines and microscope | 51–53 |
| Multicolour mosaics | After Cre recombination, all progeny of that lineage will carry a combination of fluorophores. Lineages expressing a certain hue are visualized using microscopy. | Genetically modifiable lines and microscopes capable of resolving multiple colours | 12, 54–62 |
| Microfluidic capture | A single founder cell is cultured on a microfluidic chip and the next five generations of progeny are captured downstream using a hydrodynamic trap. Time-lapse imaging is used to visualize cellular divisions in real time. | Microfluidic chip and microscope | 63 |
| Sequencing of viral barcodes | |||
| Retroviral library infection | After viral infection, all progeny of that lineage will carry the reporter gene and a unique barcode. Once cells are isolated and sequenced, clones will harbour the same barcode. If LCM is used, the precise anatomical position of the cell within a clone may be recovered. Using sequenced barcodes, lineages can be clustered and plotted into dendrograms. | Constructed barcoded library, method for isolating cells for sequencing (FACS or LCM) and sequencing analysis | 28–35 |
| Sequencing of edited barcodes | |||
| CRISPR–Cas9 genome-editing systems | After viral infection, the lineage barcode will incrementally accumulate mutations in progeny over cellular divisions. Once cells are isolated and sequenced, lineage barcode hierarchies can be determined using maximum parsimony methods and plotted into dendrograms. | Virus with target CRISPR array barcode, Cas9, guide RNAs and sequencing analysis | 64–66 |
FACS, fluorescence-activated cell sorting; LCM, laser-capture microdissection.
Sparse retroviral labelling for lineage tracing
Since the advent of recombinant DNA technology in the late 1980s, retroviral libraries that contain reporter transgenes such as β-galactosidase (β-gal) and green fluorescent protein (GFP) have been used for cell labelling and lineage tracing in vertebrate animal models20,21. Retroviral vector-mediated gene transfer allows viruses to introduce recombinant DNA into the genome of a host cell. Viruses are applied at limiting dilutions with the goal of labelling single founder cells. The integrated exogenous DNA is then inherited by all the descendants of the infected cell. The DNA encodes a histochemical or fluorescent protein that can be easily assayed to label cells of a ‘clone’ and to elucidate cell fate choices within that clone. Histological and morphological analyses of the progeny of virally infected cells allows for post hoc fate mapping within a clonally related cell population.
Sparse retroviral infection has also been used in live-cell imaging of progenitors and their progeny in organotypic slice culture. Mouse, ferret, chimpanzee and human progenitors have all been analysed using time-lapse imaging. Individual progenitors that have been labelled with fluorescent reporter genes are visualized using confocal microscopy for multiple cellular divisions. At the end of the imaging experiment, immunohistochemistry and cellular morphology can then be used to analyse cell fate within the imaged clone22–26. Although ex vivo organotypic culturing conditions closely mimic the in vivo cellular environment, it is important to consider that progenitor divisions and behaviour may differ from that observed in vivo, and such experiments can usually be carried out for only a few days at most, and thus cannot typically relate clonal relationships to adult structure.
Initial studies using sparse retroviral labelling inferred clonality on the basis of the proximity of cells that express a reporter gene. Early studies in the cerebral cortex soon showed that sibling cells dispersed widely from one another in some clones27. To analyse such widespread clones, the first retroviral libraries were developed, encoding the lacZ gene as a reporter, but also using short DNA fragments to function as barcode tags28. Clonal relationships were then directly revealed through PCR amplification of the integrated barcode tags from cells dissected from tissue sections, rather than being inferred on the basis of proximity alone (FIG. 2a). Cells derived from a common progenitor share the same DNA tag at the vector integration site (IS) regardless of their patterns of migration, whereas clonally unrelated cells harbour different barcodes (FIG. 3). The first library of 100 tags soon expanded to 1,000 tags29,30 and then to essentially unlimited complexity using random oligonucleotide barcodes of identical size but with distinct sequences31,32.
Figure 3. Prospective and retrospective lineage tracing of brain development.
This figure illustrates studies for assessing questions of neuronal migration and lineage, including whether neurons that share a common origin are physically adjacent to each other in the brain, and whether closely related cells are more likely to be adjacent than more distantly related cells, by prospective and retrospective lineage tracing. a | Using a prospective method in a model organism (ferret), cortical cells are traced using the injection of a tagged retroviral library, revealing two clonal lineages that are widely distributed across the brain (blue and green). A sagittal section of parietal cortex is shown following analysis by microscopy, demonstrating that blue-lineage neurons migrate into the cortex and spread laterally in a cone-shaped structure136,137. b | A similar, but retrospective, analysis carried out in human brain identifies somatic long interspersed nuclear element 1 (L1) retrotransposition events by sequencing and digital droplet PCR, revealing a widespread clone resulting from an early retrotransposition event (green) and a smaller clone that is restricted to a small region of frontal cortex, resulting from a later retrotransposition event (blue)81. Results from both approaches are consistent, leading to the conclusion that lineages that are marked early in mammalian neuronal development are spread across the brain and intermingled with other lineages, but those that are marked late in neuronal development are more spatially restricted and physically coherent. Part a is adapted from REF. 136, Ware, M. L., Tavazoie, S. F., Reid, C. B. & Walsh, C. A. Coexistence of widespread clones and large radial clones in early embryonic ferret cortex, Cereb. Cortex, 1999, 9 (6), 636–645, by permission of Oxford University Press, and from REF. 137, republished with permission of The Company of Biologists Ltd, from Clonal dispersion and evidence for asymmetric cell division in ferret cortex, Reid, C. B., Tavazoie, S. F. & Walsh, C. A. 124 (12), 1997. Part b is adapted with permission from REFS 17,81, AAAS and Elsevier, respectively.
Advances in transgenic animal lines have also extended the applications of retroviral genetic tagging and fate mapping. Cell type specificity can now be achieved using transgenic mouse lines that express virus receptors under the control of a cell type-specific promoter33,34. Only cells that contain the virus receptor can be infected and express the reporter gene or barcode, allowing for more precise viral targeting in vivo. Barcode tags can then be recovered using fluorescence-activated cell sorting (FACS) with the fluorescent reporter transgene or using laser capture microdissection (LCM) techniques that can preserve cellular position within the infected tissue for future reconstruction and analysis.
Although retroviral library labelling is an advantageous method for determining lineage relationships both in vivo and ex vivo, this technique does have some considerations and limitations: only cells with the capacity to divide will propagate the barcode to progeny; retroviral vectors can spontaneously silence, such that many retrovirally transfected cells are no longer histochemically labelled even though their DNA can be detected in the tissue; and barcode tag recovery from single cells can be challenging32,33. However, experiments can be designed to mitigate these drawbacks, and current technological advances may circumvent some of these limitations. Retroviral silencing is thought to be a stochastic event, thus overall clonal size or complexity may be underestimated but should not otherwise skew experimental results. To circumvent this challenge, new studies have been combining retroviral library labelling with high-throughput next-generation sequencing. This advance, which has been used to track mouse haematopoietic stem cells in vivo, allows for not only a more precise barcode identification and quantification compared with Sanger sequencing but also single-cell sensitivity35.
Plasmid transfection labelling for lineage tracing
In addition to viral infection, reporter transgenes for cell labelling and fate mapping can be introduced into cells by DNA plasmid transfection. Lipofection, which is a common lipid-based system, has been used to transfect the developing Xenopus laevis retina and to trace retinal cell fate in vivo36. Lipofection continues to be a popular method for both in vivo and in vitro lineage studies. Electroporation, which is an alternative non-viral delivery method, uses electrical fields to increase cell membrane permeability to recombinant DNA. Electroporation has also been used to deliver reporter transgenes that encode fluorescent proteins to track cells both in vitro and in various vertebrate animal models37,38. To introduce recombinant DNA plasmids into neural progenitors in vivo, in utero electroporation (IUE) has proved to be an efficient technique38. Reporter gene plasmids can be injected into the ventricles of the developing brain and then introduced into neural progenitors that line the ventricular wall by electrical pulses. A reporter transgene, such as GFP, is then carried episomally by the progenitor cell and passed on to subsequent daughter cells. Unlike retroviral labelling, however, plasmid DNA is not integrated into the genome of the progenitor and becomes diluted or inactivated in progeny after serial cellular divisions. Plasmid electroporation techniques, therefore, are transient and fail to label the entire lineage39.
A solution to plasmid loss and inactivation is a DNA transposon system, which stably integrates the reporter transgene into the genome of the progenitor (FIG. 2b). Transposon systems include Mos1, Tol2, Sleeping Beauty (SB) and piggyBac (PB) which mobilize through a cut-and-paste mechanism40–42. The typical transposon system is used in a dual plasmid format, in which a donor plasmid contains the reporter transgene of interest and a helper plasmid expresses the transposase. The donor plasmid includes terminal repeats that flank the transgene, which allows for random genomic integration by the transposase. The transgene is then propagated to all progeny within the lineage but with limited further transpositional mobility because the transposase, like any episomal plasmid, will be diluted over cellular divisions. Expression from donor and helper plasmids can be driven by different promoters, allowing for cell type specificity and genetic intersectional analyses. Compared with other transposon systems, piggyBac has a more precise cut-and-paste mechanism, higher transposition efficiency and a larger cargo capacity43. These attributes have made the piggyBac transposon system particularly popular. In addition, piggyBac transposase can be co-electroporated with multiple fluorescent reporter constructs, each of which is driven by a cell type-specific promoter. In this experimental design, multiple lineages can be examined in a single animal44. PiggyBac has been successfully used in multiple mammalian cell lines and in combination with in utero electroporation to track and manipulate cell lineages in animal models44–47.
The piggyBac transposon plasmid system allows for remarkable flexibility and cell type specificity, but as with any random genomic insertion event, the precise location and number of transposition occurrences introduces a risk of confounded results owing to mutagenesis. Transposition of the reporter transgene may cause endogenous genes at or near the insertion site to become unintentionally dysregulated. One study, however, found no evidence of mutagenesis by transposon insertion in cells that were labelled using the piggyBac IUE method43. Transposase plasmid systems are a remarkable tool for transgenesis and cell lineage tracking in both classically genetically modifiable animal models, such as mice, and in otherwise non-genetically tractable animals, such as the ferret.
Genetic recombination for lineage tracing
Cell lineage tracing by genetic recombination is able to leverage the expression of recombinase enzymes in a cell-specific or tissue-specific manner to activate the expression of a conditional reporter gene. Two genetically encoded, site-specific recombination systems include Cre-loxP and FLP-FRT. In the Cre-loxP system, mice are engineered to express Cre recombinase under the control of a chosen promoter, limiting Cre expression to a specific tissue or cell type48 (FIG. 2c). These lines are then crossed with a second line in which a reporter transgene, such as lacZ or GFP, is preceded by a loxP-flanked transcriptional stop (loxP-STOP-loxP) cassette. In cells that express Cre recombinase, the STOP sequence is excised and the reporter transgene is expressed. Temporal control of recombination can be gained by using an inducible Cre system, which selectively activates Cre under promoters that may also be active at undesired time points, such as embryogenesis. In an inducible system, Cre recombinase is fused to the human oestrogen or progesterone receptor and activated only in the presence of an anti-oestrogen, such as tamoxifen, or an anti-progestin, respectively. A pulse of tamoxifen administration with an inducible Cre system can be used to determine lineage relationships49. Leakiness is a common problem of inducible Cre systems50 but, nonetheless, these inducible systems have been used for lineage tracing in many adult tissues.
To gain even more cell type specificity, an intersectional approach with the Split-Cre or a combination of the Cre-loxP and FLP-FRT site-specific recombination systems may be used. The Split-Cre system expresses two cleaved, inactive Cre fragments with each driven by a different promoter. Only when both promoters are concurrently expressed within the same cells of a population will the Cre enzyme be reconstituted, the STOP cassette be excised and the reporter transgene expressed48. Cre-loxP can also be combined with the FLP-FRT system for higher-resolution fate mapping and a reduction in background leakage51–53. In this intersectional method, both site-specific recombinases are required to excise two STOP cassettes and activate the reporter transgene. Once the STOP cassettes are removed, all progeny will also express the reporter transgenes.
In addition to single reporter transgene recombination mouse lines, dual or multicolour reporter lines have become increasingly popular for tracking cell lineage relationships. Mosaic analysis with double markers (MADM) uses a Cre-loxP system to express GFP and red fluorescent protein (RFP) in cell populations of interest54. Before recombination, no reporter transgene is expressed, but after Cre recombinase is activated, one or both of the transgenes are reconstituted. Green, red or double-labelled yellow cells are generated depending on the recombination and the chromosomal segregation type. MADM can be used with cell type-specific and inducible Cre systems to provide single-cell resolution and to more precisely examine progenitor division patterns33,54–56. Multicolour lineage tracing is also possible with recent mouse reporter lines, including Brainbow and Confetti57,58 (FIG. 2d). The Brainbow mouse lines harness stochastic Cre-mediated recombination using incompatible loxP sites to drive the combinatorial expression of fluorescent reporter transgenes. The Brainbow mouse can label individual cells with as many as 90 distinguishable colours through the stochastic expression of several fluorescent reporter transgenes. Cells that express a particular colour share a common lineage. A modified line, the Confetti mouse, ubiquitously expresses Cre from the Rosa26 locus and has been used to track individual stem cell lineages in the mouse intestinal crypt58. With the expression of a multitude of unique colours, co-staining with antibodies to determine protein expression within Brainbow or Confetti mice is nearly impossible. Endogenous fluorescence of the reporter genes, however, can be used for imaging clones. Advances in microscopy, such as the two-photon microscope, continue to make these lines an attractive choice for in vivo cell lineage tracing.
It is important to note that these lineage-marking strategies are not mutually exclusive. Viral vectors can carry Cre recombinase to sparsely activate transgene recombination, and the Cre-loxP system can be used to drive the conditional expression of viral receptors, adding a greater level of cell type specificity59,60. Viral libraries may combine exogenous DNA barcodes with multicolour reporters to trace cell lineage. Using this type of marking strategy, clonal relationships of murine hepatocytes and leukaemic cells were recently investigated61. Other multicolour mosaic constructs such as Brainbow can also be expressed in the form of a viral vector library for random colour expression in infected cells. This method has been used successfully to visualize multiple clones in a single developing mouse embryo12. The multiaddressable genome-integrative colour (MAGIC) marker toolkit, which is a recent transposon-based Brainbow transgene method, has been used to track progenitors in both the embryonic mouse brain and spinal cord62. As new transgenic animal models and genetic labelling tools are developed, experiments that harness multiple technologies in concert will continue to remain a powerful approach for cell-specific and tissue-specific tracing in model organisms and culture.
Recent methodological advances in prospective lineage tracing
Innovations in both microfluidic platforms and genome-editing strategies have also recently been used to prospectively track cell lineage63,64 (TABLE 1). Advances in microfluidic technologies now allow for the capture and culture of single progenitor cells and up to five generations of their progeny on a single chip. In vitro time-lapse imaging for both division kinetics and the identification of lineage relationships can be coupled with on-chip immunohistochemistry to assess cell fate within the captured clones. Clones can also be retrieved after culturing for single-cell transcriptomics with known lineage relationships. Kimmerling et al.63 used this microfluidic trap array technology, paired with single-cell RNA sequencing (RNA-seq), to look at both interclonal and intraclonal variability in activated CD8+ T cells; they demonstrated that lineage-dependent transcriptional profiles corresponded to functional cellular phenotypes. This study was the first to link single-cell transcriptomics with cell lineage history63. Combining prospective lineage tracing with RNA-seq allows for the overlay of phenotypic cell identity with genetic lineage information for a more comprehensive view of clonal relationships. Moving forwards, the layering of ‘omics’ technologies such as transcriptomics and proteomics with genetic-based tracking will allow for the deeper analysis of identity and lineage within a cell population.
Recently, the CRISPR–Cas9 genome-editing technology has been used to track and to synthetically reconstruct cell lineage relationships in complex, multicellular organisms (FIG. 2e). McKenna et al.64 developed genome editing of synthetic target arrays for lineage tracing (GESTALT), a highly multiplexed method that uses barcodes that consist of multiple CRISPR–Cas9 target sites64. These barcodes progressively and stably accumulate unique mutations over cellular divisions and can be recovered using targeted sequencing. Cell lineage relationships are determined on the basis of the pattern of shared mutations among analysed cells. Although prospective in the sense that the barcode is introduced at the start of the experiment, the GESTALT method also parallels retrospective, somatic-mutation-based tracking (discussed below). The incrementally edited barcodes from thousands of cells were then used in large-scale reconstructions of multiple cell lineages within cell culture and zebrafish. Although the precise anatomical position and cell type of each assayed cell cannot be determined using this method, this study and other emerging studies demonstrate the potential for cumulative and combinatorial barcode editing in prospective lineage tracing of whole organisms64–68. Advances during the past 30 years, since the advent of genetic barcoding and recombinase-based transgenic animals, have allowed prospective cell lineage tracking experiments not only to uncover clonal relationships at the single-cell level but also to map cell fate choices in a wide variety of cells, tissues and model organisms.
Retrospective methods of lineage tracing
It has only recently become possible to harness naturally occurring mutations to infer cell lineage information retrospectively, mostly owing to advances in the genome sequencing of single cells. Similar to prospective lineage tracers in model organisms, somatic mutations indelibly mark the progeny of the dividing cell in which they occurred, and the cells bearing these naturally occurring lineage marks can be later analysed to reconstruct the genealogy of organs and cell types69. To use naturally occurring somatic mutations for lineage tracking, it is first necessary to discover the mutations that are shared between multiple cells from that individual, but because somatic mutations are, by their nature, low frequency, they are difficult to identify through the sequencing of a mixed population of cells at conventional depths. Innovations in next-generation sequencing, such as the declining cost of deep next-generation genome sequencing and the advent of single-cell genome sequencing, have made it possible to discover rare mutations that mark minority lineages within a larger cellular population70. These variants, from the least frequently somatically mutated to the most frequently mutated, include retrotransposons, copy-number variants (CNVs), single-nucleotide variants (SNVs) and microsatellites (FIG. 4). The different rates at which these variants occur in somatic tissues allow lineage tracing experiments to be conducted at different levels of granularity according to the types of variants, tissue and disease state selected (TABLE 2), although precise frequency estimates for each type of variant have not been measured in a consistent way, and are therefore not provided here. Single-cell genome sequencing promises to revolutionize lineage tracking in humans; however, whole-genome sequencing currently requires considerably more DNA than the 6 picograms that are present in a single cell, necessitating pre-sequencing genome amplification, which can introduce technical artefacts and complications to a lineage-tracing experiment71,72.
Figure 4. Somatic mutation in the genome.
Somatic mutations in the genome include (in order of increasing frequency): long interspersed nuclear element 1 (L1) retrotransposition events, copy-number variation, single-nucleotide variants, microsatellite (short tandem repeat) variants and single-strand lesions. Each class of mutations is caused by different environmental stressors, such as DNA polymerase slippage for microsatellites and cytosine deamination for single-nucleotide lesions. Furthermore, each class of mutation has different functional consequences for the genome of the cell in which it occurs, such as gene or enhancer disruption (L1 retrotransposition) and increased protein production (copy-number variation).
Table 2.
Examples of somatic mosaic mutations identified by sequencing
| Tissue | Genome source | Sequencing method | Approximate depth of coverage | Genome amplification method | Refs |
|---|---|---|---|---|---|
| L1 retrotransposition event | |||||
| Normal human brain | Single cell | L1 insertion profiling (L1-IP) | Low (0.35x) | MDA and MALBAC | 78 |
| Normal human brain | Single cell | WGS | High (40x) | MDA | 80 |
| Normal human brain | Single cell | Retrotransposon capture sequencing (RC-seq) | Low (0.35x) | MALBAC | 79 |
| Copy-number variant | |||||
| Human breast cancer | Single cell | WGS | Low (0.05x) | In vivo amplification and DOP-PCR | 14 |
| Normal human skin | Single cell | WGS | High (20x) | In vivo amplification | 82 |
| Human cancer cell line | Single cell | WGS | High (25x) | MALBAC | 119 |
| Normal human brain | Single cell | WGS | Low (0.04x) | MDA and GenomePlex | 84 |
| Human breast cancer | Single cell | WGS | High (50x) | In vivo amplification and DOP-PCR | 15 |
| Normal human brain | Single cell | WGS | Low (0.2x) | GenomePlex | 85 |
| Human breast cancer line | Single cell | WGS | Low (0.05x) | DOP-PCR | 86 |
| Normal rat brain | Single cell | WGS | Low (1x) | MALBAC and GenomePlex | 118 |
| Normal human brain, keratinocytes | Single cell | WGS | Low (0.1x) | DOP-PCR | 83 |
| Single-nucleotide variant | |||||
| Human leukaemia | Tumour and relapse samples | WGS | High (25x) | None | 105 |
| Human myoproliferative neoplasm | Single cell | WES | High (15x) | MDA | 108 |
| Human kidney tumour | Single cell | WES | High (15x) | MDA | 100 |
| Human breast cancer | Single cell | WGS | High (50x) | In vivo amplification and DOP-PCR | 15 |
| Human bladder cancer | Single cell | WES | High (40x) | MDA | 99 |
| Normal mouse gut | Single cell | WGS | High (30x) | In vivo amplification | 16 |
| Human leukaemia | Single cell | Targeted sequencing | Not provided | MDA | 109 |
| Normal human skin | Biopsy | Targeted sequencing | Very high (500x) | None | 106 |
| Normal human brain | Single cell | WGS | High (40x) | MDA | 17 |
| Normal mouse brain | Single cell | WGS | High (40x) | In vivo amplification | 88 |
| Microsatellite | |||||
| Mutant mouse tumour | Single cell | Targeted genotyping | NA | MDA | 103 |
| Normal mouse (various tissues) | Single cell | Targeted genotyping | NA | In vivo amplification | 69 |
| Mutant mouse colon | Single cell | Targeted genotyping | NA | MDA | 93 |
| Mutant mouse oocytes | Single cell | Targeted genotyping | NA | MDA | 94 |
| Human leukaemia | Single cell | Targeted genotyping | NA | MDA | 104 |
| Normal human brain | Single cell | WGS | High (40x) | MDA | 81 |
Recent work identifying somatic mutations is organized by type of variant interrogated and date of publication. Each study’s source tissue and source genome are listed, together with the genome amplification and sequencing approach used and approximate depth of coverage (if given). DOP-PCR, degenerate oligonucleotide priming PCR; L1, long interspersed nuclear element 1; MALBAC, multiple annealing and looping-based amplification cycles; MDA, multiple displacement amplification; NA, not applicable; WES, whole-exome sequencing; WGS, whole-genome sequencing.
Somatic mutations for lineage tracing in normal tissue
Endogenous retroelements, which principally include long interspersed nuclear element 1 (L1; also known as LINE-1) elements, constitute much of the human genome; L1 elements alone constitute nearly one-fifth of the genome73. A very small number of these L1 elements retain the ability to mobilize in humans and can insert into a new genomic location during somatic cell division74, which has raised substantial interest in their potential contribution to somatic diversity, especially within complex tissues, such as the brain75. Large numbers of apparent somatic L1 mobilization events were suggested by initial experiments using quantitative PCR (qPCR)76 or DNA sequencing77 from bulk human brain, but more precise estimates of L1 mobilization frequency that have been derived by sorting single neurons, amplifying the whole genome and analysing L1 retrotransposition at a single-cell level78, suggest fewer than one somatic insertion per neuronal genome on average78. A second study suggests higher rates (10–15 somatic insertions per genome)79, but this study is subject to criticism for the inclusion of sequencing and other technical artefacts, the removal of which reduces the estimated rate to <1 somatic insertion per neuron80. A single-neuron whole-genome sequencing study81 confirms the low rate of L1 retrotransposition events but also illustrates the striking spatial distribution patterns of clonal retrotransposition events, providing strong proof of principle for the use of spontaneous somatic L1 events for lineage tracing. Using a digital droplet PCR assay, one somatic L1 insertion was found across the cortex, and the other somatic L1 insertion was restricted to a small region of prefrontal cortex, indicating that the first L1 insertion occurred early in brain development, with the second occurring later81 (FIG. 3).
Subchromosomal somatic copy-number variation is common in human tissues, and somatic CNVs are potentially useful lineage-tracing tools owing to the relative ease with which they can be detected from single-cell sequencing data. Large subchromosomal somatic CNVs can be detected in normal skin82,83 and brain83–85, and these studies report large proportions of skin cells and neurons, approximately 30–70%, that contain at least one somatic CNV, including a small number of shared CNVs that arose during development85. Furthermore, the analysis of clonal CNVs can also illuminate genes and lineages that are responsible for disease; for example, brain tissue from patients with hemimegalencephaly contains neurons with somatic copy-number gains of chromosome 1q (containing the growth-promoting gene AKT3)85. CNVs are particularly promising as lineage-marking somatic variants; unlike other types of somatic mutations, they can be identified from low-coverage (<1×) sequencing, given sufficiently even genome amplification (see ‘Methodological considerations for retrospective lineage tracing’, below), making the sequencing of many single cells for variant discovery a cost-effective strategy86.
SNVs are a major source of evolutionary and disease-causing mutations, although they can also occur very frequently in non-coding portions of the genome without functional effects on somatic cells87. Thus, somatic SNVs represent a rich source of lineage-marking mutations, as they are both abundant and can be expected to be frequently functionally neutral. Indeed, pioneering work in mouse stomach, intestine and prostate16, and mouse brain88 and human brain17, suggests that somatic SNVs can be identified from single cells or clones and used to reconstruct developmental lineages. These works disagree as to the precise rates of mutation, which is potentially attributable to differences in species and methodology, as two studies amplified mouse single-cell genomes in vivo by organoid cell culture16 or somatic cell nuclear transfer88, estimating approximately 100–600 somatic SNVs per cell, and one study amplified human single-cell genomes from post-mortem tissue in vitro by multiple displacement amplification (MDA)17, estimating approximately 1,500 somatic SNVs per cell. These discrepancies could be resolved by the development of new algorithms that are specifically designed for the interpretation of single-cell genome-sequencing data89, and also by increasing the variety of cell types and tissues subjected to single-cell genome sequencing. Regardless of the precise rate of SNV mutation in somatic tissues, it is clear that somatic SNVs can be used as endogenous lineage tracers; in one study, 9 of 16 sequenced neurons, and 136 of 226 total neurons from the same area of cortex, could be placed in a lineage tree with four independent clades that diverged before gastrulation. One clade contained a nested set of 11 somatic mutations, which were progressively regionally restricted across the brain and were present in progressively decreasing frequency in bulk tissue17, suggesting that the analysis of such nested mutations might allow the examination of the progressively branching lineage trees that characterize the developing embryo.
The most frequently mutated somatic loci are likely to be microsatellites, as DNA polymerase slipping makes them highly variable both between and within individuals90. Owing to the instability of microsatellite repeats, the analysis of all microsatellite locations in the genome is predicted to be capable of reconstructing the entire cell lineage tree of an organism91, using methods adapted from organism-level phylogenetic analysis92. Microsatellites have been used to reconstruct the cell lineage decisions that lead to the development of colonic crypts93 and the female germ line94; in the female germ line study, 81 microsatellite loci were analysed in mismatch repair-deficient mice (which have elevated rates of microsatellite instability), allowing the oocyte lineage to be reconstructed in comparison to cells from the bone marrow and ovarian cumulus cells. The oocytes formed a lineage that was distinct from both bone marrow and cumulus cells, but oocytes from the left and right ovaries did not form distinct sub-clusters, demonstrating that oocytes were generated at a time before the segregation of somatic cells on the left and right sides of the body, and thus were not generated during postnatal life94. Similar to microsatellites, the polyadenylated tracts following somatic L1 retrotransposition events are subject to frequent DNA polymerase slippage during replication and, therefore, lineages that are defined by a somatic L1 retrotransposition event can be further delineated by analysing poly(A) tail polymorphisms81.
Somatic mutations for tracing cancer evolution
Cell lineage tracing is useful for describing the natural history of a tumour, as lineage analysis can identify the source of a metastasis or the accumulation of mutations that lead to unchecked growth. Although frequently mutated microsatellite loci were identified in cancer cells two decades ago and used to mark tumour lineages95,96, tracing complex mutational paths in cancerous tissue required the advent of more rapid and comprehensive methods, especially single-cell sequencing and bioinformatic analysis, to identify minority clones with multiple progressive mutations97,98. Lineage tracing in cancer tissue has several advantages compared with lineage tracing in normal tissue, including the ability to compare a tumour sample with a paired normal sample99,100, and the availability of tumour samples from living individuals owing to surgical resection15. Furthermore, the rapid mutation rate and genomic instability of cancer cells generates large numbers of clonal mutations and rearrangements, which themselves facilitate lineage-tracing analyses101, enabling cancer biologists to draw important biological insights even in the few years since single-cell sequencing has become possible.
Lineage tracking has been powerfully applied to investigate tumour evolution over time through the comparison of initial tumour samples with metastases or relapse samples102. In one pioneering study, 37 cells from a primary tumour, two secondary metastases and surrounding normal tissue were removed from a mismatch repair-deficient mouse using LCM. Cellular genomes were subjected to whole-genome amplification and genotyped at 100 microsatellite loci. Tumour cells formed a coherent clade and phylogenetically clustered away from normal surrounding cells, with physically adjacent tumour cells tending to be more closely related by lineage than non-adjacent tumour cells, indicating that the tumour mostly grew in place without substantial cellular migration103. Using similar methods, microsatellite-based lineage tracking was applied to paired original and relapse samples from patients with leukaemia, identifying some relapses that resulted from slowly dividing cells that were present in the original sample, and others that resulted from the enrichment of particular subclones that were present in the original sample or from a lineage that was almost entirely distinct from the original sample104. Although functionally neutral microsatellite mutations in single cells are likely to provide a more unbiased survey of lineage variation, mutational burden in leukaemia is high enough that sequencing candidate somatically mutated genes in original and relapse samples, even in bulk rather than in single cells, can also establish a detailed picture of clonal evolution in relapse. In one study, two major patterns of evolution were identified: in the first case, a fairly homogeneous initial clone remained the dominant clone in relapse; and in the second case, a minor subclone from the initial sample became the dominant clone in relapse, with the other original subclones lost following initial treatment. In both cases, the dominant subclone in relapse tended to acquire further mutations, possibly as a result of the treatment itself105.
As the preceding studies make clear, a more comprehensive understanding of cellular heterogeneity is crucial for understanding both the development of cancerous lesions and their resistance to treatment. Deep sequencing of small skin biopsy samples demonstrates that normal skin, which is an organ that is exposed to considerable environmental mutagens, carries a heavy burden of mutation, and this heterogeneity in normal tissue provides ample raw material for the development of malignancies, as oncogenic mutations are frequently found and positively selected even in healthy skin tissue106. Similarly, primary pancreatic tumours contain substantial numbers of deleterious mutations up to a decade before the origination of subclones that are capable of giving rise to metastases107. The expansion of mutation-carrying subclones is not strictly necessary for the development of some malignancies, such as kidney clear cell renal cell carcinoma, which seem to be genetically heterogeneous when subjected to single-cell exome sequencing, with little evidence of dominant subpopulations by principal component analysis or phylogenetic clustering100. By contrast, as suggested by the relapse studies discussed above, leukaemia samples generally demonstrate a clear clonal structure when analysed using single-cell sequencing, with dominant oncogenic clones108 and clonal structural variation occurring before the acquisition of oncogenic point mutations109.
As with species-level evolution by natural selection, single-cell lineage analyses strongly suggest that tumours evolve irregularly, with periods of mutational stasis followed by punctuated expansions. In breast cancer tissue, 100 single genomes were analysed from two tumours for ploidy and copy-number variation, and investigators identified a few distinct primary subclones in each case, rather than a large number of more closely related subclones. This indicates that large-scale genomic aberrations accumulated in punctuated bursts, with mutated cell populations rapidly emerging to dominate the cancer cell population14. By contrast, the analysis of point mutations in breast cancer suggests that these primary subclones are established early and remain stable through later tumour evolution, but are quite genetically heterogeneous at the single-nucleotide level, with each cell carrying a unique mutational burden15. Similarly, single-cell exome sequencing of bladder cancer cells demonstrated the presence of two late-occurring subclones that constituted approximately 70% of the tumour, indicating that continuing single-nucleotide mutation can generate highly proliferative clones that can be positively selected and that can come to dominate the tumour population in a short time period99.
Methodological considerations for retrospective lineage tracing
Retrospective lineage tracing based on the analysis of somatic mutation often entails analysing the genomes of single cells or small groups of cells, and so the DNA must be amplified to generate enough material for next-generation sequencing. The process of amplification, like cell division itself, is inherently error-prone, and can create amplicons that contain sequence or structural errors, which produce false-positive mosaic structural variants, microsatellite variability and SNVs. In addition, uneven amplification across the genome can produce false-positive CNV calls, as well as false-negative sequence calls, in the case of allelic dropout72,110. When designing single-cell sequencing experiments, it is therefore important to consider the frequencies and types of errors that are introduced and to select an approach that best balances signal and noise for the experiment at hand71 (TABLE 2).
One broad class of whole-genome amplification strategies is based on amplifying the genome in vitro using highly processive DNA polymerases, and another is based on amplifying the genome in vivo, in cells or whole organisms, by cloning and cell culture (TABLE 2). The earliest in vitro approach to be developed, MDA, takes advantage of the high processivity of Φ29 DNA polymerase to generate long linear amplicons. Secondary priming and extension occur from newly synthesized amplicons, increasing amplification efficiency111. MDA generates 15–20 μg of DNA from a single nucleus78, and MDA-amplified single-cell DNA is sufficiently high quality for calling somatic retrotransposition events81 and SNVs17,112. Several groups have recently described methods for partitioning MDA reactions into nanolitre-sized droplets113–115 or by using microfluidic devices116, which increases the uniformity of amplification and reduces reagent costs. A second approach, degenerate oligonucleotide priming PCR (DOP-PCR), involves the fragmentation of the genome into small pieces, followed by amplification with random priming117. This method amplifies the genome more evenly than MDA, and is thus particularly well-suited to studying copy-number variation14,85,86,118. Hybrid methods, including multiple annealing and looping-based amplification cycles (MALBAC) and PicoPlex, include pre-amplification with a tagged primer. Full amplicons contain complementary sequences at each end, creating hairpins or loops, preventing them from overamplification119. MALBAC-based amplification is more even across the genome than MDA-based amplification, but error rates are higher72. With this more even amplification, hybrid methods are appropriate for investigating CNVs118, structural variants120 and retrotransposition events79, although chimeric amplification products that occur in the MALBAC and MDA reactions are of particular concern for interpreting retrotransposition events and structural variants80.
As sequence errors introduced by DNA polymerase, as well as chimeric amplification products, can create difficulties for interpreting sequencing from DNA amplified in vitro, some groups have developed approaches that use cell division to amplify genomic DNA. Even selecting cells in G2/M phase for sequencing, after they have replicated their genomes, leads to remarkable improvement in dropout rates and false-positive calls15,121, although this approach is not applicable to non-dividing cell populations. Other groups have turned to selecting single cells (and reproducing their genomes by somatic cell nuclear transfer if they are terminally differentiated), and then growing clonal populations in induced pluripotent stem cell (iPSC) or organoid culture16,82, or in a cloned experimental organism88, and sequencing in bulk. These methods are an interesting solution to the problem of errors occurring due to DNA polymerase, although it is not clear to what degree the single-stranded lesions that exist in the genomes of terminally differentiated cells are stable after re-activating cell cycle-dependent repair processes122, which would tend to deflate the number of somatic mutations recovered.
A further methodological consideration in single-cell sequencing is selecting single cells for analysis. For many cell populations, fluorescence-activated sorting can be used to sort single cells or nuclei78,123,124, but this method partly depends on finding an antibody or cellular characteristic that is specific to the cell population under study. Alternatively, cells or nuclei can be triturated and manually selected under microscope guidance100,108,125, or FACS-purified then subjected to manual selection for single cells79. Manual selection is less expensive than FACS, and more broadly applicable to cells that lack a specific antibody marker, but it requires finer motor control on the part of the operator. Laser-capture microdissection allows the selection of single cells in their native tissue context, although it is difficult to ensure the capture of precisely one nucleus without leaving chromosomes behind126,127.
As retrospective lineage tracing by single-cell sequencing matures, it will be crucial to develop methods that allow the visualization of mutations in situ, thus maintaining the tissue context of a mutation-carrying cell. CNVs and retrotransposition events are large enough for detection by traditional fluorescence in situ hybridization (FISH), but SNVs are more challenging to detect in situ128. Because coding SNVs are present in many copies of mRNA, they can be analysed using a modified FISH protocol, with mRNAs amplified by rolling circle amplification and detected with padlock probes129; SNVs in the mitochondrial genome can be detected using a similar method130, and SNVs in genes that are expressed at high levels in a given tissue are amenable to detection by fluorescent in situ sequencing (FISSEQ)131–133. Fortunately for investigators using SNVs that occur in the large proportion of the genome that is non-coding or poorly expressed, and for those using archival fixed tissue, a new FISH method that is sensitive to SNVs in genomic DNA has been developed recently using allele-specific PCR134.
Perspectives
When designing a lineage tracing experiment, it is important to consider the strengths and weaknesses of either a prospective or a retrospective approach (FIG. 3). For success in prospective lineage tracing, there must be genetic access to the population in question, whether through a regionally directed method such as viral injection and electroporation, or by using population-specific marker lines and promoters. Because prospective lineage tracing depends on labelling and follow-up analysis, its use is restricted to experimental organisms and cell culture systems, whereas retrospective lineage tracing can investigate lineage directly in human tissue. This unprecedented access to human lineage information provides investigators with a wealth of data relevant to human development and disease. However, investigators must carefully select subjects and cells to identify pools of informative variants that differ between the experimental populations in question. Retrospective lineage tracing heavily relies on sequencing, often of single cells, and is therefore currently lower throughput and more expensive than most prospective methods. Although emerging prospective lineage systems engineer revolutionary ways to investigate lineage in model organisms, it will always be necessary to retrospectively map lineage in a naturally occurring tissue without engineered lineage marks.
Whether one chooses a prospective or a retrospective lineage tracing method, the choice of organ or cell population to investigate has an effect on the ease of lineage tracing and the questions that can be asked. Organs that are relatively homogenous in terms of cellular composition, such as the liver, will require less information about specific genetic or protein population markers to investigate than will more diverse organs, such as the immune system, and a diverse organ is likely to have a more complex lineage structure, which could be more difficult to fully investigate. If the tissue is primarily composed of post-mitotic cells, such as the composition of the kidney, lineage information from development will be preserved for a retrospective lineage analysis, but the lack of proliferative capacity means that forward lineage tracing is restricted beyond a certain developmental point. In a tissue with continuously proliferating progenitors, such as the skin, prospective lineage tracing is possible throughout life, but the loss and replenishment of post-mitotic cells removes important sources of retrospective lineage data. Organs that are structured with proliferating populations located adjacent to a luminal space, such as the intestine or embryonic brain, are more accessible to injection and infection than those without a lumen. Finally, for human studies, the accessibility of the tissue in question from patients or donors is crucial. Some specimens can be obtained from routine biopsies, such as skin, or from minimally invasive procedures, such as blood draws, whereas others are not possible to obtain from living subjects. Blood and tissue banks have been established for certain tissues, particularly the brain, and several disease-specific post-mortem tissue banks allow researchers to study lineage in pathological conditions.
Several major recent funding initiatives aim to trace lineage in whole organs or organisms, and scalable methods supported by these initiatives are beginning to bear fruit. Notably, the Paul G. Allen Foundation issued a call for proposals in 2014 for strategies that tracked lineage by barcoding large numbers of cells, which funded the development of an innovative whole-organism approach to lineage tracing using genome-editing-based barcodes64. The US National Institutes of Health (NIH) BRAIN Initiative also solicited applications in 2014 for proposals that generated a census of cell types in the brain, and several awardees proposed the identification of lineage relationships between cell types in addition to enumerating the cell types themselves. One project funded by this initiative has produced a method for RNA sequencing of single cells in nanolitre-sized droplets, enabling the sequencing of many cells with reduced cost and preparation time compared with other single-cell RNA-seq methods. As a proof of principle, this sequencing was carried out on 44,000 retinal cells, and the data were used to derive classes of cells and the gene expression relationships between them, which may relate to their lineage relationships135. As funding initiatives promote the development of new large-scale lineage-tracing methods, it will increasingly become possible to trace the lineage of organs and organisms at a scale that the early developmental biologists could never have imagined.
No longer limited to observing the development of transparent organisms or tracking a small number of cells with serially diluted dyes, biologists can now access a variety of methods for tracing lineage forwards from the application of a genetic label. In addition, recent advances in sequencing, particularly genome sequencing of single cells, allow lineage tracing to be carried out retrospectively, reconstructing lineage decisions that occurred months or years before sequencing. Retrospective lineage tracing can be carried out in normal tissue, examining developmental relationships between cells, and in pathological states such as cancer, enabling the reconstruction of tumour evolution. In both prospective and retrospective lineage tracing experiments, biological differences between tissues and experimental organisms inform appropriate choices in experimental approach. Furthermore, within a broad experimental strategy, the choice of amplification, sequencing and visualization methods must be adapted to the biological question under study. One hundred years after the first investigations of cell lineage, developmental biologists have built a tremendously enriched genetic toolkit for examining the developmental fate decisions that construct a whole organism.
Acknowledgments
The authors thank members of the Walsh laboratory, especially M. Lodato, for helpful comments. This work was supported by the Manton Center for Orphan Disease Research and grants from the National Institute of Neurological Disorders and Stroke (NINDS) (R01 NS032457, R01 NS079277 and U01 MH106883) to C.A.W. M.B.W. is supported by the Leonard and Isabelle Goldenson Research Fellowship. K.M.G. is supported by US National Institutes of Health (NIH) grant T32 MH20017. C.A.W. is a Distinguished Investigator of the Paul G. Allen Family Foundation and an Investigator of the Howard Hughes Medical Institute.
Glossary
- Fate-mapping methods
Approaches that apply a heritable mark to a given progenitor or class of progenitors, then use the inheritance of the mark to define the progeny of that cell or class.
- Genetic mosaicism
The state of containing more than one distinct genome within a single organism, whether achieved by experimental means (by combining early-stage embryos from different individuals or species) or by natural means (by considering differences in DNA from cell to cell).
- Prospective lineage analysis
An approach that applies an experimental label to cells, which is then examined at some point in the future to construct a lineage tree looking forwards from development.
- Retrospective lineage analysis
An approach that uses naturally occurring labels (for example, somatic mutations) to construct a lineage tree looking backwards at development.
- Intersectional analyses
Using two attributes of a cell population (for example, the expression from two different promoters) to select only cells that display both attributes.
- Organotypic slice culture
A culture system in which a slice of tissue is cultured, rather than a collection of dissociated cells, to more closely mimic the biological context of an organ.
- Cut-and-paste mechanism
Method of mobilization by class II DNA transposable elements, in which the transposon excises itself from its genomic location using transposase protein and integrates into a new target site.
- Cre-loxP
A genetic system derived from P1 bacteriophage and adapted for use in genetically modifiable organisms. The site-specific recombinase Cre inverts or recombines any sequence located between 34 bp loxP sites, depending on their orientation.
- FLP-FRT
A genetic system derived from Saccharomyces cerevisiae and adapted for use in genetically modifiable organisms. The site-specific recombinase FLP inverts or recombines any sequence located between 34 bp FRT sites, depending on their orientation.
- loxP-STOP-loxP
A DNA element containing a transcription termination sequence flanked by loxP sequences, allowing the transcription termination sequence to be removed by the activity of Cre recombinase.
- Pulse
An experiment in which a brief bolus of label is followed by a period with no label, allowing events that occurred within a specific time window to be marked.
- Leakiness
Activity in the absence of inducing signal.
- Microsatellites
(Also known as short tandem repeats). Short genomic repeats consisting of a set of tandem nucleotides, with repeat numbers varying between different alleles.
- Digital droplet PCR
A polymerase chain reaction in which the reaction is divided into thousands of small droplets, allowing absolute quantification of PCR products.
- Organoid
A three-dimensional culture model of a whole or partial organ or tissue.
- Clade
A group on a dendrogram (tree diagram) that is separate from another group.
- Nested
To have a set fully contained within a broader set.
- Allelic dropout
One of two alleles at a genomic locus fails to amplify and is therefore not recovered in sequencing data. Compare with locus dropout, in which both alleles at a given locus fail to amplify.
- Chimeric amplification
In whole-genome amplification, when one amplicon misprimes another locus, leading to a hybrid DNA product with sequences from the original amplicon adjacent to those from the second locus.
- Induced pluripotent stem cell (iPSC)
A cell that is capable of giving rise to daughter cells of many or all lineages, derived by reprogramming of an adult cell using pluripotency factors.
- Triturated
A homogeneous solution created by mixing or grinding, such as pipetting cells up and down to create a uniform suspension.
Footnotes
Competing interests statement
The authors declare no competing interests.
References
- 1.Deppe U, et al. Cell lineages of the embryo of the nematode Caenorhabditis elegans. Proc Natl Acad Sci USA. 1978;75:376–380. doi: 10.1073/pnas.75.1.376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Sulston JE, Schierenberg E, White JG, Thomson JN. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev Biol. 1983;100:64–119. doi: 10.1016/0012-1606(83)90201-4. [DOI] [PubMed] [Google Scholar]
- 3.Mintz B. Gene control of mammalian pigmentary differentiation. I Clonal origin of melanocytes. Proc Natl Acad Sci USA. 1967;58:344–351. doi: 10.1073/pnas.58.1.344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kelly SJ. Studies of the developmental potential of 4- and 8-cell stage mouse blastomeres. J Exp Zool. 1977;200:365–376. doi: 10.1002/jez.1402000307. [DOI] [PubMed] [Google Scholar]
- 5.Le Douarin NM, Teillet MA. The migration of neural crest cells to the wall of the digestive tract in avian embryo. J Embryol Exp Morphol. 1973;30:31–48. [PubMed] [Google Scholar]
- 6.Rosenquist GC. The location of the pregut endoderm in the chick embryo at the primitive streak stage as determined by radioautographic mapping. Dev Biol. 1971;26:323–335. doi: 10.1016/0012-1606(71)90131-x. [DOI] [PubMed] [Google Scholar]
- 7.Johnson MH, Ziomek CA. The foundation of two distinct cell lineages within the mouse morula. Cell. 1981;24:71–80. doi: 10.1016/0092-8674(81)90502-x. [DOI] [PubMed] [Google Scholar]
- 8.Lawson KA, Meneses JJ, Pedersen RA. Cell fate and cell lineage in the endoderm of the presomite mouse embryo, studied with an intracellular tracer. Dev Biol. 1986;115:325–339. doi: 10.1016/0012-1606(86)90253-8. [DOI] [PubMed] [Google Scholar]
- 9.Pedersen RA, Wu K, Bałakier H. Origin of the inner cell mass in mouse embryos: cell lineage analysis by microinjection. Dev Biol. 1986;117:581–595. doi: 10.1016/0012-1606(86)90327-1. [DOI] [PubMed] [Google Scholar]
- 10.Clarke JD, Tickle C. Fate maps old and new. Nat Cell Biol. 1999;1:E103–E109. doi: 10.1038/12105. [DOI] [PubMed] [Google Scholar]
- 11.Gerrits A, et al. Cellular barcoding tool for clonal analysis in the hematopoietic system. Blood. 2010;115:2610–2618. doi: 10.1182/blood-2009-06-229757. [DOI] [PubMed] [Google Scholar]
- 12.Cai D, Cohen KB, Luo T, Lichtman JW, Sanes JR. Improved tools for the Brainbow toolbox. Nat Methods. 2013;10:540–547. [PubMed] [Google Scholar]
- 13.Hsu PD, Lander ES, Zhang F. Development and applications of CRISPR-Cas9 for genome engineering. Cell. 2014;157:1262–1278. doi: 10.1016/j.cell.2014.05.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Navin N, et al. Tumour evolution inferred by single-cell sequencing. Nature. 2011;472:90–94. doi: 10.1038/nature09807. The authors pioneered the sequencing of dividing cancer cells by single-nucleus sequencing (SNS) to construct the lineage of a breast tumour. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wang Y, et al. Clonal evolution in breast cancer revealed by single nucleus genome sequencing. Nature. 2014;512:155–160. doi: 10.1038/nature13600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Behjati S, et al. Genome sequencing of normal cells reveals developmental lineages and mutational processes. Nature. 2014;513:422–425. doi: 10.1038/nature13448. This study uses cell culture as an in vivo DNA-amplification method, sequencing clonal lines of mouse cells to identify lineages in adult tissues. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Lodato MA, et al. Somatic mutation in single human neurons tracks developmental and transcriptional history. Science. 2015;350:94–98. doi: 10.1126/science.aab1785. The authors sequence normal human brain tissue and identify SNVs, which they use to place adult neurons in a lineage tree. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Beddington R. An autoradiographic analysis of the potency of embryonic ectoderm in the 8th day postimplantation mouse embryo. Development. 1981;64:87–104. [PubMed] [Google Scholar]
- 19.Serbedzija GN, Bronner-Fraser M, Fraser SE. A vital dye analysis of the timing and pathways of avian trunk neural crest cell migration. Development. 1989;106:809–816. doi: 10.1242/dev.106.4.809. [DOI] [PubMed] [Google Scholar]
- 20.Turner DL, Cepko CL. A common progenitor for neurons and glia persists in rat retina late in development. Nature. 1987;328:131–136. doi: 10.1038/328131a0. [DOI] [PubMed] [Google Scholar]
- 21.Frank E, Sanes JR. Lineage of neurons and glia in chick dorsal root ganglia: analysis in vivo with a recombinant retrovirus. Development. 1991;111:895–908. doi: 10.1242/dev.111.4.895. [DOI] [PubMed] [Google Scholar]
- 22.Brown KN, et al. Clonal production and organization of inhibitory interneurons in the neocortex. Science. 2011;334:480–486. doi: 10.1126/science.1208884. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Noctor SC, Flint AC, Weissman TA, Dammerman RS, Kriegstein AR. Neurons derived from radial glial cells establish radial units in neocortex. Nature. 2001;409:714–720. doi: 10.1038/35055553. [DOI] [PubMed] [Google Scholar]
- 24.Noctor SC, Martinez-Cerdeño V, Ivic L, Kriegstein AR. Cortical neurons arise in symmetric and asymmetric division zones and migrate through specific phases. Nat Neurosci. 2004;7:136–144. doi: 10.1038/nn1172. [DOI] [PubMed] [Google Scholar]
- 25.Gertz CC, Lui JH, LaMonica BE, Wang X, Kriegstein AR. Diverse behaviors of outer radial glia in developing ferret and human cortex. J Neurosci. 2014;34:2559–2570. doi: 10.1523/JNEUROSCI.2645-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Betizeau M, et al. Precursor diversity and complexity of lineage relationships in the outer subventricular zone of the primate. Neuron. 2013;80:442–457. doi: 10.1016/j.neuron.2013.09.032. [DOI] [PubMed] [Google Scholar]
- 27.Walsh CA, Cepko CL. Clonally related cortical cells show several migration patterns. Science. 1988;241:1342–1345. doi: 10.1126/science.3137660. [DOI] [PubMed] [Google Scholar]
- 28.Walsh CA, Cepko CL. Widespread dispersion of neuronal clones across functional regions of the cerebral cortex. Science. 1992;255:434–440. doi: 10.1126/science.1734520. [DOI] [PubMed] [Google Scholar]
- 29.Walsh CA, Cepko CL. Clonal dispersion in proliferative layers of developing cerebral cortex. Nature. 1993;362:632–635. doi: 10.1038/362632a0. [DOI] [PubMed] [Google Scholar]
- 30.Reid CB, Liang I, Walsh C. Systematic widespread clonal organization in cerebral cortex. Neuron. 1995;15:299–310. doi: 10.1016/0896-6273(95)90035-7. [DOI] [PubMed] [Google Scholar]
- 31.Golden JA, Fields-Berry SC, Cepko CL. Construction and characterization of a highly complex retroviral library for lineage analysis. Proc Natl Acad Sci USA. 1995;92:5704–5708. doi: 10.1073/pnas.92.12.5704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Fuentealba LC, et al. Embryonic origin of postnatal neural stem cells. Cell. 2015;161:1644–1655. doi: 10.1016/j.cell.2015.05.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Mayer C, et al. Clonally related forebrain interneurons disperse broadly across both functional areas and structural boundaries. Neuron. 2015;87:989–998. doi: 10.1016/j.neuron.2015.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Harwell CC, et al. Wide dispersion and diversity of clonally related inhibitory interneurons. Neuron. 2015;87:999–1007. doi: 10.1016/j.neuron.2015.07.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Lu R, Neff NF, Quake SR, Weissman IL. Tracking single hematopoietic stem cells in vivo using high-throughput sequencing in conjunction with viral genetic barcoding. Nat Biotechnol. 2011;29:928–933. doi: 10.1038/nbt.1977. The authors demonstrate a method for combining viral genetic barcoding with high-throughput next-generation sequencing, then apply this method to follow the differentiation of single haematopoietic stem cells. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Holt CE, Garlick N, Cornel E. Lipofection of cDNAs in the embryonic vertebrate central nervous system. Neuron. 1990;4:203–214. doi: 10.1016/0896-6273(90)90095-w. [DOI] [PubMed] [Google Scholar]
- 37.Emerson MM, Cepko CL. Identification of a retina-specific Otx2 enhancer element active in immature developing photoreceptors. Dev Biol. 2011;360:241–255. doi: 10.1016/j.ydbio.2011.09.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Fukuchi-Shimogori T, Grove EA. Neocortex patterning by the secreted signaling molecule FGF8. Science. 2001;294:1071–1074. doi: 10.1126/science.1064252. [DOI] [PubMed] [Google Scholar]
- 39.LoTurco J, Manent JB, Sidiqi F. New and improved tools for in utero electroporation studies of developing cerebral cortex. Cereb Cortex. 2009;19(Suppl 1):i120–i125. doi: 10.1093/cercor/bhp033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wu S, Meir Y, Coates C. J piggyBac is a flexible and highly active transposon as compared to sleeping beauty, Tol2, and Mos1 in mammalian cells. Proc Natl Acad Sci USA. 2006;103:15008–15013. doi: 10.1073/pnas.0606979103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.VandenDriessche T, Ivics Z, Izsvák Z, Chuah MKL. Emerging potential of transposons for gene therapy and generation of induced pluripotent stem cells. Blood. 2009;114:1461–1468. doi: 10.1182/blood-2009-04-210427. [DOI] [PubMed] [Google Scholar]
- 42.Yoshida A, et al. Simultaneous expression of different transgenes in neurons and glia by combining in utero electroporation with the Tol2 transposon-mediated gene transfer system. Genes Cells. 2010;15:501–512. doi: 10.1111/j.1365-2443.2010.01397.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Chen F, LoTurco J. A method for stable transgenesis of radial glia lineage in rat neocortex by piggyBac mediated transposition. J Neurosci Methods. 2012;207:172–180. doi: 10.1016/j.jneumeth.2012.03.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Siddiqi F, et al. Fate mapping by piggyBac transposase reveals that neocortical GLAST+ progenitors generate more astrocytes than nestin+ progenitors in rat neocortex. Cereb Cortex. 2014;24:508–520. doi: 10.1093/cercor/bhs332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Ding S, et al. Efficient transposition of the piggyBac (PB) transposon in mammalian cells and mice. Cell. 2005;122:473–483. doi: 10.1016/j.cell.2005.07.013. This study demonstrated that the piggyBac transposon could be used for genetic manipulation in human and mouse cells. [DOI] [PubMed] [Google Scholar]
- 46.Wilson MH, Coates CJ, George AL. PiggyBac transposon-mediated gene transfer in human cells. Mol Ther. 2007;15:139–145. doi: 10.1038/sj.mt.6300028. [DOI] [PubMed] [Google Scholar]
- 47.Woltjen K, et al. PiggyBac transposition reprograms fibroblasts to induced pluripotent stem cells. Nature. 2009;458:766–770. doi: 10.1038/nature07863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Orban PC, Chui D, Marth JD. Tissue- and site-specific DNA recombination in transgenic mice. Proc Natl Acad Sci USA. 1992;89:6861–6865. doi: 10.1073/pnas.89.15.6861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Branda CS, Dymecki SM. Talking about a revolution: the impact of site-specific recombinases on genetic analyses in mice. Dev Cell. 2004;6:7–28. doi: 10.1016/s1534-5807(03)00399-x. [DOI] [PubMed] [Google Scholar]
- 50.Greig LC, Woodworth MB, Greppi C, Macklis JD. Ctip1 controls acquisition of sensory area identity and establishment of sensory input fields in the developing neocortex. Neuron. 2016;90:261–277. doi: 10.1016/j.neuron.2016.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Awatramani R, Soriano P, Rodriguez C, Mai JJ, Dymecki SM. Cryptic boundaries in roof plate and choroid plexus identified by intersectional gene activation. Nat Genet. 2003;35:70–75. doi: 10.1038/ng1228. [DOI] [PubMed] [Google Scholar]
- 52.Farago AF, Awatramani RB, Dymecki SM. Assembly of the brainstem cochlear nuclear complex is revealed by intersectional and subtractive genetic fate maps. Neuron. 2006;50:205–218. doi: 10.1016/j.neuron.2006.03.014. [DOI] [PubMed] [Google Scholar]
- 53.Yamamoto M, et al. A multifunctional reporter mouse line for Cre- and FLP-dependent lineage analysis. Genesis. 2009;47:107–114. doi: 10.1002/dvg.20474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Zong H, Espinosa JS, Su HH, Muzumdar MD, Luo L. Mosaic analysis with double markers in mice. Cell. 2005;121:479–492. doi: 10.1016/j.cell.2005.02.012. [DOI] [PubMed] [Google Scholar]
- 55.Bonaguidi MA, et al. In vivo clonal analysis reveals self-renewing and multipotent adult neural stem cell characteristics. Cell. 2011;145:1142–1155. doi: 10.1016/j.cell.2011.05.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Hippenmeyer S, et al. Genetic mosaic dissection of Lis1 and Ndel1 in neuronal migration. Neuron. 2010;68:695–709. doi: 10.1016/j.neuron.2010.09.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Livet J, et al. Transgenic strategies for combinatorial expression of fluorescent proteins in the nervous system. Nature. 2007;450:56–62. doi: 10.1038/nature06293. In this work, the authors present the Brainbow method for stochastic multicolour labelling in mice in vivo. [DOI] [PubMed] [Google Scholar]
- 58.Snippert HJ, et al. Intestinal crypt homeostasis results from neutral competition between symmetrically dividing Lgr5 stem cells. Cell. 2010;143:134–144. doi: 10.1016/j.cell.2010.09.016. [DOI] [PubMed] [Google Scholar]
- 59.Sutherland KD, et al. Cell of origin of small cell lung cancer: inactivation of Trp53 and Rb1 in distinct cell types of adult mouse lung. Cancer Cell. 2011;19:754–764. doi: 10.1016/j.ccr.2011.04.019. [DOI] [PubMed] [Google Scholar]
- 60.Beier KT, Samson MES, Matsuda T, Cepko CL. Conditional expression of the TVA receptor allows clonal analysis of descendents from Cre-expressing progenitor cells. Dev Biol. 2011;353:309–320. doi: 10.1016/j.ydbio.2011.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Cornils K, et al. Multiplexing clonality: combining RGB marking and genetic barcoding. Nucleic Acids Res. 2014;42:e56. doi: 10.1093/nar/gku081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Loulier K, et al. Multiplex cell and lineage tracking with combinatorial labels. Neuron. 2014;81:505–520. doi: 10.1016/j.neuron.2013.12.016. [DOI] [PubMed] [Google Scholar]
- 63.Kimmerling RJ, et al. A microfluidic platform enabling single-cell RNA-seq of multigenerational lineages. Nat Commun. 2015;7:10220. doi: 10.1038/ncomms10220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.McKenna A, et al. Whole organism lineage tracing by combinatorial and cumulative genome editing. Science. 2016;353:aaf7907. doi: 10.1126/science.aaf7907. This study represents a proof of principle for the use of genome editing to simultaneously mark large numbers of lineages in a developing organism, opening the door for the next generation of high-throughput lineage tracing methods. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Junker JP, et al. Massively parallel whole-organism lineage tracing using CRISPR/Cas9 induced genetic scars. Preprint at. 2016 bioRxiv http://dx.doi.org/10.1101/056499.
- 66.Kalhor R, Mali P, Church GM. Rapidly evolving homing CRISPR barcodes. Nat Methods. 2016 doi: 10.1038/nmeth.4108. http://dx.doi.org/10.1038/nmeth.4108. [DOI] [PMC free article] [PubMed]
- 67.Frieda KL, et al. Synthetic recording and in situ readout of lineage information in single cells. Nature. 2016 doi: 10.1038/nature20777. http://dx.doi.org/10.1038/nature20777. [DOI] [PMC free article] [PubMed]
- 68.Schmidt ST, Zimmerman SM, Wang J, Kim SK, Quake SR. Cell lineage tracing using nuclease barcoding. 2016 doi: 10.1021/acssynbio.6b00309. Preprint at arXiv https://arxiv.org/abs/1606.00786. [DOI] [PMC free article] [PubMed]
- 69.Salipante SJ, Kas A, McMonagle E, Horwitz MS. Phylogenetic analysis of developmental and postnatal mouse cell lineages. Evol Dev. 2010;12:84–94. doi: 10.1111/j.1525-142X.2009.00393.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Shapiro E, Biezuner T, Linnarsson S. Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat Rev Genet. 2013;14:618–630. doi: 10.1038/nrg3542. [DOI] [PubMed] [Google Scholar]
- 71.Grün D, van Oudenaarden A. Design and analysis of single-cell sequencing experiments. Cell. 2015;163:799–810. doi: 10.1016/j.cell.2015.10.039. [DOI] [PubMed] [Google Scholar]
- 72.Gawad C, Koh W, Quake SR. Single-cell genome sequencing: current state of the science. Nat Rev Genet. 2016;17:175–188. doi: 10.1038/nrg.2015.16. [DOI] [PubMed] [Google Scholar]
- 73.Ostertag EM, Kazazian HH. Biology of mammalian L1 retrotransposons. Annu Rev Genet. 2001;35:501–538. doi: 10.1146/annurev.genet.35.102401.091032. [DOI] [PubMed] [Google Scholar]
- 74.Muotri AR, et al. Somatic mosaicism in neuronal precursor cells mediated by L1 retrotransposition. Nature. 2005;435:903–910. doi: 10.1038/nature03663. [DOI] [PubMed] [Google Scholar]
- 75.Erwin JA, Marchetto MC, Gage FH. Mobile DNA elements in the generation of diversity and complexity in the brain. Nat Rev Neurosci. 2014;15:497–506. doi: 10.1038/nrn3730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Coufal NG, et al. L1 retrotransposition in human neural progenitor cells. Nature. 2009;460:1127–1131. doi: 10.1038/nature08248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Baillie JK, et al. Somatic retrotransposition alters the genetic landscape of the human brain. Nature. 2011;479:534–537. doi: 10.1038/nature10531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Evrony GD, et al. Single-neuron sequencing analysis of l1 retrotransposition and somatic mutation in the human brain. Cell. 2012;151:483–496. doi: 10.1016/j.cell.2012.09.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Upton KR, et al. Ubiquitous L1 mosaicism in hippocampal neurons. Cell. 2015;161:228–239. doi: 10.1016/j.cell.2015.03.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Evrony GD, Lee E, Park PJ, Walsh CA. Resolving rates of mutation in the brain using single-neuron genomics. eLife. 2016;5:e12966. doi: 10.7554/eLife.12966. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Evrony GD, et al. Cell lineage analysis in human brain using endogenous retroelements. Neuron. 2015;85:49–59. doi: 10.1016/j.neuron.2014.12.028. This study demonstrated that somatic L1 retrotransposition events could be used to identify distinct lineages in normal human brain. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Abyzov A, et al. Somatic copy number mosaicism in human skin revealed by induced pluripotent stem cells. Nature. 2012;492:438–442. doi: 10.1038/nature11629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Knouse KA, Wu J, Amon A. Assessment of megabase-scale somatic copy number variation using single-cell sequencing. Genome Res. 2016;26:376–384. doi: 10.1101/gr.198937.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.McConnell MJ, et al. Mosaic copy number variation in human neurons. Science. 2013;342:632–637. doi: 10.1126/science.1243472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Cai X, et al. Single-cell, genome-wide sequencing identifies clonal somatic copy-number variation in the human brain. Cell Rep. 2014;8:1280–1289. doi: 10.1016/j.celrep.2014.07.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Baslan T, et al. Optimizing sparse sequencing of single cells for highly multiplex copy number profiling. Genome Res. 2015;25:714–724. doi: 10.1101/gr.188060.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Frank SA. Somatic evolutionary genomics: mutations during development cause highly variable genetic mosaicism with risk of cancer & neurodegeneration. Proc Natl Acad Sci USA. 2010;107(Suppl 1):1725–1730. doi: 10.1073/pnas.0909343106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Hazen JL, et al. The complete genome sequences, unique mutational spectra, and developmental potency of adult neurons revealed by cloning. Neuron. 2016;89:1223–1236. doi: 10.1016/j.neuron.2016.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Zafar H, Wang Y, Nakhleh L, Navin N, Chen K. Monovar: single-nucleotide variant detection in single cells. Nat Methods. 2016;13:505–507. doi: 10.1038/nmeth.3835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Ellegren H. Microsatellites: simple sequences with complex evolution. Nat Rev Genet. 2004;5:435–445. doi: 10.1038/nrg1348. [DOI] [PubMed] [Google Scholar]
- 91.Frumkin D, Wasserstrom A, Kaplan S, Feige U, Shapiro E. Genomic variability within an organism exposes its cell lineage tree. PLoS Comput Biol. 2005;1:e50. doi: 10.1371/journal.pcbi.0010050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Salipante SJ, Horwitz MS. Phylogenetic fate mapping. Proc Natl Acad Sci USA. 2006;103:5448–5453. doi: 10.1073/pnas.0601265103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Reizel Y, et al. Colon stem cell and crypt dynamics exposed by cell lineage reconstruction. PLoS Genet. 2011;7:e1002192. doi: 10.1371/journal.pgen.1002192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Reizel Y, et al. Cell lineage analysis of the mammalian female germline. PLoS Genet. 2012;8:e1002477. doi: 10.1371/journal.pgen.1002477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Thibodeau SN, Bren G, Schaid D. Microsatellite instability in cancer of the proximal colon. Science. 1993;260:816–819. doi: 10.1126/science.8484122. [DOI] [PubMed] [Google Scholar]
- 96.Ionov Y, Peinado MA, Malkhosyan S, Shibata D, Perucho M. Ubiquitous somatic mutations in simple repeated sequences reveal a new mechanism for colonic carcinogenesis. Nature. 1993;363:558–561. doi: 10.1038/363558a0. [DOI] [PubMed] [Google Scholar]
- 97.Turajlic S, McGranahan N, Swanton C. Inferring mutational timing and reconstructing tumour evolutionary histories. Biochim Biophys Acta. 2015;1855:264–275. doi: 10.1016/j.bbcan.2015.03.005. [DOI] [PubMed] [Google Scholar]
- 98.Ross EM, Markowetz F. OncoNEM: inferring tumor evolution from single-cell sequencing data. Genome Biol. 2016;17:69. doi: 10.1186/s13059-016-0929-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Li Y, et al. Single-cell sequencing analysis characterizes common and cell-lineage-specific mutations in a muscle-invasive bladder cancer. Gigascience. 2012;1:12. doi: 10.1186/2047-217X-1-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Xu X, et al. Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell. 2012;148:886–895. doi: 10.1016/j.cell.2012.02.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Shibata D, Navidi W, Salovaara R, Li ZH, Aaltonen LA. Somatic microsatellite mutations as molecular tumor clocks. Nat Med. 1996;2:676–681. doi: 10.1038/nm0696-676. [DOI] [PubMed] [Google Scholar]
- 102.Naxerova K, Jain RK. Using tumour phylogenetics to identify the roots of metastasis in humans. Nat Rev Clin Oncol. 2015;12:258–272. doi: 10.1038/nrclinonc.2014.238. [DOI] [PubMed] [Google Scholar]
- 103.Frumkin D, et al. Cell lineage analysis of a mouse tumor. Cancer Res. 2008;68:5924–5931. doi: 10.1158/0008-5472.CAN-07-6216. After previous work by the authors demonstrated the theoretical feasibility of using somatic microsatellite variability to reconstruct lineage, this study carried out the experiments, demonstrating that cells in a mouse tumour could be traced back to their somatic origin. [DOI] [PubMed] [Google Scholar]
- 104.Shlush LI, et al. Cell lineage analysis of acute leukemia relapse uncovers the role of replication-rate heterogeneity and microsatellite instability. Blood. 2012;120:603–612. doi: 10.1182/blood-2011-10-388629. [DOI] [PubMed] [Google Scholar]
- 105.Ding L, et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature. 2012;481:506–510. doi: 10.1038/nature10738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Martincorena I, et al. Tumor evolution. High burden and pervasive positive selection of somatic mutations in normal human skin. Science. 2015;348:880–886. doi: 10.1126/science.aaa6806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Yachida S, et al. Distant metastasis occurs late during the genetic evolution of pancreatic cancer. Nature. 2010;467:1114–1117. doi: 10.1038/nature09515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Hou Y, et al. Single-cell exome sequencing and monoclonal evolution of a JAK2-negative myeloproliferative neoplasm. Cell. 2012;148:873–885. doi: 10.1016/j.cell.2012.02.028. [DOI] [PubMed] [Google Scholar]
- 109.Gawad C, Koh W, Quake SR. Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics. Proc Natl Acad Sci USA. 2014;111:17947–17952. doi: 10.1073/pnas.1420822111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Wang Y, Navin NE. Advances and applications of single-cell sequencing technologies. Mol Cell. 2015;58:598–609. doi: 10.1016/j.molcel.2015.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.Dean FB, et al. Comprehensive human genome amplification using multiple displacement amplification. Proc Natl Acad Sci USA. 2002;99:5261–5266. doi: 10.1073/pnas.082089499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Wang J, Fan HC, Behr B, Quake SR. Genome-wide single-cell analysis of recombination activity and de novo mutation rates in human sperm. Cell. 2012;150:402–412. doi: 10.1016/j.cell.2012.06.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Fu Y, et al. Uniform and accurate single-cell sequencing based on emulsion whole-genome amplification. Proc Natl Acad Sci USA. 2015;112:11923–11928. doi: 10.1073/pnas.1513988112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Sidore AM, Lan F, Lim SW, Abate AR. Enhanced sequencing coverage with digital droplet multiple displacement amplification. Nucleic Acids Res. 2016;44:e66. doi: 10.1093/nar/gkv1493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Rhee M, Light YK, Meagher RJ, Singh AK. Digital Droplet Multiple Displacement Amplification (ddMDA) for whole genome sequencing of limited dna samples. PLoS ONE. 2016;11:e0153699. doi: 10.1371/journal.pone.0153699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Szulwach KE, et al. Single-cell genetic analysis using automated microfluidics to resolve somatic mosaicism. PLoS ONE. 2015;10:e0135007. doi: 10.1371/journal.pone.0135007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Barker DL, et al. Two methods of whole-genome amplification enable accurate genotyping across a 2320-SNP linkage panel. Genome Res. 2004;14:901–907. doi: 10.1101/gr.1949704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Ning L, et al. Quantitative assessment of single-cell whole genome amplification methods for detecting copy number variation using hippocampal neurons. Sci Rep. 2015;5:11415. doi: 10.1038/srep11415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Zong C, Lu S, Chapman AR, Xie XS. Genome-wide detection of single-nucleotide and copy-number variations of a single human cell. Science. 2012;338:1622–1626. doi: 10.1126/science.1229164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Voet T, et al. Single-cell paired-end genome sequencing reveals structural variation per cell cycle. Nucleic Acids Res. 2013;41:6119–6138. doi: 10.1093/nar/gkt345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Leung ML, et al. Highly multiplexed targeted DNA sequencing from single nuclei. Nat Protoc. 2016;11:214–235. doi: 10.1038/nprot.2016.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Chow HM, Herrup K. Genomic integrity and the ageing brain. Nat Rev Neurosci. 2015;16:672–684. doi: 10.1038/nrn4020. [DOI] [PubMed] [Google Scholar]
- 123.Spalding KL, Bhardwaj RD, Buchholz BA, Druid H, Frisén J. Retrospective birth dating ofçcells in humans. Cell. 2005;122:133–143. doi: 10.1016/j.cell.2005.04.028. [DOI] [PubMed] [Google Scholar]
- 124.Leung ML, Wang Y, Waters J, Navin NE. SNES: single nucleus exome sequencing. Genome Biol. 2015;16:55. doi: 10.1186/s13059-015-0616-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Knouse KA, Wu J, Whittaker CA, Amon A. Single cell sequencing reveals low levels of aneuploidy across mammalian tissues. Proc Natl Acad Sci USA. 2014;111:13409–13414. doi: 10.1073/pnas.1415287111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Espina V, et al. Laser-capture microdissection. Nat Protoc. 2006;1:586–603. doi: 10.1038/nprot.2006.85. [DOI] [PubMed] [Google Scholar]
- 127.Gutierrez-Gonzalez L, et al. Analysis of the clonal architecture of the human small intestinal epithelium establishes a common stem cell for all lineages and reveals a mechanism for the fixation and spread of mutations. J Pathol. 2009;217:489–496. doi: 10.1002/path.2502. [DOI] [PubMed] [Google Scholar]
- 128.Crosetto N, Bienko M, van Oudenaarden A. Spatially resolved transcriptomics and beyond. Nat Rev Genet. 2015;16:57–66. doi: 10.1038/nrg3832. [DOI] [PubMed] [Google Scholar]
- 129.Larsson C, Grundberg I, Söderberg O, Nilsson M. In situ detection and genotyping of individual mRNA molecules. Nat Methods. 2010;7:395–397. doi: 10.1038/nmeth.1448. [DOI] [PubMed] [Google Scholar]
- 130.Larsson C, et al. In situ genotyping individual DNA molecules by target-primed rolling-circle amplification of padlock probes. Nat Methods. 2004;1:227–232. doi: 10.1038/nmeth723. [DOI] [PubMed] [Google Scholar]
- 131.Ke R, et al. In situ sequencing for RNA analysis in preserved tissue and cells. Nat Methods. 2013;10:857–860. doi: 10.1038/nmeth.2563. [DOI] [PubMed] [Google Scholar]
- 132.Lee JH, et al. Highly multiplexed subcellular RNA sequencing in situ. Science. 2014;343:1360–1363. doi: 10.1126/science.1250212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Lee JH, et al. Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues. Nat Protoc. 2015;10:442–458. doi: 10.1038/nprot.2014.191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Janiszewska M, et al. In situ single-cell analysis identifies heterogeneity for PIK3CA mutation and HER2 amplification in HER2-positive breast cancer. Nat Genet. 2015;47:1212–1219. doi: 10.1038/ng.3391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135.Macosko EZ, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell. 2015;161:1202–1214. doi: 10.1016/j.cell.2015.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 136.Ware ML, Tavazoie SF, Reid CB, Walsh CA. Coexistence of widespread clones and large radial clones in early embryonic ferret cortex. Cereb Cortex. 1999;9:636–645. doi: 10.1093/cercor/9.6.636. [DOI] [PubMed] [Google Scholar]
- 137.Reid CB, Tavazoie SF, Walsh CA. Clonal dispersion and evidence for asymmetric cell division in ferret cortex. Development. 1997;124:2441–2450. doi: 10.1242/dev.124.12.2441. [DOI] [PubMed] [Google Scholar]




