Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Oct 5.
Published in final edited form as: Cell Stem Cell. 2017 Oct 5;21(4):547–555.e8. doi: 10.1016/j.stem.2017.07.015

Multiplex CRISPR-Cas9 Based Genome Editing in Human Hematopoietic Stem Cells Models Clonal Hematopoiesis and Myeloid Neoplasia

Zuzana Tothova 1,2,3, John M Krill-Burger 3, Katerina D Popova 2,3, Catherine C Landers 2,3, Quinlan L Sievers 2,3,4, David Yudovich 2,3, Roger Belizaire 2,3,5, Jon C Aster 5, Elizabeth A Morgan 5, Aviad Tsherniak 3, Benjamin L Ebert 1,2,3,6,*
PMCID: PMC5679060  NIHMSID: NIHMS897727  PMID: 28985529

SUMMARY

Hematologic malignancies are driven by combinations of genetic lesions that have been difficult to model in human cells. We used CRISPR/Cas9 genome engineering of primary adult and umbilical cord blood CD34+ human hematopoietic stem and progenitor cells (HSPCs), the cells of origin for myeloid pre-malignant and malignant diseases, followed by transplantation into immunodeficient mice, to generate genetic models of clonal hematopoiesis and neoplasia. Human hematopoietic cells bearing mutations in combinations of genes observed in myeloid malignancies, including cohesin complex genes, generated immunophenotypically defined neoplastic clones capable of long-term, multi-lineage reconstitution and serial transplantation. Employing these models to investigate therapeutic efficacy, we found that TET2 and cohesin-mutated hematopoietic cells were sensitive to azacitidine treatment. These findings demonstrate the potential for generating genetically-defined models of human myeloid diseases and are suitable for examining the biological consequences of somatic mutations and the testing of therapeutic agents.

Graphical Abstract

graphic file with name nihms897727u1.jpg

INTRODUCTION

Hematopoietic malignancies are genetically complex diseases in which the serial acquisition of somatic mutations results in clonal diversity with distinct responses to therapy (Bejar et al., 2014; Walter et al., 2012; Welch et al., 2012). While tremendous progress has been made in defining the genetic basis of hematologic malignancies through large-scale sequencing studies, models are now needed that reflect the specific combinations of mutations identified and the clonal complexity of human disease. Such models would be powerful tools to probe the biology of malignant transformation and to identify genetic subtypes that are sensitive or resistant to therapeutic agents.

Genome engineering technologies have enabled the generation of murine models that reflect the genetics of human cancer (Heckl et al., 2014; Sanchez-Rivera et al., 2014). However, there are fundamental differences in tumorigenesis in human versus mouse cells (Rangarajan and Weinberg, 2003), related both to the biology of oncogenes and tumor suppressor genes and to the effects of genetic background and target cells of transformation. The differences between mouse and human cells may be particularly important in the study of somatic mutations in cohesin genes, which regulate long-range DNA looping interactions, due to sequence variation between the species in regulatory elements and chromosome architecture (Mazumdar et al., 2015; Mullenders et al., 2015; Viny et al., 2015). As such, use of genetically engineered mouse models or retroviral transduction of mouse bone marrow cells may not fully capture disease-relevant biology, and these differences suggest the need to develop genetically engineered in vivo models of human myeloid malignancies.

Clonal hematopoiesis of indeterminate potential (CHIP) is a pre-malignant state in which HSPCs bear somatic mutations in genes that are recurrently mutated in hematologic malignancies, including DNMT3A, TET2 and ASXL1(Genovese et al., 2014a; Jaiswal et al., 2014; Xie et al., 2014). Over time, in a subset of cases, acquisition of additional cooperating mutations may cause progression to overt hematologic malignancies. No experimental models currently exist to study human CHIP in vivo.

A number of different in vivo models have been used to study hematologic malignancies. Patient-derived xenograft (PDX) and humanized scaffold models share the advantage of human disease context and support expansion of some human myeloid malignancies, but they are not customizable to genetic lesions of interest and don’t capture the initial steps of clonal expansion (Klco et al., 2014). Genome editing methods, including zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and RNA-guided nucleases (CRISPR/Cas9), have been successfully employed to edit human HSPC (Buechele et al., 2015; Dever et al., 2016; Genovese et al., 2014b; Gundry et al., 2016; Mandal et al., 2014). These studies have focused on modifying one gene at a time, and with the exception of MLL translocations, they have been mainly directed at therapeutic gene correction rather than in vivo disease model building. Over 85% of adult patients with acute myeloid leukemia (AML) have mutations in two or more driver genes (Papaemmanuil et al., 2016), and new models that reflect such combinations of mutations are needed.

We endeavored to develop genetic models of CHIP and human myeloid neoplasia using CRISPR/Cas9 engineering of human HSPCs with combinations of established leukemia driver mutations. Our goal was to develop models that would target and expand human HSPCs in vivo, mimic the genetic complexity of human disease in an easily customizable fashion, allow the study of clonal dynamics and evolution, and be amenable to pharmacologic testing in a genotype-specific fashion.

RESULTS

Genetic editing of a single leukemia driver in human HSPC in vitro

To examine the efficiency of engineering human HSPCs with mutations that are common in leukemia, we first needed approaches to genetically edit human CD34+ cells in vitro with leukemia-associated mutations and to quantify allelic fractions of the insertions and deletions (indels) generated. We began by introducing mutations into human umbilical cord blood (UCB) CD34+ cells by transient transfection of plasmids coding for S. pyogenes Cas9 and green fluorescent protein (GFP), as well as a chimeric small guide RNA (sgRNA) (Figure S1A)(Mandal et al., 2014). Targeting of UCB CD34+ cells with a single sgRNA for STAG2, encoding a core member of the cohesin complex, yielded an editing efficiency of 53%, with the majority of the genetic alterations introducing a frameshift mutation, similar to mutations observed in patients with myelodysplastic syndrome (MDS) and AML (Figure S1B).

To identify and track the allelic fractions of specific indels introduced by CRISPR/Cas9 in multiple genes and across multiple samples, we developed a sequencing and computational strategy utilizing PCR-based amplification of the sgRNA target site and next-generation sequencing (NGS) (Figure S1C–F, Table S1A, CRISPR-Seq (http://crispr-seq.com/)). This approach enables annotation and quantitation of unique indels, which can then be used as genetic barcodes to track the abundance of individual clones over time.

Efficient multiplex editing of human HSPC in vitro

Having demonstrated efficient single gene editing and quantification of allelic fractions of specific indels, we next sought to model the combinatorial genetic complexity of human myeloid malignancies. We developed sgRNAs targeting additional genetic drivers of leukemia and focused on nine recurrently mutated genes in MDS with predicted loss-of-function (LOF) mutations in greater than 5% of patients: TET2, ASXL1, DNMT3A, RUNX1, TP53, NF1, EZH2, and the two cohesin genes STAG2 and SMC3. As negative controls, we targeted U2AF1 and SRSF2 to introduce LOF mutations, distinct from the recurrent nucleotide alterations observed in these genes in patients with myeloid malignancies. We designed sgRNAs targeting each gene, tested their editing efficiency in a fluorescence based reporter assay (Figure S1G) and selected the sgRNA with the most efficient editing for each target gene (Figure S1H, Table S1A).

We next sought to determine whether we could achieve multiplex targeting of human UCB CD34+ cells in vitro by introducing a pool of sgRNAs targeting 11 independent leukemia drivers into UCB CD34+ cells using nucleofection. We cultured cells in vitro and isolated genomic DNA from the GFP+ pool as well as from single cell-derived colonies (Figure S2A). Analysis of the bulk population at 24 and 144 hours after transfection demonstrated the presence of indels in all 11 genes targeted, with the majority of genome editing occurring by 24 hours, and absence of substantial clonal selection during this short-term in vitro culture (Figure 1A).

Figure 1. Genetic editing of a single and multiple leukemia drivers in CD34+ cells in vitro and in vivo.

Figure 1

A: Multiplex targeting of UCB CD34+ cells leads to efficient indel formation across all genes targeted without evidence of major selection for a dominant clone during 5 days of in vitro culture (p=1.0, Shannon entropy). B: Analysis of single cell-derived colonies demonstrates multiplex gene targeting in a single cell and suggests distinct patterns of zygosity (SMC3, p=0.02; ASXL1, p=0.04; RUNX1, p=0.04; Barnard’s exact test). C: Indel type analysis for SMC3 gene pre-injection (AF=0.05) and 5 months post-transplantation in mouse ZM178 (AF=0.02, recipient #1 for adult CD34 donor ADT1 in Table S1B) shows stable representation of major LOF clones with dropout of predicted non-LOF clones. Corresponding colors denote the same indel type. D=deletion, I=insertion, followed by size and hg19 genomic coordinate in brackets. n=3; mean ± SD indel VAF amongst edited alleles for “D1: 112341796”, “D2: 112341794”, and “D8: 112341797” are 0.379 ± 0.032, 0.165 ± 0.046 and 0.091 ± 0.046, respectively, with pre-injection indel VAF amongst edited alleles of 0.363, 0.193, and 0.042, respectively. D: Multiplex targeting of UCB CD34+ cells with LOF mutations in 11 leukemia drivers followed by in vivo expansion shows evidence of mutated clone expansion in 10/17 transplanted mice. Incidence of CRISPR/Cas9 induced indels in experimental genes is statistically significantly different from control genes SRSF2 and U2AF1 (p=0.01, Fisher’s exact test, two-tailed). Clones with AF>0.01 at 5 months in the sorted bone marrow CD45+ cells are shown. Pre-injection AF: SMC3(0.17), STAG2(0.28), U2AF1(0.05), TET2(0.16), NF1(0.14), DNMT3A(0.10), EZH2(0.22), SRSF2(0.04), TP53(0.60), RUNX1(0.20). See also Figure S1, S2, and Table S1, S3, S4.

To demonstrate the presence of multiplex targeting within a single CD34+ cell, we sequenced single cell-derived colonies grown on methylcellulose. Out of 88 colonies analyzed, 37 (42%) had targeted editing of at least one gene (Figure 1B). Of these edited colonies, 26/37 (70%) had targeted editing of more than one gene in a single clone (Figure S2B). Our data demonstrate that multiplex targeting of HSPC is both feasible and efficient.

Genetic editing of human HSPC in vitro predicts driver lesion zygosity observed in patients

Genetic analysis of single colonies enabled the assignment of heterozygous, compound heterozygous, or homozygous editing. In this analysis, we found that the zygosity observed for each gene was concordant with that seen in patients. For example, we never observed targeting of more than a single allele of SMC3, ASXL1 and RUNX1, consistent with the absence of acquired homozygous inactivating mutations in these genes in patients, and suggesting that homozygous loss of these tumor suppressor genes is not tolerated in human CD34+ cells. In contrast, we did observe bi-allelic editing of TET2, DNMT3A, EZH2, TP53 and NF1, in agreement with the presence of biallellic LOF mutations in these genes in patients. Similarly, we observed frequent inactivating mutations in STAG2, an X-linked gene frequently mutated in males with MDS, in male UCB CD34+ cells (Bejar et al., 2011; Cancer Genome Atlas Research, 2013; Haferlach et al., 2014; Kon et al., 2013; Papaemmanuil et al., 2013). These studies demonstrate that patterns of CRISPR editing predict heterozygous versus homozygous inactivation of cohesin genes and other leukemia drivers in primary HSPCs.

Modeling human clonal hematopoiesis in vivo

Genetically engineered CD34+ cells bearing mutations in driver genes did not grow as immortalized cell lines, suggesting a requirement for an in vivo microenvironment for ongoing propagation. We first sought to model CHIP by engineering mutations in a single gene per mouse, focusing on DNMT3A, TET2, and ASXL1. We initially performed CRISPR/Cas9 genome editing in adult CD34+ cells, rather than UCB CD34+ cells, as these mutations are rare in children but increase in frequency with age. We transplanted unsorted cells into sublethally irradiated NSGS mice, previously shown to support engraftment of human myeloid leukemias (Wunderlich et al., 2010). Five months after transplantation, a time point at which human hematopoiesis is expected to be derived from long-term HSCs (LT-HSCs), we observed 18–35% human cell engraftment and detected mean indel frequencies of 4–36% in flow-sorted human CD45+ cells. We noted highest clonal expansion of TET2, DNMT3A and ASXL1 exon 12 mutated clones over the course of 5 months (Figure S2C–D, Table S1B).

Over five months, we observed expansion of predicted LOF mutations, demonstrating clonal selection for inactivation of these tumor suppressor genes. The complexity of indel types for each gene from the time of transplantation was highly preserved through in vivo expansion. The stable spectrum of unique indels, acting as genetic barcodes, indicates that multiple LT-HSCs with distinct but functionally equivalent LOF mutations stably support hematopoiesis to the same extent over five months post-transplantation (Figure 1C). Consistent with the asymptomatic nature of CHIP in humans, mice transplanted with CD34+ cells modified at a single genetic locus did not have any evidence of pathologic changes in the bone marrow or spleen, or changes in blood counts or spleen size, as compared to control mice transplanted with CD34+ cells infected with Cas9 and a non-targeting control sgRNA. We obtained similar results for cohesin genes, STAG2 and SMC3, which can be mutated early in the pathogenesis of myeloid malignancies and more rarely in CHIP (Figure 1C and S2D)(Corces-Zimmerman et al., 2014; Jaiswal et al., 2014). These studies demonstrate that introduction of mutations into CD34+ cells followed by transplantation into immunodeficient mice can model CHIP, providing a system to examine the biology of these mutations in isolation in human cells and to evaluate response to therapeutic interventions, using mutant allele fractions to track clonal dynamics.

Efficient multiplex editing of human HSPC in vivo

Since overt myeloid malignancies are generally associated with the acquisition of somatic mutations in multiple driver genes in a single clone, we next attempted to generate genetic models of myeloid malignancies by sequential acquisition of leukemia driver mutations using serial transplantation (Figure S2E). However, poor engraftment of CHIP clones in the secondary and tertiary transplant recipients was limiting in this approach (Table S2A–B). Therefore, we performed multiplex genome editing in vivo using our previously tested pool of sgRNAs targeting nine genes with recurrent LOF mutations in myeloid malignancies. In addition, we expressed commonly mutated oncogenes, including FLT3-ITD and NPM1 among others, in CD34+ cells using lentiviral transduction 24 hours prior to transfection with the sgRNA pool (Figure S2F). Of 17 mice transplanted with engineered CD34+ cells, 10 had expansion of genetically modified human cells with a mutant indel fraction of at least 0.01 five months after transplantation and 80% of mice harbored an expanded clone with at least two mutated genes (Figure 1D and S2G–H). Mice transplanted with Cas9 and control sgRNA did not show indels at any of the targeted loci (Figure S2G). As expected, genes with recurrent LOF mutations in human malignancies were most commonly mutated in expanded clones in vivo, including RUNX1, STAG2, SMC3, NF1 and DNMT3A. In contrast, genes that do not associate with LOF mutations in human disease, SRSF2 and U2AF1, and would therefore not be predicted to cause clonal dominance when knocked out, were almost never mutated (Figure S2H; the sole U2AF1 mutation may have been present as a passenger mutation in a cell with other driver mutations). These studies demonstrate that human HSPCs with multiple genetic lesions have the capacity to engraft and expand in vivo, and that clonal selection can be tracked in vivo.

Morphologic and genetic characterization of models generated using multiplex editing of human HSPC

The genetic lesions introduced into human HSPCs generated diverse morphologic phenotypes. For example, we observed a clonal expansion of immature human myeloperoxidase (MPO)+ myeloid forms in mouse 1783, that had an SMC3 indel fraction of 0.45 and a FLT3-ITD cell fraction of 0.92 in the bone marrow, consistent with a major clone characterized by a heterozygous SMC3 mutation and a FLT3-ITD integration (Figure 2A). This mouse demonstrated increased human cell engraftment as well as expansion of CD34+ CD38-CD33-CD11B- population, relative to mouse transplanted with mock infected CD34+ cells (mouse 1764) or a mouse that expanded a FLT3-ITD only mutant clone (mouse 1782) (Figure 2A, C–D, S3A and S2G). Emergence of this particular genetic combination among others in vivo is in consonance with the observation of this combination of mutations in patients with AML(Cancer Genome Atlas Research, 2013).

Figure 2. Functional and morphologic characterization of in vivo models generated by targeting CD34+ cells.

Figure 2

A: IHC shows expansion of immature CD45+ MPO+ myeloid cells in the bone marrow of mouse 1783 characterized by a major SMC3/FLT3-ITD mutated clone. Comparison is made to the control mouse 1764 transplanted with mock infected CD34+ cells and mouse 1782 characterized by a FLT3-ITD only mutated clone. B: Expansion of CD45+ CD163+ histiocytic/macrophage cells in the bone marrow mouse 1785 characterized by LOF mutations in 7 genes and NPM1 mutation as demonstrated by IHC. Comparison is made to the control mouse 1764 and mouse 1787 characterized by an NPM1 only mutated clone. All images are 40X magnification (scale bar = 0.125 mm) with H&E insets at 100X magnification (scale bar = 0.025mm). C–D: Human engraftment over time in peripheral blood (C) and bone marrow at 5 months (D) of mouse 1783, wildtype (WT) control mice (1763 and 1764) and mouse 1782 with FLT3-ITD only mutant clone as determined by flow cytometry using hCD45 staining. Mean %+/− SD of peripheral blood (PB) and bone marrow (BM) engraftment is shown for control WT mice 1763 and 1764. See also Figure S2 and Table S3.

Other models, such as mouse 1785 with eight genetic lesions (i.e. TP53, NF1, RUNX1, TET2, EZH2, ASXL1, U2AF1, and NPM1), had clonal expansion of immature myeloid cells, and human CD68+CD163+ macrophage/histiocytic populations (Figure 2B). Immunophenotypic analysis confirmed expansion of CD45 intermediate CD34-CD38+CD33 intermediate CD11B+CD68+ population in mouse 1785, relative to the control mouse 1764 or mice with one or two of the eight genetic lesions (mouse 1787 and 1786) (Figure 2B and S3B). Yet another model, mouse ZM46, characterized by clonal expansion of RUNX1 and NF1 mutated clones showed expansion of CD45+ CD68+CD163+TNFα+ macrophage/histiocytic cells, which was associated with pancytopenia and bone marrow failure (Figure S3C), and expansion of CD34+ CD38+ CD33+ population with a subset of CD11B+CD68+ cells (Figure S3D) not seen in the control mouse. In total, we have generated genetic models characterized by the combinations of somatic lesions observed in patients using multiplex targeting of both adult and UCB CD34+ cells in a total of 50 mice, with sgRNA pools of 3–11 sgRNAs and 59–100% targeting efficiency (Table S3C).

To examine in vivo clonal dynamics of human cells with multiplex editing, we compared indel sequences at the time of injection and at 5 months in vivo. In mouse 1785, which carried mutations in 8 leukemia drivers, a single clone emerged with uniform dominance of a single indel for each mutated gene at 5 months (Figure 3A and S3E). Indels that were not predicted to cause a frameshift were selectively lost over 5 months of in vivo expansion, and 100% of all major clones had out-of-frame indels (Figure 3A and S3E). We observed the same preferential loss of non-LOF indels across this entire mouse cohort in 7/8 experimental genes (Table S4A). Furthermore, analysis of clonal dynamics and clonal expansion of all major indel types detected 5 months after transplantation showed clonal selection for 88% indels, 77% of which expanded over wildtype cells (Tables S4B–C). Our in vivo model utilizing multiplex targeting of CD34+ cells therefore recapitulates the genetic complexity frequently observed in patients and models selective dominance of an individual genetic clone.

Figure 3. Clonal dynamics, multi-lineage reconstitution and serial transplantability of mutant clones.

Figure 3

A: Emergence of a single dominant clone in mouse 1785 at 5 months post-transplantation as assessed by NGS and indel type analysis. D=deletion, I=insertion: followed by size of indel and (Hg19 genomic coordinate). Indel fractions observed at the time of pre-injection and at 5 months: TP53(0.06; 0.43), NF1(0.14; 0.32), RUNX1(0.20; 0.32), TET2(0.17; 0.23), EZH2(0.22; 0.24), ASXL1(0.12; 0.24), and U2AF1(0.05; 0.13). B: Sorting of B-(CD45+CD19+CD33-CD3-), T-(CD45+CD3+CD19-CD33-), myeloid-(CD45+CD33+CD19-CD3-) and immature (CD45+CD34+CD19-CD3-CD33-) cells followed by NGS supports presence of the mutant clone in all lineages of mouse 1783. Representative flow analysis panels are shown; gating as outlined in each panel, % viable CD45+ cells shown. C: Patterns of genetic lesions among primary and secondary transplant recipients of 3 independent donor mice, 1783, 1785 and 1786. Clones with AF>0.01 in the sorted CD45+ bone marrow are shown. Mutant clones identified in the primary transplant are more likely to be detected in the secondary transplant recipients than mutant clones not identified in the primary transplant (p<0.0001, Fisher’s exact test). See also Figure S3 and Table S4.

To confirm the presence of specific indels and to examine whether the phenotypes observed were caused by additional mutations acquired through off-target gene editing or a pre-existing clone in the donor cells, we performed whole exome sequencing of two independent genetic models described above. The indel fractions determined using sgRNA target site PCR amplification and whole exome sequencing were highly concordant (Table S3B). More importantly, we did not identify any additional mutations in genes known to be recurrently mutated in hematologic malignancies (Jaiswal et al., 2014), indicating that the gene editing strategy did not result in oncogene or tumor suppressor mutations in genes that were not directly targeted. In addition, we examined predicted off-target sites of the sgRNAs (Hsu et al., 2013), but did not find any evidence of off-target mutations.

Edited clones are capable of long-term multilineage reconstitution and serial transplantation

HSCs, like many malignant stem cells, have the capacity for self-renewal and multi-lineage differentiation (Jordan et al., 2006). To investigate whether our models have the capacity for multi-lineage differentiation, we sorted cells from different lineages and assessed them for presence of the introduced genetic mutations. The FLT3-ITD/SMC3 clone was present in myeloid, B- and T- lymphoid lineages, as well as in immature lineage negative CD34+ cells sorted from mouse 1783, indicating that gene editing occurred in a stem cell capable of long-term multi-lineage reconstitution (Figure 3B).

We next examined whether the genetically modified cells are serially transplantable, which would enable propagation of the models and the ability to study clonal dynamics following therapeutic interventions at a larger scale. We selected 3 mice with unique genetic clones and transplanted their CD45+ sorted bone marrow into NSGS mice. We observed that the malignant clones were serially transplantable across multiple different genetic backgrounds, with the same dominant clone present in all mice transplanted (Figure 3C and S3F). Therefore, multiple lines of evidence, including maintenance of hematopoiesis 5 months after transplantation, serial transplantability and multi-lineage reconstitution, support the generation of neoplastic stem cells.

Genotype specific responses of edited human HSPC in vitro and in vivo

The genetic engineering of human HSPCs, and the generation of models of hematologic neoplasia, creates the opportunity to examine the sensitivity of different genotypes to a therapeutic intervention. The hypomethylating agents (HMAs), azacitidine and decitabine, are effective therapies for the treatment of MDS, although it remains unclear which patients are most likely to respond. The presence of TET2-mutant/ASXL1-wildtype clones in MDS patients has been previously shown to predict response to treatment with HMAs, in contrast to ASXL1-mutant clones which predict resistance (Bejar et al., 2014). As a proof of concept, we engineered CD34+ cells with LOF mutations in TET2 or ASXL1, and treated them in vitro for 72 hours with varying doses of azacitidine (Figure 4A and S4A). We observed increased sensitivity of TET2 mutant cells to azacitidine across a number of doses as compared to ASXL1 mutant or wild-type cells (Figure S4B).

Figure 4. Genotype-specific responses to treatment with azacitidine in vitro and in vivo.

Figure 4

A: In vitro 72-hour treatment of TET2 and SMC3 mutant clones shows increased sensitivity to hypomethylating agent azacitidine (250nM). Mean of 3 technical replicates with SD is plotted for each condition. % viability is normalized to DMSO treated controls for each genetic condition. * p<0.05 (two–tailed unpaired t-test) B: Schedule of azacitidine dosing in vivo. After engraftment of TET2 and ASXL1 mutant clones, 2.5mg/kg azacitidine or PBS (vehicle) was administered intraperitoneally once daily on Days 1–5 of a 14 day cycle. Mice were treated for a total of 12 weeks (6 cycles). C: Bone marrow analysis of TET2 and ASXL1 indel fraction by NGS demonstrates a genotype specific response of TET2 mutant clones (n=3–4 per group). * p<0.05 (two–tailed unpaired t-test) D: Analysis of TET2 indel fraction in sorted populations from azacitidine and vehicle treated TET2 mice (n=3 per group). * p<0.05 (two–tailed unpaired t-test) See also Figure S4.

Having generated in vitro models of cohesin mutations in human cells, we next sought to determine their response to treatment with HMAs. We found that adult CD34+ cells engineered with SMC3 LOF mutations exhibited increased sensitivity to varying doses of azacitidine (Figure 4A and S4B). These data suggest that CHIP, in addition to MDS, may be sensitive to HMA treatment in the cases of TET2 and SMC3 mutations.

We next examined the genotype-specific effects of azacitidine in vivo. After confirming engraftment of TET2 or ASXL1 mutated clones in NSGS mice 3 months after transplantation, we treated recipient mice with azacitidine for 3 months, harvested bone marrow and examined the effect on TET2 and ASXL1 indel fraction (Figure 4B). We observed ~4-fold decrease in the TET2 indel but not ASXL1 indel fraction, which was evident in sorted CD34+, CD33+, CD11B+, and CD19+ and CD3+ populations (Figure 4C–D and S4C–D). These data indicate that engineered human cells reflect pharmacologic sensitivities observed in patients with the same genotypes.

DISCUSSION

We describe an approach for creating preclinical models of human myeloid diseases by simultaneously modifying multiple genes in human HSPCs with subsequent expansion in vivo. We developed human models of CHIP by CRISPR/Cas9 editing of human HSPC and demonstrated mutant clone expansion over time with contribution to both myeloid and lymphoid, as well as stem and progenitor compartments. Furthermore, our approach achieved efficient multiplex editing of human HSPC in vitro and in vivo and recapitulated zygosity of genetic lesions observed in patients in a short-term in vitro culture assay. Expansion of multiplex edited clones in vivo led to clonal selection, multilineage reconstitution and serial transplantation of specific genetic combinations observed in patients. Finally, we demonstrated feasibility of genotype-specific pharmacologic testing of such models in long-term transplant assays in vivo.

We have used genetics rather than morphology to evaluate clonal dynamics in vivo. Clinical trials for hematologic malignancies are being performed in which changes in variant allelic fraction (VAF) are a primary end-point (Welch et al., 2016), and our approach provides human genetic models tailored to such studies. We provide evidence to support the feasibility of therapeutic testing of the human in vivo models, and demonstrate genotype-specific effects of azacitidine. These models, which can recapitulate the genetic complexity of human myeloid malignancies and pre-malignant diseases, may more accurately predict responses of investigational agents in clinical trials than previously used models. This approach enables the creation of systematic, genetically-defined hematopoietic clones to determine the genotypes most sensitive to a pharmacologic intervention.

None of the models we generated developed frank leukemia based on morphologic criteria, which has several possible reasons. First, many murine cytokines do not bind to human cytokine receptors, and the NSGS model does not recapitulate the full set and expression levels of cytokines that are present in a human bone marrow and that regulate cellular differentiation. Second, we have primarily focused on the study of acquired somatic lesions in clonal hematopoiesis and MDS which have not been previously shown to lead to transformation of hematopoietic cells in the manner in which translocations involving potent oncogenes do. And third, there are well-known limitations of any xenograft model, namely the lack of contribution from human bone marrow stroma and a functional immune system, which may be necessary for development of fully penetrant disease driven by some somatic mutations. Future studies using immunodeficient mouse strains that express human cytokines at more physiologic levels or human stroma based xenograft systems could potentially overcome this challenge.

The approach described here allows efficient editing of both adult as well as UCB CD34+ cells in vivo and can be used to address a number of biologically relevant questions. For example, pediatric and adult myeloid malignancies have different cells of origin and distinct patterns of genetic lesions (Schuback et al., 2013). We now have the means to manipulate each cell type with the same genetic lesions to ask whether the cell of origin may alter the transforming potential of particular genetic lesions.

In summary, our approach of generating genetic models of human myeloid neoplasia using multiplex CRISPR/Cas9 engineering provides a highly customizable approach to generate models that recapitulate the genetics, clonal evolution and dynamics, and therapeutic sensitivities observed in patients. With the ongoing improvements of efficiency of CRISPR/Cas9 genomic engineering technologies, we believe that these models will prove valuable in future studies of biology and therapeutic vulnerabilities of specific genetic lesions found in patients with hematologic malignancies.

STAR METHODS

CONTACT FOR REAGENT AND RESOURCE SHARING

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Dr. Benjamin Ebert (bebert@partners.org).

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Human CD34+ cells

De-identified umbilical cord blood (UCB) or adult peripheral blood CD34+ specimen were obtained from (1) New York Blood Bank (pooled female and male UCB donors, collected under a New York Blood Bank IRB-approved protocol, used for pilot experiments), (2) Lonza 2C-101 (commercially obtained UCB CD34+ enriched cells, male donors only, unknown age, collected from tissue recovery agencies under IRB-approved protocols), and (3) the Fred Hutchinson Hematopoietic Cell Processing and Repository (de-identified adult peripheral blood CD34+ cells, male donors only, unknown ages, collected under a Fred Hutchinson Cancer Center IRB-approved protocol). Informed consent, legal authorization and protection of human subjects considerations were followed during all steps of the tissue acquisition process. All specimens were de-identified, obtained from pre-existing inventories collected for generic research purposes and qualify under a non-human subjects exemption. Each experiment was performed using a single donor wherever possible, with 1–40 millions CD34+ cells of each donor used per experiment. Same donor cells were used for different arms of the experiment.

Mice

6–8 week old male and female NSGS mice (NOD-SCID; IL2Rγ null; Tg (IL3, CSF2, KITL)) were obtained from The Jackson laboratory, strain 013062). All mice were housed in a pathogen-free animal facility in microisolator cages and experiments were conducted with the ethical approval of the Harvard Medical Area Standing Committee on Animals and according to an IACUC approved protocol at Children’s Hospital Boston.

METHOD DETAILS

CRISPR/Cas9 vector construction

3–8 sgRNAs per gene of interest were designed using the Zhang lab off-target prediction tool (http://crispr.mit.edu/) and tested for on-target activity using a fluorescence based reporter system previously described (Addgene #61395) (Heckl et al., 2014). In brief, a stable reporter cell line was first established for each targeted gene using lentiviral transduction of TF1 cells (multiplicity of infection (MOI) of 1) with a construct containing the concatemer of sequences of up to five tested sgRNA+PAM sites which were cloned in frame with the RFP657 gene (Figure S1G). Expression of the RFP657 reporter fusion protein was driven by the SFFV (spleen focus forming virus) promoter and cells were sorted for RFP657 positivity. Each sgRNA was then individually tested by transducing the reporter cell line with a lentiviral vector containing Cas9-P2A-GFP driven by EFS promoter and a single sgRNA of interest driven by U6 promoter (Addgene #57818). Cells were analyzed for GFP and RFP657 positivity on Day 12–14 after transduction using BD LSRII flow cytometer. % Editing efficiency for a particular sgRNA was determined using the following formula:

%editingefficiency=(%RFP+untrasducedcells)-(%RFP+ofGFP+transducedcells)

3–5 unique sgRNAs were tested for each gene of interest. sgRNAs with the highest editing efficiency in the reporter assay were cloned into appropriate constructs (below) and used in the study. A complete list of all sgRNAs used in this study is provided in Supplemental Table 1A. sgRNAs for specific genes were cloned into a plasmid containing a U6 promoter (Addgene #41824) using a BbsI restriction enzyme site, and lentiviral plasmid containing a U6 promoter and Cas9-P2A-GFP (Addgene #57818) using a BsmBI restriction enzyme site.

CRISPR/Cas9 targeting of CD34+ cells

We tested genome engineering using CD34+ cells from (UCB) and adult peripheral blood, which were nucleofected with equimolar amounts of CAG-Cas9-2A-GFP plasmid (Addgene # 44719) and cloned U6-sgRNA plasmid (s) (Addgene #41824) using Nucleofector II (Lonza, program U-008) and CD34+ nucleofector kit (Lonza VPA-1003). As controls, we included CD34+ cells, which were nucleofected with equimolar amounts of CAG-Cas9-2A-GFP plasmid (Addgene # 44719) and cloned U6-sgRNA plasmid (Addgene #41824) containing a non-targeting sgRNA. Following nucleofection, cells were cultured in cytokine supplemented medium (DMEM/F-12 + 10% FCS + rh-SCF (100ng/mL), rh-TPO(50ng/mL), rh-FLT3L(40ng/mL), rh-IL3(20ng/mL), rh-GM-CSF(15ng/mL), rh-IL6(10ng/mL) for 16–24 hours prior to follow-up in vitro analysis or transplantation in vivo. For single cell clone analysis, a fraction of sorted GFP+ cells was cultured for an additional 5 days to maximize Cas9 cutting. All cytokines were obtained from Miltenyi. 200,000–500,000 cells were transplanted per mouse. Transfection efficiency was estimated by GFP expression and assessed using flow cytometry or fluorescent microscopy, and generally within 60–80% across different experiments. Viability was determined using trypan blue exclusion and generally around 50% 24 hrs after nucleofection. Transfected cells were either cultured in cytokine supplemented medium (as listed above), or grown as colonies on fully supplemented methylcellulose medium (StemCell Technologies MethoCult H4034 Optimum) for 14 days. We used CD34+ cells from male donors to enhance our ability to target STAG2, which is an X-linked gene and frequently mutated in male patients with MDS and secondary AML. In vitro targeting with pools of 11 leukemia drivers with follow-up single cell clone analysis were done in two replicate experiments. Figure 1A shows a representative experiment of two independent replicates. Figure 1B shows pooled data from two independent experiments.

Lentiviral transduction

UCB CD34+ cells were transduced with the following lentiviral constructs, all generated by cloning the following cDNAs into pRSF91 (kind gift of C. Baum, Hanover Medical School): FLT3-ITD(W51), NPM1mut (frameshift), HOXA9, NRAS(G12D), IDH1(R132H) or AML1/ETO. CD34+ cells were spin-infected with lentivirus in retronectin coated plates (Clontech) at 37°C for 90 min at 1500rpm, with subsequent nucleofection of CRISPR reagents 24 hours later. Transduction efficiency was confirmed using fluorescence microscopy and within the range of 3–5% across different cDNAs. 3 mice were transplanted per each cDNA/CRISPR pool group, and mock transduced cells transfected with a non-targeting sgRNA were used as a control.

Transplantation assays and mouse analysis

8–10 week old male and female NSGS mice (NOD-SCID; IL2Rγ null; Tg (IL3, CSF2, KITL)) obtained from The Jackson laboratory, strain 013062) were sublethally irradiated using Gamma-cell irradiator (Best Theratronics) at a dose of 250 Rads, and retro-orbitally injected with 200,000–500,000 cells/mouse. 3–5 recipients were injected per arm, and a total of at least 6 adult and 4 UCB donors were used for experiments described here. For CHIP transplants, at least two unique donors in two independent experiments were used to target each gene. For smaller pool experiments, two independent adult CD34+ donors were used, and 6–9 recipients were transplanted in independent experiments. Animals were monitored daily for presence of disease, and were sacrificed at designated times after transplantation or when moribund. Peripheral blood was collected from the retro-orbital cavity using a heparinized glass capillary and automated total and differential blood cell counts were determined using ADVIA Hematology system (Bayer). Collected blood was also used to prepare blood smears, which were stained with May Gruenwald and Modified Giemsa (Sigma Aldrich). Following sacrifice, mice were examined for presence of tumors, enlarged lymph nodes or other abnormalities, and organs were collected for further cell and histopathologic analysis. Single cell suspensions were made from bone marrow, and spleen, washed, red blood cells were lysed, and samples were frozen in 10% dimethylsulfoxide/90% FCS until analysis. Peripheral blood and bone marrow from mice at the time of sacrifice were analyzed for contribution of human CD45+ hematopoiesis by flow cytometry (CD45-PE-Cy7 antibody (BD Biosciences, clone HI30)) as well as immunohistochemistry (Dako, clone 2B11 + PD7/26). Additional immunohistochemistry analysis was performed using the following antibodies: anti-MPO (Dako, polyclonal rabbit), anti-CD20 (Dako, clone L26), anti-CD68 (Dako, clone PG-M1), anti-CD163 (Vector Labs, clone 10D6), anti-TNFa (Santa Cruz, clone F-6), anti-CD117 (Dako, rabbit polyclonal). For sequential transplant experiments, 500,000–1 million bone marrow CD45+ cells sorted from primary (CHIP) mice were nucleofected with Cas9 and a single targeting or NT sgRNA and transplanted into sublethally irradiated NSGS mice. 6–8 recipient mice were transplanted per gene across two independent experiments. CD45+ bone marrow sorted from each individual mouse was transplanted separately into a single secondary recipient. For tertiary sequential transplantation, 500,000–1 million bone marrow CD45+ cells sorted from secondary sequential transplant mice were transduced with lentiviral vectors coding FLT3-ITD or NRAS-G12D and transplanted into sublethally irradiated NSGS mice. CD45+ bone marrow from up to two secondary sequential transplant mice was pooled to obtain 500,000 – 1million bone marrow CD45+ cells for transduction and transplantation. Serial transplantation experiments with engineered neoplastic clones were performed using 300,000–1 million CD45+ viable bone marrow cells, which were retro-orbitally injected into sublethally irradiated NSGS mice as described above. Cells were transplanted into 1–5 secondary recipient mice depending on the cell numbers available.

Flow cytometry

CD45+ bone marrow populations were analyzed and sorted with a FACSAriaII instrument (Becton Dickinson, Mountain View, CA) using anti-human CD45-PE-Cy7 antibody (BD Biosciences, clone HI30). Further characterization of mutant clones was carried out using the following antibodies: CD3-FITC (Life Technologies, clone S4.1), CD19-PE (BD Biosciences, clone HIB19), CD33-APC (BD Biosciences, clone WM53), CD34-BV421 (BD Biosciences, clone 581), CD38-BV711 (BD Biosciences, clone HIT2), CD11B-PerCP/Cy5.5 (BioLegend, clone M1/70), CD68-APC/Cy7 (BioLegend, clone Y1/82A), LIVE/DEAD Fixable Near-IR Dead Cell Stain Kit (Life Technologies), and LIVE/DEAD Fixable Aqua Dead Cell Stain Kit (Life Technologies).

Colony assays

Single cell-derived colony growth was performed for 14 days in fully cytokine supplemented methylcellulose based medium MethoCult H4034 Optimum (Stem Cell Technologies, Vancouver, BC Canada). Single non-overlapping colonies of different morphologies were picked and DNA extraction was performed using QuickExtract DNA extraction solution on Day 14 of plating (Epicentre, Madison, WI) in two independent experiments. Figure 1B shows pooled data from two independent experiments.

In vitro azacitidine treatment

Azacitidine was purchased from Sigma Aldrich, dissolved in DMSO and serially diluted in culture medium. CD34+ cells were nucleofected with Cas9 and sgRNA containing plasmids as described above, transfection efficiency was confirmed using fluorescent microscopy and cells were treated with a range of concentrations of azacitidine for 72 hours in triplicates. Cell counts and viability were determined using trypan blue exclusion. Three technical replicates were performed in two independent experiments across 6 different concentrations.

In vivo azacitidine treatment

Azacitidine was purchased from Sigma Aldrich, freshly dissolved in PBS at a concentration of 0.5mg/mL prior to each dosing, and administered once a day at a dose of 2.5mg/kg using intraperitoneal (i.p.) injection. Drug treatment was initiated 3 months after transplantation of engineered CD34+ cells and confirmation of engraftment by flow cytometry of hCD45+ staining of peripheral blood. Genomic DNA was isolated from peripheral blood at the time of engraftment and only mice with TET2 or ASXL1 indel fraction >0.01 in peripheral blood at the time of engraftment were included in the subsequent analysis. Azacitidine was administered on Days 1–5 of a 14 day cycle. Mice were treated for a total of 12 weeks (6 cycles), and bulk bone marrow and sorted populations of cells were analyzed for TET2 and ASXL1 indel fraction using NGS. 3–4 mice were included in each experimental arm.

QUANTIFICATION AND STATISTICAL ANALYSIS

CRISPR/Cas9 Indel Detection

Primary PCR amplicons spanning ~180–220 bp of genomic sequence and centered around the predicted Cas9 cut site were designed for each targeted gene and amplified using a sequential PCR (Figure S1C). All PCR primers used are listed in Table S1A. Next generation sequencing using the MiSeq desktop sequencer (Illumina) was performed, and 300bp single end reads were used to identify indels as described below. Depth of sequencing was designed to match the number of human CD45+cells sequenced and was generally between 50,000–200,000 reads/sgRNA target site/gene. Each sequencing run included PCR amplicons which were performed in wildtype samples in parallel with samples, in which targeting had occurred. The wildtype samples were used to determine the accuracy of indel calling pipeline as well as to determine the significance threshold for indel calling (described below).

DNA-seq reads were mapped against human genome build Hg19, downloaded from the UCSC genome browser, using the affine gap Smith-Waterman algorithm as implemented by BWA-MEM version 0.7.10. Reads were filtered for those mapping unambiguously to the targeted sequencing region of ASXL1, DNMT3A, EZH2, NF1, RUNX1, SRSF2, SMC3, STAG2, TET2, TP53, or U2AF1. For each sequencing run, a test set was constructed for assessing indel detection accuracy by sampling 100 wild-type reads per gene and modifying each read to create 300 variations of the read, including 1 to 150 base deletions and 1 to 150 base insertions. For the purposes of sampling, wild-type reads were defined as those with an indel-free alignment spanning the entire length of the PCR amplicon and BWA-MEM MAPQ of 60. The deletion variations of each wild-type read were created by removing 1 to 150 bases centered at the cut site while leaving the original distribution of quality scores. The insertion variations were created by sampling 1 to 150 bases from the original read, inserting them at the cut site, and extending the quality scores by the average quality score across all reads in the sequencing run for the position of the last base in the amplicon. For a sequencing run with 12 targeted gene regions, the overall test set consists of 361,200 reads.

Insertions and deletions were characterized in both the test reads and the unmodified experimental reads using two complementary procedures. The first procedure is applied to reads that have a BWA-MEM alignment that spans the CRISPR/Cas9 cut site. The second procedure is applied to reads that have multiple non-spanning alignments. For spanning alignments, the first procedure detects small indels by simply parsing the CIGAR string of the alignment using the Pysam module (https://pypi.python.org/pypi/pysam). A target region is defined as +/− 25 nucleotides from the cut site and reads are labeled positive or negative for CRISPR/Cas9 editing depending on whether an indel is observed in the target region. For multiple non-spanning alignments, the second procedure searches amongst a read’s multiple alignments to find an aligned segment that maps prior to the cut site (prefix), and an aligned segment that maps post cut site (suffix). If both a prefix and suffix alignment exist, the read is labeled positive for editing and indel position is approximated by the distance between the prefix end position and suffix start position. The second procedure does not label any reads negative as prefix or suffix alignments are individually uninformative.

Accuracy of the indel calling procedure was assessed based on the percent of test reads correctly classified as wild-type versus target region indel. Figure S1D shows the percent accuracy of each gene for a representative sequencing run containing 11 targeted genes. Deletions and insertions are represented as negative and positive lengths, respectively. If we define a high accuracy interval as the largest interval for which all indel sizes within the interval are detected at over 95% accuracy, the average high accuracy interval across genes is [−71,56]. Sequence specific characteristics result in variation of indel detection accuracy between genes, ranging from U2AF1 with the largest high accuracy interval, [−150,94], to NF1 with the smallest, [−34,44]. However, over 99% of the distribution of observed indel sizes in experimental reads for both U2AF1 and NF1 are contained within the interval [−12,1] (Figure S1E), suggesting the high accuracy interval for all 11 genes is large enough to detect the vast majority of indels induced by CRISPR/Cas9.

Allelic fractions represent insertions and deletions of any size that overlap a gene’s target region as detected by the two indel detection procedures. For each sample/gene pair, the count of reads with target region indels and count of reads without target region indels were compared to the average corresponding counts of the wild-type control present in each sequencing run using a one-sided Fisher’s exact test. False discovery rate was calculated using the Benjamini & Hochberg method (Benjamini, 1995) and only fractions satisfying a 5% FDR cutoff were considered to be above zero. Presence of indels was confirmed using the integrative genomics viewer (IGV) (Figure S1F).

To facilitate the use of our indel detection method, the analysis components have been compiled into the automated CRISPR-Seq workflow and provided as a publically available method on FireCloud (http://www.firecloud.org), a cloud-based genomic analysis platform. Detailed instructions for using the CRISPR-Seq method via FireCloud are provided in the CRISPR-Seq documentation (http://crispr-seq.com/). In brief, users upload FASTQ data and annotation files to secure Google Cloud storage (HIPAA compliant) and use the FireCloud web interface to launch their own CRISPR-Seq analysis. Source code is also provided in a Docker image for computational scientists interested in modifying analyses or using other compute resources.

Point Mutation (IDH1, NRAS) and FLT3-ITD Insertion Detection

For point mutation detection, reads were aligned with BWA-MEM and the allele fraction was determined using pileup statistics from R Rsamtools package. For FLT3-ITD detection, two independent methods, end matching and comparative alignment, were used to verify the results. In the end matching method, raw FASTQ reads were queried for those matching both the first (s1) and last (s2) 22 nt of the FLT3 amplicon using dictionary pattern matching from the R Biostrings package. The matching parameters allow up to 10% mismatches per string, but zero indels. Distance between s1 and s2 mapping positions within each read were calculated and used to bin the reads into wild-type (102nt), FLT3-ITD (123nt), or ambiguous (other distance) groups. In the comparative alignment method, reads were aligned to a custom reference consisting of only two reference sequences (FLT3 and FLT3-ITD hg19 sequences) using BWA-MEM and filtered for primary alignments. Pearson correlation between FLT3-ITD counts across 45 samples demonstrates the two methods yield consistent results (r = .99, p-value < 2.2e-16). Mutant cell fractions reported for mutations inserted as a third allele are calculated according to the following:

mutant_cell_fraction=2×#mutant_reads#wild_type_reads

Visualization of indel type analysis for monitoring clonal dynamics over time

All reads supporting unique indel types (as characterized by type of indel (insertion versus deletion), size of indel (in bp) and genomic locus of the start site (Hg19 coordinate)) were graphically represented as a separate pie chart for each gene. Only indel types supported by at least 3 reads were shown. For in vitro experiments, pie charts representing indel types at 24hr versus 144hr in vitro culture are shown. For in vivo experiments, pie charts representing indel types at 24hr (pre-injection) and in vivo (usually 5 months or at the time of analysis) are shown. Subsequent analysis of LOF and non-LOF indel types was performed, with non-LOF indels considered to be those that are < 18bp long and multiples of 3bp.

Exome sequencing and off-target analysis

Whole exome sequencing (WES) was performed on human CD45+ sorted bone marrow cells (without matched normal controls). DNA was extracted using the QIAamp DNA Micro kit (Qiagen). WES was performed at the Broad Institute (Human WES Express (standard coverage) product). Libraries were prepared according to previously published methods using Illumina’s in-solution DNA probe based hybrid selection method. The Illumina exome specifically targets approximately 37.7Mb of mainly exonic territory made up of all targets from Agilent exome design (Agilent SureSelect All Exon V2), all coding regions of Gencode V11 genes, and all coding regions of RefSeq gene and KnownGene tracks from the UCSC genome browser (http://genome.ucsc.edu). Pooled libraries were normalized to 2nM and denatured using 0.2 N NaOH prior to sequencing. Flowcell cluster amplification and sequencing were performed according to the manufacturer’s protocols using either the HiSeq 2000 v3 or HiSeq 2500. Each run was a 76bp paired-end with a dual eight-base index barcode read. Data was analyzed using the Broad Picard Pipeline, which includes de-multiplexing and data aggregation. Mean bait coverage was 98 and 66 for samples 1783 and ZM46, respectively. Mutations were called with MuTect and Indelocator and annotated to genes with Oncotator. Visual inspection with IGV was used to confirm presence of NF1 indels, which were not picked up by Indelocator. Indels and SNPs were first filtered for coverage of at least 20X, coding variants, and allelic fractions of at least 0.10. Likely germline SNPs were removed by filtering against the Exome aggregation consortium data set (ExAC, http://exac.broadinstitute.org/) and 1000 genomes data set (http://www.1000genomes.org/), but all SNPs and indels in DNMT3A, TET2, JAK2 and ASXL1, which are genes most commonly mutated in CHIP, were included in the final analysis even if filtered out by ExAC or 1000 genomes. List of indels and SNPs found in any of the 160 hematopoiesis and leukemia genes (Jaiswal et al., 2014) were cross-referenced with known functional variants reported in the tumor portal dataset (http://www.tumorportal.org/) as well as the COSMIC dataset (http://cancer.sanger.ac.uk/cosmic). Predicted off-target sites were determined using the Zhang laboratory online tool (http://tools.genome-engineering.org/) based on a published algorithm (Hsu et al., 2013), and no indels were found within 50bp of the 131 predicted off-target sites.

Statistical analysis

In vitro targeting with pools of 11 leukemia drivers with follow-up single cell clone analysis was done in two independent experiments. Statistical significance of differences among mono- versus bi-allelic LOF mutations induced by multiplex targeting of CD34+ cells in vitro was determined using Barnard’s exact test (Figure 1B). Statistical significance supporting non-randomness of indel distribution in the pre-injection samples and 5 months after transplantation (experiment in Figure 1D and S2G) between control and experimental sgRNAs was determined using a two-tailed Fisher’s exact test. Statistical significance of differences in viability between vehicle and azacitidine treatment conditions in vitro was determined using a two-tailed unpaired Students t test. Three technical replicates were performed in two independent experiments using two independent CD34+ donors (Figure 4A). Differences between indel fractions of bulk bone marrow and sorted populations isolated from vehicle and azacitidine treated mice was determined using a two-tailed unpaired Students t test. Cells isolated from 3–4 mice were included in each experimental arm (Figures 4C–D). For CHIP transplants, at least two unique donors in two independent experiments were used to target each gene. Paired Students t-test was performed on a composite data set of pre-injection indel fractions and mean indel fractions for TET2, DNMT3A and ASXL1 exon 12 to show significant clonal expansion. Data for adult and umbilical cord donors was pooled for this analysis (Figure 4A). One-tailed paired Students t-test was performed to examine the tendency for specific major indels to expand over time, and for specific predicted non-LOF indels to be lost over time (Table S4A and B). One-tailed Mann-Whitney U test was used to examine relative expansion of mutant clones in experimental genes versus control genes (Figure S2H).

Shannon entropy analysis

The indel distribution for a sample can be represented in bits by assigning each read a binary value according to its indel status. Each sample consists of sequencing reads from 11 genes, requiring some binning of binary read values into gene groups to encode the total distribution information in bits. Shannon entropy was calculated for each sample based on within-bin indel probabilities using the R entropy package (Hausser and Strimmer, 2009). To determine whether there are significant changes in entropy between the pre-injection distribution, Y, and in vivo expansion (or a later in vitro timepoint), Xi, the change in entropy is calculated for each expansion sample i, and used as a test statistic, ti = H(Xi) − H(Y), for random permutation tests. Each permutation consists of pooling reads for Xi and Y and shuffling the sample labels but keeping the gene labels intact so that the overall indel frequencies remain fixed per gene. The P-Value is determined by the proportion of sampled permutations where the change in entropy is more negative than the observed ti. Shannon entropy was calculated for Figure 1A and S2G.

Supplementary Material

1
2. Table S4. Related to Figures 1 and 3.

Clonal dynamics and expansion of predicted non-LOF and mutant clones over a 5 month-long in vivo expansion.

Acknowledgments

We thank P.K. Mandal and D. J. Rossi for sharing reagents and technical advice, Dr. D. Neuberg for help with statistical analysis, S. Lazo for flow cytometry assistance, E. Haydu and M. Donahue for technical assistance, D. Heckl for sharing reagents, U. Ben-David, J. Doench, and M. Hegde for valuable discussion, and P. Bandopadhayay and N. Greenwald for assistance with exome data analysis. This work was supported by the NIH(R01HL082945 and P01CA108631), the Edward P. Evans Foundation, the Gabrielle’s Angel Foundation, LLS Scholar Award, and Broad Ignite early career grant to B.L.E. from the Broad Institute of MIT and Harvard; NIH(5K12CA087723-12), LLS Special Fellow Award, ASCO Young Investigator Award and ASH/EHA Translational Research in Hematology Award to Z.T.; NIH(T32GM007753) to Q.L.S., and NIDDK grant (DK106829) to S. Heimfeld of the SCCA Cellular Therapy Laboratory.

Footnotes

AUTHOR CONTRIBUTIONS:

Z.T., J.C.A., A.T. and B.L.E. designed and supervised the research; Z.T., J.M.K.-B., K.D.P, C.C.L., Q.L.S., and D.Y. performed the research; Z.T., J.M.K.-B., K.D.P., C.C.L., Q.L.S., D.Y., R.B., J.C.A., E.A.M. and B.L.E. analyzed the data; Z.T. and B.L.E. wrote the manuscript.

DATA AND SOFTWARE AVAILABILITY

Supplemental Data: URL/Tothova_et_al_Supplemental_Data

Supplemental Table 4: link

ADDITIONAL RESOURCES

CRISPR-SEQ analytical pipeline for CRISPR indel detection and all associated documentation is available at: http://crispr-seq.com/

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Bejar R, Lord A, Stevenson K, Bar-Natan M, Perez-Ladaga A, Zaneveld J, Wang H, Caughey B, Stojanov P, Getz G, et al. TET2 mutations predict response to hypomethylating agents in myelodysplastic syndrome patients. Blood. 2014;124:2705–2712. doi: 10.1182/blood-2014-06-582809. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bejar R, Stevenson K, Abdel-Wahab O, Galili N, Nilsson B, Garcia-Manero G, Kantarjian H, Raza A, Levine RL, Neuberg D, et al. Clinical effect of point mutations in myelodysplastic syndromes. The New England journal of medicine. 2011;364:2496–2506. doi: 10.1056/NEJMoa1013343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Benjamini YHY. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B. 1995:289–300. [Google Scholar]
  4. Buechele C, Breese EH, Schneidawind D, Lin CH, Jeong J, Duque-Afonso J, Wong SH, Smith KS, Negrin RS, Porteus M, et al. MLL leukemia induction by genome editing of human CD34+ hematopoietic cells. Blood. 2015;126:1683–1694. doi: 10.1182/blood-2015-05-646398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cancer Genome Atlas Research N. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. The New England journal of medicine. 2013;368:2059–2074. doi: 10.1056/NEJMoa1301689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Corces-Zimmerman MR, Hong WJ, Weissman IL, Medeiros BC, Majeti R. Preleukemic mutations in human acute myeloid leukemia affect epigenetic regulators and persist in remission. Proceedings of the National Academy of Sciences of the United States of America. 2014;111:2548–2553. doi: 10.1073/pnas.1324297111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dever DP, Bak RO, Reinisch A, Camarena J, Washington G, Nicolas CE, Pavel-Dinu M, Saxena N, Wilkens AB, Mantri S, et al. CRISPR/Cas9 beta-globin gene targeting in human haematopoietic stem cells. Nature. 2016;539:384–389. doi: 10.1038/nature20134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Genovese G, Kahler AK, Handsaker RE, Lindberg J, Rose SA, Bakhoum SF, Chambert K, Mick E, Neale BM, Fromer M, et al. Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence. The New England journal of medicine. 2014a;371:2477–2487. doi: 10.1056/NEJMoa1409405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Genovese P, Schiroli G, Escobar G, Di Tomaso T, Firrito C, Calabria A, Moi D, Mazzieri R, Bonini C, Holmes MC, et al. Targeted genome editing in human repopulating haematopoietic stem cells. Nature. 2014b;510:235–240. doi: 10.1038/nature13420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Gundry MC, Brunetti L, Lin A, Mayle AE, Kitano A, Wagner D, Hsu JI, Hoegenauer KA, Rooney CM, Goodell MA, et al. Highly Efficient Genome Editing of Murine and Human Hematopoietic Progenitor Cells by CRISPR/Cas9. Cell reports. 2016;17:1453–1461. doi: 10.1016/j.celrep.2016.09.092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Haferlach T, Nagata Y, Grossmann V, Okuno Y, Bacher U, Nagae G, Schnittger S, Sanada M, Kon A, Alpermann T, et al. Landscape of genetic lesions in 944 patients with myelodysplastic syndromes. Leukemia. 2014;28:241–247. doi: 10.1038/leu.2013.336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Heckl D, Kowalczyk MS, Yudovich D, Belizaire R, Puram RV, McConkey ME, Thielke A, Aster JC, Regev A, Ebert BL. Generation of mouse models of myeloid malignancy with combinatorial genetic lesions using CRISPR-Cas9 genome editing. Nature biotechnology. 2014 doi: 10.1038/nbt.2951. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O, et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nature biotechnology. 2013;31:827–832. doi: 10.1038/nbt.2647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Jaiswal S, Fontanillas P, Flannick J, Manning A, Grauman PV, Mar BG, Lindsley RC, Mermel CH, Burtt N, Chavez A, et al. Age-related clonal hematopoiesis associated with adverse outcomes. The New England journal of medicine. 2014;371:2488–2498. doi: 10.1056/NEJMoa1408617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Jordan CT, Guzman ML, Noble M. Cancer stem cells. The New England journal of medicine. 2006;355:1253–1261. doi: 10.1056/NEJMra061808. [DOI] [PubMed] [Google Scholar]
  16. Klco JM, Spencer DH, Miller CA, Griffith M, Lamprecht TL, O’Laughlin M, Fronick C, Magrini V, Demeter RT, Fulton RS, et al. Functional heterogeneity of genetically defined subclones in acute myeloid leukemia. Cancer cell. 2014;25:379–392. doi: 10.1016/j.ccr.2014.01.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Kon A, Shih LY, Minamino M, Sanada M, Shiraishi Y, Nagata Y, Yoshida K, Okuno Y, Bando M, Nakato R, et al. Recurrent mutations in multiple components of the cohesin complex in myeloid neoplasms. Nature genetics. 2013;45:1232–1237. doi: 10.1038/ng.2731. [DOI] [PubMed] [Google Scholar]
  18. Mandal PK, Ferreira LM, Collins R, Meissner TB, Boutwell CL, Friesen M, Vrbanac V, Garrison BS, Stortchevoi A, Bryder D, et al. Efficient ablation of genes in human hematopoietic stem and effector cells using CRISPR/Cas9. Cell stem cell. 2014;15:643–652. doi: 10.1016/j.stem.2014.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Mazumdar C, Shen Y, Xavy S, Zhao F, Reinisch A, Li R, Corces MR, Flynn RA, Buenrostro JD, Chan SM, et al. Leukemia-Associated Cohesin Mutants Dominantly Enforce Stem Cell Programs and Impair Human Hematopoietic Progenitor Differentiation. Cell stem cell. 2015;17:675–688. doi: 10.1016/j.stem.2015.09.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Mullenders J, Aranda-Orgilles B, Lhoumaud P, Keller M, Pae J, Wang K, Kayembe C, Rocha PP, Raviram R, Gong Y, et al. Cohesin loss alters adult hematopoietic stem cell homeostasis, leading to myeloproliferative neoplasms. The Journal of experimental medicine. 2015;212:1833–1850. doi: 10.1084/jem.20151323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Papaemmanuil E, Gerstung M, Bullinger L, Gaidzik VI, Paschka P, Roberts ND, Potter NE, Heuser M, Thol F, Bolli N, et al. Genomic Classification and Prognosis in Acute Myeloid Leukemia. The New England journal of medicine. 2016;374:2209–2221. doi: 10.1056/NEJMoa1516192. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Papaemmanuil E, Gerstung M, Malcovati L, Tauro S, Gundem G, Van Loo P, Yoon CJ, Ellis P, Wedge DC, Pellagatti A, et al. Clinical and biological implications of driver mutations in myelodysplastic syndromes. Blood. 2013;122:3616–3627. doi: 10.1182/blood-2013-08-518886. quiz 3699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Rangarajan A, Weinberg RA. Opinion: Comparative biology of mouse versus human cells: modelling human cancer in mice. Nature reviews Cancer. 2003;3:952–959. doi: 10.1038/nrc1235. [DOI] [PubMed] [Google Scholar]
  24. Sanchez-Rivera FJ, Papagiannakopoulos T, Romero R, Tammela T, Bauer MR, Bhutkar A, Joshi NS, Subbaraj L, Bronson RT, Xue W, et al. Rapid modelling of cooperating genetic events in cancer through somatic genome editing. Nature. 2014;516:428–431. doi: 10.1038/nature13906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Schuback HL, Arceci RJ, Meshinchi S. Somatic characterization of pediatric acute myeloid leukemia using next-generation sequencing. Semin Hematol. 2013;50:325–332. doi: 10.1053/j.seminhematol.2013.09.003. [DOI] [PubMed] [Google Scholar]
  26. Viny AD, Ott CJ, Spitzer B, Rivas M, Meydan C, Papalexi E, Yelin D, Shank K, Reyes J, Chiu A, et al. Dose-dependent role of the cohesin complex in normal and malignant hematopoiesis. The Journal of experimental medicine. 2015;212:1819–1832. doi: 10.1084/jem.20151317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Walter MJ, Shen D, Ding L, Shao J, Koboldt DC, Chen K, Larson DE, McLellan MD, Dooling D, Abbott R, et al. Clonal architecture of secondary acute myeloid leukemia. The New England journal of medicine. 2012;366:1090–1098. doi: 10.1056/NEJMoa1106968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Welch JS, Ley TJ, Link DC, Miller CA, Larson DE, Koboldt DC, Wartman LD, Lamprecht TL, Liu F, Xia J, et al. The origin and evolution of mutations in acute myeloid leukemia. Cell. 2012;150:264–278. doi: 10.1016/j.cell.2012.06.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Welch JS, Petti AA, Miller CA, Fronick CC, O’Laughlin M, Fulton RS, Wilson RK, Baty JD, Duncavage EJ, Tandon B, et al. TP53 and Decitabine in Acute Myeloid Leukemia and Myelodysplastic Syndromes. The New England journal of medicine. 2016;375:2023–2036. doi: 10.1056/NEJMoa1605949. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Wunderlich M, Chou FS, Link KA, Mizukawa B, Perry RL, Carroll M, Mulloy JC. AML xenograft efficiency is significantly improved in NOD/SCID-IL2RG mice constitutively expressing human SCF, GM-CSF and IL-3. Leukemia. 2010;24:1785–1788. doi: 10.1038/leu.2010.158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Xie M, Lu C, Wang J, McLellan MD, Johnson KJ, Wendl MC, McMichael JF, Schmidt HK, Yellapantula V, Miller CA, et al. Age-related mutations associated with clonal hematopoietic expansion and malignancies. Nature medicine. 2014;20:1472–1478. doi: 10.1038/nm.3733. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2. Table S4. Related to Figures 1 and 3.

Clonal dynamics and expansion of predicted non-LOF and mutant clones over a 5 month-long in vivo expansion.

RESOURCES