Abstract
Short H2A (sH2A) histone variants are primarily expressed in the testes of placental mammals. Their incorporation into chromatin is associated with nucleosome destabilization and modulation of alternate splicing. Here, we show that sH2As innately possess features similar to recurrent oncohistone mutations associated with nucleosome instability. Through analyses of existing cancer genomics datasets, we find aberrant sH2A upregulation in a broad array of cancers, which manifest splicing patterns consistent with global nucleosome destabilization. We posit that short H2As are a class of “ready-made” oncohistones, whose inappropriate expression contributes to chromatin dysfunction in cancer.
Subject terms: Cancer epigenetics, Cancer genomics
Short H2A variants are testis-specific histones that destabilize nucleosomes during spermatogenesis. In this study, the authors show that these variants are expressed in an array of different cancers and identify splicing changes associated with nucleosome instability in these malignancies.
Introduction
Nucleosomes, the fundamental subunit of chromatin, consist of octamers of histones (H2A, H2B, H3, and H4) that wrap 147 bp of DNA1. Single-allele mutations in histones, termed “oncohistones”, are found in many different malignancies2,3. Oncohistones comprise small percentages of the total histone pool4,5 and rarely cause cancer by themselves2,6. Instead, they synergize with other oncogenes to facilitate the development of neoplastic chromatin landscapes2,6. Recent large-scale cancer genome analyses identified recurrent mutations in histones within the highly conserved histone fold domain (HFD) in many common cancers2,3. These HFDs mutations, including the well-characterized H2B-E76K substitution, reduce nucleosome stability in vitro and perturb chromatin in vivo2,7. Similarly, most H2A HFD oncohistone mutations disrupt either contact sites with DNA (R29Q) or inter-nucleosomal interactions (acidic patch) (Fig. 1a). Cells co-expressing H2B-E76K and a PI3KCA oncogene showed increased transformation capacity, consistent with nucleosome instability, enhancing the cancerogenic potential of other oncogenes2.
Short histone H2A variants (sH2A) are a class of histone variants expressed during mammalian spermatogenesis8–11. Regulation of sH2A expression in normal testis is unknown12. Unlike other histone variants, sH2As are rapidly evolving and possess highly divergent HFDs, mutated acidic patches, and truncated C-termini, all of which impact nucleosome stability8,13–16. The best characterized of these variants, H2A.B, forms unique nucleosomes that wrap ~120 bp of DNA both in vitro and in vivo15–17. In testis, H2A.B is incorporated into nucleosomes during meiosis and has been shown to interact with splicing factors at actively transcribed genes17–20. Germline disruption of H2A.B-encoding genes in mice revealed that H2A.B loss is associated with chromatin dysfunction and splicing changes in testis18.
Though a role for sH2As in cancer has yet to be determined21, the emerging literature on nucleosome instability as cancer driver2,3 along with H2A.B’s potent ability to destabilize nucleosomes13–16 prompted us to investigate whether sH2As may contribute to cancer. Previous work showed that expression of H2A.B causes increased sensitivity to DNA damaging agents, shortens S-phase22, and alters splicing17,19,22, each of which are associated oncogenesis. Additional evidence for a role for H2A.B in cancer comes from Hodgkin’s lymphoma (HL), where H2A.B transcripts have been detected23 and HL cells expressing H2A.B grow faster than H2A.B-negative cells22. Here, through comparative analyses of germline short H2A sequences and oncohistone mutations in canonical H2A, we show that short H2As inherently possess oncohistone features. We explore several cancer data sets and find H2A.B expression in a diverse array of malignancies. We also show that many of these cancers possess unique splicing signatures. We propose that the nucleosome-destabilizing characteristics sH2As evolved for their role in testis result in oncohistone activity in other tissues.
Results
sH2As have evolved oncohistone features
There are five X-linked sH2A genes in humans: H2A.B.1.1 (H2AFB2), H2A.B.1.2 (H2AFB3), H2A.B.2 (H2AFB1), H2A.P (HYPM), and H2A.Q (unannotated)8. We compared the amino-acid sequences of sH2As to canonical H2A to assess whether their rapid evolution resulted in oncohistone-like changes. This analysis revealed that many of the most common cancer-associated mutations in canonical H2A are already present in all wild-type sH2A sequences (Fig. 1a, Supplementary Fig. 1a, b). This includes R29Q/F substitutions that correspond to the second most frequent mutation in canonical H2A (Fig. 1a, Supplementary Fig. 1a, b)2,3. In addition, all wild-type sH2As have a C-terminal truncation that removes E121, the most common mutation in canonical H2A (Fig. 1a, Supplementary Fig. 1a, b)2,3. Phylogenetic analyses in primates showed that despite their rapid evolution, these oncohistone-like changes are highly conserved (Fig. 1b, Supplementary Fig. 1a–c)8. This conservation implies functional consequences as many of these residues are critical contact points for histone-DNA or histone-histone interactions1,13–16. These data show that sH2As contain oncohistone features similar to canonical H2A mutations in cancers.
H2A.Bs are reactivated in a broad array of cancers
The oncohistone properties inherent in sH2As indicate that they may play a role in cancer simply through upregulation. We focused on the expression of H2A.B paralogs, as they are well annotated and have been shown to impact both nucleosome stability and cell cycle progression22. To investigate whether H2A.Bs are reactivated in different cancers, we first used transcriptomic data from The Cancer Genome Atlas (TCGA). This analysis showed that H2A.B paralogs are activated (at a threshold of >1.5 transcripts per million (TPM)) in numerous individual tumors across cancer types (Fig. 2a, Supplementary Data 1, Supplementary Data 2), but never in adjacent normal tissue (Supplementary Data 1), and very rarely (<1.5%) in non-testes tissue samples from the Genotype-Tissue Expression database (Supplementary Table 1). The range of expression varies widely, with H2A.B-encoding transcripts present at >100 TPMs in two specimens (Supplementary Data 2). Although many tumors reactivate H2AFB1 alone, most tumors that express H2AFB2 also express H2AFB3 (Fig. 2a). This finding may result from transcriptional co-regulation due to their genomic proximity (Supplementary Fig. 2b) or inability to distinguish these near-identical paralogs by short-read mapping8. Despite their similarity, we were able to distinguish these two genes in a few tumor samples (Fig. 2a).
Across the TCGA data set, diffuse large B-cell lymphomas (DLBCLs) showed the highest frequency of aberrant H2A.B expression at 50% (Fig. 2a). A recent analysis of DLBCL genomes identified five distinct molecular subtypes24, including a favorable prognosis-germinal center (FP-GC) subtype associated with histone mutations. We investigated whether H2A.B expression was restricted to the FP-GC subtype. We queried the 37 DLBCL samples for mutations associated with the FP-GC subtype including linker H1 and core histones, immune evasion genes, PI3K, NF-κB, and JAK/STAT/RAS pathway components24. Twenty-five samples had a mutation in at least one of these genes, including 15 different samples with histone mutations (Supplementary Data 3). H2A.B was expressed in 13 samples with any FP-GC mutation, and in 6 of the 10 FP-GC samples without histone mutations (Supplementary Data 3). To contrast this with another DLBCL subtype, we analyzed H2A.B expression in the poorer prognosis-germinal center subtype associated with mutations in chromatin modifiers EZH2, CREBBP, EP300, KMT2D, and BCL11A24. Though we did not identify any EZH2 mutations, 15 samples had mutations in at least one chromatin modifier gene. Nine of these samples also had H2A.B upregulation. These analyses suggest that H2A.B expression occurs in multiple germinal center DLBCL subtypes.
Other cancers in the TCGA data set with H2A.B aberrant expression include uterine corpus endometrial carcinomas (UCEC) (9.5%), urothelial bladder carcinomas (BLCA) (4.7%), and cervical squamous cell carcinomas and endocervical carcinomas (4.5%) (Fig. 2a). These same cancers were previously identified as having the highest frequencies of core histone mutations in the TCGA data set, ranging from 5 to 8%2. We found a few specimens with both recurrent H2A mutations and H2A.B expression (Supplementary Data 4), however, the low numbers of specimens that share both of these features hinder meaningful correlative analyses.
Upregulation of H2A.B in HL23 and DLBCLs (Fig. 2a) prompted us to analyze data sets from other lymphoid lineage-derived, low mutation cancers for aberrant H2A.B expression. We queried four separate B-acute lymphoblastic leukemia (B-ALL) data sets and found 6–7% of specimens with H2A.B-encoding transcripts at >1.5 TPM (Fig. 2b) in three of the data sets25–27, and 13% in the fourth (Supplementary Fig. 2c)28. Because of the diversity of liquid and solid cancers with H2A.B expression, we searched the Cancer Cell Line Encyclopedia (CCLE) database29 for cell lines with H2A.B expression at >1.5 TPM. Consistent with high-frequency H2A.B expression in TCGA DLBCLs, lymphomas demonstrated the highest percentage of H2A.B-positive cell lines (Fig. 2c), with 70% of HL and 25% of non-Hodgkin’s lymphoma cell lines expressing H2A.B. The spectrum of H2A.B expression across other cancers was also similar between CCLE and TCGA data sets (Fig. 2c). We conclude that H2A.B is aberrantly expressed in a broad array of cancers.
We investigated the potential causes of H2A.B induction in cancer. Although little is known about the transcriptional regulation of H2A.B-encoding loci in testis, changes in X-chromosome ploidy are associated with increased fitness in cancer cells30. We investigated whether H2A.B expression in cancer may result from global derepression of large domains, amplifications, or gain of an additional X chromosome in these samples. We compared levels of X- to autosome-linked transcripts in H2A.B-expressing and silent samples and found no significant differences (Supplementary Fig. 2a). We also investigated the expression profiles of individual H2AFB loci and their surrounding regions and found that upregulation was limited to each individual H2A.B-encoding locus without upregulation of neighboring loci (Supplementary Fig. 2b). These results are consistent with our findings in the TCGA data set, where median H2A.B expression for the 232 H2A.B-positive samples is ~3 TPM (Supplementary Data 2), corresponding to 49th percentile of all expressed genes. This level of expression is more likely the result of local, specific activation of individual H2AFB paralogues than recurrent amplifications or broader X-chromosome dysfunction.
H2A.Bs are associated with cancer-specific, rather than pan-cancer gene expression programs
H2A.B proteins encoded by H2AFB1 and H2AFB2/3 are nearly identical in sequence. Nevertheless, the independent reactivation of H2AFB1 and H2AFB2/3 in different cancer specimens raised the possibility that these closely related paralogues may be associated with distinct global gene expression programs. To explore this, we compared transcriptomes from H2AFB1-reactivated samples versus those from H2AFB2/3-reactivated samples within the same cancer types. We found thousands of genes that were commonly up- or downregulated in UCEC, HNSC, LUSC, and BLCA (Fig. 3a), suggesting that different H2A.B paralogues operate in similar gene expression contexts.
We investigated whether expression of other genes was consistently associated with H2A.B expression. We found 146 genes were upregulated and 90 downregulated across H2A.B-positive cancers (Supplementary Data 5). We did not identify co-upregulation of other testis-specific histone variants such as H2A.1 (TH2A) or H2B.1 (TH2B)20 (Supplementary Data 5) in H2A.B-expressing cancers. Three histone variants with broad tissue distributions, H2A.Z, H2A.X, and H3.3, also did not show consistent differences between H2A.B-positive and negative cancers, except for lower H2A.Z and H2A.X in UCEC (Fig. 3b). We note that median H2A.X levels are similar to maximum values for H2A.B (Fig. 3b, Supplementary Data 1). We also examined expression of the histone chaperone NAP1 (NAP1L1), which can assemble H2A.B-containing nucleosomes15,21. We detected NAP1-encoding transcripts in all TCGA cancers, with DLBCLs expressing the highest levels (Fig. 3b, Supplementary Table 2). The chromatin consequences of this correlation, i.e., whether higher NAP1 levels result in increased incorporation of H2A.B in chromatin are unknown.
We noted that 12/146 of the commonly upregulated genes are Cancer-Testis Antigens. As H2AFB1 was previously shown to be co-expressed with a subset of CTAs in HL23, we determined whether H2A.B-reactivated cancers are generally associated with CTA upregulation. We summarized the expression of individual CTAs into a composite “CTA score” for each tumor and compared scores between H2A.B-reactivated and silent samples (Fig. 3c). Although H2A.B-expressing HNSCs, LUSCs, and UCECs showed statistically significant CTA enrichment, DLBCLs and SARCs did not (Fig. 3c). We also examined the four B-ALL data sets and found that H2A.B expression was associated with CTA upregulation (Fig. 3c). However, individual CTAs such as NY-ESO-1 (CTAG1B) and CT45A5 were variably expressed across cancers (Supplementary Data 5), consistent with well-recognized transcriptional heterogeneity of this class of genes31,32. These data indicate that H2A.B expression is associated with CTA expression in several cancer types.
CTAs are subject to endogenous immunosurveillance mechanisms23 and TCGA tumor samples are known to contain variable amounts of immune infiltrates33. We investigated whether H2A.B expression was associated with immune infiltrates, as this could confound our transcriptome analyses. We found that transcript levels for markers of B-cells, T-cell subsets, NK cells, monocytes, and activated macrophages did not show consistent enrichment across H2A.B-expressing tumors (Supplementary Fig. 3). In fact, UCEC displayed a statistically significant decrease in PRF1 expression as well as several macrophage and neutrophil markers. Several sH2A-derived peptides are predicted to bind human leukocyte antigen (HLA) molecules34,35(Supplementary Data 6), suggesting an immunosuppressive microenvironment may contribute to sustained H2A.B expression in UCEC. The lack of excess immune infiltrates in H2A.B-positive TCGA specimens and the identification of H2A.B-positive cancer cell lines (Fig. 2c) support H2A.B upregulation in cancer cells, though a contribution from surrounding stroma in patient specimens cannot be excluded.
H2A.B-expressing cancers have distinct splicing patterns
H2A.B has been shown to directly bind RNA and interacts with splicing factors and H2A.B expression impacts alternative splicing patterns17–20,22. To determine if H2A.B expression is associated with splicing dysregulation, we annotated and quantified all constitutive and alternative splicing events in the transcriptomes of H2A.B-reactivated and silent tumors from the TCGA data set. We uncovered thousands of altered splicing events between these cancers (Fig. 4a, b). We found that H2A.B expression is associated with reduced utilization of alternative “cassette exons” (se) and proximal alternative 3′ polyadenylation (APA) sites (Supplementary Fig. 4a, b). These features were particularly prominent in BLCA, SARC, and UCEC (Fig. 4a, Supplementary Fig. 4a). Although the changes are individually modest (Supplementary Fig. 4c–f, Supplementary Data 7), they are widespread, i.e., we observe significant changes at thousands of sites across multiple cancer types (Fig. 4a, b). These patterns are not H2A.B paralogue-specific, as similar patterns were observed in specimens expressing either H2AFB1 or H2AFB2/3 (Supplementary Fig. 4a, b).
We also explored splicing in the four B-ALL data sets. Unlike in myelodysplastic syndromes and acute myelogenous leukemias36, B-ALLs are not associated with mutations in splicing factors and global splicing dysregulation is not thought to be a major driver of these leukemias. When we compared splicing patterns in the H2A.B-reactivated and silent samples within each data set, we observed aberrant splicing at a scale similar to that seen in H2A.B-positive TCGA cancers, with reductions in alternative exon and APA usage. However, the most notable feature is a consistent decrease in retained introns “ri” in all four data sets (Fig. 4c). We conclude that H2A.B expression is associated with splicing dysfunction, with some features common among many cancers while others occur in a context-specific manner.
Discussion
The discovery of oncohistone mutations has revealed new insights into the biology of cancer. We show that all mammalian genomes already encode sH2A histone variants that have evolved nucleosome-destabilizing features without any additional coding mutations. These features are important for sH2As’ roles in normal testis physiology but result in oncohistone properties when expressed out of context. In this manner, they are similar to CATACOMB/EZHIP37,38, another testis-specific oncohistone mimic that inhibits EZH2 in a subset of rare malignancies. Unlike CATACOMB/EZHIP, however, H2A.B expression occurs in many common cancers. The diversity of H2A.B-expressing cancer types suggests that pathological histone dynamics play a more significant role in neoplasia than previously appreciated.
The precise molecular targets of H2A.B expression in cancers are not known. Relatively few genes are commonly dysregulated across H2A.B-positive malignancies (Supplementary Data 5), implying that H2A.B impacts different genes in different cancers. As nucleosomes protect DNA from inappropriate transcription factor binding, nucleosome instability may allow oncogenic TFs access to different regulatory elements depending on cancer type2,39. Nucleosome destabilization also hastens RNA pol II elongation, which in turn reduces transcription-coupled splicing efficiency40. Alternative exons and proximal polyadenylation sequences are preferentially impacted by inefficient splicing owing to their weaker splice signals, resulting in a splicing phenotype similar to those observed in several H2A.B-positive cancers40. As some alternative exons promote mRNA degradation by targeting them for nonsense-mediated decay, even modest reductions in alternative splicing can increase oncogene expression41. H2A.B may operate at the nexus of several processes that cooperate to drive oncogenesis.
The relationship between histone dynamics, transcription, and splicing may also explain our inability to detect a splicing phenotype in DLBCLs despite high-frequency H2A.B expression. Many chromatin proteins are deranged in DLBCLs including Myc, p300, H1 linker, and core histones, each of which can also impact alternative splicing24,42–44. Whether potential similarities between histone mutant cancers and H2A.B-expressing cancers extend to prognoses and vulnerabilities merits further investigation, particularly in the context of DLBCL where larger data sets are needed to dissect these relationships. Several cell lines show sensitivity to H2AFB1-gRNAs in the Sanger Cancer Dependency Map, with lymphoma-derived cell lines SU-DHL-8 and IM9 being among the most sensitive to H2AFB1 disruption45. Better characterization of histone mutations and H2A.B expression across cancer cell lines is also needed in order to probe for similarities between H2A.B-expressing cancers and histone mutant cancers. Finally, sH2A-derived short peptides that bind HLA molecules (Supplementary Data 6) may be useful immunotherapy targets, and global splicing dysregulation can also generate highly immunogenic neoantigens46. Thus, our discovery of sH2A-expressing cancers may open new avenues of study and treatment for hundreds of thousands of cancer cases worldwide.
Methods
Alignments of sH2A sequences
sH2A and other H2A sequences were retrieved from Histone DB v247 and previously published work8. Predicted protein sequences from annotated CDS were aligned using ClustalW and manually curated. For primate alignments, sequences were arranged according to the accepted species phylogeny.
Genome annotation, RNA-seq read mapping, and gene and isoform expression estimation
RNA-seq reads from TCGA were downloaded from CGHub. Reads were processed for gene expression and splice isoform ratio quantification as previously described48. In brief, read alignment and expression estimation were performed with RSEM v1.2.44349, Bowtie v1.0.04450, and TopHat v2.1.14551, using the hg19/GRCh37 assembly of the human genome with a gene annotation that merges the UCSC knownGene gene annotation52, Ensembl v71.1 gene annotation53, and MISO v2.0 isoform annotation48. MISO v2.038 was used to quantify isoform ratios. The trimmed mean of M values method54, as applied to coding genes, was used to normalize gene expression estimates across all of TCGA.
Data analysis and visualization
Data analysis was performed in the R programming environment and relied on Bioconductor55, dplyr56, and ggplot257.
RNA-seq coverage plots
RNA-seq coverage plots (i.e., Fig. S2c–f) were made using the ggplot2 package in R, and represent reads normalized by the number of reads mapping to all coding genes in each sample (per million).
Somatic mutation analysis
TCGA somatic mutation calls from the Mutect pipeline58, were obtained using the GDCquery_Maf function from TCGAbiolinks59. Mutations in the following canonical histones were collated by their class: H2A—HIST1H2A(A/B/C/D/E/G/H/I/J/K/L/M), HIST2H2A(B/C), HIS3H2A; H2B—HIST1H2B(A/B/C/D/E/F/G/H/I/J/K/L/M/N/O), HIST2H2B(E/F), HIST3H2BB. Recurrent mutations (Supplementary Data 3) are defined as occurring at least five times across all cancer types in TCGA (e.g., 10 instances of E121Q mutations are found in various H2As across all TCGA samples).
Differential gene expression and splice event analyses
For the purposes of differential analyses, a threshold of >1.5 TPM was used to determine whether H2A.B was expressed in a sample, whereas a threshold of <0.5 TPM was used to determine whether H2A.B was not expressed; samples with an intermediate expression of H2A.B were not used in differential analyses. Statistical significance in differential expression or splicing in H2A.B-positive versus H2A.B-negative cancer samples was determined with a Mann–Whitney U test, as implemented in wilcox.test in R.
Prediction of H2A variant candidate T-cell epitopes
The amino acid sequence of human H2A.B.1.1 (H2AFB2), H2A.B.1.2 (H2AFB3), H2A.B.2 (H2AFB1), H2A.Q, and H2A.P (HYPM) were examined for short peptides with the potential to bind to common HLA alleles34,35. Specifically, the NetPanMHCBA4.0 algorithm of the Immune Epitope Database and Analysis Resource (IEDB) was used to identify peptides of 8, 9, 10, or 11 amino acids long that are predicted to bind with strong affinity (IC50 < 300 nM) to HLA-A*0101, A*0201, A*0301, A*1101, A*2402, B*0702, B*0801, B*1501, B*1502, B*4001, B*4002, B*4402 or B*4403. Additional IEDB algorithms were employed to confirm predicted HLA binding, whereby binding was predicted by NetPanMHCBA4.0 and at least one other method (including artificial neural network, stabilized matrix method, PickPocket and NetPanMHCBA4.0) was required for inclusion in Supplementary Table 5.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We would like to thank R. Eisenman and members of the Malik, Henikoff, and Bradley labs for helpful discussions and comments on this manuscript. J.S. is supported by a Damon Runyon Cancer Research Foundation/Sohn Foundation Pediatric Cancer Research Fellowship, an Alex’s Lemonade Stand Foundation (ALSF) Young Investigator Award and a Northwestern Mutual/ALSF Award for Data Sharing. A.M. is supported by the Damon Runyon Cancer Research Foundation (DRG:2192-14) and by the NIH (R01 GM074108). G.-L.C. is a Mahan Fellow. M.B. is supported by a Stand Up To Cancer Innovative Research Grant, Grant Number SU2C-AACR-IRG 14-17. Stand Up To Cancer is a division of the Entertainment Industry Foundation. R.K.B. is a Scholar of The Leukemia and Lymphoma Society (1344-18). H.S.M. and S.H. are investigators of the Howard Hughes Medical Institute. The results shown here are in part based upon data generated by the TCGA Research Network: https://cancergenome.nih.gov/.
Author contributions
G.C., A.M., and J.S. designed the study and conceived the analyses. G.C., A.M., and M.B. performed the analyses. R.B., H.M., and S.H. provided guidance. A.M. and J.S. provided project leadership. G.C., A.M., and J.S. wrote the paper.
Data availability
All data sets used in this study are publicly available or previously published. RNA-seq reads from TCGA were downloaded from CGHub, but are now available from the NIH NCI Genomic Data Commons [https://portal.gdc.cancer.gov/]. RNA-seq reads from B-ALL samples were obtained from the Japanese Genotype–Phenotype Archive (accession number JGAS000047, this data set is available under restricted access and step by step instructions for obtaining access including Form 2 submission are available at [https://humandbs.biosciencedbc.jp/en/data-use]. For questions regarding this data set, please contact Dr. Hiroyuki Mano at hmano@m.u-tokyo.ac.jp)27. RNA-seq reads from B-ALL samples were also obtained from the European Genome-phenome Archive (accession number EGAD0000100211228, and EGAD00001002151)26, and the Chinese Genotype–phenotype Archive (data are available under restricted access, access can be obtained by contacting Dr. Sai-Juan Chen: sjchen@stn.sh.cn)25. RNA-seq quantification of CCLE cell lines were obtained from the Broad Institute CCLE Portal [https://portals.broadinstitute.org/ccle/data] (02-Jan-2019 release). TCGA mutation data was obtained through the GenomicDataCommons Bioconductor package. The remaining data are available within the Article, Supplementary Information or from the authors upon request.
Competing interests
The authors declare no competing interests.
Footnotes
Peer review information Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Antoine Molaro, Email: antoine.molaro@uca.fr.
Jay Sarthy, Email: jsarthy@fredhutch.org.
Supplementary information
Supplementary information is available for this paper at 10.1038/s41467-020-20707-x.
References
- 1.Luger K, Mader AW, Richmond RK, Sargent DF, Richmond TJ. Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature. 1997;389:251–260. doi: 10.1038/38444. [DOI] [PubMed] [Google Scholar]
- 2.Bennett, R. L. et al. A mutation in histone H2B represents a new class of oncogenic driver. Cancer Discov.9, 1438–1451 (2019). [DOI] [PMC free article] [PubMed]
- 3.Nacev BA, et al. The expanding landscape of ‘oncohistone’ mutations in human cancers. Nature. 2019;567:473–478. doi: 10.1038/s41586-019-1038-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Lehnertz B, et al. H3(K27M/I) mutations promote context-dependent transformation in acute myeloid leukemia with RUNX1 alterations. Blood. 2017;130:2204–2214. doi: 10.1182/blood-2017-03-774653. [DOI] [PubMed] [Google Scholar]
- 5.Lewis PW, et al. Inhibition of PRC2 activity by a gain-of-function H3 mutation found in pediatric glioblastoma. Science. 2013;340:857–861. doi: 10.1126/science.1232245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Funato K, Tabar V. Histone mutations in cancer. Annu. Rev. Cancer Biol. 2018;2:337–351. doi: 10.1146/annurev-cancerbio-030617-050143. [DOI] [Google Scholar]
- 7.Arimura Y, et al. Cancer-associated mutations of histones H2B, H3.1 and H2A.Z.1 affect the structure and stability of the nucleosome. Nucleic Acids Res. 2018;46:10007–10018. doi: 10.1093/nar/gky661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Molaro A, Young JM, Malik HS. Evolutionary origins and diversification of testis-specific short histone H2A variants in mammals. Genome Res. 2018;28:460–473. doi: 10.1101/gr.229799.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Shaytan AK, Landsman D, Panchenko AR. Nucleosome adaptability conferred by sequence and structural variations in histone H2A-H2B dimers. Curr. Opin. Struct. Biol. 2015;32:48–57. doi: 10.1016/j.sbi.2015.02.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Govin J, et al. Pericentric heterochromatin reprogramming by new histone variants during mouse spermiogenesis. J. Cell Biol. 2007;176:283–294. doi: 10.1083/jcb.200604141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Soboleva TA, et al. A unique H2A histone variant occupies the transcriptional start site of active genes. Nat. Struct. Mol. Biol. 2011;19:25–30. doi: 10.1038/nsmb.2161. [DOI] [PubMed] [Google Scholar]
- 12.Jiang, X., Soboleva, T. A. & Tremethick, D. J. Short histone H2A variants: small in stature but not in function. Cells9, 867 (2020). [DOI] [PMC free article] [PubMed]
- 13.Kohestani, H. & Wereszczynski, J. Effects of H2A.B incorporation on nucleosome structures and dynamics. bioRxiv, 10.1101/2020.06.25.172130 (2020). [DOI] [PMC free article] [PubMed]
- 14.Peng J, Yuan C, Hua X, Zhang Z. Molecular mechanism of histone variant H2A.B on stability and assembly of nucleosome and chromatin structures. Epigenetics Chromatin. 2020;13:28. doi: 10.1186/s13072-020-00351-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Bao Y, et al. Nucleosomes containing the histone variant H2A.Bbd organize only 118 base pairs of DNA. EMBO J. 2004;23:3314–3324. doi: 10.1038/sj.emboj.7600316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Doyen CM, et al. Dissection of the unusual structural and functional properties of the variant H2A.Bbd nucleosome. EMBO J. 2006;25:4234–4244. doi: 10.1038/sj.emboj.7601310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Tolstorukov MY, et al. Histone variant H2A.Bbd is associated with active transcription and mRNA processing in human cells. Mol. Cell. 2012;47:596–607. doi: 10.1016/j.molcel.2012.06.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Anuar ND, et al. Gene editing of the multi-copy H2A.B gene and its importance for fertility. Genome Biol. 2019;20:23. doi: 10.1186/s13059-019-1633-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Soboleva TA, et al. A new link between transcriptional initiation and pre-mRNA splicing: the RNA binding histone variant H2A.B. PLoS Genet. 2017;13:e1006633. doi: 10.1371/journal.pgen.1006633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hoghoughi N, Barral S, Vargas A, Rousseaux S, Khochbin S. Histone variants: essential actors in male genome programming. J. Biochem. 2018;163:97–103. doi: 10.1093/jb/mvx079. [DOI] [PubMed] [Google Scholar]
- 21.Martire S, Nguyen J, Sundaresan A, Banaszynski LA. Differential contribution of p300 and CBP to regulatory element acetylation in mESCs. BMC Mol. Cell Biol. 2020;21:55. doi: 10.1186/s12860-020-00296-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Sansoni V, et al. The histone variant H2A.Bbd is enriched at sites of DNA synthesis. Nucleic Acids Res. 2014;42:6405–6420. doi: 10.1093/nar/gku303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Winkler C, et al. Hodgkin’s lymphoma RNA-transfected dendritic cells induce cancer/testis antigen-specific immune responses. Cancer Immunol. Immunother. 2012;61:1769–1779. doi: 10.1007/s00262-012-1239-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Chapuy B, et al. Molecular subtypes of diffuse large B cell lymphoma are associated with distinct pathogenic mechanisms and outcomes. Nat. Med. 2018;24:679–690. doi: 10.1038/s41591-018-0016-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Liu YF, et al. Genomic profiling of adult and pediatric B-cell acute lymphoblastic leukemia. EBioMedicine. 2016;8:173–183. doi: 10.1016/j.ebiom.2016.04.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Qian M, et al. Whole-transcriptome sequencing identifies a distinct subtype of acute lymphoblastic leukemia with predominant genomic abnormalities of EP300 and CREBBP. Genome Res. 2017;27:185–195. doi: 10.1101/gr.209163.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Yasuda T, et al. Recurrent DUX4 fusions in B cell acute lymphoblastic leukemia of adolescents and young adults. Nat. Genet. 2016;48:569–574. doi: 10.1038/ng.3535. [DOI] [PubMed] [Google Scholar]
- 28.Lilljebjorn H, et al. Identification of ETV6-RUNX1-like and DUX4-rearranged subtypes in paediatric B-cell precursor acute lymphoblastic leukaemia. Nat. Commun. 2016;7:11790. doi: 10.1038/ncomms11790. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Barretina J, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–607. doi: 10.1038/nature11003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Xu, J. et al. Free-living human cells reconfigure their chromosomes in the evolution back to uni-cellularity. Elife6 e2807 (2017). [DOI] [PMC free article] [PubMed]
- 31.Fratta E, et al. The biology of cancer testis antigens: putative function, regulation and therapeutic potential. Mol. Oncol. 2011;5:164–182. doi: 10.1016/j.molonc.2011.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Whitehurst AW. Cause and consequence of cancer/testis antigen activation in cancer. Annu Rev. Pharm. Toxicol. 2014;54:251–272. doi: 10.1146/annurev-pharmtox-011112-140326. [DOI] [PubMed] [Google Scholar]
- 33.Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science362 eaav1898 (2018). [DOI] [PMC free article] [PubMed]
- 34.Lundegaard C, et al. NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8-11. Nucleic Acids Res. 2008;36:W509–W512. doi: 10.1093/nar/gkn202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Nielsen M, et al. Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Sci. 2003;12:1007–1017. doi: 10.1110/ps.0239403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Dvinge H, Kim E, Abdel-Wahab O, Bradley RK. RNA splicing factors as oncoproteins and tumour suppressors. Nat. Rev. Cancer. 2016;16:413–430. doi: 10.1038/nrc.2016.51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Piunti A, et al. CATACOMB: An endogenous inducible gene that antagonizes H3K27 methylation activity of Polycomb repressive complex 2 via an H3K27M-like mechanism. Sci. Adv. 2019;5:eaax2887. doi: 10.1126/sciadv.aax2887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Jain SU, et al. PFA ependymoma-associated protein EZHIP inhibits PRC2 activity through a H3 K27M-like mechanism. Nat. Commun. 2019;10:2146. doi: 10.1038/s41467-019-09981-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Sarthy JF, Henikoff S, Ahmad K. Chromatin bottlenecks in cancer. Trends Cancer. 2019;5:183–194. doi: 10.1016/j.trecan.2019.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Jimeno-Gonzalez S, et al. Defective histone supply causes changes in RNA polymerase II elongation rate and cotranscriptional pre-mRNA splicing. Proc. Natl Acad. Sci. USA. 2015;112:14840–14845. doi: 10.1073/pnas.1506760112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Thomas JD, et al. RNA isoform screens uncover the essentiality and tumor-suppressor activity of ultraconserved poison exons. Nat. Genet. 2020;52:84–94. doi: 10.1038/s41588-019-0555-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Koh CM, et al. MYC regulates the core pre-mRNA splicing machinery as an essential step in lymphomagenesis. Nature. 2015;523:96–100. doi: 10.1038/nature14351. [DOI] [PubMed] [Google Scholar]
- 43.Siam A, et al. Regulation of alternative splicing by p300-mediated acetylation of splicing factors. RNA. 2019;25:813–824. doi: 10.1261/rna.069856.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Glaich O, Leader Y, Lev Maor G, Ast G. Histone H1.5 binds over splice sites in chromatin and regulates alternative splicing. Nucleic Acids Res. 2019;47:6145–6159. doi: 10.1093/nar/gkz338. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Behan FM, et al. Prioritization of cancer therapeutic targets using CRISPR-Cas9 screens. Nature. 2019;568:511–516. doi: 10.1038/s41586-019-1103-9. [DOI] [PubMed] [Google Scholar]
- 46.Shen L, Zhang J, Lee H, Batista MT, Johnston SA. RNA transcription and splicing errors as a source of cancer frameshift neoantigens for vaccines. Sci. Rep. 2019;9:14184. doi: 10.1038/s41598-019-50738-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Draizen, E. J. et al. HistoneDB 2.0: a histone database with variants–an integrated resource to explore histones and their variants. Database (Oxford)2016, baw014(2016). [DOI] [PMC free article] [PubMed]
- 48.Katz Y, Wang ET, Airoldi EM, Burge CB. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods. 2010;7:1009–1015. doi: 10.1038/nmeth.1528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 2011;12:323. doi: 10.1186/1471-2105-12-323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25:1105–1111. doi: 10.1093/bioinformatics/btp120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Meyer LR, et al. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 2013;41:D64–D69. doi: 10.1093/nar/gks1048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Flicek P, et al. Ensembl 2013. Nucleic Acids Res. 2013;41:D48–D55. doi: 10.1093/nar/gks1236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11:R25. doi: 10.1186/gb-2010-11-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Huber W, et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat. Methods. 2015;12:115–121. doi: 10.1038/nmeth.3252. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Wickham, H., Francois, R., Henry, L., Muller, K. dplyr: A G 0.7.6 https://CRAN.R-project.org/package=dplyr. (2018).
- 57.Wickham, H. ggplot2: Elegant Graphics for Data Analysis., (Springer-Verlag, New York, NY, 2016).
- 58.Cibulskis K, et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 2013;31:213–219. doi: 10.1038/nbt.2514. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Colaprico A, et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 2016;44:e71. doi: 10.1093/nar/gkv1507. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data sets used in this study are publicly available or previously published. RNA-seq reads from TCGA were downloaded from CGHub, but are now available from the NIH NCI Genomic Data Commons [https://portal.gdc.cancer.gov/]. RNA-seq reads from B-ALL samples were obtained from the Japanese Genotype–Phenotype Archive (accession number JGAS000047, this data set is available under restricted access and step by step instructions for obtaining access including Form 2 submission are available at [https://humandbs.biosciencedbc.jp/en/data-use]. For questions regarding this data set, please contact Dr. Hiroyuki Mano at hmano@m.u-tokyo.ac.jp)27. RNA-seq reads from B-ALL samples were also obtained from the European Genome-phenome Archive (accession number EGAD0000100211228, and EGAD00001002151)26, and the Chinese Genotype–phenotype Archive (data are available under restricted access, access can be obtained by contacting Dr. Sai-Juan Chen: sjchen@stn.sh.cn)25. RNA-seq quantification of CCLE cell lines were obtained from the Broad Institute CCLE Portal [https://portals.broadinstitute.org/ccle/data] (02-Jan-2019 release). TCGA mutation data was obtained through the GenomicDataCommons Bioconductor package. The remaining data are available within the Article, Supplementary Information or from the authors upon request.