ABSTRACT
Centromeres are constricted chromosomal regions that are essential for cell division. In eukaryotes, centromeres display a remarkable architectural and genetic diversity. The basis of centromere-accelerated evolution remains elusive. Here, we focused on Pneumocystis species, a group of mammalian-specific fungal pathogens that form a sister taxon with that of the Schizosaccharomyces pombe, an important genetic model for centromere biology research. Methods allowing reliable continuous culture of Pneumocystis species do not currently exist, precluding genetic manipulation. CENP-A, a variant of histone H3, is the epigenetic marker that defines centromeres in most eukaryotes. Using heterologous complementation, we show that the Pneumocystis CENP-A ortholog is functionally equivalent to CENP-ACnp1 of S. pombe. Using organisms from a short-term in vitro culture or infected animal models and chromatin immunoprecipitation (ChIP)-Seq, we identified CENP-A bound regions in two Pneumocystis species that diverged ~35 million years ago. Each species has a unique short regional centromere (<10 kb) flanked by heterochromatin in 16–17 monocentric chromosomes. They span active genes and lack conserved DNA sequence motifs and repeats. These features suggest an epigenetic specification of centromere function. Analysis of centromeric DNA across multiple Pneumocystis species suggests a vertical transmission at least 100 million years ago. The common ancestry of Pneumocystis and S. pombe centromeres is untraceable at the DNA level, but the overall architectural similarity could be the result of functional constraint for successful chromosomal segregation.
IMPORTANCE
Pneumocystis species offer a suitable genetic system to study centromere evolution in pathogens because of their phylogenetic proximity with the non-pathogenic yeast S. pombe, a popular model for cell biology. We used this system to explore how centromeres have evolved after the divergence of the two clades ~ 460 million years ago. To address this question, we established a protocol combining short-term culture and ChIP-Seq to characterize centromeres in multiple Pneumocystis species. We show that Pneumocystis have short epigenetic centromeres that function differently from those in S. pombe.
KEYWORDS: evolution, chromosome segregation, genome organization, genetics, opportunistic fungi
INTRODUCTION
Centromeres (CENs) are genomic locations where spindles attach during chromosomal segregation. They are essential for cell division. Errors in the chromosomal segregation can result in aneuploidy and cell division defects with disastrous consequences such as cancer. In fungi, centromeres are fragile sites that are often involved in karyotype variability (1, 2) and drug resistance (3). In most eukaryotes, centromeres are defined by the presence of CENP-A nucleosomes, a centromere-specific histone H3 variant that is essential for the localization of known kinetochore components (4). There is a remarkable diversity in centromere structures ranging from sequence-dependent (point) centromeres to epigenetically regulated (regional) centromeres.
Point centromeres are exemplified by Saccharomyces cerevisiae with genetically defined 125-bp long centromeres containing conserved DNA elements (I, II, and III) that serve as anchors for the recruitment of the centromeric DNA binding factor 3 (CBF3) complex (5). Regional centromeres are much longer, lack universally conserved DNA patterns, and are often made up of repetitive DNA (6). Regional centromeres are not strictly defined by DNA but are rather epigenetically regulated; however, which factors contribute to this epigenetic specification of centromeres remains elusive. In fungi, regional centromeres are classified as short [<20 kb, e.g., Candida albicans (7)], intermediate [40–110 kb, e.g., Schizosaccharomyces pombe (8)], and long [150–300 kb, e.g., Cryptococcus (9)].
Pneumocystis is a genus of pathogenic fungi that exclusively infect mammals and remain unculturable. They belong to the Taphrinomycotina subphylum and form a monophyletic clade with fission yeasts Schizosaccharomyces, with a separation time estimated at ~460 million years ago (10, 11). In addition to these two classes, there are other classes, all of which are plant or soil-adapted organisms. Compared with S. pombe, which is a tractable organism that is widely used as a model for chromosome and cell biology, Pneumocystis species have streamlined genomes due to substantial gene losses during their transition to animal parasitism. Recently, the genomes of multiple Pneumocystis species have been sequenced (11–14). However, centromeres were not defined in these because they could not be predicted bioinformatically without reference for synteny-based detection. Alternative methods such as circular chromosome conformation capture assay (4C) that have been used to predict centromere loci in fungal genomes (15) are not applicable to Pneumocystis because they require large quantities of pure DNA from synchronized cell cultures. No long-term culture system exists, and during purification of Pneumocystis organisms from infected lungs, it is virtually impossible to completely eliminate host cell DNA. Pneumocystis have retained CENP-A (11, 14), which presumably binds to centromeres, and most of the kinetochore proteins. Some data suggest that Pneumocystis centromeres differ structurally from those of S. pombe. For example, the heterochromatin protein Swi6/HP1 (16), which is required for the activation of replication origins at the centromere flanking regions (pericentromeres), has not been found in Pneumocystis genomes (17). The RNA interference (RNAi) pathway, which is essential to centromere function, has been lost in Pneumocystis (18), further supporting mechanistic difference with S. pombe.
In this work, we undertook to characterize Pneumocystis centromeres and compare them to Schizosaccharomyces pombe centromeres. We found that Pneumocystis have small regional centromeres that are defined by CENP-A and flanked by heterochromatin, resembling those of S. pombe. However, Pneumocystis species lack orthologs for genes required for centromere function and maintenance in S. pombe, suggesting that its centromeres may function differently.
RESULTS
Survey of genes related to centromere function in Taphrinomycotina fungi
To understand the kinetochore evolution in Taphrinomycotina fungi, we screened the predicted proteomes of 14 species spanning six orders, Pneumocystis (n = 7), Schizosaccharomyces (n = 3), Taphrina (n = 1), Saitoella (n = 1), Protomyces (n = 1), and Neolecta (n = 1), as well as other fungi to have a broad overview of kinetochore evolution.
Using a combination of protein domain hidden Markov models and BLAST scans followed by phylogenetic assignment (see Materials and Methods), we searched for homologs of CENP-ACnp1, CENP-CCnp3, CCAN (Constitutive Centromere Associated Network), and KMN (KNL-1/Mis12 complex/Ndc80 complex) proteins. We extended our searches to pathways required for centromere function and chromosomal segregation (heterochromatin, RNAi, and DNA methylation) (Fig. S1).
CENP-A, CENP-C, and outer kinetochore proteins are conserved across Taphrinomycotina. The CENP-A histones from Pneumocystis species, henceforth referring to as pnCENP-A, display a high overall conservation (89% average protein identity). Two genes of the inner kinetochore (CCAN) that are not found in Pneumocystis are innovations in S. pombe (fta2 and fta3 genes). The fta2 and fta3 gene products associate with the central core (cnt) and innermost repeats (imr) region of the centromere in S. pombe (19). No ortholog for the mis17 gene can be found in Pneumocystis. In S. pombe, mis17 encodes a member of the Mis6-Mal2-Sim4 multiprotein required for CENP-A recruitment (20). The S. pombe centromere-associated protein B genes (cbp1, cbh1, and cbh2), which are derived from the domestication of pogo-like transposases (21), have no identified homologs in other fungi.
The chromosomal passenger complex is a heterotetrametric complex composed of the Aurora B kinase Ark1 and three regulatory components Pic1 (inner centromere protein), Bir1p (Survivin), and Nbl1 (Borealin) which mediates chromosome segregation and cytokinesis (22). This pathway is conserved in Pneumocystis and S. pombe.
The CENP-A recruiting complex, which includes Mis18 and Mis16 (23), and the CENP-A histone chaperone Scm3 required for CENP-A loading at the centromere (24), are conserved in Pneumocystis, as are the monopolin complex genes that coordinate the kinetochore microtubule attachment during mitosis (25).
The network for heterochromatin formation in Pneumocystis seems intact with the presence of the Clr4 methyltransferase complex (Clr4/SUV39H). Pneumocystis species have a single chromo domain containing protein that bears similarity with S. pombe Swi6 and Chp2 which are HP1 inparalogs (Fig. S2).
DNA methyltransferases (DMTs) are divided into five classes (DNMT1, DNMT2, DNMT3/DRM, Masc1/RID, and DNMT5)(26). Canonical DMTs are not detected in any of the seven sequenced Pneumocystis species (Table S1 at https://doi.org/10.5281/zenodo.10574230), which is consistent with previous studies (27, 28).
Overall, the Pneumocystis kinetochore (CCAN) protein catalog is relatively conserved and similar to that of other Taphrinomycotina fungi.
pnCENP-ACnp1 localizes to the nucleus
Centromeres are identified by tracking CENP-A binding regions. Precise nuclear localization of CENP-A is required for accurate chromosomal segregation. To determine if pnCENP-A localizes to the nucleus, we used an anti-Pneumocystis CENP-A antibody, in conjunction with an anti-Pneumocystis antibody (29) and DAPI. Using indirect confocal immunofluorescence microscopy, we found that pnCENP-A displays a nuclear peripheral localization in replicating organisms from infected lung tissues and cultures (Fig. 1A and B).
Fig 1.
Nuclear peripheral localization of CENP-A in Pneumocystis cells. (A) pnCENP-A (red) localizes at the nuclear periphery (blue) inside Pneumocystis carinii organisms (green). A cluster of organisms from an infected rat lung tissue section is presented and labeled with an anti-CENP-A antibody (red), an anti-Pneumocystis carinii antibody 7C4 (green), and 4′,6-diamidino-2-phenylindole (DAPI, blue). The image is a maximum projection of a 5-µm thick z-stack. Bottom panels 1–8 are magnified fields. Most anti-pnCENP-A labels overlap or are close in proximity with DAPI (represented by a pink color produced by the overlap between red and blue colors). Some organisms are not labeled by 7C4 antibody (e.g., box 5), which suggests that the epitope is not expressed. This field does not contain any host cells (rat). Scale bar = 2 µm. (B) In plane and orthogonal views of immunofluorescence labeling of P. carinii organisms co-cultured with mammalian cells for 7 days. Organisms were triple labeled using an anti-CENP-A antibody (red), a different anti-Pneumocystis antibody (RAE7) targeting Pneumocystis major surface glycoproteins (green), and DAPI (blue). Panels 1–3 are magnified fields. CENP-A is located inside Pneumocystis cells at the periphery of the nucleus. Few organisms are only labeled with the DAPI and do not express detectable levels of CENP-A and major surface glycoprotein. A rat cell nucleus is labeled with the letter “H.” Scale bar = 1 µm.
pnCENP-A is functional in S. pombe and supports viability and centromere loading
To confirm that the putative cenp-a gene from P. murina (Pmcenp-a) encodes a functional CENP-A protein, we expressed Pmcenp-a in S. pombe. P. carinii CENP-A has only one amino acid difference with P. murina CENP-A (Fig. 2A). Pmcenp-a was codon-optimized, expressed under the endogenous S. pombe cnp1+, the CENP-A ortholog, gene promoter, and subcloned into a standard LEU2-based multicopy plasmid. In addition, the gene product was tagged with GFP at its N-terminus since it was shown that C-terminal tagging (i.e., cnp1+-GFP) displays growth retardation (30). We first assessed the ability of Pmcenp-a to rescue the lethality of cnp1∆ cells. We performed plasmid shuffling assays, in which cnp1∆ cells were initially kept viable by supplying a copy of cnp1+ on a plasmid carrying the ura4+ selection marker. A rescue plasmid carrying Pmcenp-a and the LEU2 selection marker was then transformed into the cells, followed by growth on medium containing 5-Fluoroorotic acid (FOA), which selects for loss of the ura4+-marked cnp1+ plasmid (Fig. 2B). As seen with control cells expressing the S. pombe GFP-CENP-A gene, we found that cells expressing PmCENP-A also fully support viability of cnp1∆ cells as they grew upon FOA selection (Fig. 2C). We also observed no differences in growth between cnp1∆ cells expressing either S. pombe or P. murina CENP-A (Fig. S3). We next asked if PmCENP-A can also rescue lethality of cnp1-76 cells, a thermosensitive mutant, whose mutation T74M lies at the CATD (CENP-A targeting domain) region (31). As previously reported (32), we found that the S. pombe CENP-A tagged with GFP restored growth of cnp1-76 cells at restrictive temperatures (i.e., 34°C). Importantly, the P. murina counterpart was also able to rescue the thermosensitive phenotype (Fig. 2D) proving the functionality of this gene. In interphase fission, yeast displays the Rabl configuration, in which centromeres cluster to the spindle pole body while telomeres associate to each other near nuclear periphery (33, 34). As expected, expressing GFP-CENP-A from S. pombe resulted in cells displaying single fluorescence foci that colocalized with a tetO array inserted at cen2 (NB: a TetR-tdTomato fusion is expressed to bind to the tetO array and this locus is referred to as cen2-tetO-tdTomato) (Fig. 2E). Consistent with the mentioned rescue of cnp1-ts and cnp1∆ cells and further supporting their role as a functional CENP-A protein, P. murina CENP-A protein also colocalized with cen2-tetO-tdTomato and remarkably exhibited single foci, albeit the intensities of centromere foci were dimmer than that of the S. pombe counterpart (Fig. 2F). Finally, to confirm the specific localization of PmCENP-A to centromeres, we performed ChIP-qPCR (Fig. 2G). Noteworthy, we observed that PmCENP-A localizes to the central core regions (i.e., cc1&3 and cc2:ura4+) similar to the fission yeast counterpart, albeit the enrichment was consistently lower with the results from live-cell imaging. Additionally, PmCENP-A localization seems to be specific as no detectable enrichment was found at pericentromeric outer repeats coated with heterochromatin (i.e., dg) and euchromatin locations (i.e., fbp1). Together, these results confirm that the P. murina cenp-a gene encodes a bona fide CENP-A protein.
Fig 2.
Pneumocystis CENP-A supports viability and centromere loading in a heterologous system. (A) Protein sequence alignment of full-length CENP-Acnp-1 orthologs in Schizosaccharomyces pombe, Pneumocystis murina, and P. carinii. Each Pneumocystis species only infects a single mammalian host: P. murina (infecting mice) and P. carinii (rats). The ~25 amino acid insert in Pneumocystis CENP-A N-terminal regions is unique to this genus. (B) Schematic of the plasmid-shuffling assay used in C. The inviability of cnp1∆ cells is masked by the presence of cnp1+ on a ura4+-marked plasmid. After introducing the LEU2-marked rescue plasmid, cells that have lost the ura4+-marked plasmid are selected on medium containing FOA, leaving only those with the rescue plasmid. Note: S. cerevisiae LEU2 gene complements the S. pombe leu1-32 mutant. (C) Rescue experiments using the plasmid-shuffling assay described in B to test the ability of multicopy plasmids expressing GFP-tagged CENP-A transgenes to rescue cnp1∆ cells. Growth was assayed at indicated media using 10-fold serial dilutions. (D) Rescue experiment of fission yeast cnp1-76 thermosensitive cells using same plasmids as in C. Growth was assayed at indicated temperatures using 10-fold serial dilutions. The transgene encoding S. pombe CENP-A serves as a non-temperature-sensitive control. (E) Representative images of cen2-tetO-TdTomato strains expressing the indicated GFP-tagged CENP-A transgenes in multicopy plasmids. GFP and TdTomato fluorescence as well as brightfield images are merged. Magnified views of boxed regions are shown at the bottom. (F) Integrated fluorescence intensity of GFP foci was measured and plotted for the strains used in panel E. The median (bar) and interquartile range (error bars) are shown. n > 409. A.U., arbitrary units. (G) Anti-GFP ChIP-qPCR analysis of indicated loci for strains expressing GFP-tagged CENP-A transgenes as single-copy integrations. %IP represents the percentage of input that was immunoprecipitated. Error bars denote the SD (n = 3). For each strain, comparison of %IP between loci was performed by one-way ANOVA and Holm-Sidak test for multiple comparisons (P < 0.001). Mean values marked with letters (a or b) indicate results that are significantly different from each other. cc1&3, S. pombe centromere central core 1 & 3; cc2:ura4+, ura4+ insertion at central core 2; dg, a class of heterochromatic outer repeats within pericentromeric regions (control); fbp1, euchromatic locus (control).
pnCENP-A binds to single genomic foci in Pneumocystis-replicating cells
In S. pombe, CENP-ACnp1 is deposited at the centromeres during the G2 phase of the cell cycle (30). Although Pneumocystis cell replication is not fully understood, these organisms take 5–8 days to replicate in vivo (35). To determine pnCENP-A bound genome regions, we established a short-term co-culture system for P. murina and P. carinii (Fig. 3A). We assessed organism growth using quantitative PCR targeting the single-copy gene dihydrofolate reductase (dhfr); a typical growth curve shows a decline from day 0 to 7 followed by a gradual increase (Fig. 3B and C). Population growth is further supported by the presence of mitotic or fusing cells 7 days post culture, though some cells do not appear to be dividing (Fig. 3D).
Fig 3.
Pneumocystis CENP-A binds to single genomic foci in replicating cells. (A) Study workflow and sample preparation. P. murina and P. carinii were obtained from CD40 ligand knockout female mice and immunosuppressed Sprague-Dawley male rats, respectively, and cultured on a co-culture of human lung adenocarcinoma cells (A549) and immortalized murine lung epithelial type 1 cells (Let-1) for 14 days. Species phylogeny and estimated speciation timing are presented. Animal icons were obtained from http://phylopic.org under creative commons licenses https://creativecommons.org/licenses/by/3.0/: mouse (Anthony Caravaggi; license CC BY-NC-SA 3.0) and rat (by Rebecca Groom; license CC BY-NC-SA 3.0). (B) P. murina population growth in duplicate wells were measured by quantitative PCR targeting a single-copy dihydrofolate reductase (dhfr) gene over 14 days. Error bars represent the standard deviation (n = 2). (C) P. carinii growth measurement by qPCR targeting dhfr gene. Error bars represent the standard deviation (n = 2). (D) Electron micrograph showing a possibly dividing P. murina trophic form 7 days post culture (black arrow). The field also includes a non-dividing trophozoite (white arrow). Scale bar, 2 µm.
We mapped CENP-A bound genomic regions in all P. carinii and P. murina chromosomes using ChIP-Seq from cultured organisms collected days 0 (normalized initial inoculum), 7 and 14.
In P. murina, each of the 17 chromosome-level scaffolds displays a single region containing peaks of CENP-A at all three time points (Fig. 4A and B; Fig. S4). We confirmed CENP-A enrichment by ChIP-qPCR (Fig. S5A). CENP-A enrichment has a bimodal distribution, suggesting two distinct populations of CENP-A in chromosome arms. The two CENP-A peaks are separated by a 1-kb non-coding region that lacks a conserved DNA motif. Although the significance of two peaks of CENP-A in Pneumocystis is unclear, in S. pombe CENP-A, cores are interspersed with H3K4me (36). CENP-A enrichment levels fluctuate according to the growth curve, that is, the highest level is observed at day 0 followed by a decline at day 7 and a recovery at day 14. In S. pombe, CENP-ACnp1 enrichment at the centromeres correlates with a depletion of histone H3 (37). At day 7, the H3 occupancy (expressed as the ratio between H3 and H4) is significantly reduced at centromeres relative to flanking regions (30 kb) as estimated by ChIP-Seq (Wilcoxon test; P < 0.0001; Fig. S6), which suggests that CENP-A substitutes the canonical H3. However, H3 depletion at the centromeres does not persist after 14 days of culture (Fig. S7A).
Fig 4.
Pneumocystis displays 17 centromeres. (A) Scaled ideogram of 17 chromosomal-level scaffolds (gray) of P. murina genome showing CENP-A binding regions (constricted areas). (B) CENP-A binding regions delineate putative CENs in P. murina genome. Color-coded peaks represent enrichment of immunoprecipitated DNA (IP DNA) relative to controls (Input DNA) in P. murina organisms (input subtracted). Only ChIP-Seq data from Pneumocystis cocultured with A549 and Let-1 cells for 7 days are presented. Data for the full experiment covering days 0, 7, and 14 are presented in Supplementary material. The 20-kb windows of the enrichment peak are presented. Each scaffold displays two peaks labeled M (Major) and m (minor) according to the enrichment level. (C) Scaled ideogram of 17 chromosomal-level scaffolds (gray) of P. carinii genome showing CENP-A binding regions (constricted areas). (D) CENP-A binding regions delineate putative CENs in P. carinii genome. Color-coded peaks represent enrichment of IP DNA relative to controls (Input DNA) in P. carinii organisms cocultured with A549 and Let-1 cells for 7 days (input subtracted). Data for the full experiment covering days 0, 7, and 14 are presented in Fig. S6 and supplemental data at https://doi.org/10.5281/zenodo.10574230. (E) A DNA dot plot of CENP-A binding regions in P. murina showing regional self-similarity. The main diagonal represents the sequence alignment with itself. Lines off the main diagonal which are repetitive patterns within the sequences are not observed. The plot shows that each CEN is unique within the genome. (F) DNA dot plot of CENP-A binding regions in P. carinii showing that each CEN is unique and repeat free.
P. carinii also displays monocentric chromosomes enriched with two CENP-A peaks at the three time points by ChIP-Seq (Fig. 4C and D; Fig. S8) and ChIP-qPCR (Fig. S5B). H3 depletion is observed at day 14 (Fig. S7B). In contrast to P. murina, we found a significant reduction of CENP-A enrichment at day 14 despite a population growth (Fig. 3C). Also not observed in P. murina, at day 14, low levels of CENP-A are present outside the primary CENP-A binding region in several chromosomes (Fig. S8), which correspond to a non-specific displacement of CENP-A. In budding yeasts and chicken cells, low levels of CENP-A molecules have been shown outside the core domain (CENP-A cloud), which contribute to centromere plasticity (38).
In both species, CENP-A bound regions span 4.8–8.0 kb, with an average length of 6.7 kb, and have an average of 1.03% lower GC content than the rest of the genome (Table 1). Using dot plot analysis of a concatemer of different centromeres, each CENP-A binding region appears unique within the genome (Fig. 4E and F) and lacks a shared conserved DNA motif (see Materials and Methods). Because most centromeres in fungi are associated with repetitive DNA (e.g., DNA transposons and retrotransposons), we searched for known signatures of repeats and found no significant overlap of repeats with centromeres (Fig. S6 and S9).
TABLE 1.
Coordinates, length, and GC content (in %) of Pneumocystis centromeres identified by direct ChIP
| Species | NCBI accession no. | Chromosome length (kb) | Chromosome GC (%) | Centromere ID | Centromere length (kb) | Centromere start | Centromere end | Centromere GC (%) | Difference GC (%) | Synteny |
|---|---|---|---|---|---|---|---|---|---|---|
| P. carinii | NW_017264713.1 | 635.4 | 22.9 | PcCEN1 | 6.5 | 233,731 | 240,226 | 21.4 | −1.5 | PmCEN5 |
| NW_017264714.1 | 590.5 | 23.1 | PcCEN2 | 6.1 | 121,692 | 127,752 | 24.4 | 1.3 | PmCEN3 | |
| NW_017264715.1 | 543.2 | 26.2 | PcCEN3 | 7.2 | 84,071 | 91,273 | 24.2 | −2 | PmCEN9 | |
| NW_017264716.1 | 564.5 | 21.4 | PcCEN4 | 7.3 | 506,597 | 513,868 | 19 | −2.4 | PmCEN11 | |
| NW_017264717.1 | 566.0 | 24.5 | PcCEN5 | 4.3 | 78,025 | 82,304 | 22.5 | −2 | PmCEN10 | |
| NW_017264718.1 | 476.3 | 23.5 | PcCEN6 | 5.9 | 430,075 | 436,005 | 21 | −2.5 | PmCEN17 | |
| NW_017264719.1 | 465.1 | 25.7 | PcCEN7 | 6.4 | 176,750 | 183,172 | 23.3 | −2.4 | PmCEN7 | |
| NW_017264720.1 | 416.9 | 23.7 | PcCEN8 | 5.6 | 50,911 | 56,559 | 21.3 | −2.4 | PmCEN16 | |
| NW_017264721.1 | 419.0 | 25.9 | PcCEN9 | 7.1 | 86,466 | 93,542 | 24.9 | −1 | PmCEN14 | |
| NW_017264722.1 | 437.4 | 21.8 | PcCEN10 | 6.3 | 277,837 | 284,141 | 19.8 | −2 | PmCEN2 | |
| NW_017264723.1 | 377.5 | 25.8 | PcCEN11 | 6.1 | 241,326 | 247,471 | 22.7 | −3.1 | PmCEN4 | |
| NW_017264724.1 | 370.5 | 24.3 | PcCEN12 | 5.3 | 148,081 | 153,334 | 22.7 | −1.6 | PmCEN6 | |
| NW_017264725.1 | 359.5 | 23.8 | PcCEN13 | 7.1 | 240,705 | 247,764 | 23.4 | −0.4 | PmCEN1 | |
| NW_017264726.1 | 281.9 | 25.1 | PcCEN14 | 5.3 | 206,934 | 212,243 | 22.5 | −2.6 | PmCEN13 | |
| NW_017264727.1 | 446.5 | 23.7 | PcCEN15 | 4.6 | 110,888 | 115,452 | 22.5 | −1.2 | PmCEN12 | |
| NW_017264728.1 | 274.9 | 23.6 | PcCEN16 | 6.5 | 45,069 | 51,578 | 20.9 | −2.7 | PmCEN8 | |
| NW_017264729.1 | 268.2 | 24.5 | PcCEN17 | 7.8 | 61,164 | 68,971 | 20.1 | −4.4 | PmCEN15 | |
| P. murina | NW_006920852 | 535.3 | 23.9 | PmCEN1 | 5.7 | 458,129 | 463,864 | 22.6 | −1.34 | |
| NW_006920853 | 584.1 | 23.2 | PmCEN2 | 6.3 | 463,956 | 470,213 | 25.3 | 2.13 | ||
| NW_006920854 | 584.1 | 23.7 | PmCEN3 | 8.0 | 126,184 | 134,213 | 20.8 | −2.91 | ||
| NW_006920855 | 545.3 | 26.1 | PmCEN4 | 6.7 | 450,901 | 457,584 | 24.5 | −1.65 | ||
| NW_006920856 | 532.6 | 23.4 | PmCEN5 | 7.4 | 197,805 | 205,216 | 23.0 | −0.36 | ||
| NW_006920857 | 536.0 | 21.3 | PmCEN6 | 7.5 | 18,391 | 25,875 | 20.5 | −0.82 | ||
| NW_006920858 | 446.5 | 23.8 | PmCEN7 | 4.8 | 331,196 | 335,979 | 22.7 | −1.1 | ||
| NW_006920859 | 432.0 | 23.4 | PmCEN8 | 6.6 | 359,990 | 366,546 | 22.2 | −1.24 | ||
| NW_006920860 | 378.7 | 25.4 | PmCEN9 | 8.1 | 236,499 | 244,613 | 24.0 | −1.37 | ||
| NW_006920861 | 376.0 | 23.2 | PmCEN10 | 5.1 | 244,652 | 249,709 | 23.6 | 0.38 | ||
| NW_006920862 | 385.6 | 23.5 | PmCEN11 | 6.3 | 221,796 | 228,057 | 24.6 | 1.06 | ||
| NW_006920863 | 491.3 | 24.9 | PmCEN12 | 6.3 | 310,161 | 316,424 | 22.7 | −2.18 | ||
| NW_006920864 | 333.4 | 23.7 | PmCEN13 | 8.3 | 39,073 | 47,361 | 23.5 | −0.16 | ||
| NW_006920865 | 318.1 | 22.9 | PmCEN14 | 6.8 | 38,876 | 45,692 | 21.6 | −1.35 | ||
| NW_006920866 | 412.2 | 26.5 | PmCEN15 | 7.8 | 311,726 | 319,558 | 26.8 | 0.26 | ||
| NW_006920867 | 290.2 | 23.5 | PmCEN16 | 5.5 | 49,567 | 55,084 | 20.0 | −3.46 | ||
| NW_006920868 | 292.7 | 25.6 | PmCEN17 | 7.5 | 208,752 | 216,295 | 24.5 | −1.09 |
Pneumocystis centromeres contain active genes
By cross referencing annotated gene locations with those of pnCENP-A bound regions, we found that P. murina and P. carinii centromeres encode 74 and 58 genes, respectively; all of them are conserved in the genomes of both species except a prefoldin gene that is lost in P. carinii (Table S2 at https://doi.org/10.5281/zenodo.10574230). Of the 74 P. murina centromeric genes, 73 were expressed based on RNA-seq and four were further detected by protein mass spectrometry mapping (LC-MS); of the 58 P. carinii genes, 56 are expressed and 53 were detected by LC-MS. Analysis of the predicted function of these genes revealed housekeeping functions without major differences compared with randomly sampled genes.
Centromeres are flanked by heterochromatin
In S. pombe, H3K9me2/3 marking of chromatin is associated with the repression of centromeres, subtelomeres, ribosomal rDNA, and the mating locus (36). There are two types of chromatins: euchromatin, the lightly packed form of the chromatin enriched with H3K4me2 (di-methylation of the fourth lysine residue of the histone H3), which is associated with active transcription, and heterochromatin enriched with H3K9me2 and H3K9me3 (di- and tri-methylation of the nineth lysine residue of the histone H3 protein). To test if this feature is shared in Pneumocystis, we performed ChIP-Seq with antibodies targeting histone H3 modifications (H3K9me2/3 and H3K4me2). In Pneumocystis, we detected broadly distributed peaks of H3K9me2 and H3K9me3 bordering the putative centromeres delineated by CENP-A narrow peaks (Fig. 5A and B; Fig. S6 and S9). This configuration of chromatin markers is specific to centromeric regions. H3K4me2 is correlated with H3K9me2 in the centromeres (Pearson rho = 0.72) but not at the whole genome level (rho = 0.05). However, research has shown that the histone modifications H3K4me2 and H3K9me2 decorate euchromatin and heterochromatin regions, respectively (39). Therefore, the correlation observed between H3K4me2 and H3K9me2 may be due to differences in growth kinetics for the Pneumocystis cell population used in the ChIP analysis. Furthermore, similar to S. pombe (36), a small amount of H3K4me2 is present at the centromeres in Pneumocystis (Fig. 5A and B; Fig. S6 and S9). These results suggest that Pneumocystis centromeres are flanked by high levels of heterochromatin.
Fig 5.
Centromeres are flanked by heterochromatin and contain active genes. (A) Genomic view of chromosome 2 of P. murina genome subsequently showing annotated genes (directed gray boxes), DNA repeats (not present), percent GC content (blue), ChIP-Seq read coverage distribution [bins per million mapped reads (BPM) normalized over bins of 50 bp; input subtracted] of CENP-A, histone H3 and H4 ratio, heterochromatin-associated modifications (H3K9me2 and H3K9me3), and euchromatin (H3K4me2) and gene expression (RNA-seq) in relation with centromeres. (B) Genomic view of chromosome 2 of P. carinii genome with the same features presented as P. murina. A duplicated copy of copia-retrotransposon is presented (red arrows), which is present in syntenic regions in other Pneumocystis genomes. The two presented chromosomes are syntenic.
DNA methylation is frequently associated with centromeres in fungi, where they silence repeats (40, 41). Similar to S. pombe, DNA methylation has been predicted to be absent in Pneumocystis based on the absence of DNA methyltransferases (DMTs) (27, 28). To determine whether DNA methylation plays a role in Pneumocystis centromeres, we performed bisulfite sequencing (five methylcytosine) in P. carinii. The overall level of 5mC DNA methylation as measured by the average weighted methylation percentage is 0.6% for P. carinii at the CG dinucleotides (Fig. S10). These levels are in range with reported levels for other fungi, e.g., Verticillium (0.4%) (42). The presence of 5mC methylated DNA bases despite the absence of recognizable DNA methyltransferases in Pneumocystis requires further investigation. To assess the potential role of DNA methylation in centromere function in Pneumocystis, we analyzed the DNA methylation patterns over different genomic features (genes, intergenic spacers, and centromeres). However, there is no significant difference among centromeric or pericentromeric regions (defined as 30 kb flanking the centromeres) (Mann-Whitney U test P-value >0.3) and the randomly selected genomic regions (genomic background). These results suggest that 5mC DNA methylation is not required for centromere function in Pneumocystis. The absence of repeats in the centromeres and the presence of actively transcribed genes suggest that DNA methylation does not induce an epigenetic silencing in Pneumocystis centromeres.
Centromere conservation during speciation
Pneumocystis murina and P. carinii species have diverged ~35 Mya (11). Yet, orthologous regions act as centromeres in both species, which suggest a vertical transmission (Fig. 6A). Syntenic regions can be found across five additional Pneumocystis species in which speciation spans at least 100 million years (Fig. S11 at https://doi.org/10.5281/zenodo.10574230).
Fig 6.
Centromere locations, not sequences, are evolutionarily constrained by positive selection. (A) Genome organization and synteny in Pneumocystis. Circos plots depicting pairwise P. carinii and P. murina genome synteny. Rodent-infecting Pneumocystis (P. carinii and P. murina) have 17 chromosomes. Colored connectors indicate regions of synteny between species. Centromeres that overlap with recent chromosomal breakpoints are indicated (*). The square highlights P. carinii centromere 4 displayed in panel B. (B) Genome view of centromere 4 in the P. carinii genome. Genes are represented by directed boxes (gray for protein-coding genes and cyan for polymorphic major surface glycoprotein genes), pnCENP-A binding region (centromere), and sequence conservation scores which were calculated from whole genome alignments of P. carinii, P. murina, and P. wakefieldiae (PhasCons). The phastCons scores represent probabilities of negative selection and range between 0 (no conservation) and 1 (total conservation). (C) Boxplot of conservation scores per genomic context summarized for centromere 4 in P. carinii genome, 30-kb regions flanking the centromeres (Cenflk), major surface glycoproteins encoding regions (Msg), and random genomic background (Bckg). Msgs are fast-evolving proteins potentially involved in antigenic variation. Background data were obtained from randomly selected intervals (n = 1 × 106) from genomic regions excluding above-mentioned regions (CEN, Cenflk, and Msg). Statistical differences for the indicated comparisons were obtained using one-sided non-parametric Mann-Whitney test; ****P < 0.0001, ***P < 0.001, **P < 0.01, *P < 0.05. Data for all 17 centromeres are presented in Supplementary material.
To investigate footprints of selection acting on centromeres, we computed conservation scores (PhastCons) from whole genome alignments. Conservation scores range from 0 to 1 and represent probabilities of negative selection. Pneumocystis genomes encode a large multicopy major surface glycoprotein (MSG) gene family. MSGs are highly polymorphic and evolve rapidly due to their role in antigenic variation during mammalian host infection [reviewed in reference (43)]. Using MSGs as control for evolutionary speed, we found that centromeres and flanking regions are substantially more conserved than MSGs and similar to the genomic background (Fig. 6B and C). The levels of centromeric conservation are variable across the chromosomes, with the flanking regions being more conserved than the cores (defined as a 1–2-kb region in the center of each centromere) (Fig. S12 at https://doi.org/10.5281/zenodo.10574230). Given that macrosynteny at the centromeric regions is conserved across multiple Pneumocystis species, this suggests that centromere locations are maintained by positive selection. This would be consistent with the hypothesis that centromere positioning tends to be conserved in obligate sexual fungi due to their role in meiosis (44).
Centromeres contribute to karyotypic diversity in some fungi (2). However, we found little support for this hypothesis here because most centromeres do not overlap with chromosomal breaks (only 3 of 17 in the P. carinii versus P. murina pairwise comparison). This is reminiscent of other fungi such as Verticillium species where centromeres do not account for most of the karyotype variation (45).
DISCUSSION
In the current study, we have identified centromeres of two Pneumocystis species by showing that the centromeric histone CENP-A binds to gene-rich genomic regions that are flanked with heterochromatin (Fig. 7).
Fig 7.
Model of regional centromere structures in Pneumocystis and other representative fungi. On the left is the phylogeny of selected fungi inferred from maximum likelihood phylogenetic analysis of shared core protein orthologs. The overall centromere structures for Pneumocystis carinii (details supporting our model are provided in the text), Schizosaccharomyces pombe (46), Candida albicans (7), Zymoseptoria tritici (44), and Cryptococcus neoformans (9) are used as representative to showcase the diversity of regional centromeres in fungi. Animal pathogens are highlighted in pale-olive, and the plant fungal pathogen Zymoseptoria tritici is in pale-green. For the sake of brevity, only Pneumocystis carinii is presented. In the middle are presented DNA structures of centromeres. P. carinii has 17 centromeres (one for each of its 17 chromosomes) that share the same overall architecture. Centromeres are delineated by a localized enrichment of the centromeric histone CENP-A, which overlaps with a reduction of the canonical histone H3 (inverted grey triangle). Centromeres are flanked by heterochromatin H3K9me (here stands for both H3K9me2 and H3K9me3). All 17 Pneumocystis carinii centromeres span active genes (dark-gray boxes). Each centromere sequence is different and lacks shared DNA sequence motif. S. pombe has three centromeres that share the same overall structure in which a central core (cnt) domain is surrounded by innermost repeats (imr) and outer repeats (otr). The imr repeats incorporate clusters of transfer RNAs (tRNAs) that play a role in restricting CENP-A spread. S. pombe centromeres are flanked by heterochromatin (H3K9me). Genes are found 0.75–1.5 kb beyond the limits of the centromeres. C. albicans has eight unique and different centromeres that are gene free and lack shared sequence motifs. Z. tritici has 21 centromeres ranging from 6 to 14 kb in size that partially overlap with genes. C. neoformans has 14 centromeres that are gene free and enriched with Tcn transposons.
This configuration suggests the presence of small epigenetically regulated centromeres (regional). Within each genome, each chromosome has a unique non-repetitive centromere. Syntenic regions for each centromere are present in all seven currently sequenced Pneumocystis species. Cell replication and chromosomal segregation pathways are unexplored in these species. Central to these pathways are the centromeres.
As centromeres cannot be predicted bioinformatically because no suitable reference for synteny analyses existed before this study, the locations and characteristics of Pneumocystis centromeres were not reported before. Moreover, there are several roadblocks for characterizing centromeres in Pneumocystis, with the most significant one being the lack of continuous culture and transfection tools. Here, we utilized two Pneumocystis species in a short-term culture system to determine where pnCENP-A, the epigenetic marker for centromeres, binds in the genome.
As a first step, we demonstrated that PnCENP-A functions similarly to CENP-ACnp1 in S. pombe, though less efficiently, validating its use in characterizing Pneumocystis centromeres. We then developed a PnCENP-A-based ChIP-Seq assay that showed a reproducible efficiency in defining centromeric regions of Pneumocystis species.
Unlike the phylogenetically related S. pombe, in Pneumocystis, these regions overlap with active genes, but like S. pombe, these centromeres are flanked by heterochromatin. Our work also provides the first experimental evidence that DNA methylation occurs in these species, although it may not be involved in centromere function.
Despite their close phylogenetic relationship, P. murina and P. carinii have slightly different growth kinetics in our short-term culture system, which could explain the variations in the PnCENP-A and H3 DNA binding profiles. The non-specific binding of PnCENP-A observed in P. carinii at day 14 is likely the result of a random dissociation of CENP-A molecules from the centromeres.
In P. murina, 5 out of the 17 chromosomes showed secondary CENP-A enrichment at non-centromeric sites (Fig. S4). These secondary CENP-A peaks are likely ChIP-Seq artefacts because they were only detected at day 7 post culture and mostly overlap with highly expressed loci (Fig. S8), which are susceptible to misleading signals in ChIP-Seq experiments (47).
The presence of genes within centromeres is rare and has only been described in rice (48) and the plant fungal pathogen Zymoseptoria tritici (44). Centromeres bearing genes are interpreted as young centromeres (neocentromeres), in which the genes are progressively inactivated. This hypothesis lacks support here because there is no sign of pseudogenization in these genes: nearly all are transcribed and translated, they are conserved in all species and many are involved in housekeeping cellular pathways that are presumably critical for organism survival.
Findings in Pneumocystis cannot be generalized to other Taphrinomycotina species. The determination of ancestral traits between Pneumocystis and fission yeast centromeres will require characterizing the centromeres in additional Taphrinomycotina species. The loss of RNAi is often associated with shortening of centromeres (9); consistent with this, Pneumocystis centromeres are much smaller than Schizosaccharomyces centromeres.
Our study will benefit from a microscopy validation of CENP-A loading kinetics when a reliable long-term culture system becomes available. Live imaging would help to determine at which cell cycle stage CENP-A is loaded. A direct genetic confirmation of Pneumocystis centromeres will also be required when a long-term culture system and genetic manipulation tools become available.
In summary, we have identified short regional centromeres in genetically intractable micro-organisms. Our results provide insights into the formation of centromeres in host-adapted fungal pathogens. Our ultimate goal is to use Pneumocystis centromeres to stabilize plasmids for genetic manipulation. This is the first step along this path, which should lead to better understanding of the biology of Pneumocystis and facilitate the discovery of novel interventions to effectively control and prevent the disease caused by this pathogen.
MATERIALS AND METHODS
Ethics and organism source
Studies involving mouse and rat samples were approved by the NIH Clinical Center Animal Care and Use Committee (protocol CCM 19-05), and rat Pneumocystis studies by the Animal Care and Use Committees of the Cincinnati VA Medical Center (protocol 20-11-08-01).
P. carinii organisms were collected from heavily infected lungs of corticosteroid-treated Sprague-Dawley male rats and P. murina from heavily infected lungs of CD40 ligand knockout mice. The list of samples is presented in Table S3 at https://doi.org/10.5281/zenodo.10574230.
Pneumocystis culture and growth quantification
P. murina and P. carinii were partially purified by Ficoll-Hypaque density gradient centrifugation (49) and frozen at −80°C in cell recovery media (GIBCO). For short-term Pneumocystis cultures, A549 (ATCC) and LET1 (a gift from Dr. Paul Thomas, St. Jude Children’s Research Hospital) cells were cultured in culture medium F12 with 2.5% or 5% heat inactivated fetal bovine serum (GIBCO) and Penicillin-Streptomycin (GIBCO), plated to approximately 60%–80% confluency and incubated for 24 hours at 37°C. The next day, frozen Pneumocystis vials were thawed, washed in 50 mL 1× PBS, and centrifuged at 2,000 g for 20 minutes. Pneumocystis cell pellets were resuspended in culture medium and added to the plated cells. Media were partially changed every 3 or 4 days. At the set time points (days 7 and 14), wells were scraped for collection. Cell/organism suspensions were centrifuged at 10,000 g for 3 min, the supernatant was removed, and the pellets were frozen at −80°C. Genomic DNA was extracted using the QiAmp DNA Extraction Kit (Qiagen). Quantitative PCR targeting the single copy dhfr gene was performed as described previously (50).
P. murina CENP-ACnp1 complementation and ChIP-qPCR in S. pombe
Standard procedures were used for fission yeast growth, genetics, and manipulations (51). The sequence encoding P. murina CENP-A gene was codon optimized for S. pombe codon bias, synthesized commercially (GenScript, New Jersey), and subcloned into pS2 vector (52) generating pPmCENP-A. NdeI-SphI fragments from both pS2 and pPmCENP-A were subcloned into the same sites of pREP81 vector to obtain LEU2-based multicopy plasmids that were used in experiments presented in Fig. 2B through F. Alternatively, plasmids were linearized with NsiI and introduced by transformation to the lys1+ of cen2-tetO-tdTomato strain (53). These strains bearing single-copy integrations were grown overnight at 30°C in rich medium YEA and used for ChIP with 2 µL of anti-GFP (ab290, Abcam). PCR oligonucleotides detecting central core (cc1&3; ura4), heterochromatic dg, and euchromatin control fbp1 were previously reported (54).
Ortholog identification
To find orthologs for kinetochore proteins, we constructed a database of 21 proteomes from fungi 15 Taphrinomycotina including 7 Pneumocystis species and 6 representative distantly related fungi (Table S1 https://doi.org/10.5281/zenodo.10574230). We queried the database with S. pombe proteins as seed using BLASTp (55) with a minimum an e-value of 1.0 × 10−5 as threshold.
When many potential homologs were detected in Pneumocystis species, we separated orthologs from out-paralogs by searching in orthologous groups inferred from the clustering of the whole proteome data set using OrthoFinder v2.5.2 (56). A sequence was considered as ortholog if it was assigned to the same orthologous group as S. pombe protein query. We verified that the assignment was consistent with EggNOG database v6.0 (57), which contains all S. pombe proteins and a limited number of Pneumocystis proteins. If a Pneumocystis sequence was detected by BLASTp but not assigned as an ortholog, we first identified conserved domains in all sequences using InterProScan version 5.64–96.0 (58). Then, we constructed multiple sequence alignment containing all sequences including the S. pombe query using MAFFT version v7.467 (59) either with the option E-INS-I or L-INS-i depending on the protein domain architecture. Alignments were manually curated using MacVector version 18.6.1 (https://macvector.com/) to remove spurious aligned regions and used to infer an approximately maximum likelihood gene tree using FastTree v2.1.10 (60). When necessary, a maximum likelihood phylogeny was performed using IQ-TREE version 1.5.5 (61). For histones, we used protein sequences identified by reference (62) to guide the classification. In case of inconclusive phylogenetic classification, we BLASTed Pneumocystis protein sequences against NCBI nr using a minimum e-value of 1.0 × 10−5 as threshold. Retrieved proteins were added to the existing alignment, and the phylogenetic inference was repeated.
If no homologs were found in Pneumocystis using S. pombe query, we used a different seed from another Taphrinomycotina if available (e.g., Saitoella). Alternatively, proteins from, Saccharomyces cerevisiae, Neurospora crassa, and Cryptococcus neoformans were used as seeds. If no homologs were detected, we used PSI-BLASTp (55). In parallel, if S. pombe query contains a conserved protein domain, we screened our fungal database with the corresponding Pfam domain using hmmsearch (HMMER version 3.3.2; http://hmmer.org) (63) with a cut off e-value of 0.1.
Because many genes could not be found in Pneumocystis proteomes, we scanned the genomes using TBLASTn (BLAST+ 2.13.0) (64) with an e-value of 0.1 as threshold and Exonerate (65) using the protein-to-genome option. Protein to genome alignments were translated in protein sequences and aligned, and phylogenetic inference was used as described above.
To identify DNA methylases within the Taphrinomycotina, we identified annotated proteins with a DNA methylase domain [Pfam domain PF000145 (66)] using hmmsearch (HMMER version 3.3.2; http://hmmer.org) (63) with a cut off e-value of 0.01. Although no hits were found in Pneumocystis proteomes, DMTs were detected in other fungi including Neolecta irregularis and Saitoella complicata, which is consistent with a previous study (27). Pneumocystis proteomes were queried using sequences from N. irregularis and S. complicata using BLASTp (55) with an e-value of 1 × 10−5 as cut off. This strategy identified 52 Pneumocystis proteins with local similarities, which were combined with all DMTs identified, aligned using Muscle v3.8.31 (67) and used for phylogenetic inference using FastTree. Then, we BLASTed Pneumocystis proteins against NCBI nr database (last accessed Thu Oct 12 12:07:56 EDT 2023), retrieved the top hit sequences, and added them to the multiple alignment. All 52 Pneumocystis sequences belong to related but distinct protein families (e.g., DNA helicases and SNF2 chromatin remodeling ATPases).
Electron microscopy
P. murina and A549-LET1 cells described above were grown on Thermanox coverslips (Ted Pella, Redding, CA, USA), harvested at day 7, fixed with 2% paraformaldehyde/2.5% glutaraldehyde in 0.1 M Sorenson’s phosphate buffer, then post fixed with 1.0% osmium tetroxide/0.8% potassium ferricyanide in 0.1 M sodium cacodylate buffer for 1 hour, washed with buffer, and then stained with 1% tannic acid in dH2O for 1 hr. After additional buffer washes, the samples were further osmicated with 2% osmium tetroxide in 0.1M sodium cacodylate and then washed with dH2O. Specimens were then stained overnight at 4°C with 1% aqueous uranyl acetate. The cells were then washed with dH2O and dehydrated with a graded ethanol series, prior to embedding in Spurr’s resin. Thin sections were cut with a Leica UC7 ultramicrotome (Buffalo Grove, IL) prior to viewing at 120 kV on a FEI BT Tecnai transmission electron microscope (Thermo Fisher/FEI, Hillsboro, OR). Digital images were acquired with a Gatan Rio camera (Gatan, Pleasanton, CA).
Antibodies
To obtain antibody against pnCENP-A, we created an alignment containing CENP-A orthologs from Rattus norvegicus (rat), Mus musculus (mouse), Schizosaccharomyces pombe, and multiple Pneumocystis species (Fig. S13A at https://doi.org/10.5281/zenodo.10574230). We deliberately avoided the C-terminus, which contains the highly conserved histone fold to minimize the risk of cross reactivity with the host histones in partially purified Pneumocystis protein preparations. Immunogenic peptides were selected from the N-terminal region. Peptides with sequence similarities with rat and/or mouse proteins (NCBI accession nos. GCF_015227675.2 and GCF_000001635.27, respectively) were excluded using BLASTp (55) with a word size of 6 and a minimum e-value of 1 as cut off as well as with pattern matches using Perl regular expressions. Additional searches were performed online at https://www.uniprot.org/peptide-search. We select one immunogenic peptide conserved in both P. carinii and P. murina. Two rabbits were immunized with 20 µg of the affinity-purified peptide. The polyclonal antibody was purified by affinity column (GenScript). No cross reactivity against the host cell lysates was seen by Western blots (Fig. S13B at https://doi.org/10.5281/zenodo.10574230). The following commercial antibodies targeting histones were used at 1:100 dilution for ChIP-Seq: H3 (ab1791, Abcam), H3K4me (39142, Active Motif), H3K9me2 (ab1220, Abcam), H3K9me3 (ab8898, Abcam), and H4 (ab1015, Abcam). We verified that peptides used for immunization for commercial antibodies were conserved in Pneumocystis proteins.
Western blots
Pneumocystis proteins (antigens) were prepared using glass beads. Partially purified organism pellets were resuspended in 1× PBS buffer. An aliquot of 0.65 mL of 0.5 mm glass beads (Biospec Products Inc.) was added to the suspensions and vortexed for 5 minutes at 4°C. Beads were allowed to settle for less than 30 seconds, and the supernatant was transferred to a new tube for sonication. The samples were centrifuged for 20,000 × g for 10 minutes, and the supernatants were transferred to a new tube. The pellet was resuspended in 20 µL PBS.
For Western blots, cell lysates and formalin-fixed chromatin preparations were run on Tris-Glycine SDS 4–20% WedgeWell gel (Thermo Fisher Scientific) and transferred to nitrocellulose membranes, followed by 1 hour incubation in blocking buffer (1× PBS + 5% milk + 0.05% Tween 20) with shaking. Nitrocellulose blots were incubated with primary antibodies (1:100) diluted in blocking buffer for an hour with shaking, followed by washing and then 1 hour of incubation with a secondary HRP conjugated goat anti-rabbit or a goat anti-mouse IgG antibody (1:2,000) (Jackson ImmunoResearch) and developed with Pierce 1-Step Ultra TMB-Blotting solution (Pierce).
Immunofluorescence microscopy
For culture, cells were plated in chamber well slide (Millicell EZ slide, Millipore). When cells were ready, the medium was removed and washed with cold PBS 1×, fixed with 4% formaldehyde for 10 minutes at room temperature, and washed three times. Cells were permeabilized with 0.2% Triton X-100 for 30 minutes and followed by a washing with 1× PBST (PBS with 0.05% Tween 20).
Pneumocystis carinii-infected lung tissues fixed in HistoChoice (VWR Life Science) were embedded in paraffin. Five-micrometer sections were labeled with rabbit anti-Pneumocystis CENP-A antibody (1:200) and mouse monoclonal anti-Pneumocystis carinii antibodies (7C4 and RAE7) (68, 69). The 7C4 antibody reacts with one or two uncharacterized antigens of 40 000 daltons and RAE7 reacts with the major surface glycoproteins of P. carinii (RAE7 antibody is a gift from Drs. Michael Linke and Peter Walzer, University of Cincinnati College of Medicine). Slides were mounted with Vector TrueView autofluorescence quenching kit with DAPI (Vector Laboratories). Alexa Fluor 555 (1:200)-labeled goat anti-rabbit IgG (A270039, Invitrogen) and Alexa Fluor 488 (1:200)-labeled goat anti-mouse (A11029, Thermo Fisher Scientific) were used as secondary antibodies. Images were acquired using either a Leica SP8 confocal microscope with 63× (1.4 NA) HC PL APO objective or a Zeiss 880 confocal microscope with a 63× (1.4 NA) PL APO objective. Images from the Zeiss were taken using 405-, 488-, and 561-nm excitation with emission bandwidths of 415–480 nm, 500–550 nm, and 565–700 nm. Images from the Leica were taken using 405-, 488-, and 555-nm excitation with emission bandwidths of 409–491 nm, 504–550 nm, and 560–721 nm. All images were taken with a format of 1,024 × 1,024 pixels with pixel size ranging from 69 to 73 nm and a line average of 2. Interslice distance for z-stack imaging was set to 160 nm and 250 nm with pinhole settings of 0.7 airy units and 1 airy unit of images acquired on the Leica and Zeiss microscopes, respectively. Image deconvolution was performed assuming an idealized point spread function using Hyugens software program (Scientific Volume Imaging, Netherlands). The software program Imaris (Oxford Instruments, Abingdon UK) was used for image visualization.
S. pombe cells were grown overnight at 30°C in minimal medium PMG lacking leucine until logarithmic phase and then mounted in PMG 2% agarose as described (70). Images were acquired on a Delta Vision Elite microscope (Applied Precision) with a 100 × 1.35 NA oil lens (Olympus). Twenty 0.35-µm z sections were acquired generating maximum intensity projections. Further image processing including background-subtracted centromere foci intensity measurements was performed using ImageJ (National Institutes of Health).
ChIP-Seq
Five independent culture experiments with three time points (days 0, 7, and 14) were performed for each species. Purified cells (~5 × 106 cells) and tissue preparations were fixed by 37% formaldehyde treatment. Chromatin preparation, immunoprecipitation (IP), and Illumina sequencing libraires were prepared using the Low Cell ChIP-Seq Kit (Active Motif, Carlsbad, CA) according to the manufacturer instructions. Pre-immune sera served as negative controls (Input). ChIP-Seq libraries were sequenced commercially (Psomagen Rockville, Maryland, USA) using Illumina NovaSeq 6000 (150-base paired end reads).
RNA-Seq
Total RNA was extracted using the RNeasy Mini Kit (Qiagen). RNA quality and integrity were estimated using Bioanalyzer RNA 6000 Pico Assay (Agilent) and sequenced commercially using Illumina HiSeq 4000 (150-base paired end libraries, Novogene Inc., USA). Reads were mapped using STAR v.2.7.9a (71) and normalized to bins per million mapped reads (BPM). Transcript counts were quantified using kallisto 0.44.0 (72).
ChIP-Seq peak calling and sequence analysis
Raw reads were quality checked with multi-QC (73). We removed host DNA contamination from raw sequence reads by mapping against the rat and mouse genomes using Bowtie2 (74) with default parameters. Unmapped reads (0.3-5% of total reads) were aligned to the Pneumocystis genomes [P. murina (GenBank accession number GCA_000349005.2) and P. carinii (GCA_001477545.1)], using the Burrows-Wheeler Aligner BWA-MEM (75). Alignments were filtered and sorted using SAMTools (76) and duplicates were marked using PICARD MarkDuplicates (http://broadinstitute.github.io/picard). Quality controls were performed using deepTools plotFingerprint (77) and phantompeakqualtools (78), which estimate the normalized strand coefficient and the relative strand correlation. DeepTools fingerprint plots showed that most reads are mapped within a small fraction of the genome (~3% of the total 7–8 Mb), which indicates that the antibodies demonstrated a restricted binding (Fig. S13C https://doi.org/10.5281/zenodo.10574230). ChIP-Seq coverage data were calculated as a 50-bp bin and normalized per genomic content (1× normalization). Chromosome ideograms were generated using chromoMap v0.3 (79). Enriched regions following IP were inspected using the IGV (Integrative Genomics Viewer) genome browser.
Because CENP-A is expected to produce sharp peaks, CENP-A ChIP-Seq peak calling was performed using MACS3 v3.0.0a5 (Model-Based Analysis for ChIP-Sequencing) (80) with the “callpeak” function with estimated genome sizes and sorted by fold enrichment (cut off 1.2) and FDR values (broad cutoff 0.1; FDR < 0.05) (raw data are provided in Supplementary data set 4). Enriched peaks were further inspected using deepTools bamCompare versus shuffle non-centromeric genomic regions. Peaks were further examined in conjunction with other data (e.g., conservation score, RNA-seq, methylation data, GC content, and repeats) with IGV (81), ensuring that peaks are present in all IP samples and absent in controls. Because H3K9me2/3 and H3K4me2 are histone modifications with broad distribution, we used SICER (Spatial-clustering method for Identification of ChIP-Enriched Regions) (82) with default parameters for peak calling (raw data are provided in Supplementary data set 4).
De novo DNA motif searches were performed using MEME v5.1.0 (83) and Homer v4.9 (84) with a P value cut off of 10−10. GC content across windows (bin) was computed using BEDTools (85). Normalized coverage and annotations across chromosomes were visualized using pyGenomeTracks (86).
Repeated elements (DNA transposons, retrotransposons, and low complexity repeats) were identified using RepeatMasker (87) and RepBase (88).
Gene located in centromeres were identified from NCBI annotated GFF3 and converted in BED format using custom scripts. Overlaps between gene regions and centromere regions in BED format were identified using BEDtools intersect.
Genome synteny and conservation
DNA alignments were generated using Satsuma2 (89) and visualized with Circos (90). To estimate the sequence conservation scores, we performed one-to-one pairwise whole genome alignments using LAST (91) with the MAM4 seeding scheme (92). We considered the trio P. carinii/P. murina/P. wakefieldiae. We generated a neutral model using phyloFit (93) from PHAST which fit the tree model to multiple sequence alignments by maximum likelihood using the REV substitution model. Sequence conservation scores were estimated using phastCons (94) (--target-coverage 0.25 --expected-length 20 --estimate-trees). Wig files and Bedgraph files were converted using “wigToBigWig” and “bigWigToBedGraph” tools (95).
Bisulfite sequencing
Pneumocystis DNA (5 µg) was extracted by a Pneumocystis DNA enrichment protocol (14). Bisulfite conversion, library preparation, and sequencing (Illumina NovaSeq 150 base paired end reads) were performed commercially (Novogene). Bisulfite conversion of non-methylated DNA was performed using the EZ DNA Methylation Kit. Data analysis was performed using BSMAP and methratio script (96). Only cytosine positions with more than five-fold coverage were considered. We used weighted methylation percentage (42), which is calculated as the number of reads supporting methylation over the number of cytosines sequenced to quantify methylation levels. Methylation data were partitioned over different genomic compartments using BEDTools (97). Statistical comparisons were performed using R (https://www.R-project.org) to compute non-parametric Mann-Whitney U test and Bonferroni correction to adjust P values for multiple comparisons.
ChIP-qPCR
CENP-A enrichment and H3 depletion relative to H4 within the centromeres were evaluated using primers specific to centromeric and non-centromeric loci (control region taken 200 kb away from centromere 1 of each species) (see Table S5 at https://doi.org/10.5281/zenodo.10574230 for primers). Primers were designed using primer3 (98) from regions visualized using the IGV genome browser. Primers with potential matches against mammal genomes were excluded using NCBI BLASTn. Dilutions of 1:10 were used based on preliminary test runs. Each reaction contains 5 µL of iTaq universal Sybr green supermix (Bio-Rad), 1.25 µL of each primer (500 nM), and 2.5 µL of DNA. All assays were run on CFX96 thermocycler (Bio-Rad). Three technical replicates were taken for each assay, and the standard errors of the mean were calculated. The PCR program was as follows: initial denaturation for 2 min at 95°C, followed by 30 cycles of 95°C for 30 seconds, 60°C for 30 seconds, and 72° for 30 seconds. Locus ΔCt values were normalized using the following formula: (ΔΔCt ChIP – ΔCt Input; https://www.sigmaaldrich.com/US/en/technical-documents/protocol/genomics/qpcr/chip-qpcr-data-analysis), and the fold enrichment of the ChIP DNA over the input was computed as log2 (2ΔΔCt). The plots and statistical analyses were performed with GraphPad Prism 8.
ChIP-LC-MS
Formalin-fixed purified chromatin preparations were separately co-immunoprecipitated with anti-CENP-A, H4, and pre-immune serum (control) using the Pierce Direct Magnetic IP/Co-IP Kit (Thermo Fisher). Each IP reaction was performed separately on aliquots from the same chromatin preparation. Elution was performed using 1% formic acid. Extracts were frozen in dry ice, lyophilized for 1 hour, digested using trypsin, diluted, and injected to a Thermo Orbitrap Fusion Liquid Chromatography with tandem mass spectrometry (LC-MS/MS) to identify unique peptides. LC-MS/MS data were searched against the proteomes using Proteome Discoverer 2.4 (Thermo Fisher Scientific, Waltham, MA). Label-free quantification was analyzed based on the peak intensity of the precursor ion.
ACKNOWLEDGMENTS
We thank Rene Costello for animal care and Drs. Keytam Awad, Salina Gairhe, and Shuibang Wang for the discussion about ChIP-Seq. This work has been funded in whole or in part with federal funds from the Intramural Research Program of the US National Institutes of Health (NIH) Clinical Center, the National Institute of Allergy, and Infectious Diseases (NIAID) and the National Cancer Institute. The views expressed in this work are those of the authors and do not necessarily represent the official positions of the NIH or U.S. Government.
This work was supported in part by VA I01 BX004441 (MTC). MTC is a Senior Research Career Scientist supported by IK6BX005232 Department of Veterans Affairs.
This study used the Office of Cyber Infrastructure and Computational Biology (OCICB) High Performance Computing (HPC) cluster at the National Institute of Allergy and Infectious Diseases (NIAID), Bethesda, MD. This study also utilized the high-performance computational capabilities of the Biowulf Linux cluster at the NIH, Bethesda, MD (http://biowulf.nih.gov).
Contributor Information
Ousmane H. Cissé, Email: ousmane.cisse@nih.gov.
Joseph A. Kovacs, Email: jkovacs@mail.nih.gov.
Louis M. Weiss, Albert Einstein College of Medicine, Bronx, New York, USA
DATA AVAILABILITY
ChIP-Seq, RNA-Seq and methylation sequencing raw data and associated files generated in this study are available at NCBI GEO Series GSE255275. Codes generated for this project are available at GitHub: https://github.com/ocisse/pneumo-cens
SUPPLEMENTAL MATERIAL
The following material is available online at https://doi.org/10.1128/mbio.03185-23.
Figures S1 to S13
ASM does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted ASM a non-exclusive, world-wide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.
REFERENCES
- 1. Gordon JL, Byrne KP, Wolfe KH. 2011. Mechanisms of chromosome number evolution in yeast. PLoS Genet 7:e1002190. doi: 10.1371/journal.pgen.1002190 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Sankaranarayanan SR, Ianiri G, Coelho MA, Reza MH, Thimmappa BC, Ganguly P, Vadnala RN, Sun S, Siddharthan R, Tellgren-Roth C, Dawson TL, Heitman J, Sanyal K. 2020. Loss of centromere function drives karyotype evolution in closely related Malassezia species. Elife 9:e53944. doi: 10.7554/eLife.53944 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Selmecki A, Forche A, Berman J. 2006. Aneuploidy and isochromosome formation in drug-resistant Candida albicans. Science 313:367–370. doi: 10.1126/science.1128242 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. McKinley KL, Cheeseman IM. 2016. The molecular basis for centromere identity and function. Nat Rev Mol Cell Biol 17:16–29. doi: 10.1038/nrm.2015.5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Clarke L, Carbon J. 1980. Isolation of a yeast centromere and construction of functional small circular chromosomes. Nature 287:504–509. doi: 10.1038/287504a0 [DOI] [PubMed] [Google Scholar]
- 6. Kursel LE, Malik HS. 2016. Centromeres. Curr Biol 26:R487–R490. doi: 10.1016/j.cub.2016.05.031 [DOI] [PubMed] [Google Scholar]
- 7. Sanyal K, Baum M, Carbon J. 2004. Centromeric DNA sequences in the pathogenic yeast Candida albicans are all different and unique. Proc Natl Acad Sci U S A 101:11374–11379. doi: 10.1073/pnas.0404318101 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Folco HD, Pidoux AL, Urano T, Allshire RC. 2008. Heterochromatin and RNAi are required to establish CENP-A chromatin at centromeres. Science 319:94–97. doi: 10.1126/science.1150944 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Yadav V, Sun S, Billmyre RB, Thimmappa BC, Shea T, Lintner R, Bakkeren G, Cuomo CA, Heitman J, Sanyal K. 2018. RNAi is a critical determinant of centromere evolution in closely related fungi. Proc Natl Acad Sci U S A 115:3108–3113. doi: 10.1073/pnas.1713725115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Shen X-X, Steenwyk JL, LaBella AL, Opulente DA, Zhou X, Kominek J, Li Y, Groenewald M, Hittinger CT, Rokas A. 2020. Genome-scale phylogeny and contrasting modes of genome evolution in the fungal phylum Ascomycota. Sci Adv 6:eabd0079. doi: 10.1126/sciadv.abd0079 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Cissé OH, Ma L, Dekker JP, Khil PP, Youn J-H, Brenchley JM, Blair R, Pahar B, Chabé M, Van Rompay KKA, Keesler R, Sukura A, Hirsch V, Kutty G, Liu Y, Peng L, Chen J, Song J, Weissenbacher-Lang C, Xu J, Upham NS, Stajich JE, Cuomo CA, Cushion MT, Kovacs JA. 2021. Genomic insights into the host specific adaptation of the Pneumocystis genus. Commun Biol 4:305. doi: 10.1038/s42003-021-01799-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Cissé OH, Pagni M, Hauser PM. 2012. De novo assembly of the Pneumocystis jirovecii genome from a single bronchoalveolar lavage fluid specimen from a patient. mBio 4:e00428-12. doi: 10.1128/mBio.00428-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Slaven BE, Meller J, Porollo A, Sesterhenn T, Smulian AG, Cushion MT. 2006. Draft assembly and annotation of the Pneumocystis carinii genome. J Eukaryot Microbiol 53:S89–S91. doi: 10.1111/j.1550-7408.2006.00184.x [DOI] [PubMed] [Google Scholar]
- 14. Ma L, Chen Z, Huang DW, Kutty G, Ishihara M, Wang H, Abouelleil A, Bishop L, Davey E, Deng R, et al. 2016. Genome analysis of three Pneumocystis species reveals adaptation mechanisms to life exclusively in mammalian hosts. Nat Commun 7:10740. doi: 10.1038/ncomms10740 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Varoquaux N, Liachko I, Ay F, Burton JN, Shendure J, Dunham MJ, Vert J-P, Noble WS. 2015. Accurate identification of centromere locations in yeast genomes using Hi-C. Nucleic Acids Res 43:5331–5339. doi: 10.1093/nar/gkv424 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Almeida J, Cissé OH, Fonseca Á, Pagni M, Hauser PM. 2015. Comparative genomics suggests primary homothallism of Pneumocystis species. mBio 6:e02250-14. doi: 10.1128/mBio.02250-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Hayashi MT, Takahashi TS, Nakagawa T, Nakayama J, Masukata H. 2009. The heterochromatin protein Swi6/HP1 activates replication origins at the pericentromeric region and silent mating-type locus. Nat Cell Biol 11:357–362. doi: 10.1038/ncb1845 [DOI] [PubMed] [Google Scholar]
- 18. Cissé OH, Pagni M, Hauser PM. 2014. Comparative genomics suggests that the human pathogenic fungus Pneumocystis jirovecii acquired obligate biotrophy through gene loss. Genome Biol Evol 6:1938–1948. doi: 10.1093/gbe/evu155 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Liu X, McLeod I, Anderson S, Yates JR, He X. 2005. Molecular analysis of kinetochore architecture in fission yeast. EMBO J 24:2919–2930. doi: 10.1038/sj.emboj.7600762 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Shiroiwa Y, Hayashi T, Fujita Y, Villar-Briones A, Ikai N, Takeda K, Ebe M, Yanagida M. 2011. Mis17 is a regulatory module of the Mis6-Mal2-Sim4 centromere complex that is required for the recruitment of CenH3/CENP-A in fission yeast. PLoS One 6:e17761. doi: 10.1371/journal.pone.0017761 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Casola C, Hucks D, Feschotte C. 2008. Convergent domestication of pogo-like transposases into centromere-binding proteins in fission yeast and mammals. Mol Biol Evol 25:29–41. doi: 10.1093/molbev/msm221 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Leverson JD, Huang H, Forsburg SL, Hunter T. 2002. The Schizosaccharomyces pombe aurora-related kinase Ark1 interacts with the inner centromere protein Pic1 and mediates chromosome segregation and cytokinesis. Mol Biol Cell 13:1132–1143. doi: 10.1091/mbc.01-07-0330 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Hayashi T, Fujita Y, Iwasaki O, Adachi Y, Takahashi K, Yanagida M. 2004. Mis16 and Mis18 are required for CENP-A loading and histone deacetylation at centromeres. Cell 118:715–729. doi: 10.1016/j.cell.2004.09.002 [DOI] [PubMed] [Google Scholar]
- 24. Williams JS, Hayashi T, Yanagida M, Russell P. 2009. Fission yeast Scm3 mediates stable assembly of Cnp1/CENP-A into centromeric chromatin. Mol Cell 33:287–298. doi: 10.1016/j.molcel.2009.01.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Corbett KD, Yip CK, Ee L-S, Walz T, Amon A, Harrison SC. 2010. The monopolin complex crosslinks kinetochore components to regulate chromosome-microtubule attachments. Cell 142:556–567. doi: 10.1016/j.cell.2010.07.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Ponger L, Li WH. 2005. Evolutionary diversification of DNA methyltransferases in eukaryotic genomes. Mol Biol Evol 22:1119–1128. doi: 10.1093/molbev/msi098 [DOI] [PubMed] [Google Scholar]
- 27. Bewick AJ, Hofmeister BT, Powers RA, Mondo SJ, Grigoriev IV, James TY, Stajich JE, Schmitz RJ. 2019. Diversity of cytosine methylation across the fungal tree of life. Nat Ecol Evol 3:479–490. doi: 10.1038/s41559-019-0810-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Freitag M. 2017. Histone methylation by SET domain proteins in fungi. Annu Rev Microbiol 71:413–439. doi: 10.1146/annurev-micro-102215-095757 [DOI] [PubMed] [Google Scholar]
- 29. Kovacs JA, Halpern JL, Lundgren B, Swan JC, Parrillo JE, Masur H. 1989. Monoclonal antibodies to Pneumocystis carinii: identification of specific antigens and characterization of antigenic differences between rat and human isolates. J Infect Dis 159:60–70. doi: 10.1093/infdis/159.1.60 [DOI] [PubMed] [Google Scholar]
- 30. Lando D, Endesfelder U, Berger H, Subramanian L, Dunne PD, McColl J, Klenerman D, Carr AM, Sauer M, Allshire RC, Heilemann M, Laue ED. 2012. Quantitative single-molecule microscopy reveals that CENP-A(Cnp1) deposition occurs during G2 in fission yeast. Open Biol 2:120078. doi: 10.1098/rsob.120078 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Castillo AG, Mellone BG, Partridge JF, Richardson W, Hamilton GL, Allshire RC, Pidoux AL. 2007. Plasticity of fission yeast CENP-A chromatin driven by relative levels of histone H3 and H4. PLoS Genet 3:e121. doi: 10.1371/journal.pgen.0030121 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Folco HD, Campbell CS, May KM, Espinoza CA, Oegema K, Hardwick KG, Grewal SIS, Desai A. 2015. The CENP-A N-tail confers epigenetic stability to centromeres via the CENP-T branch of the CCAN in fission yeast. Curr Biol 25:348–356. doi: 10.1016/j.cub.2014.11.060 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Funabiki H, Hagan I, Uzawa S, Yanagida M. 1993. Cell cycle-dependent specific positioning and clustering of centromeres and telomeres in fission yeast. J Cell Biol 121:961–976. doi: 10.1083/jcb.121.5.961 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Mizuguchi T, Fudenberg G, Mehta S, Belton J-M, Taneja N, Folco HD, FitzGerald P, Dekker J, Mirny L, Barrowman J, Grewal SIS. 2014. Cohesin-dependent globules and heterochromatin shape 3D genome architecture in S. pombe. Nature 516:432–435. doi: 10.1038/nature13833 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Vestereng VH, Bishop LR, Hernandez B, Kutty G, Larsen HH, Kovacs JA. 2004. Quantitative real-time polymerase chain-reaction assay allows characterization of Pneumocystis infection in immunocompetent mice. J Infect Dis 189:1540–1544. doi: 10.1086/382486 [DOI] [PubMed] [Google Scholar]
- 36. Cam HP, Sugiyama T, Chen ES, Chen X, FitzGerald PC, Grewal SIS. 2005. Comprehensive analysis of heterochromatin- and RNAi-mediated epigenetic control of the fission yeast genome. Nat Genet 37:809–819. doi: 10.1038/ng1602 [DOI] [PubMed] [Google Scholar]
- 37. Thakur J, Talbert PB, Henikoff S. 2015. Inner kinetochore protein interactions with regional centromeres of fission yeast. Genetics 201:543–561. doi: 10.1534/genetics.115.179788 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Scott KC, Bloom KS. 2014. Lessons learned from counting molecules: how to lure CENP-A into the kinetochore. Open Biol 4:140191. doi: 10.1098/rsob.140191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Yasuhara JC, Wakimoto BT. 2008. Molecular landscape of modified histones in Drosophila heterochromatic genes and euchromatin-heterochromatin transition zones. PLoS Genet 4:e16. doi: 10.1371/journal.pgen.0040016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Miura A, Yonebayashi S, Watanabe K, Toyama T, Shimada H, Kakutani T. 2001. Mobilization of transposons by a mutation abolishing full DNA methylation in Arabidopsis. Nature 411:212–214. doi: 10.1038/35075612 [DOI] [PubMed] [Google Scholar]
- 41. Smith KM, Galazka JM, Phatale PA, Connolly LR, Freitag M. 2012. Centromeres of filamentous fungi. Chromosome Res 20:635–656. doi: 10.1007/s10577-012-9290-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Cook DE, Kramer HM, Torres DE, Seidl MF, Thomma B. 2020. A unique chromatin profile defines adaptive genomic regions in a fungal plant pathogen. Elife 9:e62208. doi: 10.7554/eLife.62208 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Ma L, Cissé OH, Kovacs JA. 2018. A molecular window into the biology and epidemiology of Pneumocystis spp. Clin Microbiol Rev 31:e00009-18. doi: 10.1128/CMR.00009-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Schotanus K, Soyer JL, Connolly LR, Grandaubert J, Happel P, Smith KM, Freitag M, Stukenbrock EH. 2015. Histone modifications rather than the novel regional centromeres of Zymoseptoria tritici distinguish core and accessory chromosomes. Epigenetics Chromatin 8:41. doi: 10.1186/s13072-015-0033-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Seidl MF, Kramer HM, Cook DE, Fiorin GL, van den Berg GCM, Faino L, Thomma B. 2020. Repetitive elements contribute to the diversity and evolution of centromeres in the fungal genus Verticillium. mBio 11:e01714-20. doi: 10.1128/mBio.01714-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Pidoux AL, Allshire RC. 2004. Kinetochore and heterochromatin domains of the fission yeast centromere. Chromosome Res 12:521–534. doi: 10.1023/B:CHRO.0000036586.81775.8b [DOI] [PubMed] [Google Scholar]
- 47. Teytelman L, Thurtle DM, Rine J, van Oudenaarden A. 2013. Highly expressed loci are vulnerable to misleading ChIP localization of multiple unrelated proteins. Proc Natl Acad Sci U S A 110:18602–18607. doi: 10.1073/pnas.1316064110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Nagaki K, Cheng Z, Ouyang S, Talbert PB, Kim M, Jones KM, Henikoff S, Buell CR, Jiang J. 2004. Sequencing of a rice centromere uncovers active genes. Nat Genet 36:138–145. doi: 10.1038/ng1289 [DOI] [PubMed] [Google Scholar]
- 49. Kovacs JA, Halpern JL, Swan JC, Moss J, Parrillo JE, Masur H. 1988. Identification of antigens and antibodies specific for Pneumocystis carinii. J Immunol 140:2023–2031. [PubMed] [Google Scholar]
- 50. Liu Y, Davis AS, Ma L, Bishop L, Cissé OH, Kutty G, Kovacs JA. 2020. MUC1 mediates Pneumocystis murina binding to airway epithelial cells. Cell Microbiol 22:e13182. doi: 10.1111/cmi.13182 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Sabatinos SA, Forsburg SL. 2010. Molecular genetics of Schizosaccharomyces pombe. Methods Enzymol 470:759–795. doi: 10.1016/S0076-6879(10)70032-X [DOI] [PubMed] [Google Scholar]
- 52. Takayama Y, Sato H, Saitoh S, Ogiyama Y, Masuda F, Takahashi K. 2008. Biphasic incorporation of centromeric histone CENP-A in fission yeast. Mol Biol Cell 19:682–690. doi: 10.1091/mbc.e07-05-0504 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Sakuno T, Tada K, Watanabe Y. 2009. Kinetochore geometry defined by cohesion within the Centromere. Nature 458:852–858. doi: 10.1038/nature07876 [DOI] [PubMed] [Google Scholar]
- 54. Folco HD, McCue A, Balachandran V, Grewal SIS. 2019. Cohesin impedes heterochromatin assembly in fission yeast cells lacking Pds5. Genetics 213:127–141. doi: 10.1534/genetics.119.302256 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402. doi: 10.1093/nar/25.17.3389 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Emms DM, Kelly S. 2019. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20:238. doi: 10.1186/s13059-019-1832-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Hernández-Plaza A, Szklarczyk D, Botas J, Cantalapiedra CP, Giner-Lamia J, Mende DR, Kirsch R, Rattei T, Letunic I, Jensen LJ, Bork P, von Mering C, Huerta-Cepas J. 2023. eggNOG 6.0: enabling comparative genomics across 12 535 organisms. Nucleic Acids Res 51:D389–D394. doi: 10.1093/nar/gkac1022 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58. Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong S-Y, Lopez R, Hunter S. 2014. InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240. doi: 10.1093/bioinformatics/btu031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Katoh K, Misawa K, Kuma K, Miyata T. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30:3059–3066. doi: 10.1093/nar/gkf436 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Price MN, Dehal PS, Arkin AP. 2009. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol 26:1641–1650. doi: 10.1093/molbev/msp077 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32:268–274. doi: 10.1093/molbev/msu300 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Postberg J, Forcob S, Chang WJ, Lipps HJ. 2010. The evolutionary history of histone H3 suggests a deep eukaryotic root of chromatin modifying mechanisms. BMC Evol Biol 10:259. doi: 10.1186/1471-2148-10-259 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Eddy SR. 2011. Accelerated profile HMM searches. PLoS Comput Biol 7:e1002195. doi: 10.1371/journal.pcbi.1002195 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Gertz EM, Yu Y-K, Agarwala R, Schäffer AA, Altschul SF. 2006. Composition-based statistics and translated nucleotide searches: improving the TBLASTN module of BLAST. BMC Biol 4:41. doi: 10.1186/1741-7007-4-41 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Slater GSC, Birney E. 2005. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6:31. doi: 10.1186/1471-2105-6-31 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. El-Gebali S, Mistry J, Bateman A, Eddy SR, Luciani A, Potter SC, Qureshi M, Richardson LJ, Salazar GA, Smart A, Sonnhammer ELL, Hirsh L, Paladin L, Piovesan D, Tosatto SCE, Finn RD. 2019. The Pfam protein families database in 2019. Nucleic Acids Res 47:D427–D432. doi: 10.1093/nar/gky995 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. doi: 10.1093/nar/gkh340 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. Kovacs JA, Halpern JL, Lundgren B, Swan JC, Parrillo JE, Masur H. 1989. Monoclonal antibodies to Pneumocystis carinii: identification of specific antigens and characterization of antigenic differences between rat and human isolates. J Infect Dis 159:60–70. doi: 10.1093/infdis/159.1.60 [DOI] [PubMed] [Google Scholar]
- 69. Linke MJ, Sunkin SM, Andrews RP, Stringer JR, Walzer PD. 1998. Expression, structure, and location of epitopes of the major surface glycoprotein of Pneumocystis carinii f. sp. carinii. Clin Diagn Lab Immunol 5:50–57. doi: 10.1128/CDLI.5.1.50-57.1998 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Tran PT, Paoletti A, Chang F. 2004. Imaging green fluorescent protein fusions in living fission yeast cells. Methods 33:220–225. doi: 10.1016/j.ymeth.2003.11.017 [DOI] [PubMed] [Google Scholar]
- 71. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. 2013. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21. doi: 10.1093/bioinformatics/bts635 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Bray NL, Pimentel H, Melsted P, Pachter L. 2016. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34:525–527. doi: 10.1038/nbt.3519 [DOI] [PubMed] [Google Scholar]
- 73. Ewels P, Magnusson M, Lundin S, Käller M. 2016. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32:3047–3048. doi: 10.1093/bioinformatics/btw354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74. Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. doi: 10.1038/nmeth.1923 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75. Li H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303. doi: 10.48550/arXiv.1303.3997 [DOI] [Google Scholar]
- 76. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup . 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079. doi: 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77. Ramírez F, Ryan DP, Grüning B, Bhardwaj V, Kilpert F, Richter AS, Heyne S, Dündar F, Manke T. 2016. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res 44:W160–W165. doi: 10.1093/nar/gkw257 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78. Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, Bernstein BE, Bickel P, Brown JB, Cayting P, et al. 2012. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res 22:1813–1831. doi: 10.1101/gr.136184.111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Anand L, Rodriguez Lopez CM. 2022. chromoMap: An R package for interactive visualization and annotation of Chromosomes. BMC Bioinformatics 23:33. doi: 10.1186/s12859-021-04556-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W, Liu XS. 2008. Model-based analysis of ChIP-Seq (MACS). Genome Biol 9:R137. doi: 10.1186/gb-2008-9-9-r137 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. 2011. Integrative genomics viewer. Nat Biotechnol 29:24–26. doi: 10.1038/nbt.1754 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82. Zang C, Schones DE, Zeng C, Cui K, Zhao K, Peng W. 2009. A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25:1952–1958. doi: 10.1093/bioinformatics/btp340 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83. Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. 2009. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37:W202–W208. doi: 10.1093/nar/gkp335 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84. Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C, Singh H, Glass CK. 2010. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38:576–589. doi: 10.1016/j.molcel.2010.05.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85. Quinlan AR, Hall IM. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842. doi: 10.1093/bioinformatics/btq033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86. Lopez-Delisle L, Rabbani L, Wolff J, Bhardwaj V, Backofen R, Grüning B, Ramírez F, Manke T. 2021. pyGenomeTracks: reproducible plots for multivariate genomic datasets. Bioinformatics 37:422–423. doi: 10.1093/bioinformatics/btaa692 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87. Smit AFA, Hubley R, Green P. 2013-2015. RepeatMasker open-4.0
- 88. Bao W, Kojima KK, Kohany O. 2015. Repbase update, a database of repetitive elements in eukaryotic genomes. Mob DNA 6:11. doi: 10.1186/s13100-015-0041-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89. Grabherr MG, Russell P, Meyer M, Mauceli E, Alföldi J, Di Palma F, Lindblad-Toh K. 2010. Genome-wide synteny through highly sensitive sequence alignment: Satsuma. Bioinformatics 26:1145–1151. doi: 10.1093/bioinformatics/btq102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90. Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA. 2009. Circos: an information aesthetic for comparative genomics. Genome Res 19:1639–1645. doi: 10.1101/gr.092759.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91. Kiełbasa SM, Wan R, Sato K, Horton P, Frith MC. 2011. Adaptive seeds tame genomic sequence comparison. Genome Res 21:487–493. doi: 10.1101/gr.113985.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92. Frith MC, Noé L. 2014. Improved search heuristics find 20,000 new alignments between human and mouse genomes. Nucleic Acids Res 42:e59. doi: 10.1093/nar/gku104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93. Siepel A, Haussler D. 2004. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 21:468–488. doi: 10.1093/molbev/msh039 [DOI] [PubMed] [Google Scholar]
- 94. Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, Weinstock GM, Wilson RK, Gibbs RA, Kent WJ, Miller W, Haussler D. 2005. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15:1034–1050. doi: 10.1101/gr.3715005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95. Kent WJ, Zweig AS, Barber G, Hinrichs AS, Karolchik D. 2010. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26:2204–2207. doi: 10.1093/bioinformatics/btq351 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96. Xi Y, Li W. 2009. BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics 10:232. doi: 10.1186/1471-2105-10-232 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97. Quinlan AR. 2014. BEDTools: the Swiss-army tool for genome feature analysis. Curr Protoc Bioinformatics 47:11. doi: 10.1002/0471250953.bi1112s47 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98. Kõressaar T, Lepamets M, Kaplinski L, Raime K, Andreson R, Remm M. 2018. Primer3_Masker: integrating masking of template sequence with primer design software. Bioinformatics 34:1937–1938. doi: 10.1093/bioinformatics/bty036 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Figures S1 to S13
Data Availability Statement
ChIP-Seq, RNA-Seq and methylation sequencing raw data and associated files generated in this study are available at NCBI GEO Series GSE255275. Codes generated for this project are available at GitHub: https://github.com/ocisse/pneumo-cens







