Skip to main content
mBio logoLink to mBio
. 2021 Apr 20;12(2):e00586-21. doi: 10.1128/mBio.00586-21

Intraspecies Transcriptional Profiling Reveals Key Regulators of Candida albicans Pathogenic Traits

Joshua M Wang a,*,#, Andrew L Woodruff a,#, Matthew J Dunn a, Robert J Fillinger a, Richard J Bennett b, Matthew Z Anderson a,c,
Editor: Michael Lorenzd
PMCID: PMC8092256  PMID: 33879584

Infectious fungal species are often treated uniformly despite clear evidence of genotypic and phenotypic heterogeneity being widespread across strains. Identifying the genetic basis for this phenotypic diversity is extremely challenging because of the tens or hundreds of thousands of variants that may distinguish two strains.

KEYWORDS: Candida, coexpression networks, gene expression, transcriptional networks, variation

ABSTRACT

The human commensal and opportunistic fungal pathogen Candida albicans displays extensive genetic and phenotypic variation across clinical isolates. Here, we performed RNA sequencing on 21 well-characterized isolates to examine how genetic variation contributes to gene expression differences and to link these differences to phenotypic traits. C. albicans adapts primarily through clonal evolution, and yet hierarchical clustering of gene expression profiles in this set of isolates did not reproduce their phylogenetic relationship. Strikingly, strain-specific gene expression was prevalent in some strain backgrounds. Association of gene expression with phenotypic data by differential analysis, linear correlation, and assembly of gene networks connected both previously characterized and novel genes with 23 C. albicans traits. Construction of de novo gene modules produced a gene atlas incorporating 67% of C. albicans genes and revealed correlations between expression modules and important phenotypes such as systemic virulence. Furthermore, targeted investigation of two modules that have novel roles in growth and filamentation supported our bioinformatic predictions. Together, these studies reveal widespread transcriptional variation across C. albicans isolates and identify genetic and epigenetic links to phenotypic variation based on coexpression network analysis.

INTRODUCTION

Candida albicans resides within the oral cavity, gastrointestinal tract, and genitourinary tract and on the skin of its human host as a commensal species (1). Development of an immunocompromised state can lead to C. albicans overgrowth of these same niches, producing debilitating mucosal infections and life-threatening bloodstream infections (2, 3). Critical to its success as both a ubiquitous commensal and opportunistic pathogen of multiple body sites is the ability of C. albicans to persist and proliferate in a wide range of physiological temperatures, oxic environments, nutrient availabilities, and pH conditions (46).

Clinical isolates of C. albicans represent a genetically diverse collection of heterozygous diploid organisms that can be separated into 17 clades by multilocus sequence typing (MLST), with Clade I making up the majority of typed isolates (79). Recent sequencing efforts have examined genomes from across the C. albicans phylogeny (10, 11). Analysis of these genomes supports a primarily clonal lifestyle for C. albicans, with occasional interclade mating generating recombinant genomes in a subset of isolates (10, 12). Thus, C. albicans evolves principally through the acquisition and accumulation of iterative mutations, leading to expanded genotypic diversity over time.

This genotypic diversity contributes to extensive phenotypic variation among C. albicans isolates, including an assortment of alternative cell states associated with distinct colonization and pathogenic traits (11, 1321). Some phenotypes are biased toward specific C. albicans clades (22, 23). For example, inherent resistance to the antifungal 5-flucytosine (5-FC) is mediated by a single missense mutation in FUR1 found ubiquitously across Clade I strains but absent in those from other clades (24, 25). In contrast, most phenotypes are heterogeneous both within and across C. albicans clades (11, 2628), suggesting multilocus control of these traits. This incongruence between genetic and phenotypic similarity in C. albicans deviates significantly from other asexual species in which phylogenetic conservation has been used to predict phenotypic traits (2931). It has also complicated large-scale investigations of the underlying polymorphisms that contribute to C. albicans phenotypic diversity and limited identification of genotype-phenotype relationships (10, 11, 23). Instead, phenotypic diversity may associate more strongly with other molecular signatures such as gene expression and protein abundance (3234).

The ability to rapidly respond to environmental cues is central to microbial adaptive potential. C. albicans adopts distinct transcriptional profiles in different cell states or when cultured under different physiologically relevant conditions (13, 35, 36). Altered transcriptional states can be detected as early as 5 min following exposure to new environments (3740). Distinct transcriptional responses in C. albicans are also observed in response to cues in the host, and these may contribute to colonization and pathogenesis in different niches (4143).

Altered expression of hundreds of genes following environmental shifts complicates distinguishing the regulatory genes that govern these transcriptional changes from downstream effectors. Defining the genetic regulons associated with specific transcription factors or responses has typically relied on a simple model of conditional expression focused on a single gene or environmental condition (4446), while the broader transcriptional architecture of C. albicans cells remains largely undefined. Concerted efforts to determine the transcriptional regulation of phenotypic switching between the C. albicans “white” and “opaque” states or between planktonic and biofilm communities has revealed the existence of highly interconnected transcription factor networks that collectively control differentiation between these states (4751). Genes within these circuits encode some of the most well-characterized transcription factors in C. albicans and yet account for only a small fraction of the complete repertoire of transcriptional regulators. Thus, integration of large-scale expression data across C. albicans isolates could aid in elucidating the transcriptional networks underlying the regulatory architecture of this important human pathogen.

Here, we describe transcriptional profiling of 21 C. albicans isolates representing five clades with significant genotypic and phenotypic diversity (11). Gene expression profiles of these strains did not reflect their phylogenetic relationships at either the strain or clade level. Moreover, differential gene expression of up to 35% of the annotated genes was found between any two strains grown under identical conditions, with several strains displaying extensive strain-specific gene expression. Transcriptional differences between strains were associated with specific phenotypes that corroborate previous experimental studies and also predicted new molecular functions related to pathogenesis. Furthermore, unbiased clustering of genes based on correlated gene expression levels revealed a transcriptional map of cellular functions from which coexpression modules were linked to pathogen-associated phenotypes. Experimental investigation of two coexpression modules uncovered new regulators of filamentation and a cell state-specific module and found that these contribute to intraspecies phenotypic variation in C. albicans.

RESULTS

A previous investigation sequenced the genomes of 21 C. albicans isolates and identified widespread genetic and phenotypic variation among the strain set (11). Candidate gene approaches identified one strain with a homozygous nonsense mutation in the transcriptional regulator EFG1 that caused a defect in filamentation and increased commensal fitness while decreasing systemic virulence (11). More recently, loss of EFG1 function was also linked to formation of the “gray” phenotypic state in clinical isolates (13). However, broader attempts to link genetic polymorphisms to phenotypic differences present a significant challenge as multiple loci may regulate a single trait. Consequently, many of the causative polymorphisms contributing to phenotypic variation remain unknown. To gain greater insight into the underlying basis of phenotypic diversity in C. albicans, we transcriptionally profiled the set of 21 isolates with diverse geographical origins, sites of infection, and clade designations within the C. albicans phylogeny (see Fig. S1 in the supplemental material).

FIG S1

Phylogenetic relationship of C. albicans strains used in this study. The phylogenetic relationship of the 21 C. albicans isolates used for transcriptional profiling is shown based on comparison of full-genome sequences. Bootstrap support for each node is indicated. Assignments of isolates to fingerprinting clades are color coded. Download FIG S1, PDF file, 0.1 MB (149KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Gene expression does not reflect genetic relatedness.

To compare gene expression across the 21 isolates, RNA was harvested from cells cultured in rich medium (yeast extract-peptone-dextrose [YPD], 30°C) in exponential phase. Transcript abundances were averaged between biological duplicates and binned across the 6,468 genes. The largest fraction of transcripts in the SC5314 reference genome were present at low but detectable expression levels (10 to 100 transcripts per million [TPM]), although the number of genes within each expression range fluctuated considerably among strains (Fig. S2A; see also Table S1 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS1/14211179/1). For example, P37037 expressed 25.3% of its genes at fewer than 10 TPM whereas this proportion increased to ∼50% in GC75. Differential binning of gene expression occurred even among strains in the same clade (e.g., compare Clade II strains P57072 and P76067), suggesting that large changes in genome-wide transcript abundance exist even between closely related strains.

FIG S2

Correlation of gene expression with phylogenetic relationships among the C. albicans isolates. (A) Read counts were calculated for all genes from each strain and binned based on the value of transcripts per million (TPM). The fraction of reads within each bin was then plotted per strain. Clade assignments for each strain are color coded as indicated. (B) Similarity in transcript profiles among the 42 biological samples was assessed by hierarchical clustering of TPM values using Euclidean distance and average linkage. One thousand bootstraps were performed. The resulting bootstrap values are shown in green, and corresponding approximately unbiased (AU) P values are shown in red at each node. (C) A heat map represents the RNA transcripts per million (TPM) of the 50 genes with the greatest difference in expression among the 21 isolates on a log2 scale. The expression for each strain is the average for two biological replicates. The strains are ordered based on their phylogenetic relationships, and their clade assignments are color coded. (D) The 32 genes whose expression significantly correlated with the strain phylogeny are listed. Genes that contributed to enrichment of the gene ontology (GO) terms associated with this list are boldfaced. Significant GO categories are listed. Download FIG S2, PDF file, 1.4 MB (1.4MB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

To determine if gene expression patterns were reflective of genetic relatedness, hierarchical clustering of genome-wide TPM values was performed. Similarity in gene expression profiles failed to reproduce the genetic phylogeny of these strains when averaged between replicates (Fig. 1A) or as individual samples (Fig. S2B). Variability in low-abundance transcripts was not responsible for obscuring phylogenetic similarity as none of the 50 genes with the greatest dynamic range in expression recapitulated the phylogenetic tree (Fig. S2C). In fact, averaged expression of only 0.5% of all genes (31 of 6,468) was associated with phylogenetic similarity, and these genes were functionally enriched for transcriptional regulation by glucose (Fig. S2D).

FIG 1.

FIG 1

Gene expression does not reflect strain phylogeny. (A) Hierarchical clustering of strains by Spearman’s correlation and average distances was performed for transcript abundance of the 6,468 genes across C. albicans strains using averaged values between replicates. Clade designations based on reported fingerprinting clades (FP) are indicated by color. (B) Gene expression was averaged among biological replicates, and the averages were compared between individual strains. Spearman’s correlation values were calculated in all pairwise combinations and visualized as a heat map ordered to reflect phylogenetic relatedness. FP clades are color coded, and clades with strong clustering are outlined in yellow. (C) The genetic similarity between isolates (x axis) was compared to similarity in transcript abundance as defined in panel B (y axis). Pairwise comparisons between all strains are represented as dots and color coded to denote intraclade comparisons (I, red; II, orange; III, blue; SA, dark gray) or marked as light gray for comparison across clades. Two clusters emerged with interclade comparisons showing less nucleotide similarity and a greater range of expression correlation scores (left) that extended below intraclade comparisons (right). Two recombinant isolates, P60002 and P94015 (indicated in purple and magenta, respectively), clustered only within interclade comparisons. (D) Clade gene expression profiles were built using the average from all strains within the clade. The clade-average profiles were compared by Spearman’s correlation and visualized by a heat map.

In a few select cases, averaged gene expression levels among strains within a single clade were similar, such as those within Clade III and among a subset of Clade SA strains (Fig. 1B, outlined in yellow). Indeed, gene expression within this strain set was more similar among intraclade comparisons than interclade comparisons (Wilcoxon test, W = 4,978, P value = 0.022), supporting evidence of clade-associated expression signatures among these isolates. Regardless, intra- and interclade correlations of gene expression largely overlapped (average: 0.783 versus 0.759; range: 0.33 to 0.97 versus 0.54 to 0.96, respectively) (Fig. 1C), and Clade III strains largely drove the differences between intraclade and interclade comparisons, which disappeared when these strains were removed (Brunner-Munzel test [BM] = 0.500, df = 78.9, P = 0.62). The two isolates in this set that have been proposed to harbor recombinant genomes, P60002 and P94015 (12), exhibited divergent genomes consistent with interclade comparisons of nucleotide divergence, and P60002 displayed the most divergent gene expression comparisons of any other strain (Fig. 1C). This further supports these isolates as being genetically distinct with unique expression patterns compared to other strains from their assigned clades and is in line with these two isolates having undergone interclade recombination during their evolutionary history (12).

Gene expression patterns were also compared between C. albicans clades, as some phenotypes have been associated with specific clades and clade-level comparisons can reduce the influence of “outlier” strains (22, 23). However, with the exception of Clades II and III, similarities in clade-average expression levels were not enriched among the more closely related clades (Fig. 1D and Fig. S3). Thus, genetic similarity contributes to, but does not strictly determine, similarity among C. albicans gene expression profiles.

FIG S3

Transcriptional profiles are not more similar among genetically similar strains. A distance matrix based on similarity in transcriptional profiles was constructed for all 21 C. albicans isolates. Distances were separated based on comparison between strains within the same clade or between strains in different clades based on fingerprinting analysis and plotted. Intraclade and interclade comparisons were not statistically different. Download FIG S3, PDF file, 0.7 MB (752.1KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Gene expression differences between C. albicans isolates span biological traits.

The set of C. albicans strains analyzed here exhibits up to 1.7% nucleotide divergence in pairwise comparisons (12), highlighting the potential for large-scale differences in genetic regulation and gene expression. The number of differentially expressed genes between any two isolates varied considerably, ranging from 43 to 1,457 genes (adjusted P value 0.05, ≥2-fold change) (see Table S2 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS2/14211218/1 and Table S3 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS3/14211224/1), and increased with greater dissimilarity in overall gene expression (Pearson’s test; r = −0.86, n = 210, P < 2.2E−16) (Fig. S4). Investigation of gene ontologies (GO) associated with differentially expressed genes between isolates returned 147 process terms spanning the full breadth of biology (see Table S4 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS4/14211233/1). The most prevalent GO terms were associated with ribosome biogenesis followed by nucleic acid and aromatic compound metabolism, suggesting that some isolates may have evolved unique growth characteristics, pathways to control nutrient utilization or signaling, and/or preferred nutrient conditions for optimal growth. Conversely, consistent gene expression levels across isolates point to core functions required for basic cellular processes in this diploid yeast. Of the 5,956 complete and intact open reading frames (ORFs) present across all 21 C. albicans isolates, 2,036 genes displayed indistinguishable expression levels among all strains. These include genes for key cellular functions such as amino acid charging of tRNAs, RNA polymerase function, and core translational processes (see Table S5 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS5/14211239/1).

FIG S4

Greater dissimilarity in gene expression correlates with more differential gene expression. The number of differentially expressed genes between any two strains (adjusted P value < 0.05, 2-fold cutoff) and the similarity in overall gene expression between two strains in all pairwise comparisons were plotted. Comparisons were performed in all pairwise combinations for all strains and color coded for comparisons between two strains within the same clade or marked as gray for comparison across clades. These data produced an inverse relationship between expression similarity and the number of differentially expressed genes. Download FIG S4, PDF file, 0.8 MB (774.6KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

A distinctive class of genes considered were those expressed at unique levels within a single strain compared to all other strains and therefore classified as having strain-specific expression. The number of strain-specific genes varied considerably, ranging from 0 in isolates 12C, 19F, P37005, and P57055 to 171 in GC75 (q ≤ 0.05 and 2-fold change) (Fig. S5). Strain-specific expression was enriched for cellular processes ranging from cell wall organization (GC75) to oxidation-reduction (P78042) to mannosyltransferase activity (P60002) and RNA polymerase I activity (P57072) (see Table S6 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS6/14211242/1). Isolates with the largest number of uniquely expressed genes typically clustered closely with other strains in the phylogenetic tree (Fig. S1), further highlighting the disconnect between genetic relatedness and gene expression.

FIG S5

Strain-specific gene expression among C. albicans isolates. (A) The number of genes expressed uniquely by one strain compared to all other 20 transcriptionally profiled isolates was plotted for each of the 21 isolates. Isolates that uniquely expressed a greater number of genes beyond 2 standard deviations are labeled. (B) The number of strain-specific genes for each isolate is listed. Download FIG S5, PDF file, 0.5 MB (487.6KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Characteristics of functional noncoding RNA elements.

Untranslated regions in C. albicans can serve as regulatory platforms for protein binding to control transcript stability and translation (52, 53). The average 5′ untranslated region (UTR) length for 5,076 genes with detectable expression among all sequenced isolates centered at 1 to 25 bp and decreased in frequency with greater lengths (Fig. S6A). Prior analysis has revealed that some C. albicans transcription factors have extended 5′ UTRs greater than 1 kb in length (36, 5254). Analysis of all transcription factor genes among the 21 strains showed they encoded significantly longer 5′ UTRs compared to the genome-wide average (286 versus 97 bp, respectively; Wilcoxon test, W = 3.97E5, P value < 2.2E−16) (Fig. S6B and see Table S7 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS7/14211248/1). In contrast, 3′ UTRs were, on average, between 25 and 75 bp for the 5,899 genes with detectable expression. Genes involved in protein translation were found to contain significantly longer 3′ UTRs than the genome average (141 versus 44 bp, respectively; Wilcoxon test, W = 8.63E5, P value < 2.2E−16) (Fig. S6C and see Table S8 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS8/14211251/1), which may also implicate important regulatory functions for these regions through either transcriptional or translational control (55).

FIG S6

Untranslated regions (UTRs) in C. albicans vary in length with gene function. (A) The UTR length for all genes in each isolate was determined by measuring the length of continuous reads extending beyond defined coding sequences on the appropriate strand. Lengths for each gene were plotted with 5′ UTRs above and 3′ UTRs below the x axis. Red vertical lines indicate the 95% cutoff value. (B) The 5′ UTR was detected from aligned transcripts from each of the 21 sequenced isolates. The length of the 5′ UTR for each gene was averaged for all genes with detectable expression in at least 15 strains. The length of all gene 5′ UTRs is plotted alongside those of all C. albicans transcription factors as defined in the Candida Genome Database (http://candidagenome.org). (C) The 3′ UTRs of all genes in the C. albicans genome were similarly determined from transcriptional profiling. The 3′ UTRs of all genes were plotted alongside all genes defined by the gene ontology term “ribosome.” FIG S6, PDF file, 0.8 MB (803.1KB, pdf)

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Mobile genetic elements play an important role in shaping genome evolution through promoting recombination, disrupting gene function, and forming new transcriptional units (56). Previous work has catalogued the transposable elements (TEs) present in the C. albicans genome using their associated long terminal repeats for classification among clinical isolates (11, 57). Transcriptional profiling of the 21 C. albicans isolates revealed active expression of multiple transposon families within C. albicans. The most highly transcribed transposons were flanked by gamma-class long terminal repeat (LTR) sequences, although the abundance of actively transcribed retroelements varied immensely between strains (Fig. S7A). The RNA abundance of TEs did not reflect strain relatedness or changes in genomic copy number among the isolates (Pearson’s test; r = 0.062, df = 19, P = 0.79) (Fig. S7B), suggesting that mechanisms of transposon quiescence or inactivation may contribute to differences in expression among strains.

FIG S7

Retroelement expression does not correlate with copy number. (A) The abundance of each transposon-associated long terminal repeat (LTR) was determined from RNA-Seq for each strain and is shown as a stacked bar and color coded to indicate each LTR class. Strains are color coded by clade. (B) The number of retroelements encoded in the genome of each C. albicans isolate was determined from previous whole-genome sequencing (11) and plotted against the value of total transcripts per million (TPM) for all retroelements. A linear model was fitted to the data to detect a relationship between copy number and expression. Download FIG S7, PDF file, 1.0 MB (980.1KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Gene expression does not correlate with chromosomal position.

A previous report suggested that genes found at the chromosome ends could exhibit higher levels of expression plasticity, variable gene expression among cell populations (58). To assess expression plasticity, the coefficient of variation (CV) between biological replicates was calculated for all genes and averaged across the 21 strains. The average CV in 10-kb sliding windows remained fairly constant across the genome, centered at approximately 0.15 (Fig. S8A). Subtelomeric genes in the 15 kb most proximal to the telomeric repeats did not show increased variability compared to the rest of the genome; in fact, the CV decreased slightly in the subtelomeres. Additionally, only two of nine TLO genes with transcript abundance data across all strains showed elevated plasticity compared to the genome average (Student’s t test; P < 0.05) (Fig. S8B). Instead, the majority of genes with significantly elevated expression plasticity were scattered throughout the genome (see Table S9 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS9/14211263/1).

FIG S8

TLO genes do not display increased expression plasticity. (A) The coefficient of variation (CV) for gene expression between biological replicates for each strain was calculated for all genes and averaged across strains. The CV of each gene (gray dots) was plotted across the eight C. albicans chromosomes along with a smoothed average using nonoverlapping 10-kb windows (red lines). Chromosomal positions are depicted below. (B) The coefficient of variation (CV) of gene expression for each gene was calculated between biological replicates for each clinical isolate. The CV was averaged across strains and plotted with a smoothing line (red). Individual TLO genes were plotted against the distribution (arrows) and compared to 2 standard deviations from the mean (dashed black line). The blue arrow indicates the chromosome internal TLO, TLOα34. Download FIG S8, PDF file, 2.7 MB (2.7MB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Differentially expressed gene sets associate with C. albicans phenotypes.

Previously, the 21 sequenced C. albicans isolates were characterized for a diverse set of in vitro and in vivo phenotypic traits (11). Differentially expressed genes between groups with extreme phenotypes can implicate the causative networks or pathways that are responsible for the divergent traits (Fig. 2A).

FIG 2.

FIG 2

Differential expression predicts genes associated with C. albicans phenotypes. (A) The workflow used to identify phenotype-associated genes is depicted. Phenotyping results for 8 traits determined in the work of Hirakawa et al. (11) were used to (1) screen strains to (2) identify strains with extreme phenotypes. (3) Differential gene expression (2-fold change, q < 0.05) was identified among strains with opposing phenotypic groups, and (4) enrichment analysis was performed for biological terms. (B) The fold change in expression between groups with opposing phenotypic measurements as defined in panel A is plotted for all genes and the eight phenotypes investigated. Genes showing significantly different expression levels between the opposing phenotypic groups are color coded by phenotype, and genes without statistically supported differences are in gray. (C) Values of transcripts per million (TPM) are plotted as a heat map on a log2 scale for differentially expressed genes within the enriched gene ontology term “Single-species biofilm formation” between strains that filament poorly (low) or profusely (high) on Spider agar medium at 30°C. Two biological replicates per strain are displayed. (D) The TPM values for each euploid (blue) and aneuploid (red) isolate sample are plotted for the two differentially expressed genes within the enriched GO term for aneuploidy.

To identify genes that associate with quantitative phenotypes, we compared differentially expressed genes between strains that displayed phenotypic extremes in the work of Hirakawa et al. (11). Overall, gene expression profiles between groups for any given phenotype were overwhelmingly similar, with the extreme groups differentially expressing between 2 and 209 genes for each phenotype (>2× change, q ≤ 0.05) (see Table S10 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS10/14211266). Growth phenotypes were associated with the largest number of differentially expressed genes (Fig. 2B), which may reflect the conditions used for RNA isolation (logarithmic-phase growth in YPD medium at 30°C). Genes involved in cell cycle regulation, lipid metabolism, and carbohydrate metabolism were overrepresented among those differentially expressed between strains with high/low growth rates. Surprisingly, phenotypes not directly linked to the growth conditions under which RNA was prepared also showed differential expression of genes enriched for associated biological processes (see Table S11 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS11/14211275/1). For example, strains with contrasting abilities to filament on Spider medium showed differential expression of genes associated with biofilm formation (11 of 129, q = 7.78E−3) and oxidoreductase activity (8 of 129, q = 9.61E−3), even though they were grown as planktonic cells in YPD medium at 30°C (Fig. 2C). Interestingly, strains harboring supernumerary chromosomes differentially expressed genes involved in oxidoreductase activity using NAD+/NADH acceptors compared to their euploid counterparts (2 of 9, q = 3.07E−2; Fig. 2D). Thus, gene expression differences could be connected to a variety of phenotypes, even though cells were isolated from a single experimental condition. This analysis was limited to phenotypes with clear opposing differences, however, and suggested that more dynamic models of expression-phenotype relationships could identify additional loci responsible for phenotypic variation.

Linear models link gene expression with variation in simple traits.

The differential gene expression analysis described above relied on categorical definitions (such as phenotypic extremes) and therefore failed to acknowledge that gene expression and quantitative traits often fall along a continuum. To incorporate nondiscrete values, gene expression and phenotypic measurements were fit to a linear model. A generalized least-squares model of regression was used to account for the potential influence of population structure on gene expression among the 21 strains. Expression values for the ∼6,400 genes were plotted for all 21 isolates against a panel of 23 phenotypic measurements spanning growth rates, drug resistance, stress resistance, filamentation, and virulence, and significant associations were identified (see Table S12 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS12/14211278/1). Notably, growth rates correlated strongly with expression of a significant portion of the genome (e.g., expression of 1,879 genes correlated with growth rates in YPD medium at 37°C) (Fig. 3A). Genes connected to growth rates across a range of conditions were often overrepresented for functions related to the cell cycle or cell division (see Table S13 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS13/14211281/1). For example, increased growth rates in minimal, Spider, and synthetic complete dextrose (SCD) media at 30°C displayed a linear relationship with increased expression of genes overrepresented in the mitotic cell cycle (q < 1.40E−4) and spindle assembly (q < 0.05). This analysis also identified core regulatory processes associated with growth rates including expression levels of Mediator, a major transcriptional regulatory complex (59). Expression of Mediator subunits was overrepresented for growth rates in YPD at 30°C, χ[(1, n = 1,320) = 9.48, P = 2.07E−3] (Fig. 3B).

FIG 3.

FIG 3

Linear regression reveals genes correlated with C. albicans phenotypic traits. (A) Expression of each gene and quantitative phenotype scores from all biological replicates were fitted to a linear model and tested for significance using Pearson’s correlation. The correlation score was plotted for each of 23 phenotypes and color coded by phenotype for significantly associated genes. Gray points indicate no significant association. (B) Representative correlation scores for components of the Mediator transcriptional regulator complex with growth in YPD medium at 30°C are indicated on the right. Mediator components significantly associated with these growth conditions are indicated in the Mediator schematic by thick black outlines. (C) The expression levels of four genes previously known to be involved in C. albicans filamentation are plotted for the 21 isolates compared to their filamentation score on Spider solid medium at 30°C. The regulatory relationship of the four genes is indicated by arrows. (D) The value of transcripts per million (TPM) of all annotated ribosomal genes in the C. albicans genome is plotted for the 21 isolates by ascending filamentation scores on solid Spider medium at 30°C. A best-fit line is indicated in red.

In contrast, linear modeling found fewer significant relationships between gene expression and more complex traits such as biofilm formation or virulence. Intriguingly, however, the expression of a large number of genes correlated linearly with the degree of hyphal growth observed under filamentation-inducing conditions. One of these genes, CZF1, is a key transcription factor required for the transition to hyphal growth (60), as well as a member of the core transcriptional network governing biofilm formation (47). Our results revealed that higher expression of CZF1 in clinical isolates (in YPD medium) correlated with increased filamentation when cells were grown on Spider medium (Fig. 3C). Elevated expression of other hypha-regulated genes including RFX2, BRG1, and ROB1 also correlated with increased filamentous growth under these conditions (q = 4.66E−3, 5.23E−3, and 7.08E−4, respectively). Both BRG1 and ROB1 are regulatory targets of Czf1 and Rfx2 (47, 51, 61), demonstrating that multiple members of known regulatory pathways can be uncovered by linear modeling of expression. Additionally, expression of ribosome and mitochondrial genes correlated with the extent of filamentation across a range of conditions (Fig. 3D), consistent with previous reports (6264). Thus, linear modeling captured expression dependencies of key regulators with simple phenotypes but was less proficient in detecting relationships between gene expression and more complex C. albicans phenotypes.

Construction of gene networks associated with phenotypic traits.

To capture additional cellular pathways and processes associated with both simple and complex traits, we constructed gene expression networks using weighted gene correlation network analysis (WGCNA) (65). Implementation of network construction using transcript abundance of all genes across the set of 21 isolates produced 43 distinct coexpression modules (ME) (Fig. 4A; also see Table S14 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS14/14211287/1).

FIG 4.

FIG 4

Coexpression modules reconstruct biological relationships in C. albicans cells. (A) A weighted gene coexpression network analysis (WGCNA) of transcript abundance across all strains resolved 43 modules. A gene dendrogram obtained by average linkage hierarchical clustering is depicted above each associated module. ME8 and ME30 are indicated. (B) The relationship between genes within all modules was visualized using a correlation cutoff of 0.93. Eight of the 10 largest modules formed connections with each other and are color coded as indicated. The relationship between each module is represented spatially, where genes are represented as individual points and their correlated expression is represented by edges.

Spatial organization of the coexpression modules produced a striking arrangement in which transcriptional cross talk between modules was evident (Fig. 4B). Color coding was used to highlight different coexpression modules in which nodes are individual genes and edges have a correlation score of at least 0.93 (Fig. 4B). Surprisingly, we found that eight of the 10 largest modules connect to one another to produce a ring structure, where most modules interact with a limited set of one to three other modules and that collectively incorporates expression of 67% of annotated C. albicans genes (4,377 of 6,468 genes). The two largest modules form the backbone of the ring structure: ME1, which includes the RNA processing and vesicular transport machinery, and ME2, which encompasses the translational machinery (see Table S15 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS15/14211290/1). These processes are connected through ME4, which is enriched for genes involved in RNA binding in the nucleolus and ribosomal genes for RNA processing and translation. Genes required for ubiquitination and the proteasome are enriched in ME3 and connected to ME1, indicative of transcriptional cross talk in protein turnover. ME3 is linked to ME5, which contains the genes for glycerophosphodiester transport and lipid production; to ME9, which is enriched for genes involved in the metabolism of nucleotide sugars and production of biofilm matrix; and finally to ME2, which links back to translation. Thus, our analysis produced a gene expression atlas that delineates the interconnected transcriptional control of core cellular processes in C. albicans.

Gene coexpression modules were subsequently correlated with previously characterized phenotypes (11) to infer potential regulatory links (Fig. S9). Related phenotypes clustered to the same modules in many cases (e.g., growth rates in different media clustered to ME8, and filamentation across multiple conditions clustered to ME30). These module-phenotype links often included previously characterized genotype-phenotype associations. For example, elevated expression of ME30 and ME16 genes correlated with increased filamentation and encompassed known activators of filamentation such as BRG1 (ME30) and SUV3 (ME16) (66, 67). However, most genes in these modules have not been previously linked to filamentation and therefore represent candidates for further investigation.

FIG S9

Module-phenotype relationships for C. albicans isolates. Modules built from C. albicans expression data with weighted gene correlation network analysis (WGCNA) were correlated against each phenotype from the work of Hirakawa et al. (M. P. Hirakawa, D. A. Martinez, S. Sakthikumar, M. Z. Anderson, et al., Genome Res 25:413–425, 2015, https://doi.org/10.1101/gr.174623.114). Significant positive and negative correlations of modules and phenotypes are indicated in red and blue, respectively. Each significant interaction contains the correlation coefficient (top line) and the P value (bottom line). Download FIG S9, PDF file, 2.3 MB (2.3MB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Identification of a putative state-specific network.

Two phenotypes, growth rates and filamentation, were strongly associated with several gene coexpression modules (Fig. S9). To test WGCNA predictions of module-phenotype associations, we first interrogated the ME8 module, which was linked to growth rates under several conditions (Fig. 5A). Interestingly, a single strain, P37037, expressed genes in ME8 at higher levels than did all other isolates (Fig. 5B), suggesting that ME8 conferred unique attribute(s) to this strain. The elevated expression of ME8 genes in P37037 may be due to coordinated gene regulation and/or interconnectivity, as 17 of the 18 genes within the ME8 network connect to a minimum of 12 other genes within the same network (Fig. 5C).

FIG 5.

FIG 5

Identification of a gray-specific module associated with cell state growth differences. (A) Two modules defined by WGCNA, ME8 and ME30, were correlated with phenotypes of the set of 21 C. albicans isolates. Significant associations are indicated by increasingly darker red hues, and gray indicates no association. Each cell provides the Pearson’s correlation statistic (top) and q value (bottom). (B) A heat map represents the transcript-per-million (TPM) gene expression of ME8 genes on a log2 scale ranging from −6 to 6 for biological replicates for three isolates, SC5314, 19F, and P37037. Genes in bold were tested experimentally. (C) Strongly correlated expression of 18 genes from ME8 is depicted where each gene is represented by nodes and correlated expression is shown as edges. Correlation scores are >90%. (D) The white and gray cell states found in P37037 are shown for both colonies and cell images (at ×40 magnification). The EFG1 locus was genotyped by Sanger sequencing from both P37037 cell types. P37037 white cells encoded a heterozygous G/A and gray cells encoded a homozygous A/A at nucleotide 755 in EFG1. (E) Growth rates for P37037 white and gray cell states. The average doubling time during logarithmic phase growth was determined in YPD, SCD, and minimal SD media and plotted as the mean with standard deviations. n = 6. (F) Growth curves during an 18-h window are displayed for wild-type, Δ/Δofi1, and Δ/Δofi1+OFI1 strains in the P37037 background and color coded as indicated. Measurements of optical density were taken in 15-min intervals. (G) Growth rates for white (left) and gray (right) cells in the wild type, three mutant lines (Δ/Δkns1, Δ/Δofi1, Δ/Δzcf31), and their complemented P37037 strains. Significance was determined relative to the wild type (WT). n = 6. ** denotes P < 0.01. *** denotes P < 0.001.

Analysis of P37037 colony sectors revealed two distinct cell types that resembled the previously defined “white” and “gray” states of C. albicans (Fig. 5D). C. albicans is most commonly isolated in the white state, which is considered the default state. In contrast, the gray state represents an efg1/efg1 null state that can readily arise in strains that are EFG1/efg1 heterozygous due to spontaneous loss of the functional allele (13). P37037 is functionally heterozygous for EFG1 as it contains a polymorphism at nucleotide 755 that inactivates one allele via a G252D mutation in the encoded protein (13). Sequencing of the EFG1 locus in P37037 confirmed the heterozygous polymorphic site (G/A) in white populations whereas all assayed gray colonies (4/4) had become homozygous (A/A) to produce cells lacking functional EFG1 (Fig. 5D). Consistent with previous observations of conversion to the gray state (13), gray sectors often arose within white colonies but no white sectors were observed within gray colonies.

We hypothesized that gray cells within the mixed population from P37037 may be responsible for resolving the ME8 network and, potentially, its association with growth. Indeed, transcriptional profiling of gray P37037 cells demonstrated significantly elevated expression of ME8 genes compared to the white state (Fig. S10A). Interestingly, only 9 of these 18 genes displayed differences in expression between white and gray cells in the SC5314 background (Fig. S10B), indicating that strain background also influences white versus gray expression profiles. To test the association between cell state and growth, the doubling times of P37037 white and gray cells were compared in multiple medium types. White cells grew significantly faster than gray cells in both nutrient-rich (YPD and SCD) and nutrient-poor (minimal) media at 30°C (Student’s t test; P < 0.001) (Fig. 5E).

FIG S10

Expression of ME8 genes is unique to P37037 gray cells. (A) qRT-PCR-measured abundance of ME8 genes in white and gray populations of P37037. Abundance was measured for cells in logarithmic growth at 30°C and normalized to ACT1. n = 3 biological replicates. (B) A heat map represents the gene expression in transcripts per million (TPM) of ME8 genes from white, gray, and opaque SC5314 cells taken from the work of Liang et al. (S. H. Liang, M. Z. Anderson, M. P. Hirakawa, J. M. Wang, et al., Cell Host Microbe 25:418–431.e416, 2019, https://doi.org/10.1016/j.chom.2019.01.005) and plotted on a log2 scale ranging from −10 to 10. Download FIG S10, PDF file, 0.8 MB (821.5KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Three putative transcription factors in the ME8 module that had no previously described growth phenotypes (KNS1, OFI1, and ZCF31) were individually disrupted in strain P37037 to determine if genes within this module impact growth rates in either the white or gray cell state beyond the influence of cell state alone. Disruption of any of the three genes did not alter growth rates of white cells. In contrast, disruption of OFI1 significantly decreased growth rates in the gray state, although doubling times were challenging to measure due to the lack of a clear logarithmic growth phase for these cells (Wilcoxon test; W = 70, P = 0.017) (Fig. 5F and G; see also Fig. S11 at https://figshare.com/articles/figure/mBio_Wang_etal_2020_supplement_FigS11/14211173/1). Loss of KNS1 also decreased the growth rates of gray cells, but this difference did not reach statistical significance (see Fig. S11 at https://figshare.com/articles/figure/mBio_Wang_etal_2020_supplement_FigS11/14211173/1). Thus, genes in the ME8 module exhibit state-specific expression that reflects differences in growth between white and gray states.

Dissection of a novel network that regulates filamentation.

We also examined a second coexpression module, ME30, given that this module was associated with filamentation, but not growth rates, across a range of conditions (Fig. 5A). In contrast to ME8, this module displayed relatively low interconnectivity and exhibited a range of expression values across isolates (see Fig. S12A at https://figshare.com/articles/figure/mBio_Wang_etal_2020_supplement_figS12_pdf/14211164/1). Expression of genes in the ME30 module was elevated in strains with higher filamentation scores compared to those that filament poorly (e.g., SC5314 versus P37037, respectively) (Fig. 6A). ME30 genes included the previously characterized BRG1 gene that encodes a transcriptional activator of filamentation (66), further suggesting a role for ME30 in promoting hyphal formation. Four genes from ME30 with potential regulatory roles (UME7, transcription factor; FGR2, putative transmembrane transporter; PHO100, putative phosphatase; and orf19.6864, putative ubiquitin ligase), in addition to BRG1, were disrupted in the high-expression strain SC5314 and assessed for filamentation in liquid and on solid media. Loss of each gene reduced filamentation in liquid RPMI medium at 1 h, when hyphal initiation begins in SC5314 (Fig. 6B). Thus, most cells in the Δ/Δbrg1 background remained as yeast whereas loss of the other four ME30 genes produced a heterogeneous mix of yeast cells and cells forming germ tubes. After 4 h in RPMI medium, all ME30 mutant cultures contained mostly hyphae, although significantly fewer filamentous cells were present in the Δ/Δbrg1, Δ/Δfgr2, Δ/Δpho100, and Δ/Δume7 strains (Wilcoxon test; P < 0.05) (Fig. 6B). Many of the mutants that formed filamentous cells remained as pseudohyphae at these later time points, compared to the wild-type background, which grew as a mix of hyphal and pseudohyphal cells (see Fig. S12B at https://figshare.com/articles/figure/mBio_Wang_etal_2020_supplement_figS12_pdf/14211164/1). Complementation of each mutant restored the wild-type phenotype at both the 1- and 4-h time points (Fig. 6B; see also Fig. S12B at https://figshare.com/articles/figure/mBio_Wang_etal_2020_supplement_figS12_pdf/14211164/1). Plating cells to single colonies on YPD and Spider media at 30°C produced similar outcomes with reduced filamentation of most ME30 mutants. Strains lacking BRG1, FGR2, and UME7 demonstrated reduced colony filamentation after 7 days on both YPD and Spider media with Δ/Δpho100 colonies also generating less filamentation on Spider medium (Wilcoxon test; P < 0.05) (Fig. 6C). Similar to liquid filamentation, complementation of each mutant with a wild-type copy of the disrupted gene restored filamentation to wild-type levels (Fig. 6C). These results suggest that ME30 genes are responsible for activating filamentation responses in C. albicans and may be particularly important for hyphal initiation. Mutants in ME30 genes did not display any growth phenotypes, consistent with these defects being filamentation specific (Fig. 5A; see also Fig. S12C at https://figshare.com/articles/figure/mBio_Wang_etal_2020_supplement_figS12_pdf/14211164/1). Thus, our collective experimental validation of phenotypes predicted to associate with coexpression modules demonstrates the power of this approach to define gene function across C. albicans strains and to link previously uncharacterized loci to biological processes important for disease.

FIG 6.

FIG 6

Genes within a coexpression module promote C. albicans filamentation across conditions. (A) A heat map represents the RNA transcripts per million (TPM) of all ME30 genes on a log2 scale ranging from −6 to 6 for SC5314 and P37037, isolates that filament strongly and poorly across multiple conditions, respectively. Genes in bold were tested experimentally. Colony images were taken following growth on Spider agar medium at 30°C for 7 days. (B) SC5314 wild-type cells, mutants in five genes from the ME30 module, and the complemented mutants were grown for 1 and 4 h in RPMI at 30°C and visualized at ×40 magnification. Bar = 5 μm. The fraction of filamentous cells is plotted for SC5314 wild-type cells, mutants in five genes from the ME30 module, and the complemented mutants. n = 9, 11, 4, 10, 4, 10, 4, 7, 4, 8, and 4 for 1 h and n = 9, 10, 4, 10, 4, 10, 4, 10, 4, 10, and 4 for 4 h in order from left to right. (C) The filamentation score for SC5314 wild type, ME30 mutants, and the complemented mutants following growth on solid YPD (left) or Spider (right) medium for 7 days. n = 14, 14, 6, 14, 7, 13, 7, 9, 6, 19, and 7 for YPD and 17, 12, 6, 12, 6, 8, 7, 8, 7, 14, and 6 for Spider medium for strains from left to right. Significance was determined relative to the wild type. * denotes P < 0.05. ** denotes P < 0.01. *** denotes P < 0.001.

DISCUSSION

A hallmark of C. albicans biology is the extensive genetic and phenotypic plasticity displayed among clinical isolates. This study expands previous observations that considerable transcriptional variation exists between natural isolates of the species (23, 27). We demonstrate that phylogenetic relationships between a set of 21 strains are not mirrored at the transcriptional level, as closely related strains often display contrasting expression profiles under identical growth conditions. Notably, the construction of coexpression modules identified genes and pathways that underlie phenotypic differences between isolates. Furthermore, it permitted the direct evaluation of target genes for their roles in virulence-associated traits, thereby demonstrating the utility of this unbiased approach for delineating genes contributing to phenotypic diversity.

A striking finding in our analyses was the incongruence between constructed phylogenies and transcriptional profiles in C. albicans. Previous work has described transcriptional profiles in bacteria that reflect strain phylogeny and even phenotypic similarity based on shared lifestyle characteristics (6870). In some eukaryotes such as Saccharomyces cerevisiae, strong selective pressures based on niche specificity may explain incongruence between genetic and transcriptional profiles (34, 71). Here, we show that C. albicans strains express genes largely independently of their genetic similarity and that there is no clear association with the niche of isolation, although we recognize the limited number of multilocus sequence type (MLST) clades represented by these isolates (7 of 17) as well as incomplete clinical information for these strains. The lack of a connection between genotype and gene expression is highlighted by the prevalence of strain-specific expression patterns for several isolates. This indicates that phenotypic variation between C. albicans isolates arises, in large part, from transcriptional differences that cannot be simply predicted by genetic phylogenies or clinical correlates.

Transcriptional differences among the 21 C. albicans isolates provided new insights into functional variation between isolates. Genes involved in metabolic processes were often differentially expressed among strains and may contribute to the range of growth rates seen for these isolates (11). Genes regulating transcriptional activation and hypha formation also showed variable expression and were linked to differences in growth rates and filamentation, respectively. This is despite the fact that all expression profiling involved cells grown under a single culture condition (replete medium at 30°C). Why might cells grown under one condition reflect expression differences that affect function in another? One possibility is that strains express genes in preparation for exposure to a new environment. Such priming can result from epigenetic reprogramming following a previous exposure (72), stochastic expression of regulators that promote bet hedging (73), and/or chromatin remodeling that favors activation of certain promoters (74). Priming of C. albicans cells could promote population fitness during environmental shifts including transitions between different host niches (75). C. albicans strains may also contain subpopulations of cells with distinct expression profiles that favor alternative environmental conditions, with the fraction of these subpopulations varying between strains. Additionally, cell variation in a population can arise due to changes in transcription factor binding that will disproportionately affect gene expression but will not cause general fitness defects (76). Single-cell analysis and transcriptional profiling of large strain sets grown under multiple environmental conditions will help differentiate between these possibilities.

Our expression analysis of the set of 21 C. albicans strains facilitated the construction of a gene expression map of the species and the incorporation of a large proportion of uncharacterized loci into coexpression clusters linked to putative functions. Similar approaches in other systems have revealed the function of uncharacterized genes and their contributions to complex phenotypes (7779). However, previous systems-level analyses have often skirted direct molecular testing of predicted gene functions. Here, experimental tests of C. albicans genes associated with growth and filamentation revealed functional roles for cell state and transcriptional regulators linked to two coexpression modules, ME8 and ME30. Analysis of genomic sequences could not predict the results described here as no inactivating mutations are present within ME8 and ME30 genetic alleles assayed in our strain set (11). Our study therefore reveals how expression profiling allows for an analysis of genotype-phenotype relationships using a variety of gene expression models instead of only assessing discrete mutation types.

Expression of ME8 module genes was linked to the gray cell state, which was recently shown to arise due to mutations that abolish EFG1 function (13). The EFG1 locus is heterozygous in P37037, and loss-of-heterozygosity (LOH) events can therefore cause cells to become efg1 null and adopt the gray state (13). Unexpectedly, our analysis identified ME8 as a gray-specific coexpression module in P37037, where gray cells grow more slowly than white cells and which produced the expression module-phenotype association. ME8 genes that are upregulated in P37037 gray cells versus white cells are not uniformly upregulated in SC5314 gray cells (see Fig. S10B in the supplemental material). These results further emphasize that C. albicans phenotypes and expression profiles are dependent on their genetic background (11, 23, 26, 27). The existence of an EFG1 heterozygote capable of accessing the gray state is not particularly uncommon (∼2% of assayed clinical isolates), and this hemizygous state may reflect advantages in gray state colonization of the gut or oral cavity compared to white cells (13, 15). Reduced growth rates of gray cells compared to white state cells in our assays could reflect differences from conditions in the host or, more simply, differences between genetic backgrounds. We evaluated the phenotypic consequences of deleting three genes from the highly interconnected ME8 module and showed that loss of OFI1 significantly reduced the growth rates of P37037 gray cells. Thus, we uncovered a novel factor with a cell state-specific phenotype which further validated our approach.

A functional dissection of the ME30 module similarly connected several poorly characterized genes to a key phenotype in C. albicans. In this case, novel regulators of filamentation were discovered despite the wealth of research into filamentation pathways in this species (21, 8082). Most studies have focused on genetic dissection of filamentation in SC5314 and have relied on candidate gene or transcriptional profiling approaches. We note that our identification of ME30 genes as regulators of filamentation did not rely on the presence of ORF-inactivating mutations but on differential expression across isolates that correlated with filamentation responses. Inclusion of the well-characterized filamentation regulator BRG1 (66) emphasized the potential for other ME30 genes to regulate filamentation. Indeed, all assayed genes in ME30 appear to promote this process, albeit to different degrees, which likely reflects the lack of highly interconnected expression within this module (see Fig. S12 at https://figshare.com/articles/figure/mBio_Wang_etal_2020_supplement_figS12_pdf/14211164/1). All mutants of ME30 genes disrupted hyphal formation at early time points, suggesting that these genes play a critical function in hyphal initiation and operate across multiple conditions, even though the ME30 module was defined using cells grown in the yeast form. The priming of filamentation via ME30 genes is supported by defined roles for Brg1 in recruiting Hda1, a histone deacetylase that remodels chromatin at the promoters of hypha-specific genes and occluding Nrg1, a negative regulator of filamentation (66, 83). Elevated expression of BRG1 during rich medium growth could reduce the activation time needed to transcribe UME6 and other genes that promote filamentation, while maintaining a phenotypically yeast state. The particularly long 5′ UTR of BRG1 may indicate complex regulation of this gene, including undefined molecular pathways that include other ME30 genes, especially those with clear regulatory capacities (e.g., FGR2, PHO100, and UME7) (54, 84). Thus, our study indicates that ME30 module genes may play broad roles in the regulation of filamentation in C. albicans.

MATERIALS AND METHODS

Media and reagents.

Yeast extract-peptone-dextrose (YPD) and synthetic complete dextrose (SCD) media were prepared as previously described (85). Spider medium was prepared (1% nutrient broth, 1% mannitol, 0.2% K2HPO4) and equilibrated to a pH of 7.4. Minimal medium was prepared as 0.17% yeast nitrogen base, 0.5% ammonium sulfate. YPD containing 200 μg/ml nourseothricin (Werner Bioagents, Jena, Germany) was used to select for nourseothricin-resistant (NATR) strains.

RNA sequencing (RNA-Seq) library preparation.

Two independent cultures for each of the 21 clinical isolates were grown at 30°C in YPD overnight. Cultures were diluted 1:100 into fresh YPD and allowed to grow to an optical density (OD) of 1.0. RNA was harvested from cells using a MasterPure yeast RNA purification kit (Epicentre, Madison, WI) and treated with DNase I (Fisher Scientific, Hampton, NH). RNA quality was measured on an Agilent 2100 Bioanalyzer, and RNA with RNA Integrity Number (RIN) scores of ≥7.5 was used for construction of sequencing libraries.

Poly(A) RNA was isolated and used to construct strand-specific libraries using the dUTP second-strand marking method (86, 87) as previously described (88). The 42 sequencing libraries were pooled and sequenced on the Illumina HiSeq to generate 151 base-paired-end reads. To measure gene expression, reads were aligned to the C. albicans SC5314 reference genome. RNA-Seq reads were then mapped to the transcripts with STAR (version 2.0.9) (89). Count tables were generated with HTSeq (version 0.9.0) (90), and differentially expressed genes were identified using EdgeR (version 3.28.1) (91).

FASTQ processing and alignments.

Sequenced reads were returned in FASTQ format, and quality score was confirmed using FastQC. All 42 samples exceeded the minimum allowed Phred quality score (28) across all bases. An average of 8.1 million reads were obtained per samples. Reads were aligned using the Spliced Transcripts Alignment to a Reference (STAR) with the alignIntronMin and alignIntronMax parameters set to 30 and 1,000 (92). Greater than 90% of reads mapped to defined genes (range, 96 to 98%). All other parameters were executed with default values. For each gene, the number of aligned reads was calculated using HTSeq-count (90). Gene features were defined as those exon regions annotated in the SC5314 Assembly 21 features file (https://tinyurl.com/yt2vmb4c), for a total of 6,468 features. These read counts per feature were normalized into TPM values, which can be publicly accessed at https://goo.gl/PqgGtH. The RNA-sequencing library contained a known defect with strand orientation, where orientation was incorrectly denoted as opposite of actual designation. All analyses (including features count) had taken this into account and corrected for it prior to analysis.

Hierarchical clustering of gene expression.

TPM values for all C. albicans genome features from the Assembly 21 genome feature file were used to build dendrograms of similar gene expression. Hierarchical clustering was performed using Spearman’s correlation and average linkage. To assess, trees of similarity between biological duplicates were built and tested with 1,000 bootstraps using the ‘pvclust’ package (version 2.2-0) in R (version 3.5.3). For comparisons across strains, average TPM values were calculated between strains and hierarchical clustering was performed.

Correlation of expression with strain phylogeny.

Phylogenetic relatedness among the 21 clinical isolates focused on strains that clustered well within their respective canonical clusters (I, II, III, and SA). To increase the tightness of these well-represented clusters, outlier strains with long branch lengths (P94015, P60002, and P75010) were removed. Based on each gene’s individual transcriptomic profile, we performed unsupervised clustering on each gene’s expression for the remaining 18 strains to bin into 4 groups using the R library kmeans. Hierarchical clustering was then performed on those genes for which these 4 groups contained at least half of the expected strains organized the same as for whole-genome analysis. For each gene’s hierarchical clustering, the number of strains inconsistently assigned was counted, and only 31 genes had at most six incorrectly assigned strains, less than expected by chance. No gene reported perfect homology with the phylogenetic tree.

5′ UTR and 3′ UTR construction.

The aligned reads in bam file formats for each of the 42 replicates were converted into bed format using bamToBed (https://bedtools.readthedocs.io/en/latest/content/tools/bamtobed.html), such that each individually aligned read is denoted in each row. Next, mergeBed (https://bedtools.readthedocs.io/en/latest/content/tools/merge.html) was applied so that overlapping reads on the same strand are merged together into one contiguous segment. intersectBed (https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html) was used to annotate the respective gene contained with each overlapping segment, with a minimum overlap of 1 bp. The -S flag was used when running intersectBed (https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html) to account for opposite strand orientation. Continuous merged reads that overlap more than one gene feature and those with negative UTR lengths were removed.

Differential gene expression by phenotypic extremes.

Previous phenotyping of these 21 was used as the basis for this analysis (11). For each phenotype with categorical extremes, both biological replicates for strains exhibiting traits at the extremes of the distribution for each phenotype were binned into opposing groups and compared against each other for differentially expressed genes as described above using EdgeR (91). The following groupings were used for each phenotypic comparison:

Differentially expressed genes were filtered for a minimum log2 fold change of 2 and a q value less than or equal to 0.05 and included only genes that had a minimum of 1 count per million reads in at least two samples. The expression data set was normalized using the default weighted trimmed mean of M-values (TMM) method, and dispersion was estimated using an empirical Bayes method. Because all replicates were collected and sequenced in a single experimental run, no batch effect is expected.

Gene ontology annotation.

Enrichment for gene ontology terms was conducted through the Candida Genome Database (93). In complement, we introduce an R library (CAlbicansR [https://github.com/joshuamwang/CAlbicansR/]) to facilitate nonbrowser analysis of Candida genomic data sets. Its functionality includes an offline database for converting orf19 identifiers into gene names and vice versa. In addition, the library also provides a function for automated searches of the Gene Ontology Term Finder. Results are outputted into the R console.

Linear regression of phenotype on gene expression.

The strength of a linear association between a gene’s expression and phenotypic score was assessed for all genes in all phenotypes using each sequencing set as a single data point (42 data points in all). To account for existing phylogenetic relationships, the covariance structure between strains was calculated based on a Brownian motion process of evolution, using the R phytools package. Phylogenetic generalized least-square regression was fitted while accounting for within-group correlation structure as defined previously. For each gene, the x axis represented the strain’s expression of that gene and the y axis indicated the corresponding strain’s phenotype score, and a linear least-squares equation was calculated. The F statistic was used to assess statistical significance, with a Bonferroni correction applied to each set of phenotype tests. Only genes with a corrected P value less than 0.05 were retained.

WGCNA construction.

The recommended default settings were used from the tutorial section 2.a.2 (https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/FemaleLiver-02-networkConstr-auto.pdf) for WGCNA of all 42 sequenced samples (2 replicates each from 21 isolates). Specifically, beta was set to 20 to achieve scale-free topology (first value for which R2 exceeded 0.80) as recommended previously (94). In addition, the networkType and TOMType both were set to signed, minModuleSize was at 10, and mergeCutHeight was at 0.15.

Identification of bimodal networks.

To identify genes with expression values that follow a multimodal distribution, we used a Gap Statistic method (95) implemented through the R library clusGap (https://stat.ethz.ch/R-manual/R-devel/library/cluster/html/clusGap.html) and used hclust (https://stat.ethz.ch/R-manual/R-devel/library/stats/html/hclust.html) to identify clusters. Only genes with minimum expression values were considered (TPM ≥ 5). A gene was considered to operate via a bimodal response if its maximized gap statistic exceeded 0.9 and corresponding k value exceeded a minimum of 2. Specifically, this analysis identified a subset of genes within ME8 that have significantly higher expression only in P37037.

Strain and plasmid construction.

Strains, oligonucleotides, and plasmids described in this paper are provided in Table S16 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS16/14211296/1, Table S17 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS17/14211299/1, and Table S18 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS18/14211305/1, respectively. Gene disruption was performed using long oligonucleotide-mediated targeting of OFI1, ZCF31, and KNS1 in P37037 through amplification of the SAT1-FLP cassette from pSFS2A (deletion oligonucleotides listed in pairs as “Round 1 KO” or “Round 2 KO” in Table S17 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS17/14211299/1) and integration by lithium acetate transformation (96, 97). Integration of deletion cassettes (Deletion Chk) and complementation plasmids (Addback Chk), as well as the presence or absence of open reading frames for each gene (ORF Chk), was confirmed with PCR using the oligonucleotides listed in Table S17 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS17/14211299/1. The SAT1-FLP cassette was recycled by plating to 100 colonies on yeast extract-peptone-maltose (YPM) solid medium top-spread with either 10 μg/ml or 20 μg/ml NAT. Small colonies were then patched to YPD with or without 200 μg/ml NAT to screen for nourseothricin-sensitive (NATS) colonies.

Construction of the OFI1 complementation plasmid p41 was performed by cloning PCR-amplified OFI1 from P37037 genomic DNA (including the promoter, coding sequence, and downstream) into pSFS2A using restriction enzymes ApaI and BamHI. The resulting plasmid was linearized in the promoter of OFI1 using HpaI for transformation into C. albicans. Construction of plasmids p50, p52, and p53 was performed using gap-repair cloning as described in the work of Jacobus and Gross (98) to generate ZCF31_A, ZCF31_B, and KNS1 complementation plasmids, respectively. Briefly, ZCF31 from P37037 genomic DNA (including the promoter, coding sequence, and downstream) was PCR amplified with oligonucleotides encoding 20-bp ends homologous to pSFS2A, and pSFS2a was linearized via PCR amplification with oligonucleotides containing 20 bp of homology to ZCF31, generating 40 bp of overlap. After digestion of the residual plasmid template using DpnI, each PCR product was gel purified and cotransformed into chemically competent DH5α to be assembled into an intact plasmid. The resulting plasmids yielded two plasmids containing different ZCF31 alleles listed as p50 (ZCF31-P37037_A) and p52 (ZCF31-P37037_B). p50 and p52 were linearized in the promoter of ZCF31 using PacI for lithium acetate transformation into C. albicans. The KNS1 complementation plasmid p53 was generated in a similar manner, but the genomic amplification was split into two fragments to introduce a novel MluI restriction site into the promoter region. p53 was linearized in the promoter of KNS1 using MluI for C. albicans transformation.

Pure populations of P37037 white and gray state cells were isolated from the mixed P37037 stock by streaking MAY3 onto YPD and growing at 30°C for 5 days until individual white and gray colonies could be differentiated. Independent colonies were inoculated into liquid YPD and grown overnight at 30°C for storage and sequencing of EFG1 to determine the allelic makeup of this locus.

Gray state cells from P37037-derived mutant strains were obtained by streaking white state strains onto YPD, followed by growth at room temperature. After 5 days of growth, gray sectors were identified, struck out onto YPD, and grown at room temperature once again to obtain isolated gray state colonies. After 3 days of growth, streaks were examined at a cellular and colony level to confirm gray state morphologies.

CRISPR-mediated deletion of SC5314 BRG1, UME7, orf19.6864, PHO100, and FGR2 was performed as previously described using a modified lithium acetate transformation protocol (99). Colonies were screened for gene deletions by PCR for the presence of a band using oligonucleotides flanking the excised locus (Up/Dwn Check) and for the loss of the target gene (ORF Chk) using the oligonucleotides listed in Table S17 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS17/14211299/1.

Complementation plasmids for BRG1, UME7, orf19.6864, PHO100, and FGR2 mutants were constructed by amplifying the wild-type locus from the background strains for all CRISPR-based deletions using primers listed in Table S17 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS17/14211299/1 and cloning them into pSFS2a as described above using gap repair cloning. All plasmids were cloned in two pieces with the exception of UME7, which required a three-piece cloning to include an MluI site for linearization prior to transformation (plasmids listed in Table S18 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS18/14211305/1). Genes were confirmed to be identical to the expected sequence by Sanger sequencing and then linearized using PacI, MluI, PacI, AgeI, and CspCI for BRG1, UME7, orf19.6864, PHO100, and FGR2, respectively, for lithium acetate transformation. Cells were selected on 200 μg/ml NAT and confirmed to contain the gene integrated at the native locus by PCR using primers listed in Table S17 at https://figshare.com/articles/dataset/mBio_Wang_etal_2020_supplement_TableS17/14211299/1.

Filamentation.

For liquid filamentation assays, cells were grown overnight in YPD at 30°C. The next day, cultures were spun down, washed in phosphate-buffered saline (PBS), inoculated 1:100 into RPMI 1640 liquid medium, and allowed to grow for either 1 or 4 h before imaging. Images were captured at ×40 magnification across 6 fields of view per sample to include at least 50 cells. At least four biological replicates were performed per genotype.

For solid medium filamentation, cells were taken from YPD solid medium, counted by hemocytometer, and plated to Spider or YPD medium at 100 cells per plate. Plates were incubated at 30°C for 7 days and imaged. Filamentation was measured using MIPAR as previously described (100). At least six biological replicates were performed per genotype.

Data availability.

The data sets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. The transcriptional profiling data generated in this study have been submitted to the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/) under accession number PRJNA630085. Tools developed to aid in gene ontology analysis are available from https://github.com/joshuamwang/CAlbicansR.

ACKNOWLEDGMENTS

We thank the entire Anderson lab for helpful discussions and feedback during the production of this work. We also thank the lab of Chad Rappleye for comments and critique of this work and Kou-San Ju, Christina Cuomo, and Lara Sucheston-Campbell for feedback on network analysis and visualization.

This work was supported by National Institutes of Health grants R01AI148788 to M.Z.A. and R01AI141893/R01AI081704 to R.J.B. R.J.F. was supported by an NIH F31 fellowship (1F31DE029409-01). This work was also supported by the American Heart Association grant AHA 20PRE35200201, M.J.D. 2020.

Footnotes

Citation Wang JM, Woodruff AL, Dunn MJ, Fillinger RJ, Bennett RJ, Anderson MZ. 2021. Intraspecies transcriptional profiling reveals key regulators of Candida albicans pathogenic traits. mBio 12:e00586-21. https://doi.org/10.1128/mBio.00586-21.

REFERENCES

  • 1.Neville BA, d’Enfert C, Bougnoux ME. 2015. Candida albicans commensalism in the gastrointestinal tract. FEMS Yeast Res 15:fov081. doi: 10.1093/femsyr/fov081. [DOI] [PubMed] [Google Scholar]
  • 2.Horn DL, Neofytos D, Anaissie EJ, Fishman JA, Steinbach WJ, Olyaei AJ, Marr KA, Pfaller MA, Chang CH, Webster KM. 2009. Epidemiology and outcomes of candidemia in 2019 patients: data from the prospective antifungal therapy alliance registry. Clin Infect Dis 48:1695–1703. doi: 10.1086/599039. [DOI] [PubMed] [Google Scholar]
  • 3.Yapar N. 2014. Epidemiology and risk factors for invasive candidiasis. Ther Clin Risk Manag 10:95–105. doi: 10.2147/TCRM.S40160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Danhof HA, Vylkova S, Vesely EM, Ford AE, Gonzalez-Garay M, Lorenz MC. 2016. Robust extracellular pH modulation by Candida albicans during growth in carboxylic acids. mBio 7:e01646-16. doi: 10.1128/mBio.01646-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Ene IV, Cheng SC, Netea MG, Brown AJ. 2013. Growth of Candida albicans cells on the physiologically relevant carbon source lactate affects their recognition and phagocytosis by immune cells. Infect Immun 81:238–248. doi: 10.1128/IAI.01092-12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Zeuthen ML, Howard DH. 1989. Thermotolerance and the heat-shock response in Candida albicans. J Gen Microbiol 135:2509–2518. doi: 10.1099/00221287-135-9-2509. [DOI] [PubMed] [Google Scholar]
  • 7.Odds FC, Bougnoux ME, Shaw DJ, Bain JM, Davidson AD, Diogo D, Jacobsen MD, Lecomte M, Li SY, Tavanti A, Maiden MC, Gow NA, d’Enfert C. 2007. Molecular phylogenetics of Candida albicans. Eukaryot Cell 6:1041–1052. doi: 10.1128/EC.00041-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lott TJ, Holloway BP, Logan DA, Fundyga R, Arnold J. 1999. Towards understanding the evolution of the human commensal yeast Candida albicans. Microbiology 145:1137–1143. doi: 10.1099/13500872-145-5-1137. [DOI] [PubMed] [Google Scholar]
  • 9.Schmid J, Herd S, Hunter PR, Cannon RD, Yasin MS, Samad S, Carr M, Parr D, McKinney W, Schousboe M, Harris B, Ikram R, Harris M, Restrepo A, Hoyos G, Singh KP. 1999. Evidence for a general-purpose genotype in Candida albicans, highly prevalent in multiple geographical regions, patient types and types of infection. Microbiology 145:2405–2413. doi: 10.1099/00221287-145-9-2405. [DOI] [PubMed] [Google Scholar]
  • 10.Ropars J, Maufrais C, Diogo D, Marcet-Houben M, Perin A, Sertour N, Mosca K, Permal E, Laval G, Bouchier C, Ma L, Schwartz K, Voelz K, May RC, Poulain J, Battail C, Wincker P, Borman AM, Chowdhary A, Fan S, Kim SH, Le Pape P, Romeo O, Shin JH, Gabaldon T, Sherlock G, Bougnoux M-E, d’Enfert C. 2018. Gene flow contributes to diversification of the major fungal pathogen Candida albicans. Nat Commun 9:2253. doi: 10.1038/s41467-018-04787-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hirakawa MP, Martinez DA, Sakthikumar S, Anderson MZ, Berlin A, Gujja S, Zeng Q, Zisson E, Wang JM, Greenberg JM, Berman J, Bennett RJ, Cuomo CA. 2015. Genetic and phenotypic intra-species variation in Candida albicans. Genome Res 25:413–425. doi: 10.1101/gr.174623.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Wang JM, Bennett RJ, Anderson MZ. 2018. The genome of the human pathogen Candida albicans is shaped by mutation and cryptic sexual recombination. mBio 9:e01205-18. doi: 10.1128/mBio.01205-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Liang SH, Anderson MZ, Hirakawa MP, Wang JM, Frazer C, Alaalm LM, Thomson GJ, Ene IV, Bennett RJ. 2019. Hemizygosity enables a mutational transition governing fungal virulence and commensalism. Cell Host Microbe 25:418–431.e416. doi: 10.1016/j.chom.2019.01.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Pande K, Chen C, Noble SM. 2013. Passage through the mammalian gut triggers a phenotypic switch that promotes Candida albicans commensalism. Nat Genet 45:1088–1091. doi: 10.1038/ng.2710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Tao L, Du H, Guan G, Dai Y, Nobile CJ, Liang W, Cao C, Zhang Q, Zhong J, Huang G. 2014. Discovery of a “white-gray-opaque” tristable phenotypic switching system in Candida albicans: roles of non-genetic diversity in host adaptation. PLoS Biol 12:e1001830. doi: 10.1371/journal.pbio.1001830. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Takagi J, Singh-Babak SD, Lohse MB, Dalal CK, Johnson AD. 2019. Candida albicans white and opaque cells exhibit distinct spectra of organ colonization in mouse models of infection. PLoS One 14:e0218037. doi: 10.1371/journal.pone.0218037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Moyes DL, Runglall M, Murciano C, Shen C, Nayar D, Thavaraj S, Kohli A, Islam A, Mora-Montes H, Challacombe SJ, Naglik JR. 2010. A biphasic innate immune MAPK response discriminates between the yeast and hyphal forms of Candida albicans in epithelial cells. Cell Host Microbe 8:225–235. doi: 10.1016/j.chom.2010.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Xie J, Tao L, Nobile CJ, Tong Y, Guan G, Sun Y, Cao C, Hernday AD, Johnson AD, Zhang L, Bai FY, Huang G. 2013. White-opaque switching in natural MTLa/alpha isolates of Candida albicans: evolutionary implications for roles in host adaptation, pathogenesis, and sex. PLoS Biol 11:e1001525. doi: 10.1371/journal.pbio.1001525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Peters BM, Palmer GE, Nash AK, Lilly EA, Fidel PL, Jr, Noverr MC. 2014. Fungal morphogenetic pathways are required for the hallmark inflammatory response during Candida albicans vaginitis. Infect Immun 82:532–543. doi: 10.1128/IAI.01417-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Solis NV, Park YN, Swidergall M, Daniels KJ, Filler SG, Soll DR. 2018. Candida albicans white-opaque switching influences virulence but not mating during oropharyngeal candidiasis. Infect Immun 86:e00774-17. doi: 10.1128/IAI.00774-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Noble SM, Gianetti BA, Witchley JN. 2017. Candida albicans cell-type switching and functional plasticity in the mammalian host. Nat Rev Microbiol 15:96–108. doi: 10.1038/nrmicro.2016.157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Soll DR, Pujol C. 2003. Candida albicans clades. FEMS Immunol Med Microbiol 39:1–7. doi: 10.1016/S0928-8244(03)00242-6. [DOI] [PubMed] [Google Scholar]
  • 23.MacCallum DM, Castillo L, Nather K, Munro CA, Brown AJ, Gow NA, Odds FC. 2009. Property differences among the four major Candida albicans strain clades. Eukaryot Cell 8:373–387. doi: 10.1128/EC.00387-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Pujol C, Pfaller MA, Soll DR. 2004. Flucytosine resistance is restricted to a single genetic clade of Candida albicans. Antimicrob Agents Chemother 48:262–266. doi: 10.1128/aac.48.1.262-266.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Dodgson AR, Dodgson KJ, Pujol C, Pfaller MA, Soll DR. 2004. Clade-specific flucytosine resistance is due to a single nucleotide change in the FUR1 gene of Candida albicans. Antimicrob Agents Chemother 48:2223–2227. doi: 10.1128/AAC.48.6.2223-2227.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Wu W, Lockhart SR, Pujol C, Srikantha T, Soll DR. 2007. Heterozygosity of genes on the sex chromosome regulates Candida albicans virulence. Mol Microbiol 64:1587–1604. doi: 10.1111/j.1365-2958.2007.05759.x. [DOI] [PubMed] [Google Scholar]
  • 27.Huang MY, Woolford CA, May G, McManus CJ, Mitchell AP. 2019. Circuit diversification in a biofilm regulatory network. PLoS Pathog 15:e1007787. doi: 10.1371/journal.ppat.1007787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Li X, Yan Z, Xu J. 2003. Quantitative variation of biofilms among strains in natural populations of Candida albicans. Microbiology (Reading) 149:353–362. doi: 10.1099/mic.0.25932-0. [DOI] [PubMed] [Google Scholar]
  • 29.Goberna M, Verdu M. 2016. Predicting microbial traits with phylogenies. ISME J 10:959–967. doi: 10.1038/ismej.2015.171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Raes J, Letunic I, Yamada T, Jensen LJ, Bork P. 2011. Toward molecular trait-based ecology through integration of biogeochemical, geographical and metagenomic data. Mol Syst Biol 7:473. doi: 10.1038/msb.2011.6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Obenauer JC, Denson J, Mehta PK, Su X, Mukatira S, Finkelstein DB, Xu X, Wang J, Ma J, Fan Y, Rakestraw KM, Webster RG, Hoffmann E, Krauss S, Zheng J, Zhang Z, Naeve CW. 2006. Large-scale sequence analysis of avian influenza isolates. Science 311:1576–1580. doi: 10.1126/science.1121586. [DOI] [PubMed] [Google Scholar]
  • 32.Dufour YS, Gillet S, Frankel NW, Weibel DB, Emonet T. 2016. Direct correlation between motile behavior and protein abundance in single cells. PLoS Comput Biol 12:e1005041. doi: 10.1371/journal.pcbi.1005041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Skelly DA, Merrihew GE, Riffle M, Connelly CF, Kerr EO, Johansson M, Jaschob D, Graczyk B, Shulman NJ, Wakefield J, Cooper SJ, Fields S, Noble WS, Muller EG, Davis TN, Dunham MJ, Maccoss MJ, Akey JM. 2013. Integrative phenomics reveals insight into the structure of phenotypic diversity in budding yeast. Genome Res 23:1496–1504. doi: 10.1101/gr.155762.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kvitek DJ, Will JL, Gasch AP. 2008. Variations in stress sensitivity and genomic expression in diverse S. cerevisiae isolates. PLoS Genet 4:e1000223. doi: 10.1371/journal.pgen.1000223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Bruno VM, Wang Z, Marjani SL, Euskirchen GM, Martin J, Sherlock G, Snyder M. 2010. Comprehensive annotation of the transcriptome of the human fungal pathogen Candida albicans using RNA-seq. Genome Res 20:1451–1458. doi: 10.1101/gr.109553.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Tuch BB, Mitrovich QM, Homann OR, Hernday AD, Monighetti CK, De La Vega FM, Johnson AD. 2010. The transcriptomes of two heritable cell types illuminate the circuit governing their differentiation. PLoS Genet 6:e1001070. doi: 10.1371/journal.pgen.1001070. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Niemiec MJ, Grumaz C, Ermert D, Desel C, Shankar M, Lopes JP, Mills IG, Stevens P, Sohn K, Urban CF. 2017. Dual transcriptome of the immediate neutrophil and Candida albicans interplay. BMC Genomics 18:696. doi: 10.1186/s12864-017-4097-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lorenz MC, Bender JA, Fink GR. 2004. Transcriptional response of Candida albicans upon internalization by macrophages. Eukaryot Cell 3:1076–1087. doi: 10.1128/EC.3.5.1076-1087.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Sellam A, van het Hoog M, Tebbji F, Beaurepaire C, Whiteway M, Nantel A. 2014. Modeling the transcriptional regulatory network that controls the early hypoxic response in Candida albicans. Eukaryot Cell 13:675–690. doi: 10.1128/EC.00292-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Munoz JF, Delorey T, Ford CB, Li BY, Thompson DA, Rao RP, Cuomo CA. 2019. Coordinated host-pathogen transcriptional dynamics revealed using sorted subpopulations and single macrophages infected with Candida albicans. Nat Commun 10:1607. doi: 10.1038/s41467-019-09599-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Amorim-Vaz S, Tran VDT, Pradervand S, Pagni M, Coste AT, Sanglard D. 2015. RNA enrichment method for quantitative transcriptional analysis of pathogens in vivo applied to the fungus Candida albicans. mBio 6:e00942-15. doi: 10.1128/mBio.00942-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Bruno VM, Shetty AC, Yano J, Fidel PL, Jr, Noverr MC, Peters BM. 2015. Transcriptomic analysis of vulvovaginal candidiasis identifies a role for the NLRP3 inflammasome. mBio 6:e00182-15. doi: 10.1128/mBio.00182-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Xu W, Solis NV, Ehrlich RL, Woolford CA, Filler SG, Mitchell AP. 2015. Activation and alliance of regulatory pathways in C. albicans during mammalian infection. PLoS Biol 13:e1002076. doi: 10.1371/journal.pbio.1002076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wang T, Xiu J, Zhang Y, Wu J, Ma X, Wang Y, Guo G, Shang X. 2017. Transcriptional responses of Candida albicans to antimicrobial peptide MAF-1A. Front Microbiol 8:894. doi: 10.3389/fmicb.2017.00894. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Szabo K, Jakab A, Poliska S, Petrenyi K, Kovacs K, Issa LHB, Emri T, Pocsi I, Dombradi V. 2019. Deletion of the fungus specific protein phosphatase Z1 exaggerates the oxidative stress response in Candida albicans. BMC Genomics 20:873. doi: 10.1186/s12864-019-6252-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.McCall AD, Kumar R, Edgerton M. 2018. Candida albicans Sfl1/Sfl2 regulatory network drives the formation of pathogenic microcolonies. PLoS Pathog 14:e1007316. doi: 10.1371/journal.ppat.1007316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Nobile CJ, Fox EP, Nett JE, Sorrells TR, Mitrovich QM, Hernday AD, Tuch BB, Andes DR, Johnson AD. 2012. A recently evolved transcriptional network controls biofilm development in Candida albicans. Cell 148:126–138. doi: 10.1016/j.cell.2011.10.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Hernday AD, Lohse MB, Nobile CJ, Noiman L, Laksana CN, Johnson AD. 2016. Ssn6 defines a new level of regulation of white-opaque switching in Candida albicans and Is required for the stochasticity of the switch. mBio 7:e01565-15. doi: 10.1128/mBio.01565-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Lohse MB, Johnson AD. 2016. Identification and characterization of Wor4, a new transcriptional regulator of white-opaque switching. G3 (Bethesda) 6:721–729. doi: 10.1534/g3.115.024885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Glazier VE, Murante T, Murante D, Koselny K, Liu Y, Kim D, Koo H, Krysan DJ. 2017. Genetic analysis of the Candida albicans biofilm transcription factor network using simple and complex haploinsufficiency. PLoS Genet 13:e1006948. doi: 10.1371/journal.pgen.1006948. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Fox EP, Bui CK, Nett JE, Hartooni N, Mui MC, Andes DR, Nobile CJ, Johnson AD. 2015. An expanded regulatory network temporally controls Candida albicans biofilm formation. Mol Microbiol 96:1226–1239. doi: 10.1111/mmi.13002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Desai PR, Lengeler K, Kapitan M, Janssen SM, Alepuz P, Jacobsen ID, Ernst JF. 2018. The 5′ untranslated region of the EFG1 Transcript promotes its translation to regulate hyphal morphogenesis in Candida albicans. mSphere 3:e00280-18. doi: 10.1128/mSphere.00280-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Guan Z, Liu H. 2015. The WOR1 5′ untranslated region regulates white-opaque switching in Candida albicans by reducing translational efficiency. Mol Microbiol 97:125–138. doi: 10.1111/mmi.13014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Childers DS, Mundodi V, Banerjee M, Kadosh D. 2014. A 5′ UTR-mediated translational efficiency mechanism inhibits the Candida albicans morphological transition. Mol Microbiol 92:570–585. doi: 10.1111/mmi.12576. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Mayr C. 2017. Regulation by 3′-untranslated regions. Annu Rev Genet 51:171–194. doi: 10.1146/annurev-genet-120116-024704. [DOI] [PubMed] [Google Scholar]
  • 56.Kim JM, Vanguri S, Boeke JD, Gabriel A, Voytas DF. 1998. Transposable elements and genome organization: a comprehensive survey of retrotransposons revealed by the complete Saccharomyces cerevisiae genome sequence. Genome Res 8:464–478. doi: 10.1101/gr.8.5.464. [DOI] [PubMed] [Google Scholar]
  • 57.Goodwin TJ, Poulter RT. 2000. Multiple LTR-retrotransposon families in the asexual yeast Candida albicans. Genome Res 10:174–191. doi: 10.1101/gr.10.2.174. [DOI] [PubMed] [Google Scholar]
  • 58.Anderson MZ, Gerstein AC, Wigen L, Baller JA, Berman J. 2014. Silencing is noisy: population and cell level noise in telomere-adjacent genes is dependent on telomere position and sir2. PLoS Genet 10:e1004436. doi: 10.1371/journal.pgen.1004436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Moran GP, Anderson MZ, Myers LC, Sullivan DJ. 2019. Role of Mediator in virulence and antifungal drug resistance in pathogenic fungi. Curr Genet 65:621–630. doi: 10.1007/s00294-019-00932-8. [DOI] [PubMed] [Google Scholar]
  • 60.Brown DH, Jr, Giusani AD, Chen X, Kumamoto CA. 1999. Filamentous growth of Candida albicans in response to physical environmental cues and its regulation by the unique CZF1 gene. Mol Microbiol 34:651–662. doi: 10.1046/j.1365-2958.1999.01619.x. [DOI] [PubMed] [Google Scholar]
  • 61.Hernday AD, Lohse MB, Fordyce PM, Nobile CJ, DeRisi JL, Johnson AD. 2013. Structure of the transcriptional network controlling white-opaque switching in Candida albicans. Mol Microbiol 90:22–35. doi: 10.1111/mmi.12329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Calderone R, Li D, Traven A. 2015. System-level impact of mitochondria on fungal virulence: to metabolism and beyond. FEMS Yeast Res 15:fov027. doi: 10.1093/femsyr/fov027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Grahl N, Demers EG, Lindsay AK, Harty CE, Willger SD, Piispanen AE, Hogan DA. 2015. Mitochondrial activity and Cyr1 are key regulators of Ras1 activation of C. albicans virulence pathways. PLoS Pathog 11:e1005133. doi: 10.1371/journal.ppat.1005133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Kim SW, Joo YJ, Chun YJ, Park YK, Kim J. 2019. Cross-talk between Tor1 and Sch9 regulates hyphae-specific genes or ribosomal protein genes in a mutually exclusive manner in Candida albicans. Mol Microbiol 112:1041–1057. doi: 10.1111/mmi.14346. [DOI] [PubMed] [Google Scholar]
  • 65.Langfelder P, Horvath S. 2008. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9:559. doi: 10.1186/1471-2105-9-559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Cleary IA, Lazzell AL, Monteagudo C, Thomas DP, Saville SP. 2012. BRG1 and NRG1 form a novel feedback circuit regulating Candida albicans hypha formation and virulence. Mol Microbiol 85:557–573. doi: 10.1111/j.1365-2958.2012.08127.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Richard ML, Nobile CJ, Bruno VM, Mitchell AP. 2005. Candida albicans biofilm-defective mutants. Eukaryot Cell 4:1493–1502. doi: 10.1128/EC.4.8.1493-1502.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Kordes A, Preusse M, Willger SD, Braubach P, Jonigk D, Haverich A, Warnecke G, Haussler S. 2019. Genetically diverse Pseudomonas aeruginosa populations display similar transcriptomic profiles in a cystic fibrosis explanted lung. Nat Commun 10:3397. doi: 10.1038/s41467-019-11414-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Hazen TH, Daugherty SC, Shetty AC, Nataro JP, Rasko DA. 2017. Transcriptional variation of diverse enteropathogenic Escherichia coli isolates under virulence-inducing conditions. mSystems 2:e00024-17. doi: 10.1128/mSystems.00024-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Le Gall T, Darlu P, Escobar-Paramo P, Picard B, Denamur E. 2005. Selection-driven transcriptome polymorphism in Escherichia coli/Shigella species. Genome Res 15:260–268. doi: 10.1101/gr.2405905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Vital M, Chai B, Ostman B, Cole J, Konstantinidis KT, Tiedje JM. 2015. Gene expression analysis of E. coli strains provides insights into the role of gene regulation in diversification. ISME J 9:1130–1140. doi: 10.1038/ismej.2014.204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Jaeger BN, Linker SB, Parylak SL, Barron JJ, Gallina IS, Saavedra CD, Fitzpatrick C, Lim CK, Schafer ST, Lacar B, Jessberger S, Gage FH. 2018. A novel environment-evoked transcriptional signature predicts reactivity in single dentate granule neurons. Nat Commun 9:3084. doi: 10.1038/s41467-018-05418-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Maamar H, Raj A, Dubnau D. 2007. Noise in gene expression determines cell fate in Bacillus subtilis. Science 317:526–529. doi: 10.1126/science.1140818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Vihervaara A, Mahat DB, Guertin MJ, Chu T, Danko CG, Lis JT, Sistonen L. 2017. Transcriptional response to stress is pre-wired by promoter and enhancer architecture. Nat Commun 8:255. doi: 10.1038/s41467-017-00151-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Pradhan A, Avelar GM, Bain JM, Childers D, Pelletier C, Larcombe DE, Shekhova E, Netea MG, Brown GD, Erwig L, Gow NAR, Brown AJP. 2019. Non-canonical signalling mediates changes in fungal cell wall PAMPs that drive immune evasion. Nat Commun 10:5315. doi: 10.1038/s41467-019-13298-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Sorrells TR, Johnson AD. 2015. Making sense of transcription networks. Cell 161:714–723. doi: 10.1016/j.cell.2015.04.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Amiri A, Coppola G, Scuderi S, Wu F, Roychowdhury T, Liu F, Pochareddy S, Shin Y, Safi A, Song L, Zhu Y, Sousa AMM, The PsychENCODE Consortium, Gerstein M, Crawford GE, Sestan N, Abyzov A, Vaccarino FM. 2018. Transcriptome and epigenome landscape of human cortical development modeled in organoids. Science 362:eaat6720. doi: 10.1126/science.aat6720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Nowakowski TJ, Bhaduri A, Pollen AA, Alvarado B, Mostajo-Radji MA, Di Lullo E, Haeussler M, Sandoval-Espinosa C, Liu SJ, Velmeshev D, Ounadjela JR, Shuga J, Wang X, Lim DA, West JA, Leyrat AA, Kent WJ, Kriegstein AR. 2017. Spatiotemporal gene expression trajectories reveal developmental hierarchies of the human cortex. Science 358:1318–1323. doi: 10.1126/science.aap8809. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Argilaguet J, Pedragosa M, Esteve-Codina A, Riera G, Vidal E, Peligero-Cruz C, Casella V, Andreu D, Kaisho T, Bocharov G, Ludewig B, Heath S, Meyerhans A. 2019. Systems analysis reveals complex biological processes during virus infection fate decisions. Genome Res 29:907–919. doi: 10.1101/gr.241372.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Rossi DCP, Gleason JE, Sanchez H, Schatzman SS, Culbertson EM, Johnson CJ, McNees CA, Coelho C, Nett JE, Andes DR, Cormack BP, Culotta VC. 2017. Candida albicans FRE8 encodes a member of the NADPH oxidase family that produces a burst of ROS during fungal morphogenesis. PLoS Pathog 13:e1006763. doi: 10.1371/journal.ppat.1006763. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Schrevens S, Van Zeebroeck G, Riedelberger M, Tournu H, Kuchler K, Van Dijck P. 2018. Methionine is required for cAMP-PKA mediated morphogenesis and virulence of Candida albicans. Mol Microbiol 109:415–416. doi: 10.1111/mmi.14065. [DOI] [PubMed] [Google Scholar]
  • 82.Silao FGS, Ward M, Ryman K, Wallstrom A, Brindefalk B, Udekwu K, Ljungdahl PO. 2019. Mitochondrial proline catabolism activates Ras1/cAMP/PKA-induced filamentation in Candida albicans. PLoS Genet 15:e1007976. doi: 10.1371/journal.pgen.1007976. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Lu Y, Su C, Liu H. 2012. A GATA transcription factor recruits Hda1 in response to reduced Tor1 signaling to establish a hyphal chromatin state in Candida albicans. PLoS Pathog 8:e1002663. doi: 10.1371/journal.ppat.1002663. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Su C, Yu J, Sun Q, Liu Q, Lu Y. 2018. Hyphal induction under the condition without inoculation in Candida albicans is triggered by Brg1-mediated removal of NRG1 inhibition. Mol Microbiol 108:410–423. doi: 10.1111/mmi.13944. [DOI] [PubMed] [Google Scholar]
  • 85.Guthrie C, Fink GR. 1991. Guide to yeast genetics and molecular biology. Academic Press, San Diego, CA. [Google Scholar]
  • 86.Parkhomchuk D, Borodina T, Amstislavskiy V, Banaru M, Hallen L, Krobitsch S, Lehrach H, Soldatov A. 2009. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res 37:e123. doi: 10.1093/nar/gkp596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Levin JZ, Yassour M, Adiconis X, Nusbaum C, Thompson DA, Friedman N, Gnirke A, Regev A. 2010. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat Methods 7:709–715. doi: 10.1038/nmeth.1491. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Anderson MZ, Porman AM, Wang N, Mancera E, Huang D, Cuomo CA, Bennett RJ. 2016. A multistate toggle switch defines fungal cell fates and is regulated by synergistic genetic cues. PLoS Genet 12:e1006353. doi: 10.1371/journal.pgen.1006353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. 2013. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36. doi: 10.1186/gb-2013-14-4-r36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Anders S, Pyl PT, Huber W. 2015. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics 31:166–169. doi: 10.1093/bioinformatics/btu638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Robinson MD, McCarthy DJ, Smyth GK. 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. 2013. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Arnaud MB, Costanzo MC, Shah P, Skrzypek MS, Sherlock G. 2009. Gene Ontology and the annotation of pathogen genomes: the case of Candida albicans. Trends Microbiol 17:295–303. doi: 10.1016/j.tim.2009.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Zhang B, Horvath S. 2005. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4:Article17. doi: 10.2202/1544-6115.1128. [DOI] [PubMed] [Google Scholar]
  • 95.Tibshirani R, Walther G, Hastie T. 2001. Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc B 63:411–423. doi: 10.1111/1467-9868.00293. [DOI] [Google Scholar]
  • 96.Reuss O, Vik A, Kolter R, Morschhauser J. 2004. The SAT1 flipper, an optimized tool for gene disruption in Candida albicans. Gene 341:119–127. doi: 10.1016/j.gene.2004.06.021. [DOI] [PubMed] [Google Scholar]
  • 97.Hernday AD, Noble SM, Mitrovich QM, Johnson AD. 2010. Genetics and molecular biology in Candida albicans. Methods Enzymol 470:737–758. doi: 10.1016/S0076-6879(10)70031-8. [DOI] [PubMed] [Google Scholar]
  • 98.Jacobus AP, Gross J. 2015. Optimal cloning of PCR fragments by homologous recombination in Escherichia coli. PLoS One 10:e0119221. doi: 10.1371/journal.pone.0119221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Nguyen N, Quail MMF, Hernday AD. 2017. An efficient, rapid, and recyclable system for CRISPR-mediated genome editing in Candida albicans. mSphere 2:e00149-17. doi: 10.1128/mSphereDirect.00149-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Dunn MJ, Fillinger RJ, Anderson LM, Anderson MZ. 2020. Automated quantification of Candida albicans biofilm-related phenotypes reveals additive contributions to biofilm production. NPJ Biofilms Microbiomes 6:36. doi: 10.1038/s41522-020-00149-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

FIG S1

Phylogenetic relationship of C. albicans strains used in this study. The phylogenetic relationship of the 21 C. albicans isolates used for transcriptional profiling is shown based on comparison of full-genome sequences. Bootstrap support for each node is indicated. Assignments of isolates to fingerprinting clades are color coded. Download FIG S1, PDF file, 0.1 MB (149KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S2

Correlation of gene expression with phylogenetic relationships among the C. albicans isolates. (A) Read counts were calculated for all genes from each strain and binned based on the value of transcripts per million (TPM). The fraction of reads within each bin was then plotted per strain. Clade assignments for each strain are color coded as indicated. (B) Similarity in transcript profiles among the 42 biological samples was assessed by hierarchical clustering of TPM values using Euclidean distance and average linkage. One thousand bootstraps were performed. The resulting bootstrap values are shown in green, and corresponding approximately unbiased (AU) P values are shown in red at each node. (C) A heat map represents the RNA transcripts per million (TPM) of the 50 genes with the greatest difference in expression among the 21 isolates on a log2 scale. The expression for each strain is the average for two biological replicates. The strains are ordered based on their phylogenetic relationships, and their clade assignments are color coded. (D) The 32 genes whose expression significantly correlated with the strain phylogeny are listed. Genes that contributed to enrichment of the gene ontology (GO) terms associated with this list are boldfaced. Significant GO categories are listed. Download FIG S2, PDF file, 1.4 MB (1.4MB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S3

Transcriptional profiles are not more similar among genetically similar strains. A distance matrix based on similarity in transcriptional profiles was constructed for all 21 C. albicans isolates. Distances were separated based on comparison between strains within the same clade or between strains in different clades based on fingerprinting analysis and plotted. Intraclade and interclade comparisons were not statistically different. Download FIG S3, PDF file, 0.7 MB (752.1KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S4

Greater dissimilarity in gene expression correlates with more differential gene expression. The number of differentially expressed genes between any two strains (adjusted P value < 0.05, 2-fold cutoff) and the similarity in overall gene expression between two strains in all pairwise comparisons were plotted. Comparisons were performed in all pairwise combinations for all strains and color coded for comparisons between two strains within the same clade or marked as gray for comparison across clades. These data produced an inverse relationship between expression similarity and the number of differentially expressed genes. Download FIG S4, PDF file, 0.8 MB (774.6KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S5

Strain-specific gene expression among C. albicans isolates. (A) The number of genes expressed uniquely by one strain compared to all other 20 transcriptionally profiled isolates was plotted for each of the 21 isolates. Isolates that uniquely expressed a greater number of genes beyond 2 standard deviations are labeled. (B) The number of strain-specific genes for each isolate is listed. Download FIG S5, PDF file, 0.5 MB (487.6KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S6

Untranslated regions (UTRs) in C. albicans vary in length with gene function. (A) The UTR length for all genes in each isolate was determined by measuring the length of continuous reads extending beyond defined coding sequences on the appropriate strand. Lengths for each gene were plotted with 5′ UTRs above and 3′ UTRs below the x axis. Red vertical lines indicate the 95% cutoff value. (B) The 5′ UTR was detected from aligned transcripts from each of the 21 sequenced isolates. The length of the 5′ UTR for each gene was averaged for all genes with detectable expression in at least 15 strains. The length of all gene 5′ UTRs is plotted alongside those of all C. albicans transcription factors as defined in the Candida Genome Database (http://candidagenome.org). (C) The 3′ UTRs of all genes in the C. albicans genome were similarly determined from transcriptional profiling. The 3′ UTRs of all genes were plotted alongside all genes defined by the gene ontology term “ribosome.” FIG S6, PDF file, 0.8 MB (803.1KB, pdf)

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S7

Retroelement expression does not correlate with copy number. (A) The abundance of each transposon-associated long terminal repeat (LTR) was determined from RNA-Seq for each strain and is shown as a stacked bar and color coded to indicate each LTR class. Strains are color coded by clade. (B) The number of retroelements encoded in the genome of each C. albicans isolate was determined from previous whole-genome sequencing (11) and plotted against the value of total transcripts per million (TPM) for all retroelements. A linear model was fitted to the data to detect a relationship between copy number and expression. Download FIG S7, PDF file, 1.0 MB (980.1KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S8

TLO genes do not display increased expression plasticity. (A) The coefficient of variation (CV) for gene expression between biological replicates for each strain was calculated for all genes and averaged across strains. The CV of each gene (gray dots) was plotted across the eight C. albicans chromosomes along with a smoothed average using nonoverlapping 10-kb windows (red lines). Chromosomal positions are depicted below. (B) The coefficient of variation (CV) of gene expression for each gene was calculated between biological replicates for each clinical isolate. The CV was averaged across strains and plotted with a smoothing line (red). Individual TLO genes were plotted against the distribution (arrows) and compared to 2 standard deviations from the mean (dashed black line). The blue arrow indicates the chromosome internal TLO, TLOα34. Download FIG S8, PDF file, 2.7 MB (2.7MB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S9

Module-phenotype relationships for C. albicans isolates. Modules built from C. albicans expression data with weighted gene correlation network analysis (WGCNA) were correlated against each phenotype from the work of Hirakawa et al. (M. P. Hirakawa, D. A. Martinez, S. Sakthikumar, M. Z. Anderson, et al., Genome Res 25:413–425, 2015, https://doi.org/10.1101/gr.174623.114). Significant positive and negative correlations of modules and phenotypes are indicated in red and blue, respectively. Each significant interaction contains the correlation coefficient (top line) and the P value (bottom line). Download FIG S9, PDF file, 2.3 MB (2.3MB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

FIG S10

Expression of ME8 genes is unique to P37037 gray cells. (A) qRT-PCR-measured abundance of ME8 genes in white and gray populations of P37037. Abundance was measured for cells in logarithmic growth at 30°C and normalized to ACT1. n = 3 biological replicates. (B) A heat map represents the gene expression in transcripts per million (TPM) of ME8 genes from white, gray, and opaque SC5314 cells taken from the work of Liang et al. (S. H. Liang, M. Z. Anderson, M. P. Hirakawa, J. M. Wang, et al., Cell Host Microbe 25:418–431.e416, 2019, https://doi.org/10.1016/j.chom.2019.01.005) and plotted on a log2 scale ranging from −10 to 10. Download FIG S10, PDF file, 0.8 MB (821.5KB, pdf) .

Copyright © 2021 Wang et al.

This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Data Availability Statement

The data sets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. The transcriptional profiling data generated in this study have been submitted to the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/) under accession number PRJNA630085. Tools developed to aid in gene ontology analysis are available from https://github.com/joshuamwang/CAlbicansR.


Articles from mBio are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES