Abstract
Purpose.
Ovarian cancer is a heterogeneous disease that can be divided into multiple subtypes with variable etiology, pathogenesis, and prognosis. We analyzed DNA methylation profiling data to identify biologic subgroups of ovarian cancer and study their relationship with histologic subtypes, copy number variation, RNA expression data and outcomes.
Experimental design.
A total of 162 paraffin embedded ovarian epithelial tumor tissues, including the five major epithelial ovarian tumor subtypes (high and low grade serous, endometrioid, mucinous and clear cell) and tumors of low malignant potential were selected from two different sources: The Polish Ovarian Cancer study, and the Surveillance, Epidemiology, and End Results Residual Tissue Repository (SEER RTR). Analyses were restricted to Caucasian women. Methylation profiling was conducted using the Illumina 450K methylation array. For 45 tumors array copy number data were available. NanoString gene expression data for 39 genes were available for 61 high-grade serous carcinomas (HGSC).
Results.
Consensus non-negative matrix factorization clustering of the 1,000 most variable CpG sites showed four major clusters among all epithelial ovarian cancers. We observed statistically significant differences in survival (log-rank test P= 9.1×10−7) and genomic instability across these clusters. Within HGSC, clustering showed three subgroups with survival differences (log-rank test P= 0.002). Comparing models with and without methylation subgroups in addition to previously identified gene expression subtypes suggested that the methylation subgroups added significant survival information (P=0.007).
Conclusion.
DNA methylation profiling of ovarian cancer identified novel molecular subgroups that had significant surivival difference and provided insights into the molecular underpinnings of ovarian cancer.
Introduction
Ovarian cancer is the fifth leading cause of cancer death among women in the United States (1). Currently, there is no effective screening strategy, and most cancers are detected at advanced stages limiting curative therapies. Ovarian cancer is a histologically heterogeneous set of diseases with approximately 90% of cases being of epithelial origin. The major histotypes of malignant epithelial ovarian cancers (EOCs) include serous (high- and low-grade), endometrioid, clear cell and mucinous (2). Serous tumors account for 60%−70% of EOC, endometrioid approximately 15%, clear cells for 5% and mucinous for approximately 10% (3–7), although these rates vary globally and have changed over time (8). Of all these, high-grade serous carcinomas (HGSC) represent the most common histotype and are associated with poor prognosis (3,5). Ovarian tumors of low malignant potential (LMP) or borderline tumors account for 15% of all epithelial neoplasms, and are generally considered apart from extraovarian metastasis because of differences in proposed pathogenesis and highly favorable clinical outcomes in the absence of invasive implants (9,10).
Several studies have shown that the major histologic subtypes of EOC have different risk factors (11), genetic susceptibilities (12–14) and clinical response to therapy (15,16). These histotypes also differ at the molecular level. Serous tumors, mostly high grade, were the only histological subtypes included in The Cancer Genome Atlas (TCGA) (17). Results from this large-scale genomics analysis showed that these tumors have near ubiquitous TP53 mutations (96% of tumors) and a high level of copy number alterations. TCGA analyses confirmed four transcriptional subtypes that were previously described for HGSC (18). Large scale integrated characterization of the other subtypes is mostly lacking. Mutations in BRAF, KRAS, PTEN and B-catenin have been shown in endometrioid, clear cell, low-grade serous, mucinous EOC (19).
There is increasing evidence that the different histotypes of EOC have different cells of origin, including that many HGSCs originate from secretory cells in the fimbria of the fallopian tubes, while endometrioid and clear cell carcinomas appear to arise from endometriosis, which itself may represent explanted menstrual endometrium (20). The origin of mucinous tumors is less well understood. It has been suggested that a possible origin is in epithelial nests located in the tubo-peritoneal junction (19). Of note, historically, many tumors classified as ovarian mucinous carcinomas represent metastases to the ovary, often from an occult gastrointestinal primary (21). Large-scale identification of molecular features in tumors may help to identify key molecular characteristics that differ across histotypes, refine tumor classification beyond histology alone and inform about potential treatments and etiology, which could lead to development of new approaches for prevention and early detection of EOC.
DNA methylation is a key epigenetic regulator of gene expression across a variety of cellular processes (22). There is growing interest in the role of DNA methylation in carcinogenesis and in defining molecular subtypes that can assist in elucidating the etiology, clinical characteristics and outcomes of EOC. Genome-wide methylation studies have focused mostly on a single subtype (14,17,23) with different studies using different technologies, experimental conditions and analytical techniques, which makes it difficult to compare results across the different histotypes. Therefore, a more comprehensive study involving epigenome-wide assessment of DNA methylation is needed to better understand methylation status across different EOCs.
In two population-based studies, we analyzed DNA methylation in the major histologies of ovarian cancer to define subtypes of EOC and compare these subtypes with the degree of chromosomal copy number alterations. Furthermore, we compared DNA methylation within HGSC to previously identified subgroups based on gene expression.
Materials and methods
Study population
Subjects included in this analysis were selected from two studies, the Polish Ovarian Cancer Study (POCS) (24) and the National Cancer Institute’s Surveillance Epidemiology and End Results (SEER) Residual Tissue Repository (RTR) study (25). Briefly, POCS was a population-based case-control study that included eligible cases diagnosed between June 1, 2001 and December 30, 2003, residents in Warsaw or Lodz, Poland, and were aged 20–74 at the time of diagnosis. Incident ovarian cancer cases were ascertained through a rapid identification system coordinated by five participating Polish hospitals, which cover approximately 85% of all cases diagnosed in the two cities (24). Additionally, cancer registries in Warsaw and Lodz were used to identify cases missed by the rapid identification system. Medical records were reviewed for pathology, treatment, and outcomes information. An experienced gynecological pathologist (MES) reviewed the hematoxylin and eosin (H&E) slides to determine the pathologic diagnosis used for this analysis. Formalin-fixed paraffin-embedded (FFPE) tissue blocks were available for a subset of ovarian cancer cases (55%). Between two and four 1mm cores were obtained from these cancer patients. The POCS was reviewed and approved by the institutional review boards (IRBs) of the U.S. National Cancer Institute (NCI), the M. Sklodowska Curie Institute of Oncology and Cancer Center in Warsaw, and the Institute of Occupational Medicine in Lodz. This study is covered by Single Project Assurances (SPAs) in Warsaw (S-009741–04) and Lodz (S-017191–01). All participants provided written informed consent for use of clinical data and archival tissue specimens. FFPE tumor blocks from 32 patients that did not received chemotherapy were included in the current analysis (Supplemental Table 1).
The SEER RTR study included FFPE tissue blocks from primary tumors in the Hawaii and Iowa Tumor Registries (25). Ovarian cancer cases in the Hawaii Tumor Registry were diagnosed from 1983 through 2004, which represented 38% of all ovarian tumors in the Hawaii catchment area during that period. Ovarian cancer cases in the Iowa Tumor Registry were diagnosed from 1987 through 2003, representing 4% of all ovarian tumors in the Iowa catchment area during that period. Tumors were restricted to those from European ancestry women. Other demographic characteristics available, in addition to race, were age and year at diagnosis and year of death. Tumor characteristics included histology and stage from the American Joint Committee on Cancer, and grade (25). Because SEER’s RTR data were anonymized, the National Institutes of Health’s Office of Human Subjects Research (OHSR) designated the project as exempt from IRB approval (OHSR #4081); nonetheless, IRB approvals were provided at the Universities of Hawaii and Iowa. We retrieved the available FFPE tissue blocks for primary invasive ovarian carcinomas in the Hawaii and Iowa Discard Tumor Registries. Tumor blocks from 144 Caucasian patients from the SEER RTRs (60 (42%) from the Hawaii Tumor Registry and 84 (58%) from the Iowa Tumor Registry) were included in the current analysis (Supplemental Table 1). H&E slides from all tumors went thought expert pathology reviewed by MES and the reviewed diagnosis was used in this analysis. Tumors included in the analysis did not received neoadjuvant chemotherapy prior to their collection based on review of pathology reports. Mortality data was coded by year.
Methylation array
DNA extraction and genome-wide DNA methylation assays using the Infinium HumanMethylation450 BeadChips (Illumina) were performed as previously described (26,27). Briefly, FFPE tissues cores were deparaffinized using mineral oil and lysed in Qiagen ATL buffer/proteinase K. NanoDrop UV absorbance and PicoGreen dsDNA fluorometry (Invitrogen) were used to measures DNA levels in the lysates. Fifty microliters of isolated, purified DNA was sodium bisulfite treated following the Zymo Research EZ-96 Methylation Gold kit. Converted DNA was further treated with the Illumina FFPE Restoration solution and 250ng of restored DNA was then run on the Infinium HumanMethylation450 BeadChip according to the standard protocol. Raw data was extracted using the Genome Studio v2011.1.
Four samples were run in triplicate and one cell line (GIST 28T) was included in quadruplicate across the two methylation plates for quality controls. The average correlation across these replicates was 0.984. A plot showing the intraclass-correlation coefficient against the variance of the beta values is shown in the Supplemental Figure 1. Fourteen samples were excluded based on the lack of a bimodal distribution of the betas values, as expected for this array. All these excluded samples were from the SEER Hawaii Tumor Registry collected prior to 1993. Raw intensity data were imported into the R programming software using the minfi Bioconductor package (28). Data was pre-processed for background subtraction and controls normalization using the preprocessIllumina function. Subset-quantile within array normalization (SWAN) was performed to correct for type I and II probe bias (29). Probes that were cross-reactive or located at single nucleotide polymorphisms were excluded (30). Probes that had a detection P-value>0.01, which indicated that they were indistinguishable from the background, were also excluded. Analyses were restricted to probes in the 22 autosomes. A total of 362,498 CpG sites remained for analysis.
Array copy number assay
A subset of invasive tumors for which enough DNA was extracted that was not required for the methylation arrays (N=47) were analyzed for chromosomal copy number alterations using a commercial 180 K-feature array comparative genomic hybridization (aCGH) assay (Agilent Inc.). We oversampled rare subtypes (clear cell, mucinous and endometrioid) for this analysis (see Supplemental Tables 1 and 2). Briefly, purified DNA, prior to bisulfite conversion, was labeled with Cy5 and Promega Male reference DNA was labeled with Cy3 using the Agilent Genomic DNA ULS Labeling Kit with subsequent purification of DNA on Kreatech genomic purification columns to remove excess Cy3/Cy5. Equal amounts of Cy5-tumor DNA and Cy3-reference DNA were combined and hybridized to Agilent 4X180K CGH arrays for at least 40 hours. Slides were washed and scanned in the Agilent Scanner C. Image data is extracted using the Agilent Feature Extraction software and visual inspection of results is performed using the Nexus 6.0 (Biodiscovery) software.
Background correction and normalization of the aCGH raw signal intensities were done using the Bioconductor limma package with the minimum method for background correction and the print-tip loess normalization, which takes into account signal intensity and spatial position on the array (31). Diagnostic plots were used to assure that signal biases were not present in the data after normalization. Two samples were excluded based on this analysis, leaving data on 45 samples for analysis. Binary logarithm ratios (log2 ratios) of the two intensity channels were then computed. For the subject with duplicate runs, the log2 ratios of the two runs were averaged for analysis. The log2 ratios were subjected to the circular binary segmentation (CBS) algorithm (32) by means of the Bioconductor package DNAcopy. Copy numbers (double deletions, losses, neutral, gains and amplifications) that were significantly altered across all samples were determined using the GISTIC 2.02 algorithm using the default parameters (33) in the NIH High-Performance Computing Biowulf cluster computing (http://hpc.nih.gov).
NanoString assay
FFPE tumor cores from the HGSC methylation array set (N=70) were sent to the international Ovarian Tumor Tissue Analysis (OTTA) consortium for gene expression assays of 518 genes on the NanoString platform analysis. Tissues were deparaffinised with xylene and processed with the Qiagen miRNeasy FFPE protocol (Qiagen) as per manufacturer’s recommendation, including DNase 1 treatment and with an extended proteinase K digest (55°C digest period extended to 45 min). RNA was quantitated on Nanodrop spectrophotometer. Nine samples failed quality controls, leaving 61 samples for analyses.
Specimens were processed following NanoString gene expression codeset procedures (NanoString). Briefly, 500ng of total RNA from each sample was mixed with a custom codeset (NanoString) and hybridization buffer (NanoString). Specimens were hybridized in a Tetrad 2 thermal cycler (Bio Rad) overnight for 16 or 20 hours and then processed on the nCounter Prep Station (NanoString). Cartridges were then imaged on nCounter Digital Analyzer (NanoString). Data was normalized to housekeeping genes (RPL19, ACTB, PGK1, SDHA, and POLR1B) and pre-processed using 3 pooled ovarian cancer specimens, run on the same codeset-lot and using the same protocols, as references in accordance with a single sample classification scheme outlined previously (34).
Statistical analysis
Analyses were performed using the R software (version 3.1.1) and appropriate Biconductor packages.
Consensus non-negative matrix factorization (NMF) clustering algorithm (35) was used for unsupervised class discovery using the 1,000 most variable methylation probes according to the median absolute deviation (MAD).). To identify the number of stable clusters (rank) in our dataset, we performed 100 independent NMF runs for ranks between 2 and 7 using the Lee and Seung method (36). Previous studies have suggested that 30–50 runs is sufficient to assess cluster stability (37,38). For each rank, the algorithm calculates how frequently two samples are grouped together in repeated clustering runs. The resulting pairwise consensus rates (cophenetic correlation coefficient) are used for assessment of cluster stability. To estimate the optimal number of clusters, we followed the approach proposed by Brunet et al. (37) by selecting the rank for which the cophenetic correlation coefficient starts to decrease, and therefore the stability of the cluster starts to decrease. This approach has been previously used in analyses of ovarian cancer samples (17,18,39–41). In addition, we performed 2,000 bootstrap runs on resampled data to compute confidence intervals around each cophenetic coefficient and to assess whether the optimal rank choice was statistically significantly different to other ranks. The choice of optimal rank was robust to the number of independent NMF runs, the number of probes and the NMF algorithm used.
Overall survival was computed as the number of years between the year of diagnosis and the year of death from all causes, the date of last follow-up or 5-year censored survival data. Kaplan-Meier curves compared overall survival according to subgroups or histotype and a long-rank test was used to assess the survival distributions across the subgroups. For univariate and multivariate (adjusted for age, stage, grade and study) survival analysis, a Cox proportional hazards regression model was used, and hazard ratios (HR) and 95% confidence intervals (CI) were estimated. Year of diagnosis was not associated with survival.
To quantify the level of genomic instability in each sample, we computed the fraction of the genome that was altered. This was defined as the percentage of regions with gains or losses, based on the GISTIC results, compared to the covered genome. We called this measure the genomic instability index (GII) of the sample (42,43). The Wilcoxon rank-sum test was used to compare the distribution of the GII values across different groups.
NanoString data for 39 genes was used to classify HGSC samples according to previously defined gene expression subgroups (18) using subtypes scores (44). Briefly, subtype scores were computed using a weighted score of the genes differentially expressed in a specific subgroup. Genes up-regulated in the subgroup were allocated a weight of 1, while genes downregulated in the subgroup were allocated a weight of −1. The subtype score was computed as the sum of the subtype specific genes weighted by the directionality of the gene (44,45). Because the gene expression classification approach we used does not take into account batch variability, we confirmed that the gene expression predicted clusters were validated against previously identified clusters by correlating cluster centroids.
Finally, we assessed whether DNA methylation subgroups added survival information to the previously defined gene expression subgroups. Using a likelihood ratio test, we compared a predictor model with gene expression subgroups, age, and stage, and stratified the baseline hazard by study to a model that had the same variables and the methylation subgroups.
All statistical tests were two-sided and a P value of less than 0.05 was considered statistically significant.
Results
Our analysis included 162 EOC tumors covering the major histotypes of EOC (Supplemental Table 1). The majority of tumors were from the SEER RTR (N=130, 80%). Principal component analysis was performed on the methylation beta values to explore whether methylation clustered according to factors such as study (POCS vs SEER RTR), study site (Warsaw, Lodz, Hawaii, Iowa), year of diagnosis, age at diagnosis or batch (Supplemental Figure 2). None of these factors were related to methylation differences.
Methylation-based across all EOC tumors
Clustering analysis of the 1,000 most variable methylation probes across all EOC tumors identified 4 stable, methylation based subtypes based on the decrease of the cophenetic coefficient (C1, C2, C3 and C4; Figure 1A, B; Supplemental Table 3). C1 was predominantly comprised of HGSC tumors (88%; Figure 1C and Supplemental Table 4); C2 was predominantly represented by LMPs (66%) and 27% of invasive serous tumors. Almost all LMP serous tumors (97%) were in C2. C3 was predominantly comprised of clear cell and endometrioid tumors (74%). Finally, C4 was predominantly comprised of HGSC, high-grade endometrioid carcinomas and mucinous tumors. The concordance between methylation subgroups and histological subtypes was approximately 67% (Supplemental Table 5). The mean age at diagnosis of women was 64.0 years old in C1, 49.6 years old in C2, 55.9 years old in C3 and 59.2 years old in C4 (C1 vs C2, P=1.42×10−8; C1 vs C3, P=0.003; C2 vs C3, P=0.029; C2 vs C4, P=0.002; other comparisons were not statistically significant).
Kaplan-Meier curves of the methylation-based subtypes showed significantly different overall survival outcomes for the different subtypes (long-rank test P=9.1×10−7; Figure 1D). The survival pattern of each of the methylation-based subtypes resembled that of the major histological group predominantly represented in each methylation-based subtype (Figure 1D, E). Compared to C2, the risk of mortality of women in C1 had a 3-fold higher risk of death (HR=3.17, 95% CI: 1.11, 9.02; P=0.031; Table 1). In unadjusted analysis, the risk of mortality of women in cluster C1 was 6-fold higher compared to women in C2 (HR=6.29, 95% CI: 2.78, 14.23; P=9.99×10−6; Supplemental Table 6). Similarly, women in C3 had higher risk of death compared to those in C2 (HR=3.39, 95% CI: 1.38, 8.31; P=0.008). Women in C3 were also at higher risk compared to patients in cluster C2, but it was not statistically significant (Supplemental Table 6). However, when analyses where adjusted for either histology or other clinical characteristics (importantly stage and grade), estimates were greatly reduced and none of the associations remained significant, suggesting that methylation clusters capture several of the histological or clinical phenotypes.
Table 1.
M1/M3 vs M2 | HR | 95% CI | P-value |
---|---|---|---|
Unadjusted | 2.76 | (1.48, 5.12) | 0.001 |
Adjusted* | 2.21 | (1.12, 4.36) | 0.022 |
Sample size of methylation clusters: M1/M3 (N=52) and M2 (N=18).
Adjusted for age, stage and study.
Genomic instability index (GII) across all EOC tumors
Copy number was measures in a subset of the EOCs (N=45; Supplemental Table 2) comprising serous (N=15), endometrioid (N=20), clear cell (N=3) and mucinous (N=7) tumors (copy number information was not available for serous LMP tumros). We computed the fraction of genome that is altered using the GII for each sample (see methods; Figure 2). The GII was associated with grade, with lower grade samples having lower values of GII and high-grade samples having higher values of GII (mean GII in low grade tumors: 0.016 (SD=0.018), mean GII in high grader tumors: 0.050 (0.020); Wilcoxon test P-value=1.03×10−5). Next, we compared the distribution of the GII values across the methylation-based groups and histologies. The average GII of methylation cluster C1 was 0.061 (SD=0.016), of C2 was 0.042 (0.030), of C3 was 0.021 (0.017) and of C4 was 0.044 (0.021). Compared to the GII values in methylation cluster C3, GII values in the methylation-based subtypes C1 and C4 were significantly higher (C1 vs C3: Wilcoxon test P-value=7.75×10−5; C4 vs C3: Wilcoxon test P-value=0.0037). The relationship between the GII values and histologies was as expected, with serous tumors having, on average, higher GII values than the other subtypes: mean GII of serous tumors: 0.055 (SD= 0.024), mean GII of endometrioid tumors: 0.025 (0.020); mean GII of mucinous tumors: 0.040 (0.021); and mean GII of clear cell tumors: 0.051 (0.009). Compared to the GII values of endometrioid tumors, GII values in serous and clear cell tumors were significantly higher (serous vs endometrioid: Wilcoxon test P-value=8.2×10−4; clear cell vs endometrioid: Wilcoxon test P-value=0.035).
Methylation-based in HGSC tumors
Within HGSC, unsupervised clustering of the 1,000 most variable methylation probes found three stable methylation-based subtypes (M1, M2, M3; Figure 3; Supplemental Table 7). Overall survival was statistically significantly different between these methylation subtypes of HGSC, with one subtype having worse survival than the other two (long-rank test P=0.0017). Compared to patients with tumor samples in the HGSC M1/M3 subtypes, patients in the HGSC M2 subtype had significantly worse survival after adjusting for age, stage and study (HR=2.21, 95% CI: 1.12, 4.36; P=0.028; Table 1).
Gene expression subgroups of HGSC
For a subset of HGSC tumors (N=61), expression data for 39 genes was used to classify previously identified gene expression subtypes of HGSC (Figure 4). Survival difference, across the gene expression subgroups were consistent with previous reports (log-rank test P=0.023): the mesenchymal (N1) and proliferative (N5) subtypes had the worst survival and the immunoreactive (N2) subtype had the most favorable survival. We studied the relationship between the 1,000 CpG sites used to define the methylation-based subgroups and expression of the 39 genes (Supplemental Figure 3). None of the 1,000 CpG sites used to define the methylation signatures were located in the 39 genes. However, we observed 5 statistically significant correlations after adjusting p-values by the Bonferroni correction (p-values smaller than 1.28×10−6; Supplemental Figure 4). They were all trans/distal associations. Next, we investigated the relationship between gene expression signatures and methylation signatures.The survival patterns observed in the methylation subtypes for all the HGSC samples was retained in the subset of tumors with gene expression data (Figure 4). Approximately half of the subjects in the gene expression subtype N5 were in the methylation-based cluster M2. Most subjects in the NanoString-based clusters N1 and N4 were in the methylation-based cluster M3. Although the N2 group contained the highest proportion of subjects with the methylation-based cluster M1, approximately half of those in N2 were in the methylation-based cluster M3 (Figure 4; Supplemental Table 7). We observed that the NanoString and methylation clusters were significantly associated with each other (Fisher’s exact test P value=0.0037). Next, we investigated whether the methylation based-classification could add prognostic information to the gene expression subtypes. A likelihood ratio test comparing survival models with and without HGSC methylation subgroups (M1/M3 vs M2) suggested that adding the methylation subgroups added prognostic information (P=0.0072; Table 2).
Table 2.
N | HR (95% CI)* | P-value | HR (95% CI)** | P-value | |
---|---|---|---|---|---|
Gene expression but not methylation subtypes in the model | |||||
Gene expression clusters | |||||
N1 | 12 | 2.38 (0.94, 6.05) | 0.068 | 1.60 (0.58, 4.45) | 0.37 |
N2 | 13 | 0.75 (0.24, 2.28) | 0.61 | 0.81 (0.25, 2.59) | 0.72 |
N4 | 17 | 1.00 (ref) | - | 1.00 (ref) | - |
N5 | 19 | 2.43 (1.02, 5.81) | 0.045 | 1.66 (0.62, 4.39) | 0.31 |
Gene expression and methylation subtypes in the model | |||||
Gene expression clusters | |||||
N1 | 12 | 2.21 (0.87, 5.64) | 0.097 | 1.56 (0.57, 4.29) | 0.39 |
N2 | 13 | 0.72 (0.23, 2.20) | 0.56 | 0.71 (0.22, 2.31) | 0.57 |
N4 | 17 | 1.00 (ref) | - | 1.00 (ref) | - |
N5 | 19 | 1.44 (0.54, 3.86) | 0.47 | 1.13 (0.4, 3.13) | 0.82 |
Methylation clusters | |||||
M1/M3 | 47 | 1.00 (ref) | - | 1.00 (ref) | - |
M2 | 14 | 3.05 (1.35, 6.88) | 0.007 | 3.12 (1.40, 6.98) | 0.006 |
Likelihood ratio test comparing the two models | 0.0090 | 0.0072 |
Unadjusted for other variables.
Additional adjusting variables in the model: age, stage and study.
Discussion
EOC is a heterogeneous disease with different histotypes exhibiting distinct clinical and genetic features. Most previous studies of DNA methylation in EOC have focused on a single histotype or differences between tumor and putative normal tissues (17,23,46). However, the relationship between methylation patterns and the different pathological histotypes is not well understood (47). In our analysis, we included the major histological subtypes, which led to the identification of four EOC methylation subgroups with shared molecular phenotypes and a statistically significant survival difference. These EOC methylation subgroups may provide an alternative classification of ovarian tumors to histological subtypes. The methylation based subgroups could represent the origin of the tumors. In particular, evidence seems to suggest that the majority of HGSCs originate from tubal intra-epithelial carcinoma (TIC) (48), while a subset of HGSCs may have a truly ovarian origin (48,49) as is still presumed for LMP serous tumors (50). We found that one of the methylation subgroups (EOC methylation subgroup C1) was predominately represented by HGSCs, potentially being tumors of tubal origin, while another subgroup (EOC methylation subgroup C2) was predominantly represented by serous LMP and a small subset of HGSCs, possibly representing tumors of ovarian origin. Thus, while the HGSCs in both groups were histologically similar, the difference in methylation patterns among them could provide additional insights into etiology or carcinogenesis. Endometrioid and clear cell ovarian tumors are thought to have a common origin from orthotopic or ectopic endometrial tissue (6). EOC methylation subgroup C3 was predominately represented by endometrioid and clear cell ovarian tumors, which, again, could reflect their cell of origin. Methylation subgroups could also reflect other etiologic factors, as endometrioid and clear cell tumors seem to share many risk factors (11)
We related the EOC methylation subgroups to other molecular features. In particular, we looked at the level of genomic instability for each of the methylation-based subgroups. This measure was related to the grade of the tumor. We found that the fraction of the altered genome differed across the methylation subgroups with EOC methylation subgroup C1, which had most of the high-grade tumors, having a higher average fraction of the altered genome compared to other methylation subgroups.
Previous studies have suggested that HGSCs are a heterogeneous group of tumors at the molecular level (17). We found three HGSC methylation-based subgroups with differences in survival. The TCGA also had methylation information to characterize HGSC. However, it used a lower density methylation array (the Illumina Infinium Human-Mehtylation27 BeadChip with approximately 27,000 CpG sites) and, while the authors found four methylation groups, they concluded that the clusters demonstrated only modest stability with different clustering yielding varying methylation subgroups (17). We found robust subgroups regarding the number of probes, clustering methods and repeated clustering runs. The majority of the probes used to define the methylation subgroups in our study were part of the Methylation450 BeadChip, but not present on the Methylation27 BeadChip. However, our methylation groups need to be replicated.
In addition to the methylation subgroups, gene expression subtypes were consistent with previous reports (17,18,44). These gene expression subtypes showed a statistically significant difference in survival, with previously identified groups that demonstrated more favorable survival also showing better survival in our study and vice versa. Previous studies that have multiple subgroups based on different genomic characteristics, such as gene expression and methylation, have not formally looked at the relationship between the different genomic-based subgroups. Here, we were interested in understanding what additional survival information methylation subtypes could add to the gene expression subtypes. While methylation subgroups were associated with gene expression subgroups, we observed that methylation further indicated survival differences beyond current clinical and molecular factors. This may be important for the classification of HGSC, a better understanding of their etiology, as well as for finding markers for early detection and/or clinical outcomes.
A recent paper found that hypermethylation of six adjacent CpG loci close to the TAP1 promoter region in 6p21.3 was associated with shorter time to recurrence in HGSC (51). Five of the six CpG sites were included in our analyses (cg02181920, cg06473288, cg24111025, cg25042789 and cg26033526). Hypermethylation at these five sites was associated with higher risk of mortality in our study, in agreement with the results of Wang et al. (51). Hazard ratios ranged from 3.10 to 5.25 (P values ranged from 0.11 to 0.062).
Most previous studies of methylation in EOC have used fresh frozen ovarian tumors (17,23). However, the vast majority of epidemiologic and clinical studies have only archival FFPE tumor specimens; it is therefore important to investigate whether characterization using methylation is robust in FFPE tumors. During formalin-fixation, DNA can be damaged or suffer from crosslinking between biomolecules (DNA to DNA, DNA to proteins, etc.). The varying age of the tumors and the degree or method of fixation can also affect the number of artefacts in FFPE samples (52,53). While these artifacts can have an impact on several technologies, such as gene expression or sequencing, DNA methylation seems to be robust after DNA restoration (54,55) and produces accurate, reproducible results. We used tumors from two distinct studies conducted in two different countries during different periods, and we did not observe differences in methylation by study. This provides evidence that methylation can robustly be run in collaborative efforts, such as consortia, where studies are likely conducted in different locations with potentially variable collection protocols. In our experience, these factors did not affect the methylation signal. This is important as the classification found in our study could be applied to other epidemiologic and clinical studies.
Recent consortial epidemiologic studies have tried to clarify the etiologic heterogeneity of EOC (11,56,57). These studies analyzed risk factors according to histological subtypes. Most of the known risk factors were more strongly associated with non-serous ovarian cancers (11). Understanding the relationship between risk and genetic factors with molecular subgroups could provide additional information which could be crucial for improving prevention, early detection and therapies, especially for HGSC which is the most fatal subtype. We found that HGSC was not a single group at the methylation level. It will be important to understand the relationship of these HGSC methylation subgroups in relation to risk and genetic factors.
BRCA status was unknown in our study participants. BRCA carriers tend to be diagnosed with HGSC at a younger age than the population average. We explored whether the women diagnosed with HGSC before the age of 55 (N=21) were grouped in specific clusters. We did not observe any specific clustering across those represented by the different histological subtypes (C1-C4), but 18 of the 21 women were in HGSC cluster M3. In fact, these 18 women represented half of the women in cluster M3. While promoter methylation of BRCA1 in clusters M1-M3 was not significantly different, it is possible that these are set of tumors with other BRCA-like properties. Future studies with BRCA status should explore this further.
Our study has several strengths. It is the most comprehensive study of methylation of ovarian cancer to date, as we included the major histological subtypes of ovarian cancer and examined their shared methylation phenotypes. We used the same laboratory and analytical methods for all EOC tumors, allowing for a valid comparison across subtypes. We also related our methylation-based subgroups to other genomic features, such as copy number and gene expression. To date, this has only been done in TCGA, mostly for HGSC (17) (some low-grade serous carcinomas might have been included as well). However, our study also suffers from few limitations, including sample size. Ovarian cancer is a rare disease and most individual studies have small sample sizes. We were able to pool data from two distinct studies, and methylation showed robust signals independently of the study and study period. However, larger studies should be conducted in the future to validate our results. While we did not have a replication set, we evaluated several independent genomic platforms in the same specimens, including copy number and NanoString with strong prior data in relation to ovarian cancer. NanoString-based clustering analysis of HGSC was aligned with previous classifications of gene expression. Another limitation of our study is the lack of treatment information. The majority of the tumors included in our analyses were from SEER population-based registries for which specific treatment information is not available. It is possible that intrinsic characteristics of the tumors could result in better respond to a specific therapy treatment or, independently of treatment, the molecular characteristics described in our analysis provide a survival advantage to the patient. Future studies with treatment information will be able to study this question in detail.
In conclusion, in this comprehensive analysis of ovarian tumor DNA methylation, we showed that there are four methylation defined subgroups of EOC with survival differences, similar to histological subtypes. Moreover, within HGSCs, we found methylation subgroups that added significant survival information to expression derived molecular subgroups.
Supplementary Material
Translational Relevance:
Ovarian cancer is a heterogeneous disease with several histological subtypes with differnt etiology, pathogenesis, and prognosis. Identifying key molecular characteristics that differ across these subtypes could refine tumor classification beyond histology alone and inform about potential treatments and etiologiy, and help in the development of new approaches for prevention and early detection. In this analysis of epithelial ovarian tumor DNA methylation, that included high and low grade serous, endometrioid, mucinous and clear cell and tumors of low malignant potential, we showed that there are four methylation defined subgroups of epithelial ovarin cancer with survival differences, similar to histological subtypes. Within high grade serous carcinomas, we found methylation subgroups that added significant survival information to previously identified gene expression subtypes
Acknowledgements
This research was supported by the Intramural Research Program of the National Cancer Institute. In addition, S. J. Ramus acknowledges support by the grant “Identifying Prognostic Markers and Therapeutic Targets for Serous Ovarian Cancer” (R01-CA172404), J. A. Doherty acknowledges support by the grant “Epidemiologic factors and survival by molecular subtypes of ovarian cancer” (R01-CA168758), and M. S. Anglesio acknowledges support by the Canadian Institutes of Health Research (CIHR) Proof of Principle (Phase I) grant “PrOTYPE: An Enabling Technology to Improve Ovarian Cancer Care” and funding from the Janet D. Cottrelle Foundation Scholar Program administered by the BC Cancer Foundation.
Footnotes
Disclaimers
The authors have no conflicts of interest.
References
- 1.Siegel RL, Miller KD, Jemal A. Cancer statistics, 2015. CA Cancer J Clin 2015;65:5–29 [DOI] [PubMed] [Google Scholar]
- 2.Köbel M, Bak J, Bertelsen BI, Carpen O, Grove A, Hansen ES, et al. Ovarian carcinoma histotype determination is highly reproducible, and is improved through the use of immunohistochemistry. Histopathology 2014;64:1004–13 [DOI] [PubMed] [Google Scholar]
- 3.Seidman JD, Horkayne-Szakaly I, Haiba M, Boice CR, Kurman RJ, Ronnett BM. The histologic type and stage distribution of ovarian carcinomas of surface epithelial origin. International journal of gynecological pathology : official journal of the International Society of Gynecological Pathologists 2004;23:41–4 [DOI] [PubMed] [Google Scholar]
- 4.Cho KR, Shih Ie M. Ovarian cancer. Annu Rev Pathol 2009;4:287–313 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Koonings PP, Campbell K, Mishell DR Jr. Grimes DA. Relative frequency of primary ovarian neoplasms: a 10-year review. Obstet Gynecol 1989;74:921–6 [PubMed] [Google Scholar]
- 6.McCluggage WG. Morphological subtypes of ovarian carcinoma: a review with emphasis on new developments and pathogenesis. Pathology 2011;43:420–32 [DOI] [PubMed] [Google Scholar]
- 7.Trabert B, Coburn SB, Mariani A, Yang HP, Rosenberg PS, Gierach GL, et al. Reported Incidence and Survival of Fallopian Tube Carcinomas: A Population-Based Analysis From the North American Association of Central Cancer Registries. J Natl Cancer Inst 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sung PL, Chang YH, Chao KC, Chuang CM, Task Force on Systematic R, Meta-analysis of Ovarian C. Global distribution pattern of histological subtypes of epithelial ovarian cancer: a database analysis and systematic review. Gynecol Oncol 2014;133:147–54 [DOI] [PubMed] [Google Scholar]
- 9.Gardner GJ, Birrer MJ. Ovarian Tumors of Low Malignant Potential: Can Molecular Biology Solve This Enigma? JNCI: Journal of the National Cancer Institute 2001;93:1122–3 [DOI] [PubMed] [Google Scholar]
- 10.Zanetta G, Rota S, Chiari S, Bonazzi C, Bratina G, Mangioni C. Behavior of borderline tumors with particular interest to persistence, recurrence, and progression to invasive carcinoma: a prospective study. J Clin Oncol 2001;19:2658–64 [DOI] [PubMed] [Google Scholar]
- 11.Wentzensen N, Poole EM, Trabert B, White E, Arslan AA, Patel AV, et al. Ovarian Cancer Risk Factors by Histologic Subtype: An Analysis From the Ovarian Cancer Cohort Consortium . Journal of Clinical Oncology 2016;34:2888–98 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Goode EL, Chenevix-Trench G, Song H, Ramus SJ, Notaridou M, Lawrenson K, et al. A genome-wide association study identifies susceptibility loci for ovarian cancer at 2q31 and 8q24. Nat Genet 2010;42:874–9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bolton KL, Tyrer J, Song H, Ramus SJ, Notaridou M, Jones C, et al. Common variants at 19p13 are associated with susceptibility to ovarian cancer. Nat Genet 2010;42:880–4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Shen H, Fridley BL, Song H, Lawrenson K, Cunningham JM, Ramus SJ, et al. Epigenetic analysis leads to identification of HNF1B as a subtype-specific susceptibility gene for ovarian cancer. Nat Commun 2013;4:1628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Raja FA, Chopra N, Ledermann JA. Optimal first-line treatment in ovarian cancer. Ann Oncol 2012;23 Suppl 10:x118–27 [DOI] [PubMed] [Google Scholar]
- 16.Coward JI, Middleton K, Murphy F. New perspectives on targeted therapy in ovarian cancer. Int J Womens Health 2015;7:189–203 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Cancer Genome Atlas Research N. Integrated genomic analyses of ovarian carcinoma. Nature 2011;474:609–15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Tothill RW, Tinker AV, George J, Brown R, Fox SB, Lade S, et al. Novel Molecular Subtypes of Serous and Endometrioid Ovarian Cancer Linked to Clinical Outcome. Clinical Cancer Research 2008;14:5198–208 [DOI] [PubMed] [Google Scholar]
- 19.Kurman RJ, Shih Ie M. Molecular pathogenesis and extraovarian origin of epithelial ovarian cancer--shifting the paradigm. Human pathology 2011;42:918–31 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kurman RJ, Shih Ie M. The origin and pathogenesis of epithelial ovarian cancer: a proposed unifying theory. Am J Surg Pathol 2010;34:433–43 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Zaino RJ, Brady MF, Lele SM, Michael H, Greer B, Bookman MA. Advanced stage mucinous adenocarcinoma of the ovary is both rare and highly lethal. Cancer 2011;117:554–62 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Jones PA, Baylin SB. The Epigenomics of Cancer. Cell 2007;128:683–92 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Cicek MS, Koestler DC, Fridley BL, Kalli KR, Armasu SM, Larson MC, et al. Epigenome-wide ovarian cancer analysis identifies a methylation profile differentiating clear-cell histology with epigenetic silencing of the HERG K+ channel. Human Molecular Genetics 2013;22:3038–47 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Garcia-Closas M, Brinton LA, Lissowska J, Richesson D, Sherman ME, Szeszenia-Dabrowska N, et al. Ovarian cancer risk and common variation in the sex hormone-binding globulin gene: a population-based case-control study. BMC Cancer 2007;7:60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Matsuno RK, Sherman ME, Visvanathan K, Goodman MT, Hernandez BY, Lynch CF, et al. Agreement for tumor grade of ovarian carcinoma: analysis of archival tissues from the surveillance, epidemiology, and end results residual tissue repository. Cancer Causes Control 2013;24:749–57 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Killian JK, Kim SY, Miettinen M, Smith C, Merino M, Tsokos M, et al. Succinate dehydrogenase mutation underlies global epigenomic divergence in gastrointestinal stromal tumor. Cancer Discov 2013;3:648–57 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Killian JK, Walker RL, Bilke S, Chen Y, Davis S, Cornelison R, et al. Genome-wide methylation profiling in archival formalin-fixed paraffin-embedded tissue samples. Methods Mol Biol 2012;823:107–18 [DOI] [PubMed] [Google Scholar]
- 28.Aryee MJ, Jaffe AE, Corrada-Bravo H, Ladd-Acosta C, Feinberg AP, Hansen KD, et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 2014;30:1363–9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Maksimovic J, Gordon L, Oshlack A. SWAN: Subset-quantile Within Array Normalization for Illumina Infinium HumanMethylation450 BeadChips. Genome biology 2012;13:R44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Chen YA, Lemire M, Choufani S, Butcher DT, Grafodatskaya D, Zanke BW, et al. Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics 2013;8:203–9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Smyth GK, Speed T. Normalization of cDNA microarray data. Methods 2003;31:265–73 [DOI] [PubMed] [Google Scholar]
- 32.Venkatraman ES, Olshen AB. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 2007;23:657–63 [DOI] [PubMed] [Google Scholar]
- 33.Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome biology 2011;12:R41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Talhouk A, Kommoss S, Mackenzie R, Cheung M, Leung S, Chiu DS, et al. Single-Patient Molecular Testing with NanoString nCounter Data Using a Reference-Based Strategy for Batch Effect Correction. PLOS ONE 2016;11:e0153844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Gaujoux R, Seoighe C. A flexible R package for nonnegative matrix factorization. BMC Bioinformatics 2010;11:1–9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Lee DD, Seung HS. Algorithms for non-negative matrix factorization. 2001. [Google Scholar]
- 37.Brunet J-P, Tamayo P, Golub TR, Mesirov JP. Metagenes and molecular pattern discovery using matrix factorization. Proceedings of the National Academy of Sciences 2004;101:4164–9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hutchins LN, Murphy SM, Singh P, Graber JH. Position-dependent motif characterization using non-negative matrix factorization. Bioinformatics 2008;24:2684–90 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Konecny GE, Wang C, Hamidi H, Winterhoff B, Kalli KR, Dering J, et al. Prognostic and Therapeutic Relevance of Molecular Subtypes in High-Grade Serous Ovarian Cancer. Journal of the National Cancer Institute 2014;106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Way GP, Rudd J, Wang C, Hamidi H, Fridley BL, Konecny GE, et al. Comprehensive Cross-Population Analysis of High-Grade Serous Ovarian Cancer Supports No More Than Three Subtypes. G3 (Bethesda, Md) 2016;6:4097–103 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bosquet JG, Marchion DC, Chon H, Lancaster JM, Chanock S. Analysis of Chemotherapeutic Response in Ovarian Cancers Using Publicly Available High-Throughput Data. Cancer Research 2014;74:3902–12 [DOI] [PubMed] [Google Scholar]
- 42.Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 2012;486:346–52 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Bodelon C, Vinokurova S, Sampson JN, den Boon JA, Walker JL, Horswill MA, et al. Chromosomal copy number alterations and HPV integration in cervical precancer and invasive cancer. Carcinogenesis 2016;37:188–96 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Leong HS, Galletta L, Etemadmoghadam D, George J, Australian Ovarian Cancer S, Kobel M, et al. Efficient molecular subtype classification of high-grade serous ovarian cancer. J Pathol 2015;236:272–7 [DOI] [PubMed] [Google Scholar]
- 45.Helland Å, Anglesio MS, George J, Cowin PA, Johnstone CN, House CM, et al. Deregulation of MYCN, LIN28B and LET7 in a Molecular Subtype of Aggressive High-Grade Serous Ovarian Cancers. PLOS ONE 2011;6:e18064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kolbe DL, DeLoia JA, Porter-Gill P, Strange M, Petrykowska HM, Guirguis A, et al. Differential analysis of ovarian and endometrial cancers identifies a methylator phenotype. PLoS One 2012;7:e32941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Earp MA, Cunningham JM. DNA methylation changes in epithelial ovarian cancer histotypes. Genomics 2015;106:311–21 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Erickson BK, Conner MG, Landen CN Jr. The role of the fallopian tube in the origin of ovarian cancer. Am J Obstet Gynecol 2013;209:409–14 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Przybycin CG, Kurman RJ, Ronnett BM, Shih I-M, Vang R. Are All Pelvic (Nonuterine) Serous Carcinomas of Tubal Origin? The American Journal of Surgical Pathology 2010;34:1407–16 [DOI] [PubMed] [Google Scholar]
- 50.Shih Ie M, Kurman RJ. Molecular pathogenesis of ovarian borderline tumors: new insights and old challenges. Clin Cancer Res 2005;11:7273–9 [DOI] [PubMed] [Google Scholar]
- 51.Wang C, Cicek MS, Charbonneau B, Kalli KR, Armasu SM, Larson MC, et al. Tumor Hypomethylation at 6p21.3 Associates with Longer Time to Recurrence of High-Grade Serous Epithelial Ovarian Cancer. Cancer Research 2014;74:3084–91 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kerick M, Isau M, Timmermann B, Sultmann H, Herwig R, Krobitsch S, et al. Targeted high throughput sequencing in clinical cancer settings: formaldehyde fixed-paraffin embedded (FFPE) tumor tissues, input amount and tumor heterogeneity. BMC Med Genomics 2011;4:68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Schweiger MR, Kerick M, Timmermann B, Albrecht MW, Borodina T, Parkhomchuk D, et al. Genome-wide massively parallel sequencing of formaldehyde fixed-paraffin embedded (FFPE) tumor tissues for copy-number- and mutation-analysis. PLoS One 2009;4:e5548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Dumenil TD, Wockner LF, Bettington M, McKeone DM, Klein K, Bowdler LM, et al. Genome-wide DNA methylation analysis of formalin-fixed paraffin embedded colorectal cancer tissue. Genes, chromosomes & cancer 2014;53:537–48 [DOI] [PubMed] [Google Scholar]
- 55.de Ruijter TC, de Hoon JPJ, Slaats J, de Vries B, MJFW Janssen, van Wezel T, et al. Formalin-fixed, paraffin-embedded (FFPE) tissue epigenomics using Infinium HumanMethylation450 BeadChip assays. Lab Invest 2015;95:833–42 [DOI] [PubMed] [Google Scholar]
- 56.Ose J, Poole EM, Schock H, Lehtinen M, Arslan AA, Zeleniuch-Jacquotte A, et al. Androgens are differentially associated with ovarian cancer subtypes in the Ovarian Cancer Cohort Consortium. Cancer Res 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Cuellar-Partida G, Lu Y, Dixon SC, Australian Ovarian Cancer S, Fasching PA, Hein A, et al. Assessing the genetic architecture of epithelial ovarian cancer histological subtypes. Hum Genet 2016;135:741–56 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.