Abstract
Context
Hierarchical clustering (HC) identifies subtypes of polycystic ovary syndrome (PCOS).
Objective
This work aimed to identify clinically significant subtypes in a PCOS cohort diagnosed with the Rotterdam criteria and to further characterize the distinct subtypes.
Methods
Clustering was performed using the variables body mass index (BMI), luteinizing hormone (LH), follicle-stimulating hormone, dehydroepiandrosterone sulfate, sex hormone–binding globulin (SHBG), testosterone, insulin, and glucose. Subtype characterization was performed by analyzing the variables estradiol, androstenedione, dehydroepiandrosterone, cortisol, anti-Müllerian hormone (AMH), total follicle count (TFC), lipid profile, and blood pressure. Study participants were girls and women who attended our university hospital for reproductive endocrinology screening between February 1993 and February 2021. In total, 2502 female participants of European ancestry, aged 13 to 45 years with PCOS (according to the Rotterdam criteria), were included. A subset of these (n = 1067) fulfilled the National Institutes of Health criteria (ovulatory dysfunction and hyperandrogenism). Main outcome measures included the identification of distinct PCOS subtypes using cluster analysis. Additional clinical variables associated with these subtypes were assessed.
Results
Metabolic, reproductive, and background PCOS subtypes were identified. In addition to high LH and SHBG levels, the reproductive subtype had the highest TFC and levels of AMH (all P < .001). In addition to high BMI and insulin levels, the metabolic subtype had higher low-density lipoprotein levels and higher systolic and diastolic blood pressure (all P < .001). The background subtype had lower androstenedione levels and features of the other 2 subtypes.
Conclusion
Reproductive and metabolic traits not used for subtyping differed significantly in the subtypes. These findings suggest that the subtypes capture distinct PCOS causal pathways.
Keywords: PCOS, subtypes, cluster analysis, reproductive, metabolic
Polycystic ovary syndrome (PCOS) is a complex genetic disorder reflecting the interaction of susceptibility genes and environmental factors (1). It is among the most common endocrine disorders of reproductive-aged girls and women, affecting 5% to 15% of this population worldwide, depending on the diagnostic criteria applied (2-4). PCOS is characterized by the presence of 2 or more of the following features: ovulatory dysfunction (OD), hyperandrogenism (HA), and polycystic ovarian morphology (PCOM). All of the diagnostic criteria for PCOS are based on expert opinion. The first diagnostic criteria, known as the National Institutes of Health (NIH) criteria, required the presence of both OD and HA; PCOM is not included in these diagnostic criteria (5, 6). In 2003, PCOM was added as a diagnostic criterion and the diagnosis of PCOS required 2 of 3 of the following features, OD, HA, or PCOM. The application of these so-called Rotterdam criteria resulted in 4 phenotypes, which have been designated phenotype A, HA + OD + PCOM; phenotype B, HA + OD; phenotype C, HA + PCOM; and phenotype D, OD + PCOM. The phenotypes including HA + OD with or without PCOM, which do not differ biochemically, are often designated as NIH phenotype or classic PCOS. The combination of HA + PCOM and OD + PCOM are known as the non-NIH Rotterdam phenotypes. The Androgen Excess Society criteria include only the phenotypes with HA. A meta-analysis of genome-wide association study (GWAS) had adequate power to formally compare NIH PCOS, non-NIH Rotterdam PCOS, and self-reported PCOS. In this study, no significant differences were found in effect sizes across the cases stratified by phenotype for 13 of 14 PCOS-associated loci (7). These findings imply that the current diagnostic criteria do not identify biologically distinct phenotypes.
In contrast, data-driven approaches to PCOS classification using unsupervised hierarchical clustering of quantitative traits identified 3 reproducible subtypes in cases with the NIH phenotype of OD and HA (8). These subtypes were designated 1) “reproductive,” characterized by higher luteinizing hormone (LH) and sex hormone–binding globulin (SHBG) levels with relatively low body mass index (BMI) and insulin levels; 2) “metabolic,” characterized by increased glucose, and insulin levels with lower SHBG and LH levels; and 3) “background,” for the cases that demonstrated no distinguishable pattern regarding their relative phenotypic trait distributions. Each subtype was associated with unique genome-wide significant loci suggesting that the subtypes had distinct genetic architecture. Further, these significant genetic associations provided orthogonal validation that the subtypes captured biologically distinct groups.
We undertook this study to investigate whether these subtypes were present in more broadly ascertained PCOS cases using the Rotterdam criteria. Further, we investigated whether the subtypes thus identified had differences in additional PCOS-related clinical variables not used for clustering and whether the differences aligned with distinct biologic pathways.
Materials and Methods
Study Population
Girls and women of European ancestry, aged 13 to 45 years, who attended our outpatient clinic of Reproductive Endocrinology and Infertility at the Erasmus University Medical Center Rotterdam between February 1993 and February 2021, were included. Before 2003, girls and women with PCOS were diagnosed when they met the World Health Organization 2 anovulation criteria according to the World Health Organization classification in combination with PCOM or HA or both (9, 10). From 2003 to 2018, girls and women were diagnosed with PCOS using the 2003 Rotterdam criteria, and from 2018 onward, the criteria from the 2018 International Guideline for PCOS were used (11, 12). Both guidelines state that PCOS can be diagnosed when at least 2 of the following 3 features are present: OD, HA, and/or PCOM. In addition, girls and women were screened to exclude the presence of adrenal gland disorders, pituitary gland dysregulation, and/or ovarian diseases. OD was defined as oligomenorrhea (menstrual cycle <21 days, >35 days or <8 cycles per year) or amenorrhea (interval of vaginal bleeding >182 days). For adolescents, the specific adolescent criteria were used, which includes more than 1 to less than 3 years post menarche: less than 21 or more than 45 days and more than 1 year post menarche more than 90 days for 1 cycle, or primary amenorrhea by age 15 years or more than 3 years post thelarche.
Until 2018, clinical HA was defined as a Ferriman Gallwey score of 8 or greater, and from 2018 onward, clinical HA was defined as a modified Ferriman Gallwey score (mFGs) of 5 or greater (13, 14). Until August 2012, biochemical HA was diagnosed as a total serum testosterone (T) greater than 3.0 nmol/L and/or a free androgen index greater than 4.5%. Since the introduction of liquid chromatography–tandem mass spectrometry (August 2012), a free androgen index cutoff above 2.9% and/or a serum total T greater than 2.0 nmol/L has been used (15). PCOM was defined as 12 or more follicles (2-9 mm in diameter), and/or increased ovarian volume (>10 cm3) in at least 1 ovary, evaluated by using a transvaginal ultrasound of less than 8 MHz. From 2019 onward, an ultrasound of greater than 8 MHz was used, and the diagnostic cutoff values for the diagnosis of PCOM were changed to 20 or more follicles (2-9 mm in diameter), and/or increased ovarian volume (>10 cm3) in at least 1 ovary (16). Girls and women were excluded if they used hormonal contraceptives or received contraceptive (progesterone) injections 3 months prior to the screening, and/or were not fasting at the time of screening. For our subgroup analysis, we used the NIH criteria (subset NIH criteria), which includes girls and women with OD and clinical and/or biochemical HA.
All female participants underwent a standardized screening. Screening took place in the morning after an overnight fast and included assessment of the menstrual cycle, height and weight, calculation of BMI, assessment of hirsutism using mFGs, and blood withdrawal. TFC and ovarian volume were assessed using transvaginal ultrasound. The same day, the following hormone levels were measured: LH, follicle-stimulating hormone (FSH), estradiol (E2), progesterone, 17-hydroxyprogesterone, T, androstenedione (Adion), dehydroepiandrosterone (DHEA), DHEA-sulfate (DHEAS), cortisol, prolactin, thyroid-stimulating hormone (TSH), SHBG, anti-Müllerian hormone (AMH), and fasting levels of insulin (Ins0) and glucose (Glu0). All assay methods and antibodies used are described in Table 1.
Table 1.
Name of assay | RRID | |
---|---|---|
Testosterone | ||
Method 1 | RIA kits (Diagnostic Products Corp) | Catalog No. DSL-4000, RRID:AB_3096130 |
Method 2 | RIA (Siemens DPC) | Catalog No. TKTT5, RRID:AB_2905660 |
Method 3 | PerkinElmer CHS MSMS Steroids Kit | NA |
Method 4 | Self-developed LC-MS/MS | NA |
Method 5 | Self-developed LC-MS/MS. | NA |
SHBG | ||
Method 1 | RIA kits (Diagnostic Products Corp) | Catalog No. DSL-6300, RRID:AB_3096132 |
Method 2 | Siemens Immulite 2000XPi | NA |
Method 3 | Immunoassay, Immunodiagnostic Systems-iSYS | Catalog No. IS-5600, RRID:AB_3096143 |
DHEAS | ||
Method 1 | RIA kits (Diagnostics Products Corp) | Catalog No. TKDS1, RRID:AB_3096136 |
Method 2 | Immulite 2000, platform assay | NA |
Method 3 | LC-MS/MS | NA |
Method 4 | LC-MS/MS | NA |
Glu0 | ||
Method 1 | Unicell DxC 800 assay (Beckman Coulter) | NA |
Method 2 | Roche Hitachi 917 | NA |
Method 3 | Roche Modular E170 | NA |
Method 4 | COBAS 8000 Modular Analyzer (Roche Diagnostics GmbH) | NA |
Ins0 | ||
Method 1 | Assay not specified | NA |
Method 2 | Immulite 1000 assay (Diagnostics Products Corp) | Catalog No. LKIN1, RRID:AB_2750939 |
Method 3 | Immulite 2000 | Catalog No. L2KIN2, RRID:AB_2756390 |
Method 4 | Lumipulse G1200 | Catalog No. 292938, RRID:AB_3096140 |
LH | ||
Method 1 | Immulite 1000, platform assay | Catalog No. LKLH1, RRID:AB_3096145 |
Method 2 | Siemens Immulite 2000XPi | Catalog No. L2KLH2, RRID:AB_2756388 |
Method 3 | Lumipulse G1200 (Fujirebio) | Catalog No. 292822, RRID:AB_3096138 |
FSH | ||
Method 1 | Immulite 1000, platform assay | Catalog No. LKFS1, RRID:AB_3096144 |
Method 2 | Siemens Immulite 2000X Pi | Catalog No. L2KFS2, RRID: AB_2756389 |
Method 3 | Lumipulse G1200 (Fujirebio) | Catalog No. 230923, RRID:AB_3096137 |
E2 | ||
Method 1 | Immulite (Diagnostic Products Corp) | Siemens Catalog No. LKE21, RRID:AB_2800400 |
Method 2 | Siemens RIA | Catalog No. DSL-4800, RRID:AB_3096131 |
Method 3 | Roche Cobas E | NA |
Method 4 | Lumipulse G1200 (Fujirebio) | Catalog No. 296011, RRID:AB_3096142 |
Adion | ||
Method 1 | RIA kits (Diagn Products Corp) | Catalog No. TKAN1, RRID:AB_3096134 |
Method 2 | Immulite 2000 platform assay | Siemens Catalog No. LKAO1, RRID:AB_2895713 |
Method 3 | LC-MS/MS | NA |
Method 4 | Self-developed UPLC-MS/MS test | NA |
DHEA | ||
Method 1 | RIA kits (Diagnostic Products Corp) | Catalog No. TKDH1, RRID:AB_3096135 |
Method 2 | Immulite 2000 platform assay | NA |
Method 3 | LC-MS/MS | NA |
Method 4 | LC-MS/MS | NA |
Cortisol | ||
Method 1 | Immulite 2000 | Siemens Catalog No. LKCO2, RRID:AB_2810257 |
Method 2 | UPLC-MS/MS middels Beckman Access II | NA |
TSH | ||
Method 1 | Amerlite TSH assay | NA |
Method 2 | Immulite 2000XPi | NA |
Method 3 | Lumipulse G1200 (Fujirebio) | Catalog No. 294604, RRID:AB_3096141 |
AMH | ||
Method 1 | Immulon 2 plates (Dynatech Corp) | NA |
Method 2 | Immunotech-Coulter | NA |
Method 3 | Beckman Coulter Inc, AMH Gen II assay | Beckman Coulter Catalog No. 79765, RRID:AB_2800500 |
Method 4 | Lumipulse G1200 (Fujirebio) | NA |
Prolactin | ||
Method 1 | Siemens Immulite 2000XPi | Catalog No. L2KPR2, RRID:AB_2827375 |
Method 2 | Lumipulse G1200 (Fujirebio) | Catalog No. 292839, RRID:AB_3096139 |
Method 3 | Siemens Atellica IM1300 | Catalog No. 10995656, RRID:AB_3096296 |
Abbreviations: Adion, androstenedione; AMH, antimüllerian hormone; DHEA, dehydroepiandrosterone; DHEAS, dehydroepiandrosterone sulfate; E2, estradiol; FSH, follicle-stimulating hormone; Glu0, glucose; Ins0, insulin; LC-MS/MS, liquid chromatography–tandem mass spectrometry; LH, luteinizing hormone; NA, not available; RIA, radioimmunoassay; RRID, Research Resource Identifier; SHBG, sex hormone–binding globulin; TSH, thyroid-stimulating hormone (http://antibodyregistry.org/); UPLC, ultra-performance liquid chromatography.
Before 2019, lipid levels were measured occasionally and since 2019 lipid measurements are part of the standardized screening. Measured lipid levels included triglycerides (TG), low-density lipoprotein (LDL), high-density lipoprotein (HDL), and total cholesterol (Chol). Furthermore, systolic blood pressure (SBP) and diastolic blood pressure (DBP) were assessed. The medical ethical review board of the Erasmus University Medical Center Rotterdam approved retrospective studies within this patient population, which includes girls and women with ovulatory dysfunction (MEC-2020-0534).
Cluster Analysis
Cluster analysis and subtype naming was performed as we have previously reported (8). Unsupervised hierarchical cluster analysis was applied using the following 8 age-adjusted quantitative variables: BMI, T, SHBG, DHEAS, LH, FSH, Ins0, and Glu0. First, we performed the cluster analysis on our total cohort (Rotterdam criteria) and repeated the analysis in a subset according to the NIH criteria (subset NIH criteria). Individuals with a glucose level above 7 mmol/L were excluded. The quantitative variables were first loge-normalized and adjusted for age and assay method. Next, an inverse normal transformation was applied for each variable to ensure equal scaling. The residuals of the different variables were then clustered using hierarchical clustering (HC) as described previously (8). The subtypes were designated 1) “reproductive,” characterized by higher LH and SHBG levels with relatively low BMI and insulin levels; 2) “metabolic,” characterized by increased glucose, and insulin levels with lower SHBG and LH levels; and 3) “background” (previously labeled indeterminate), for the cases that demonstrated no distinguishable pattern regarding their relative phenotypic trait distributions (17). The contribution of each variable to the first 3 dimensions of the clustering was further quantified.
Comparison Between Subtypes
Additional clinical variables were compared between the subtypes, including E2, Adion, DHEA, cortisol, TSH, AMH, TFC, prolactin, mFGs, TG, Chol, LDL, HDL, SBP, and DBP. These variables were first transformed using log10 transformation, and subsequently Z scores were calculated to account for the use of different assays. Analysis of variance was used to compare the clinical variables between the 3 subtypes. Pair-wise comparison between different subtypes was adjusted for multiple testing using Bonferroni correction.
Distribution of the Phenotypes Within Subtypes
We compared the distribution of each phenotype from the Rotterdam criteria over the 3 subtypes. The results are provided in Supplementary Tables S1 and S2 (18). The Rotterdam diagnostic criteria include 4 phenotypes: phenotype A (OD + HA + PCOM); phenotype B (OD + HA); phenotype C (HA + PCOM); and phenotype D (OD + PCOM). Whenever one of the clinical characteristics (OD, HA, or PCOM) was missing, we considered the phenotype of that same participant as missing. We performed this analysis for both the total cohort as well as for the subset fulfilling the NIH criteria. Finally, we stratified the total cohort into the groups “classic NIH” (phenotype A + B) and “non-NIH Rotterdam” (phenotype C + D) and compared the distribution of the 3 subtypes within both groups by performing a chi-square test.
Results
Descriptive Statistics
In total, 2510 girls and women with PCOS, diagnosed using the Rotterdam criteria, were included (total cohort).
Eight participants were excluded because of a serum glucose greater than 7 mmol/L. Of the remaining 2502 girls and women, 1067 met the NIH criteria (subset NIH criteria). Tables 2 and 3 show the 8 quantitative variables stratified by assay method, which were used for the cluster analysis for both the total cohort and the subset NIH criteria respectively.
Table 2.
N | Metabolic median (25-75) | N | Reproductive median (25-75) | N | Background median (25-75) | P total | P (met – rep) | P (met – back) | P (rep – back) | |
---|---|---|---|---|---|---|---|---|---|---|
Age, y | 1026 | 29.0 (25.2-32.4) | 450 | 29.4 (26.1-32.1) | 1026 | 29.1 (25.9-32.2) | .455 | .628 | ≥.999 | ≥.999 |
BMI | 1026 | 30.1 (25.2-34.9) | 450 | 21.5 (19.8-24.0) | 1026 | 22.5 (20.5-25.3) | <.001 | <.001 | <.001 | <.001 |
Testosterone | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | 52 | 3.0 (2.1-3.5) | 18 | 3.0 (2.7-3.3) | 28 | 2.0 (1.5-2.4) | ||||
Method 2 | 488 | 2.0 (1.5-2.6) | 235 | 2.0 (1.6-2.7) | 498 | 1.2 (0.8-1.6) | ||||
Method 3 | 259 | 1.4 (1.1-1.9) | 108 | 1.5 (1.3-2.1) | 257 | 1.1 (0.8-1.4) | ||||
Method 4 | 53 | 1.4 (1.1-1.8) | 24 | 1.8 (1.5-2.1) | 55 | 1.0 (0.9-1.2) | ||||
Method 5 | 174 | 1.5 (1.2-1.9) | 65 | 1.7 (1.4-2.4) | 188 | 1.0 (0.8-1.3) | ||||
SHBG | <.001 | <.001 | <.001 | .002 | ||||||
Method 1 | 51 | 32.8 (23.4-51.1) | 18 | 65.0 (47.2-93.5) | 25 | 65.8 (50.2-93.0) | ||||
Method 2 | 877 | 31.3 (22.6-43.2) | 401 | 61.9 (48.1-76.8) | 884 | 56.0 (42.8-73.7) | ||||
Method 3 | 98 | 30.6 (22.2-41.4) | 31 | 69.1 (47.7-82.0) | 117 | 61.6 (42.8-75.7) | ||||
DHEAS | <.001 | <.001 | <.001 | .201 | ||||||
Method 1 | 51 | 7.3 (4.7-10.2) | 18 | 6.8 (4.9-9.5) | 25 | 5.8 (4.0-8.6) | ||||
Method 2 | 530 | 5.8 (4.1-7.4) | 261 | 4.5 (3.5-5.8) | 541 | 4.5 (3.1-6.0) | ||||
Method 3 | 218 | 5.0 (3.6-6.4) | 82 | 3.9 (2.9-5.4) | 217 | 3.9 (2.6-5.6) | ||||
Method 4 | 227 | 5.5 (4.3-7.2) | 89 | 3.8 (2.6-5.7) | 243 | 4.1 (3.0-5.6) | ||||
Glu0 | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | 21 | 4.6 (4.1-4.9) | 12 | 2.8 (2.3-2.9) | 11 | 4.3 (4.0-4.7) | ||||
Method 2 | 290 | 4.2 (3.9-4.5) | 113 | 3.6 (3.4-3.8) | 239 | 4.0 (3.7-4.2) | ||||
Method 3 | 254 | 5.0 (4.7-5.2) | 139 | 4.5 (4.3-4.7) | 302 | 4.7 (4.5-4.9) | ||||
Method 4 | 461 | 5.2 (4.9-5.5) | 186 | 4.7 (4.5-4.9) | 474 | 4.9 (4.7-5.1) | ||||
Ins0 | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | 200 | 94.0 (72.0-141.6) | 50 | 43.5 (29.0-58.0) | 167 | 49.0 (36.0-64.1) | ||||
Method 2 | 107 | 83.0 (62.5-111.0) | 75 | 34.0 (26.0-47.0) | 83 | 44.0 (29.5-57.0) | ||||
Method 3 | 548 | 82.0 (56.0-123.0) | 260 | 27.0 (15.8-40.0) | 589 | 35.0 (20.0-51.0) | ||||
Method 4 | 171 | 94.0 (63.0-127.5) | 65 | 34.0 (30.0-41.0) | 187 | 44.0 (32.5-56.0) | ||||
LH | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | 51 | 6.9 (4.9-10.4) | 18 | 10.5 (7.7-11.5) | 25 | 5.2 (3.5-6.1) | ||||
Method 2 | 804 | 8.7 (5.8-12.4) | 367 | 12.8 (8.7-18.1) | 814 | 4.9 (3.3-6.9) | ||||
Method 3 | 171 | 9.1 (6.3-13.0) | 65 | 12.0 (8.8-16.2) | 187 | 4.6 (3.2-6.7) | ||||
FSH | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | 51 | 4.7 (3.6-6.1) | 18 | 4.7 (3.6-5.0) | 25 | 4.9 (3.8-5.7) | ||||
Method 2 | 804 | 5.9 (4.5-7.2) | 367 | 7.0 (5.8-8.4) | 814 | 4.8 (3.3-6.2) | ||||
Method 3 | 171 | 5.6 (4.7-6.7) | 65 | 6.8 (6-7.6) | 187 | 5.0 (3.5-6.1) |
Values are medians with 25th and 75th percentiles for each cluster. For pairwise comparisons, Z scores were calculated to account for the use of different assays and a post hoc Bonferroni correction was used to adjust for multiple testing.
Abbreviations: BMI, body mass index; DHEAS, dehydroepiandrosterone sulfate; FSH, follicle-stimulating hormone; Glu0, glucose; Ins0, insulin; LH, luteinizing hormone; met, metabolic; rep, reproductive; SHBG, sex hormone–binding globulin; y, year.
Table 3.
N | Metabolic median (25-75) | N | Reproductive median (25-75) | N | Background median (25-75) | P total | P (met – rep) | P (met – back) | P (rep – back) | |
---|---|---|---|---|---|---|---|---|---|---|
Age, y | 652 | 28.1 (24.3-32.0) | 199 | 29.0 (25.9-31.8) | 216 | 28.2 (24.1-32.0) | .092 | .104 | ≥.999 | .232 |
BMI | 652 | 30.6 (26.7-35.0) | 199 | 22.4 (20.4-24.8) | 216 | 22.8 (20.5-25.9) | <.001 | <.001 | <.001 | .845 |
Testosterone | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | 5 | 3.0 (2.9 –3.3) | — | — | — | — | ||||
Method 2 | 342 | 2.2 (1.7-2.8) | 80 | 3.1 (2.6-3.7) | 113 | 1.6 (1.3-2.1) | ||||
Method 3 | 180 | 1.5 (1.2-1.9) | 74 | 2.2 (2.0-2.7) | 62 | 1.3 (0.9-1.5) | ||||
Method 4 | 39 | 1.4 (1.2-2.0) | 15 | 2.1 (1.7-2.8) | 15 | 1.1 (1.0-1.5) | ||||
Method 5 | 86 | 1.4 (1.1-1.8) | 30 | 2.2 (1.9-2.8) | 26 | 1.5 (1.1-1.7) | ||||
SHBG | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | 3 | 30.7 (25.7-40.0) | — | — | ||||||
Method 2 | 627 | 29.2 (21.4-37.5) | 188 | 58.4 (48.5-74.7) | 206 | 41.0 (31.8-52.6) | ||||
Method 3 | 22 | 26.7 (21.1-35.3) | 11 | 67.1 (56.5-78.5) | 10 | 46.3 (35.1-62.3) | ||||
DHEAS | <.001 | <.001 | .316 | <.001 | ||||||
Method 1 | 3 | 12.6 (9.8-12.8) | — | — | — | — | ||||
Method 2 | 380 | 5.8 (4.2-7.5) | 94 | 4.9 (3.6-6.3) | 123 | 6.3 (4.8-7.6) | ||||
Method 3 | 269 | 5.2 (4.0-6.8) | 105 | 4.0 (3.0-5.7) | 93 | 5.95 (4.3-7.6) | ||||
Glu0 | <.001 | <.001 | <.001 | .106 | ||||||
Method 1 | 2 | 3.7 (3.7-3.8) | — | — | — | — | ||||
Method 2 | 182 | 4.2 (3.8-4.4) | 38 | 3.9 (3.6-4.0) | 48 | 3.8 (3.5-4.0) | ||||
Method 3 | 183 | 4.9 (4.7-5.2) | 50 | 4.6 (4.4-4.7) | 72 | 4.6 (4.2-4.8) | ||||
Method 4 | 285 | 5.1 (4.9-5.4) | 111 | 4.9 (4.6-5.0) | 96 | 4.7 (4.6-5.0) | ||||
Ins0 | <.001 | <.001 | <.001 | 1.0 | ||||||
Method 1 | 94 | 93.5 (72.0-139.3) | 16 | 51.0 (42.1-68.7) | 11 | 36.1 (31.5-59.0) | ||||
Method 2 | 88 | 86.5 (61.8-110.0) | 22 | 29.5 (24.5-35.8) | 37 | 38.0 (30.0-44.0) | ||||
Method 3 | 386 | 86.0 (62.0-131.0) | 131 | 33.0 (17.5-48.0) | 143 | 32.0 (18.0-43.0) | ||||
Method 4 | 84 | 98.5 (60.8-141.0) | 30 | 33.0 (31.0-44.8) | 25 | 40.0 (35.0-48.0) | ||||
LH | <.001 | <.001 | .004 | <.001 | ||||||
Method 1 | 3 | 4.8 (4.6-9.3) | — | — | — | — | ||||
Method 2 | 565 | 8.4 (5.4-11.7) | 169 | 13.3 (10.0-18.8) | 191 | 7.1 (4.8-10.4) | ||||
Method 3 | 84 | 8.1 (5.4-10.8) | 30 | 12.8 (10.4-17.3) | 25 | 6.6 (5.7-8.6) | ||||
FSH | <.001 | <.001 | .248 | <.001 | ||||||
Method 1 | 3 | 5.6 (5.2-5.8) | — | — | — | — | ||||
Method 2 | 565 | 5.6 (4.0-7.0) | 169 | 6.4 (5.3-7.9) | 191 | 5.6 (4.5-7.0) | ||||
Method 3 | 84 | 5.4 (4.0-6.3) | 30 | 6.4 (5.6-7.3) | 25 | 5.1 (4.7-5.6) |
Values are medians with 25th and 75th percentiles for each cluster. For pairwise comparisons, Z scores were calculated to account for the use of different assays and a post hoc Bonferroni correction was used to adjust for multiple testing.
Abbreviations: BMI, body mass index; DHEAS, dehydroepiandrosterone sulfate; FSH, follicle-stimulating hormone; Glu0, glucose; Ins0, insulin; LH, luteinizing hormone; met, metabolic; rep, reproductive; SHBG, sex hormone–binding globulin.
Cluster Analysis
Investigation of the previously defined 3 subtypes (8) showed in our total cohort the following distributions: (1) metabolic subtype (41.0%, 1026/2502) characterized by higher BMI, Glu0, and Ins0 levels with relatively low LH and SHBG levels; (2) reproductive subtype (18.0%, 450/2502), characterized by higher FSH, LH, and SHBG levels with relatively low BMI and Ins0 levels; and (3) background subtype (41.0%, 1026/2502), which has no distinguishable pattern in the phenotypic trait distributions (see Table 2, Fig. 1). Age did not differ between the 3 subtypes (P = .455). All other variables showed significant differences between the 3 subtypes (all P < .001), except for DHEAS levels, which were significantly higher only in the metabolic subtype (P < .001) (see Table 2). These results are also shown in the principal component analysis (PCA) plot and box plot and demonstrate that the reproductive subtype is mainly driven by LH and SHBG, contributing respectively 18.7% and 14.6% to the first 2 principal components, whereas the metabolic subtype is driven by the variables BMI and Ins0, contributing 16.8% and 15.9%, respectively (Fig. 1A and 1B). These results are supported by heat map visualization (Fig. 1C), which reflects the similarity of individual subjects in a row-based dendrogram compared to the trait Z scores of the 3 cluster groups.
Our results are in line with our previously published study on hierarchical clustering in women with PCOS diagnosed with the NIH criteria (8). Indeed, after subsetting our data based on the NIH criteria, and repeating the cluster analysis, 3 distinct subtypes could again be identified (see Table 3, Fig. 2). However, compared to the total cohort, the percentage of women having the metabolic subtype (61.1%, 652/1067) was higher while the percentage of women having the background subtype (18.7%, 199/1067) was lower compared to the total cohort (see Table 3). The percentage of women having the reproductive subtype was similar to the total cohort (20.2%, 216/1067). Age was again not significantly different among the 3 subtypes (P = .09) (see Table 3). In line with the analysis in the total cohort, the PCA plot, box plot, and heat map showed that the reproductive subtype is mainly driven by LH, SHBG, and FSH, each contributing 19.6% and 15.3% and 13.0% to the first 2 principal components, whereas the metabolic subtype is mainly driven by BMI and Ins0, contributing 17.4% and 18.2% (see Fig. 2). The background subtype has no distinguishable pattern in the phenotypic trait distributions (see Fig. 2).
Comparison of Additional Clinical Variables Between Subtypes
We compared clinical variables not used for clustering to determine whether the subtypes captured additional distinctive biologic features related to reproductive or metabolic pathways (Tables 4 and 5). In the reproductive subtype, girls and women had significantly higher AMH levels and higher TFC compared to participants in the metabolic subtype (all P < .001). Participants in the metabolic subtype had significantly higher TG and LDL levels and significantly lower HDL levels, compared to those in the reproductive subtype (all P < .001). SBP and DBP were also significantly higher in the metabolic subtype compared to both the reproductive and background subtypes (P < .001). These results were similar when PCOS was defined by the NIH or Rotterdam diagnostic criteria.
Table 4.
N | Metabolic median (25-75) | N | Reproductive median (25-75) | N | Background median (25-75) | P total | P (met – rep) | P (met – back) | P (rep – back) | |
---|---|---|---|---|---|---|---|---|---|---|
E2 | <.001 | <.001 | .071 | <.001 | ||||||
Method 1 | 52 | 237.5 (195.0-303.3) | 18 | 263.0 (219.3-352.0) | 28 | 200.5 (127.0-263.8) | ||||
Method 2 | 592 | 219.5 (164.0-294.0) | 275 | 243.0 (17.0-371.0) | 600 | 201.5 (123.3-354.0) | ||||
Method 3 | 265 | 184.0 (139.0-297.5) | 117 | 229.0 (165.0-345.5) | 262 | 196.5 (123.8-400.3) | ||||
Method 4 | 116 | 184.0 (138.5-323.5) | 40 | 213.0 (164.8-354.5) | 136 | 190.5 (122.8-416.0) | ||||
Adion | <.001 | ≥.999 | <.001 | <.001 | ||||||
Method 1 | 51 | 15.0 (10.3-20.2) | 18 | 13.3 (9.4-20.3) | 25 | 10.8 (6.3-13.7) | ||||
Method 2 | 489 | 12.1 (9.3-15.4) | 236 | 11.3 (8.9-14.3) | 500 | 8.5 (6.5-11.4) | ||||
Method 3 | 259 | 6.1 (4.8-7.8) | 106 | 6.1 (4.7-8.3) | 255 | 4.3 (3.4-5.5) | ||||
Method 4 | 227 | 6.3 (4.7-7.9) | 89 | 6.9 (5.7-8.8) | 243 | 4.3 (3.4-5.4) | ||||
DHEA | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | — | — | — | — | — | — | ||||
Method 2 | 528 | 42.4 (29.3-60.4) | 260 | 36.1 (25.4-50.5) | 538 | 30.7 (20.4-44.6) | ||||
Method 3 | 218 | 22.0 (15.0-29.9) | 82 | 18.8 (12.8-25.7) | 217 | 18.1 (12.9-25.6) | ||||
Method 4 | 227 | 20.3 (14.9-28.1) | 89 | 18.5 (14.2-25.8) | 243 | 16.4 (11.8-23.6) | ||||
Cortisol | ≥.999 | ≥.999 | ≥.999 | ≥.999 | ||||||
Method 1 | 939 | 315.0 (243.0-405.0) | 420 | 309.0 (252.0-400.8) | 926 | 317.0 (251.0-400.0) | ||||
Method 2 | 81 | 264.0 (180.5-339.0) | 28 | 293.0 (234.8-330.0) | 99 | 265.0 (210.0-332.0) | ||||
TSH | .013 | .410 | .011 | 1.0 | ||||||
Method 1 | 50 | 1.2 (0.8-2.0) | 18 | 1.4 (0.8-1.7) | 24 | 1.1 (0.7-1.5) | ||||
Method 2 | 815 | 1.8 (1.2-2.4) | 373 | 1.5 (1.2-2.2) | 828 | 1.5 (1.1-2.3) | ||||
Method 3 | 160 | 1.9 (1.4-2.5) | 59 | 1.8 (1.2-2.7) | 173 | 1.8 (1.3-2.3) | ||||
AMH | <.001 | <.001 | .029 | <.001 | ||||||
Method 1 | 1 | 51.0 (51.0-51.0) | — | — | — | — | ||||
Method 2 | 246 | 14.4 (8.0-23.7) | 149 | 17.3 (10.4-30.4) | 239 | 11.8 (8.1-18.2) | ||||
Method 3 | 378 | 7.0 (4.3-11.0) | 167 | 10.0 (5.7-15.4) | 420 | 5.7 (3.8-9.1) | ||||
Method 4 | 151 | 6.8 (4.3-9.8) | 51 | 9.7 (6.9-19.3) | 161 | 6.0 (4.3-8.7) | ||||
TFC | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | 770 | 38.0 (27.0-53.0) | 375 | 43.0 (30.0-61.0) | 783 | 33.0 (25.0-43.0) | ||||
Method 2 | 139 | 52.0 (39.0-69.0) | 56 | 64.0 (45.5-87.0) | 162 | 51.0 (37.0-65.0) | ||||
Prolactin | .535 | ≥.999 | ≥.999 | ≥.999 | ||||||
Method 1 | 197 | 0.2 (0.1-0.2) | 50 | 0.2 (0.1-0.2) | 167 | 0.2 (0.1-0.2) | ||||
Method 2 | 655 | 0.2 (0.2-0.3) | 334 | 0.2 (0.1-0.3) | 672 | 0.2 (0.1-0.3) | ||||
Method 3 | 171 | 0.3 (0.2-0.4) | 65 | 0.3 (0.2-0.4) | 187 | 0.3 (0.2 –0.4) | ||||
mFGs | 818 | 3.0 (1.0-7.0) | 386 | 1.0 (0.0-4.0) | 839 | 1.0 (0.0-4.0) | <.001 | <.001 | <.001 | ≥.999 |
TG | 497 | 1.2 (0.9-1.7) | 177 | 0.7 (0.6-0.9) | 450 | 0.8 (0.6-1.1) | <.001 | <.001 | <.001 | .13 |
Chol | 497 | 4.8 (4.2-5.4) | 177 | 4.7 (4.1-5.2) | 450 | 4.4 (3.9-5.0) | <.001 | .89 | <.001 | <.001 |
LDL | 497 | 3.1 (2.6-3.6) | 177 | 2.8 (2.4-3.3) | 450 | 2.6 (2.2-3.1) | <.001 | <.001 | <.001 | .045 |
HDL | 497 | 1.1 (0.9-1.4) | 177 | 1.6 (1.4-1.9) | 450 | 1.4 (1.2-1.7) | <.001 | <.001 | <.001 | <.001 |
SBP | 780 | 120.0 (110.0-126.0) | 375 | 110.0 (105.0-120.0) | 822 | 110.0 (105.0-120.0) | <.001 | <.001 | <.001 | .464 |
DBP | 780 | 80.0 (70.0-84.0) | 375 | 70.0 (65.0-80.0) | 822 | 70.0 (70.0-80.0) | <.001 | <.001 | <.001 | ≥.999 |
Values are medians with 25th and 75th percentiles for each cluster. For pairwise comparisons, Z scores were calculated to account for the use of different assays and a post hoc Bonferroni correction was used to adjust for multiple testing.
Abbreviations: Adion, androstenedione; AMH, antimüllerian hormone; back, background; Chol, cholesterol; DBP, diastolic blood pressure; DHEA, dehydroepiandrosterone; E2, estradiol; HDL, high-density lipoprotein; LDL, low-density lipoprotein; met, metabolic; mFGs, modified Ferriman Gallwey score; rep, reproductive; SBP, systolic blood pressure; TFC, total follicle count; TG, triglycerides; TSH, thyrotropin.
Table 5.
N | Metabolic median (25-75) | N | Reproductive median (25-75) | N | Background median (25-75) | P total | P (met – rep) | P (met – back) | P (rep – back) | |
---|---|---|---|---|---|---|---|---|---|---|
E2 | <.001 | .04 | <.001 | <.001 | ||||||
Method 1 | 5 | 242.0 (199.5-337.5) | — | — | — | — | ||||
Method 2 | 322 | 228.0 (117.0-335.3) | 107 | 240.0 (187.0-338.0) | 133 | 191.0 (137.0-290.0) | ||||
Method 3 | 186 | 183.0 (138.8-355.0) | 77 | 237.0 (165.5-386.5) | 69 | 158.0 (104.0-276.0) | ||||
Method 4 | 38 | 211.5 (160.5-278.0) | 15 | 261.0 (186.0-356.0) | 14 | 201.0 (123.0-357.5) | ||||
Adion | <.001 | <.001 | <.001 | <.001 | ||||||
Method 1 | 3 | 20.2 (15.2-20.2) | — | — | — | — | ||||
Method 2 | 344 | 12.8 (9.7-16.0) | 82 | 15.2 (11.3-18.1) | 113 | 10.8 (8.8-13.8) | ||||
Method 3 | 180 | 6.5 (5.0-8.2) | 72 | 8.3 (6.5-9.6) | 62 | 5.8 (4.1-7.2) | ||||
Method 4 | 125 | 6.3 (8.1-4.7) | 45 | 8.4 (6.9-10.8) | 41 | 5.7 (4.6-7.7) | ||||
DHEA | .028 | .024 | ≥.999 | .397 | ||||||
Method 1 | — | — | — | — | — | — | ||||
Method 2 | 380 | 43.4 (31.4-61.6) | 93 | 35.9 (26.2-47.3) | 123 | 42.7 (29.4-60.7) | ||||
Method 3 | 144 | 23.0 (17.0-31.6) | 60 | 21.1 (16.3-23.0) | 52 | 21.7 (15.4-29.5) | ||||
Method 4 | 125 | 21.0 (15.2-29.5) | 45 | 18.5 (14.7-24.1) | 41 | 21.6 (16.1-29.2) | ||||
Cortisol | .787 | ≥.999 | ≥.999 | ≥.999 | ||||||
Method 1 | 639 | 310.0 (237.0-394.0) | 192 | 312.0 (242.8-411.8) | 209 | 309.0 (247.5-393.5) | ||||
Method 2 | 13 | 218.0 (178.5-354.5) | 6 | 235.5 (179.3-287.8) | 7 | 217.0 (203.0-250.0) | ||||
TSH | .086 | ≥.999 | .148 | .137 | ||||||
Method 1 | 3 | 1.1 (0.6—X) | — | — | — | — | ||||
Method 2 | 578 | 1.7 (1.2-2.3) | 171 | 1.7 (1.2-2.3) | 192 | 1.5 (1.1-2.1) | ||||
Method 3 | 71 | 1.9 (1.5-2.7) | 28 | 2.1 (1.2-3.0) | 24 | 1.7 (1.2-1.9) | ||||
AMH | <.001 | <.001 | ≥.999 | <.001 | ||||||
Method 1 | 193 | 14.3 (7.5-24.0) | 51 | 25.7 (13.4-38.9) | 72 | 12.9 (8.9-23.3) | ||||
Method 2 | 267 | 7.5 (4.7-11.9) | 100 | 12.1 (7.8-18.9) | 97 | 7.0 (4.5-11.8) | ||||
Method 3 | 62 | 6.6 (4.4-9.8) | 25 | 16.8 (12.6-23.0) | 20 | 8.8 (5.9-11.7) | ||||
TFC | <.001 | <.001 | 1.0 | <.001 | ||||||
Method 1 | 509 | 41.0 (29.0-57.0) | 161 | 55.0 (37.5-75.0) | 169 | 38.0 (29.0-52.0) | ||||
Method 2 | 57 | 54.0 (37.5-77.5) | 27 | 75.0 (63.0-89.0) | 21 | 52.0 (42.0-71.5) | ||||
Prolactin | .537 | .877 | 1.0 | ≥.999 | ||||||
Method 1 | 94 | 0.2 (0.1-0.2) | 16 | 0.2 (0.2-0.3) | 11 | 0.2 (0.2-0.4) | ||||
Method 2 | 474 | 0.2 (0.1-0.3) | 152 | 0.2 (0.2-0.3) | 180 | 0.2 (0.1-0.3) | ||||
Method 3 | 84 | 0.3 (0.2-0.4) | 30 | 0.3 (0.2-0.3) | 25 | 0.3 (0.2-0.4) | ||||
mFGs | 647 | 3.0 (0.0-7.0) | 197 | 2.0 (0.0-6.0) | 216 | 5.0 (1.0-7.0) | .002 | .012 | .70 | .003 |
TG | 320 | 1.3 (0.9-1.9) | 90 | 0.7 (0.6-1.0) | 95 | 0.8 (0.6-1.1) | <.001 | <.001 | <.001 | .305 |
Chol | 320 | 4.8 (4.2-5.4) | 90 | 4.5 (4.1-5.0) | 95 | 4.7 (4.0-5.2) | .154 | .311 | .498 | ≥.999 |
LDL | 320 | 3.1 (2.6-3.6) | 90 | 2.6 (2.3-3.2) | 95 | 2.8 (2.4-3.5) | <.001 | <.001 | .083 | .576 |
HDL | 320 | 1.1 (0.8-1.3) | 90 | 1.5 (1.2-1.8) | 95 | 1.4 (1.1-1.7) | <.001 | <.001 | <.001 | .084 |
SPB | 523 | 120.0 (110.0-126.0) | 174 | 110.0 (105.0-120.0) | 191 | 110.0 (105.0-120.0) | <.001 | <.001 | <.001 | ≥.999 |
DBP | 522 | 80.0 (70.0-85.0) | 174 | 70.0 (70.0-80.0) | 191 | 70.0 (68.0-80.0) | <.001 | <.001 | <.001 | ≥.999 |
Values are medians with 25th and 75th percentiles for each cluster. For pairwise comparisons, Z scores were calculated to account for the use of different assays and a post hoc Bonferroni correction was used to adjust for multiple testing.
Abbreviations: Adion, androstenedione; AMH, antimüllerian hormone; back, background; Chol, cholesterol; DBP, diastolic blood pressure; DHEA, dehydroepiandrosterone; E2, estradiol; HDL, high-density lipoprotein; LDL, low-density lipoprotein; met, metabolic; mFGs, modified Ferriman Gallwey score; rep, reproductive; SBP, systolic blood pressure; TFC, total follicle count; TG, triglycerides; TSH, thyrotropin.
Cases assigned to the background subtype showed a distinctive pattern of clinical variables. AMH and LDL levels as well as TFC were significantly lower compared to the metabolic and reproductive subtype (P = .03; P < .001; and P v< .001, respectively). Adion levels were significantly lower compared to the other 2 subtypes. E2 levels were lower compared to the reproductive subtype (all P < .001) and TSH levels were significantly lower in the background subtype compared to the metabolic subtype in the total cohort (P = .011).
Distribution of the Phenotypes Within the Subtypes
We assessed which phenotypic features used for the diagnosis of Rotterdam PCOS (OD, HA, and PCOM) were captured by the subtypes (Supplementary Table S1) (18). In the total cohort, data to determine the phenotype were missing for 258 participants. Based on the data from the remaining 2244 girls and women, the metabolic subtype predominantly had phenotype A (OD + HA + PCOM) (72.0%), while 9.4% had phenotype B (OD + HA), 4.2% phenotype C (HA + PCOM), and 13.0% phenotype D (OD + PCOM). In the reproductive subtype 52.9% had phenotype A, 2.4% phenotype B, 2.2% phenotype C, and 41.0% phenotype D. The background subtype had predominantly phenotype D (63.6%), while 26.0% had phenotype A, 4.6% phenotype B, and 2.3% phenotype C. Additionally, after performing the cluster analysis, we divided the total cohort into “classic NIH” and “non-NIH Rotterdam” based on the diagnostic criteria (Supplementary Table S2) (18). Comparison of the 2 subsets showed higher prevalence of the metabolic subtype in the classic NIH subset compared to the non-NIH Rotterdam subset (61.7% vs 18.3%; P < .001), whereas in the non-NIH Rotterdam subset the background subtype was the most prominent subtype (62.4% vs 20.8%; P < .001) (see Supplementary Table S2) (18).
After clustering the subset NIH criteria, data from 16 participants were missing in the resulting data set (see Supplementary Table 1) (18). In this subset, by design, 100% of the participants had OD in combination with HA, as defined by the NIH criteria, but PCOM was present in more than 85% of all participants as well. This resulted in a large percentage of girls and women with phenotype A (87.8% in metabolic, 93.9% in reproductive, and 92% in background subtype) and only a small percentage of participants with phenotype B (12.2% in metabolic, 6.1% in reproductive, and 8.0% in the background subtype).
Discussion
It has long been recognized that PCOS is a heterogeneous disorder with a spectrum of clinical presentations. The current diagnostic criteria, which are based on expert opinion, do not capture this heterogeneity because the phenotypes identified were genetically similar in the largest PCOS GWAS meta-analysis published to date (7). In contrast, a data-driven approach using unsupervised HC analysis of phenotypic traits identified reproducible reproductive, metabolic, and background subtypes in a European ancestry cohort of NIH PCOS cases from the United States (8). These subtypes were associated with unique genetic loci suggesting that they did capture biologically distinct causal pathways (8). Our study has replicated these subtypes in a Dutch European ancestry PCOS cohort fulfilling the broader Rotterdam diagnostic criteria, despite the fact that the Dutch cohort was substantially leaner (mean BMI 26) than the US cohort (mean BMI 35). Novel to the previous study, we were able to further characterize the different subtypes by comparing additional phenotypic traits not used for clustering. We have shown significant differences in these traits that align with these distinct causal pathways, for example, higher AMH and TFC in the reproductive subtype, and higher TG, Chol, LDL, SBP, and DBP in the metabolic subtype.
Our cluster analysis was based on 8 important phenotypic traits, but subsequent analysis of additional variables aligned with the pathways implicated. The reproductive subtype, which is characterized by higher levels of LH and SHBG, also had higher AMH levels and TFC compared to the other subtypes. This suggests that this subtype represents girls and women with PCOS with alterations in folliculogenesis. The metabolic subtype, which is characterized by increased BMI, Glu0, and Ins0 levels with lower SHBG and LH levels, also had unfavorable lipid profiles, suggesting an increased risk for cardiovascular diseases (CVDs). Therefore, girls and women in this subtype might indeed need to be screened for CVD throughout their lives, whereas those with the reproductive subtype may be at lower risk for CVD. While longitudinal studies are needed, subtype-specific differences in disease risk may account for conflicting results of studies investigating long-term health outcomes in PCOS (19, 20).
The background subtype had no distinguishable pattern in the phenotypic trait distributions. Nevertheless, this subtype had distinctive phenotypic features. The additional variables, AMH, Adion, TFC, and LDL, were significantly decreased compared to the 2 other subtypes. These findings taken together with the significant association of the background subtype with FSHB in our previous study (8), which we have recently replicated in a transethnic meta-analysis (21), support considering this subtype as etiologically distinct. The FSHB locus, which encodes the FSH β polypeptide, is associated not only with PCOS status but also with multiple fertility parameters, including twining, age of menarche, and menopause, and circulating FSH levels in GWAS (7, 22-24).
The distributions of the phenotypes in the total cohort indicated that phenotype A was highly prevalent in the metabolic and reproductive subtypes, whereas phenotype D was highly prevalent in the reproductive and background subtypes. To further investigate the effect of the diagnostic criteria, we repeated the cluster analysis limited to cases with the NIH phenotype. The percentage of cases in the metabolic subtype increased, whereas the percentage of cases in the background subtype decreased compared to the total cohort. We also statistically compared the prevalence of the phenotypes in the clusters stratified by diagnostic criteria. This analysis confirmed a significant increase in the prevalence of the metabolic subtype and a significant decrease in the background subtype in the NIH compared to the non-NIH Rotterdam cases. Taken together, these findings suggest that NIH or classic PCOS (phenotype A) predominates in the metabolic subtype, consistent with the well-established greater metabolic risk in this phenotype (3, 25, 26). The NIH (phenotype A) and non-NIH Rotterdam (phenotype D) contribute almost equally to the reproductive subtype; phenotype D predominates in the background subtype. GWAS have shown that the PCOS phenotypes are genetically similar (7). In our United States–based NIH cohort (which was included in the meta-analysis), we found that performing cluster analysis followed by GWAS identified subtypes that were associated with unique genetic loci (8). These findings suggest that the clusters captured biologically meaningful differences.
The present study included only European-ancestry PCOS cases. However, we have reported that the PCOS subtypes are present in regionally and ethnically diverse NIH PCOS cohorts, including Greek and Korean, in addition to US and Dutch (27). Further, 2 recent studies (28, 29) have replicated our subtypes using our clustering algorithm in Han Chinese PCOS cohorts. Other groups (30, 31), using different clustering approaches, have confirmed that there are reproductive and metabolic subsets of PCOS that are associated with distinct PCOS GWAS variants and risk scores. There have been previous attempts to resolve the heterogeneity of PCOS with PCA (32) or cluster analysis (33) of phenotypic traits. However, there has been no orthogonal validation with uncorrelated biomarkers, such as GWAS variants, to confirm that the resulting subtypes captured discrete biologic pathways (34).
It has frequently been hypothesized that comorbidities associated with PCOS change throughout the lifespan. One of the concepts proposed is that women with PCOS start with reproductive problems in their early reproductive years and that these problems improve over the course of life, while metabolic problems become more pronounced (35). In our cohort age was not different in three clusters, implying that reproductive and metabolic features are already present from an early age. Indeed, this is supported by other studies showing that metabolic issues are already present in adolescents with PCOS (36, 37). A recent study indicates that there is already evidence for discrete reproductive and metabolic subsets in adolescents with PCOS (38)
Strengths of our study include the availability of a large, deeply and consistently phenotyped PCOS cohort. Accordingly, we were able to assess the effect of subtyping on a number of additional important reproductive and metabolic traits not used for clustering. We were also able to investigate the distribution of Rotterdam PCOS phenotypes in the subtypes. Limitations of our study include the potential effect of referral bias in our academic medical center–based PCOS cohort so our findings may not accurately reflect the general PCOS population (39). In addition, we did not include prospective data, therefore, long-term health outcomes could not be assessed. Finally, we did not validate the clusters using uncorrelated biomarkers, as we did in our original publication (8). However, we plan GWAS to assess whether the subtypes remain associated with distinct genetic loci in the current cohort ascertained by Rotterdam criteria.
In conclusion, we were able to replicate the 3 PCOS subtypes, reproductive, metabolic, and background, in a large cohort of girls and women with PCOS fulfilling the Rotterdam criteria. Importantly, we show that additional traits not used for clustering differ significantly among the subtypes and align with the reproductive and metabolic pathways implicated. Our findings suggest that these PCOS subtypes have different underlying etiologies and clinical characteristics. The applicability of our findings is 2-fold. First, clustering will enable the data-driven diagnosis of PCOS. Second, the identification of mechanistically distinct subtypes will allow precision-medicine approaches to screening, therapy, and prevention of adverse health outcomes.
Acknowledgments
The authors would like to acknowledge the girls and women who participated in the study.
Abbreviations
- Adion
androstenedione
- AMH
anti-Müllerian hormone
- BMI
body mass index
- Chol
cholesterol
- CVD
cardiovascular disease
- DBP
diastolic blood pressure
- DHEA
dehydroepiandrosterone
- DHEAS
dehydroepiandrosterone sulfate
- E2
estradiol
- FSH
follicle-stimulating hormone
- Glu0
glucose
- GWAS
genome-wide association study
- HA
hyperandrogenism
- HC
hierarchical clustering
- HDL
high-density lipoprotein
- Ins0
insulin
- LC-MS/MS
liquid chromatography–tandem mass spectrometry
- LDL
low-density lipoprotein
- LH
luteinizing hormone
- mFGs
modified Ferriman Gallwey score
- NIH
National Institutes of Health
- OD
ovulatory dysfunction
- PCA
principal component analysis
- PCOM
polycystic ovarian morphology
- PCOS
polycystic ovary syndrome
- SBP
systolic blood pressure
- SHBG
sex hormone–binding globulin
- T
testosterone
- TFC
total follicle count
- TG
triglycerides
- TSH
thyroid-stimulating hormone
Contributor Information
Kim van der Ham, Division of Reproductive Endocrinology and Infertility, Department of Obstetrics and Gynecology, Erasmus MC, Erasmus University Medical Center, 3015 GD, Rotterdam, the Netherlands.
Loes M E Moolhuijsen, Department of Internal Medicine, Erasmus MC, Erasmus University Medical Center, 3015 GD, Rotterdam, the Netherlands.
Kelly Brewer, Division of Endocrinology, Diabetes and Bone Disease, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
Ryan Sisk, Division of Endocrinology, Metabolism, and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA.
Andrea Dunaif, Division of Endocrinology, Diabetes and Bone Disease, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
Joop S E Laven, Division of Reproductive Endocrinology and Infertility, Department of Obstetrics and Gynecology, Erasmus MC, Erasmus University Medical Center, 3015 GD, Rotterdam, the Netherlands.
Yvonne V Louwers, Division of Reproductive Endocrinology and Infertility, Department of Obstetrics and Gynecology, Erasmus MC, Erasmus University Medical Center, 3015 GD, Rotterdam, the Netherlands.
Jenny A Visser, Department of Internal Medicine, Erasmus MC, Erasmus University Medical Center, 3015 GD, Rotterdam, the Netherlands.
Funding
This work was supported by the foundation for the National Institutes of Health (reference No. R01 HD100812).
Disclosures
A.D. is a consultant for Quest Diagnostics, Inc and AcaciaBio, Inc. J.S.E.L. reports grants from Ansh Labs, Ferring, Roche Diagnostics, Merck, and personal fees from Ferring, Titus Healthcare, Gedeon Richter, Ansh Labs, from Roche Diagnostics, and is an unpaid board member and president of the AE-PCOS Society, and a member of the ASRM outside the submitted work. J.A.V. has received royalties from AMH assays, paid to the institute/laboratory with no personal financial gain. Y.V.L. received an internal research grant from the Erasmus MC (The Synergy grant) and she received fees from Ferring and Merck for presentations. The other authors do not have any conflicts of interest to declare.
Data Availability
Some or all data sets generated during and/or analyzed during the current study are not publicly available but are available from the corresponding author on reasonable request.
References
- 1. Dapas M, Dunaif A. Deconstructing a syndrome: genomic insights into PCOS causal mechanisms and classification. Endocr Rev. 2022;43(6):927‐965. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Bozdag G, Mumusoglu S, Zengin D, Karabulut E, Yildiz BO. The prevalence and phenotypic features of polycystic ovary syndrome: a systematic review and meta-analysis. Hum Reprod. 2016;31(12):2841‐2855. [DOI] [PubMed] [Google Scholar]
- 3. Lizneva D, Suturina L, Walker W, Brakta S, Gavrilova-Jordan L, Azziz R. Criteria, prevalence, and phenotypes of polycystic ovary syndrome. Fertil Steril. 2016;106(1):6‐15. [DOI] [PubMed] [Google Scholar]
- 4. Neven ACH, Laven J, Teede HJ, Boyle JA. A summary on polycystic ovary syndrome: diagnostic criteria, prevalence, clinical manifestations, and management according to the latest international guidelines. Semin Reprod Med. 2018;36(1):5‐12. [DOI] [PubMed] [Google Scholar]
- 5. Diamanti-Kandarakis E, Dunaif A. Insulin resistance and the polycystic ovary syndrome revisited: an update on mechanisms and implications. Endocr Rev. 2012;33(6):981‐1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Zawadski JK, Dunaif A. Diagnostic criteria for polycystic ovary syndrome; towards a rational approach. In: Dunaif A, Givens JR, and Haseltine F, eds. Polycystic Ovary Syndrome. Black-well Scientific; 1992:377‐384. [Google Scholar]
- 7. Day F, Karaderi T, Jones MR, et al. Large-scale genome-wide meta-analysis of polycystic ovary syndrome suggests shared genetic architecture for different diagnosis criteria. PLoS Genet. 2018;14(12):e1007813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Dapas M, Lin FTJ, Nadkarni GN, et al. Distinct subtypes of polycystic ovary syndrome with novel genetic associations: an unsupervised, phenotypic clustering analysis. PLoS Med. 2020;17(6):e1003132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Rowe PJ, Comhaire FH, Hargreave TB, et al. Female partner. In: Rowe PJ, Comhaire FH, and Hargreave TB, et al. eds. WHO Manual for the Standardized Investigation and Diagnosis of the Infertile Couple. Press Syndicate of the University of Cambridge; 2000:40‐67. [Google Scholar]
- 10. van Santbrink EJ, Hop WC, Fauser BC. Classification of normogonadotropic infertility: polycystic ovaries diagnosed by ultrasound versus endocrine characteristics of polycystic ovary syndrome. Fertil Steril. 1997;67(3):452‐458. [DOI] [PubMed] [Google Scholar]
- 11. Rotterdam ESHRE/ASRM-Sponsored PCOS consensus workshop group . Revised 2003 consensus on diagnostic criteria and long-term health risks related to polycystic ovary syndrome (PCOS). Hum Reprod. 2004;19(1):41‐47. [DOI] [PubMed] [Google Scholar]
- 12. Teede HJ, Tay CT, Laven JJE, et al. Recommendations from the 2023 international evidence-based guideline for the assessment and management of polycystic ovary syndrome. J Clin Endocrinol Metab. 2023;108(10):2447‐2469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Ferriman D, Gallwey JD. Clinical assessment of body hair growth in women. J Clin Endocrinol Metab. 1961;21(11):1440‐1447. [DOI] [PubMed] [Google Scholar]
- 14. Zhao X, Ni R, Li L, et al. Defining hirsutism in Chinese women: a cross-sectional study. Fertil Steril. 2011;96(3):792‐796. [DOI] [PubMed] [Google Scholar]
- 15. Bui HN, Sluss PM, Hayes FJ, et al. Testosterone, free testosterone, and free androgen index in women: reference intervals, biological variation, and diagnostic value in polycystic ovary syndrome. Clin Chim Acta. 2015;450:227‐232. [DOI] [PubMed] [Google Scholar]
- 16. Teede HJ, Misso ML, Costello MF, et al. Recommendations from the international evidence-based guideline for the assessment and management of polycystic ovary syndrome. Fertil Steril. 2018;110(3):364‐379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Henning C. Cluster-wise assessment of cluster stability. Comput Stat Data Anal. 2007;52(1):258‐271. [Google Scholar]
- 18. van der Ham K, Moolhuijsen LME, Brewer K, et al. Supplementary data for “Clustering identifies subtypes with different phenotypic characteristics in women with polycystic ovary syndrome”. Figshare. Deposited 6 December 2023. doi: 10.6084/m9.figshare.24720762 [DOI] [PMC free article] [PubMed]
- 19. Meun C, Gunning MN, Louwers YV, et al. The cardiovascular risk profile of middle-aged women with polycystic ovary syndrome. Clin Endocrinol (Oxf). 2020;92(2):150‐158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Ollila MM, Arffman RK, Korhonen E, et al. Women with PCOS have an increased risk for cardiovascular disease regardless of diagnostic criteria-a prospective population-based cohort study. Eur J Endocrinol. 2023;189(1):96‐105. [DOI] [PubMed] [Google Scholar]
- 21. Brewer K, Lee H, Moolhuijsen LME, et al. Trans-Ethnic analysis of PCOS subtype genomewide association signals reveals 3 shared subtype-specific loci. J Endocr Soc. 2023;7(Supplement_1):bvad114.1654. [Google Scholar]
- 22. Day FR, Hinds DA, Tung JY, et al. Causal mechanisms and balancing selection inferred from genetic associations with polycystic ovary syndrome. Nat Commun. 2015;6(1):8464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Hayes MG, Urbanek M, Ehrmann DA, et al. Genome-wide association of polycystic ovary syndrome implicates alterations in gonadotropin secretion in European ancestry populations. Nat Commun. 2015;6(1):7502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Mbarek H, Steinberg S, Nyholt DR, et al. Identification of common genetic variants influencing spontaneous dizygotic twinning and female fertility. Am J Hum Genet. 2016;98(5):898‐908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Diamanti-Kandarakis E, Panidis D. Unravelling the phenotypic map of polycystic ovary syndrome (PCOS): a prospective study of 634 women with PCOS. Clin Endocrinol (Oxf). 2007;67(5):735‐742. [DOI] [PubMed] [Google Scholar]
- 26. Kim JJ, Hwang KR, Choi YM, et al. Complete phenotypic and metabolic profiles of a large consecutive cohort of untreated Korean women with polycystic ovary syndrome. Fertil Steril. 2014;101(5):1424‐1430.e3. [DOI] [PubMed] [Google Scholar]
- 27. Dapas M, Diamanti-Kandarakis E, Dunaif A, et al. Replication of PCOS reproductive and metabolic subtypes in diverse cohorts—towards a rationale approach to PCOS classification. J Endocr Soc. 2022;5(Supplement_1):A711. [Google Scholar]
- 28. Cai J, Yue J, Lu N, et al. Association of fat mass and skeletal muscle mass with cardiometabolic risk varied in distinct PCOS subtypes: a propensity score-matched case-control study. J Clin Med. 2024;13(2):483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Chen H, Zeng R, Zeng X, Qin L. Cluster analysis reveals a homogeneous subgroup of PCOS women with metabolic disturbance associated with adverse reproductive outcomes. Chin Med J (Engl). 2023;137(5):604‐612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Stamou MI, Smith KT, Kim H, Balasubramanian R, Gray KJ, Udler M. Polycystic ovarian syndrome physiologic pathways implicated through clustering of genetic loci. J Clin Endocrinol Metab. 2023;108(4):897‐908. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Zhang Y, Movva VC, Williams MS, Lee MTM. Polycystic ovary syndrome susceptibility loci inform disease etiological heterogeneity. J Clin Med. 2021;10(12):2688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Dewailly D, Pigny P, Soudan B, et al. Reconciling the definitions of polycystic ovary syndrome: the ovarian follicle number and serum anti-Mullerian hormone concentrations aggregate with the markers of hyperandrogenism. J Clin Endocrinol Metab. 2010;95(9):4399‐4405. [DOI] [PubMed] [Google Scholar]
- 33. Tzeng CR, Chang YC, Chang YC, Wang CW, Chen CH, Hsu MI. Cluster analysis of cardiovascular and metabolic risk factors in women of reproductive age. Fertil Steril. 2014;101(5):1404‐1410.e1. [DOI] [PubMed] [Google Scholar]
- 34. Gerszten RE, Accurso F, Bernard GR, et al. Challenges in translating plasma proteomics from bench to bedside: update from the NHLBI clinical proteomics programs. Am J Physiol Lung Cell Mol Physiol. 2008;295(1):L16‐L22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Fauser BC, Tarlatzis BC, Rebar RW, et al. Consensus on women's health aspects of polycystic ovary syndrome (PCOS): the Amsterdam ESHRE/ASRM-sponsored 3rd PCOS Consensus Workshop Group. Fertil Steril. 2012;97(1):28‐38.e25. [DOI] [PubMed] [Google Scholar]
- 36. Fazleen NE, Whittaker M, Mamun A. Risk of metabolic syndrome in adolescents with polycystic ovarian syndrome: a systematic review and meta-analysis. Diabetes Metab Syndr. 2018;12(6):1083‐1090. [DOI] [PubMed] [Google Scholar]
- 37. Li L, Feng Q, Ye M, He Y, Yao A, Shi K. Metabolic effect of obesity on polycystic ovary syndrome in adolescents: a meta-analysis. J Obstet Gynaecol. 2017;37(8):1036‐1047. [DOI] [PubMed] [Google Scholar]
- 38. Chen-Patterson A, Bernier A, Burgert T, et al. Distinct reproductive phenotypes segregate with differences in body weight in adolescent polycystic ovary syndrome. J Endocr Soc. 2024;8(2):bvad169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Ezeh U, Yildiz BO, Azziz R. Referral bias in defining the phenotype and prevalence of obesity in polycystic ovary syndrome. J Clin Endocrinol Metab. 2013;98(6):E1088‐E1096. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Citations
- van der Ham K, Moolhuijsen LME, Brewer K, et al. Supplementary data for “Clustering identifies subtypes with different phenotypic characteristics in women with polycystic ovary syndrome”. Figshare. Deposited 6 December 2023. doi: 10.6084/m9.figshare.24720762 [DOI] [PMC free article] [PubMed]
Data Availability Statement
Some or all data sets generated during and/or analyzed during the current study are not publicly available but are available from the corresponding author on reasonable request.