Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2021 Jul 6;49(17):e98. doi: 10.1093/nar/gkab552

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Figure 3. — Breast cancer subtyping performance assessment of bulk gene expression data. Comparison of sorting of breast cancer Pam50 subtypes and genotypes (ER-, PR- and ER-status) for two bulk gene expression data sets, METABRIC and TCGA. An aggregate, three gene genotype status was also included by combining the individual genotypes. Performance was assessed based on reduction of entropy as the number cluster estimate increased based on tree cutting. K2Taxonomer was only run on the full set of features, while either agglomerative method, average and Ward's, were run on three additional subsets of the data. (A) Illustration of the results generated by K2Taxonomer and Ward's method for the METABRIC dataset. These results reflect Ward's method run on 5% of the total number of features, which demonstrated the best performance among agglomerative methods. (B) Entropy measurements for each method as K increased across the METABRIC and TCGA data sets.