Skip to main content
Journal of Pathology Informatics logoLink to Journal of Pathology Informatics
. 2026 Jan 2;20:100539. doi: 10.1016/j.jpi.2025.100539

Biological feature-based machine learning in histopathological images: a systematic review

Stéphane Treillard a,b,c,, Robin Schwob a,c,d, Sandrine Mouysset c,e, Pierre Brousset h, Sylvain Cussat-Blanc b,c,f, Camille Franchet g
PMCID: PMC12858367  PMID: 41623734

Abstract

Digital pathology has recently led to significant advancements in the field of microscopic image analysis, particularly regarding the increasing use of Deep Learning methods. These models represent the state-of-the-art in histopathological slide analysis, but Deep Learning features remain difficult to interpret, despite recent developments in post hoc explainability frameworks. In contrast, features extracted from biological objects—such as nuclei, cells or tissues—are supposed to be more grounded in pathologists' a priori knowledge. Accordingly, Machine Learning based on handcrafted features represents another paradigm of explainability and may stand as a complementary method to Deep Learning to assist pathologists.

In order to perceive how biological features have been used in hematoxylin & eosin microscopic images to address medical questions, we conducted a systematic review of articles published from January 2005 to May 2025, adhering to PRISMA guidelines.

A total of 97 articles were analyzed from the PubMed, IEEE, and ACM databases. Three primary categories of features—texture/color, morphology and topology—were both identified and thoroughly described. These features were most frequently derived from segmented cells and tissues in 80 and 28 studies, respectively. They were used to address seven types of medical questions: “normal vs diseased”, disease subtyping, tumor grading, phenotyping, object detection, prognosis and treatment-response prediction.

We discussed methodological and reporting limitations of these studies, highlighting the difficulty to assess the potential impact of such methods. Among the most common concerns, we found features difficult to interpret, data leakage, and inadequate sample sizes. Nevertheless, we also focused on promising domain-inspired feature engineering that provides better explainability and specificity. This kind of features associated with more methodological rigor may increase the relevance and reliability of AI models, and also raise new research avenues in pathology.

Keywords: Histopathology, Feature extraction, Machine learning, Microscopy, Image processing, Explainability

Highlights

  • The main types of features extracted in the 97 reviewed articles were texture/color, morphological and topological.

  • The features were used for seven types of medical tasks

  • Several studies raised concerns regarding sample sizes, potential data leakage and overall lack of transparency.

  • Articles leveraging domain-inspired features are the most promising in terms of specificity and explainability .

1. Introduction

In the late 1990s, Whole Slide Image (WSI) scanners were introduced to acquire pathology slides in a pyramidal format, allowing tissue sample examination at different levels, mimicking multiple microscope magnifications. The constitution of large WSI and image patches datasets fueled the development of robust Artificial Intelligence (AI) models (Pantanowitz et al.1).

In the 1990s and early 2000s, Machine Learning (ML) emerged as the method of choice to analyze histopathological images, making predictions from a set of handcrafted features that were often extracted manually or semi-automatically. These features can be either directly extracted from raw patches (e.g., mean intensity of an entire patch) (Hwang et al.2), or from previously segmented biological objects (e.g., mean intensity of a segmented nuclei) (Demir et al.3). We referred to the latter features as biological features in the present review.

In the 2010s, Deep Learning (DL) was introduced in histopathology (Cireşan et al.4). Unlike conventional ML models, DL ones automatically generate the features required for classification tasks. DL has since represented the state-of-the-art in histopathological slide analysis. Both ML and DL studies most frequently focus on Hematoxylin & Eosin (H&E), which is the standard staining in pathology.

WSIs raise many challenges in AI model development, both from a technical and medical standpoint. They typically present a wide variety of biological entities and their aspect is also influenced by preanalytical factors such as sample fixation, slice thickness and staining protocol (Brixtel et al.5). Conceiving AI models able to extract relevant characteristics from biological diversity while preventing overfitting on technical artifacts is still a key challenge in pathology (Franchet et al.6). In this context of intertwined sources of variability, model transparency is particularly relevant to validate these methods for clinical use, as recommended by Articles 13 and 14 of the 2024 European Union AI act7 and by the 2021 AI/ML-Based Software as a Medical Device of the U.S. Food and Drug Administration.8

Interpretability and explainability are both crucial considerations when deploying AI in critical domains, in particular medical image analysis. While often used interchangeably, these terms refer to distinct concepts. Interpretability generally describes the extent to which a human can directly understand the internal mechanics of a model (e.g., how inputs and all internal processing steps influence the outputs), whereas explainability refers to the ability to describe a model's behavior in a human-understandable manner, often through post hoc analyses (Gilpin et al.9).

In this sense, biological features may provide an inherently explainable relationship between input data and model output. For example, in the classification of lymphomas, one would expect the model to rely on cell area, since cell size is a key diagnostic criterion. Being able to verify that the model effectively uses such biologically meaningful features helps ensuring alignment with established medical knowledge.

Several post hoc explainability methods have been developed for both end-to-end DL and conventional ML models such as Gradient-weighted Class Activation Mapping (Selvaraju et al.10), Local Interpretable Model-agnostic Explanations (Ribeiro et al.11) or Shapley Additive Explanations (Lundberg and Lee12). Despite being able to make black-box models more explainable, these frameworks may be subject to confirmation bias and only provide indirect insights into a model's mechanics, which may be incorrect or inaccurate (Rudin13). On top of that, visualization methods such as saliency maps may highlight unexpected regions that are clinically irrelevant (Saporta et al.14).

Some literature reviews have already been published on computer-aided histopathological analysis with emphasis on biological feature extraction. Both Li et al.15 and Al-Thelaya et al.16 reviewed methods for handling WSIs, from preprocessing to classification, using both conventional ML and DL techniques. Bera et al.17 compared the performance of domain-agnostic features, domain-inspired features and DL-based approaches in WSIs. Domain-agnostic features are non-specific characteristics that can be extracted from a large range of diseases, such as cellular shape or texture; whereas domain-inspired ones are designed to distinguish patterns specific to some diseases, such as gland angularity in prostate cancers (Bera et al.17). Regarding both the risks of bias and applicability of AI models, McGenity et al.18 used QUADAS-2 (Whiting et al.19) to examine studies assessing diagnostic test accuracy. The authors evaluated 100 studies conducted on WSIs and found out that 99 % of them raised concerns about bias or applicability.

The aforementioned reviews thoroughly studied AI in histopathology, from feature extraction to applicability in routine practice. However, these studies barely mentioned explainability, which is essential to guarantee the trustworthiness of a model's predictions. In an attempt to select articles that present more clinically explainable models, we decided to systematically review those that extracted biological features to train an AI model. We also limited the scope to articles leveraging H&E images to address medical questions, published over the past two decades (2005–2025).

In this review, we aim at providing an overview of the use of biological features in histopathological images by addressing the following research questions:

  • RQ1: What types of features are commonly extracted from H&E slides? How are they subsequently processed?

  • RQ2: What are the medical tasks tackled by the models trained on these features?

  • RQ3: What are the most common pitfalls encountered when extracting and processing features from H&E slides?

  • RQ4: To what extent are the reported results robust and transparent?

  • RQ5: Which approaches are the most promising in terms of relevance and interpretability/explainability?

2. Biological features

Biological features, i.e. features extracted from biological objects segmented either manually or automatically, include the aspect of the segmented objects, their geometry, as well as the way objects are organized within the slides. We proposed a biological feature taxonomy in Fig. 1, and all features are described more extensively in the Supplementary Materials (section Features).

Fig. 1.

Fig. 1

Feature taxonomy. LBP: Local Binary Patterns, FeDeG: Feature Driven Local Cell Graph (defined in Lu et al.20).

2.1. Texture and color features

Texture features can be particularly relevant in studies using H&E-stained slides. For instance, hematoxylin highlights biologically relevant aspects in nuclei, as vesicular and salt & pepper chromatin are examples of distinct patterns identified by pathologists. In this sense, various subtypes of texture features were used, all of which relying on a grayscale image, represented as a matrix (Fig. 2).

Fig. 2.

Fig. 2

Common examples of texture features, 2a: Gray Level Co-occurrence Matrix (GLCM) features are extracted from a matrix based on neighboring pixel values; 2b: Gray Level Run Length Matrix (GLRLM) features are derived from a matrix representing consecutive pixels with the same value, 2c: Local Binary Pattern (LBP) features are obtained by comparing each pixel to its neighbors and assigning an LBP value based on whether the pixel's value is higher or lower. Features are then extracted from the resulting histogram; 2d: Gabor features are extracted by convolving the image with a bank of Gabor filters with different angles and wavelengths; 2e: Wavelet features are derived by convolving the image with a combination of two 1-dimensional wavelets, yielding four images showcasing the horizontal, vertical and diagonal detail as well as an approximation of the original image. Wavelet features are then extracted from the resulting images.

Statistics-based features are either derived directly from a channel of the input image or from a matrix capturing spatial relationships of pixels such as the gray-level co-occurrence matrix (GLCM) (Haralick et al.21) (Fig. 2a) or the gray-level run length matrix (GLRLM) (Galloway22) (Fig. 2b). These statistics-based features are referred to as first-order and second-order, respectively.

Transform-based texture features are computed after applying a transformation to input images to unveil patterns of interest. Gabor filtering and wavelet transform both highlight oriented patterns within images.

Other common texture features included Local Binary Patterns (LBP), fractal features and Tamura features (Tamura et al.23). LBP describe the frequency of specific pixel arrangements through a histogram, which reveals local structures within images. Variants such as rotation-invariant, completed and uniform LBP as well as local gray-level difference pattern offer additional robustness. Fractal features, such as fractal dimension and multifractal spectrum features, assess pattern self-similarity at different scales, quantitatively describing complex and irregular shapes in histopathological images. Tamura et al.23 introduced features that highly correlate with human visual perception of texture, and include coarseness, contrast, directionality, line-likeness, regularity and roughness.

Color features can be considered as a subcategory of texture features since they are essentially texture descriptors extracted from a specific color channel, treated as grayscale input. Color information is encoded in the form of vectors in a color space. RGB is the most widely used but others such as HSV or Lab may be used for digital slide analysis. Furthermore, color deconvolution—involving mapping image colors into computed hematoxylin (H) and eosin (E) channels—is another relevant approach in histopathology.

2.2. Morphological features

Morphological features are extracted from biological objects, such as cells or tissue structures, and are represented in Fig. 3.

Fig. 3.

Fig. 3

Common examples of morphological features include: object area and perimeter (3a): a measure of the size and the boundary length of the segmented object; equivalent diameter (3b): the diameter of a circle that has the same area as the object; Feret diameter (3c): the maximum and minimum distances between two parallel lines tangent to the object's contour; elliptic features (3d): the lengths of the minor and major axes of an ellipse fitted to the object, along with its orientation; convex hull features (3e): the area and perimeter of the convex hull, which is the smallest convex shape that encloses the object; bounding box dimensions (3f): the height and width of the smallest rectangle that can fully contain the object.

Size and shape features describe the geometry of biological objects. They were extracted either from the object itself or from a shape fitted on the object (ellipse, convex hull or bounding box). Ratios of these features (e.g., minor axis length to major axis length ratio) may also convey information about the objects' shapes. Cardinality-based features include object counts and densities in different contexts.

Other examples of extracted morphological features included skeleton-based, cell orientation, Zernike and Fourier features. Skeleton features represent the shape and structure of the object by analyzing its skeleton or medial axis. Cell orientation features are derived from the angle between the major axis length of an object and a horizontal line (Fig. 3d). Zernike features are derived from the projection of a shape into a basis of Zernike polynomials.

Since biological objects are closed shapes, plotting their boundary coordinates yields a periodic function that can be decomposed into a Fourier series. These features provide alternative and more precise ways to describe objects' geometry in a more complex way than usual size and shape features.

2.3. Topological features

Topological features aim at characterizing cell spatial organization, assessing whether cells are closely packed or sparsely distributed. Cells' organization and distribution can serve as a biomarker for specific diseases, reflecting pathological changes in tissue architecture.

Voronoi tessellation (Fig. 4a) divides the image into polygons (Voronoi cells), each surrounding a nucleus' centroid such that all points within the polygon are closer to that centroid than to any other. Features commonly derived from Voronoi tessellation include area, perimeter or side length.

Fig. 4.

Fig. 4

Topological features: dots represent cell centroids and are used as seeds for the Voronoi tessellation and as vertices in Delaunay triangulation and in the minimum spanning tree.

Delaunay triangulation (Fig. 4b) connects neighboring cell centroids to form triangles such that no centroid is inside the circumcircle of any triangle. This results in a mesh of triangles covering the image. Features from the triangles and from the edges connecting neighboring cells were extracted to characterize cell arrangements. As Delaunay triangulation can be obtained from Voronoi tessellation, features extracted using both methods may be highly correlated.

The minimum spanning tree (Fig. 4c) connects all cell centroids such that the total edge length is minimal. Edge lengths from this tree can be used to describe the overall connectivity and proximity of cells.

Other graph-based approaches were also studied. Some articles explored localized graphs or graphs built directly using distances between cells, providing additional descriptors of cell organization. These approaches can include features from local cell clusters or more complex graph metrics derived from spatial relationships.

2.4. Domain-agnostic and domain-inspired features

Bera et al.17 introduced the notions of domain-agnostic and domain-inspired features. On the one hand, domain-agnostic features are non-specific characteristics that can be extracted from every image for generic classification tasks. Most features described hereinabove belong to this category, as they tend to be extracted to address multiple medical questions. On the other hand, domain-inspired ones are designed to distinguish specific patterns of histopathological images or a definite range of diseases. For instance, Lewis et al.24 introduced anaplasia and cellular multinucleation as poor prognosis factors in oropharyngeal carcinomas (Fig. 5).

Fig. 5.

Fig. 5

Example of a domain-inspired feature in head & neck tumors. Anaplasia is defined as any high-power field with ≥3 tumor nuclei with diameters ≥5 lymphocytes nuclei (≈ 25 μm) (red outline). Multinucleation is defined as any high-power field with ≥3 tumor cells with multiple nuclei (green outline). Both anaplasia and multinucleation are associated with poor prognosis of Human Papilloma Virus-associated oropharyngeal squamous cell carcinoma (Lewis et al.24). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

3. Results

3.1. Study selection

Articles included in the study met the following eligibility criteria:

  • Non-DL feature extraction from segmented or annotated biological objects (e.g., cells, tissue) to address a medical question;

  • Model training on the basis of extracted features;

  • H&E slides;

  • Human research;

  • Publication in a scientific journal or presented at an A or A* conference according to the 2023 CORE ranking (Knoth et al.25).

We also defined exclusion criteria:

  • Non-bright field images;

  • Cytology samples (e.g., blood smears or fine-needle aspiration);

  • Feature extraction without addressing a specific medical question (e.g., segmentation without classification, image retrieval, or image registration);

  • End-to-end DL approaches (see Fig. 6).

Fig. 6.

Fig. 6

Article selection process.

3.2. Medical information

The 97 articles selected are listed in Table 1. Eighty-four articles focused on various forms of cancer, while 13 examined other diseases. The most frequently studied organs were breast (n = 22) and prostate (n = 14), followed by brain (n = 12), colorectal (n = 9) and kidney (n = 7). Additionally, a few articles investigated other conditions, including oral submucous fibrosis (n = 4), dental cysts (n = 1), inflammatory bowel disease (n = 1), colonic tubular adenoma (n = 1), and preeclamptic placentas (n = 1).

Table 1.

Reviewed studies.

Ref. Task Biological object(s) Types of features used Dataset
accessibility
Number of patients
3 normal vs diseased cell topological private 60
26 tumor grading nucleus texture,
morphological
private 80
27 tumor grading,
normal vs diseased
nucleus, gland, tissue texture,
color,
morphological
private Not
specified
28 tumor grading gland morphological private 78
29 prognosis,
disease subtyping
nucleus texture,
morphological, other
private 104
30 tumor grading tissue,
nucleus
topological,
morphological
private Not
specified
31 tumor grading nucleus,
cytoplasm, tissue
texture,
color
private 36
32 object
detection
cell texture,
color,
morphological
private 53
33 object
detection
cell texture private Not
specified
34 tumor grading nucleus topological,
morphological
private 12
35 prognosis nucleus texture,
morphological, other
private 71
36 object
detection
gland morphological private 20
37 disease subtyping nucleus texture,
other
private 20
38 prognosis nucleus morphological private 56
39 normal vs diseased,
disease subtyping
tissue texture,
other
private 52
40 normal vs diseased,
disease subtyping
nucleus, tissue, gland texture,
color,
morphological
private 96
41 normal vs diseased,
disease subtyping
tissue texture,
morphological
private 52
42 tumor grading gland morphological private 125
43 normal vs diseased nucleus,
tissue
texture,
morphological
private 22
44 normal vs diseased nucleus,
tissue
texture,
morphological
private >20
45 tumor grading,
normal vs diseased
gland texture,
color,
morphological
private 17
46 normal vs diseased,
object detection
gland,
nucleus
color,
morphological
private 20
47 disease subtyping nucleus texture private 1050
48 tumor grading nucleus,
tissue
texture,
topological, other
private Not
specified
49 prognosis nucleus topological,
morphological
private 39
50 normal vs diseased,
disease subtyping
nucleus morphological public Not
specified
51 normal vs diseased, tumor grading gland morphological private 58
52 normal vs diseased,
disease subtyping
nucleus,
gland
texture,
color,
morphological
private Not
specified
53 phenotyping nucleus morphological public 146
54 phenotyping nucleus texture,
morphological
public 117
55 tumor grading nucleus texture, color, topological private 126
56 normal vs diseased nucleus, gland, tissue morphological private 8
57 disease subtyping tissue,
nucleus
topological private 160
58 disease subtyping nucleus texture,
morphological
private 20
59 normal vs diseased nucleus texture,
color,
morphological
private 111
60 tumor grading nucleus texture,
color,
morphological
private 132
61 normal vs diseased tissue morphological private 75
62 prognosis gland texture,
topological, morphological
private 40
63 disease subtyping gland,
nucleus
morphological private 3
64 normal vs diseased tissue,
nucleus
morphological private Not
specified
65 normal vs diseased tissue, gland, cell texture,
morphological
private Not
specified
66 object
detection
nucleus color public Not
specified
67 prognosis tissue,
nucleus
topological,
morphological
private 230
68 tumor grading,
normal vs diseased
gland morphological private Not
specified
69 tumor grading nucleus texture,
color,
topological, morphological
private 40
70 tumor grading nucleus topological,
morphological
public Not
specified
71 normal vs diseased,
object detection
tissue,
nucleus, duct
texture,
other
private 193
72 normal vs diseased nucleus texture,
color,
morphological
private Not
specified
73 normal vs diseased nucleus texture private 12
74 prognosis nucleus morphological public 209
75 prognosis,
normal vs diseased, disease subtyping
nucleus,
cytoplasm
texture,
morphological, other
public 1311
76 normal vs diseased, tumor grading nucleus, cell, tissue texture,
color,
topological, morphological
private Not
specified
77 prognosis nucleus texture,
morphological, other
public 1034
78 object detection cell,
nucleus
other both 25
79 normal vs diseased nucleus texture,
morphological
private 12
80 tumor grading,
normal vs diseased
gland morphological private 38
81 disease subtyping tissue texture private 30
82 object detection nucleus texture,
color,
morphological
public 5
83 normal vs diseased nucleus texture,
morphological
private Not
specified
84 tumor grading nucleus,
tissue
topological,
morphological
both Not
specified
85 normal vs diseased,
disease subtyping
Fallopian tube, nucleus texture,
morphological
private 50
86 prognosis nucleus texture,
color,
topological, morphological
public 190
87 normal vs diseased nucleus,
glomeruli
morphological private Not
specified
88 disease subtyping cell,
tissue, nucleus
morphological private 41
89 prognosis tissue morphological public 539
20 prognosis nucleus topological private 434
90 disease subtyping nucleus texture,
color,
morphological
public 30
91 normal vs diseased,
disease subtyping
nucleus texture,
color,
morphological, other
private Not
specified
92 normal vs diseased nucleus texture,
color,
topological,
morphological, other
public Not
specified
93 prognosis nucleus texture,
color,
topological, morphological
private 160
94 treatment response prediction nucleus texture,
morphological
private 71
95 normal vs diseased nucleus morphological public Not
specified
96 disease subtyping nucleus texture, color, topological both 148
97 disease subtyping nucleus, tissue, cytoplasm morphological, other private Not
specified
98 object
detection
nucleus texture,
color,
morphological
public 5
99 object
detection
nucleus texture,
topological
public >20
100 disease subtyping gland morphological private 7
101 object
detection
nucleus texture,
color,
morphological
both Not
specified
102 tumor grading,
normal vs diseased
tissue,
nucleus
texture private Not
specified
103 phenotyping tissue morphological, other both >2634
104 normal vs diseased nucleus texture,
morphological
public Not
specified
105 normal vs diseased nucleus texture,
color
public Not
specified
106 tumor grading nucleus,
gland
morphological public Not
specified
107 prognosis nucleus texture,
color,
morphological, other
private 107
108 prognosis nucleus, tissue, vessel, red cell texture,
topological, morphological
private 125
109 prognosis tissue,
nucleus
topological private 127
110 tumor grading nucleus texture,
morphological
public Not
specified
111 normal vs diseased nucleus topological public 82
112 prognosis nucleus topological,
morphological, other
public 1824
113 prognosis tissue morphological, other public 78
114 prognosis tissue,
nucleus
morphological, other both 3177
115 normal vs diseased nucleus texture,
color,
morphological
public 82
116 object
detection
nucleus texture,
morphological, other
public Not
specified
117 prognosis nucleus morphological private 36
118 prognosis vessel, cell texture,
morphological
both 371
119 normal vs diseased,
disease subtyping
nucleus texture,
morphological
private 53
120 tumor grading gland,
nucleus
color,
morphological, other
private 151

Features were extracted to address a range of medical questions grouped in seven distinct types: “normal vs diseased”, disease subtyping, tumor grading, phenotyping, object detection, prognosis and treatment response prediction. The most frequently segmented biological objects were cells and nuclei (n = 80), glands (n = 18), and tissues (n = 28). The medical tasks were defined as follows:

  • “Normal vs diseased”: predicting whether an image is normal or diseased. Diseased means either malignant or abnormal in articles about cancer or other conditions, respectively (n = 37);

  • Disease subtyping: predicting whether an image belongs to a specific subtype of the studied disease (n = 21);

  • Tumor grading (n = 22);

  • Phenotyping: correlating image features with immunophenotype, molecular or omics data (n = 3);

  • Object detection: identifying specific objects in the image (n = 12);

  • Prognosis: predicting clinical outcomes (n = 21);

  • Treatment response: predicting patients' sensitivity to a specific treatment (n = 1).

Some articles addressed multiple tasks (n = 19). For instance, the study from Tabesh et al.,27 that proposed a method to classify prostate images into benign versus low and high Gleason grade,1 was considered as both “normal vs diseased” and “tumor grading” tasks.

3.3. Study material

To tackle these medical questions, 28 articles used WSIs, 66 focused on patches extracted from WSIs, and 5 used Tissue Microarray (TMA) data, i.e., sections from tissue cores punched from paraffin blocks. Most studies relied on private datasets exclusively (n = 65), while 25 articles leveraged only public datasets and 7 used both public and private ones. When the data source was not specified (n = 16), we assumed it originated from a private dataset. The public datasets that were used in the different studies are reported in Table 2, and cohorts from large WSI repositories such as The Cancer Genome Atlas (TCGA) or the National Lung Screening Trial (NLST) are reported in Table 3. Patients and samples' counts were strictly reported as stated in the articles reviewed and might not correspond to the current state of the databases.

Table 2.

Public datasets used in the reviewed studies.

Dataset Organ(s) WSIs or patches Samples Patients Used in
AMIDA-13 breast patches 585 unclear 116
BACH (ICIAR2018) breast patches and
WSIs
400 patches,
20 WSIs
unclear 105
Bioimage2015 breast patches 249 unclear 105
BreCaHAD breast patches 162 unclear 95
BreakHis breast patches 7909 82 111,115
MITOS breast patches 50 5 78,82,98,101
MITOS-ATYPIA-14 breast patches 480–2127 unclear 66,116
TUPAC breast WSIs 821 821 116
UCSB
Bio-segmentation benchmark database
breast patches 58 unclear 92
CCH colon patches 5000 unclear 99
CRC100K colon patches 100,000 86 99
CRC-VAL-HE-7 K colon patches 7000 50 99
LC25000 lung,
colon
patches unclear unclear 115
IICBU lymphoma
dataset
lymph
node
patches 374 unclear 90

Table 3.

Public cohorts used in the reviewed studies.

Dataset Cohort Organ Samples Patients Used in
TCGA KIRC kidney 28–148 28–148 50,96,112
TCGA KIRP kidney 172–190 172–190 86,112
TCGA GBM brain 10–599 unclear 53,54,70
TCGA LGG brain 10–515 unclear 53,70,74
TCGA LUAD lung 457–1581 389–523 75,77,89
TCGA LUSC lung 446–1625 412–511 75,77
TCGA BRCA breast 141–1119 141–1044 103,112,114
TCGA SKCM skin 358 327 103
TCGA STAD stomach 407 407 103
TCGA LIHC liver 225 225 118
TCGA unclear prostate 58 unclear 84
PLCO Breast luminal-like breast 296 296 114
PLCO Breast HER2+ breast 65 65 114
PLCO Breast TNBC breast 100 100 114
NLST LUAD lung 267 150 89
Stanford TMA LUAD lung 227 227 75
Stanford TMA LUSC lung 67 67 75

3.4. Features

3.4.1. Texture and color features

Texture features were commonly extracted characteristics in the articles reviewed (n = 56). First-order statistics (n = 20) were frequently used as they are simple to compute and to comprehend. For example, mean intensity was identified as a statistically significant feature in Krishnan et al.43 to distinguish healthy nuclei from nuclei of oral submucous fibrosis dysplasia that appear darker in severe cases. Second-order statistics (n = 32) are both more complex and harder to interpret, but remained frequently used in the articles reviewed. In Krishnan et al.,44 the same authors used 4 statistically significant second-order texture features derived from a GLCM matrix to classify nuclei from normal and oral submucous fibrosis basal cells, with slightly better results than in Krishnan et al..43

Gabor features (n = 6) were used in Bejnordi et al.71 to classify ductal carcinoma in situ and normal breast tissue, and wavelet features (n = 6) were used in Tabesh et al.27 to grade prostate cancer. In both articles they were used in combination with multiple other features, making it difficult to measure their proper impact. In addition, due to the transformation process, these features tend to be harder to interpret than features directly extracted from images.

LBP (n = 8) among many other texture features were used to detect mitoses in Sayed and Hassanien.82 Fractal features were used in 9 articles: fractal dimension was utilized as a prognostic biomarker in melanoma in Bedin et al.35 and multifractal spectrum features were used in Vasiljevic et al.47 to identify bone metastasis images among different types of carcinomas. Tamura features (Tamura et al.23) were used among other features to classify medulloblastoma images in Das et al..91 Although these features are designed to distinguish complex patterns, their mathematical formulae are difficult to apprehend.

Among the 27 articles in the present review that relied on color, color features were alternatively extracted from histograms (n = 5), correlograms (n = 2), or as ratios of pixel intensities in different channels (n = 2). These features were studied to address a range of medical questions from classifying renal carcinoma (Kothari et al.52) to predicting neuroblastoma prognosis (Liu et al.107).

RGB was the most widely used color space (n = 17), followed by HSV (n = 8) and Lab (n = 7). The other color spaces used in the studies were YCbCr (n = 3), HSI (n = 2), blue-ratio, Luv, YIQ and XYZ (n = 1). Furthermore, H&E deconvolution was employed in 3 articles.

In short, texture and color features might be hard to interpret, excepting first-order statistics. Texture and color may also be influenced by technical factors such as compression rate, image format or stain provider. However, they may also reveal biologically relevant patterns that would be hard to describe otherwise.

3.4.2. Morphological features

Among the reviewed articles, morphological features were the most frequently used (n = 76), size and shape features especially (n = 62). These features are easy both to compute and to interpret, which justifies their widespread use. Cardinality-based features (n = 11) were particularly relevant when studying different populations of objects (tumor infiltrating lymphocytes and epithelial cells in Amgad et al.114), or location within tissues (cells in cancer tissue and in cancer-associated stroma in Diao et al.103).

Skeleton-based features were used in 3 articles. Even though both individual and mean cell orientation seem inappropriate, as the slides are not oriented, features based on cells' orientation distribution appeared to be relevant in two papers. Indeed, orientation entropy features were used alone to prognose biochemical recurrence in prostate cancer (Lee et al.49), and in combination with other features to predict recurrence in node-negative gastric adenocarcinoma (Ji et al.93). Zernike features (n = 3) were used among other features in (Yu et al.75) to identify, subtype and prognose lung adenocarcinoma and squamous cell cancer. Fourier features were used in combination with second-order texture features in Rathore et al.65 to identify images with colorectal cancer.

Morphological features are usually easy to interpret for pathologists and are simple to extract once objects have been segmented. However, they depend on the segmentation quality and are more relevant when there is a clear rationale justifying their use (e.g., using tumor cell area to classify small cell lung cancer against non-small cell lung cancer).

3.4.3. Topological features

Topological features are another frequently used category of features (n = 23). Voronoi features (n = 11) were used among other topological features in Xie et al.109 to predict intrahepatic cholangiocarcinoma survival. Delaunay triangulation (n = 17) was used to grade cervical intraepithelial neoplasia, in combination with other features (De et al.48). All reviewed studies relying on minimum spanning tree features (n = 10) also used either Delaunay triangulation or Voronoi tessellation, or both. For instance, Fukuma et al.70 used all the aforementioned topological features to grade gliomas.

Other graph-based approaches were also studied (n = 11). For example, Lu et al.20 used a first-order texture feature to cluster neighboring nuclei into local cell graphs to predict lung cancer survival.

Although a bit more abstract than morphological features, topological features remain relatively easy to interpret, and can highlight differences in cells' organization. However, topological features depend on segmentation quality, since false positive and false negative cells affect the resulting graphs, regardless of the method. On top of that, features extracted from large graphs may miss local variations containing relevant information for classification.

3.4.4. Other types of features

Most articles used the extracted features directly to train a classifier. However, a few studies employed a different approach by utilizing these features to create a cell or tissue dictionary, also referred to as “bag of features” models (n = 3). These articles extracted features from biological objects, and then built a dictionary using a clustering algorithm (O'Hara and Draper121). Built on the same concept, spatial pyramid matching approaches incorporating positional information of objects were also used (n = 2).

Finally, some studies used a combination of biological features and features out of the review scope as they were extracted from raw patches rather than from biological objects. Examples of such features included gradient orientation methods, like Histogram of Oriented Gradients (HOG) (n = 1), Scale-Invariant Feature Transform (SIFT) (n = 1), and Speeded Up Robust Features (SURF) (n = 1), which were employed in some studies to identify the most descriptive points or regions in histopathological images. Tools like CellProfiler (Stirling et al.122) were used to automatically extract a large number of features (up to 9879). Among them, some features were extracted from raw patches, such as image quality and granularity characteristics (n = 3). A mix of deep and biological features was utilized in a couple of articles (n = 2). Additionally, clinical data was seldom incorporated (n = 2).

3.4.5. Domain-agnostic and domain-inspired features

Among the reviewed articles, 6 relied on domain-inspired features: both Sparks and Madabhushi51 and Niazi et al.84 graded prostate tumors, by deriving features from the medial axis of prostate glands and by utilizing the average distance of nuclei to their closest lumen, respectively. Similarly, Lee et al.49 created features derived from cell orientation entropy to predict biochemical recurrence in prostate cancers, that were also used by Ji et al.93 to prognose gastric adenocarcinomas. Rathore et al.68 applied features derived from gland lumen geometry to identify and grade colorectal cancer. Amgad et al.114 also used many domain-inspired features such as chromatin clumping of epithelial nuclei or circularity of epithelial nests to predict breast cancer survival. Finally, Diao et al.103 used Human Interpretable Features directly inspired from pathology practice, such as the density of plasma-cells in cancer tissue or the mean cluster size of fibroblasts in cancer-associated stroma. This latter article relied on features that would not be considered as domain-inspired according to Bera et al.17 but that were nonetheless grounded in the field of histopathology.

3.5. Feature selection and dimensionality reduction

Feature selection involves identifying an optimal subset of features, while dimensionality reduction transforms features to efficiently represent data in fewer dimensions (Jia et al.123). Both techniques are crucial in ML, as too many features can diminish a model's performance and generalizability, a phenomenon known as the “curse of dimensionality” (Bellman124). Among the articles reviewed, up to 9879 features were extracted, with a mean of 319 and a median of 48. After selection, the number of features used dropped significantly, with a maximum of 1054, a mean of 60, and a median of 15.

Finding the optimal subset of features requires testing every feature combination, which is computationally expensive. Feature selection methods are heuristics designed to find a relevant subset of features in a reasonable amount of time. Some articles in this study compared different feature selection approaches to determine which one reached the highest score for their data and model. The most frequently used method was minimum redundancy-maximum relevance (n = 9), followed by Wilcoxon's rank sum test (n = 5) and univariate survival analysis (n = 5). Many other methods were used in the reviewed articles, including evolutionary algorithms (n = 2), graph embeddings (n = 2) and sequential floating forward search (n = 3).

Dimensionality reduction approaches project the features in a space with lower dimensionality, and thus directly affect the feature values. The most popular of them was principal component analysis, used in 2 articles. Unlike feature selection dimensionality reduction affects the explainability of the features, as they lose their physical interpretation when projected in a space with lower dimension. Both feature selection and dimensionality reduction methods diminish the redundancy of the features while maintaining or even improving their predictive abilities. Therefore, these methods do not necessarily decrease a model's accuracy and may even enhance it (Jia et al.123).

3.6. Classification algorithms

A wide variety of classification methods were used in the articles. Support Vector Machines (SVM) (Boser et al.125) was the most popular algorithm by a substantial margin (n = 30). Random Forest (Breiman126) (n = 7) and k-Nearest Neighbors (Fix and Hodges127) (n = 5) were also commonly used methods. In some studies, classifier ensembles were the best approach (Kong et al.,31 Kruk et al.32). These versatile methods were employed in a broad range of tasks. Conversely, Cox proportional hazards models (Cox128) were specifically used in prognosis tasks to separate high-risk from low-risk populations (n = 12). Besides these well-known ML methods introduced many years ago, it should also be noted that more modern ML algorithms such as XGBoost (Chen and Guestrin129), CatBoost (Dorogush et al.130) or LightGBM (Ke et al.131) were only considered in 4 studies (Bejnordi et al.,71 Kanber et al.,115 Krithiga and Geetha,116 Illarionova et al.120).

3.7. Methods' evolution

This review included articles from 2005 to 2025. Articles published prior to 2010 had already introduced most of the different features that were described in section Features. Indeed, before 2010, topological features (Demir et al.,3 Wang et al.30), texture features (Chapman et al.,26 Tabesh et al.,27 Cheretis et al.,29 Kong et al.,31 Kruk et al.32) and morphological features (Tabesh et al.,27 Chapman et al.,26 Wittke et al.,28 Cheretis et al.,29 Wang et al.,30 Kruk et al.32) had already been studied. These domain-agnostic features remained widely used in more recent articles (n = 90). Domain-inspired features however stand out as being both more original and more explainable than their generic counterparts, as they are tailored to a specific task (n = 6). Despite the emergence of more effective gradient-boosted ML models in the 2010s, they remained rarely used even in reviewed articles published in the 2020s. Indeed, popular models used in more recent studies such as SVMs, Cox regression models, multi-layer perceptrons and ensemble models had already been used in articles written prior to 2010.

The progress of segmentation methods has been more notable: all articles using manual annotation to segment biological objects were written prior to 2013 (Chapman et al.,26 Wittke et al.,28 Bedin et al.,35 de Andrea Carlos E. et al.,38 Sparks and Madabhushi51). Although some common methods such as k-means segmentation were used in articles appeared both before 2010 (Demir et al.,3 Kruk et al.32) and after 2020 (Tan et al.101), DL segmentation has become increasingly common over the years (Awan et al.,80 Wang et al.,89 Yu et al.,92 Falkenstein et al.,95 Javed et al.,99 Diao et al.,103 Liu et al.,107 Wang et al.,108 Xie et al.,109 Amgad et al.,114 L'Imperio et al.119), as it has tremendously improved the segmentation quality of the biological objects. The overall progress of the methods reviewed herein seems to have mainly been driven by better data and segmentation quality rather than by an improvement in features or in non-DL classification methods.

4. Discussion

We reviewed 97 articles about extracting features from biological objects in H&E histopathological images to address various medical questions. We hereby examined the different research questions that were raised in the introduction:

RQ1: What types of features are commonly extracted from H&E slides for medical applications? How are they subsequently processed?

Our review highlighted key trends in the field of AI applied to digital pathology and revealed that the extracted features generally fall into three main categories: texture and color, morphology, and topology. Regarding feature selection, methods such as minimum redundancy-maximum relevance and Wilcoxon's rank sum test were commonly used, and SVM was by far the most prevalent classifier (n = 30). The lack of more modern ML models in recent studies and SVM models' predominance were other unexpected findings. To verify whether the relative absence of recent models was due to a bias in our study selection, we checked the use of XGBoost—the most popular conventional ML model developed in the 2010s—in other literature reviews. Abdelsamea et al.132 reviewed a single study employing XGBoost with deep features, which is out of this review's scope (Makarchuk et al.133). McGenity et al.18 reported a single study using XGBoost that did fit our inclusion criteria but was not included since it lacked MeSH terms (Yu et al.134). We found only one extra study using XGBoost as a classifier that fitted the review's inclusion criteria and that would have been filtered out for having the term “deep” in its title (Sung135).

One possible explanation for the dominance of SVMs in the reviewed literature lies in their favorable properties in low- to medium-dimensional settings. Another benefit of linear SVMs is their relative simplicity: they can be interpreted in terms of feature weights, allowing researchers and clinicians to identify which extracted features contribute most to the classification task. This has to be contrasted by the fact that only 4 articles included in this review proposed linear kernels as their classification approach. The others proposed either non-linear kernels, which was the case in 16 articles (sometimes comparing their performance to a linear kernel), or did not specify which kernel they used (n = 10). Non-linear kernels are unfortunately harder to interpret than their linear counterpart. More recent approaches such as XGBoost often achieve higher predictive performance, especially when capturing non-linear relationships or handling heterogeneous feature sets, but they sacrifice interpretability. These models aggregate multiple weak learners, making it difficult to directly understand the decision-making process.

Due to the study design of this review, many approaches relying on both biological features and deep features were filtered out. A few articles leveraging both features were nonetheless included (Javed et al.,99 Illarionova et al.120). Such hybrid approaches may provide additional explainability to regular DL approaches, as highlighted in Kapse et al.136. In this review, our focus was on providing an overview of the use of biological features in histopathology, hence the limited inclusion of these methods.

RQ2: What are the medical tasks tackled by the models trained on these features?

From a medical perspective, the reviewed studies covered a range of tasks including “normal vs diseased”, disease subtyping, tumor grading, phenotyping, object detection, prognosis and treatment response prediction. Comparing the results obtained by the different articles was difficult as: (1) the problems tackled were very dissimilar, (2) different datasets were used and (3) the medical tasks reviewed do not present the same level of complexity. Indeed, most “normal vs diseased” classifications can be performed at first glance by experienced pathologists using H&E staining alone. Object detection tasks such as mitosis counting are not complicated per se but can be quite tedious. They can nonetheless be challenging for AI models due to severe data imbalance and to the presence of numerous mimics. Grading and subtyping are usually more complex but are also performed routinely. While the difficulty of these tasks can vary depending on the pathology, AI models are generally expected to achieve high performance scores. In contrast, prognosis, phenotyping, and treatment response prediction often require additional techniques (e.g., immunohistochemistry or molecular biology), lower performance scores are thus expected.

RQ3: What are the most common pitfalls encountered when extracting and processing features from H&E slides?

As medical data are sensitive, most reviewed studies were based on a private dataset (n = 65), while 32 leveraged a publicly available one, 7 of which used both. To address the challenge of working with non-disclosed datasets, one effective approach is to validate the model trained on private data using at least one public dataset. This practice facilitates comparisons with other published approaches and helps to assess a model's robustness. Indeed, it has been shown that a model can experience significant drops in accuracy when applied on an external test set, due to different slide preparation protocols across laboratories (Syrykh et al.,137 Li et al.138). Texture and color features in particular are sensitive to H&E staining variability across laboratories. Heterogeneous datasets should thus be used during the training process to improve a model's generalizability. Techniques such as data augmentation or normalization can also be applied to this end (Tellez et al.,139 Franchet et al.6).

Unfortunately, public datasets can be quite scarce and can have some major limitations. Several studies indeed pointed out flaws of TCGA, the most widely used public dataset among the reviewed articles. Notably, it has been shown that models trained on TCGA slides were unexpectedly able to recognize site-specific acquisition patterns, leading to a potential risk of bias (Dehkharghanian et al.140). Furthermore, TCGA includes highly specific images that are not representative of cases encountered in daily clinical practice (Yu et al.75). Other public datasets also present limitations such as data leakage in the MITOS dataset, augmented images in the LC25000 dataset that may cause data leakage if handled incorrectly, small sample sizes or potential biases (Ignatov and Malivenko141). These limitations should systematically be specified when working on such datasets. This appeared to be uncommon in the reviewed articles. Additionally, studies that only focus on image tiles might overlook the challenge of identifying and analyzing regions of interest across entire slides, which is a significant aspect of routine pathology.

In medical image analysis, especially for rare diseases, data scarcity is a significant challenge. Most of the studies reviewed used small sample sizes, with a median of 73 patients. Models trained on small datasets are unable to perceive the full variability of the disease. Indeed, they tend to either underfit or overfit to the training data, and may learn noise rather than meaningful patterns, affecting their performance in different clinical settings. To address this issue, researchers should do their best to increase sample sizes, or should at least apply data augmentation methods.

RQ4: To what extent are the reported results robust and transparent?

One of the most concerning findings of this study was the overall lack of transparency in the field. Several guidelines such as APPRAISE-AI (Kwong et al.142) or QUADAS-2 (Whiting et al.19) were published to both assess and improve the quality of studies involving AI and medical data. These guidelines were often not followed by the articles we reviewed. For instance, some reviewed studies insufficiently detailed how they preprocessed their data, raising concerns about potential data leakage. Kapoor and Narayanan143 defined data leakage as a spurious relationship between the target variable and the features provided to a model. They introduced a taxonomy of data leakage, which included: improper separation of training and test datasets [L1], illegitimate features [L2] and improper test dataset design [L3]. Since data leakage inflates performance metrics, studies should be transparent regarding their data preprocessing so as to obtain reliable results. This was not always the case in the reviewed studies, as 56 of them showed potential data leakage, while 9 even lacked an independent testing set (L1.1). In practice, splitting should occur before normalization and augmentation (L1.2), as well as prior to feature selection (L1.3), at the patient's level rather than at the patch level (L3.2). Therefore, authors should explicitly detail their preprocessing steps and specify how they split their data. In this sense, APPRAISE-AI guidelines recommend using either a held-out validation cohort or a temporal split.

Another apparent issue related to the lack of transparency concerned the data itself. Indeed, 27 studies did not disclose the number of patients involved, and 16 did not mention the data's origin. Stating the data's origin and the number of patients is important for several reasons: (1) it reflects the scale and ambition of the study, for instance, research involving a large number of patients from multiple hospitals is likely to be more generalizable than studies involving fewer patients from one single hospital; (2) providing this information encourages to report a target population and patients' eligibility criteria, thereby enhancing reproducibility; (3) it aids in identifying potential sources of bias, such as dataset imbalance. APPRAISE-AI quantifies reporting quality and reproducibility criteria, such as data sharing, source code and trained models' accessibility, as well as ground truth and features' description. In fact, very few articles provided access to their code (n = 5), giving an incomplete view of their results and thus compromising their reliability.

RQ5: Which approaches are the most promising in terms of performance and interpretability/explainability?

Despite the aforementioned limitations, promising approaches were presented in some articles, designing features specifically for given tasks in order to be more easily explainable by pathologists. Handcrafted features are not all equally understandable due to their varying levels of complexity and abstraction. Many studies extracted a large number of generic features without providing a clear rationale and did not describe how these features may be relevant to their classification tasks. This practice raises concerns about their potential generalizability. In addition, some texture features, such as second order statistics, may be difficult to grasp. They might also be affected by factors such as slice thickness or image compression and may not directly reflect any biological phenomena. Morphological and topological features are usually easier to understand but they can be overly simplistic to reveal meaningful biological differences. Furthermore, morphological features may be highly correlated, and topological features may miss subtle variations at a local level.

Relying on a large number of non-specific features also diminishes a study's potential biological insights. Such features were referred to as domain-agnostic in the work of Bera et al.17 and may be extracted from any type of disease. This practice often results in similar studies using comparable features, altering their originality as well as their innovative nature, as discussed in section Evolution of the methods. However, some articles using domain-agnostic features stood out, as they designed features so as to be explainable by pathologists (Diao et al.103). On top of that, some researchers conceived domain-inspired features reflecting observable differences between classes, as seen in other reviewed studies (Sparks and Madabhushi,51 Lee et al.,49 Amgad et al.114). These studies highlighted the potential benefits of collaborations between engineers and medical practitioners, as domain-inspired features appear to be an excellent compromise between interpretability and the ability to reveal biological differences. Interpretability is crucial in the adoption of ML models in pathology, and in this sense, domain-inspired approaches are more likely to be practical and useful in clinical settings.

Simultaneous improvement in both predictive performance and interpretability remains a central challenge in AI model development. In this regard, approaches that remain human-readable by design are promising avenues. Genetic programming, a method that evolves or generates symbolic models (e.g., mathematical expressions, logical rules, or structured programs) is an example of such an approach (Cortacero et al.,144 Bi et al.145). When applied to well-chosen biological features, it can produce models that offer both competitive performance and a high level of interpretability, allowing domain experts to scrutinize the decision logic and assess its validity (De La Torre et al.146). Such approaches hold potential for fostering more collaborative and transparent AI pipelines in pathology, where interpretability is a prerequisite for clinical integration.

One of the main limitations of the present review is the lack of a meta-analysis. Our main objective was to provide an overview of biological feature extraction methods and led us to compare a wide range of medical tasks. A statistical comparison of results would be irrelevant, since they are based on widely different datasets and objectives that vary greatly in difficulty. On top of that, the reported methodology and results often lacked rigor and consistency, and often raised suspicions of bias and applicability, making it highly difficult to draw conclusions on the most promising features or approaches.

5. Conclusion

Many studies from the last 20 years included herein showed major limitations such as lack of transparency and data leakage, hampering the reliability of both results and approaches. In conventional ML and DL, addressing data leakage and improving both transparency and explainability should be top priorities. Following guidelines to apply AI to medical data, such as APPRAISE-AI (Kwong et al.142) or QUADAS-2 (Whiting et al.19) may also be helpful in this regard. Improving the quality of public histopathology datasets would be another way to promote transparency and reproducibility in the field. Furthermore, having a clear rationale to justify the use of a given feature can help improving the explainability of a model. Recent improvements in segmentation methods have enabled the study of specific populations of biological objects, fueling the extraction of domain-inspired features. Domain-inspired features aim at reproducing pathologists' approach by quantifying specific elements in slides that are known to be critical for a given disease. To implement and design such features, we strongly believe that a collaboration between engineers and pathologists is essential.

6. Methods

This review adheres to the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines147 for search strategy. The PRISMA checklist can be found in the Supplementary Materials (section PRISMA checklist). The search was conducted on May 16th, 2025, using the PubMed, IEEE, and ACM databases. The search queries for each database are detailed in the Supplementary Materials (section Database queries). For PubMed, the query included the following Medical Subject Headings (MeSH) terms: image processing, computer-assisted, and humans. For IEEE Xplore, the query applied the following filters on keywords: ‘medical image processing’, ‘optical microscopy’, ‘feature extraction’, ‘image segmentation’, ‘cancer’, ‘image classification’, ‘biomedical optical imaging’, ‘biological tissues’, ‘learning (artificial intelligence)’, ‘diseases’, and ‘tumors’. No additional filters were applied to the ACM database. The code used to extract relevant information from bib and csv files, to filter the IEEE Xplore articles, to remove duplicates and to scrap PubMed abstracts is available on GitHub.

This review was not registered and no protocol was written.

Code availability

The code used for this review can be found on the following GitHub: https://github.com/streilla/histoFeaturesReview.

Declaration of generative AI and AI-assisted technologies in the writing process

During the preparation of this work the authors used ChatGPT in order to improve the clarity and readability of specific sections of this manuscript. After using this tool/service, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work has been supported by funds from Institut Carnot CALYM (DIALYM project), the Health Data Hub (APRIORICS project), La Région Occitanie, IA pour la santé and Laboratoire d'Excellence Toulouse Cancer (TouCan). This paper has also benefited from the proofreading of Yuri Lavinas and Ana Inés Darquier.

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.jpi.2025.100539.

1

The Gleason grading system is used to categorize prostate cancers based on gland morphology.

Appendix A. Supplementary data

Supplementary material

mmc1.pdf (262.4KB, pdf)

Data availability

All data generated or analyzed during this study are available from the corresponding author on reasonable request.

References

  • 1.Pantanowitz L., Sharma A., Carter A.B., Kurc T., Sussman A., Saltz J. Twenty years of digital pathology: an overview of the road travelled, what is on the horizon, and the emergence of vendor-neutral archives. J Pathol Inform. 2018;9 doi: 10.4103/jpi.jpi_69_18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Hwang H.-G., Choi H.-J., Lee B.-I., Yoon H.-K., Nam S.-H., Choi H.-K. Multi-resolution wavelet-transformed image analysis of histological sections of breast carcinomas. Cell Oncol. 2005;27:237–244. doi: 10.1155/2005/526083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Demir C., Gultekin S.H., Yener B. Learning the topological properties of brain tumors. IEEE/ACM Trans Comput Biol Bioinform. 2005;2:262–269. doi: 10.1109/TCBB.2005.42. [DOI] [PubMed] [Google Scholar]
  • 4.Cireşan D.C., Giusti A., Gambardella L.M., Schmidhube J. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2013. Ichiro S., Yoshinobu B., Christian N.N.M., Kensaku Sakuma, editors. Springer; Berlin Heidelberg: 2013. Mitosis detection in breast cancer histology images with deep neural networks; pp. 411–418. [DOI] [PubMed] [Google Scholar]
  • 5.Brixtel R., Bougleux S., Lezoray O., et al. Whole slide image quality in digital pathology: review and perspectives. IEEE Access. 2022;10:131005–131035. doi: 10.1109/ACCESS.2022.3227437. [DOI] [Google Scholar]
  • 6.Franchet C., Schwob R., Bataillon G., et al. Bias reduction using combined stain normalization and augmentation for AI-based classification of histological images. Computers in Biology and Medicine. 2024;171:108130. doi: 10.1016/j.compbiomed.2024.108130. https://www.sciencedirect.com/science/article/pii/S0010482524002142 10.1016/j.compbiomed.2024.108130 URL. doi. [DOI] [PubMed] [Google Scholar]
  • 7.Regulation (eu) 2024/1689 of the european parliament and of the council of 13 june 2024 laying down harmonised rules on artificial intelligence and amending regulations (ec) no 300/2008, (eu) no 167/2013, (eu) no 168/2013, (eu) 2018/858, (eu) 2018/1139 and (eu) 2019/2144 and directives 2014/90/eu, (eu) 2016/797 and (eu) 2020/1828 (artificial intelligence act), Official Journal of the European Union, L 2024/1689 2024. https://eur-lex.europa.eu/ URL. articles 13 and 14 relevant to transparency and human oversight requirements.
  • 8.U.S. Food and Drug Administration . 2021. Ai/ml-based software as a medical device (samd) action plan, FDA, digital health policy document.https://www.fda.gov/media/145022/download URL. action plan for regulation of AI/ML-enabled medical device software. [Google Scholar]
  • 9.Gilpin L.H., Bau D., Yuan B.Z., Bajwa A., Specter M., Kagal L. 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA) 2018. Explaining explanations: An overview of interpretability of machine learning; pp. 80–89. [DOI] [Google Scholar]
  • 10.Selvaraju R.R., Cogswell M., Das A., Vedantam R., Parikh D., Batra D. 2017 IEEE International Conference on Computer Vision (ICCV) 2017. Grad-cam: Visual explanations from deep networks via gradientbased localization; pp. 618–626. [DOI] [Google Scholar]
  • 11.Ribeiro M.T., Singh S., Guestrin C. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery. 2016. “Why should i trust you?”: Explaining the predictions of any classifier; pp. 1135–1144. [DOI] [Google Scholar]
  • 12.Lundberg S.M., Lee S.-I. Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc; 2017. A unified approach to interpreting model predictions; pp. 4768–4777. [Google Scholar]
  • 13.Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell. 2019;1:206–215. doi: 10.1038/s42256-019-0048-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Saporta A., Gui X., Agrawal A., et al. Benchmarking saliency methods for chest x-ray interpretation. Nat Mach Intell. 2022;4:867–878. doi: 10.1038/s42256-022-00536-x. [DOI] [Google Scholar]
  • 15.Li X., Li C., Rahaman M.M., et al. A comprehensive review of computer-aided whole-slide image analysis: from datasets to feature extraction, segmentation, classification and detection approaches. Artif Intell Rev. 2022;55:4809–4878. doi: 10.1007/s10462-021-10121-0. [DOI] [Google Scholar]
  • 16.Al-Thelaya K., Gilal N.U., Alzubaidi M., et al. Applications of discriminative and deep learning feature extraction methods for whole slide image analysis: a survey. J Pathol Inform. 2023;14:100335. doi: 10.1016/j.jpi.2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Bera K., Katz I., Madabhushi A. Reimagining t staging through artificial intelligence and machine learning image processing approaches in digital pathology. JCO Clin Cancer Inform. 2020;4:1039–1050. doi: 10.1200/CCI.20.00110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.McGenity C., Clarke E.L., Jennings C., et al. 2024. Artificial Intelligence in Digital Pathology: A Systematic Review and Meta-Analysis of Diagnostic Test Accuracy. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Whiting P.F., Rutjes A.W., Westwood M.E., et al. Quadas2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155:529–536. doi: 10.7326/0003-4819-155-8-201110180–00009. [DOI] [PubMed] [Google Scholar]
  • 20.Lu C., Wang X., Prasanna P., et al. Medical Image Computing and Computer-Assisted Intervention. vol. 11071. LNCS, Springer Verlag; 2018. Feature driven local cell graph (fedeg): Predicting overall survival in early stage lung cancer; pp. 407–416. [DOI] [Google Scholar]
  • 21.Haralick R.M., Shanmugam K., Dinstein I. Textural features for image classification. IEEE Trans Syst Man Cybern. 1973;SMC-3:610–621. doi: 10.1109/TSMC.1973.4309314. [DOI] [Google Scholar]
  • 22.Galloway M.M. 1975. Short Note Texture Analysis Using Gray Level Run Lengths. [Google Scholar]
  • 23.Tamura H., Mori S., Yamawaki T. Textural features corresponding to visual perception. IEEE Trans Syst Man Cybern. 1978;8:460–473. doi: 10.1109/TSMC.1978.4309999. [DOI] [Google Scholar]
  • 24.Lewis J.S., Scantlebury J.B., Luo J., Thorstad W.L. American Journal of Surgical Pathology. vol. 36. Lippincott Williams and Wilkins; 2012. Tumor cell anaplasia and multinucleation are predictors of disease recurrence in oropharyngeal squamous cell carcinoma, including among just the human papillomavirus-related cancers; pp. 1036–1046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Knoth P., Herrmannova D., Cancellieri M., et al. Core: a global aggregation service for open access papers. Sci Data. 2023;10 doi: 10.1038/s41597-023-02208-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Chapman J.A.W., Miller N.A., Lickley H.L.A., et al. Ductal carcinoma in situ of the breast (dcis) with heterogeneity of nuclear grade: prognostic effects of quantitative nuclear assessment. BMC Cancer. 2007;7:174. doi: 10.1186/1471-2407-7-174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Tabesh A., Teverovskiy M., Pang H.Y., et al. Multifeature prostate cancer diagnosis and Gleason grading of histological images. IEEE Trans Med Imaging. 2007;26:1366–1378. doi: 10.1109/TMI.2007.898536. [DOI] [PubMed] [Google Scholar]
  • 28.Wittke C., Mayer J., Schweiggert F. On the classification of prostate carcinoma with methods from spatial statistics. IEEE Trans Inf Technol Biomed. 2007;11:406–414. doi: 10.1109/TITB.2006.888703. [DOI] [PubMed] [Google Scholar]
  • 29.Cheretis C., Angelidou E., Dietrich F., Politi E., Kiaris H., Koutselini H. Prognostic value of computer-assisted morphological and morphometrical analysis for detecting the recurrence tendency of basal cell carcinoma. Med Sci Monit. 2008;14:13–19. [PubMed] [Google Scholar]
  • 30.Wang Y., Crookes D., Eldin O.S., Wang S., Hamilton P., Diamond J. Assisted diagnosis of cervical intraepithelial neoplasia (cin) IEEE J Sel Top Signal Process. 2009;3:112–121. doi: 10.1109/JSTSP.2008.2011157. [DOI] [Google Scholar]
  • 31.Kong J., Sertel O., Shimada H., Boyer K.L., Saltz J.H., Gurcan M.N. Computer-aided evaluation of neuroblastoma on whole-slide histology images: classifying grade of neuroblastic differentiation. Pattern Recogn. 2009;42:1080–1092. doi: 10.1016/j.patcog.2008.10.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kruk M., Osowski S., Koktysz R. Recognition and classification of colon cells applying the ensemble of classifiers. Comput Biol Med. 2009;39:156–165. doi: 10.1016/j.compbiomed.2008.12.001. [DOI] [PubMed] [Google Scholar]
  • 33.Sertel O., Lozanski G., Shanáah A., Gurcan M.N. Computer-aided detection of centroblasts for follicular lymphoma grading using adaptive likelihood-based cell segmentation. IEEE Trans Biomed Eng. 2010;57:2613–2616. doi: 10.1109/TBME.2010.2055058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Basavanhally A.N., Ganesan S., Agner S., et al. Computerized image-based detection and grading of lymphocytic infiltration in her2+ breast cancer histopathology. IEEE Trans Biomed Eng. 2010;57:642–653. doi: 10.1109/TBME.2009.2035305. [DOI] [PubMed] [Google Scholar]
  • 35.Bedin V., Adam R.L., de Sá B.C., Landman G., Metze K. Fractal dimension of chromatin is an independent prognostic factor for survival in melanoma. BMC Cancer. 2010;10 doi: 10.1186/1471-2407-10-260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Monaco J.P., Tomaszewski J.E., Feldman M.D., et al. High-throughput detection of prostate cancer in histological sections using probabilistic pairwise markov models. Med Image Anal. 2010;14:617–629. doi: 10.1016/j.media.2010.04.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Al-Kadi O.S. Texture measures combination for improved meningioma classification of histopathological images. Pattern Recogn. 2010;43:2043–2053. https://www.sciencedirect.com/science/article/pii/S0031320310000373 10.1016/j.patcog.2010.01.005 URL. doi. [Google Scholar]
  • 38.de Andrea Carlos E., Sergio P.A., Reynaldo J.-G. Large and round tumor nuclei in osteosarcoma: good clinical outcome. Int J Clin Exp Pathol. 2011;4:169–174. www.ijcep.com [PMC free article] [PubMed] [Google Scholar]
  • 39.Krishnan M.M.R., Shah P., Choudhary A., Chakraborty C., Paul R.R., Ray A.K. Textural characterization of histopathological images for oral sub-mucous fibrosis detection. Tissue Cell. 2011;43 doi: 10.1016/j.tice.2011.06.005. [DOI] [PubMed] [Google Scholar]
  • 40.Kalkan H., Nap M., Duin R.P.W., Loog M. Medical Image Computing and Computer-Assisted Intervention. 2012. Automated colorectal cancer diagnosis for whole-slice histopathology. [DOI] [PubMed] [Google Scholar]
  • 41.Krishnan M.M.R., Venkatraghavan V., Acharya U.R., et al. Automated oral cancer identification using histopathological images: a hybrid feature extraction paradigm. Micron. 2012;43:352–364. doi: 10.1016/j.micron.2011.09.016. [DOI] [PubMed] [Google Scholar]
  • 42.Loeffler M., Greulich L., Scheibe P., et al. Classifying prostate cancer malignancy by quantitative histomorphometry. J Urol. 2012;187:1867–1875. doi: 10.1016/j.juro.2011.12.054. [DOI] [PubMed] [Google Scholar]
  • 43.Krishnan M.M.R., Pal M., Paul R.R., Chakraborty C., Chatterjee J., Ray A.K. Computer vision approach to morphometric feature analysis of basal cell nuclei for evaluating malignant potentiality of oral submucous fibrosis. J Med Syst. 2012;36:1745–1756. doi: 10.1007/s10916-010-9634-5. [DOI] [PubMed] [Google Scholar]
  • 44.Krishnan M.M.R., Chakraborty C., Paul R.R., Ray A.K. Hybrid segmentation, characterization and classification of basal cell nuclei from histopathological images of normal oral mucosa and oral submucous fibrosis. Expert Syst Appl. 2012;39:1062–1077. doi: 10.1016/j.eswa.2011.07.107. [DOI] [Google Scholar]
  • 45.Nguyen K., Sabata B., Jain A.K. Prostate cancer grading: gland segmentation and structural features. Pattern Recogn Lett. 2012;33:951–961. doi: 10.1016/j.patrec.2011.10.001. [DOI] [Google Scholar]
  • 46.Nguyen K., Sarkar A., Jain A.K. Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2012. Structure and context in prostatic gland segmentation and classification. [DOI] [PubMed] [Google Scholar]
  • 47.Vasiljevic J., Reljin B., Sopta J., Mijucic V., Tulic G., Reljin I. Application of multifractal analysis on microscopic images in the classification of metastatic bone disease. Biomed Microdevices. 2012;14:541–548. doi: 10.1007/s10544-012-9631-1. [DOI] [PubMed] [Google Scholar]
  • 48.De S., Stanley R.J., Lu C., et al. A fusion-based approach for uterine cervical cancer histology image classification. Comput Med Imaging Graph. 2013;37:475–487. doi: 10.1016/j.compmedimag.2013.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Lee G., Ali S., Veltri R., Epstein J.I., Christudass C., Madabhushi A. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 8151 LNCS. 2013. Cell orientation entropy (core): Predicting biochemical recurrence from prostate cancer tissue microarrays; pp. 396–403. [DOI] [PubMed] [Google Scholar]
  • 50.Chang H., Borowsky A., Spellman P., Parvin B. Computer Vision and Pattern Recognition (CVPR) 2013. Classification of tumor histology via morphometric context. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Sparks R., Madabhushi A. Explicit shape descriptors: novel morphologic features for histopathology classification. Med Image Anal. 2013;17:997–1009. doi: 10.1016/j.media.2013.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Kothari S., Phan J.H., Young A.N., Wang M.D. Histological image classification using biologically interpretable shape-based features. BMC Med Imaging. 2013;13 doi: 10.1186/1471-2342-13-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Chang H., Han J., Borowsky A., et al. Invariant delineation of nuclear architecture in glioblastoma multiforme for clinical and molecular association. IEEE Trans Med Imaging. 2013;32:670–682. doi: 10.1109/TMI.2012.2231420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Kong J., Cooper L.A., Wang F., et al. Machine-based morphologic analysis of glioblastoma using wholeslide pathology images uncovers clinically relevant molecular correlates. PloS One. 2013;8 doi: 10.1371/journal.pone.0081049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Basavanhally A., Ganesan S., Feldman M., et al. Multi-field-of-view framework for distinguishing tumor grade in er+ breast cancer from entire histopathology slides. IEEE Trans Biomed Eng. 2013;60:2089–2099. doi: 10.1109/TBME.2013.2245129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Rashid S., Fazli L., Boag A., Siemens R., Abolmaesumi P., Salcudean S.E. Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2013. Separation of benign and malignant glands in prostatic adenocarcinoma. [DOI] [PubMed] [Google Scholar]
  • 57.Ali S., Lewis J., Madabhushi A. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2013. Mori K., Sakuma I., Sato Y., Barillot C., Navab N., editors. Springer; Berlin Heidelberg, Berlin, Heidelberg: 2013. Spatially aware cell cluster(spaccl) graphs: Predicting outcome in oropharyngeal p16+ tumors; pp. 412–419. [DOI] [PubMed] [Google Scholar]
  • 58.Fatima K., Arooj A., Majeed H. A new texture and shape based technique for improving meningioma classification. Microsc Res Tech. 2014;77:862–873. doi: 10.1002/jemt.22409. [DOI] [PubMed] [Google Scholar]
  • 59.Dong F., Irshad H., Oh E.Y., et al. Computational pathology to discriminate benign from malignant intraductal proliferations of the breast. PloS One. 2014;9 doi: 10.1371/journal.pone.0114885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Kruk M., Osowski S., Slodkowska J., Koktysz R. Computer approach to recognition of fuhrman grade of cells in clear-cell renal cell carcinoma. Anal Quant Cytol Histopathol. 2014;36:147–160. https://www.researchgate.net/publication/264989971 URL. [PubMed] [Google Scholar]
  • 61.Mukherjee R. Morphometric evaluation of preeclamptic placenta using light microscopic images. Biomed Res Int. 2014;2014 doi: 10.1155/2014/293690. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Lee G., Singanamalli A., Wang H., et al. Supervised multi-view canonical correlation analysis (smvcca): integrating histologic and proteomic features for predicting recurrent prostate cancer. IEEE Trans Med Imaging. 2015;34:284–297. doi: 10.1109/TMI.2014.2355175. [DOI] [PubMed] [Google Scholar]
  • 63.Pääkkönen J., Päivinen N., Nykänen M., Paavonen T. An automated gland segmentation and classification method in prostate biopsies: an image source-independent approach. Mach Vis Appl. 2015;26:103–113. doi: 10.1007/s00138-014-0650-1. [DOI] [Google Scholar]
  • 64.Lu C., Mandal M. Automated analysis and diagnosis of skin melanoma on whole slide histopathological images. Pattern Recogn. 2015;48:2738–2750. doi: 10.1016/j.patcog.2015.02.023. [DOI] [Google Scholar]
  • 65.Rathore S., Hussain M., Khan A. Automated colon cancer detection using hybrid of novel geometric features and some traditional features. Comput Biol Med. 2015;65:279–296. doi: 10.1016/j.compbiomed.2015.03.004. [DOI] [PubMed] [Google Scholar]
  • 66.Paul A., Mukherjee D.P. Mitosis detection for invasive breast cancer grading in histopathological images. IEEE Trans Image Process. 2015;24:4041–4054. doi: 10.1109/TIP.2015.2460455. [DOI] [PubMed] [Google Scholar]
  • 67.Chen J.M., Qu A.P., Wang L.W., et al. New breast cancer prognostic factors identified by computer-aided image analysis of he stained histopathology images. Sci Rep. 2015;5 doi: 10.1038/srep10690. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Rathore S., Hussain M., Iftikhar M.A., Jalil A. Novel structural descriptors for automated colon cancer detection and grading. Comput Methods Programs Biomed. 2015;121:92–108. doi: 10.1016/j.cmpb.2015.05.008. [DOI] [PubMed] [Google Scholar]
  • 69.Ali S., Veltri R., Epstein J.I., Christudass C., Madabhushi A. Selective invocation of shape priors for deformable segmentation and morphologic classification of prostate cancer tissue microarrays. Comput Med Imaging Graph. 2015;41:3–13. doi: 10.1016/j.compmedimag.2014.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Fukuma K., Prasath V.B., Kawanaka H., Aronow B.J., Takase H. in: 20th International Conference on Knowledge Based and Intelligent Information and Engineering Systems. vol. 96. Elsevier B.V; 2016. A study on nuclei segmentation, feature extraction and disease stage classification for human brain histopathological images; pp. 1202–1210. [DOI] [Google Scholar]
  • 71.Bejnordi B.E., Balkenhol M., Litjens G., et al. Automated detection of dcis in whole-slide h&e stained breast histopathology images. IEEE Trans Med Imaging. 2016;35:2141–2150. doi: 10.1109/TMI.2016.2550620. [DOI] [PubMed] [Google Scholar]
  • 72.Wang P., Hu X., Li Y., Liu Q., Zhu X. Automatic cell nuclei segmentation and classification of breast cancer histopathology images. Signal Process. 2016;122:1–13. doi: 10.1016/j.sigpro.2015.11.011. [DOI] [Google Scholar]
  • 73.Jothi J.A.A., Rajam V.M.A. Effective segmentation and classification of thyroid histopathology images. Appl Soft Comput J. 2016;46:652–664. doi: 10.1016/j.asoc.2016.02.030. [DOI] [Google Scholar]
  • 74.Ju H., Yunfu W., Weidong C., Alexander B., Bahram P., Hang C. In: Medical Image Computing and Computer-Assisted Intervention (MICCAI) Ourselin S., Joskowicz L., Sabuncu M.R., Unal G., Wells W., editors. vol. 9900. Springer International Publishing; 2016. Integrative analysis of cellular morphometric context reveals clinically relevant signatures in lower grade glioma; pp. 72–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Yu K.H., Zhang C., Berry G.J., et al. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat Commun. 2016;7 doi: 10.1038/ncomms12474. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Guo P., Banerjee K., Stanley R.J., et al. Nuclei-based features for uterine cervical cancer histology image analysis with fusionbased classification. IEEE J Biomed Health Inform. 2016;20:1595–1607. doi: 10.1109/JBHI.2015.2483318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Luo X., Zang X., Yang L., et al. Comprehensive computational pathological image analysis predicts lung cancer prognosis. J Thorac Oncol. 2017;12:501–509. doi: 10.1016/j.jtho.2016.10.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Wan T., Zhang W., Zhu M., Chen J., Achim A., Qin Z. Automated mitosis detection in histopathology based on non-gaussian modeling of complex wavelet coefficients. Neurocomputing. 2017;237:291–303. doi: 10.1016/j.neucom.2017.01.008. [DOI] [Google Scholar]
  • 79.Jothi J.A.A., Rajam V.M.A. Automatic classification of thyroid histopathology images using multi-classifier system. Multimed Tools Appl. 2017;76:18711–18730. [Google Scholar]
  • 80.Awan R., Sirinukunwattana K., Epstein D., et al. Glandular morphometrics for objective grading of colorectal adenocarcinoma histology images. Sci Rep. 2017;7:2220–2243. doi: 10.1038/s41598-017-16516-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Florindo J.B., Bruno O.M., Landini G. Morphological classification of odontogenic keratocysts using bouligand–minkowski fractal descriptors. Comput Biol Med. 2017;81:1–10. doi: 10.1016/j.compbiomed.2016.12.003. [DOI] [PubMed] [Google Scholar]
  • 82.Sayed G.I., Hassanien A.E. Moth-flame swarm optimization with neutrosophic sets for automatic mitosis detection in breast cancer histology images. Appl Intell. 2017;47:397–408. doi: 10.1007/s10489-017-0897-0. [DOI] [Google Scholar]
  • 83.Pang W., Jiang H., Li S. Sparse contribution feature selection and classifiers optimized by concave-convex variation for hcc image recognition. Biomed Res Int. 2017;2017 doi: 10.1155/2017/9718386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Niazi M.K.K., Yao K., Zynger D.L., et al. Visually meaningful histopathological features for automatic grading of prostate cancer. IEEE J Biomed Health Inform. 2017;21:1027–1038. doi: 10.1109/JBHI.2016.2565515. [DOI] [PubMed] [Google Scholar]
  • 85.Amit A., Sabo E., Movsas A., et al. Can morphometric analysis of the fallopian tube fimbria predict the presence of uterine papillary serous carcinoma (upsc)? PloS One. 2017;14 doi: 10.1371/journal.pone.0211329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Cheng J., Mo X., Wang X., Parwani A., Feng Q., Huang K. Identification of topological features in renal tumor microenvironment associated with patient survival. Bioinformatics. 2018;34:1024–1030. doi: 10.1093/bioinformatics/btx723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Barros G.O., Navarro B., Duarte A., Dos-Santos W.L. Pathospotterk: a computational tool for the automatic identification of glomerular lesions in histological images of kidneys. Sci Rep. 2017;7 doi: 10.1038/srep46769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Sapkota M., Liu F., Xie Y., Su H., Xing F., Yang L. Aiimds: an integrated framework of automatic idiopathic inflammatory myopathy diagnosis for muscle. IEEE J Biomed Health Inform. 2018;22:942–954. doi: 10.1109/JBHI.2017.2694344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Wang S., Chen A., Yang L., et al. Comprehensive analysis of lung cancer pathology images to discover tumor shape and boundary features that predict survival outcome. Sci Rep. 2018;8 doi: 10.1038/s41598-018-27707-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.do Nascimento M.Z., Martins A.S., Tosta T.A.A., Neves L.A. Lymphoma images analysis using morphological and non-morphological descriptors for classification. Comput Methods Programs Biomed. 2018;163:65–77. doi: 10.1016/j.cmpb.2018.05.035. [DOI] [PubMed] [Google Scholar]
  • 91.Das D., Mahanta L.B., Ahmed S., Baishya B.K., Haque I. Study on contribution of biological interpretable and computer-aided features towards the classification of childhood medulloblastoma cells. J Med Syst. 2018;42 doi: 10.1007/s10916-018-1008-4. [DOI] [PubMed] [Google Scholar]
  • 92.Yu C., Chen H., Li Y., Peng Y., Li J., Yang F. Breast cancer classification in pathological images based on hybrid features. Multimed Tools Appl. 2019;78:21325–21345. [Google Scholar]
  • 93.Ji M.Y., Yuan L., Jiang X.D., et al. Nuclear shape, architecture and orientation features from h&e images are able to predict recurrence in node-negative gastric adenocarcinoma. J Transl Med. 2019;17 doi: 10.1186/s12967-019-1839-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Lan C., Li J., Huang X., et al. Stromal cell ratio based on automated image analysis as a predictor for platinum-resistant recurrent ovarian cancer. BMC Cancer. 2019;19 doi: 10.1186/s12885-019-5343-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Falkenstein B., Kovashka A., Hwang S.J., Chennubhotla S.C. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) vol. 12535. LNCS, Springer Science and Business Media Deutschland GmbH; 2020. Classifying nuclei shape heterogeneity in breast tumors with skeletons; pp. 310–323. [DOI] [Google Scholar]
  • 96.Cheng J., Han Z., Mehra R., et al. Computational analysis of pathological images enables a better diagnosis of tfe3 xp11.2 translocation renal cell carcinoma. Nat Commun. 2020;11 doi: 10.1038/s41467-020-15671-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Yin P.N., Kc K., Wei S., et al. Histopathological distinction of non-invasive and invasive bladder cancers using machine learning approaches. BMC Med Inform Decis Mak. 2020;20 doi: 10.1186/s12911-020-01185-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Maroof N., Khan A., Qureshi S.A., Rehman A. Ul, Khalil R.K., Shim S.O. Mitosis detection in breast cancer histopathology images using hybrid feature space. Photodiagnosis Photodyn Ther. 2020;31 doi: 10.1016/j.pdpdt.2020.101885. [DOI] [PubMed] [Google Scholar]
  • 99.Javed S., Mahmood A., Werghi N., Benes K., Rajpoot N. Multiplex cellular communities in multi-gigapixel colorectal cancer histology images for tissue phenotyping. IEEE Trans Image Process. 2020;29:9204–9219. doi: 10.1109/TIP.2020.3023795. [DOI] [PubMed] [Google Scholar]
  • 100.Lee H., Kim J.O., Shim J., Cho M. Multivariate discriminant analysis for branching classification of colonic tubular adenoma glands. Cytom Part B - Clin Cytom. 2020;98:429–440. doi: 10.1002/cyto.b.21871. [DOI] [PubMed] [Google Scholar]
  • 101.Tan X.J., Mustafa N., Mashor M.Y., Rahman K.S.A. Automated knowledge-assisted mitosis cells detection framework in breast histopathology images. Math Biosci Eng. 2022;19:1721–1745. doi: 10.3934/mbe.2022081. [DOI] [PubMed] [Google Scholar]
  • 102.Mathialagan P., Chidambaranathan M. Computer vision techniques for upper aero-digestive tract tumor grading classification – addressing pathological challenges. Pattern Recogn Lett. 2021;144:42–53. doi: 10.1016/j.patrec.2021.01.002. [DOI] [Google Scholar]
  • 103.Diao J.A., Wang J.K., Chui W.F., et al. Human-interpretable image features derived from densely mapped cancer pathology slides predict diverse molecular phenotypes. Nat Commun. 2021;12 doi: 10.1038/s41467-021-21896-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Rastghalam R., Danyali H., Helfroush M.S., Celebi M.E., Mokhtari M. Skin melanoma detection in microscopic images using hmm-based asymmetric analysis and expectation maximization. IEEE J Biomed Health Inform. 2021;25:3486–3497. doi: 10.1109/JBHI.2021.3081185. [DOI] [PubMed] [Google Scholar]
  • 105.Hu H., Qiao S., Hao Y., et al. Breast cancer histopathological images recognition based on two-stage nuclei segmentation strategy. PloS One. 2022;17 doi: 10.1371/journal.pone.0266973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Peregrina-Barreto H., Ramirez-Guatemala V.Y., Lopez-Armas G.C., Cruz-Ramos J.A. Characterization of nuclear pleomorphism and tubules in histopathological images of breast cancer. Sensors. 2022;22 doi: 10.3390/s22155649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Liu Y., Jia Y., Hou C., et al. Pathological prognosis classification of patients with neuroblastoma using computational pathology analysis. Comput Biol Med. 2022;149 doi: 10.1016/j.compbiomed.2022.105980. [DOI] [PubMed] [Google Scholar]
  • 108.Wang Z., Lu H., Wu Y., et al. Predicting recurrence in osteosarcoma via a quantitative histological image classifier derived from tumour nuclear morphological features. CAAI Trans Intell Technol. 2023;8:836–848. doi: 10.1049/cit2.12175. [DOI] [Google Scholar]
  • 109.Xie J., Pu X., He J., et al. Survival prediction on intrahepatic cholangiocarcinoma with histomorphological analysis on the whole slide images. Comput Biol Med. 2022;146 doi: 10.1016/j.compbiomed.2022.105520. [DOI] [PubMed] [Google Scholar]
  • 110.Brindha V., Jayashree P., Karthik P., Manikandan P. Tumor grading model employing geometric analysis of histopathological images with characteristic nuclei dictionary. Comput Biol Med. 2022;149 doi: 10.1016/j.compbiomed.2022.106008. [DOI] [PubMed] [Google Scholar]
  • 111.Saha P., Das P., Nath N., Bhowmik M.K. Estimation of abnormal cell growth and mcg-based discriminative feature analysis of histopathological breast images. Int J Intell Syst. 2023;2023 doi: 10.1155/2023/6318127. [DOI] [Google Scholar]
  • 112.Shao W., Liu J., Zuo Y., et al. Fam3l: feature-aware multi-modal metric learning for integrative survival analysis of human cancers. IEEE Trans Med Imaging. 2023;42:2552–2565. doi: 10.1109/TMI.2023.3262024. [DOI] [PubMed] [Google Scholar]
  • 113.Duenweg S.R., Brehler M., Lowman A.K., et al. Quantitative histomorphometric features of prostate cancer predict patients who biochemically recur following prostatectomy. Lab Invest. 2023;103 doi: 10.1016/j.labinv.2023.100269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Amgad M., Hodge J.M., Elsebaie M.A., et al. A population-level digital histologic biomarker for enhanced prognosis of invasive breast cancer. Nat Med. 2024;30:85–97. doi: 10.1038/s41591-023-02643-7. [DOI] [PubMed] [Google Scholar]
  • 115.Kanber B.M., Smadi A.A., Noaman N.F., Liu B., Gou S., Alsmadi M.K. Lightgbm: a leading force in breast cancer diagnosis through machine learning and image processing. IEEE Access. 2024;12:39811–39832. doi: 10.1109/ACCESS.2024.3375755. [DOI] [Google Scholar]
  • 116.Krithiga R., Geetha P. Proliferation score prediction using novel smhc feature using adaptive xgboost model. Multimed Tools Appl. 2024;83:11845–11860. doi: 10.1007/s11042-023-15987-6. [DOI] [Google Scholar]
  • 117.Iwamoto R., Nishikawa T., Musangile F.Y., et al. Small sized centroblasts as poor prognostic factor in follicular lymphoma - based on artificial intelligence analysis. Comput Biol Med. 2024;178 doi: 10.1016/j.compbiomed.2024.108774. [DOI] [PubMed] [Google Scholar]
  • 118.Lin S., Yong J., Zhang L., et al. Applying image features of proximal paracancerous tissues in predicting prognosis of patients with hepatocellular carcinoma. Comput Biol Med. 2024;173 doi: 10.1016/j.compbiomed.2024.108365. [DOI] [PubMed] [Google Scholar]
  • 119.L’Imperio V., Coelho V., Cazzaniga G., et al. Machine learning streamlines the morphometric characterization and multiclass segmentation of nuclei in different follicular thyroid lesions: everything in a nutshell. Mod Pathol. 2024;37 doi: 10.1016/j.modpat.2024.100608. [DOI] [PubMed] [Google Scholar]
  • 120.Illarionova S., Hamoudi R., Zapevalina M., et al. A hierarchical algorithm with randomized learning for robust tissue segmentation and classification in digital pathology. Inform Sci. 2025;686 [Google Scholar]
  • 121.O’Hara S., Draper B.A. 2011. Introduction to the bag of features paradigm for image classification and retrieval.http://arxiv.org/abs/1101.3354 URL. [Google Scholar]
  • 122.Stirling D.R., Swain-Bowden M.J., Lucas A.M., Carpenter A.E., Cimini B.A., Goodman A. Cellprofiler 4: improvements in speed, utility and usability. BMC Bioinform. 2021;22 doi: 10.1186/s12859-021-04344-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123.Jia W., Sun M., Lian J., Hou S. Feature dimensionality reduction: a review. Complex Intell Syst. 2022;8:2663–2693. doi: 10.1007/s40747–021-00637-x. [DOI] [Google Scholar]
  • 124.Bellman R. Princeton University Press; 1972. Dynamic Programming. [Google Scholar]
  • 125.Boser B.E., Guyon I.M., Vapnik V.N. Proceedings of the Fifth Annual Workshop on Computational Learning Theory. 1992. A training algorithm for optimal margin classifiers; pp. 144–152. [Google Scholar]
  • 126.Breiman L. Random forests. Mach Learn. 2001;45:5–32. [Google Scholar]
  • 127.Fix E., Hodges J. USAF School of Aviation Medicine; 1951. Discriminatory Analysis: Nonparametric Discrimination: Consistency Properties.https://books.google.fr/books?id=4XwytAEACAAJ URL. [Google Scholar]
  • 128.Cox D.R. Regression models and life-tables. J R Stat Soc B Methodol. 1972;34:187–202. doi: 10.1111/j.2517–6161.1972.tb00899.x. [DOI] [Google Scholar]
  • 129.Chen T., Guestrin C. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, volume 13–17-August-2016. Association for Computing Machinery; 2016. Xgboost: A scalable tree boosting system; pp. 785–794. [DOI] [Google Scholar]
  • 130.Dorogush A.V., Ershov V., Gulin A. Catboost: Gradient Boosting with Categorical Features Support. 2018. https://api.semanticscholar.org/CorpusID:26037613 ArXiv abs/1810.11363. URL.
  • 131.Ke G., Meng Q., Finley T., et al. Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc; 2017. Lightgbm: A highly efficient gradient boosting decision tree; pp. 3149–3157. [Google Scholar]
  • 132.Abdelsamea M.M., Zidan U., Senousy Z., Gaber M.M., Rakha E., Ilyas M. A survey on artificial intelligence in histopathology image analysis. Wiley Interdiscip Rev Data Min Knowl Discov. 2022;12 doi: 10.1002/widm.1474. [DOI] [Google Scholar]
  • 133.Makarchuk G., Kondratenko V., Pisov M., Pimkin A., Krivov E., Belyaev M. Image Analysis and Recognition. Springer International Publishing; 2018. Ensembling neural networks for digital pathology images classification and segmentation.http://arxiv.org/abs/1802.00947 URL. [Google Scholar]
  • 134.Yu W.H., Li C.H., Wang R.C., Yeh C.Y., Chuang S.S. Machine learning based on morphological features enables classification of primary intestinal t-cell lymphomas. Cancers. 2021;13 doi: 10.3390/cancers13215463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135.Sung Y.-N. Interpretable deep learning model to predict lymph node metastasis in early gastric cancer using whole slide images. Am J Cancer Res. 2024;14:3513–3522. doi: 10.62347/RJBH6076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 136.Kapse S., Pati P., Das S., et al. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024. Si-mil: Taming deep mil for self-interpretability in gigapixel histopathology. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137.Syrykh C., Abreu A., Amara N., et al. Accurate diagnosis of lymphoma on whole-slide histopathology images using deep learning. npj Digit Med. 2020;3 doi: 10.1038/s41746-020-0272-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 138.Li D., Bledsoe J.R., Zeng Y., et al. A deep learning diagnostic platform for diffuse large b-cell lymphoma with high accuracy across multiple hospitals. Nat Commun. 2020;11 doi: 10.1038/s41467-020-19817-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 139.Tellez D., Litjens G., Bándi P., et al. Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology. Med Image Anal. 2019;58 doi: 10.1016/j.media.2019.101544. [DOI] [PubMed] [Google Scholar]
  • 140.Dehkharghanian T., Bidgoli A.A., Riasatian A., et al. Biased data, biased AI: deep networks predict the acquisition site of tcga images. Diagn Pathol. 2023;18 doi: 10.1186/s13000-023-01355-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 141.Ignatov A., Malivenko G. Nct-crc-he: Not All Histopathological Datasets are Equally Useful. 2024. http://arxiv.org/abs/2409.11546 URL.
  • 142.Kwong J.C.C., Khondker A., Lajkosz K., et al. APPRAISE-AI tool for quantitative evaluation of AI studies for clinical decision support. JAMA Netw Open. 2023;6 doi: 10.1001/jamanetworkopen.2023.35377. https://jamanetwork.com/journals/jamanetworkopen/articlepdf/2809841/kwo e2335377–e2335377. arXiv. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143.Kapoor S., Narayanan A. Leakage and the reproducibility crisis in machine-learning-based science. Patterns. 2023;4 doi: 10.1016/j.patter.2023.100804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144.Cortacero K., McKenzie B., Müller S., et al. Evolutionary design of explainable algorithms for biomedical image segmentation. Nat Commun. 2023;14 doi: 10.1038/s41467-023-42664-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145.Bi Y., Xue B., Zhang M. vol. 24. Springer Nature; 2021. Genetic Programming for Image Classification: An Automated Approach to Feature Learning. [Google Scholar]
  • 146.De La Torre C., Nadizar G., Lavinas Y., et al. In: Genetic Programming. Luca B.I.X., Bing Manzoni, editors. Springer Nature Switzerland; 2025. Evolved and transparent pipelines for biomedical image classification; pp. 173–189. [Google Scholar]
  • 147.Page M.J., McKenzie J.E., Bossuyt P.M., et al. 2021. The Prisma 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material

mmc1.pdf (262.4KB, pdf)

Data Availability Statement

All data generated or analyzed during this study are available from the corresponding author on reasonable request.


Articles from Journal of Pathology Informatics are provided here courtesy of Elsevier

RESOURCES