Abstract
Deep Learning (DL) and Machine Learning (ML) algorithms are adept at managing and classifying a wide range of data formats, including time series, text, and images, addressing challenges in both supervised and unsupervised learning. However, the practical applications of specific algorithms—particularly convolutional neural networks (CNNs) and vision transformers (VTs)—are often constrained by the need for large datasets, extensive training, and complex parameter tuning, which frequently relies on a trial-and-error approach. Other approaches, such as visibility graphs (VGs), often produce networks with an exceedingly high number of nodes, resulting in significant computational costs related to runtime and memory usage. Recent research has explored alternative feature extraction and classification solutions to address these challenges. One noteworthy innovation is the use of quantile graphs (QGs), initially applied to time series data, which transform data points into a complex network of quantiles. This method effectively identifies key structural patterns while minimizing computational requirements. These graphs have produced promising outcomes in analyzing physiological time series related to brain function and disorders, including Alzheimer’s disease. This research enhances quantile graphs for image identification and introduces a method for feature extraction applicable in ML and DL processes within the domain of computer vision. The novelty of this work is extending the QGs framework from one-dimensional time series to two-dimensional images, introducing a scalable graph-based approach for image classification, and providing an open-source implementation of the method. The study utilized two well-established benchmark datasets: the Modified National Institute of Standards and Technology (MNIST) handwritten digit database and Fashion MNIST. The performance of the proposed QGs was evaluated in comparison to that of CNNs and VTs. Our findings reveal that, while CNNs and VTs demonstrate superior accuracy in certain circumstances, the proposed QGs outperform these methods in other scenarios, particularly when training data is limited. Additionally, QGs yielded more consistent results across all situations, suggesting the choice of training components has less influence on them than CNNs and VTs. Moreover, the QGs were applied to a medical imaging dataset to illustrate their relevance to real biological data, indicating potential for integration into applications to detect brain diseases.
Keywords: Classification, Complex networks, Computer vision, Quantile graphs, Machine learning
Subject terms: Biomedical engineering, Computational science
Introduction
The importance of utilizing computational methods in decision-making has markedly increased across various scientific and technological fields in recent years 1–3. There has been a notable rise in interest in machine learning (ML) and deep learning (DL) techniques for pattern recognition and classification. ML and DL algorithms enhance the processing, analysis, and annotation of diverse types of data 4–6. This data encompasses geological or physiological time series, textual information from the internet or books, and medical and satellite imagery. Machine learning methods are being applied to address various challenges, including those associated with supervised, unsupervised, and reinforcement learning 7–9.
Applying computational tools to tackle decision-making difficulties offers benefits such as increased speed, enhanced precision, and elevated impartiality, characteristics that are more restricted when executed by a human 10–12. Although different ML/DL algorithms for classification employ diverse operations, they all require obtaining strong features that effectively distinguish between the groups to be classified; therefore, the search for useful features remains a field open for further research 13–15. Reliable features can significantly enhance various applications, including sound and emotion recognition, remote sensing, object detection, and disease diagnosis 16–18.
Despite significant advancements in ML/DL techniques, two fundamental gaps persist. First, current methods often require large datasets and substantial computational resources, which may not be accessible in many scientific or industrial settings. Second, although these algorithms demonstrate high accuracy, their vulnerability to overfitting, sensitivity to parameters, and susceptibility to adversarial perturbations reveal critical weaknesses 19,20. These challenges underscore the need to explore alternative feature-extraction frameworks that are less resource-intensive, more robust, and easier to interpret.
Numerous methods for image pattern classification have yielded impressive results 13,21–23. The most commonly employed techniques include support vector machines, k-nearest neighbors, random forests, decision trees, neural networks, and complex networks (CNs) 24–28. Over the years, a variety of models with consistent performance have been established. Specific neural network architectures, particularly convolutional neural networks (CNNs) and vision transformers (VTs), have exhibited remarkable effectiveness in image classification, as highlighted in 13,29. Deep learning models, such as CNNs and VTs, are widely employed for processing and analyzing visual data, excelling in tasks including image recognition and segmentation 30,31.
In contrast, the practical application of these network architectures is constrained by the scarcity of large-scale datasets and the requisite computing power. This limitation arises from the high demands for training data and processing time that these algorithms impose 32–38. CNNs and VTs heavily depend on parameter selection, including the number of layers, neurons, training epochs, and activation functions; however, there are no standardized guidelines for setting these parameters. Most proposed frameworks are based on trial and error, often guided by individual experience in the adjustment process 30. Recent studies have shown that certain robust architectures can become vulnerable when faced with adversarial examples—these are modified data samples engineered to increase the likelihood of misclassifications by the classifier model. For instance, an adversarial example of a stop sign might be slightly altered to make it appear as a yield sign to a classifier, thereby underscoring the limitations of these architectures 19,20. There remains a pressing need for the introduction of new techniques capable of extracting robust features and classifying image patterns, which can drive innovation and advancement in both technological and scientific domains 39,40.
Recent studies, including 41,42, have introduced lightweight deep learning strategies designed to reduce computational costs. However, these approaches still struggle to maintain discriminative features, particularly in scenarios characterized by small sample sizes or noisy data. As a result, despite the progress made, the trade-off between efficiency, accuracy, and robustness remains an unresolved challenge. Our research aims to bridge this gap by exploring an alternative graph-based paradigm for feature extraction.
Researchers have explored alternative methods for extracting key features from images to enhance classification accuracy. Approaches based on complex network theory have shown promising results, as highlighted in numerous studies 43,44. A complex network is represented as a collection of N interconnected nodes, linked by M edges. Each edge denotes an interaction between two nodes, and in cases of self-loops, it indicates an interaction of a node with itself 45–47. In recent years, various methodologies employing CNs have been developed, enabling the analysis of diverse data types through an examination of the structural characteristics associated with these networks. The system or structure in question is represented as a network, with its connections characterized by various topological metrics 48–52.
A method recently developed from the theory of CNs originated the visibility graphs (VGs). The proposed method, which was initially developed for time series analysis, utilizes the concept of visibility among data points to generate graphs that represent input data in a network framework, as demonstrated in references 44,51. Several applications of VGs have shown their potential as powerful tools for time series and image classification tasks. In this approach, each pixel in an image is considered a node within a network. Two pixels (nodes) are connected by an edge if a line drawn between them meets a visibility criterion, meaning that no intermediate pixel along this line has a higher intensity value that obstructs their “line of sight”. This process yields a graph that mirrors the spatial and intensity-based organization of the image, reflecting its connectivity structure. By transforming images into graphs, VGs allow the application of complex network theory to extract topological features such as degree distribution, clustering coefficients, and centrality measures. These features can subsequently be utilized for tasks such as classification or segmentation 43,44,53,54.
It is crucial to recognize that computational models of this type can be extremely time-consuming 25,53,54. The number of nodes in networks created by VGs equals the number of data points, making it quite challenging to analyze high-dimensional data. For example, a
image containing 262, 144 pixels corresponds to a network consisting of 262, 144 nodes, leading to adjacency matrices with
entries 43. The sheer volume of nodes greatly increases the processing time required for feature computation, ultimately limiting the practicality of the proposed method for large datasets. As a result, VGs were not utilized in our study.
In contrast, a new method based on the concept of time series distribution quantiles has generated computational models called quantile graphs (QGs). The core concept of mapping a time series to a QG involves dividing the time series amplitude distribution into Q quantiles. Following this, each quantile is depicted as a node within a network, with
nodes linked according to the proximity of the corresponding time series points. The process produces a weighted and directed graph with self-loops, containing significantly fewer nodes than the original data points 55,56.
Recent studies have used QGs to assess the effects of specific brain disorders, such as epilepsy and Alzheimer’s disease, as well as the impact of aging on heart rate. These investigations underscored the reliability of QGs for the identification of changes in normal biological processes from physiological time series 57–59. In these studies, QGs were compared to five commonly used nonlinear time-series analysis techniques. The application uses EEG signals from patients with severe Alzheimer’s disease and healthy individuals. The results indicated that QGs demonstrated enhanced classification accuracy and lower computational expenses compared to the alternative approaches 60. Time-series analysis is practical for QGs, but they have not been used for the study of two-dimensional data, such as images.
This study proposes an expansion of the QGs method for characterizing and categorizing various types of images. The QGs will be derived from grayscale images, which will then be employed for feature extraction, as the resulting graphs can be analyzed using a range of topological metrics. The proposed methodology offers an additional approach that complements established techniques in the field of ML/DL applications in computer vision, specifically in feature extraction. The computational implementation of the QGs mapping for images was performed using the Python programming language and is publicly accessible through a repository 61. In summary, the novelty of this work is highlighted by two main aspects. First, it expands the QGs methodology from one-dimensional time series to two-dimensional image data, creating a scalable graph-based framework that effectively captures spatial structures for image classification. Second, it offers an open-source implementation that enhances transparency and reproducibility, while also encouraging further exploration and adaptation of the proposed approach within the research community.
Convolutional neural networks
The established CNNs were used as a reference point for image classification tasks, and their performance served as a benchmark for comparison with the introduced QGs. These deep learning algorithms, specifically designed to handle image processing tasks, have proven highly effective in identifying intricate patterns and complex structures. The core concept of a CNN is extracting features from the input image through a convolution operation using kernel filters. These filters are weighted masks engineered to emphasize visual features, such as corners and textures 62,63. The features obtained during this process are then fed into the neurons and layers of the CNN. Utilizing specific activation functions, these features facilitate the training of the network and are subsequently employed for classification purposes 30. The operation of convolution is illustrated in Figure 1, which shows the feature extraction process using a kernel and an input image. From left to right, the sequence displays the original image, the image with fixed zero boundaries, the kernel mask, and the resulting feature matrix 64. In this study, the CNNs were implemented using the keras programming interface, and the implementation details are available at 61.
Fig. 1.
Visual representation of a CNN procedure for extracting features from an input image. The diagram illustrates the sequential processing stages, beginning with the input image and progressing through multiple convolutional and max-pooling layers, which progressively capture spatial features. Convolutional kernels are applied to detect local patterns such as edges and textures, while pooling operations reduce spatial dimensionality. 30.
Vision transformers
Neural networks that employ VTs represent an advanced deep learning architecture designed explicitly for visual recognition tasks. Unlike traditional methods that view an image as a grid of pixels, VTs decompose the image into fixed-size patches, which are then linearly embedded into a sequence of tokens. These tokens are enhanced with positional encodings and processed through multiple self-attention layers, allowing the model to learn long-range dependencies and global contextual information throughout the entire image. This architecture enables VTs to effectively capture relationships between distant regions of an image, surpassing the capabilities of CNNs, which rely on local receptive fields. Although VTs have demonstrated remarkable accuracy in image classification and related tasks, they require extensive training datasets and substantial computational resources to fully realize their potential. An illustration of the VT architecture is provided in Fig. 2. In this work, VTs were implemented using PyTorch, and further implementation details can be found at 61.
Fig. 2.
Visual representation of the architecture of VTs. The diagram outlines the primary processing pipeline, where the input image undergoes a linear projection and is enriched with positional encodings to preserve spatial information. These embedded tokens are subsequently fed into layers that utilize multi-head attention, while layer normalization and residual connections contribute to maintaining training stability 65.
Quantile graphs in higher dimensions
QGs were introduced to map a time series
of length L to a directed weighted graph with
nodes 56. Each node specifically stores a particular quantile of the time series values. The weight of the arc connecting nodes
and
is determined by the frequency with which a time point
in
is succeeded by a time point
in
. Here, r represents the temporal distance between two time points, with both i and j ranging from 1 to Q, and t ranging from 1 to
. The parameter r can be adjusted freely across the length of the time series. Importantly, the accuracy of the mapping–specifically, the graphs’ ability to represent the characteristics of the time series–does not depend on the time series being stationary or of a specific length.
We now extend the concept of QGs to higher-dimensional objects, specifically images. To simplify our analysis, we start by examining a black-and-white image represented as a matrix, denoted by
. Mathematically,
is defined as a matrix of size
. The product of
and
corresponds to the total number of pixels in the image, which can be divided into quantiles using the elements of
, specifically
. To preserve the relationship between the number of data points and the number of quantiles, we set
.
To map spatial proximity to connectivity, we employ a Moore neighborhood with a radius
(see Fig. 3). Figure 3 illustrates the identification of neighboring pixels using the Moore neighborhood for both
and
. It is important to note that due to border effects, not all pixels in the image have the same number of neighbors. For instance, with
, the pixel
has three adjacent pixels:
,
, and
. In contrast, a non-border pixel like
has eight neighbors when considering
. Specifically, these neighbors include
,
,
,
,
,
,
, and
. In the resulting network, an undirected edge is established between the nodes that represent neighboring pixels. This process of identifying adjacent pixels for a central pixel is iterated until every pixel is considered as the main pixel exactly once. This iterative approach ensures a comprehensive coverage of the entire image, effectively mapping the spatial relationships and structural arrangements of the pixels into the topology of the QG. The proposed method generates both undirected and weighted QGs. Adjusting the parameter
results in networks with distinct topologies, which can be freely modified within the constraints of the image.
Fig. 3.
Procedure for identifying neighboring pixels using the Moore neighborhood in a
input image, given a radius
. Primary pixels are represented by dark gray squares, while adjacent pixels are shown as light gray squares. When
, pixel
has three adjacent pixels, which are
,
, and
. Similarly, pixel
is recognized as a neighbor to pixels
,
,
,
,
,
,
, and
when
is applied.
The proposed technique is depicted in Fig. 4 and involves transforming a
grayscale image with pixel values ranging from 1 to 100, into QGs with
nodes for both
and
cases. Lower pixel values are directly associated with lower quantile numbers, reflecting darker tones in the original image. In comparison, higher pixel values are connected to higher quantile numbers, reflecting more vibrant colors in the input image. Once the pixels are mapped to the nodes of the respective network, the connections can be subsequently calculated.
Fig. 4.
Illustration of the proposed technique for transforming a
grayscale image into QGs with seven nodes. The example demonstrates how pixel intensities are partitioned into seven quantile levels, each corresponding to a node in the resulting graph representation. Edges are established based on spatial proximity, determined by the neighborhood radius parameter r. Two configurations are presented: one with
, which captures only immediate pixel neighborhoods, and another with
, which includes a wider spatial context. This process effectively converts local gray-level distributions into a structured graph topology, preserving both spatial and statistical information for subsequent feature extraction and classification.
For illustration, consider the first pixel shown in Fig. 4b, which displays the quantile values at each pixel location. By setting r to 1 and focusing on the first pixel
, which corresponds to node
, its adjacent pixels are
,
, and
, which are linked to nodes
,
and
. In this step, connections are established between nodes
and
, between
and itself, and well as between nodes
and
. In the resulting network (Fig. 4c), it can be seen that the arc connecting nodes
and
has a weight equal to 3, since these nodes were identified as neighbors 3 times within the image–that is, through pixels
and
, pixels
and
, as well as pixels
and
. When
is set to
, it is evident that the weight of the arc connecting nodes
and
increases to 4 (Fig. 4d), reflecting that quantiles
and
are recognized as neighbors four times in the image—specifically in pixels
and
, pixels
and
, pixels
and
, and pixels
and
. Ultimately, the weights of the edges correspond to the spatial arrangement of the original image’s pixel values.
Each QG is associated with a weighted adjacency matrix, denoted as W, which can be characterized through various topological metrics. These metrics provide a framework for representing the mapped image within a network structure, offering significant advantages for image processing and classification tasks. Similar to one-dimensional approaches, two-dimensional QGs can effectively represent two-dimensional data types, such as images, without regard to their distribution, scale, or scope. By modeling images as QGs, pixel intensity distributions are transformed into intricate networks. This transformation preserves critical aspects of the original image, including intensity variability and distributional structure, while also capturing underlying spatial patterns through the connectivity of the resulting network. By converting image information into topological descriptors, QGs deliver a compact yet expressive representation that retains essential spatial features.
Methods
To assess the effectiveness of the proposed QGs mapping, we utilized various image datasets and derived QGs from them. This study is organized as follows: (i) we initially conducted a sensitivity analysis of the QG parameters to refine them for the subsequent steps; (ii) we then compared the results obtained from QGs with those from CNNs and VTs, evaluating computational cost and accuracy using benchmark datasets in the field of computer vision; (iii) finally, we applied our proposed approach to an MRI dataset of patients with dementia to illustrate its applicability to real-world biological data. Figure 5 depicts the workflow adopted in this study, which ranges from the theoretical formulation of the QG method for two-dimensional data to its application on real biological data.
Fig. 5.
Block diagram illustrating the workflow of this study. The workflow consists of three main phases, progressing from the theoretical formulation of the QG framework for two-dimensional data to its application on real biological data.
In constructing the CNNs, we utilized two convolutional layers with a kernel size of
, applying the ReLU activation function to both layers. The learning parameters were established with a batch size of 128 and a total of 3 epochs. These specific values were selected based on existing literature to ensure an accurate classification model 66. For the VTs, we employed a batch size of 64, 3 epochs, an embedded dimension of 128, and a depth of 6 67.
Topological measures
In graph analysis, local and global topological metrics play a crucial role in various investigations, including the representation, characterization, and categorization of a network’s features. Following the approach of numerous prior studies on complex networks, we modeled pertinent structures, such as images, as networks and subsequently examined the informative attributes identified through topological measures, as outlined in 46,49. For each resulting graph, we computed six key topological measures: clustering coefficient, Laplacian Estrada index, graph energy, spectral gap, betweenness centrality, and jump length. These specific measures were selected for their effectiveness in extracting essential information from weighted adjacency matrices and their computational efficiency 55,57–60,68–72.
The results from the computation of topological measures serve as feature inputs for an XGBoost classifier. XGBoost is a powerful machine learning model, primarily employed for supervised learning tasks such as classification and regression. It is renowned for its high accuracy and speed when handling a large number of features 73. In all simulations, the training time was optimized using the SelectKBest function from sklearn for feature selection, which involved selecting
features and thereby reducing the total number of features to 100.
Clustering coefficient: Network clustering is characterized by nodes that tend to group together in “triangles”, which are clusters of three interconnected nodes. The formation of connected neighborhoods in a network can be quantified using the clustering coefficient 74. In the context of a weighted directed graph and node
, the clustering coefficient
is defined as the ratio between all weighted directed triangles formed by
and the total potential triangles
could create. The clustering coefficient
for node
in a weighted undirected adjacency matrix W can be computed as:
![]() |
1 |
for
. The normalized weighted adjacency matrix,
, is obtained by dividing all elements in W by its maximum value, and the total number of edges connecting node
is quantified by
. The global average clustering coefficient of a network is obtained by calculating the following formula:
![]() |
2 |
Laplacian Estrada index: The Laplacian matrix for a given matrix W is obtained by subtracting W from a diagonal matrix D, in which each diagonal entry corresponds to the degree of the associated node 59,75,76. The Laplacian Estrada index
can be calculated using the given eigenvalues of the Laplacian matrix,
for each of the
values (Eq. 3).
![]() |
3 |
Graph energy: In the spectral analysis of complex networks, eigenvalues serve as essential measures that represent the spectrum of the graph. For an
adjacency matrix, which produces N eigenvalues, these measures can be linked to the potential paths of random walks taken over the graph, offering insights into its structural properties 77,78. The graph energy
can be computed as the sum of the absolute values of the eigenvalues, denoted as
for
79.
![]() |
4 |
Spectral gap : The spectral gap is a fundamental property of a graph that measures the difference between the largest and second-largest eigenvalues of a matrix representation of the network. In the context of complex networks, the spectral gap is closely linked to robustness: a larger gap indicates stronger overall connectivity, whereas a smaller gap suggests a more weakly connected structure. Given the set of eigenvalues of an adjacency matrix, denoted as
, the spectral gap
is defined as:
![]() |
5 |
Betweenness centrality: Betweenness centrality is a crucial network measure that quantifies the significance of a node as an intermediary in the flow of information across a graph. In weighted networks, betweenness centrality indicates how often a particular node lies on the shortest paths connecting pairs of other nodes. A high betweenness centrality value suggests that the node plays a vital role in linking different regions of the network, potentially serving as a “hub” for communication. This measure is particularly important for identifying structural or functional nodes that mediate interactions within complex systems 70,80. The betweenness centrality of a node
is defined as:
![]() |
6 |
where
represents the total number of shortest paths between node
and node
, and
denotes the number of those paths that pass through node
. The average betweenness centrality can also be computed, serving as a global network measure known as:
![]() |
7 |
Jump length: The Markov transition matrix P can be derived from adjacency matrix W, where each element
signifies the likelihood of transitioning from quantile
to
. According to mathematical formulation 55, the transition probability is computed as follows:
![]() |
8 |
Following normalization, each
represents the probability of moving from node
to node
. As a result, it is possible to perform a random walk on the graph, calculating the average over all jumps of length
.The mean jump length
can be quickly calculated from the transition matrix, such that 55:
![]() |
9 |
where H is a
matrix with
,
is the transpose of P, and tr represents the trace operation.
Data
MNIST-C: The standard Modified National Institute of Standards and Technology (MNIST) dataset comprises 70,000 images, with 60,000 images reserved for training and 10,000 images reserved for testing. The initial separation of training and testing components in this dataset facilitates the creation of unbiased comparisons between various classification methods. Each image consists of
pixels and depicts handwritten digits ranging from 0 to 9. The MNIST dataset has been employed in numerous image processing studies as a standard reference point for machine learning and deep learning methodologies 81–84. The MNIST-C dataset is an extension of the original MNIST dataset 85,86. The MNIST-C dataset was created by introducing 15 different types of corruption to the 70,000 images sourced from MNIST, resulting in deliberate noise effects and/or distortions to the original numbers’ shapes to increase the difficulty level for predictive models 87–89. The MNIST-C dataset examples in Fig. 6 feature 15 distinct corruptions from the original MNIST dataset, with the “identity” corruption serving as a representation of no corruption. This study utilizes the MNIST-C dataset to evaluate the proposed QGs method against widely recognized CNNs.
Fig. 6.
Examples from the MNIST-C dataset, which enhances the original MNIST collection by introducing 15 unique types of corruption. The “identity” subset represents the unaltered MNIST digits, while the other variants feature controlled degradations, including noise, blur, contrast reduction, pixelation, and geometric distortions. In this instance, the element is identified as “5”.
Fashion MNIST-C: The Fashion MNIST dataset was created as a more demanding alternative to the traditional MNIST dataset of handwritten digits for testing computer vision capabilities. Similar to MNIST, this dataset has the same number of training and test samples and the exact image resolution; however, it includes images of various clothing items, resulting in a greater diversity of visual complexity 66,90. We obtained the Fashion MNIST-C dataset by merging corruptions from the MNIST-C dataset with the Fashion MNIST dataset, as depicted in Fig. 7.
Fig. 7.
Example from the Fashion MNIST-C dataset, which extends the original Fashion MNIST collection by introducing 15 distinct types of corruption. The “identity” subset consists of the unaltered Fashion MNIST clothing items, while the other variants feature controlled degradations, including noise, blur, contrast reduction, pixelation, and geometric distortions. In this instance, the item is labeled as a “T-shirt”.
MRI dataset:
To demonstrate the applicability of the proposed method on a biological dataset, this study utilized a collection of MRI data. The dataset, accessible on the virtual platform Kaggle, comprises 6,400 axial MRI scans of the brain from patients with a range of health conditions related to dementia 91. It is categorized into four groups: 3,200 images from healthy control (HC) individuals, 2,240 from patients with very mild cognitive impairment (VMCI), 896 from those with mild cognitive impairment (MCI), and 64 from patients diagnosed with moderate Alzheimer’s disease (AD). Recent research has analyzed the images within this dataset, leading to the development of precise computational models for disease stage classification and prediction 92–97.
While specific clinical or demographic information about the patient group is not detailed, it is established that participants included both men and women, with no notable age differences observed. The MRI data were acquired using the T1-weighted modality, which effectively emphasizes contrast differences between tissues with significant fluid volumes, such as those found in the brain 98. This dataset encompasses brain regions that have been pre-segmented in each image, with all images standardized to a resolution of
pixels. Figure 8 displays an example of an axial image from a patient in each group within the dataset 91.
Fig. 8.
Representative T1-weighted MRI scans from patients across each group included in the study. These images exemplify the typical anatomical patterns and intensity variations observed at different diagnostic stages. They emphasize the visual and nuanced differences among individuals experiencing various levels of cognitive impairment.
Results and discussion
Sensitivity analysis
For the sensitivity analysis of the parameters involved in QG mapping, we utilized the MNIST-C and Fashion MNIST-C datasets, with a specific focus on the “identity” group. The study conducted on the parameters
and
offers valuable insights into the behavior of the proposed QG mapping when applied to these diverse datasets. Figure 9 illustrates the classification accuracy achieved for both the MNIST and Fashion MNIST datasets across various parameter configurations. Notably, accuracy values remain relatively stable across a broad range of
and
values, with the optimal performance observed at moderate levels for both parameters. The QG representation effectively captures the simple textures and shapes without necessitating fine quantile discretization or extensive neighborhood structures. Practically, this implies that classification with QGs can be conducted efficiently, as increasing the parameter values does not lead to significant enhancements in performance. In light of these findings, we set
, similar to the one-dimensional case, and
for subsequent applications, striking a balance between performance and efficiency.
Fig. 9.
Classification accuracy as a function of the number of quantiles
and the maximum radius
for the “identit” corruption of the MNIST-C and Fashion-MNIST-C datasets. Observe how the performance of the proposed quantile-graph approach is affected by varying levels of quantile partitioning and adjustments in spatial proximity.
Efficiency analysis
To evaluate the computational efficiency of our proposed method in comparison to the techniques reviewed in the literature, we generated random images with a resolution of
pixels. Over the course of 100 simulations, we recorded both the runtime and peak memory usage for each approach, including the training and testing phases. The results from this efficiency assessment are presented in Fig. 10, which compares the normalized performance of the classification models.
Fig. 10.
Runtime and peak memory consumption as a function of image size L for the evaluated classification methods, demonstrating how computational cost scales with increasing input resolution and highlighting the differences in algorithmic efficiency and resource requirements among the approaches.
The comparative evaluation of runtime and memory consumption among CNNs, VTs, and QGs reveals that CNNs require less runtime than the other two techniques. Conversely, VTs demonstrate better runtime performance than QGs, as illustrated in Fig. 10. This efficiency suggests that QG-based representations can achieve classification with lower computational demands compared to recent network architectures, such as those based on VTs. In terms of memory consumption, Fig. 10 illustrates that CNNs consistently outperform QGs and VTs, requiring significantly less peak memory to process the same set of images. In contrast, QGs demonstrate a high demand for computational memory. Notably, memory usage can be further minimized by selecting only the most informative topological measures rather than using the whole set, as well as by adjusting the parameters
and
appropriately. These optimizations offer flexibility in balancing accuracy with efficiency, positioning QGs as a scalable tool.
MNIST-C
We used images from the MNIST-C dataset to transform grayscale images into QGs using our novel approach. With
quantiles and r values ranging from 1 to 10, networks comprising
nodes were generated for each image. For each image, we calculated a total of 960 features, derived from the product of 6 topological measures, 10 different values of
, and the sum of 18 local measures along with 1 global measure. The transition matrices in Fig. 11 were calculated by applying the QG mapping to the images in Fig. 6 with a radius of 1. Transitions between quantiles are commonly found near the diagonal, suggesting a connection between nodes, most notably in instances involving corruption such as “brightness”, “identity”, “rotate” or “scale”. Unlike some forms of corruption, like “glass blur”, “fog”, and “motion blur”, these transition characteristics are disrupted, leading to local smoothing in the transition matrices. These results demonstrate that QGs can effectively capture transitions between pixel values and discern changes in the structural properties of the generated networks when image corruptions are introduced.
Fig. 11.
Transition matrices derived from a representative image of the MNIST-C dataset under various types of corruption and using a neighborhood radius of
. Each matrix illustrates the transition probabilities between quantile levels, highlighting how local gray-level relationships are modified by different forms of image degradation, including noise, blur, and contrast distortion. The resulting structures unveil distinct statistical signatures for each type of corruption, showcasing variations in texture and intensity distribution.
The accuracy values for the MNIST-C dataset are detailed in Table 1, calculated using the complete training set of 60,000 elements. The peak accuracies achieved by CNNs, QGs, and VTs for the “identity” images were 98.50%, 93.90%, and 96.25%, respectively. The methods exhibit differing worst-case performance metrics. For CNNs, the lowest recorded accuracy was 96.28%, occurring on images corrupted by the “translate” manipulation. In comparison, QGs encountered a minimum accuracy of 55.65% on images affected by “fog”, while VTs reached a low of 92.95% on images corrupted with “canny edges”. The CNN and VT architectures used in this study were specifically tailored and fine-tuned for the MNIST dataset; however, it is clear that CNNs consistently achieved higher accuracy rates across all instances presented in Table 1. Notably, QGs demonstrated significant potential in classifying images from this dataset, achieving accuracy levels exceeding 90.00% in various cases, suggesting that this newly proposed technique could be a viable method for image classification and pattern recognition.
Table 1.
Classification accuracy of CNNs, QGs, and VTs on the MNIST-C dataset using the complete training set of 60,000 elements, across all types of corruption, including the “identity” (uncorrupted) subset.
| Corruption | CNNs | QGs | VTs |
|---|---|---|---|
| Brightness | 0.9679 | 0.9401 | 0.9572 |
| Canny edges | 0.9825 | 0.8895 | 0.9295 |
| Dotted line | 0.9819 | 0.8703 | 0.9506 |
| Fog | 0.9703 | 0.5565 | 0.9533 |
| Glass blur | 0.9684 | 0.8735 | 0.9470 |
| Identity | 0.9850 | 0.9390 | 0.9625 |
| Impulse noise | 0.9731 | 0.8588 | 0.9371 |
| Motion blur | 0.9751 | 0.9119 | 0.9638 |
| Rotate | 0.9721 | 0.8700 | 0.9373 |
| Scale | 0.9832 | 0.9340 | 0.9719 |
| Shear | 0.9785 | 0.9166 | 0.9500 |
| Shot noise | 0.9793 | 0.9263 | 0.9431 |
| Spatter | 0.9766 | 0.7817 | 0.9462 |
| Stripe | 0.9809 | 0.9204 | 0.9579 |
| Translate | 0.9628 | 0.8546 | 0.9344 |
| Zigzag | 0.9737 | 0.8350 | 0.9426 |
Figure 12 illustrates the histograms produced during the feature selection phase for each type of image corruption within the MNIST-C dataset. Despite the presence of these corruptions, the
measures consistently emerged as the most significant features for distinguishing between different groups. Other network metrics, such as
and
, also exhibited robustness in specific classification scenarios. In contrast, the metrics denoted by
and
appeared to be less effective in this context. These findings suggest that
is a promising variable derived from QGs, as it effectively captures essential spatial patterns with greater resilience than the other metrics.
Fig. 12.
Histograms of the most significant features selected for classifying the MNIST-C dataset. Each histogram depicts the distribution of topological measures across various types of corruption. The chosen features are those identified as the most discriminative during the feature selection phase, playing a crucial role in enhancing model performance.
The boxplots displayed in Fig. 13 depict the accuracy of each technique across 100 simulated runs. In these simulations, the test set remained constant at 10,000 elements, while a random selection of 30,000 images was drawn from the total training set of 60,000 images. This approach allowed us to evaluate the classification capabilities of CNNs, QGs, and VTs under conditions of limited training data. To ensure consistency in assessment, we utilized the same randomly chosen subset of 30,000 elements for all methods across each simulation. The results illustrated in Fig. 13 are consistent with those shown in Table 1, reinforcing the finding that the “identity” corruption leads to the highest accuracy across all methods, with CNNs generally outperforming QGs and VTs in nearly every instance (Fig. 13).
Fig. 13.
Comparison of the classification accuracy obtained by CNNs, QGs, and VTs on the MNIST-C dataset. Each model was trained on an identical randomly selected subset of 30,000 images, and each box illustrates the accuracy calculated from 100 independent simulations to ensure statistical robustness.
Furthermore, the “fog” corruption continues to pose significant challenges for QGs, while both “spatter” and “fog” are particularly difficult for CNNs and VTs, respectively. Notably, the dispersion and interquartile range depicted in the boxplots indicate that although CNNs and/or VTs consistently exceed QGs in most cases, QGs exhibit lower accuracy variance throughout the simulations. This observation is underscored by the fact that the blue boxes representing CNNs and the green boxes for VTs are wider than the red boxes indicative of QGs in each scenario.
Although CNNs and, in some cases, VTs may yield more accurate results, QGs exhibit greater stability and consistency. This suggests that the proposed approach is less susceptible to fluctuations in the selection of training data. Compared to DL architectures, the proposed model’s performance is less reliant on specific training samples, highlighting the potential resilience of QGs in classification tasks.
We used the Wilcoxon signed-rank test to statistically compare the three evaluated techniques across different types of corruption. Based on the values illustrated in the boxplots in Fig. 13, we created Table 2, which presents the p-values obtained from the statistical test for each pair of techniques across various forms of corruption. The analysis indicates that the differences among the methods are not consistently distributed across the corruptions. Notably, the comparison between CNNs and QGs shows no statistically significant differences for a substantial number of corruptions, with p-values remaining above 0.1. Similarly, the comparison between CNNs and VTs reveals no significant statistical differences in most instances.
Table 2.
P-values derived from the Wilcoxon signed-rank test conducted on the MNIST-C dataset, utilizing 30, 000 randomly selected samples across 100 independent simulations. The statistical analysis employed to compare the performance distributions of the assessed classification methods—CNNs, QGs, and VTs—under uniform training conditions reveals statistically significant differences in accuracy among paired methods, offering a thorough evaluation of the comparative effectiveness of each approach.
| Corruption | QGs vs VTs | CNNs vs VTs | CNNs vs QGs |
|---|---|---|---|
| Brightness | 0.8712 | 0.5978 | 0.7456 |
| Canny edges | 0.3817 | 0.9838 | 0.5837 |
| Dotted line | 0.0767 | 0.6408 | 0.2054 |
| Fog | 0.4521 | 0.0817 | 0.0635 |
| Glass blur | 0.8078 | 0.6701 | 0.3387 |
| Identity | 0.0001 | 0.0001 | 0.7151 |
| Impulse noise | 0.7610 | 0.0727 | 0.4044 |
| Motion blur | 0.0105 | 0.0164 | 0.9032 |
| Rotate | 0.2366 | 0.9676 | 0.1579 |
| Scale | 0.6120 | 0.4771 | 0.8552 |
| Shear | 0.0001 | 0.1094 | 0.0001 |
| Shot noise | 0.2710 | 0.6701 | 0.5158 |
| Spatter | 0.0919 | 0.3598 | 0.9676 |
| Stripe | 0.3598 | 0.5027 | 0.5425 |
| Translate | 0.8393 | 0.2366 | 0.0840 |
| Zigzag | 0.3085 | 0.7921 | 0.1839 |
Similarly, Fig. 14 displays boxplots showing the accuracy achieved by each technique over 100 simulations using a subset of 10,000 images randomly selected from the MNIST-C training set. As observed in Fig. 13, the boxplots in Fig. 14 reveal a larger spread and interquartile range for CNNs and VTs than for QGs, suggesting that QGs provide more stable and consistent results. Unlike the previous analysis with 30,000 training elements, QGs outperformed CNNs in some cases, including “brightness”, “motion blur”, “scale”, “shear”, and “translate”, indicating that QGs may offer greater robustness than CNNs and VTs when working with limited training data.
Fig. 14.
Comparison of the classification accuracy obtained by CNNs, QGs, and VTs on the MNIST-C dataset. Each model was trained on an identical randomly selected subset of 10,000 images, and each box illustrates the accuracy calculated from 100 independent simulations to ensure statistical robustness.
The statistical analysis shown in Table 3 demonstrates significant differences among all methods across most types of corruption. Overall, the p-values underscore the significance of QGs in comparison to both CNNs and VTs, reinforcing the assertion that QGs exhibit greater robustness in scenarios with limited training data.
Table 3.
P-values derived from the Wilcoxon signed-rank test conducted on the MNIST-C dataset, utilizing 10, 000 randomly selected samples across 100 independent simulations. The statistical analysis employed to compare the performance distributions of the assessed classification methods—CNNs, QGs, and VTs–under uniform training conditions reveals statistically significant differences in accuracy among paired methods, offering a thorough evaluation of the comparative effectiveness of each approach.
| Corruption | QGs vs VTs | CNNs vs VTs | CNNs vs QGs |
|---|---|---|---|
| Brightness | 0.0039 | 0.0039 | 0.7456 |
| Canny edges | 0.0136 | 0.0019 | 0.5837 |
| Dotted line | 0.0039 | 0.0039 | 0.2054 |
| Fog | 0.3222 | 0.0018 | 0.0635 |
| Glass blur | 0.0097 | 0.0019 | 0.3387 |
| Identity | 0.4316 | 0.0195 | 0.7151 |
| Impulse noise | 0.0019 | 0.0020 | 0.4044 |
| Motion blur | 0.2324 | 0.0019 | 0.9032 |
| Rotate | 0.0058 | 0.0136 | 0.1579 |
| Scale | 0.0019 | 0.0019 | 0.8552 |
| Shear | 0.0021 | 0.0018 | 0.0001 |
| Shot noise | 0.0195 | 0.0019 | 0.5158 |
| Spatter | 0.0136 | 0.0018 | 0.9676 |
| Stripe | 0.1308 | 0.0021 | 0.5425 |
| Translate | 0.0097 | 0.0018 | 0.0840 |
| Zigzag | 0.1601 | 0.0039 | 0.1839 |
Fashion MNIST-C
The images from the Fashion MNIST-C dataset were mapped into QGs using
nodes and
, resulting in the calculation of 960 features for each image. Figure 15 illustrates the transition matrices obtained for the images presented in Fig. 7 with
. In contrast to the results observed for the MNIST-C dataset, the plots in Fig. 15 exhibit smoother and less concentrated transitions near the diagonal, even under the “identity” case. These outcomes are consistent with the mapped images’ characteristics; while the handwritten digits in MNIST-C demonstrate more pronounced and defined transitions between pixel colors, the clothing items in Fashion MNIST-C showcase a wider variety of shapes and more gradual transitions between pixel colors. Once again, corruptions such as “rotate” or “translate” result in minimal visible changes in the transition matrices, whereas corruptions like “fog”, “glass blur”, or “shot noise” significantly disrupt the visual patterns, thereby altering the properties of the generated networks.
Fig. 15.
Transition matrices derived from a representative image of the Fashion MNIST-C dataset under various types of corruption and using a neighborhood radius of
. Each matrix illustrates the transition probabilities between quantile levels, highlighting how local gray-level relationships are modified by different forms of image degradation, including noise, blur, and contrast distortion. The resulting structures unveil distinct statistical signatures for each type of corruption, showcasing variations in texture and intensity distribution.
Table 4 presents the accuracy values obtained for the Fashion MNIST-C dataset using the complete training set of 60,000 samples. As expected, due to the increased complexity of the input images compared to those in MNIST-C, all techniques demonstrated superior performance on the MNIST-C dataset relative to Fashion MNIST-C. Specifically, CNNs, QGs, and VTs achieved peak accuracies of 86.93%, 85.59%, and 85.77%, respectively. Consistent with the results observed for MNIST-C, CNNs achieved their lowest accuracy with the “translate” corruption at 80.93%, whereas QGs reached their lowest accuracy with the “fog” corruption at 65.67%. VTs achieved their lowest accuracy of 78.91% under “impulse noise” corruption. Notably, in several cases, QGs surpassed both CNNs and/or VTs in classifying images from Fashion MNIST-C. The CNN and VT architectures utilized in this study were specifically designed for classifying handwritten digits in the MNIST dataset; however, they reveal limitations when applied to clothing items in the Fashion MNIST dataset. These observations underscore the sensitivity of these network architectures to parameter selection and structural design, ultimately impacting their reliability across diverse datasets.
Table 4.
Classification accuracy of CNNs, QGs, and VTs on the fashion MNIST-C dataset using the complete training set of 60,000 elements, across all types of corruption, including the “identity” (uncorrupted) subset.
| Corruption | CNNs | QGs | VTs |
|---|---|---|---|
| Brightness | 0.8342 | 0.8508 | 0.8305 |
| Canny edges | 0.8424 | 0.8039 | 0.8061 |
| Dotted line | 0.8568 | 0.8335 | 0.8306 |
| Fog | 0.8239 | 0.6567 | 0.8164 |
| Glass blur | 0.8347 | 0.8216 | 0.8209 |
| Identity | 0.8677 | 0.8559 | 0.8493 |
| Impulse noise | 0.8261 | 0.8119 | 0.7891 |
| Motion blur | 0.8268 | 0.7992 | 0.8412 |
| Rotate | 0.8434 | 0.8051 | 0.8316 |
| Scale | 0.8693 | 0.8387 | 0.8577 |
| Shear | 0.8350 | 0.8146 | 0.8230 |
| Shot noise | 0.8244 | 0.8138 | 0.8041 |
| Spatter | 0.8439 | 0.7881 | 0.8251 |
| Stripe | 0.8627 | 0.8553 | 0.8297 |
| Translate | 0.8093 | 0.8089 | 0.8188 |
| Zigzag | 0.8512 | 0.8049 | 0.8276 |
Figure 16 illustrates the histograms from the feature selection phase for each type of image corruption within the Fashion MNIST-C dataset. Unlike the analysis conducted for MNIST-C, the measures included in
were often identified as the most significant features, with
once again standing out as a noteworthy example. These results indicate that variations in textures and shapes present across images may be more effectively captured by distinct network metrics, underscoring the adaptability of the proposed methodology.
Fig. 16.
Histograms of the most significant features selected for classifying the Fashion MNIST-C dataset. Each histogram depicts the distribution of topological measures across various types of corruption. The chosen features are those identified as the most discriminative during the feature selection phase, playing a crucial role in enhancing model performance.
The boxplots displayed in Fig. 17 depict the accuracy of each technique across 100 simulated runs. In these simulations, the test set remained constant at 10,000 elements, while a random selection of 30,000 images was drawn from a total training set of 60,000 images. The results indicate that QGs consistently outperformed VTs across almost all scenarios, with the medians of the red boxes surpassing those of the green boxes, except in the cases of “fog”, “motion blur”, and “spatter”. Notably, all techniques achieved their highest accuracy with the “identity”corruption, while the lowest accuracy was recorded in the cases of “fog”, “fog”, and “translate” for QGs, VTs, and CNNs, respectively. Moreover, QGs demonstrate lower variance and greater consistency throughout the simulations.
Fig. 17.
Comparison of the classification accuracy obtained by CNNs, QGs, and VTs on the Fashion MNIST-C dataset. Each model was trained on an identical randomly selected subset of 30,000 images, and each box illustrates the accuracy calculated from 100 independent simulations to ensure statistical robustness.
The Wilcoxon test yielded the p-values presented in Table 5, indicating significant differences in performance across various corruption conditions. Overall, the analysis demonstrates that CNNs and QGs display notable discrepancies under the majority of corruptions, with p-values frequently falling below 0.01, particularly in the case of “fog”. Regarding “identity” corruption, the values in Table 5 illustrate that QGs outperform CNNs in terms of classification performance. These results suggest a consistent performance advantage of one method over the other in specific scenarios. Similarly, the comparison between QGs and VTs reveals strong statistical differences across nearly all instances, further emphasizing the robustness of the proposed method.
Table 5.
P-values derived from the Wilcoxon signed-rank test conducted on the Fashion MNIST-C dataset, utilizing 30, 000 randomly selected samples across 100 independent simulations. The statistical analysis employed to compare the performance distributions of the assessed classification methods—CNNs, QGs, and VTs—under uniform training conditions reveals statistically significant differences in accuracy among paired methods, offering a thorough evaluation of the comparative effectiveness of each approach.
| Corruption | QGs vs VTs | CNNs vs VTs | CNNs vs QGs |
|---|---|---|---|
| Brightness | 0.0098 | 0.1309 | 0.0039 |
| Canny edges | 0.0018 | 0.1309 | 0.0018 |
| Dotted line | 0.0098 | 0.0644 | 0.0019 |
| Fog | 0.0644 | 0.0098 | 0.0017 |
| Glass blur | 0.0039 | 0.1309 | 0.0039 |
| Identity | 0.0644 | 0.1055 | 0.0039 |
| Impulse noise | 0.0039 | 0.9219 | 0.0098 |
| Motion blur | 0.0019 | 0.0644 | 0.0039 |
| Rotate | 0.0019 | 0.6250 | 0.0059 |
| Scale | 0.0195 | 0.8457 | 0.0371 |
| Shear | 0.0488 | 0.0195 | 0.0039 |
| Shot noise | 0.1934 | 0.1602 | 0.0273 |
| Spatter | 0.1309 | 0.1934 | 0.0098 |
| Stripe | 0.0273 | 0.0059 | 0.0273 |
| Translate | 0.0644 | 0.4922 | 0.0039 |
| Zigzag | 0.0098 | 0.1934 | 0.0098 |
Figure 14 presents boxplots that demonstrate the accuracy of each method across 100 simulations, utilizing a randomly selected subset of 10,000 images from the MNIST-C training set. The boxplots in Fig. 14 exhibit a greater spread and interquartile range compared to those in Fig. 13, suggesting that QGs yield more stable and consistent results than CNNs and VTs. Notably, QGs outperform both CNNs and VTs in all cases. This finding indicates that QGs offer enhanced stability, particularly in scenarios with limited training data availability (Fig. 18).
Fig. 18.
Comparison of the classification accuracy obtained by CNNs, QGs, and VTs on the Fashion MNIST-C dataset. Each model was trained on an identical randomly selected subset of 10,000 images, and each box illustrates the accuracy calculated from 100 independent simulations to ensure statistical robustness.
Table 6 presents the results of the statistical test. The p-values indicate a strong statistical difference in the performance of all techniques in many instances. This demonstrates that the proposed approach shows improvements compared to the other network architectures.
Table 6.
P-values derived from the Wilcoxon signed-rank test conducted on the Fashion MNIST-C dataset, utilizing 10, 000 randomly selected samples across 100 independent simulations. The statistical analysis employed to compare the performance distributions of the assessed classification methods—CNNs, QGs, and VTs—under uniform training conditions reveals statistically significant differences in accuracy among paired methods, offering a thorough evaluation of the comparative effectiveness of each approach.
| Corruption | QGs vs VTs | CNNs vs VTs | CNNs vs QGs |
|---|---|---|---|
| Brightness | 0.0839 | 0.0018 | 0.0042 |
| Canny edges | 0.1024 | 0.0019 | 0.0039 |
| Dotted line | 0.1053 | 0.0117 | 0.0136 |
| Fog | 0.0058 | 0.0039 | 0.0644 |
| Glass blur | 0.0195 | 0.0020 | 0.0839 |
| Identity | 0.1054 | 0.0019 | 0.0371 |
| Impulse noise | 0.0039 | 0.0021 | 0.0058 |
| Motion blur | 0.2753 | 0.0136 | 0.0195 |
| Rotate | 0.0058 | 0.0019 | 0.0029 |
| Scale | 0.0019 | 0.0218 | 0.1933 |
| Shear | 0.0020 | 0.0018 | 0.0097 |
| Shot noise | 0.1054 | 0.0058 | 0.0018 |
| Spatter | 0.6953 | 0.0019 | 0.0019 |
| Stripe | 0.0644 | 0.0019 | 0.0020 |
| Translate | 0.0371 | 0.0020 | 0.0097 |
| Zigzag | 0.4316 | 0.0195 | 0.0058 |
Comparative analysis
We present a summary of the comparative findings obtained from the classification experiments using the benchmark dataset. Table 7 offers an overview of the key characteristics of QGs, CNNs, and VTs in the context of image classification tasks. As illustrated, QGs provide an interpretable and data-efficient alternative, effectively representing image structures without the need for extensive training or significant computational resources. However, their performance may be constrained by sensitivity to noise and complex patterns. Conversely, CNNs and VTs exhibit superior capabilities in learning hierarchical and abstract features, benefitting from extensive training datasets, pre-training, and optimization techniques. Nevertheless, these deep learning approaches require greater computational power and often lack interpretability.
Table 7.
Qualitative comparison among CNNs, QGs, and VTs in image classification tasks, highlighting their main conceptual and practical differences in terms of interpretability, computational efficiency, and data requirements.
| Method | Advantages | Disadvantages |
|---|---|---|
| QGs |
|
|
| CNNs |
|
|
| VTs |
|
|
MRI dataset
Based on the correlation between the number of pixels and quantiles, the MRI scans were mapped into networks comprising approximately
nodes. We computed topological measures for each generated network and then performed a feature selection process that retained the 600 most informative features for both training and testing. In this context, we utilized k-fold cross-validation to demonstrate the robustness and generalization capabilities of our model. For each simulation, we utilized 640 MRI scans for testing and 5,760 images for training. Classification was executed using 10-fold cross-validation, and confusion matrices were produced for each task. To address the significant class imbalance among the groups, we incorporated additional performance metrics—specifically precision, F1-score, and recall–to ensure a more comprehensive evaluation of the classification results. The outcomes of our proposed method for MRI image classification are summarized in Table 8, while Table 9 presents the evaluation metrics derived from this approach.
Table 8.
Confusion matrix for the classification of MRI scans using the proposed QGs-based approach, detailing the number of samples accurately and inaccurately classified across each diagnostic category. The diagonal elements represent correct classifications, and the off-diagonal values highlight instances of misclassification between categories.
| True/Pred | HC | VMCI | MCI | AD |
|---|---|---|---|---|
| HC | 2877 | 300 | 23 | 0 |
| VMCI | 438 | 1772 | 30 | 0 |
| MCI | 135 | 222 | 539 | 0 |
| AD | 6 | 27 | 20 | 11 |
Table 9.
Evaluation metrics for the classification of MRI scans using the proposed QGs-based approach encompass standard indicators such as accuracy, precision, recall, and F1-score. These metrics offer a comprehensive and rigorous assessment of the model’s predictive performance, highlighting its effectiveness in balancing sensitivity and specificity across all diagnostic categories.
| Accuracy | 0.8123 |
|---|---|
| Precision | 0.8167 |
| F1-score | 0.8072 |
| Recall | 0.8123 |
The proposed method for classifying images from the analyzed database yielded notable results, with the model’s evaluation metrics exceeding 80% across all categories, indicating strong classification performance. An examination of the confusion matrices further underscored the method’s effectiveness, not only in distinguishing between various stages of Alzheimer’s disease in MRI scans but also in differentiating images of healthy individuals from those of patients with very mild dementia. This capability is particularly significant for the early detection of the disease.
Conclusions
In this study, we introduce an extension of quantile graphs (QGs), which have traditionally been used to map time series into complex networks, for the purpose of transforming grayscale images into complex networks. The primary contributions of this work include the proposal of a novel methodology for image pattern recognition, which is made available for open use and further investigation. We extracted QGs from two standard benchmark datasets: MNIST and Fashion MNIST, and subsequently utilized topological measures as features for image classification. Our analysis demonstrates that QGs are less computationally intensive in terms of runtime compared to other methods, making them more applicable for real-world data applications. We compared the results obtained from QGs against those generated by convolutional neural networks (CNNs) and vision transformers (VTs), both of which are well-known for their outstanding performance in computer vision tasks. While our findings indicate that CNNs and VTs typically achieve higher accuracy, the newly introduced QGs show remarkable performance in specific scenarios, particularly when training data is limited.
In all cases, QGs exhibit a higher level of consistency, indicating they are less affected by the choice of training components compared to CNNs and VTs, making them less prone to overfitting. However, like most models, QGs have their limitations. Potential failure cases can arise in scenarios characterized by high intra-class variability, significant background noise, or overlapping structural features among different classes. In these instances, the quantile partitioning process may struggle to accurately depict the underlying structural or textural distinctions within the data. As a result, the generated graph representations might not effectively capture subtle yet crucial discriminative patterns, leading to decreased separability between classes and, ultimately, an increased risk of misclassification. It is essential to acknowledge these limitations to guide future improvements and the development of hybrid strategies.
While this is a pilot study, and further evaluation of parameters and features is warranted, our findings indicate that the proposed QGs in higher-dimensional spaces provide a robust method for feature extraction from grayscale images, offering an alternative approach to image analysis and classification. Furthermore, the proposed method could be improved by integrating additional topological measures, exploring variations in pixel neighborhood structures, and employing more advanced feature selection techniques. Future research will focus on applying this method to other real-world datasets, as well as adapting the proposed methodology to accommodate RGB-colored images and 3D volumetric images, which can be achieved with minor adjustments to our existing approach.
Acknowledgements
We thank Luís A. N. Amaral and the Amaral lab members at Northwestern University (Evanston, USA) for their valuable comments and suggestions. M. L. Vicchietti acknowledges the support of Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), grant 88887.602913/2021-00. A. S. L. O. Campanharo acknowledges the support of Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP), grant 2023/06563-9.
Author contributions
Conceptualization: M.L.V., F.M.R., A.S.L.O.C. Formal analysis: M.L.V., F.M.R., A.S.L.O.C. Funding acquisition: A.S.L.O.C. Investigation: M.L.V., F.M.R., A.S.L.O.C. Methodology: M.L.V., F.M.R., A.S.L.O.C. Resources: A.S.L.O.C. Software: M.L.V., A.S.L.O.C. Supervision: F.M.R., A.S.L.O.C. Validation: M.L.V., F.M.R., A.S.L.O.C. Writing: M.L.V., F.M.R., A.S.L.O.C.
Funding
Funding provided by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and the Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP).
Data availability
The datasets analyzed during the current study are freely available for download in the GitHub, Zenodo, and Kaggle repositories through https://github.com/zalandoresearch/fashion-mnist/tree/master/data, https://zenodo.org/records/3239543, and https://kaggle.com/datasets/marcopinamonti/alzheimer-mri-4-classes-dataset respectively.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Sahoo, S. K. & Goswami, S. S. A comprehensive review of multiple criteria decision-making (MCDM) methods: Advancements, applications, and future directions. Decis. Mak. Adv.1, 25–48 (2023). [Google Scholar]
- 2.Sarker, I. H. Data science and analytics: An overview from data-driven smart computing, decision-making and applications perspective. SN Comput. Sci.2, 377 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Sarker, I. H. Machine learning: Algorithms, real-world applications and research directions. SN Comput. Sci.2, 160 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gibert, D., Mateu, C. & Planes, J. The rise of machine learning for detection and classification of malware: Research developments, trends and challenges. J. Netw. Comput. Appl.153, 102526 (2020). [Google Scholar]
- 5.Pugliese, R., Regondi, S. & Marini, R. Machine learning-based approach: Global trends, research directions, and regulatory standpoints. Data Sci. Manag.4, 19–29 (2021). [Google Scholar]
- 6.Rana, M. & Bhushan, M. Machine learning and deep learning approach for medical image analysis: diagnosis to detection. Multimed. Tools Appl.82, 26731–26769 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Jiao, Z., Hu, P., Xu, H. & Wang, Q. Machine learning and deep learning in chemical health and safety: a systematic review of techniques and applications. ACS Chem. Health Saf.27, 316–334 (2020). [Google Scholar]
- 8.Mahadevkar, S. V. et al. A review on machine learning styles in computer vision-techniques and future directions. IEEE Access10, 107293–107329 (2022). [Google Scholar]
- 9.Mahmud, M., Kaiser, M. S., McGinnity, T. M. & Hussain, A. Deep learning in mining biological data. Cognit. Comput.13, 1–33 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature521, 436–444 (2015). [DOI] [PubMed] [Google Scholar]
- 11.Ng, K. K., Chen, C.-H., Lee, C. K., Jiao, J. R. & Yang, Z.-X. A systematic literature review on intelligent automation: Aligning concepts from theory, practice, and future perspectives. Adv. Eng. Inform.47, 101246 (2021). [Google Scholar]
- 12.Ventola, C. L. Mobile devices and apps for health care professionals: Uses and benefits. Pharm. Ther.39, 356 (2014). [PMC free article] [PubMed] [Google Scholar]
- 13.Chen, L. et al. Review of image classification algorithms based on convolutional neural networks. Remote Sens.13, 4712 (2021). [Google Scholar]
- 14.Madabhushi, A. & Lee, G. Image analysis and machine learning in digital pathology: Challenges and opportunities. Med. Image Anal.33, 170–175 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zhou, L., Pan, S., Wang, J. & Vasilakos, A. V. Machine learning on big data: Opportunities and challenges. Neurocomputing237, 350–361 (2017). [Google Scholar]
- 16.Dargan, S., Kumar, M., Ayyagari, M. R. & Kumar, G. A survey of deep learning and its applications: A new paradigm to machine learning. Arch. Comput. Methods Eng.27, 1071–1092 (2020). [Google Scholar]
- 17.Maxwell, A. E., Warner, T. A. & Fang, F. Implementation of machine-learning classification in remote sensing: An applied review. Int. J. Remote Sens.39, 2784–2817 (2018). [Google Scholar]
- 18.Zhang, J., Yin, Z., Chen, P. & Nichele, S. Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review. Inf. Fusion59, 103–126 (2020). [Google Scholar]
- 19.Kurakin, A., Goodfellow, I. J. & Bengio, S. Adversarial examples in the physical world. In Artificial Intelligence Safety and Security. 99–112 (Chapman and Hall/CRC, 2018).
- 20.Zhang, J. & Li, C. Adversarial examples: Opportunities and challenges. IEEE Trans. Neural Netw. Learn. Syst.31, 2578–2593 (2019). [DOI] [PubMed] [Google Scholar]
- 21.Snider, E. J., Hernandez-Torres, S. I. & Boice, E. N. An image classification deep-learning algorithm for shrapnel detection from ultrasound images. Sci. Rep.12, 8427 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tchito Tchapga, C. et al. Biomedical image classification in a big data architecture using machine learning algorithms. J. Healthc. Eng.2021, 1–11 (2021). [DOI] [PMC free article] [PubMed]
- 23.Zhou, N.-R., Liu, X.-X., Chen, Y.-L. & Du, N.-S. Quantum K-nearest-neighbor image classification algorithm based on KL transform. Int. J. Theor. Phys.60, 1209–1224 (2021). [Google Scholar]
- 24.Khorshid, S. F. & Abdulazeez, A. M. Breast cancer diagnosis based on k-nearest neighbors: A review. PalArch’s J. Archaeol. Egypt/Egyptol.18, 1927–1951 (2021). [Google Scholar]
- 25.Pei, L., Li, Z. & Liu, J. Texture classification based on image (natural and horizontal) visibility graph constructing methods. Chaos Interdiscip. J. Nonlinear Sci.31 (2021). [DOI] [PubMed]
- 26.Rashid, T. & Mokji, M. M. Low-resolution image classification of cracked concrete surface using decision tree technique. In Control, Instrumentation and Mechatronics: Theory and Practice. 641–649 (Springer, 2022).
- 27.Sheykhmousa, M. et al. Support vector machine versus random forest for remote sensing image classification: A meta-analysis and systematic review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.13, 6308–6325 (2020). [Google Scholar]
- 28.Tripathi, M. Analysis of convolutional neural network based image classification techniques. J. Innov. Image Process. (JIIP)3, 100–117 (2021). [Google Scholar]
- 29.Maurício, J., Domingues, I. & Bernardino, J. Comparing vision transformers and convolutional neural networks for image classification: A literature review. Appl. Sci.13, 5521 (2023). [Google Scholar]
- 30.Li, Z., Liu, F., Yang, W., Peng, S. & Zhou, J. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst.33, 6999–7019 (2021). [DOI] [PubMed] [Google Scholar]
- 31.Wang, Y., Deng, Y., Zheng, Y., Chattopadhyay, P. & Wang, L. Vision transformers for image classification: A comparative survey. Technologies13, 32 (2025). [Google Scholar]
- 32.Han, K. et al. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell.45, 87–110 (2022). [DOI] [PubMed] [Google Scholar]
- 33.Hassanzadeh, T., Essam, D. & Sarker, R. EvoDCNN: An evolutionary deep convolutional neural network for image classification. Neurocomputing488, 271–283 (2022). [Google Scholar]
- 34.Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R. & Bengio, Y. Binarized neural networks. Adv. Neural Inf. Process. Syst.29 (2016).
- 35.Hur, T., Kim, L. & Park, D. K. Quantum convolutional neural network for classical data classification. Quantum Mach. Intell.4, 3 (2022). [Google Scholar]
- 36.Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R. & Bengio, Y. Quantized neural networks: Training neural networks with low precision weights and activations. J. Mach. Learn. Res.18, 1–30 (2018). [Google Scholar]
- 37.Klus, S. & Gelß, P. Tensor-based algorithms for image classification. Algorithms12, 240 (2019). [Google Scholar]
- 38.Taherkhani, A., Cosma, G. & McGinnity, T. M. AdaBoost-CNN: An adaptive boosting algorithm for convolutional neural networks to classify multi-class imbalanced datasets using transfer learning. Neurocomputing404, 351–366 (2020). [Google Scholar]
- 39.Cai, L., Gao, J. & Zhao, D. A review of the application of deep learning in medical image classification and segmentation. Ann. Transl. Med.8 (2020). [DOI] [PMC free article] [PubMed]
- 40.Chandra, M. A. & Bedi, S. Survey on SVM and their application in image classification. Int. J. Inf. Technol.13, 1–11 (2021).33527094 [Google Scholar]
- 41.Xu, Y. et al. Graph-based deep learning for image classification: a survey. IMA J. Appl. Math.10.1002/ima.70090 (2023). [Google Scholar]
- 42.Zhang, W. & Li, M. Deep learning approaches for efficient and robust image classification. Int. J. Mach. Learn. Cybern.10.1007/s13042-025-02629-6 (2025). [Google Scholar]
- 43.Iacovacci, J. & Lacasa, L. Visibility graphs for image processing. IEEE Trans. Pattern Anal. Mach. Intell.42, 974–987 (2019). [DOI] [PubMed] [Google Scholar]
- 44.Lacasa, L. & Iacovacci, J. Visibility graphs of random scalar fields and spatial data. Phys. Rev. E96, 012318 (2017). [DOI] [PubMed] [Google Scholar]
- 45.Barabási, A.-L. The Science of Networks (Perseus, 2012). [Google Scholar]
- 46.Newman, M. E. The structure and function of networks. Comput Phys. Commun.147, 40–45 (2002). [Google Scholar]
- 47.Reichardt, J. & Reichardt, J. Introduction to complex networks. Struct. Complex Netw. 1–11 (2009).
- 48.Albert, R. & Barabási, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys.74, 47 (2002). [Google Scholar]
- 49.Costa, L. D. F., Rodrigues, F. A., Travieso, G. & Villas Boas, P. R. Characterization of complex networks: A survey of measurements. Adv. Phys.56, 167–242 (2007).
- 50.Gao, Z.-K., Small, M. & Kurths, J. Complex network analysis of time series. Europhys. Lett.116, 50001 (2017). [Google Scholar]
- 51.Lacasa, L., Luque, B., Ballesteros, F., Luque, J. & Nuno, J. C. From time series to complex networks: The visibility graph. Proc. Natl. Acad. Sci.105, 4972–4975 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Zou, Y., Donner, R. V., Marwan, N., Donges, J. F. & Kurths, J. Complex network approaches to nonlinear time series analysis. Phys. Rep.787, 1–97 (2019). [Google Scholar]
- 53.Pal, R., Kumar, S. & Singh, M. K. Topological data analysis and image visibility graph for texture classification. Int. J. Syst. Assur. Eng. Manag. 1–11 (2024).
- 54.Wen, T., Chen, H. & Cheong, K. H. Visibility graph for time series prediction and image classification: a review. Nonlinear Dyn.110, 2979–2999 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Campanharo, A. S. & Ramos, F. M. Hurst exponent estimation of self-affine time series using quantile graphs. Phys. A Stat. Mech. Appl.444, 43–48 (2016). [Google Scholar]
- 56.Campanharo, A. S., Sirer, M. I., Malmgren, R. D., Ramos, F. M. & Amaral, L. A. N. Duality between time series and networks. PloS ONE 6, e23378 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Campanharo, A. S., Doescher, E. & Ramos, F. M. Application of quantile graphs to the automated analysis of EEG signals. Neural Process. Lett.52, 5–20 (2020). [Google Scholar]
- 58.Pineda, A. M. et al. Analysis of quantile graphs in EGC data from elderly and young individuals using machine learning and deep learning. J. Complex Netw.11, cnad030 (2023). [Google Scholar]
- 59.Pineda, A. M., Ramos, F. M., Betting, L. E. & Campanharo, A. S. Quantile graphs for EEG-based diagnosis of Alzheimer’s disease. Plos ONE 15, e0231169 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Vicchietti, M. L., Ramos, F. M., Betting, L. E. & Campanharo, A. S. Computational methods of EEG signals analysis for Alzheimer’s disease classification. Sci. Rep.13, 8184 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Vicchietti, M. L. Code from: Pattern and structural detection in grayscale images through the application of quantile graphs in higher-dimensional spaces (2025). https://github.com/MarioVicchietti/2D_QGs. [DOI] [PMC free article] [PubMed]
- 62.Gu, J. et al. Recent advances in convolutional neural networks. Pattern Recognit.77, 354–377 (2018). [Google Scholar]
- 63.Rawat, W. & Wang, Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput.29, 2352–2449 (2017). [DOI] [PubMed] [Google Scholar]
- 64.Chollet, F. et al. Building autoencoders in keras. In The Keras Blog. Vol. 14 (2016).
- 65.Mehrani, P. & Tsotsos, J. K. Self-attention in vision transformers performs perceptual grouping, not attention. Front. Comput. Sci.5, 1178450 (2023). [Google Scholar]
- 66.Kadam, S. S., Adamuthe, A. C. & Patil, A. B. Cnn model for image classification on mnist and fashion-mnist dataset. J. Sci. Res.64, 374–384 (2020). [Google Scholar]
- 67.Bbouzidi, S., Hcini, G., Jdey, I. & Drira, F. Convolutional neural networks and vision transformers for fashion mnist classification: A literature review. arXiv preprint arXiv:2406.03478 (2024).
- 68.Stanić, Z. Graphs with small spectral gap. Electron. J. Linear Algebra26, 417–432 (2013). [Google Scholar]
- 69.Watts, D. J. & Strogatz, S. H. Collective dynamics of ‘small-world’ networks. Nature393, 440–442 (1998). [DOI] [PubMed] [Google Scholar]
- 70.White, D. R. & Borgatti, S. P. Betweenness centrality measures for directed graphs. Soc. Netw.16, 335–346 (1994). [Google Scholar]
- 71.Wu, Z., Liu, S., Ding, C., Ren, Z. & Xie, S. Learning graph similarity with large spectral gap. IEEE Trans. Syst. Man Cybern. Syst.51, 1590–1600 (2019). [Google Scholar]
- 72.Zhang, J. & Small, M. Complex network from pseudoperiodic time series: Topology versus dynamics. Phys. Rev. Lett.96, 238701 (2006). [DOI] [PubMed] [Google Scholar]
- 73.Chen, T. & Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining. 785–794 (2016).
- 74.Fagiolo, G. Clustering in complex directed networks. Phys. Rev. E76, 026107 (2007). [DOI] [PubMed] [Google Scholar]
- 75.Fath-Tabar, G., Ashrafi, A. & Gutman, I. Note on estrada and l-estrada indices of graphs. Bull. (Acad. Serbe Sci. Arts Classe Sci. Math. Nat. Sci. Math.) 1–16 (2009).
- 76.Shang, Y. Laplacian estrada and normalized Laplacian estrada indices of evolving graphs. PloS ONE 10, e0123426 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Cvetkovič, D. Spectral recognition of graphs. Yugoslav J. Oper. Res.22 (2016).
- 78.Qi, Y., Yi, Y. & Zhang, Z. Topological and spectral properties of small-world hierarchical graphs. Comput. J.62, 769–784 (2019). [Google Scholar]
- 79.Nie, C.-X. Topological energy of networks. Chaos Interdiscip. J. Nonlinear Sci.33 (2023). [DOI] [PubMed]
- 80.Bader, D. A., Kintali, S., Madduri, K. & Mihail, M. Approximating betweenness centrality. In International Workshop on Algorithms and Models for the Web-Graph. 124–137 (Springer, 2007).
- 81.Baldominos, A., Saez, Y. & Isasi, P. A survey of handwritten character recognition with mnist and emnist. Appl. Sci.9, 3169 (2019). [Google Scholar]
- 82.Deng, L. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag.29, 141–142 (2012). [Google Scholar]
- 83.Seng, L. M., Chiang, B. B. C., Salam, Z. A. A., Tan, G. Y. & Chai, H. T. Mnist handwritten digit recognition with different cnn architectures. J. Appl. Technol. Innov5, 7–10 (2021). [Google Scholar]
- 84.Wu, M. & Zhang, Z. Handwritten digit classification using the mnist data set. In Course Project CSE802: Pattern Classification & Analysis. Vol. 336 (2010).
- 85.Xiao, H. MNIST and Fashion MNIST Datasets (2017). https://github.com/zalandoresearch/fashion-mnist/tree/master/data.
- 86.Mu, N. M. & Gilmer, J. MNIST-C Dataset (2019). https://zenodo.org/records/3239543.
- 87.Mu, N. & Gilmer, J. Mnist-c: A robustness benchmark for computer vision. arXiv preprint arXiv:1906.02337 (2019).
- 88.Rusak, E. et al. Increasing the robustness of dnns against image corruptions by playing the game of noise. In ICLR 2020 (2020).
- 89.Rusak, E. et al. A simple way to make neural networks robust against diverse image corruptions. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. 53–69 (Springer, 2020).
- 90.Nocentini, O., Kim, J., Bashir, M. Z. & Cavallo, F. Image classification using multiple convolutional neural networks on the fashion-MNIST dataset. Sensors22, 9544 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Pinamonti, M. Kaggle: Alzheimer MRI dataset (2022). https://kaggle.com/datasets/marcopinamonti/alzheimer-mri-4-classes-dataset.
- 92.Ajagbe, S. A., Amuda, K. A., Oladipupo, M. A., Oluwaseyi, F. A. & Okesola, K. I. Multi-classification of Alzheimer disease on magnetic resonance images (MRI) using deep convolutional neural network (dcnn) approaches. Int. J. Adv. Comput. Res.11, 51 (2021). [Google Scholar]
- 93.Eroglu, Y., Yildirim, M. & Cinar, A. mrmr-based hybrid convolutional neural network model for classification of Alzheimer’s disease on brain magnetic resonance images. Int. J. Imaging Syst. Technol.32, 517–527 (2022). [Google Scholar]
- 94.Jraba, S., Elleuch, M., Ltifi, H. & Kherallah, M. Alzheimer disease classification using deep cnn methods based on transfer learning and data augmentation. Int. J. Comput. Inf. Syst. Indus. Manag. Appl.16, 17–17 (2024). [Google Scholar]
- 95.Khasanah, I. et al. Enhancing Alzheimer’s disease diagnosis with k-nn: A study on pre-processed MRI data. Int. J. Artif. Intell. Med. Issues2, 49–60 (2024). [Google Scholar]
- 96.Sharma, S., Guleria, K., Tiwari, S. & Kumar, S. A deep learning based convolutional neural network model with vgg16 feature extractor for the detection of Alzheimer disease using MRI scans. Meas. Sens.24, 100506 (2022). [Google Scholar]
- 97.Yedavalli, R. & Bair, A. Deep learning-based classification of Alzheimer’s stages: A multiclass approach using MRI data. J. High Sch. Sci.8, 86–102 (2024). [Google Scholar]
- 98.Wei, Y. et al. Comprehensive segmentation of gray matter structures on t1-weighted brain MRI: A comparative study of convolutional neural network, convolutional neural network hybrid-transformer or-mamba architectures. Am. J. Neuroradiol. (2025). [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets analyzed during the current study are freely available for download in the GitHub, Zenodo, and Kaggle repositories through https://github.com/zalandoresearch/fashion-mnist/tree/master/data, https://zenodo.org/records/3239543, and https://kaggle.com/datasets/marcopinamonti/alzheimer-mri-4-classes-dataset respectively.














































