Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Jul 19;15:26289. doi: 10.1038/s41598-025-12257-3

The AlexNet HSD model for industrial heritage damage detection and adaptive reuse under artificial intelligence

Huiling Zhang 1,
PMCID: PMC12276325  PMID: 40684049

Abstract

As the importance of preserving and utilizing industrial heritage continues to grow, improving the efficiency and accuracy of damage detection for industrial heritage has become a key research focus. This work optimizes the structure of the traditional AlexNet HSD (Alex Krizhevsky Network Hierarchical Structure Detection) model. By integrating the Convolutional Block Attention Module (CBAM) and Support Vector Machine (SVM), an AlexNet HSD + CBAM + SVM (AlexNet HCS) model is proposed to enhance the performance of industrial heritage damage detection. Experiments are conducted on a comprehensive dataset composed of the xView2 Building Damage Assessment Dataset (xBD) and photos of third-line construction buildings in Southwest China. The results show that through structural improvements and the combination of the CBAM module and SVM, the AlexNet HCS model achieves an accuracy of 95.7%, an increase of 12.2% compared with AlexNet HSD. Its Precision, Recall, and F1 score are 94.8%, 95.7%, and 95.2% respectively, verifying the effectiveness of the optimization strategy. Ablation experiments verify the improvement of network structure and the synergistic gain of CBAM and SVM. CBAM only increases 3.5% Floating Point Operations (FLOPs) and 4ms reasoning delay, but brings 1.8% accuracy improvement; Placing DropBlock in Conv5 can further inhibit over-fitting. In comparative experiments with other models, AlexNet HCS demonstrates superior classification performance and faster convergence speed, proving its efficacy in building damage identification. Moreover, based on the findings in damage detection, this work proposes specific pathways for the adaptive reuse of industrial heritage from the Third Front Construction in Southwest China. It aims to support the sustainable development and cultural preservation of industrial heritage. This work intends to provide novel technical support and theoretical foundation for the protection of industrial heritage, promoting its scientific and sustainable utilization.

Keywords: Industrial heritage, Damage detection, AlexNet HSD, CBAM, Support vector machine

Subject terms: Physiology, Psychology, Engineering, Mathematics and computing

Introduction

Research background and motivations

As the importance of protecting and utilizing industrial heritage grows, particularly in the context of China’s Southwest Third Front construction period, enhancing the efficiency and accuracy of damage detection for industrial heritage has become a critical research task. During the Third Front construction period, the construction and operation of numerous industrial buildings have contributed significantly to local economic development. However, as time has passed, these buildings have gradually faced aging and damage issues, requiring effective detection methods to ensure their safety and sustainable use13.

Traditional manual inspection methods are not only inefficient but also prone to oversights and misjudgments4,5. With rapid advancements in computer vision and deep learning (DL), the Convolutional Neural Network (CNN) has demonstrated superior performance in image classification and recognition, positioning it as a promising solution for industrial heritage damage detection. DL models like Alex Krizhevsky Network (AlexNet), renowned for their powerful feature extraction capabilities, have been widely applied across various visual tasks68. However, the existing AlexNet Hierarchical Structure Detection (AlexNet HSD) model still faces limitations when handling complex images, particularly in feature extraction and classification accuracy9,10. Therefore, optimizing the AlexNet structure to enhance its performance and strengthen its ability to identify complex damage characteristics in industrial heritage holds both theoretical significance and practical value.

Research objectives

The primary objective of this work is to develop an industrial heritage damage detection model based on AlexNet HSD, with a specific focus on the context of China’s Southwest Third Front construction. Specifically, the research objectives include: Firstly, aiming at the problem of insufficient feature extraction ability of the traditional AlexNet HSD model, a structural optimization scheme is designed, and a convolutional block attention module (CBAM) is introduced to enhance its spatial and semantic perception ability of complex damage features. Second, integrating Support Vector Machine (SVM) to enhance classification accuracy and robustness for more precise identification of structural damage. Third, the model is applied to the image data of the third-line construction industrial heritage, and the spatial distribution law and structural weakness of the damage are excavated, which provides scientific basis for the subsequent activation design. Finally, based on the detection results, the work puts forward the path of industrial heritage activation and utilization that meets the actual development needs of Southwest China. It also and explores the whole chain technology system of “detection-evaluation-protection-activation” to provide data support and strategic support for the continuation and reuse of industrial heritage, and promote historical and cultural heritage and regional sustainable development.

Literature review

In recent years, as a powerful machine learning technology, DL has made remarkable progress in many visual and time sequence recognition tasks. Wang et al. (2022) proposed a Magnetic Resonance Imaging (MRI) image enhancement method that combined digital twins with deep transfer learning. Using composite metasurfaces and an adaptive fusion algorithm, this method significantly improved image clarity and fusion outcomes in MRI, outperforming traditional algorithms11. Zeng et al. (2023) proposed a hybrid model combining CNN and Gated Recurrent Unit (GRU) for human activity recognition tasks, aiming to process smartphone sensor data and achieve real-time recognition of user activities. Experimental results showed that the model achieved an average accuracy of 91.27% on public datasets, significantly outperforming traditional machine learning and single DL models, demonstrating good generalization performance and practical application value12. This study highlighted the potential of introducing lightweight network structures in temporal structure modeling. This idea provided theoretical support for improving model adaptability and generalization ability in building damage detection, especially in the discrimination of complex structures and temporal damage evolution. Similarly, Dan et al. (2021) applied neural network technology for efficient analysis of asphalt core images, allowing rapid extraction of aggregate particle geometry information. This method proved more convenient and accurate than traditional techniques, supporting intelligent, lightweight image analysis13. Furthermore, Li et al. (2023) developed an optimized You Only Look Once version 8 small (YOLOv8-s) multi-target detection model for drones. The model utilized Bidirectional Path Aggregation Network - Feature Pyramid Network (Bi-PAN-FPN) feature fusion and GhostblockV2 structure. This approach enhanced detection accuracy while reducing model parameters, demonstrating notable performance on the VisDrone2019 dataset14.

With increasing emphasis on industrial heritage preservation, researchers are focusing on efficient and accurate methods to detect and assess damage in these structures. Soleymani et al. (2023) reviewed damage detection and monitoring techniques, noting that while traditional methods were effective, they are often costly, time-consuming, and may fail to detect hidden issues. Their study explored the strengths and weaknesses of structural health monitoring systems and outlined directions for future research15. Additionally, Sánchez-Aparicio et al. (2023) illustrated the use of ground-based laser scanning for structural damage detection. They highlighted that point cloud technology not only extracted geometric damage information but also retained environmental data to detect various damage types, such as moisture-related issues16. Rossi et al. (2023) conducted a comprehensive evaluation of structural health monitoring technologies for cultural heritage buildings, covering both traditional and innovative approaches. They also discussed future trends to support efficient, sustainable maintenance of heritage sites17.

To sum up, although the existing research has made good progress in deep learning and building damage detection, there are still obvious gaps in industrial heritage scenes: Most models are designed for modern buildings or post-disaster scenes, and there is a lack of systematic research on the damage characteristics of industrial heritage with long history, complex structure and diverse materials. Lacking of multi-source data fusion method, it is difficult to take into account both macro-structure and micro-fracture information. Deep network is vulnerable to texture interference and low-quality sampling in industrial heritage images, and its robustness is insufficient. The existing work usually adopts a single depth classifier, and the complementary advantages of attention mechanism and traditional machine learning have not been fully explored. In view of the above gap, this work proposes an improved model based on AlexNet HSD: On one hand, CBAM attention and DropBlock are introduced to improve the representation and robustness of complex damage. On the other hand, SVM classifier is used to enhance the discrimination ability under small sample conditions. By fusing xBD satellite images with the actual shooting data of the third-line industrial heritage in Southwest China, a multi-scale and multi-source comprehensive dataset is constructed, aiming at filling the research gaps in model structure, data fusion and anti-interference ability of industrial heritage damage detection.

Research model

Analysis of the working principle of AlexNet HSD network

AlexNet HSD is a DL method based on the classic AlexNet architecture. The AlexNet consists of eight layers, with the first five layers being convolutional layers and the last three layers being fully connected layers, as shown in Fig. 118,19.

Fig. 1.

Fig. 1

Structure of the AlexNet.

In the AlexNet, the convolutional layers effectively extract local features through a local receptive field and weight-sharing mechanism. The input image undergoes a series of convolution operations to generate feature maps. It is accompanied by the introduction of non-linear features through the Rectified Linear Unit (ReLU) activation function, which captures complex feature patterns. Subsequently, the pooling layers reduce dimensionality, decreasing the size of the feature maps and lowering computational complexity while enhancing robustness against interference, preserving significant features and suppressing noise. In the fully connected layers, the extracted features are flattened and passed to fully connected neurons, mapping them to specific category labels. Finally, the output is normalized using the Softmax function, generating a probability distribution for each category and selecting the category with the highest probability as the prediction result2022.

AlexNet HSD introduces a mechanism for hierarchical feature detection based on the AlexNet architecture, progressively capturing structural information from coarse to fine levels through multi-level feature extraction. Within the HSD framework, the network first extracts low-level features through initial convolutional layers, and as the network deepens, the convolutional layers begin to capture higher-level features21,23. This hierarchical feature extraction not only improves detection accuracy but also enhances adaptability to various types of damage.

The reason why AlexNet is chosen as the basic network architecture is based on the following considerations: First, AlexNet is relatively light in structure, with moderate parameters, and its computing resource consumption is far lower than that of deep networks such as ResNet and DenseNet. It is especially suitable for efficient reasoning in industrial heritage scenes with limited equipment or complex deployment environment. Secondly, AlexNet has good interpretability and modularity. It is convenient to embed attention mechanisms such as CBAM and back-end SVM classifiers while maintaining the stability of the backbone network, and improve the discriminant ability of the model. In addition, AlexNet has strong learning ability on small and medium-sized datasets, which can avoid the risk of over-fitting caused by over-complicated structures under the condition of insufficient data. Considering that the samples of industrial heritage damage images are limited, AlexNet HSD provides a better compromise between accuracy and efficiency, which has good application adaptability and popularization.

The structural optimization of AlexNet HSD for industrial heritage damage detection

Optimization of the AlexNet structure

When it comes to detecting damage in industrial heritage, the traditional AlexNet structure faces challenges regarding feature extraction capability and model adaptability. Aiming at the practical issues of complex structures and diverse damage types in heritage buildings, this work conducts multi-faceted optimizations on the AlexNet network structure, including multi-scale design of receptive fields, improvement of convolution kernel structures, introduction of feature fusion mechanisms, and selection of activation functions, to enhance the model’s sensitivity and robustness to damage features.

  1. Introduction of a multi-scale dilated convolution module. To enhance the model’s ability to represent complex structures and damage features at different scales, a multi-scale dilated convolution structure is introduced into the first convolutional layer of the AlexNet HSD network. Specifically, the original convolution operation is replaced by three parallel convolution kernels with dilation rates set to 1, 2, and 3, respectively, to achieve joint modeling of small, medium, and large receptive fields. This structure effectively improves the model’s recognition capability for various damage types in industrial heritage, such as cracks, spalling, and corrosion, by parallelly extracting feature information at different scales. Meanwhile, it maintains the spatial dimensions of output feature maps and does not significantly increase computational burden.

  2. Replacing Traditional Convolutional Kernels: Two 3 × 3 convolutional kernels are used in place of the 5 × 5 convolutional kernel in AlexNet. This structure maintains the same receptive field while increasing the network’s depth and significantly reducing the number of parameters, which lowers computational complexity. This change not only enhances the model’s learning capability but also speeds up training, contributing to more efficient damage detection.

  3. Introducing 1 × 1 Convolutional Kernels: A 1 × 1 convolutional kernel is added between the third and fourth convolutional layers of AlexNet to accelerate network operation and improve performance. The introduction of the 1 × 1 convolutional kernel serves not only to merge features and adjust dimensions but also to increase the depth of the network while preserving the image’s planar structure. This method reduces the number of parameters, thereby lowering the risk of overfitting, while enhancing the network’s computational efficiency.

  4. Using Leaky ReLU Activation Function: The traditional ReLU function can lead to the “dying ReLU” problem under certain conditions; however, using the Leaky ReLU activation function can address this issue24,25. Leaky ReLU allows a certain gradient to flow even when the input is negative, preventing complete information loss26. Its specific form is as follows:
    graphic file with name 41598_2025_12257_Article_Equ1.gif 1
    .

Inline graphic represents the input to the activation function, Inline graphic is the output of the activation function, and Inline graphic is a constant less than 1, which is set to 0.15 here. This optimization measure improves the network’s learning ability to some extent, enhancing the model’s performance on complex datasets.

Introducing CBAM attention mechanism

To more effectively identify and analyze diverse damage features in industrial heritage buildings, this work introduces a universal and lightweight attention mechanism- CBAM. By integrating channel attention and spatial attention modules, CBAM can mine the saliency features of key damage areas in images from different dimensions, thereby further enhancing the model’s discriminative ability and overall performance2729. Unlike attention mechanisms such as Squeeze-and-Excitation Network (SENet), CBAM models spatial and channel dimensions separately, balancing local texture information and global structure perception. This makes it more suitable for processing multi-scale and complex texture damage features in industrial heritage images. The overall structure of CBAM is shown in Fig. 2, mainly including two stages: the channel attention module and the spatial attention module. Their outputs act on the input feature maps to enhance feature representation in key areas.

Fig. 2.

Fig. 2

CBAM Structure.

The channel attention module obtains the statistical information of the input feature map on each channel through Global Average Pooling (GAP) and global maximum pooling (GMP). For the input feature map Inline graphic, firstly, average pooling and maximum pooling are performed to obtain two channel description vectors:

graphic file with name 41598_2025_12257_Article_Equ2.gif 2
graphic file with name 41598_2025_12257_Article_Equ3.gif 3

.

Then, these two vectors are respectively input into the two-layer fully connected network sharing the weight, and are added element by element, and then the channel attention diagram Inline graphic is output through the sigmoid function, and the equation is as follows:

graphic file with name 41598_2025_12257_Article_Equ4.gif 4

.

MLP stands for multiple perceptron, and σ is sigmoid activation function. The attention of this channel can significantly enhance the response ability of the network to important channels.

The weighted feature map of the channel is then input into the spatial attention module to further focus on the key location areas in the image. The module obtains two spatial descriptions by performing maximum pooling and average pooling operations on each channel:

graphic file with name 41598_2025_12257_Article_Equ5.gif 5
graphic file with name 41598_2025_12257_Article_Equ6.gif 6

.

Then, after stitching these two graphs along the channel dimension, input a 7 × 7 convolution kernel and get the spatial attention diagram Inline graphic by sigmoid activation function:

graphic file with name 41598_2025_12257_Article_Equ7.gif 7

.

Finally, the enhanced feature map output by CBAM is calculated by the following equation:

graphic file with name 41598_2025_12257_Article_Equ8.gif 8
graphic file with name 41598_2025_12257_Article_Equ9.gif 9

.

In the equation, Inline graphic represents element-by-element multiplication operation. Inline graphic is the characteristic diagram after the channel attention is enhanced. Inline graphic is the final output after the spatial attention is enhanced. The insertion position of CBAM has an important influence on the model performance. If the attention module is added in the initial stage of the network, it may cause excessive intervention of the underlying features; However, the introduction in the later stage may lead to feature over-fitting. Therefore, this work chooses to embed the CBAM module after the fourth convolution layer (Conv4) of the optimized AlexNet HSD structure. At this time, the feature map has higher semantic information, and the attention mechanism can more effectively screen out the key areas and improve the model’s ability to identify multiple types of injuries. In a word, CBAM improves the spatial perception and semantic understanding of the model while retaining the advantages of lightweight computing, and provides more robust deep feature support for subsequent damage classification and active utilization path identification.

Introduction of SVM

To enhance the recognition capability of damage features in industrial heritage buildings, SVM is introduced as the classifier. SVM is an effective supervised learning algorithm designed to efficiently separate different classes of samples by finding the optimal hyperplane30,31. The core idea is to maximize the inter-class distance, thereby improving the model’s classification accuracy and generalization ability32.

In handling binary classification problems, SVM attempts to optimize a hyperplane that maximizes the classification margin33. For a given training sample Inline graphic, where Inline graphic, the objective of SVM is to solve the following optimization problem:

graphic file with name 41598_2025_12257_Article_Equ10.gif 10

.

Inline graphic represents the normal vector of the hyperplane, and Inline graphic is the bias term. Through this optimization, SVM can effectively avoid overfitting in the context of limited samples, which is particularly crucial when dealing with complex features.

When SVM is combined with AlexNet, AlexNet mainly undertakes the task of feature extraction, and its 4096-dimensional feature vector at the last layer contains the key semantic information in the image. However, AlexNet’s original Softmax classifier is essentially a multi-class linear regression, and its classification boundary is limited by the linear separability of feature distribution. It is easily affected by redundant features or unbalanced data distribution, thus limiting the final classification performance. In contrast, the introduction of SVM as a back-end classifier has the following advantages: The optimal classification hyperplane is constructed by the principle of maximum interval, especially in the case of fuzzy class boundaries or more overlapping samples, which is more discriminating than Softmax. In the industrial heritage image scene with less data, SVM shows stronger generalization ability by regularizing the complexity of the model. SVM supports kernel method, which can be extended to more complex nonlinear classification tasks in theory. Therefore, this work chooses to combine the deep features extracted by AlexNet with SVM classifier to further improve the overall performance of the building damage identification system and build a hybrid model framework with strong feature expression ability and excellent classification generalization ability.

Introducing dropblock

In industrial heritage damage detection tasks, image data often suffer from issues such as large quality variations, limited sample sizes, and complex feature regions, making models prone to overfitting, especially during deep feature extraction. To enhance the model’s robustness and generalization ability, the optimized AlexNet HSD structure further incorporates the DropBlock regularization mechanism and image data augmentation strategies, effectively mitigating performance degradation caused by insufficient training samples and inconsistent image quality.

Traditional Dropout mechanisms can suppress neuron co-adaptation in fully connected layers to improve network generalization, but their effectiveness in convolutional neural networks is limited. To better adapt to the spatial structure of convolutional feature maps, this work introduces the DropBlock mechanism. Its core idea is to randomly discard continuous regions in feature maps during training, forcing the network to learn more distributed discriminative features and avoid over-reliance on local areas. The occluded region of DropBlock is a learnable square block, and its occlusion process can be formally expressed as:

graphic file with name 41598_2025_12257_Article_Equ11.gif 11

.

Inline graphic represents the pixel values at the i and j positions in the original feature map. Inline graphic represents the output after applying DropBlock. This mechanism not only enhances the randomness of training, but also effectively improves the robustness of the model in the face of low-quality or occluded images. In AlexNet HSD network, after embedding the DropBlock module into the fifth convolution layer (Conv5), the feature map here has strong semantic representation ability, and the application of DropBlock can effectively break up redundant responses, thus promoting richer feature learning.

To further expand the training data scale and reduce the model’s dependence on a small number of annotated samples, multiple image augmentation techniques are applied to transform the original images, specifically including operations such as random cropping, horizontal flipping, brightness adjustment, rotation, affine transformation, and noise perturbation. Image augmentation not only simulates diverse damage scenarios in real-world environments but also increases sample diversity while maintaining unchanged semantic information, thereby effectively mitigating the risk of model overfitting. By applying data augmentation strategies in the data preprocessing stage and combining them with the DropBlock mechanism, the model becomes more adaptable to data perturbations and occlusions during the training phase, enhancing the detection accuracy and robustness of the final model in practical complex scenarios.

AlexNet damage detection model based on CBAM and SVM

Building on the previous research, an improved AlexNet HSD structure that integrates CBAM and SVM is proposed, named AlexNet HCS, for the effective identification of building damage. Figure 3 illustrates the architecture of the AlexNet HCS model.

Fig. 3.

Fig. 3

Architecture of the AlexNet HCS Model.

To more accurately extract damage features from industrial heritage buildings, the original AlexNet HSD structure undergoes convolution layer reorganization and channel depth adjustment to adapt to more complex structural types. On this basis, the CBAM attention mechanism is introduced, combining two-level guidance of channel attention and spatial attention to enhance the model’s focusing ability on critical damage areas and effectively improve the expressive efficiency of feature extraction. Meanwhile, to reduce the model’s sensitivity to low-quality images and mitigate overfitting risks, the DropBlock regularization method is further applied to intermediate convolution layers, forcing the network to focus on more robust regional features by randomly masking local areas. After feature extraction, the high-dimensional feature vectors output by the network are fed into an SVM classifier for final classification. The introduction of SVM not only improves the classification stability of the model with small samples but also reduces the dependence on parameters from the fully connected layers of the deep network, enhancing the interpretability and robustness of the overall model. Through the above multi-dimensional structural improvements and strategy integrations, the AlexNet HCS model can more comprehensively identify complex damage types in industrial heritage while achieving a good balance between classification performance and training efficiency.

Experimental design and performance evaluation

Datasets collection

The dataset used in this work consists of two parts: First, the public xBD (xView2 Building Damage Assessment Dataset)34, available at https://workswithcode.com/dataset/xbd. Second, industrial heritage damage images from the Third Front construction period in Southwest China collected by the research team. The mixed dataset contains a total of 10,000 labeled image samples. The xBD dataset accounts for approximately 70%, covering building damage images under various natural disaster scenarios with rich scenes and damage types. The Southwest China-collected images account for 30%, focusing on damage to age-old industrial heritage buildings, including typical damage types such as cracks, erosion, deformation, and mildew spots. The xBD dataset was jointly released by the U.S. Defense Innovation Unit and the MIT Lincoln Laboratory, specifically for building damage assessment after natural disasters. It covers pre- and post-disaster satellite images of 15 countries worldwide, involving natural disasters like earthquakes, floods, and typhoons, with approximately 850,000 labeled building instances in total. Building images are annotated with a five-level damage scale: intact, minor, moderate, severe, and destroyed. This work screens images similar to industrial building structures from xBD for modeling and training. To verify the reliability of the five-level damage annotation, three structural experts with over 10 years of cultural heritage conservation experience are invited to independently review 400 random samples. The Cohen’s κ coefficient is 0.87, indicating “substantial agreement”, which confirms that the annotation quality supports the requirements for model training and classification granularity.

To enhance the model’s perception of local historical building features, the research team conducted on-site sampling of typical industrial heritage buildings from the Third Front Construction period in Southwest China, capturing high-definition images of different damage types such as wall cracks, surface erosion, and structural deformations. The collected images underwent processing operations including uniform cropping, resolution adjustment to 224 × 224, grayscale normalization, rotation, flipping, and noise perturbation for data augmentation, to improve the diversity and robustness of training samples. Finally, this work integrates partial building images from the xBD dataset with the damage images collected in Southwest China to construct a hybrid dataset covering multiple architectural styles, different damage levels, and multi-source images. In terms of dataset division, a stratified random sampling method is adopted to ensure that the distribution proportions of each damage category in the training set, validation set, and test set are roughly consistent. The dataset is finally divided into a training set of 8,000 images, a validation set of 1,000 images, and a test set of 1,000 images at a ratio of 8:1:1. Table 1 shows the summary information of data composition and category labels.

Table 1.

Data composition and category labels.

Data source Total number of images No damaged Mild Moderate Serious Completely damaged
xBD dataset 7,000 1,400 1,400 1,400 1,400 1,400
Sampling of southwest third-line buildings 3,000 600 600 600 600 600
Total 10,000 2,000 2,000 2,000 2,000 2,000

Experimental environment and parameters setting

Table 2 presents the experimental environment and parameter settings used.

Table 2.

Experimental environment and parameter Settings.

Hardware/parameter name Configuration/value
Operating System Windows10
CPU AMD R7-5800 H
Clock Speed 3.2 GHz
GPU NVIDIA GeForce RTX3090
Memory 16GB
Storage 512G SSD
Programming Language Python3.8
Deep learning framework PyTorch 2.0
Initial Learning Rate 0.003
Epoch 100
Batch Size 64
Dropout 0.5
Optimizer Adam (Inline graphic)
Weight attenuation 1 × 10− 4
Loss function Cross-Entropy

Table 3 gives the main layer parameters of AlexNet HCS:

Table 3.

Alex net HCS main layer parameters.

Number Layer/module Output channel Key parameter (Kernel, Stride, Pad, Dilation)
1 Conv1 64 11 × 11, 4, 2, 1
2 MaxPool1 64 3 × 3, 2
3 Conv2 192 5 × 5, 1, 2
4 MaxPool2 192 3 × 3, 2
5 Multi-scale Dilated Conv3 384 3 parallel (3 × 3, d = 1/2/3)
6 Conv4 256 3 × 3, 1, 1
7 CBAM 256 Channel + spatial attention
8 Conv5 256 3 × 3, 1, 1
9 DropBlock 256 p = 0.2, block = 5
10 MaxPool3 256 3 × 3, 2
11 FC6 4096 Dropout 0.5
12 FC7 4096 Dropout 0.5
13 FC8 512 Output feature vector
14 SVM - RBF core, C = 1.0

The evaluation metrics used include accuracy (Acc), precision (Pre), recall (Rec), and F1 score. The equations for calculating Acc, Pre, Rec, and F1 score are as follows:

graphic file with name 41598_2025_12257_Article_Equ12.gif 12
graphic file with name 41598_2025_12257_Article_Equ13.gif 13
graphic file with name 41598_2025_12257_Article_Equ14.gif 14
graphic file with name 41598_2025_12257_Article_Equ15.gif 15

TP refers to True Positives, which is the number of instances correctly classified as positive; TN refers to True Negatives, which is the number of instances correctly classified as negative; FP refers to False Positives, which is the number of instances incorrectly classified as positive but are actually negative; and FN refers to False Negatives, which is the number of instances incorrectly classified as negative but are actually positive.

Performance evaluation

Comparison of activation function effects

This section compares the performance of the ReLU and Leaky ReLU activation functions within the AlexNet HSD network using the test dataset. Figure 4 illustrates the results.

Fig. 4.

Fig. 4

Performance Comparison of Different Activation Functions.

Figure 4 demonstrates that the AlexNet HSD model utilizing the Leaky ReLU activation function significantly outperforms the model that employs the ReLU activation function in terms of classification accuracy. Specifically, Leaky ReLU exhibits higher accuracy across multiple training epochs, particularly in the later stages of training, where the accuracy exceeds 85%. This indicates that Leaky ReLU can more effectively enhance the model’s classification performance in the task of building damage detection, thereby improving its ability to learn complex features.

Ablation experiments

To validate the effectiveness of various components within the AlexNet HCS model, an ablation experiment is conducted. Table 4 summarizes the models involved in the ablation experiments, where “√” indicates the presence of a specific module in the model, and “×” indicates its absence.

Table 4.

Models used in the ablation experiments.

Model Various improved modules
Improved network structure CBAM Module SVM Classifier
AlexNet HSD × × ×
AlexNet HSD-1 × ×
AlexNet HSD-2 × ×
AlexNet HSD-3 × ×
AlexNet HSD-4 ×
AlexNet HSD-5 ×
AlexNet HSD-6 ×
AlexNet HCS

Figure 5 illustrates the performance metrics for the models listed in Table 4.

Fig. 5.

Fig. 5

Ablation Experiments Results.

Figure 5 suggests that the AlexNet HCS model significantly improves classification performance through the combination of various modules. Specifically, improving the network structure alone, introducing the CBAM module, and integrating the SVM classifier increased the accuracy by 1.06%, 1.41%, and 1.17% respectively compared to the original model. When these optimization methods were used in combination, the model’s performance significantly outperformed single optimization strategies. The final AlexNet HCS model achieved an accuracy of 95.7%, representing a 12.2% improvement over the original AlexNet HSD model. Its Precision, Recall, and F1-score remained at high levels, reaching 94.8%, 95.7%, and 95.2% respectively, which are 10.9%, 12.5%, and 11.6% improvements over AlexNet HSD. These results validate the effectiveness of each optimization method and demonstrate the potential of the comprehensive strategy of each module in industrial heritage damage detection.

To further validate the effectiveness of each module and the rationality of parameter settings, a DropBlock position comparison experiment is conducted. DropBlock is respectively placed after Conv3, Conv4, and Conv5 for comparison, and the results are shown in Table 5. When DropBlock is inserted into Conv5, the model accuracy and F1 score are the highest. This indicates that applying the regularization mechanism at a deeper position can more effectively suppress overfitting while retaining more high-level semantic information, which is a better structural design choice.

Table 5.

Ablation results of dropblock position.

DropBlock position Acc (%) F1 (%)
Conv3 95.2 94.6
Conv4 95.5 94.9
Conv5 95.7 95.2

To evaluate the actual deployment cost of CBAM, this work statistically analyzes the Floating Point Operations (FLOPs) and inference latency of the model on a single 224 × 224 image before and after introducing CBAM. The results are shown in Table 6. Specifically, FLOPs increased by 3.5%, the inference time only increased by 4 milliseconds, but the model accuracy improved by 1.8%. These results indicate that the CBAM module significantly enhances the model’s ability to extract key features and effectively improves detection accuracy while maintaining high efficiency. Therefore, deploying this module in practical industrial scenarios has a high cost-performance ratio.

Table 6.

The influence of CBAM on the computational overhead.

Version FLOPs (G) Time delay (ms) Acc (%)
AlexNet HCS-CBAM 1.72 88 94.0
AlexNet HCS 1.78 (+ 3.5%) 92 (+ 4) 95.7 (+ 1.8%)

Comparison of different models

The proposed AlexNet HCS model is compared with representative damage classification models, including You Only Look Once version 5-Classification (YOLOv5-Cls), Faster Region-based Convolutional Neural Network (Faster R-CNN), and Optimized AlexNet (O_Net)35, which is an optimized version of AlexNet. Figure 6 illustrates the results of this comparison.

Fig. 6.

Fig. 6

Comparison of different model performance.

Figure 6 shows that the AlexNet HCS model demonstrates superior performance over mainstream models in all test metrics. Its classification accuracy reaches 95.7%, significantly outperforming YOLOv5-Cls and Faster R-CNN, and exceeding the highly optimized O_Net model by 1.81%. Although YOLOv5-Cls and Faster R-CNN exhibit strong performance in object detection tasks, and O_Net introduces improved structures and attention mechanisms, they still use Softmax as the classifier, failing to fully leverage feature representation capabilities. In contrast, the AlexNet HCS model enhances the receptive field through multi-scale dilated convolutions, improves the model’s focus on key regions via the CBAM module, and further strengthens classification boundaries by integrating an SVM classifier. As a result, it achieves Precision, Recall, and F1 Score of 94.8%, 95.7%, and 95.2%, respectively. These results fully demonstrate that the AlexNet HCS model offers superior accuracy, robustness, and practical application value in industrial heritage damage recognition tasks.

To reveal the performance differences of the AlexNet HCS model across various damage levels, the confusion matrix on the test set is provided in Table 7. Except for a 3.2% mutual confusion between “minor” and “moderate” damage, all other categories maintain a recall rate of > 95%. This indicates that the AlexNet HCS model can maintain high recognition accuracy even for minority classes.

Table 7.

Confusion matrix.

True \ forecast No damaged Mild Moderate Serious Damaged
No damaged 0.96 0.02 0.01 0.01 0.00
Mild 0.03 0.93 0.03 0.01 0.00
Moderate 0.01 0.03 0.92 0.03 0.01
Serious 0.00 0.01 0.04 0.93 0.02
Damaged 0.00 0.00 0.02 0.03 0.95

To further validate the adaptability of the AlexNet HCS model, a comparison is made against the original AlexNet HSD model on the test set, as shown in Fig. 7.

Fig. 7.

Fig. 7

Comparison of two models on the test set.

Figure 7 shows that the accuracy of the AlexNet HCS model is significantly higher than that of the AlexNet HSD model during the early stages of training, with a smaller fluctuation in accuracy and a faster convergence rate. This trend indicates that the improved AlexNet HCS model exhibits superior classification performance, enabling more effective identification of building damage. As training progresses, the accuracy of the AlexNet HCS model gradually stabilizes, demonstrating strong adaptability and robustness. This fully validates the effectiveness of the proposed model in building damage detection.

Pathways for the adaptive reuse of industrial heritage in Southwest China

In Southwest China, industrial heritage from the Third Front construction period carries significant strategic, industrial, and cultural values. The proposed AlexNet HCS deep detection model can accurately identify and locate typical damage types (such as cracks, spalling, and corrosion) and their distribution areas in heritage buildings, providing a technical foundation for structural safety assessment in adaptive reuse.

First, a structural health database should be established based on the model’s high-confidence damage maps. Combined with historical records and current usage conditions, scientific repair and reinforcement schemes should be formulated. The repair strategy should adopt quantitative treatment methods for specific damaged parts—for example, prioritizing reinforcement in areas where crack widths exceed thresholds and replacing severely corroded areas—to ensure heritage buildings meet structural safety requirements before activation. Second, detection results help rationally divide functional areas of heritage spaces. For instance, areas with minor damage and stable structures should be prioritized for public exhibition or educational spaces. While areas with significant structural defects should be restricted for use as protected display zones or safety warning areas to preserve the heritage’s authentic historical traces. Third, reasonable pedestrian flow organization and spatial path design should be developed based on detection data and damage hotspot distribution. By avoiding high-risk areas and optimizing traffic guidance, safety and visiting experience can be improved, enhancing the accessibility and functionality of heritage spaces. Finally, it is recommended to establish a technical and policy coordination mechanism centered on “detection-evaluation—repair-reinforcement—spatial reuse”. Local governments should leverage the model’s results to promote the establishment of a digital archive system for industrial heritage, providing guarantees in financial support and policy guidance, and encouraging universities, enterprises, and social organizations to participate in heritage protection and innovative utilization. Meanwhile, a model-based dynamic monitoring mechanism can be implemented to regularly assess the health status of activated buildings, achieving a benign cycle of heritage protection and use.

Discussion

In summary, the AlexNet HCS model, which combines SENet and SVM, demonstrates significant performance improvements in the detection of industrial heritage damage. The success of this model not only validates its effectiveness in handling complex data features but also emphasizes the crucial role of network structure optimization and attention mechanisms in enhancing model performance. Compared to existing research, the model presented excels across multiple evaluation metrics. For instance, Giglioni et al. (2024) proposed a domain adaptive transfer learning method to accurately predict new instances in unknown target domains by transferring knowledge from fully labeled bridge structure data. The method acquired natural frequencies through long-term monitoring and used domain adaptation to align damage-sensitive features. It enabled machine learning algorithms to effectively utilize labeled source-domain data and generalize to unlabeled target-domain data. Applied to two bridge case studies, it demonstrated potential to reduce computational workload and handle sparse datasets in bridge network monitoring36. Abubakr et al. (2024) investigated the application of deep learning in reinforced concrete bridge defect classification using Xception and Vanilla models based on CNNs37. Mostofi et al. (2025) classified building damage datasets after the 2023 Kahramanmaraş earthquake in Turkey using machine learning models, evaluating nine algorithms. The random forest model showed optimal performance in predicting earthquake-induced building damage severity, achieving 93% accuracy38. In contrast, the detection accuracy of the AlexNet HCS model reaches 94.6%, indicating that the optimized model possesses strong adaptability and effectiveness in the field of industrial heritage damage recognition. This work provides an alternative approach to performance prediction while highlighting the need for further exploration of how to integrate multiple technologies to optimize predictive model performance. Additionally, it serves as a reference for the future application of DL technologies in other fields, underscoring the potential of diversified combinatorial methods.

In other domains, Aher (2023) proposed a Deep Q-Network based on Political Deer Hunting Optimization Algorithm for heart disease detection. This algorithm combined a political optimizer with a deer hunting optimization algorithm, achieving 93.4% accuracy, 96.2% sensitivity, and 89.2% specificity39. Krishna and Mahboub (2024) developed an AI-driven heuristic framework integrated with medical IoT, which extracted shape and texture features and uses machine learning classification techniques to detect and classify abnormal cells in breast cancer images with 98% accuracy40. Both studies demonstrate that introducing intelligent optimization or lightweight frameworks in the pre-task stage can effectively balance accuracy and real-time performance, which highly aligns with the needs of heritage scenarios in this research. Mitra and Koley (2024) compared sound- and vibration-based bearing health diagnosis methods and proposed a model combining wavelet scattering transform with a hybrid 2D CNN architecture and multi-class SVM classifier for automated real-time bearing health diagnosis41. Sahu et al. (2023) presented a computer-aided integration method based on ResNet18 and SVM for breast cancer diagnosis, improving image quality through defogging and histogram-based K-means tumor segmentation42. These studies collectively show that the “CNN features + SVM decision” paradigm maintains excellent few-shot generalization capability across domains. These interdisciplinary achievements provide solid theoretical and practical support for the algorithm selection and performance optimization of the proposed model, further highlighting the methodological innovation and application potential of AlexNet HCS in industrial heritage damage classification. Future research will draw on the above ideas of edge computing and adaptive optimization to prune, distill, and hardware-optimize the model, meeting the needs of real-time on-site monitoring and long-term operation and maintenance for heritage sites.

Additionally, to address the requirements for response speed and deployment efficiency in practical applications, this work balances model inference latency and accuracy in structural design. Although introducing the CBAM module increases processing time by approximately 4 ms, it achieves a 1.8% performance gain, with inference time controlled within 92 ms—enabling basic near-real-time processing capabilities suitable for timeliness requirements in most industrial heritage scenarios. Future research will further introduce lightweight attention mechanisms or Transformer variant models and compress parameter scales through model pruning, knowledge distillation, and other methods. It will also reduce computational overhead while maintaining accuracy to implement a more efficient real-time monitoring system. Although this work still focuses on static image classification tasks and does not cover a complete end-to-end real-time detection pipeline, the proposed AlexNet HCS model provides important theoretical foundations and basic modules for constructing intelligent industrial heritage damage recognition systems, demonstrating good scalability. In the future, integrating with object detection frameworks to achieve an integrated pipeline from localization to classification can further enhance its practical value in intelligent detection systems.

In summary, the proposed AlexNet HCS model, which integrates the CBAM attention mechanism and an SVM classifier, not only demonstrates excellent performance in industrial heritage damage detection but also holds significant practical application value. First, the model efficiently identifies complex damage features, significantly improving the accuracy and robustness of industrial heritage building detection. This lays a foundation for promoting relevant technologies in practical conservation work. Through precise damage localization and classification, management departments and restoration units can formulate maintenance plans and reinforcement schemes more scientifically, reducing errors and costs caused by human judgment and enhancing the efficiency and quality of conservation efforts. Second, the model is adapted to industrial heritage from the Third Front construction period in Southwest China, optimized for specific regional contexts and architectural styles. It exhibits strong pertinence and operability, providing a technical demonstration for digital protection of industrial heritage in similar regions. Furthermore, combining detection results, the model provides data support for heritage adaptive reuse decisions, promoting the organic integration of protection and reuse to achieve dual enhancement of cultural and economic values. In the future, with further improvement and cross-domain applications, the model shows broad application potential in industrial heritage digital archive construction, risk early warning, intelligent inspection and other fields. The model plays a crucial role in promoting the intelligent and information-driven progress of industrial heritage conservation.

Conclusion

Research contribution

This work optimizes the structure of AlexNet HSD and proposes the AlexNet HCS model, incorporating CBAM and SVM, to enhance the performance of industrial heritage damage detection. Through experimental validation of the model’s effectiveness, the following conclusions are drawn:

  1. The AlexNet HSD model using the Leaky ReLU activation function achieves a classification accuracy of over 85%, significantly outperforming the AlexNet HSD model using ReLU. This indicates that the Leaky ReLU activation function enhances the model’s ability to learn complex features, thereby improving classification performance more effectively.

  2. Through structural improvements, the CBAM module, and the combination of SVM, the AlexNet HCS model achieves an accuracy of 95.7%, representing a 12.2% improvement over the AlexNet HSD model. Its Precision, Recall, and F1-score are 94.8%, 95.7%, and 95.2% respectively, which are 10.9%, 12.5%, and 11.6% higher than those of AlexNet HSD. These results verify the effectiveness of the optimization strategy.

  3. Ablation experiments show that introducing improved structures, CBAM, or SVM individually yields accuracy gains of 1.06%, 1.41%, and 1.17%, respectively. Placing DropBlock at Conv5 further suppresses overfitting, achieving 95.7% accuracy and 95.2% F1 score. CBAM only increases FLOPs by 3.5% and inference latency by 4 ms, but improves accuracy by 1.8%, verifying its high cost-performance ratio.

  4. In horizontal comparison with YOLOv5-Cls, Faster R-CNN, and O_Net, AlexNet HCS leads in accuracy, convergence speed, and robustness. The confusion matrix shows recall rates of all damage levels exceeding 92%, with the “completely destroyed” category reaching 95%, proving the model’s reliable recognition capability for minority classes.

  5. Combining detection results, a technical closed-loop of “damage identification—repair design—adaptive reuse” is constructed. Quantitative repair schemes, functional zoning, and policy incentive recommendations for industrial heritage of Southwest China’s Third Front construction are proposed, providing an operable path for heritage digital protection and activation.

In summary, this work firstly integrates multi-scale convolution, CBAM, and SVM into a lightweight AlexNet framework for industrial heritage scenarios, achieving dual improvement in accuracy and efficiency. Measured data verify its feasibility in complex damage detection and activation decision-making, laying a solid foundation for subsequent cross-domain engineering applications.

Future works and research limitations

This work has achieved preliminary results in industrial heritage damage detection, but there are still some limitations that need to be addressed in follow-up work. First, the current dataset is relatively small in scale with limited sample size, which may affect the model’s generalization ability. Especially when facing building structures from other regions, different industries, or with more complex damage degrees, the adaptability and robustness of the model still need further improvement. Additionally, although the model achieves high accuracy on the current small-scale dataset, this may hide certain overfitting risks—meaning that the data features learned by the model on the training set may not fully generalize to broader real-world scenarios. Despite using techniques such as DropBlock, CBAM modules, and SVM classifiers to enhance the model’s regularization and discriminative capabilities, the robustness and generalization of model performance still need to be further validated through larger-scale, diversified datasets and external verification tests. Second, the proposed model optimization method is developed based on the AlexNet architecture. Although the combination of CBAM attention mechanism and SVM classifier significantly improves detection accuracy, it does not systematically compare the applicability of other mainstream deep learning models (such as ResNet, EfficientNet, Transformer) in industrial heritage scenarios, limiting the model’s application and promotion in broader contexts. Furthermore, while this work proposes adaptive reuse paths for Third Front industrial heritage buildings based on detection results, the overall approach remains at the methodological and strategic recommendation level, lacking verification and interaction with actual engineering practices. The causal link between model output results and adaptive reuse decisions has not been fully established. How to formulate scientific and reasonable repair and reuse schemes based on detection information still requires systematic exploration through on-site engineering cases.

Future research will be dedicated to constructing a larger-scale and more representative industrial heritage damage image dataset, covering industrial buildings from different regions, periods, and types. Multimodal sensing technologies such as infrared, thermal imaging, and laser scanning will be introduced to enhance the model’s environmental adaptability and multi-dimensional feature extraction capabilities. Meanwhile, future research will continue to explore deeper network structures and their integrated optimization mechanisms with traditional classification algorithms to improve the model’s performance in complex damage recognition tasks. In terms of engineering applications, future work will take typical Third Front industrial heritage projects in Southwest China as entry points, collaborating with local cultural heritage protection institutions, design units, and university teams to promote the practical implementation of model detection results in specific practices such as heritage restoration, structural reinforcement, and spatial reuse. It will empirically evaluate the influence and reliability of the detection model on the adaptive reuse decision-making process, constructing a full-process technical closed-loop of “damage identification—repair design—adaptive reuse”. The ultimate goal is to promote the deep integration of AI-based industrial heritage protection methods with local policy systems, achieving digital, scientific, and sustainable development of industrial heritage conservation.

Author contributions

Huiling Zhang: Conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing—original draft preparation, writing—review and editing, visualization, supervision, project administration, funding acquisition.

Funding

This work was supported by Chongqing Municipal Postgraduate Research Innovation Project: Research on the Activation Model and ReconstructionMechanism of Third-Line Industrial Heritage in Southwest China within the Context of Industrial Archaeology (Project Approval No.: CYB240234).

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author Huiling Zhang on reasonable request via e-mail zz18908361863@163.com.

Declarations

Competing interests

The authors declare no competing interests.

Ethics statement

This article does not contain any studies with human participants or animals performed by any of the authors. All methods were performed in accordance with relevant guidelines and regulations.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Bai, Z. et al. Image-based reinforced concrete component mechanical damage recognition and structural safety rapid assessment using deep learning with frequency information. Autom. Constr.150, 104839 (2023). [Google Scholar]
  • 2.Wan, H. et al. A novel transformer model for surface damage detection and cognition of concrete bridges. Expert Syst. Appl.213, 119019 (2023). [Google Scholar]
  • 3.Hu, H. et al. A hybrid method for damage detection and condition assessment of hinge joints in Hollow slab bridges using physical models and vision-based measurements. Mech. Syst. Signal Process.183, 109631 (2023). [Google Scholar]
  • 4.Rao, A. et al. Earthquake Building damage detection based on synthetic-aperture-radar imagery and machine learning. Nat. Hazards Earth Syst. Sci.23 (2), 789–807 (2023). [Google Scholar]
  • 5.Agbaje, T. H. et al. Building damage assessment in aftermath of disaster events by leveraging Geoai (Geospatial artificial Intelligence). World J. Adv. Res. Rev.. 23 (1), 667–687 (2024). [Google Scholar]
  • 6.Kuldashboy, A. et al. Efficient image classification through collaborative knowledge distillation: A novel AlexNet modification approach. Heliyon, 10(14). (2024). [DOI] [PMC free article] [PubMed]
  • 7.Eldem, H., Ülker, E. & Işıklı, O. Y. Alexnet architecture variations with transfer learning for classification of wound images. Eng. Sci. Technol. Int. J.45, 101490 (2023). [Google Scholar]
  • 8.Zeng, Y. Enhancing image classification accuracy based on AlexNet. Highlights Sci. Eng. Technol.85, 879–885 (2024). [Google Scholar]
  • 9.Singh, I., Goyal, G. & Chandel, A. AlexNet architecture based convolutional neural network for toxic comments classification. J. King Saud University-Computer Inform. Sci.34 (9), 7547–7558 (2022). [Google Scholar]
  • 10.Shi, X. et al. An improved bearing fault diagnosis scheme based on hierarchical fuzzy entropy and Alexnet network. IEEE Access.9, 61710–61720 (2021). [Google Scholar]
  • 11.Wang, J. et al. Deep transfer learning-based multi-modal digital twins for enhancement and diagnostic analysis of brain mri image. IEEE/ACM Trans. Comput. Biol. Bioinf.20 (4), 2407–2419 (2022). [DOI] [PubMed] [Google Scholar]
  • 12.Zeng, X. et al. A Novel Human Activity Recognition Model[C]//2023 8th International Conference on Mathematics and Computers in Sciences and Industry (MCSI). IEEE, 101–106. (2023).
  • 13.Dan, H. C., Bai, G. W. & Zhu, Z. H. Application of deep learning-based image recognition technology to asphalt–aggregate mixtures: Methodology. Constr. Build. Mater.297, 123770 (2021). [Google Scholar]
  • 14.Li, Y. et al. A modified YOLOv8 detection network for UAV aerial image recognition. Drones7 (5), 304 (2023). [Google Scholar]
  • 15.Soleymani, A., Jahangir, H. & Nehdi, M. L. Damage detection and monitoring in heritage masonry structures: systematic review. Constr. Build. Mater.397, 132402 (2023). [Google Scholar]
  • 16.Sánchez-Aparicio, L. J. et al. Detection of damage in heritage constructions based on 3D point clouds. A systematic review. J. Building Eng., : 107440. (2023).
  • 17.Rossi, M. & Bournas, D. Structural health monitoring and management of cultural heritage structures: a state-of-the-art review. Appl. Sci.13 (11), 6450 (2023). [Google Scholar]
  • 18.Lu, T. et al. Identification, classification, and quantification of three physical mechanisms in oil-in-water emulsions using AlexNet with transfer learning. J. Food Eng.288, 110220 (2021). [Google Scholar]
  • 19.Lu, T., Han, B. & Yu, F. Detection and classification of marine mammal sounds using AlexNet with transfer learning. Ecol. Inf.62, 101277 (2021). [Google Scholar]
  • 20.Anber, S., Alsaggaf, W. & Shalash, W. A hybrid driver fatigue and distraction detection model using AlexNet based on facial features. Electronics11 (2), 285 (2022). [Google Scholar]
  • 21.Chen, H. C. et al. AlexNet convolutional neural network for disease detection and classification of tomato leaf. Electronics11 (6), 951 (2022). [Google Scholar]
  • 22.Setiawan, W. et al. Deep convolutional neural network Alexnet and squeezenet for maize leaf diseases image classification. Kinet Game Technol. Inf. Syst. Comput. Netw. Comput. Electron. Control, 6. (2021).
  • 23.Rani, S. et al. Efficient 3D AlexNet architecture for object recognition using syntactic patterns from medical images. Comput. Intell. Neurosci.2022 (1), 7882924 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Maniatopoulos, A. & Mitianoudis, N. Learnable leaky ReLU (LeLeLU): an alternative accuracy-optimized activation function. Information12 (12), 513 (2021). [Google Scholar]
  • 25.Lakhdari, K. & Saeed, N. A new vision of a simple 1D convolutional neural networks (1D-CNN) with Leaky-ReLU function for ECG abnormalities classification. Intell.-Based Med.6, 100080 (2022). [Google Scholar]
  • 26.Nayef, B. H. et al. Optimized leaky ReLU for handwritten Arabic character recognition using Convolution neural networks. Multimedia Tools Appl., : 1–30. (2022).
  • 27.Chen, L. et al. The classification and localization of crack using lightweight convolutional neural network with CBAM. Eng. Struct.275, 115291 (2023). [Google Scholar]
  • 28.Wang, B. et al. Hybrid CBAM-EfficientNetV2 fire image recognition method with label smoothing in detecting tiny Targets. Mach. Intell. Res.21 (6), 1145–1161 (2024). [Google Scholar]
  • 29.Chen, C., Wu, B. & Zhang, H. An image recognition technology based on deformable and Cbam Convolution resnet50. IAENG Int. J. Comput. Sci., 50. (2023).
  • 30.Tanveer, M. et al. Comprehensive review on twin support vector machines. Ann. Oper. Res.339 (3), 1223–1268 (2024). [Google Scholar]
  • 31.Gaye, B., Zhang, D. & Wulamu, A. Improvement of support vector machine algorithm in big data background. Math. Probl. Eng.2021 (1), 5594899 (2021). [Google Scholar]
  • 32.Bansal, M., Goyal, A. & Choudhary, A. A comparative analysis of K-nearest neighbor, genetic, support vector machine, decision tree, and long short term memory algorithms in machine learning. Decis. Analytics J.3, 100071 (2022). [Google Scholar]
  • 33.Roy, A. & Chakraborty, S. Support vector machine in structural reliability analysis: A review. Reliab. Eng. Syst. Saf.233, 109126 (2023). [Google Scholar]
  • 34.Gupta, R. et al. Xbd: A dataset for assessing Building damage from satellite imagery. arXiv preprint arXiv:1911.09296, (2019).
  • 35.Junyue, C. et al. Breast cancer diagnosis using hybrid AlexNet-ELM and chimp optimization algorithm evolved by Nelder-mead simplex approach. Biomed. Signal Process. Control.. 85, 105053 (2023). [Google Scholar]
  • 36.Giglioni, V. et al. A domain adaptation approach to damage classification with an application to Bridge monitoring. Mech. Syst. Signal Process.209, 111135 (2024). [Google Scholar]
  • 37.Abubakr, M. et al. Application of deep learning in damage classification of reinforced concrete bridges. Ain Shams Eng. J.15 (1), 102297 (2024). [Google Scholar]
  • 38.Mostofi, S. et al. A big Data-Enabled decision support model for Post-Earthquake damage classification of RC buildings: A case study on February 6, Kahramanmaraş doublet Earthquakes. J. Earthquake Eng., 1–23. (2025).
  • 39.Aher, C. N. Enhancing Heart Disease Detection Using Political Deer Hunting Optimization-Based Deep Q-Network with High Accuracy and Sensitivity. Medinformatics, (2023).
  • 40.Krishna, T. G. & Mahboub, M. A. A. Improving Breast Cancer Diagnosis with AI Mammogram Image Analysis. Medinformatics, (2024).
  • 41.Mitra, S. & Koley, C. Real-time robust bearing fault detection using scattergram-driven hybrid CNN-SVM. Electr. Eng.106 (3), 3615–3625 (2024). [Google Scholar]
  • 42.Sahu, Y. et al. A CNN-SVM based computer aided diagnosis of breast Cancer using histogram K-means segmentation technique. Multimedia Tools Appl.82 (9), 14055–14075 (2023). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets used and/or analyzed during the current study are available from the corresponding author Huiling Zhang on reasonable request via e-mail zz18908361863@163.com.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES