Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Sep 26;15:33109. doi: 10.1038/s41598-025-16794-9

Low-rank adaptation for edge AI

Zhixue Wang 1,, Hongyao Ma 2, Jiahui Zhai 2
PMCID: PMC12475476  PMID: 41006599

Abstract

The rapid advancement of edge artificial intelligence (AI) has unlocked transformative applications across various domains. However, it also poses significant challenges in efficiently updating models on edge devices, which are often constrained by limited computational and communication resources. Here, we present low-rank adaptation method for Edge AI (LoRAE), Leveraging low-rank decomposition of convolutional neural networks (CNNs) weight matrices, LoRAE reduces the number of updated parameters to approximately 4% of traditional full-parameter updates, effectively mitigating the computational and communication challenges associated with model updates. Extensive experiments across image classification, object detection, and image segmentation tasks demonstrate that LoRAE significantly decreases the scale of trainable parameters while maintaining or even enhancing model accuracy. Using the YOLOv8x model, LoRAE achieves parameter reductions of 86.1%, 98.6%, and 94.1% across the three tasks, respectively, without compromising accuracy. These findings highlight the potential of LoRAE as an efficient and precise solution for resource-constrained edge AI systems.

Keywords: Edge AI, Low-rank adaptation, Model update efficiency, Parameter reduction

Subject terms: Computer science, Computational science

Introduction

The rapid advancement of artificial intelligence (AI), the Internet of Things (IoT), and edge computing has significantly expanded the scale and complexity of intelligent devices. Edge AI has emerged as a transformative technology, enabling localized data processing on edge devices and reducing reliance on cloud computing13. This approach not only enhances data privacy and real-time responsiveness but also improves system efficiency through decentralization. It supports a wide range of applications, including smart cities and autonomous vehicles, where real-time decision-making is critical.

Edge AI, despite its many advantages, faces significant challenges, particularly in scenarios requiring frequent model updates. The large parameter sizes and high computational demands of deep learning models make on-device training on resource-constrained edge devices infeasible4. Fine-tuning models for specific tasks often involves modifying a substantial number of parameters, resulting in considerable computational and bandwidth costs. These limitations render traditional full-parameter update methods unsuitable for Edge AI systems, especially in dynamic environments characterized by constantly varying computational capabilities, fluctuating network transmission conditions, and evolving data distributions. The key challenge lies in reducing computational and communication costs during model updates while maintaining accuracy and performance5,6. Approximate Computing provides a promising approach to balancing precision and functional correctness, effectively reducing computational demands, energy consumption, and communication latency. However, the adoption of low-rank approximation techniques in Edge AI remains limited7. While early efforts have focused on optimizing neural network size and computational complexity, methods such as Low-Rank Adaptation (LoRA)8, which employ matrix decomposition to reduce redundancy while preserving model performance, represent a nascent and underexplored area of research.

In recent years, LoRA8 has garnered significant attention due to its exceptional parameter efficiency and fine-tuning performance demonstrated in large language models and Transformer architectures. By adding and training low-rank decomposition update matrices alongside pre-trained weights, LoRA successfully reduces the number of trainable parameters significantly without sacrificing performance. However, LoRA’s effectiveness and potential challenges in CNNs still require in-depth exploration. CNNs, through their unique convolutional kernel structures, can effectively capture local patterns and hierarchical spatial features in images, thus making the preservation of spatial sensitivity (i.e., the spatial position and interrelationships of features) critically important49. Directly applying the original form of LoRA to vision models often results in the inability to supplement local spatial correlation inductive bias. This is primarily because its low-rank decomposition, without specific optimization for convolutional operations, may interfere with or simplify these crucial spatial information, thereby weakening CNNs’ ability to capture fine details50.

To tackle these challenges, we propose an innovative low-rank adaptation method for Edge AI (LoRAE), which offers an efficient approach to substantially cut down the number of parameters updated during model adjustments. By harnessing the low-rank decomposition of model weight matrices, LoRAE effectively mitigates communication and computational burdens, updating only a small subset of parameters to efficiently capture the most crucial model variations.

The study’s main contributions are twofold. First, we introduce LoRAE, a low-rank adaptive decomposition method that enhances the efficiency of edge AI. By capitalizing on convolutional properties, LoRAE compresses parameter updates to roughly 4%, significantly reducing both computational and communication costs, thereby providing an innovative solution for achieving efficient model updates in resource-constrained environments. Second, we conduct a systematic evaluation of LoRAE across 5 public datasets and 49 vision models, with the results offering valuable references for optimizing edge AI models.

The paper is structured as follows. Section II reviews related work on model updates and efficient fine-tuning in Edge AI, emphasizing limitations of traditional methods and recent trends. Section III introduces the design and implementation of LoRAE, detailing its algorithmic principles and optimization strategies. Section IV outlines the experimental setup, including datasets, metrics, and comparative analyses. Section V concludes with key contributions and future research directions.

Related work

Edge AI

Edge AI and Cloud AI constitute two prominent paradigms for AI deployment, each possessing distinct characteristics, benefits, and challenges12. Cloud AI leverages centralized data centers to provide robust computing capabilities. This approach is particularly suited for processing extensive datasets and intricate AI models that demand substantial computational resources13. However, the centralized architecture of Cloud AI often leads to higher latency and increased bandwidth requirements. These issues become pronounced when frequent data transfers occur between edge devices and the cloud14.

In contrast, Edge AI performs computations on local devices with relatively limited computational power. This paradigm emphasizes low latency, bandwidth conservation, and enhanced data privacy2. Edge AI is particularly advantageous in applications requiring real-time responses, such as those in the IoT domain, smart devices, autonomous vehicles, and smart cities. In these contexts, it enables devices to process sensor data promptly, achieving millisecond-level response times. This capability significantly enhances user experience and supports critical decision-making tasks, such as traffic management and public safety, without latency interference1518. However, achieving high detection accuracy in these applications is challenging, especially in dynamic and practical scenarios where detection types change in real-time. Frequent model parameter updates are essential to maintain optimal performance in such conditions.

Despite its advantages, Edge AI faces several significant challenges. A primary limitation is the restricted bandwidth, which severely impacts the efficiency of model updates and data transmission. The relatively low bandwidth between edge devices and the cloud slows down frequent model updates, elongating the time required for training and deployment. Furthermore, the storage capacity of edge devices is typically limited. This restricts their ability to store large-scale models and datasets locally, thereby impeding their capacity to maintain diverse and up-to-date AI models19. Additionally, the computational power of edge devices is often insufficient compared to cloud servers, making complex model training unsuitable for edge environments. Bandwidth constraints further hinder timely parameter updates, especially for deep neural network tasks that demand substantial computational resources20.

To address these challenges, researchers have proposed various optimization techniques. Model compression significantly reduces the size of AI models, facilitating their transfer and deployment10. Incremental updates focus on transmitting only the changes in the model, thereby conserving bandwidth21. Edge-cloud collaboration combines the strengths of edge and cloud computing. In this approach, computationally intensive training tasks are offloaded to the cloud, while edge devices focus on real-time inference and decision-making22.

Parameter-efficient fine-tuning for AI models in resource-constrained environments

Fine-tuning is a pivotal technique in deep learning that adjusts the parameters of pre-trained models for specific tasks. It is particularly valuable when labeled data is limited or computational resources are constrained10. By leveraging the generalized features of pre-trained models, fine-tuning reduces computational costs while enhancing performance on new tasks, making it indispensable for Edge AI applications on resource-limited devices.

To optimize models for resource-constrained environments like TinyML, traditional methods often include model compression techniques such as pruning, knowledge distillation, and quantization. Pruning minimizes model size by removing less critical connections or neurons, enhancing memory and energy efficiency10. Knowledge distillation transfers knowledge from larger teacher models to smaller student models, improving student accuracy while maintaining compactness22. Quantization reduces model precision by converting high-precision floating-point weights to lower-bit integers. This technique significantly lowers storage requirements and computational costs, making it particularly advantageous for deployment on edge devices. Foundational research, such as the work on Quantized Neural Networks by Courbariaux et al.23, demonstrated the feasibility of training models with extremely low-precision weights and activations. Comprehensive surveys, like that by Nagel et al.24, further explore various quantization methods including Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT), highlighting their importance for efficient deployment on resource-limited devices25.

Recent advancements in large language models (LLMs) have made fine-tuning increasingly resource-intensive2629, primarily due to the exponential growth in model scale (with billions or even trillions of parameters), the ever-expanding size of training datasets, and the adoption of more complex training paradigms such as large-scale pre-training and instruction tuning. To address this, Parameter-Efficient Fine-tuning (PEFT) methods like LoRA have been proposed. LoRA employs low-rank matrix decomposition to adjust linear layer weights, drastically reducing trainable parameters and improving training efficiency without introducing additional inference overhead8. Variants like VeRA30, FedPara31, GS-LoRA32, HydraLoRA33, and MTLoRA35, along with integrations with Mixture of Experts (MoE) architectures35, further extend LoRA’s applicability.For instance, Conv-LoRA50 integrates ultra-lightweight convolutional parameters into the LoRA framework, specifically to inject image-related inductive biases into plain Vision Transformer (ViT) encoders (e.g., in the Segment Anything Model, SAM).

While general PEFT methods, including LoRA, have excelled in Transformer-based architectures, their direct application to CNNs often faces challenges due to the unique spatial characteristics of convolutional layers. CNNs inherently capture local patterns and hierarchical spatial features, and a naive low-rank decomposition might disrupt these crucial inductive biases. Consequently, specialized low-rank adaptation techniques and other PEFT methods have been developed for CNNs and vision models, aiming to maintain efficiency while preserving spatial sensitivity.

Motivation and methodology

As the application scenarios for Edge AI become more diverse and the demands increase, the challenge of frequent model updates becomes particularly prominent. Traditional methods rely on transferring and updating large amounts of parameters. However, edge devices have limited computational resources and communication bandwidth. Therefore, it is crucial to minimize the number of parameters that need to be transferred and updated. This must be done while maintaining high performance and adaptability.

According to the Scaling Law, the size of deep learning models has been progressively increasing. However, pre-trained models typically operate within significantly smaller intrinsic dimensions, indicating that only a small portion of the parameter space is essential for effective fine-tuning. Fine-tuning this low-dimensional space can yield performance comparable to full-parameter updates, a concept that has been validated in large models8. This raises the question whether AI models with relatively small parameter sizes can achieve competitive performance by fine-tuning their intrinsic dimensions on edge devices. Additionally, it remains to be seen whether parameter updates can be minimized while maintaining high accuracy in high-frequency update scenarios. Although LoRA has shown success in transformer-based models, its effectiveness in convolutional neural networks (CNNs) remains unclear. Preliminary studies suggest that ConvNets, which focus on local patterns and spatial structures, do not benefit significantly from LoRA’s low-rank compression, leading to slower convergence and diminished performance.

To address the challenges of frequent updates to the device model under tight resource constraints, we introduce LoRAE, a low-rank adaptation method tailored for convolutional networks in edge AI. In each convolutional layer, we insert two learnable modules, the LoRA extractor and the LoRAmapper, both based on low-rank decomposition to effectively extract and map key update directions, thereby significantly reducing the number of parameters that must be updated. During training, as is shown in Fig. 1a , the original weights of the backbone network remain frozen while only these low rank matrices are updated by backpropagation, compressing the training parameter budget to approximately 4% and greatly enhancing training efficiency. At inference time (see Fig. 1b ), low-rank matrices can be fused once with the backbone weights (reparameterization) before deployment, ensuring that the computation graph and total multiply–accumulate operations are identical to those of a fully fine-tuned model with no added latency. Alternatively, one may load the low-rank matrices dynamically and add them at run-time without fusion, incurring negligible extra computation compared to the original backbone. Moreover, LoRAE offers deployment flexibility on edge devices: the backbone model is loaded only once, after which the compact LoRAE weights can be switched or loaded on demand, significantly reducing model loading latency and context-switch overhead, an especially valuable advantage in multitask or multi-user scenarios.

Fig. 1.

Fig. 1

LoRAE training and inference workflow diagram.

Spatially sensitive low-rank adaptation for convolutional neural networks

LoRA has emerged as an effective technique for fine-tuning large-scale models by leveraging low-rank matrix decomposition. This approach reduces the scale of parameter updates, significantly lowering storage and computational requirements while enabling rapid adaptation of pretrained models. The weight update in LoRA is represented as:

graphic file with name 41598_2025_16794_Article_Equ1.gif 1

where Inline graphic and Inline graphic. Here, Inline graphic and Inline graphic denote the input and output feature dimensions, respectively, and Inline graphic represents the rank of the decomposition. The pretrained weight matrix Inline graphic remains frozen, with only the low-rank matrices Inline graphic and Inline graphic being updated.

To address this limitation, LoRAE introduces LoRAextractor and LoRAmapper, which integrate low-rank decomposition while preserving essential spatial features, as illustrated in Fig. 2. The weight update in LoRAE is refined as:

graphic file with name 41598_2025_16794_Article_Equ2.gif 2

where Inline graphic and Inline graphic are low-rank matrices designed specifically for convolutional layers to preserve spatial sensitivity. The matrix Inline graphic captures the spatial structures of the input, while Inline graphic maps the reduced features back to the full-dimensional output space. This design ensures the model retains its ability to learn local patterns while benefiting from the efficiency of low-rank decomposition.

Fig. 2.

Fig. 2

Spatially sensitive low-rank adaptation with parameter reconstruction.

LoRAextractor: Low-rank convolutional dimensionality reduction

LoRAextractor replaces traditional low-rank fully connected matrices with a low-rank convolutional approach to achieve dimensionality reduction. By preserving the input channel count and setting the output channel count to the rank Inline graphic, LoRAextractor employs convolutional operations to compress channel dimensions while extracting spatial features:

graphic file with name 41598_2025_16794_Article_Equ3.gif 3

where Inline graphic represents the dimension-reduced feature map, and Inline graphic is the low-rank convolutional kernel. This method effectively compresses the parameter count while maintaining the spatial structure of the input image.

LoRAmapper: Feature reconstruction with low-rank mapping

LoRAmapper reconstructs the dimension-reduced feature map Inline graphic into the output space of the original weight matrix. This is achieved through matrix multiplication with a low-rank matrix:

graphic file with name 41598_2025_16794_Article_Equ4.gif 4

where Inline graphic is the low-rank matrix. This ensures the restoration of spatial information and maintains the quality of the original feature map.

By reconfiguring LoRA with spatial sensitivity, LoRAE significantly reduces the scale of parameter updates while retaining spatial information critical to convolutional operations. This design makes it particularly suitable for resource-constrained environments, offering a novel solution for efficient model updates in edge AI applications.

Optimizations in LoRAE

LoRAE introduces optimizations that exploit the redundancy in convolutional operations, limiting weight updates to a low-dimensional subspace. Key benefits include:

Parameter scale reduction

The parameter count for traditional convolution is given by:

graphic file with name 41598_2025_16794_Article_Equ5.gif 5

LoRAE reduces this to:

graphic file with name 41598_2025_16794_Article_Equ6.gif 6

where Inline graphic.

The reduction ratio is approximately:

graphic file with name 41598_2025_16794_Article_Equ7.gif 7

Computational complexity reduction

The computational complexity is reduced from:

graphic file with name 41598_2025_16794_Article_Equ8.gif 8

to:

graphic file with name 41598_2025_16794_Article_Equ9.gif 9

These optimizations render LoRAE highly effective for resource-constrained edge AI tasks.

Experiments

Experimental design and configuration

Task definition and datasets

To address challenges in resource-constrained edge AI scenarios, LoRAE is evaluated on image classification, object detection, and image segmentation tasks, encompassing both general vision tasks and domain-specific applications to comprehensively assess its performance and adaptability. The datasets used include ImageNet44 for image classification, VOC45 for object detection, and three domain-specific datasets for segmentation tasks: GlobalWheat202046 for wheat maturity detection in smart agriculture, Crack-seg47 for crack detection in buildings, and Carparts-seg48 for car part segmentation for intelligent driving. These datasets provide diverse scenarios to validate LoRAE’s effectiveness under high-dynamic environments and stringent resource constraints.

Models and hardware configuration

To comprehensively evaluate LoRAE, mainstream models across the three tasks were selected, spanning from traditional convolutional neural networks to state-of-the-art algorithms to provide a robust evaluation framework. For image classification, YOLO11-cls, YOLOv8-cls, EfficientNet, and ResNet were chosen; for object detection, YOLO11, YOLOv8, YOLOv5, and Faster R-CNN; and for image segmentation, YOLO11-seg, YOLOv8-seg, U-Net, and DeepLabV3+. r values of 2, 4, 8, 16, 32, and 64 were tested to investigate their impact on performance metrics, including training time, accuracy, and computational efficiency. For reproducibility, superparameter settings for YOLOv8 are summarized in Table 1. The experiments were conducted on a high-performance system with two Intel®Xeon®Gold 6248R processors (24 cores each) and two NVIDIA GeForce RTX 3090 GPUs (24 GB memory each) to ensure reliable and reproducible results.

Table 1.

Hyperparameters of YOLOv8 for three vision tasks.

Config Vision task
Image classification Object detection Image segmentation
Datasets Cifar100 VOC Carparts-seg
Optimizer Adam Adam Adam
Training epochs 100 or 300 100 or 300 100 or 300
Initial learning rate 0.001 0.001 0.001
Batch size 1024 16 16
Image size 32 640 640
Weight decay 0.0005 0.0005 0.0005
Momentum 0.937 0.937 0.937
Warmup epochs 3.0 3.0 3.0
Dropout rate 0.0
Box loss gain 7.5
Class loss gain 0.5
IoU threshold 0.7
Max detections 300
Mask ratio 4.0
Overlap mask True
Rank (Inline graphic) 2,3,4,...,64 2,3,4,...,64 2,3,4,...,64

Performance evaluation of LoRAE

This section presents a detailed quantitative and qualitative analysis of the LoRAE method across three tasks: image classification, object detection, and image segmentation. The evaluation systematically examines LoRAE’s effectiveness in enhancing model accuracy and optimizing trainable parameter scales. Key aspects of the analysis include validation accuracy, model parameter scale, loss and accuracy curves.

Tables 2, 3, 4 present a comparative analysis of training with and without LoRAE. In Tables 2, 3, 4 throughout this section, bold values represent the recommended rank (r) settings of LoRAE for corresponding model-task combinations (non-bold values are non-recommended), selected based on balanced parameter reduction and accuracy. The comparison emphasizes validation accuracy and changes in trainable parameters to demonstrate LoRAE’s effectiveness and applicability. The experimental results reveal that LoRAE significantly reduces trainable parameter scales. At the same time, it maintains or surpasses the accuracy levels achieved by conventional training methods across all three tasks. For image classification, LoRAE achieves consistent Top-1 and Top-5 accuracy on the CIFAR-100 dataset. It also significantly reduces the number of updated parameters. In object detection, LoRAE demonstrates substantial parameter reductions on the VOC and GlobalWheat2020 datasets. It maintains high mAP@50(B) and mAP@50-95(B) across both small models (e.g., YOLOv8n) and large models (e.g., YOLO11x), showcasing its adaptability. For image segmentation, LoRAE exhibits resource optimization advantages on the Crack-seg and Carparts-seg datasets. While small models experience minor accuracy drops within acceptable ranges, large models display superior performance.

Table 2.

Comparison of model accuracy and parameter counts in image classification.

Model Inline graphic Cifar100 Upd. Params(M)
Acc-top1 Acc-top5
YOLO11n-cls 0.552 0.821 1.66
64 0.533 0.822 0.99
YOLOv8n-cls 0.540 0.822 1.99
64 0.542 0.813 0.84
EfficientNet-B1 0.561 0.816 7.79
64 0.542 0.813 2.14
ResNet-152 0.461 0.737 58.3
64 0.442 0.703 6.77
YOLO11x-cls 0.729 0.919 28.4
64 0.714 0.922 6.57
YOLOv8x-cls 0.696 0.904 60.4
64 0.684 0.906 8.4

Table 3.

Comparison of model accuracy and parameter counts in object detection.

Model Inline graphic VOC GlobalWheat2020 Upd. Params(M)
mAP@50(B) mAP@50-95(B) mAP@50(B) mAP@50-95(B)
YOLO11n 0.843 0.647 0.974 0.662 2.59
8 0.867 0.676 0.964 0.621 0.38
YOLOv8n 0.820 0.622 0.969 0.640 3.01
8 0.829 0.627 0.952 0.588 0.29
YOLOv5s 0.842 0.648 0.974 0.663 9.13
8 0.881 0.694 0.964 0.625 0.50
Faster R-CNN 0.851 0.647 0.972 0.668 41.53
8 0.854 0.667 0.962 0.938 9.18
YOLO11x 0.896 0.735 0.985 0.714 56.90
8 0.936 0.795 0.980 0.686 5.62
YOLOv8x 0.881 0.717 0.984 0.708 68.20
8 0.932 0.783 0.976 0.694 1.93

Table 4.

Comparison of model accuracy and parameter counts in image segmentation.

Model Inline graphic Crack-seg Carparts-seg Upd. Params(M)
mAP@50(M) mAP@50-95(M) mAP@50(M) mAP@50-9(M)
YOLO11n-seg 0.688 0.235 0.690 0.566 2.84
8 0.679 0.218 0.664 0.499 0.75
YOLOv8n-seg 0.660 0.213 0.680 0.559 3.27
8 0.673 0.212 0.555 0.417 0.34
U-Net 0.636 0.201 0.632 0.534 24.44
8 0.621 0.194 0.628 0.505 8.02
DeepLabV3+ 0.667 0.222 0.663 0.574 58.63
8 0.639 0.207 0.652 0.540 12.19
YOLO11x-seg 0.708 0.241 0.675 0.583 62.05
8 0.719 0.244 0.715 0.592 4.09
YOLOv8x-seg 0.704 0.236 0.704 0.607 71.77
8 0.715 0.232 0.693 0.564 2.12

Figures 3 and 4 illustrate the training loss and validation accuracy curves across the three tasks. In Fig. 3a, the Top-5 accuracy stabilizes within approximately 20 epochs for all models. This indicates rapid feature learning. Validation accuracy improves with increased parameter scales, such as from YOLOv8n-cls to YOLO11x-cls. This demonstrates the enhanced feature representation capability of larger models. In Fig. 3b, the validation accuracy curves show that mAP@50 (Box) stabilizes after around 40 epochs. Models using LoRAE, such as YOLO11nInline graphic, outperform their counterparts. This highlights LoRAE’s effectiveness in enhancing feature and positional information learning. Furthermore, Fig. 4b demonstrates that the training loss curves exhibit faster convergence and improved stability with LoRAE. Figures 3c and 4c reveal significant fluctuations in validation accuracy and loss curves during the first 100 epochs. These initial fluctuations are often observed in the early stages of training, particularly when adapting to new or complex tasks. They can be attributed to the model’s exploratory learning phase, where it is still adjusting to the specific nuances and distribution characteristics of the new task’s data, sometimes coupled with the dynamics of the learning rate scheduler. As training progresses, the model gradually converges, with accuracy increasing and loss decreasing overall. This demonstrates that LoRAE effectively captures distinguishing features of automotive parts. Despite a substantial reduction in trainable parameters, LoRAE achieves performance comparable to, or even surpassing, that of fully fine-tuned models.

Fig. 3.

Fig. 3

Accuracy trends across vision tasks.

Fig. 4.

Fig. 4

Validation loss comparison across vision tasks.

Figure 5 provides further evidence for these findings by illustrating the relationship between updated parameters and accuracy across different model architectures. In image classification (Fig. 5a), LoRAE-enabled models sustain high Acc-top1 scores even with up to 77.0% parameter reduction, while larger models such as YOLO11x-cls exhibit only marginal accuracy degradation. For object detection (Fig. 5b), mAP@50(B) remains stable or even improves (e.g., YOLOv5s achieves gains with 97.3% fewer parameters) across various parameter scales. In image segmentation (Fig. 5c), consistent trends are observed, with models like YOLO11x-seg attaining higher mAP@50(M) while updating substantially fewer parameters. These results collectively confirm LoRAE’s effectiveness in balancing efficiency and performance across diverse tasks.

Fig. 5.

Fig. 5

Performance evaluation of LoRAE across model architectures and configurations.

Overall, as task complexity increases, the required number of training epochs also grows. The LoRAE method demonstrates particularly strong performance in models with larger parameter scales. It significantly reduces the number of trainable parameters while achieving superior training accuracy.

Rank value analysis

This section investigates the impact of Inline graphic on model performance, emphasizing the relationship between validation accuracy and the number of updated parameters under different Inline graphic settings. The variation of validation accuracy with respect to Inline graphic is also analyzed. As summarized in Tables 5,  6, 7, the specific effects of Inline graphic on image classification, object detection, and image segmentation tasks are discussed. In Tables 5,  6, 7 throughout this section, bold values represent the recommended rank (r) settings of LoRAE for corresponding model-task combinations (non-bold values are non-recommended), selected based on balanced parameter reduction and accuracy. This analysis serves as a reference for selecting Inline graphic to balance model performance and resource efficiency. It provides guidance for determining optimal Inline graphic configurations for LoRAE across various tasks.

Table 5.

Performance of LoRAE on image classification with varying rank values.

Model Inline graphic Cifar100 Upd. Params(M)
Acc-top1 Acc-top5
YOLO11n-cls 0.552 0.821 1.66
64 0.533 Inline graphic3.4% 0.822 Inline graphic0.1% 1.00 Inline graphic39.9%
32 0.514 Inline graphic6.9% 0.805 Inline graphic2.0% 0.51 Inline graphic69.5%
16 0.508 Inline graphic8.0% 0.807 Inline graphic1.7% 0.26 Inline graphic84.2%
8 0.484 Inline graphic12.3% 0.7946 Inline graphic3.2% 0.14 Inline graphic91.6%
4 0.415 Inline graphic24.8% 0.7267 Inline graphic11.5% 0.08 Inline graphic95.3%
2 0.341 Inline graphic38.2% 0.639 Inline graphic22.2% 0.05 Inline graphic97.1%
YOLOv8n-cls 0.540 0.822 1.99
64 0.542 Inline graphic0.4% 0.813 Inline graphic1.1% 0.84 Inline graphic57.5%
32 0.499 Inline graphic7.6% 0.7972 Inline graphic3.0% 0.42 Inline graphic78.7%
16 0.428 Inline graphic20.7% 0.7323 Inline graphic10.9% 0.21 Inline graphic89.4%
8 0.374 Inline graphic30.6% 0.6733 Inline graphic18.1% 0.10 Inline graphic94.7%
4 0.309 Inline graphic43.0% 0.5894 Inline graphic28.3% 0.05 Inline graphic97.3%
2 0.226 Inline graphic58.2% 0.480 Inline graphic41.5% 0.03 Inline graphic98.7%
EfficientNet-B1 0.561 0.816 7.79
64 0.542 Inline graphic3.3% 0.813 Inline graphic0.4% 2.14 Inline graphic72.4%
32 0.511 Inline graphic8.9% 0.7864 Inline graphic3.7% 1.13 Inline graphic85.5%
16 0.498 Inline graphic11.2% 0.7624 Inline graphic6.6% 0.60 Inline graphic92.4%
8 0.423 Inline graphic24.7% 0.6967 Inline graphic14.7% 0.31 Inline graphic96.0%
4 0.387 Inline graphic31.0% 0.6241 Inline graphic23.6% 0.17 Inline graphic97.9%
2 0.241 Inline graphic57.0% 0.5246 Inline graphic35.8% 0.09 Inline graphic98.9%
ResNet-152 0.461 0.737 58.35
64 0.442 Inline graphic4.1% 0.703 Inline graphic4.7% 6.78 Inline graphic88.4%
32 0.407 Inline graphic11.7% 0.676 Inline graphic8.4% 3.64 Inline graphic93.8%
16 0.361 Inline graphic21.7% 0.596 Inline graphic19.2% 1.98 Inline graphic96.6%
8 0.315 Inline graphic31.6% 0.561 Inline graphic23.9% 1.01 Inline graphic98.3%
4 0.241 Inline graphic47.8% 0.468 Inline graphic36.6% 0.54 Inline graphic99.1%
2 0.148 Inline graphic67.9% 0.346 Inline graphic53.0% 0.30 Inline graphic99.5%
YOLO11x-cls 0.729 0.919 28.48
64 0.714 Inline graphic2.1% 0.922 Inline graphic0.3% 6.57 Inline graphic77.0%
32 0.708 Inline graphic2.9% 0.918 Inline graphic0.1% 3.43 Inline graphic88.0%
16 0.676 Inline graphic7.3% 0.909 Inline graphic1.1% 1.87 Inline graphic93.4%
8 0.650 Inline graphic10.8% 0.893 Inline graphic2.8% 1.08 Inline graphic96.2%
4 0.613 Inline graphic15.9% 0.878 Inline graphic4.5% 0.69 Inline graphic97.6%
2 0.574 Inline graphic21.3% 0.849 Inline graphic7.6% 0.50 Inline graphic98.3%
YOLOv8x-cls 0.696 0.904 60.47
64 0.684 Inline graphic1.7% 0.906 Inline graphic0.2% 8.40 Inline graphic86.1%
32 0.668 Inline graphic4.0% 0.902 Inline graphic0.2% 4.20 Inline graphic93.0%
16 0.644 Inline graphic7.5% 0.892 Inline graphic1.3% 2.10 Inline graphic96.5%
8 0.605 Inline graphic13.1% 0.866 Inline graphic4.2% 1.05 Inline graphic98.3%
4 0.559 Inline graphic19.7% 0.826 Inline graphic8.6% 0.53 Inline graphic99.1%
2 0.480 Inline graphic31.2% 0.763 Inline graphic15.6% 0.26 Inline graphic99.6%

Table 6.

Performance of LoRAE on object detection with varying rank values.

Model Inline graphic VOC GlobalWheat2020 Upd. Params (M)
mAP@50(B) mAP@50-95(B) mAP@50(B) mAP@50-95(B)
YOLOv5s 0.842 0.648 0.974 0.663 9.13
64 0.885 Inline graphic5.1% 0.690 Inline graphic6.5% 0.967 Inline graphic0.7% 0.640 Inline graphic3.5% 2.01 Inline graphic78.0%
32 0.887 Inline graphic5.3% 0.691 Inline graphic6.6% 0.970 Inline graphic0.4% 0.646 Inline graphic2.6% 1.01 Inline graphic88.9%
16 0.883 Inline graphic4.9% 0.692 Inline graphic6.8% 0.974 Inline graphic0.0% 0.658 Inline graphic0.8% 0.50 Inline graphic94.5%
8 0.881 Inline graphic4.6% 0.694 Inline graphic7.1% 0.964 Inline graphic1.0% 0.625 Inline graphic5.7% 0.25 Inline graphic97.3%
4 0.873 Inline graphic3.7% 0.677 Inline graphic4.5% 0.968 Inline graphic0.6% 0.630 Inline graphic4.9% 0.13 Inline graphic98.6%
2 0.867 Inline graphic3.0% 0.667 Inline graphic3.0% 0.963 Inline graphic1.1% 0.614 Inline graphic7.4% 0.06 Inline graphic99.3%
Faster R-CNN 0.851 0.647 0.972 0.668 41.53
64 0.851 Inline graphic0.0% 0.654 Inline graphic1.1% 0.941 Inline graphic3.2% 0.581 Inline graphic13.0% 38.42 Inline graphic7.5%
32 0.853 Inline graphic0.2% 0.658 Inline graphic1.7% 0.958 Inline graphic1.4% 0.621 Inline graphic7.0% 30.36 Inline graphic26.9%
16 0.867 Inline graphic1.9% 0.672 Inline graphic3.9% 0.970 Inline graphic0.2% 0.649 Inline graphic2.8% 16.24 Inline graphic60.8%
8 0.854 Inline graphic0.4% 0.667 Inline graphic3.1% 0.962 Inline graphic1.0% 0.638 Inline graphic4.5% 9.18 Inline graphic77.9%
4 0.857 Inline graphic0.7% 0.659 Inline graphic1.9% 0.962 Inline graphic0.1% 0.619 Inline graphic7.3% 5.65 Inline graphic86.4%
2 0.829 Inline graphic2.6% 0.628 Inline graphic3.0% 0.924 Inline graphic4.9% 0.554 Inline graphic17.1% 3.88 Inline graphic90.7%
YOLOv8n 0.820 0.622 0.969 0.640 3.01
64 0.840 Inline graphic2.4% 0.642 Inline graphic3.2% 0.962 Inline graphic0.7% 0.619 Inline graphic3.3% 2.30 Inline graphic23.7%
32 0.843 Inline graphic2.8% 0.642 Inline graphic3.2% 0.958 Inline graphic1.1% 0.611 Inline graphic4.5% 1.15 Inline graphic61.7%
16 0.833 Inline graphic1.6% 0.636 Inline graphic2.3% 0.955 Inline graphic1.4% 0.599 Inline graphic6.4% 0.58 Inline graphic80.7%
8 0.829 Inline graphic1.1% 0.627 Inline graphic0.8% 0.952 Inline graphic1.8% 0.588 Inline graphic8.1% 0.29 Inline graphic90.4%
4 0.790 Inline graphic3.7% 0.583 Inline graphic6.3% 0.941 Inline graphic3.0% 0.561 Inline graphic12.3% 0.14 Inline graphic95.4%
2 0.736 Inline graphic10.2% 0.523 Inline graphic15.9% 0.927 Inline graphic4.3% 0.541 Inline graphic15.5% 0.07 Inline graphic97.7%
YOLOv8x 0.881 0.717 0.984 0.708 68.17
64 0.931 Inline graphic5.7% 0.771 Inline graphic7.5% 0.976 Inline graphic0.8% 0.670 Inline graphic5.4% 15.46 Inline graphic77.3%
32 0.930 Inline graphic5.6% 0.778 Inline graphic8.5% 0.974 Inline graphic1.0% 0.675 Inline graphic4.7% 7.73 Inline graphic88.7%
16 0.932 Inline graphic5.8% 0.773 Inline graphic7.8% 0.978 Inline graphic0.6% 0.675 Inline graphic4.7% 3.87 Inline graphic94.3%
8 0.932 Inline graphic5.8% 0.783 Inline graphic9.2% 0.976 Inline graphic0.8% 0.694 Inline graphic2.0% 1.93 Inline graphic97.2%
4 0.932 Inline graphic5.8% 0.782 Inline graphic9.1% 0.974 Inline graphic1.0% 0.656 Inline graphic7.4% 0.97 Inline graphic98.6%
2 0.930 Inline graphic5.6% 0.778 Inline graphic8.5% 0.977 Inline graphic0.8% 0.661 Inline graphic6.6% 0.48 Inline graphic99.3%
YOLO11n 0.843 0.647 0.974 0.662 2.59
64 0.839 Inline graphic0.5% 0.644 Inline graphic0.5% 0.971 Inline graphic0.3% 0.651 Inline graphic1.7% 2.22 Inline graphic14.3%
32 0.841 Inline graphic0.2% 0.644 Inline graphic0.5% 0.970 Inline graphic0.4% 0.643 Inline graphic2.9% 1.17 Inline graphic54.8%
16 0.870 Inline graphic3.2% 0.678 Inline graphic4.8% 0.968 Inline graphic0.6% 0.639 Inline graphic3.5% 0.65 Inline graphic74.9%
8 0.867 Inline graphic2.9% 0.676 Inline graphic4.5% 0.964 Inline graphic1.0% 0.621 Inline graphic6.2% 0.38 Inline graphic85.3%
4 0.866 Inline graphic2.7% 0.675 Inline graphic4.3% 0.960 Inline graphic1.4% 0.610 Inline graphic7.9% 0.25 Inline graphic90.3%
2 0.865 Inline graphic2.6% 0.668 Inline graphic3.2% 0.954 Inline graphic2.1% 0.591 Inline graphic10.7% 0.19 Inline graphic92.7%
YOLO11x 0.896 0.735 0.985 0.714 56.90
64 0.932 Inline graphic4.0% 0.791 Inline graphic7.6% 0.982 Inline graphic0.3% 0.707 Inline graphic1.0% 15.46 Inline graphic72.9%
32 0.936 Inline graphic4.5% 0.792 Inline graphic7.7% 0.982 Inline graphic0.3% 0.692 Inline graphic3.1% 9.15 Inline graphic83.9%
16 0.937 Inline graphic4.6% 0.787 Inline graphic7.1% 0.981 Inline graphic0.4% 0.686 Inline graphic4.0% 5.62 Inline graphic90.1%
8 0.936 Inline graphic4.5% 0.795 Inline graphic8.2% 0.980 Inline graphic0.5% 0.681 Inline graphic4.6% 3.86 Inline graphic93.2%
4 0.925 Inline graphic3.2% 0.762 Inline graphic3.7% 0.978 Inline graphic0.7% 0.671 Inline graphic5.9% 2.54 Inline graphic95.5%
2 0.939 Inline graphic4.8% 0.796 Inline graphic8.3% 0.974 Inline graphic1.1% 0.661 Inline graphic7.4% 1.27 Inline graphic97.8%

Table 7.

Performance of LoRAE on image segmentation with varying rank values.

Model Inline graphic Crack-seg Carparts-seg Upd. Params (M)
mAP@50(M) mAP@50-95(M) mAP@50(M) mAP@50-95(M)
U-Net 0.636 0.201 0.632 0.534 24.44
64 0.642 Inline graphic0.9% 0.204 Inline graphic1.5% 0.640 Inline graphic1.3% 0.529 Inline graphic0.9% 20.12 Inline graphic17.7%
32 0.645 Inline graphic1.4% 0.205 Inline graphic2.0% 0.648 Inline graphic2.5% 0.524 Inline graphic1.9% 16.06 Inline graphic34.3%
16 0.638 Inline graphic0.3% 0.203 Inline graphic1.0% 0.635 Inline graphic0.5% 0.518 Inline graphic3.0% 12.03 Inline graphic50.8%
8 0.621 Inline graphic2.4% 0.194 Inline graphic3.5% 0.628 Inline graphic0.6% 0.505 Inline graphic5.4% 8.02 Inline graphic67.2%
4 0.577 Inline graphic9.3% 0.182 Inline graphic9.5% 0.620 Inline graphic1.9% 0.498 Inline graphic6.7% 4.01 Inline graphic83.6%
2 0.519 Inline graphic18.4% 0.163 Inline graphic18.9% 0.607 Inline graphic4.0% 0.471 Inline graphic11.8% 2.00 Inline graphic91.8%
DeepLabV3+ 0.667 0.222 0.663 0.574 58.63
64 0.672 Inline graphic0.8% 0.225 Inline graphic1.4% 0.668 Inline graphic0.8% 0.569 Inline graphic0.9% 47.19 Inline graphic19.5%
32 0.678 Inline graphic1.6% 0.228 Inline graphic2.7% 0.672 Inline graphic1.4% 0.563 Inline graphic1.9% 35.75 Inline graphic39.0%
16 0.662 Inline graphic0.8% 0.217 Inline graphic2.3% 0.660 Inline graphic0.5% 0.550 Inline graphic4.2% 24.38 Inline graphic58.4%
8 0.639 Inline graphic4.2% 0.207 Inline graphic6.8% 0.652 Inline graphic1.7% 0.540 Inline graphic6.0% 12.19 Inline graphic79.2%
4 0.621 Inline graphic6.9% 0.200 Inline graphic9.9% 0.643 Inline graphic3.0% 0.531 Inline graphic7.5% 6.09 Inline graphic89.6%
2 0.576 Inline graphic13.6% 0.183 Inline graphic17.6% 0.635 Inline graphic4.2% 0.522 Inline graphic9.1% 3.05 Inline graphic94.8%
YOLOv8n-seg 0.660 0.213 0.680 0.559 3.27
64 0.684 Inline graphic3.6% 0.230 Inline graphic8.0% 0.664 Inline graphic2.3% 0.515 Inline graphic7.9% 2.71 Inline graphic17.0%
32 0.667 Inline graphic1.1% 0.212 Inline graphic0.5% 0.646 Inline graphic5.0% 0.509 Inline graphic9.0% 1.36 Inline graphic58.5%
16 0.653 Inline graphic1.1% 0.212 Inline graphic0.5% 0.628 Inline graphic7.6% 0.481 Inline graphic14.0% 0.68 Inline graphic79.2%
8 0.673 Inline graphic2.0% 0.212 Inline graphic0.5% 0.555 Inline graphic18.4% 0.417 Inline graphic25.4% 0.34 Inline graphic89.6%
4 0.676 Inline graphic2.4% 0.217 Inline graphic1.9% 0.542 Inline graphic20.3% 0.403 Inline graphic28.0% 0.17 Inline graphic94.8%
2 0.669 Inline graphic1.4% 0.206 Inline graphic3.3% 0.415 Inline graphic38.9% 0.310 Inline graphic44.5% 0.08 Inline graphic97.4%
YOLOv8x-seg 0.704 0.236 0.704 0.607 71.77
64 0.673 Inline graphic4.4% 0.219 Inline graphic7.2% 0.726 Inline graphic3.1% 0.602 Inline graphic0.8% 17.00 Inline graphic76.3%
32 0.575 Inline graphic18.3% 0.188 Inline graphic20.3% 0.728 Inline graphic3.4% 0.603 Inline graphic0.7% 8.49 Inline graphic88.2%
16 0.712 Inline graphic1.1% 0.222 Inline graphic5.9% 0.719 Inline graphic2.1% 0.593 Inline graphic2.3% 4.25 Inline graphic94.1%
8 0.715 Inline graphic1.6% 0.232 Inline graphic1.7% 0.693 Inline graphic1.6% 0.564 Inline graphic7.1% 2.12 Inline graphic97.0%
4 0.717 Inline graphic1.8% 0.235 Inline graphic0.4% 0.695 Inline graphic1.6% 0.567 Inline graphic6.6% 1.06 Inline graphic98.5%
2 0.724 Inline graphic2.8% 0.238 Inline graphic0.8% 0.687 Inline graphic2.4% 0.563 Inline graphic7.2% 0.53 Inline graphic99.3%
YOLO11n-seg 0.688 0.235 0.690 0.566 2.84
64 0.671 Inline graphic2.5% 0.216 Inline graphic8.1% 0.708 Inline graphic2.6% 0.559 Inline graphic1.2% 2.85 Inline graphic0.1%
32 0.717 Inline graphic4.2% 0.232 Inline graphic 1.3% 0.698 Inline graphic1.2% 0.543 Inline graphic4.1% 2.64 Inline graphic7.3%
16 0.687 Inline graphic0.1% 0.228 Inline graphic3.0% 0.689 Inline graphic0.1% 0.536 Inline graphic5.3% 1.38 Inline graphic51.5%
8 0.679 Inline graphic1.3% 0.218 Inline graphic7.2% 0.664 Inline graphic3.8% 0.499 Inline graphic11.8% 0.75 Inline graphic73.6%
4 0.674 Inline graphic2.0% 0.216 Inline graphic8.1% 0.604 Inline graphic12.5% 0.447 Inline graphic21.0% 0.44 Inline graphic84.7%
2 0.674 Inline graphic2.0% 0.210 Inline graphic10.6% 0.588 Inline graphic14.8% 0.425 Inline graphic24.9% 0.28 Inline graphic90.2%
YOLO11x-seg 0.708 0.241 0.675 0.583 62.05
64 0.668 Inline graphic5.7% 0.221 Inline graphic8.3% 0.706 Inline graphic4.6% 0.587 Inline graphic0.7% 18.03 Inline graphic71.0%
32 0.672 Inline graphic5.1% 0.225 Inline graphic6.6% 0.719 Inline graphic6.5% 0.595 Inline graphic2.1% 10.06 Inline graphic83.8%
16 0.701 Inline graphic1.0% 0.236 Inline graphic2.1% 0.727 Inline graphic7.7% 0.606 Inline graphic3.9% 6.08 Inline graphic90.2%
8 0.719 Inline graphic1.6% 0.244 Inline graphic1.2% 0.715 Inline graphic6.0% 0.592 Inline graphic1.6% 4.09 Inline graphic93.4%
4 0.712 Inline graphic0.6% 0.240 Inline graphic0.4% 0.710 Inline graphic5.2% 0.579 Inline graphic0.7% 3.09 Inline graphic95.0%
2 0.714 Inline graphic0.8% 0.252 Inline graphic4.6% 0.681 Inline graphic1.0% 0.558 Inline graphic4.3% 2.60 Inline graphic95.8%

In image classification, the results on the CIFAR-100 dataset (Table 5) demonstrate a gradual decline in model accuracy as Inline graphic decreases. This decline is primarily attributed to the reduced feature representation capacity. However, even at Inline graphic, the YOLO11x-cls and YOLOv8x-cls models achieve substantial parameter reductions of 86.1% and 98.3%, respectively. Notably, both models show improvements in Top-5 accuracy by 0.3% and 0.2%, with only minor decreases in Top-1 accuracy. These results highlight the effectiveness of LoRAE in balancing performance and efficiency. For object detection, Table 6 indicates that LoRAE maintains competitive performance even at Inline graphic. The YOLOv5s model achieves mAP@50 and mAP@50-95 improvements of 4.6% and 7.1% on the VOC dataset, respectively, while reducing trainable parameters by 97.3%. Larger models, such as YOLO11x, exhibit stable performance at lower Inline graphic. In contrast, smaller models are more sensitive to Inline graphic reductions, particularly on complex datasets like GlobalWheat2020. In image segmentation, Table 7 summarizes performance trends under varying Inline graphic. For example, with Inline graphic, the YOLOv8x-seg model achieves a 0.1% mAP@50 improvement on the Crack-seg dataset, along with a 66.8% parameter reduction. Larger models show greater stability at lower Inline graphic. However, smaller models, such as YOLOv8n-seg, experience more pronounced performance drops.

Overall, the results demonstrate that Inline graphic has a significant impact on model performance and parameter reduction. While lower Inline graphic values reduce trainable parameters, they may degrade performance, particularly for smaller models. Medium Inline graphic values (e.g., Inline graphic or Inline graphic) strike a balance between performance and resource efficiency. Larger models exhibit greater stability at lower Inline graphic. Additionally, task and dataset characteristics influence sensitivity to Inline graphic. Proper selection of Inline graphic enables efficient model updates and performance optimization in resource-constrained scenarios.

Visualization analysis

To further validate the performance of the LoRAE method across tasks, this study presents a visualization analysis of object detection and image segmentation, demonstrating its applicability in edge AI scenarios. As shown in Fig. 6, detailed experimental designs and dataset splits were conducted for both tasks. The object detection task involves multi-object scenes (e.g., cat, dog, person, sofa), while the image segmentation task focuses on vehicle components (e.g., windows, doors, wheels). Red dashed boxes in images highlight detection/segmentation errors for intuitive performance comparison.

Fig. 6.

Fig. 6

Visualization analysis of LoRAE’s performance in object detection and image segmentation tasks.

In object detection (Fig. 6a ), two methods were compared: Retrain and LoRAE. Quantitative and qualitative analyses focused on detection accuracy, false detection rate, and complex-scene adaptability. Under the 8-15-5 split, LoRAE demonstrated superior detection capability: Retrain failed to detect cats in Image I (indices 4-6), while LoRAE successfully identified them. In Image II (indices 4-6), LoRAE correctly detected sofas, whereas Retrain missed them. Retrain also exhibited cat miss-detections and sofa false-detections in Image II (indices 4, 7), while LoRAE produced precise results. Overall, LoRAE significantly reduced false annotation rates and improved detection accuracy.

In image segmentation (Fig. 6b ), LoRAE outperformed Retrain in boundary processing and complex-scene detail segmentation. Retrain failed to identify front mirrors in Image I (index 7), while LoRAE achieved accurate segmentation. In Image II, Retrain showed inaccuracies in wheel segmentation and door differentiation, where as LoRAE provided precise door and wheel segmentation with minor wheel-region deviations. Experimental results confirmed that LoRAE reduced class error rates and enhanced segmentation accuracy/reliability, particularly in boundary details.

Combined results show LoRAE outperforming Retrain in both tasks. LoRAE achieved efficient model optimization with minimal parameter updates, reducing false annotation rates in object detection and improving boundary segmentation in image tasks. These findings validate LoRAE’s practicality for resource-constrained edge AI. Compared to Retrain, LoRAE maintains high performance with significantly fewer parameter updates. Future research may explore its application to larger datasets and additional tasks.

Conclusion and future work

This study introduces an innovative low-rank adaptation method for Edge AI (LoRAE), specifically designed to address the challenges of efficient model updates in resource-constrained edge AI scenarios. The approach leverages low-rank decomposition of weight matrices to minimize the number of updated parameters, achieving approximately 4% of the parameter updates required by traditional full-parameter methods. This effectively mitigates computational and communication burdens during model adjustments. LoRAE significantly reduces the scale of trainable parameters while maintaining accuracy comparable to–and in some cases exceeding-full-parameter update methods. Extensive experiments across image classification, object detection, and image segmentation tasks validate its performance. For example, in object detection using the YOLOv8x model with Inline graphic, LoRAE reduces parameter updates by 98.6% while improving mAP@50(B) by 5.8%. Even at Inline graphic, LoRAE achieves a mere 1.6% decrease in mAP@50(B) compared to traditional retraining, demonstrating its robustness. Currently, the effectiveness of LoRAE has primarily been validated on 2D visual tasks such as image classification, object detection, and image segmentation. However, its applicability and performance in broader domains, including 3D object recognition, multimodal learning, and other non-visual tasks, require further in-depth investigation and validation. Furthermore, while LoRAE demonstrates excellent performance in efficiently reducing model update parameters, its potential as an aggressive direct model compression method for creating extremely compact models still necessitates more comprehensive exploration and empirical research.

Future work will focus on two primary directions. First, while this study emphasizes visual tasks (e.g., object detection, image segmentation), subsequent research will explore the applicability of LoRAE to other domains, such as natural language processing and multimodal tasks. Second, inspired by the significant intrinsic low-rank characteristics observed in small models, future efforts will investigate leveraging LoRAE for direct model compression. The goal is to develop extremely compact models that surpass original performance, offering a novel optimization approach for resource-constrained environments. These advancements will further solidify LoRAE’s role as an efficient solution for edge AI systems with stringent computational and communication constraints.

Author contributions

Zhixue Wang, the corresponding author, conceived the LoRAE method, designed the low-rank decomposition framework and convolutional layer optimization modules, and led the experimental design and manuscript writing; Hongyao Ma participated in algorithm modeling and multi-model validation across image classification, object detection, and segmentation tasks to analyze rank value impacts; Jiahui Zhai analyzed edge device resource constraints, handled dataset processing and experiment reproduction.

Funding

No funding support was received for this research.

Data availability

The datasets used in our experiments are all public datasets. The CIFAR-100 can be accessed at https://www.cs. toronto.edu/ kriz/cifar.html, the VOC is available at http://host.robots.ox.ac.uk/pascal/VOC/, the Global Wheat 2020 can be found at https://www.kaggle.com/c/global-wheat-detection/data, the Crack-seg is accessible at https://universe.roboflow.com/university-bswxt/crack-bphdr, and the Carparts-seg is available at https://universe.roboflow.com/gianmarco-russo-vt9xr/car-seg-un1pm.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Singh, R. & Gill, S. S. Edge AI: A survey. Internet Things Cyber Phys. Syst.3, 71–92 (2023). [Google Scholar]
  • 2.Shi, Y. et al. Communication-efficient edge AI: Algorithms and systems. IEEE Commun. Surv. Tutorials22(4), 2167–2191 (2020). [Google Scholar]
  • 3.Martin, J. et al. Embedded vision intelligence for the safety of smart cities. J. Imaging8(12), 326 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Hyysalo, J. et al. Smart mask-Wearable IoT solution for improved protection and personal health. Internet Things18, 100511 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Daghero, F., Pagliari, D. J. & Poncino, M. Energy-efficient deep learning inference on edge devices. Adv. Comput.122, 247–301 (2021). [Google Scholar]
  • 6.Capra, M. et al. Hardware and software optimizations for accelerating deep neural networks: Survey of current trends, challenges, and the road ahead. IEEE Access8, 225134–225180 (2020). [Google Scholar]
  • 7.Damsgaard, H. J. et al. Adaptive approximate computing in edge AI and IoT applications: A review. J. Syst. Architect.150, 103114 (2024). [Google Scholar]
  • 8.Hu, E. J., Shen, Y., & Wallis, P., et al. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  • 9.Sufian, A. et al. A survey on deep transfer learning to edge computing for mitigating the COVID-19 pandemic. J. Syst. Architect.108, 101830 (2020). [Google Scholar]
  • 10.Qi, C. et al. An efficient pruning scheme of deep neural networks for Internet of Things applications. EURASIP J. Adv. Sig. Process.2021(1), 31 (2021). [Google Scholar]
  • 11.Gupta, S., & Agrawal, A. Gopalakrishnan K, et al. Deep learning with limited numerical precision. In International Conference on Machine Learning, 1737-1746 (PMLR, 2015).
  • 12.Wang, X., Han, Y., & Leung, V. C. M., et al. Edge AI: Convergence of Edge Computing and Artificial Intelligence. (Springer, 2020).
  • 13.Dastjerdi, A. V. & Buyya, R. Fog computing: Helping the Internet of Things realize its potential. Computer49(8), 112–116 (2016). [Google Scholar]
  • 14.Cui, L. et al. A survey on application of machine learning for Internet of Things. Int. J. Mach. Learn. Cybernet.9, 1399–1417 (2018). [Google Scholar]
  • 15.Teoh, Y. K., Gill, S. S. & Parlikad, A. K. IoT and fog-computing-based predictive maintenance model for effective asset management in Industry 4.0 using machine learning. IEEE Internet Things J.10(3), 2087–2094 (2021). [Google Scholar]
  • 16.Kamruzzaman, M. M. New opportunities, challenges, and applications of edge-AI for connected healthcare in smart cities. IEEE Globecom Workshops (GC Wkshps) 1–6 (IEEE, 2021).
  • 17.Soro, S. TinyML for ubiquitous edge AI. arXiv preprint arXiv:2102.01255, (2021).
  • 18.Lovén, L., Leppänen, T., & Peltonen, E. et al. EdgeAI: A vision for distributed, edge-native artificial intelligence in future 6G networks. 6G Wireless Summit, March 24-26, 2019 Levid, (2019).
  • 19.Sipola, T. et al. 31st Conference of Open Innovations Association (FRUCT). 320–331 (IEEE, 2022).
  • 20.Marculescu, R., Marculescu, D., & Ogras, U. Edge AI: Systems design and ML for IoT data analytics. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 3565–3566 (2020).
  • 21.Han, S., Pool, J., & Tran, J. et al. Learning both weights and connections for efficient neural network. Advances in Neural Information Processing Systems. 28 (2015)
  • 22.Heo, B., Kim, J., & Yun, S. et al. A comprehensive overhaul of feature distillation. In Proceedings of the IEEE/CVF International Conference on Computer Vision 1921–1930 (2019).
  • 23.Courbariaux, M., Bengio, Y. & David, J. Quantized neural networks: Training neural networks with low precision weights and activations. J. Mach. Learn. Res.18, 1–31 (2016). [Google Scholar]
  • 24.Nagel, M., Fournarakis, M., Amjad, R. A., Bondarenko, Y., van Baalen, M., & Blankevoort, T. A White Paper on Neural Network Quantization. arXiv preprint arXiv:2106.08295. (2021).
  • 25.Wang, C. H., Huang, K. Y., & Chen, J. C. et al. Heterogeneous federated learning through multi-branch network. In 2021 IEEE International Conference on Multimedia and Expo (ICME) 1–6 (IEEE, 2021).
  • 26.Chowdhery, A. et al. Palm: Scaling language modeling with pathways. J. Mach. Learn. Res.24(240), 1–113 (2023). [Google Scholar]
  • 27.Hoffmann, J., Borgeaud, S., & Mensch, A. et al. Training compute-optimal large language models. arXiv preprint arXiv:2203.15556, 10 (2022).
  • 28.Touvron, H., Lavril, T., & Izacard, G. et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, (2023).
  • 29.Touvron, H., Martin, L., & Stone, K. et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, (2023).
  • 30.Kopiczko, D. J., Blankevoort, T., & Asano, Y. M. Vera: Vector-based random matrix adaptation. arXiv preprint arXiv:2310.11454, (2023).
  • 31.Hyeon-Woo, N., Ye-Bin, M., & Oh, T.H. Fedpara: Low-rank hadamard product for communication-efficient federated learning. arXiv preprint arXiv:2108.06098, (2021).
  • 32.Zhao, H., Ni, B., & Fan, J., et al. Continual forgetting for retrained vision models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 28631–28642 (2024).
  • 33.Tian, C., Shi, Z. & Guo, Z. et al. HydraLoRA: An asymmetric LoRA architecture for efficient fine-tuning. arXiv preprint arXiv:2404.19245, (2024).
  • 34.Agiza, A., Neseem, M. & Reda, S. MTLoRA: Low-rank adaptation approach for efficient multi-task learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 16196–16205 (2024).
  • 35.Dou, S., Zhou, E. & Liu, Y., et al. LoRAMoE: Alleviating world knowledge forgetting in large language models via MoE-style plugin. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1932–1945 (2024). [DOI] [PMC free article] [PubMed]
  • 36.Kirkpatrick, J. et al. Overcoming catastrophic forgetting in neural networks. Proceed. Natl. Acad. Sci.114(13), 3521–3526 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Chen, J., Wang, Y. & Wang, P., et al. DiffusePast: Diffusion-based generative replay for class incremental semantic segmentation. arXiv preprint arXiv:2308.01127, (2023).
  • 38.Isele, D. & Cosgun, A. Selective experience replay for lifelong learning. In Proceedings of the AAAI Conference on Artificial Intelligence 32(1) (2018). [DOI] [PMC free article] [PubMed]
  • 39.Liu, Y., Schiele, B. & Vedaldi, A., et al. Continual detection transformer for incremental object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 23799–23808 (2023).
  • 40.Zhu, F. et al. Class-incremental learning via dual augmentation. Adv. Neural Inf. Process. Syst.34, 14306–14318 (2021). [Google Scholar]
  • 41.Hospedales, T. et al. Meta-learning in neural networks: A survey. IEEE Trans. Pattern Anal. Mach. Intell.44(9), 5149–5169 (2021). [DOI] [PubMed] [Google Scholar]
  • 42.Douillard, A., Ramé, A. & Couairon, G., et al. Dytox: Transformers for continual learning with dynamic token expansion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 9285–9295 (2022).
  • 43.Liu, Y., Schiele, B. & Sun, Q. Adaptive aggregation networks for class-incremental learning. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition 2544–2553 (2021).
  • 44.Deng, J., Dong, W. & Socher, R., et al. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).
  • 45.Everingham, M. et al. The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis.88, 303–338 (2010). [Google Scholar]
  • 46.David, E., Madec, S. & Sadeghi-Tehran, P. et al. Global wheat head detection (GWHD) dataset: A large and diverse dataset of high-resolution RGB-labelled images to develop and benchmark wheat head detection methods. Plant Phenomics, (2020). [DOI] [PMC free article] [PubMed]
  • 47.Crack-Bphdr Dataset. (2022). Open Source Dataset by University. Roboflow Universe, Roboflow. Retrieved December, 2022, from https://universe.roboflow.com/university-bswxt/crack-bphdr
  • 48.Gianmarco, Russo. (2023). Car-seg Dataset [Open Source Dataset]. Roboflow Universe, Roboflow. Retrieved January 24, 2024, from https://universe.roboflow.com/gianmarco-russo-vt9xr/car-seg-un1pm
  • 49.He, K., Zhang, X. & Ren, S. et al. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).
  • 50.Zhong, Z., Tang, Z. & He, T. et al. Convolution meets lora: Parameter efficient finetuning for segment anything model. arXiv preprint arXiv:2401.17868, (2024).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets used in our experiments are all public datasets. The CIFAR-100 can be accessed at https://www.cs. toronto.edu/ kriz/cifar.html, the VOC is available at http://host.robots.ox.ac.uk/pascal/VOC/, the Global Wheat 2020 can be found at https://www.kaggle.com/c/global-wheat-detection/data, the Crack-seg is accessible at https://universe.roboflow.com/university-bswxt/crack-bphdr, and the Carparts-seg is available at https://universe.roboflow.com/gianmarco-russo-vt9xr/car-seg-un1pm.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES