Abstract
The increasing demand for wind turbines and cost pressures in the wind energy industry have made the Wind Turbine Pultruded Panels Production Scheduling Problem (WTPP-PSP) a critical challenge. To address the production scheduling requirements of WTPP-PSP, an intelligent platform is proposed for wind turbine pultruded panel production systems, leveraging intelligent decision-making to tackle the problem. A multi-objective model based on mixed-integer linear programming is developed, considering sequence-dependent completion and setup time constraints. The model aims to maximize customer satisfaction, minimize total setup time, and reduce deviations in workshop machine loads. To solve this problem, an Adaptive Crayfish Optimization Algorithm (ACOA) is introduced. This algorithm incorporates crossover and mutation operators, making it effective for discrete optimization problems. Furthermore, an improved crowding distance calculation enhances the algorithm’s performance in multi-objective optimization by improving solution distribution. Reinforcement learning is employed to dynamically adjust temperature parameters, improving both exploration and exploitation capabilities and thus enhancing the convergence of the algorithm. The performance comparison using multi-objective metrics such as HV, IGD, GD, and NR demonstrates that ACOA significantly outperforms COA, WOA, and NSGA-II, with average improvements of 76%, 80%, 28%, and 220%, respectively. These results highlight ACOA’s consistent advantages in coverage, convergence, and solution diversity. In the application to WTPP-PSP, the proposed algorithm outperforms COA by approximately 13%, 10%, and 8% in the three objectives.
Keywords: Multi-objective optimization, Crayfish optimization algorithm, Reinforcement learning, Production scheduling, Distributed production workshops
Subject terms: Information technology, Applied mathematics, Mathematics and computing, Computational science
Introduction
Renewable energy sources are environmentally friendly1, contributing to the reduction of greenhouse gas emissions and air pollution2. They also offer energy security by diversifying the energy mix and reducing dependence on fossil fuels3. An efficient energy management system also provides a guarantee for the application of renewable energy4,5. As global demand for sustainable energy rises, renewable sources such as wind power, known for their technological maturity, cost-effectiveness, and broad applicability, play a crucial role in the transition to sustainable energy6. The key factors affecting energy supply should also be thoroughly studied7. Wind turbine blades, which directly impact energy conversion efficiency8, are made using pultruded panels that form the main beams9, offering lightweight and high-strength support10. These panels are produced in distributed workshops, each with different machine processing abilities, making coordination and optimization a significant challenge. Given the geographically dispersed nature of the production facilities, effective scheduling becomes critical to ensure smooth operations. Since production begins after receiving specific customer orders, pultruded panels are produced using a Make-to-Order (MTO) model. This inherently complex process involves continuous multi-roll production within a distributed system, necessitating flexible and efficient scheduling to meet customer demands, ensure timely delivery, and optimize overall production efficiency. Therefore, coordinating and optimizing the scheduling of these distributed workshops is crucial for achieving timely and cost-effective production in the cost-driven wind energy sector.
The production of pultruded panels for wind turbine blades follows a continuous production line11 and can be discretized as a Distributed Unrelated Parallel Machines Scheduling Problem (DUPMSP)12. The solution space grows exponentially with the number of workshops and machines, significantly increasing the problem’s complexity, and making it an NP-hard problem13. This scheduling problem is particularly challenging due to two key constraints. First, pultruded panels have sequence-dependent completion constraints because the laying process of wind turbine blade materials requires strict adherence to a specific process sequence. The packing order of the pultruded panels directly impacts the production efficiency of this step. On the other hand, due to the large size and weight of the panels, moving and adjusting them is extremely labor-intensive. Second, as production must align with corresponding molds, changes in product models lead to increased setup times, creating the constraint of sequence-dependent setup times. Frequent mold changes can further accelerate expensive mold wear, reducing their lifespan. Therefore, the production of pultruded panels requires both sequence-dependent completion and sequence-dependent setup time constraints.
The motivation behind this research arises from the cost-driven nature of the wind energy industry14 and the practical demands of WTPP-PSP, which is one of the widely studied combinatorial optimization challenges. In real production environments, WTPP-PSP involves multiple optimization objectives. For instance, setup time can make delivery deadlines more flexible, but it may also shorten the mold’s lifespan and increase costs, making the minimization of setup time one of the key optimization goals. Delivery deadlines directly impact customer satisfaction15, especially under fuzzy delivery deadline conditions, which in turn plays a crucial role in enhancing the company’s image. In the context of distributed manufacturing, balancing production tasks across workshops to reduce load deviations is also vital. Therefore, this study aims to maximize customer satisfaction (CS), minimize total setup time (TST), and reduce workshop in deviations machine loads (DWML), while considering sequence-dependent completion and setup time constraints. To address this complexity, a mixed-integer linear programming (MILP) model is formulated. The problem is complex, and the objectives are interrelated. For example, improving equipment load balancing may increase setup time, while optimizing customer satisfaction could lead to uneven equipment load distribution, particularly under sequence-dependent completion constraints. Due to the large solution space, we propose the Adaptive Crayfish Optimization Algorithm integrated with Q-learning for temperature parameter tuning, to adaptively adjust the evolutionary strategy and enhance exploration and exploitation capabilities.
The Crayfish Optimization Algorithm (COA), introduced in 2023, has demonstrated exceptional performance across various domains, particularly in tackling complex constraint and multi-objective optimization problems16. As highlighted by Yan et al.17, COA offers distinct advantages in addressing such challenges. Moreover, the integration of metaheuristic algorithms with Reinforcement Learning (RL) enhances the efficiency of solution space exploration and accelerates convergence to high-quality solutions for a broad spectrum of complex optimization tasks18. By incorporating RL-based Q-learning into the metaheuristic framework, the algorithm enables intelligent, autonomous parameter tuning, allowing it to adapt based on previous experiences and the current problem state. This adaptive parameter adjustment mechanism effectively balances exploration and exploitation, compared to the impracticality of manual parameter tuning and the risk of getting trapped in local optima with fixed parameter settings. In this context, the paper introduces an ACOA variant leveraging Q-learning strategies. The ACOA further integrates crossover and mutation operators to ensure robustness and discretization, while using Q-learning to adjust the temperature of COA, balancing exploration and exploitation. These features allow our approach to deliver high-quality solutions efficiently, setting ACOA apart from existing algorithms. The key contributions of this study are outlined as follows.
An intelligent platform for Wind Turbine Pultruded Panel production is proposed, integrating the optimal results obtained from intelligent algorithms. This platform supports intelligent decision-making and serves as an effective tool for production planning and scheduling. A multi-objective MILP model is developed for WTPP-PSP to minimize total setup time, deviations in workshop machine loads, and maximize customer satisfaction, while accounting for sequence-dependent setup times and component sequence-dependent completion constraints, making the problem more realistic.
By incorporating the ideas of genetic algorithms and using crossover and mutation operators, we discretize the continuous COA, making it suitable for solving discrete problems such as production scheduling. To enhance the diversity and distribution of the algorithm in multi-objective optimization tasks, we have improved the crowding distance calculation, better guiding the selection of Pareto solutions.
To improve the algorithm’s convergence, Q-learning is integrated to dynamically adjust the temperature parameter of the COA, optimizing the evolution strategy of the Crayfish individuals and enhancing both exploration and exploitation capabilities.
The remainder of this paper is organized as follows: section “Literature review” provides a detailed literature review; section “Problem description and mathematical model” defines the problem and presents the optimization model; section “Architecture of solution approach” explains the ACOA; section “Computational study” presents computational experiments and results; and section “Conclusions” concludes with a summary of contributions and future research directions.
Literature review
WTPP-PSP and related scheduling problems
As shown in Table 1, WTPP-PSP, DUPMSP has been studied in recent years. When tackling the actual scheduling problem of DUPMSP, there are typically multiple objectives that require optimization. Kim et al.19 address the problem of part-grouping and scheduling for non-identical parallel additive manufacturing machines with sequence-dependent setup times, using metaheuristic algorithms for minimizing completion time. Guo et al.20 proposes a multi-stage evolutionary algorithm (EMSEA) for solving DUPMSP, demonstrating its effectiveness in minimizing tardiness and production costs compared to existing algorithms. Wang et al.21 propose a tabu search heuristic with diversification to solve the unrelated parallel machine scheduling problem, aiming to minimize total workload and fixed costs. Pan et al.22 proposed a knowledge-based dual-population optimization algorithm to minimize energy consumption and tardiness in DUPMSP. Wang et al.23 studied DUPMSP proposing a knowledge-based Pareto memetic algorithm to minimize the total tardiness and makespan. Amallynda et al.24 aim to solve two main objectives: minimizing mean flow time and reducing the number of tardy jobs using metaheuristic algorithms. Additionally, the scheduling process involves various complex constraints, such as machine- and job-dependent setups, as well as load constraints. Srinath et al.25 addressed the issue of sequence-dependent setup times in the dyeing process, introducing new perspectives for the field of machine scheduling through multi-objective scheduling methods. Elyasi et al.26 propose an imperialist competitive algorithm to solve UPMSP with sequence-dependent and machine-dependent setups and workload constraints. Wang et al.27 propose a knowledge and Pareto-based memetic algorithm (KPMA) to solve DUPMSP, aiming to minimize both total tardiness and makespan, by incorporating heuristic initialization, knowledge-based neighborhood structures, and an elite strategy to improve solution convergence and diversity. Lei et al.28 tackle the DUPMSP considering machine eligibility and sequence-dependent setup time constraints, using an improved artificial bee colony algorithm to optimize scheduling. Wang et al.29 introduces reinforcement learning-enhanced swarm algorithms for optimizing robust parallel machine scheduling with sequence-dependent setup times, minimizing mean and worst-case makespan. Wu et al.30 propose a Learning-based Two-phase Cooperative Optimizer (LCTPO) for DUPMSP, demonstrating improved performance in minimizing total weighted tardiness and workload gap through a two-phase evolutionary approach combined with reinforcement learning. In the problem scenario we are studying, it is essential not only to address multi-objective optimization but also to account for sequence-dependent setup times and component sequence-dependent completion constraints. Earlier research primarily focused on single objectives, but in recent years, the focus has shifted toward multi-objective optimization. Balancing these multiple objectives under the given constraints has become the central challenge in scheduling problems.
Table 1.
Literature comparison on PMSP.
| Year | Refs. | Env | Constraints | methodology | Objectives | ||
|---|---|---|---|---|---|---|---|
| Dis | SST | SDC | MA | RL | |||
| 2018 | Wang et al.21 | √ | TW, FC | ||||
| 2021 | Lei et al.28 | √ | √ | Cmax, TT | |||
| 2022 | Kim et al. 19 | √ | √ | Cmax | |||
| 2022 | Pan et al.22 | √ | √ | TEC, TT | |||
| 2023 | Wang et al.27 | √ | √ | √ | TT, Cmax | ||
| 2022 | Amallynda et al.24 | √ | MFT, NTJ | ||||
| 2023 | Srinath et al.25 | √ | √ | Cmax, TJPV | |||
| 2023 | Wang et al.29 | √ | √ | √ | MM, WM | ||
| 2024 | Elyasi et al.26 | √ | √ | TET | |||
| 2024 | Wang et al.27 | √ | √ | Cmax, Tardiness | |||
| 2024 | Guo et al.20 | √ | √ | Tardiness, Costs | |||
| 2024 | Wu et al.30 | √ | √ | √ | Tardiness, Workload | ||
| 2025 | This paper | √ | √ | √ | √ | √ | CS, DWML, ST |
Dis: Distributed, Unre: Unrelated, SST: Sequence-Setup times, SDC: sequence-dependent completion, MA: Metaheuristic Algorithm, RL: Reinforcement Learning.
Algorithms for workshop scheduling optimization
In workshop scheduling problems, metaheuristic algorithms are widely used. Metaheuristic algorithms efficiently explore large solution spaces and can provide feasible approximate solutions within a reasonable time31–34, making them particularly suitable for complex optimization problems35. Wu et al.36 present a two-stage model for single-machine scheduling with maintenance and learning effects, using B&B, GA, and HGTSA algorithms. Heidari et al.37 optimize carbon emissions in machine-piece scheduling and vehicle routing using MOPSO and NSGA-II algorithms, balancing environmental impact and customer satisfaction. Khurshid et al.38 introduce the HES-IG algorithm, a hybrid of evolution strategies and iterated greedy, which outperforms existing methods for the no-wait flow shop scheduling problem. Fatma Betul et al.39 propose a genetic algorithm with simulated annealing to solve the multi-product, multi-period disassembly line balancing problem, addressing multi-manned stations and lot-sizing constraints. Lian et al.40 present an improved NSGA-II algorithm with variable neighborhood search to optimize scheduling in steel plants, addressing machine matching and energy constraints under time uncertainty. Cui et al.41 investigate an improved multi-population genetic algorithm (IMPGA) with greedy job insertion and sub-regional co-evolution to solve the multi-objective distributed hybrid flow-shop scheduling problem. Zhu et al.42 address PMSP with limited and unequal sub-lots, proposing an improved artificial bee colony algorithm to optimize makespan and due time deviation efficiently. Beren Gürsoy et al.43 proposes NSGA-II-based algorithms to solve the multi-objective hybrid flowshop scheduling problem with limited waiting time and machine capability constraints. Willy Chandra et al.44 introduce a Particle Swarm Optimization (PSO)-based method to tackle the integrated scheduling problem for additive manufacturing and batch delivery. Jiang et al.45 introduce a Discrete Whale Optimization Algorithm (DWOA) with heuristic initialization and variable neighborhood search to optimize energy-efficient job shop scheduling. These methods have achieved notable success in solving complex scheduling optimization problems. However, the limitations of metaheuristic algorithms have also become evident, particularly when dealing with highly complex constraints. The algorithms tend to get trapped in local optima, making it difficult to escape, which impacts their ability to find global solutions. Furthermore, the performance of metaheuristic algorithms heavily relies on parameter settings (such as mutation rate and crossover rate), but these often require manual tuning, which is both time-consuming and labor-intensive. Additionally, parameter configurations can vary significantly across different scheduling environments, reducing the algorithms’ general applicability.
To overcome these limitations, researchers have started integrating machine learning techniques46–49, such as Q-learning, with metaheuristic algorithms to improve their global search capabilities and adaptability. Jiang et al.50 propose a Q-learning-based biology migration algorithm to optimize energy-saving flexible job shop scheduling with speed-adjustable machines and transporters. Chen et al.51 propose an improved spider monkey optimization algorithm combined with Q-learning to solve multi-objective planning and scheduling problems in PCB assembly. Zhang et al.52 combine PSO and Q-learning-based local search to minimize makespan and total energy consumption in the energy-efficient distributed heterogeneous hybrid flow-shop scheduling problem, showing significant improvements in both global search efficiency and solution quality. Q-learning intelligently adjusts algorithm parameters, enabling metaheuristic algorithms to better balance exploration and exploitation during the search process, enhancing global search capabilities and avoiding local optima. Additionally, Q-learning dynamically adapts the algorithm’s parameters based on changes in the scheduling environment, reducing the need for manual tuning and improving the algorithm’s generality and adaptability.
Research gap
Although previous studies have made significant progress in the field of DUPMSP, providing various methods and models to optimize the production process and reduce costs, there are still noticeable gaps in the existing literature. To date, few studies have addressed DUPMSP with sequence-dependent setup times and component sequence-dependent completion constraints, and there is a lack of research specifically focused on the practical issues of wind turbine pultruded panels.
The studies above show that various metaheuristic algorithms incorporating RL have been applied to solve scheduling problems. By adaptively adjusting the parameters of the meta-algorithms, these approaches have achieved better performance in scheduling tasks. However, COA, which includes foraging and summer resort stages, relies on a purely random evolutionary strategy that fails to effectively balance exploration and exploitation. In the foraging stage, the algorithm exploits promising solutions, but this can lead to getting stuck in local optima without adaptive adjustment. Meanwhile, the summer resort stage focuses on exploration of the solution space, but this can lead to inefficiencies. Therefore, integrating reinforcement learning with COA is essential for optimizing the balance between exploration and exploitation, improving.
the algorithm’s performance in scheduling problems.
Problem description and mathematical model
Problem description
Wind turbine pultruded panels are multi-layered components, with product models strictly corresponding to the blade models. A single set of pultruded panel products consists of several coil components, all with varying lengths. The production process is a continuous automated pultrusion molding process, which mainly includes steps such as fiber supply, resin impregnation, mold heating and curing, traction, cutting, and coiling. By adapting to the mold specifications in Step 11 of Fig. 1, flexible production of different product models can be achieved. In the WTPP-PSP, there are two layers of decisions: the first layer is at the factory level, where products are allocated to the appropriate workshops, and the second layer involves assigning components to equipment, which includes sequencing and allocating the components. Key assumptions underlying the model include: (1) only existing orders are considered, excluding order disturbances; (2) at t = 0, both parallel machines and components are in a processable state without requiring setup times; (3) machine and quality disturbances are excluded during processing; (4) sequential kitting constraints exist only among the components of a single product set; (5) at most one component of a product can be assigned to a machine; (6) each machine can process only one component of a product at any given time; (7) for each product, the component earlier must wait till that the last one is completed, and then all the components in this product should be delivered as a whole to the customer.
Fig. 1.
Problem description diagram.
Notations and definitions
| Notations and indices | |
| O i | Order i,
|
|
Product set q of Oi,
|
|
Component j of set q in Oi,
|
|
Machine line in Workshop g,
|
|
Machine line with number e in Mg,
|
|
Product model,
|
|
Component j of
|
|
Fuzzy delivery time for order i |
|
Switching time from product model of order i to machine of order
|
|
Processing time of on
|
|
Load time of Mg |
|
0–1 variable, taking the value 1 when Oi switches to Oi requiring setuptime, otherwise 0 |
|
0–1 variable, takes1 if is produced on , otherwise 0 |
|
0–1 variable, takes 1 if product q of order i is processed on Mg, otherwise 0 |
|
0–1 variable, takes 1 if is processed immediately after on , otherwise 0 |
|
Completion time of on
|
|
Start time of on
|
|
Customer satisfaction of order i |
|
Production Speed |
|
Length of
|
Model formulation
The mathematical model is as follows:
![]() |
1 |
![]() |
2 |
![]() |
3 |
![]() |
4 |
![]() |
5 |
S.t.
![]() |
6 |
![]() |
7 |
![]() |
8 |
![]() |
9 |
![]() |
10 |
![]() |
11 |
![]() |
12 |
![]() |
13 |
![]() |
14 |
Equation (1) describes the maximization of customer satisfaction. Equation (2) defines the calculation formula for customer satisfaction of Piq. The delivery time window in Eq. (2) is represented by Di(dia,dib,dic,did). Equations (3)-(4) describes the minimization of setup time and deviation in workshop machine loads. The calculation formula for workshop machine load is defined in Eq. (5). Equation (6) defines the workshop production constraint: All components of a product set in any order must be processed on parallel machines within the same workshop. Equation (7) defines the sequence integrity constraint: Components with smaller numbers in a product set have lower completion priority. Equation (8) outlines the exclusive component constraint: Each component must be assigned to a single parallel machine and processed only once. Equation (9) describes the continuous production constraint: The processing of any component must be uninterrupted, and activities on parallel machines must continue without interruption. Equation (10) establish the reconfiguration adjustment constraint: Adjustment time is needed on any parallel machine only when switching between two consecutively processed components of different products. The start time of the next component depends on the completion time of the previous component plus the reconfiguration adjustment time. Equation (11) states the exclusive equipment constraint: Each parallel machine can process only one component at any given time. Equation (12) specifies that all parameter variables are positive. Equation (13) indicates that tiqf represents the latest completion time of the components subordinate to tiqj,gef. Equation (14) indicates that the component processing time is derived by dividing the component length by the equipment processing speed.
An illustrative example
In this section, we will explain an example of a scheduling problem for wind power pultruded panels, which involves two workshops and three different types of orders. The parameters of the problem are shown in Table 2.
Table 2.
Information related to order products.
| Order | Product Model | Delivery time | Product component | Component length | Workshop1(Vge) | Workshop2(Vge) | |||
|---|---|---|---|---|---|---|---|---|---|
| M11 | M12 | M21 | M22 | M23 | |||||
| O1 Quantity 2 | Ty1 | [25, 30, 40, 48] | P111 | 244.8 | 24.0 | 21.3 | 19.3 | 18.4 | 21.1 |
| P112 | 255.2 | 23.4 | 18.1 | 21.6 | 20.6 | 18.9 | |||
| P113 | 341.4 | 25.1 | 32.5 | 22.8 | 23.7 | 33.1 | |||
| P121 | 244.8 | 24.0 | 21.3 | 19.3 | 18.4 | 21.1 | |||
| P122 | 255.2 | 23.4 | 18.1 | 21.6 | 20.6 | 18.9 | |||
| P123 | 341.4 | 25.1 | 32.5 | 22.8 | 23.7 | 33.1 | |||
| O2 Quantity 3 | Ty2 | [20, 24, 32, 40] | P211 | 252.4 | 24.5 | 21.6 | 20.7 | 18.3 | 23.2 |
| P212 | 335.0 | 23.1 | 31.6 | 29.4 | 26.0 | 22.5 | |||
| P221 | 252.4 | 24.5 | 21.6 | 20.7 | 18.3 | 23.2 | |||
| P222 | 335.0 | 23.1 | 31.6 | 29.4 | 26.0 | 22.5 | |||
| P231 | 252.4 | 24.5 | 21.6 | 20.7 | 18.3 | 23.2 | |||
| P232 | 335.0 | 23.1 | 31.6 | 29.4 | 26.0 | 22.5 | |||
| O3 Quantity 2 | Ty3 | [40, 44, 50, 54] | P311 | 329.8 | 26.6 | 24.1 | 29.5 | 22.4 | 31.1 |
| P312 | 286.9 | 21.9 | 24.3 | 22.2 | 26.6 | 19.8 | |||
| P321 | 329.8 | 26.6 | 24.1 | 29.5 | 22.4 | 31.1 | |||
| P322 | 286.9 | 21.9 | 24.3 | 22.2 | 26.6 | 19.8 | |||
The changeover adjustment times between three product models can be represented by a 3 × 3 matrix, as shown below:
![]() |
In this matrix, each element represents the time required to switch from one model to another. For example, it takes 3 time units to switch from Ty1 to Ty2 and 6 time units to switch from Ty2 to Ty3.
To better illustrate the research problem addressed in this study, Fig. 2 presents an example of an initial scheduling plan.
Fig. 2.
Gantt chart of the initial plan.
Architecture of solution approach
Platform of wind turbine pultruded panel production
The wind turbine pultruded panel production system follows a continuous automated pultrusion process, which involves key steps such as fiber supply, resin impregnation, mold heating and curing, traction, cutting, and rewinding. The production workshop consists of multiple continuous production lines, each connected with IoT devices through sensors that collect essential data such as mold temperature, oven temperature, production length, and traction speed. These data points are integrated to enable intelligent decision-making, supporting the planning and scheduling of customer orders.
The overall wind turbine pultruded panel production platform consists of three layers of modules, i.e., physical layer (PL), data management layer (DML), and application service layer (ASL),as shown in Fig. 3. In the current research, the focus is on the wind turbine pultruded panel production workshop and decision-making driven by real-time production data. The PL refers to the physical elements of the production system, including pultrusion lines, fiber supply mechanisms, resin impregnation units, curing molds, cutting and rewinding stations, and IoT devices. In this layer, real-time production system data, such as mold temperature, oven temperature, production length, and traction speed, can be collected using smart sensors. The collected real-time data in the physical layer is then sent to the DML for storage and further processing. In the DML, the data related to all physical elements is stored in their assigned databases at the first stage. The collected data includes both relevant and irrelevant information. Therefore, the data undergoes further processing in the next stage of the DML. In the data processing stage, the data is classified according to the desired objectives, analyzed by mapping it into different categories, and validated to ensure that only useful data is sent to the data visualization stage. At this stage, the received data is visually inspected using various tools and techniques, which help analyze the data in the ASL.
Fig. 3.
Wind turbine pultruded panel production platform.
This research primarily focuses on the third layer, ASL. In ASL, the refined data related to pultrusion production scheduling and optimization problems are received from the DML. A comprehensive MILP-based mathematical model is developed, taking into account all constraints and assumptions specific to the pultruded panel production process. This model is then solved using the proposed ACOA, which provides the optimal solution for production scheduling. ASL relies on real-time data generated by the physical production system in the PL, ensuring a direct connection between the ASL, DML, and PL. The DML continuously receives real-time data from the PL, keeping status information up-to-date for decision-making in ASL. The optimal results obtained from the ACOA in ASL are then sent back to the DML, where the data is processed and visualized to support final decision-making. Once the proposed solutions are validated and visualized in the DML, the decision-making information is forwarded to the PL for execution in the actual pultrusion production line. The ACOA plays a pivotal role in ASL by providing the optimal solution as quickly as possible. The detailed structure and methodology of the ACOA will be discussed in the following section.
Proposed algorithms
In this study, the proposed ACOA is employed to address the integration problem, leveraging its unique strengths for effective optimization. The ACOA is enhanced with genetic algorithm principles, incorporating four carefully designed neighborhood structures and a Q-learning mechanism for adaptive temperature adjustment. This temperature adjustment plays a crucial role in balancing the exploration and exploitation processes, ensuring that the algorithm navigates the solution space efficiently.
The choice of a crayfish-based algorithm is based on several key advantages. Firstly, the crayfish algorithm is known for its robustness and flexibility, making it particularly suitable for solving complex, multi-dimensional problems, such as those encountered in this study. Its ability to adapt to various problem structures allows it to effectively navigate diverse solution spaces53. Secondly, rather than comparing different heuristic algorithms, the main focus of this study is to fine-tune and optimize the crayfish algorithm to deliver high-quality solutions within a reasonable CPU time. By concentrating on this algorithm, we are able to explore and refine its applications, ensuring that we identify the most effective tactical strategies through comprehensive analysis.
To further enhance the performance of the crayfish algorithm, Q-learning is employed to dynamically adjust parameters, while neighborhood structures are used as local search methods to improve search capabilities. This hybrid approach effectively combines the exploratory power of the crayfish algorithm, the genetic algorithm-inspired crossover and mutation, the efficiency of the four neighborhood structures, and the adaptive nature of Q-learning.
Overall, this approach ensures a balanced and powerful optimization method, leveraging the best aspects of each technique to solve complex scheduling challenges. The flowchart of the proposed algorithm is shown in Fig. 4.
Fig. 4.
Flowchart of the proposed COA-based algorithms.
Chromosome encoding
Using a two-layer encoding method, the first layer represents the workshop index, identifying the workshop assigned to each order’s product and corresponding to decision variable
. The second layer represents the equipment index, specifying the machine assigned to each component, corresponding to decision variable
. In the previous chapter’s illustrative example, the encoding for a solution involving three orders, eight products of three types, and sixteen components, with data sourced from Table 1, is shown in Fig. 5.
Fig. 5.

Encoding example.
Chromosome decoding
During the decoding process, the algorithm first repairs any infeasible solutions before calculating the three objective values. The Algorithm 1 below details the decoding method and includes an example.
Algorithm 1.

Decoding
To achieve a valid sequence, it is essential to determine the order of components assigned to each machine corresponding to the decision variable
. Components on the same machine are prioritized by layer number, with higher-layer components scheduled first to meet the sequential completion constraint. However, infeasible solutions may still arise, requiring adjustments. If lower-layer components are set to finish before higher-layer ones within the same product set, their completion times are delayed to ensure the correct order. Figure 6 illustrates a feasible solution after these corrections.
Fig. 6.

Gantt chart of the feasible plan.
ACOA operators
Crossover operators
The crossover strategy in the proposed approach integrates the crossover concept from genetic algorithms and adapts it for discrete environments. The goal of the crossover process is to combine genetic information from different solutions effectively. Two types of crossover methods have been designed: one based on the workshop layer and another based on randomly selected equipment layers within a workshop, as shown in Fig. 7.
Fig. 7.
Crossover operator.
Figure 7(a) illustrates the process of workshop-layer-based crossover. A few random positions are selected in the two parent chromosomes; these positions do not need to be consecutive. In this example, three positions indicated by arrows are selected. First, the corresponding positions in Parent Chromosome 2 are identified based on the selected genes from Parent Chromosome 1. The remaining genes from Parent Chromosome 2 are then used to generate an offspring, ensuring the positions align. The selected genes from Parent Chromosome 1 are then placed sequentially into the remaining positions of the offspring. The other offspring is created in a similar manner. The red dashed lines represent the genes inherited by the offspring from their respective parents, while the remaining genes are inherited from the other parent through crossover.
Figure 7(b) illustrates the process of randomly selecting workshops and performing uniform crossover within the equipment of that workshop. In this example, Workshop 1 is selected, as indicated by the arrows. First, based on the selection of Workshop 1 from Parent Chromosome 1, the corresponding position in Parent Chromosome 2 is identified. The red arrow boxes mark the crossover points of the equipment genes between the two parent chromosomes. The genes within the red boxes represent those inherited by the offspring through the crossover process.
Mutation operator
The mutation strategy in the proposed approach incorporates the mutation concept from genetic algorithms. The objective of the mutation process is to introduce diversity into the population and prevent premature convergence. Two types of mutation methods have been designed: one based on the workshop layer and another based on randomly selected equipment layers within a workshop, as shown in Fig. 8.
Fig. 8.
Mutation operator.
Figure 8(a) illustrates the process of multi-point mutation based on workshop-layer positions. Several random positions are selected in the two parent chromosomes, which do not need to be consecutive. In this example, two workshop-layer positions indicated by arrows are chosen. Mutation is then applied to both the workshop-layer encoding and the equipment-layer encoding at the selected positions. The red box highlights the genes generated through the mutation process in the parent chromosomes, while the remaining genes are inherited from the parents.
Figure 8(b) illustrates the process of multi-point mutation based on equipment-layer positions within workshops. Several random positions are selected in the two parent chromosomes, which do not need to be consecutive. In this example, two workshop-layer positions, indicated by arrows, are chosen, and mutation is applied to the corresponding equipment-layer encoding. The red boxes highlight the genes introduced through the mutation process in the parent chromosomes, while the rest are inherited from the parents.
Improved crowding distance calculation
The original crowding distance calculation often suffers from the problem of incorrectly estimating the density of solutions around a given point, particularly when the distribution of solutions is uneven or non-uniform. This can lead to poor diversity maintenance in the population, as certain solutions may be overrepresented or underrepresented during selection.
The formula (15) presented in this paper represents an improvement to the crowding distance calculation. The modifications adjust how the distance is calculated between neighboring solutions, especially when the values of adjacent points are close to each other. This adjustment ensures a more accurate estimation of local densities by considering different conditions for neighboring values, improving the overall diversity of the population and leading to better exploration of the solution space. The improved formula helps to avoid clustering of solutions and supports a more even distribution across the Pareto front, enhancing the performance of the algorithm in multi-objective optimization tasks.
![]() |
15 |
![]() |
16 |
where M is the total number of objective functions, j is the individual’s position in the ranking,
is the distance ratio of the individual to the mth objective,
and
are the objective function values at adjacent positions of the individual. The closer the
value is to M, the more uniform the individual’s position, and the smaller the value, the worse the congestion.
Evolution strategy
Foraging stage
In foraging stage, crossover plays a vital role in combining information from different solutions (parent chromosomes) to generate new, potentially better solutions. The purpose of employing crossover in the Foraging Stage is to enhance the exploration capability of the algorithm, ensuring that the search does not get trapped in local optima and can explore diverse areas of the solution space.
The crossover method used in Fig. 9(a) involves a crossover with an elite solution. A random probability check determines whether crossover occurs between the current solution and an elite solution, as illustrated in Fig. 7(a). The elite solution represents one of the best solutions identified thus far, guiding the search towards promising regions of the solution space. This method blends high-quality traits from the elite solution with those of the current solution, enhancing the algorithm’s ability to explore new possibilities. The resulting offspring can inherit advantageous features, which contributes to refining the overall solution. A further mutation step is applied to a new solution if another random check passes, based on Fig. 8(a).
Fig. 9.
Evolution strategy of foraging stage.
Figure 9(b), on the other hand, emphasizes exploitation by using a crossover method with the best solution. Here, the decision is also based on a probability, but the crossover occurs between the current solution and solutions from the Pareto front, as shown in Fig. 7(b). This approach ensures that the offspring inherits characteristics from the best solutions identified so far, promoting convergence towards high-quality solutions. At the same time, it maintains some level of diversity by using randomized crossover points, which allows for continued exploration, albeit more focused on refining the known best solutions. A further mutation step is applied to a new solution if another random check passes, based on Fig. 8(b).
Summer resort stage
In this stage, mutation plays a crucial role in maintaining diversity within the population and allowing the algorithm to explore new regions of the solution space. Mutation is particularly important during the Summer Resort Stage because it prevents the population from becoming too homogeneous, which could limit the exploration of potential optimal solutions.
Figure 10(a) focuses on using a mutation-based approach. If a random number, the current solution undergoes mutation to generate a new candidate solution, as outlined in Fig. 8(a). This type of mutation introduces relatively larger changes to the genes, ensuring that the solution space is explored more thoroughly. If the condition is not met, the new solution is set equal to the current one. A further mutation step is applied to a new solution if another random check passes, based on Fig. 8(b). This two-step mutation process allows for both moderate adjustments and larger deviations, providing flexibility in the search process.
Fig. 10.
Evolution strategy of summer resort stage.
Figure 10(b) integrates crossover with a mutation backup. The first decision involves a crossover, where the current solution is combined with Pareto front solutions if being the crossover probability, as shown in Fig. 7(a). This crossover mixes genes from two different solutions, promoting the combination of beneficial traits and encouraging exploration. If the crossover condition fails, the solution remains unchanged. After the crossover, if a random check passes the new solution undergoes mutation, where a new solution is randomly generated. This ensures that even after crossover, the algorithm can introduce novel solutions to maintain diversity.
Adaptive temperature adjustment based on Q-learning
In the ACOA, temperature is a critical parameter that influences the balance between exploration and exploitation. Simply generating this parameter using random numbers often results in suboptimal solutions. To address this, Q-learning is integrated into the algorithm, enabling dynamic and adaptive adjustment of the temperature parameter. This approach makes the parameter more responsive to changes, enhancing the overall quality of the algorithm. Additionally, the Q-table is pre-trained to ensure that the Q-learning process begins with a reasonable understanding of the state-action pairs, further improving the algorithm’s performance. Implementing Q-learning-based adaptive parameter adjustment involves several key steps.
Design state
In the ACOA, the COA serves as the environment, where the solutions at different time steps represent various states. The RL agent receives these states and uses them to influence the choice of actions in the subsequent time steps. The agent makes action decisions based on changes in the fitness values across past time steps, whether they increase, decrease, or remain stable. Therefore, the state set is defined as follows.
![]() |
17 |
![]() |
18 |
where St is the state observed by the agent at time step t, calculated using Eq. (17). ft denotes the IGD evaluation index at time t, calculated using Eq. (18). Where |P| is the number of solutions in the solution set P, |R| is the number of solutions in the optimal solution set of the current external archive set R, d(p,r) is the distance between solutions p and r. IGD is chosen as a performance indicator because it measures the convergence and diversity of solutions in multi-objective optimization54. It evaluates how closely the obtained Pareto front approximates the true Pareto front while also considering the spread of solutions along the front. A lower IGD value indicates better convergence and a more diverse set of solutions. By monitoring changes in IGD, the RL agent can make informed adjustments to the algorithm, aiming to improve both the quality and diversity of solutions over time.
Design action
In the ACOA, RL is used to adaptively adjust the temperature parameter, which influences the behavior of the crayfish population during different stages of the algorithm. Three actions are defined to form the action set: increasing temperature, reducing temperature, and maintaining the current temperature, as shown in the modified Eq. (19):
![]() |
19 |
where Tempt represents the temperature value at time step t. The initial Temp is randomly generated between 20 and 35. The choice of action allows the RL agent to dynamically adjust the temperature, balancing the exploration and exploitation phases based on the performance of the population over time. The overall adjustment maintains the adaptability of the algorithm without changing the total search space.
Design reward function
When the environment state improves, such as when the fitness of the crayfish population increases, the agent receives a positive reward. Conversely, if the environment state deteriorates—meaning the fitness of the population decreases—the agent is given a negative reward. The agent uses this feedback to adjust its actions at the next time step to maximize long-term performance. The reward function can be defined as Eq. (20):
![]() |
20 |
where Rewardt represents the reward provided by the environment at time step t.
Computational study
Problem settings and data generation
To evaluate the solution efficiency and quality of the proposed ACOA, we conducted simulation experiments for the complex problem with a solution space of O(Oi × Qi × Bnj! × GOi×Qi × MgeOi×Qi×Bnj). The experimental dataset includes parameters for workshops, equipment, orders, and products. Workshop parameters cover the number of workshops and equipment, while order parameters encompass the number of orders, types of products, number of product components, and component and setup times based on actual enterprise production data. The performance of the ACOA algorithm was compared with the NSGA-II55 and WOA56. Table 3 provides the parameters and their ranges for various problem instances. All simulations were performed using MATLAB R2021a, on a system with an Intel(R) Pentium(R) CPU, 2.90 GHz dual-core, and 2 GB of memory.
Table 3.
Parameters range for experimental problems.
| Parameters | Value | Parameters | Value | Parameters | Value | ||
|---|---|---|---|---|---|---|---|
| Small | Medium | Large | |||||
| G | [2, 3] | [3, 5] | [4, 5] | Liqj | [216–300] | Tyn | [3, 5] |
| Mge | [2, 4] | [3, 6] | [4, 7] | Vge | [20–24] | Bnj | [6, 9] |
| Oi | [5, 10] | [11, 15] | [18, 25] | Sii’,ge | [3, 8] | Qi | [1, 3] |
Parameters tuning for the algorithm
The proposed ACOA has four key parameters: popsize, maxgen, pc, and pm. The levels of each parameter are based on preliminary experiments and existing literature57, as shown in Table 4. Taguchi analysis is used to obtain the optimal parameter combination for ACOA, with the values of the objective functions CS, DMWL, and TST as response variables. The experimental design follows an L9 (3^4) orthogonal array. Taking medium-scale data as an example, each parameter combination is tested 10 times, with the average value calculated. In total, 90 experimental runs are conducted. In the Taguchi analysis conducted using Minitab, for the objective function CS, the goal is to maximize the value, so the "larger is better" criterion is applied. For the objective functions TST and DWML, the goal is to minimize the values, and thus the "smaller is better" criterion is used. Figure 11 illustrates the main effects of the parameters on the responses for the medium-scale problem instance. From the curves, the optimal parameter settings can be identified, and the corresponding optimal parameter values are highlighted in bold in Table 4. The parameter settings for the WOA and NSGA-II are based on values from the reference literature58,59.
Table 4.
Algorithm parameters and values.
| Parameters | Value | Parameters | Value |
|---|---|---|---|
| popsize | 50,100,150 | pc | 0.8,0.85,0.9 |
| maxgen | 100,150,200 | pm | 0.05,0.1,0.15 |
Fig. 11.
The impact of key parameters on different objective values.
Comparative experiment and result analysis
Performance Comparison of ACOA with Competitive Algorithms.
In this section, a performance comparison is conducted between the proposed ACOA and other competitive algorithms. The test problem instances are categorized into three sizes: small, medium, and large. Each problem instance was executed 10 times for all algorithms considered independently. Table 5 presents the average values of the HV, IGD, GD and NR metrics, while Table 6 displays the experimental results of the C-measure for each algorithm. The data in bold within the tables indicate the best performance values.
Table 5.
Mean values of performance indicators for ACOA, COA, WOA, and NSGA-II.
| Instance | Algorithms | ACOA | COA | WOA | NSGA-II |
|---|---|---|---|---|---|
| Small | HV | 0.70 | 0.46 | 0.44 | 0.45 |
| IGD | 5.36 | 88.70 | 109.56 | 63.05 | |
| GD | 8.53 | 12.02 | 11.75 | 11.52 | |
| NR | 0.52 | 0.14 | 0.14 | 0.20 | |
| Medium | HV | 0.65 | 0.43 | 0.41 | 0.44 |
| IGD | 19.83 | 304.42 | 188.48 | 322.61 | |
| GD | 11.25 | 17.15 | 18.05 | 17.32 | |
| NR | 0.57 | 0.13 | 0.14 | 0.16 | |
| Large | HV | 0.60 | 0.24 | 0.25 | 0.35 |
| IGD | 279.11 | 897.01 | 894.35 | 358.82 | |
| GD | 25.64 | 35.55 | 32.68 | 28.31 | |
| NR | 0.52 | 0.13 | 0.16 | 0.20 |
Table 6.
C-matrix coverage metric comparison results of ACOA with other algorithms.
| Small | ACOA | COA | WOA | NSGA-II |
|---|---|---|---|---|
| ACOA | – | 0.68 | 0.66 | 0.52 |
| COA | 0.00 | – | 0.12 | 0.21 |
| WOA | 0.01 | 0.13 | – | 0.13 |
| NSGA-II | 0.03 | 0.18 | 0.15 | – |
| Medium | ACOA | COA | WOA | NSGA-II |
| ACOA | – | 0.72 | 0.68 | 0.64 |
| COA | 0.00 | – | 0.38 | 0.20 |
| WOA | 0.09 | 0.22 | – | 0.27 |
| NSGA-II | 0.00 | 0.25 | 0.38 | – |
| Large | ACOA | COA | WOA | NSGA-II |
| ACOA | – | 0.63 | 0.63 | 0.56 |
| COA | 0.01 | – | 0.27 | 0.18 |
| WOA | 0.02 | 0.33 | – | 0.22 |
| NSGA-II | 0.03 | 0.27 | 0.33 | – |
In this section, a fair and consistent performance comparison is conducted between the proposed ACOA and other competitive algorithms. To ensure an equitable evaluation, the test problem instances are categorized into three sizes: small, medium, and large. Each problem instance was executed independently 10 times for all algorithms, with popsize of 100 and maxgen of 150 generations for each. Table 5 presents the average values of the HV, IGD, GD, and NR metrics, while Table 6 displays the experimental results of the C-measure for each algorithm. The best performance values are highlighted in bold within the tables to clearly indicate the top-performing algorithms.
In Table 5, the average performance values across all instances clearly demonstrate that ACOA outperforms other algorithms in key metrics such as HV, IGD, GD, and NR. ACOA excels in HV by an average of 76% over the other algorithms, outperforms the others by approximately 80% in IGD, shows about 28% better performance in GD, and surpasses the other algorithms by about 220% in NR. These results highlight the consistent advantages of ACOA in terms of coverage, convergence, and solution diversity. By incorporating Q-learning to adaptively adjust the temperature, we improved the algorithm’s NR metric, enabling it to find more Pareto solutions. Additionally, the enhancement of the crowding distance calculation significantly boosts ACOA’s HV metric, leading to a clearer advantage in convergence and diversity, and ensuring better solution distribution.
Figure 12, on the other hand, presents the distribution of performance across 10 trials for each algorithm, providing a more granular view of the algorithms’ robustness. While ACOA consistently outperforms the other algorithms, Fig. 11 highlights the variability in performance across different runs. Notably, ACOA shows significant improvements in both convergence (HV) and solution diversity (NR) in large-scale instances, further underscoring its capability to handle the increased complexity of large-scale multi-objective optimization problems.
Fig. 12.
Robustness of solutions from different algorithms based on IGD, GD, HV, and NR.
Additionally, as shown in Table 6, the C-matrix of the proposed ACOA consistently outperforms the other algorithms by including a greater number of non-dominated Pareto solutions, highlighting ACOA as the best-performing algorithm. ACOA consistently excels in coverage. For small instances, ACOA achieves a coverage of 0.68, 0.66, and 0.52, significantly outperforming COA’s 0.00, WOA’s 0.01, and NSGA-II’s 0.03. In medium-sized instances, ACOA further improves with a coverage of 0.72, 0.68, and 0.64, surpassing COA’s 0.00, WOA’s 0.09, and NSGA-II’s 0.00. Even in large instances, ACOA maintains a competitive edge with a coverage of 0.63, 0.63, and 0.56, demonstrating its strong and consistent performance in multi-objective optimization tasks.
The algorithm incorporates an improved crowding distance selection strategy based on NSGA-II, effectively balancing global and local optimization, thereby enhancing its ability to handle conflicting scheduling objectives. Additionally, the integration of Q-learning improves the algorithm’s adaptability, allowing it to dynamically adjust the temperature parameter based on solution changes, thereby achieving a better balance between exploration and exploitation. Therefore, compared to competing algorithms, the proposed ACOA demonstrates strong convergence and diversity in multi-objective optimization across instances of varying scales.
Statistically analysis
We conducted experimental validation across three sets of instances, using a total of 30 benchmark instances. IGD and HV metrics were chosen to assess performance, as they effectively evaluate the comprehensive capabilities of algorithms60. Table 7 presents the Friedman test results for these two performance metrics, showing a significant difference in performance among the compared algorithms due to the Asymp factor. The Sig. values for IGD and HV are 9.7791e−12 and 7.0346e−10, respectively, both well below 0.05. This is attributed to the Q-learning adaptive adjustment of the exploration and exploitation strategy, which effectively improves the algorithm’s convergence, enhances the crowding distance calculation, and guides the crayfish individuals to select solutions with better distribution, thus improving the solution distribution. Figure 12 displays the rankings of the four algorithms based on the Nemenyi post-hoc test.
Table 7.
Friedman test of benchmark instances.
| Metrics | N | Chi-Square | Asymp. Sig | Significance Conclusion |
|---|---|---|---|---|
| HV | 30 | 45.56 | 7.0346e−10 | True |
| IGD | 30 | 54.28 | 9.7791e−12 | True |
From Table 7 and Fig. 13, we can conclude that ACOA demonstrates the best overall performance.
Fig. 13.
Nemenyi post-hoc test of algorithms on benchmark instances.
Enterprise case
The simulation example is based on five production orders from a wind power company over a specified period, with the quantities for each order as follows: 2, 4, 4, 4, and 3 units. The corresponding product models are Ty1, Ty2, Ty2, Ty3 and Ty4, and their machining data is provided in Table 8. The delivery deadlines for each order are as follows: [40, 50, 60, 70], [80, 100, 120, 140], [30, 45, 90, 100], [150, 180, 220, 230], and [120, 130, 170, 180]. We assessed the practical performance of ACOA, COA, NSGA-II, and WOA. Each algorithm was independently run using the same population size and number of iterations. Figure 14 displays the Pareto optimal solutions from different perspectives for the four algorithms. Figure 14(a) offers a three-dimensional perspective with three criteria, showing that the proposed ACOA more closely approximates the true Pareto front than COA, NSGA-II, and WOA. Figure 14(b) highlights that the optimal balance between minimizing deviation in equipment region load and maximizing customer satisfaction (CS) is achieved by ACOA. Figure 14(c) presents a two-dimensional view that focuses on the trade-off between maximizing customer satisfaction and minimizing setup time (ST). Figure 14(d) provides a visual representation considering setup time and deviation in machine workshop load (DMWL), demonstrating that ACOA outperforms the other three algorithms in solution quality. This underscores the advantages of the proposed ACOA.
Table 8.
Machining data.
| Product Model | Product component | Processing time/h | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Workshop1 | Workshop2 | Workshop3 | |||||||
| M11 | M12 | M13 | M21 | M22 | M23 | M31 | M32 | ||
| Ty1 | 1 | 12.8 | 11.5 | 12.5 | 11.5 | 11.5 | 12.0 | 11.8 | 11.5 |
| 2 | 12.7 | 11.4 | 9.8 | 11.4 | 10.6 | 12.2 | 11.7 | 11.4 | |
| 3 | 11.3 | 11.1 | 12.0 | 11.1 | 11.1 | 13.3 | 11.4 | 11.1 | |
| 4 | 12.2 | 11.0 | 10.6 | 11.0 | 10.6 | 11.6 | 11.3 | 11.0 | |
| 5 | 14.0 | 12.2 | 12.9 | 12.6 | 12.6 | 11.9 | 12.9 | 12.6 | |
| 6 | 13.2 | 11.8 | 12.2 | 11.8 | 11.9 | 11.3 | 12.2 | 11.9 | |
| 7 | 12.6 | 12.6 | 11.6 | 11.3 | 11.4 | 13.2 | 11.6 | 11.3 | |
| 8 | 11.3 | 10.7 | 13.6 | 11.8 | 11.3 | 11.8 | 11.0 | 10.7 | |
| Ty2 | 1 | 10.3 | 11.4 | 11.0 | 11.8 | 11.7 | 11.0 | 11.2 | 12.1 |
| 2 | 10.2 | 11.7 | 12.4 | 11.7 | 11.4 | 12.2 | 11.1 | 12.0 | |
| 3 | 11.8 | 11.5 | 10.7 | 11.9 | 10.7 | 12.0 | 10.9 | 11.8 | |
| 4 | 10.4 | 11.6 | 10.8 | 11.6 | 11.2 | 12.5 | 11.2 | 12.1 | |
| 5 | 11.0 | 11.5 | 9.8 | 11.2 | 9.8 | 10.7 | 10.9 | 11.6 | |
| 6 | 9.4 | 10.6 | 9.4 | 10.6 | 9.4 | 11.7 | 11.4 | 10.3 | |
| 7 | 11.5 | 11.5 | 12.0 | 13.2 | 12.0 | 12.6 | 12.3 | 10.9 | |
| 8 | 11.2 | 12.9 | 11.7 | 12.9 | 11.7 | 11.4 | 12.0 | 12.5 | |
| Ty3 | 1 | 13.6 | 11.7 | 11.0 | 15.1 | 14.7 | 10.7 | 10.5 | 11.5 |
| 2 | 11.4 | 11.4 | 14.5 | 10.4 | 13.8 | 15.8 | 14.5 | 13.2 | |
| 3 | 11.1 | 14.9 | 10.8 | 10.1 | 14.1 | 10.9 | 11.4 | 15.6 | |
| 4 | 13.0 | 11.9 | 12.0 | 14.2 | 11.2 | 11.7 | 14.3 | 15.3 | |
| 5 | 11.5 | 10.3 | 11.5 | 10.5 | 10.5 | 12.4 | 10.7 | 14.8 | |
| 6 | 12.3 | 12.7 | 12.0 | 11.2 | 11.5 | 11.5 | 12.0 | 12.3 | |
| 7 | 11.7 | 12.6 | 11.8 | 13.4 | 10.4 | 10.6 | 11.4 | 12.5 | |
| 8 | 12.1 | 10.9 | 11.9 | 14.6 | 10.6 | 12.5 | 10.5 | 12.3 | |
| 9 | 11.1 | 10.4 | 10.8 | 13.7 | 9.5 | 10.0 | 10.3 | 10.6 | |
| Ty4 | 1 | 12.6 | 11.3 | 12.1 | 12.5 | 11.0 | 11.8 | 10.2 | 11.4 |
| 2 | 12.4 | 10.8 | 14.3 | 15.1 | 11.3 | 11.3 | 11.5 | 15.4 | |
| 3 | 11.9 | 15.3 | 10.3 | 14.7 | 14.5 | 10.1 | 14.7 | 12.3 | |
| 4 | 13.4 | 9.4 | 10.1 | 15.3 | 9.6 | 15.4 | 15.8 | 15.2 | |
| 5 | 10.4 | 11.1 | 13.5 | 14.3 | 9.4 | 9.9 | 13.9 | 13.5 | |
| 6 | 12.2 | 12.8 | 10.5 | 16.3 | 11.6 | 11.7 | 13.1 | 12.5 | |
| 7 | 11.6 | 10.5 | 10.9 | 14.8 | 11.6 | 11.0 | 12.0 | 11.9 | |
| 8 | 9.7 | 10.9 | 11.8 | 15.9 | 10.2 | 11.4 | 11.2 | 11.0 | |
| 9 | 9.3 | 11.0 | 10.8 | 16.1 | 9.8 | 11.5 | 10.1 | 12.3 | |
Fig. 14.
Comparison of Pareto solution sets obtained by three algorithms and mapping.
In actual production, production managers can use a combination of the Analytic Hierarchy Process (AHP) and Entropy Weight Method (EWM) to rank and select schemes. Establish an objective evaluation matrix based on expert scoring, and the calculated AHP weights for each objective are WAHP = (0.5124,0.1938,0.2938). The weight vector of each Pareto solution set index calculated by the entropy weight method is WEWM = (0.3636,0.3975,0.3908). By combining the AHP with the entropy weight method using the multiplicative combination method, the comprehensive weights W = (0.4926,0.2037,0.3037) of each evaluation index of the Pareto solution set are obtained. The solution matrix data is standardized using
, and then the subtraction consistency process is performed. For the first positive index and the latter two negative indexes, the original data of the column is subtracted from the minimum value of the column and the maximum value of each column is subtracted from the original data of the column. This results in scheme 7 having a higher score. The customer satisfaction of this scheme is 6.04, the changeover time is 100, and the region equipment load deviation is 73.14. Figure 15 shows the Gantt chart of this scheduling scheme.
Fig. 15.
Gantt chart of schedule for scheme 7.
In Fig. 16(a), the temperature variation curve compares scenarios with and without Q-learning. By incorporating reinforcement learning, the ACOA effectively adjusts the temperature parameters based on the reward feedback mechanism, dynamically refining the evolutionary strategy of the crayfish individuals. In the early stages, low temperatures encourage extensive exploration, resulting in significant fluctuations in the objective convergence curve. As the iterations progress and the temperature rises, the algorithm shifts towards exploitation, enabling the crayfish individuals to select better evolutionary strategies based on the feedback. In contrast, the COA, due to the random generation of temperature parameters without adaptive adjustment, leads to the random selection of evolutionary strategies, resulting in slower convergence and poorer solution quality. In Fig. 16(b), after normalizing the three objectives, the proposed ACOA outperforms the COA by approximately 13%, 10%, and 8% in the CS, ST, and DWML objectives, respectively.
Fig. 16.
Objective convergence and temperature variation curves.
Conclusions
This study addresses the multi-objective scheduling challenges in the distributed production workshop of Wind Turbine Pultruded Panels (WTPP). An intelligent platform is proposed, integrating optimal results from ACOA to support decision-making and improve production planning. A multi-objective MILP model is developed to minimize setup time and machine load deviations while maximizing customer satisfaction, taking into account sequence-dependent constraints. The ACOA incorporates crossover and mutation operators inspired by genetic algorithms, making it suitable for solving discrete scheduling problems. Additionally, an improved crowding distance calculation enhances solution diversity, and Q-learning is used to dynamically adjust temperature parameters, optimizing the Crayfish evolution strategy and improving convergence. The performance comparison using multi-objective metrics such as HV, IGD, GD, and NR demonstrates that ACOA significantly outperforms COA, WOA, and NSGA-II, with average improvements of 76%, 80%, 28%, and 220%, respectively. In the application to WTPP-PSP, the proposed algorithm outperforms COA by approximately 13%, 10%, and 8% in the three objectives.
Despite the promising results, it is important to acknowledge the limitations of our study. The computational complexity of the proposed method can be further optimized to enhance scalability for larger and more complex industrial applications. The method does not account for uncertainties in real-world production issues, such as machine failures, rush order demands and transportation issues. Future research should explore a broader range of benchmark problems and real-world industrial applications to further validate and generalize the proposed method. Integrating other advanced metaheuristics and machine learning techniques with the ACOA algorithm could improve its adaptability and performance across different domains. A detailed sensitivity analysis of the algorithm’s parameters would provide deeper insights into their impact on performance and guide more effective parameter-tuning strategies.
Acknowledgements
We would like to express our gratitude to Scientific Reports and editors for providing us with the opportunity to submit our work.
Author contributions
Xin Yang: Conceptualization, Data analysis, Software, Writing—Original Draft; XiaoYing Yang: Supervision, Project administration, Review & Editing Jinhao Du: Algorithm Optimization and Improvement.
Funding
Funding was provided by Key Research and Development Project of Henan Province, Grant No. 231111222600.
Data availability
The datasets used and; or analysed during the current study available from the corresponding author on reasonable request.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Xin Yang and Xiaoying Yang.
Contributor Information
Xin Yang, Email: yangxinkedahn@163.com.
Xiaoying Yang, Email: lyyxy111@163.com.
References
- 1.Yi, X., Lu, T., Li, Y., Ai, Q. & Hao, R. Collaborative planning of multi-energy systems integrating complete hydrogen energy chain. Renew. Sustain. Energy Rev.210, 115147. 10.1016/j.rser.2024.115147 (2025). [Google Scholar]
- 2.Huang, Z., Zhou, Y., Lin, Y. & Zhao, Y. Resilience evaluation and enhancing for China’s electric vehicle supply chain in the presence of attacks: A complex network analysis approach. Comput. Ind. Eng.195, 110416. 10.1016/j.cie.2024.110416 (2024). [Google Scholar]
- 3.Yang, Y., Zhao, Y., Yan, G., Mu, G. & Chen, Z. Real time aggregation control of P2H loads in a virtual power plant based on a multi-period Stackelberg game. Energy303, 131484. 10.1016/j.energy.2024.131484 (2024). [Google Scholar]
- 4.Zhang, H. et al. Homomorphic encryption based resilient distributed energy management under cyber-attack of micro-grid with event-triggered mechanism. IEEE Trans. Smart Grid.10.1109/TSG.2024.3390108 (2024). [Google Scholar]
- 5.Zhang, H. et al. Event-trigger-based resilient distributed energy management against FDI and DoS attack of cyber–physical system of smart grid. IEEE Trans. Syst. Man Cybern. Syst.54, 3220–3230. 10.1109/TSMC.2024.3357497 (2024). [Google Scholar]
- 6.Li, N., Dong, J., Liu, L., Li, H. & Yan, J. A novel EMD and causal convolutional network integrated with Transformer for ultra short-term wind power forecasting. Int. J. Electr. Power Energy Syst.154, 109470. 10.1016/j.ijepes.2023.109470 (2023). [Google Scholar]
- 7.Zhang, J. et al. A novel multiple-medium-AC-port power electronic transformer. IEEE Trans. Ind. Electron.71, 6568–6578. 10.1109/TIE.2023.3301550 (2023). [Google Scholar]
- 8.Fudlailah, P., Allen, D. H. & Cordes, R. Verification of Euler-Bernoulli beam theory model for wind blade structure analysis. Thin-Walled Struct.202, 111989. 10.1016/j.tws.2024.111989 (2024). [Google Scholar]
- 9.Li, X., Monticeli, F., Pascoe, J.-A. & Mosleh, Y. Interlaminar fracture behaviour of emerging laminated-pultruded CFRP plates for wind turbine blades. Eng. Fract. Mech.308, 110353. 10.1016/j.engfracmech.2024.110353 (2024). [Google Scholar]
- 10.Reddy, S. S. P., Suresh, R., Hanamantraygouda, M. B. & Shivakumar, B. P. Use of composite materials and hybrid composites in wind turbine blades. Mater. Today Proc.46, 2827–2830. 10.1016/j.matpr.2021.02.745 (2021). [Google Scholar]
- 11.Zhao, D., Liu, T., Lu, X. & Meng, X. Experimental and numerical analysis of a novel curved sandwich panel with pultruded GFRP strip core. Compos. Struct.288, 115404. 10.1016/j.compstruct.2022.115404 (2022). [Google Scholar]
- 12.Lei, D. & Liu, M. An artificial bee colony with division for distributed unrelated parallel machine scheduling with preventive maintenance. Comput. Ind. Eng.141, 106320. 10.1016/j.cie.2020.106320 (2020). [Google Scholar]
- 13.Lee, J.-H. & Kim, H.-J. A heuristic algorithm for identical parallel machine scheduling: Splitting jobs, sequence-dependent setup times, and limited setup operators. Flex. Serv. Manuf. J.33, 992–1026. 10.1007/s10696-020-09400-9 (2021). [Google Scholar]
- 14.Ma, K., Yang, J. & Liu, P. Relaying-assisted communications for demand response in smart grid: Cost modeling, game strategies, and algorithms. IEEE J. Sel. Areas Commun.38, 48–60. 10.1109/JSAC.2019.2951972 (2019). [Google Scholar]
- 15.Xu, X. & Wei, Z. Dynamic pickup and delivery problem with transshipments and LIFO constraints. Comput. Ind. Eng.175, 108835. 10.1016/j.cie.2022.108835 (2023). [Google Scholar]
- 16.Chaib, L., Tadj, M., Choucha, A., Khemili, F. Z. & El-Fergany, A. Improved crayfish optimization algorithm for parameters estimation of photovoltaic models. Energy Convers. Manag.313, 118627. 10.1016/j.enconman.2024.118627 (2024). [Google Scholar]
- 17.Yan, D., Wang, H., Gao, Y., Tian, S. & Zhang, H. Based on improved crayfish optimization algorithm cooperative optimal scheduling of multi-microgrid system. Sci. Rep.14, 24871. 10.1038/s41598-024-76041-5 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chen, Y., Zhong, J., Mumtaz, J., Zhou, S. & Zhu, L. An improved spider monkey optimization algorithm for multi-objective planning and scheduling problems of PCB assembly line. Expert Syst. Appl.229, 120600 (2023). [Google Scholar]
- 19.Kim, Y. J. & Kim, B. S. Part-grouping and build-scheduling with sequence-dependent setup time to minimize the makespan for non-identical parallel additive manufacturing machines. Int. J. Adv. Manuf. Technol.119, 2247–2258. 10.1007/s00170-021-08361-z (2022). [Google Scholar]
- 20.Guo, X., Deng, Q., Luo, Q. & Xie, G. An effective multi-stage evolutionary algorithm for distributed scheduling with splitting jobs in heterogeneous factories. Eng. Optim.57, 688–716. 10.1080/0305215X.2024.2332795 (2025). [Google Scholar]
- 21.Wang, H. & Alidaee, B. Unrelated parallel machine selection and job scheduling with the objective of minimizing total workload and machine fixed costs. IEEE Trans. Autom. Sci. Eng.15, 1955–1963. 10.1109/TASE.2018.2832440 (2018). [Google Scholar]
- 22.Pan, Z., Lei, D. & Wang, L. A knowledge-based two-population optimization algorithm for distributed energy-efficient parallel machines scheduling. IEEE Trans. Cybern.52, 5051–5063. 10.1109/TCYB.2020.3026571 (2020). [DOI] [PubMed] [Google Scholar]
- 23.Wang, H., Li, R. & Gong, W. Minimizing tardiness and makespan for distributed heterogeneous unrelated parallel machine scheduling by knowledge and Pareto-based memetic algorithm. Egypt. Inform. J.24, 100383. 10.1016/j.eij.2023.05.008 (2023). [Google Scholar]
- 24.Amallynda, I. & Santosa, B. Solving multi-objective Modified Distributed Parallel Machine and Assembly Scheduling Problem (MDPMASP) with eligibility constraints using metaheuristics. Prod. Manuf. Res.10, 198–225. 10.1080/21693277.2022.2070559 (2022). [Google Scholar]
- 25.Srinath, N., Yilmazlar, I. O., Kurz, M. E. & Taaffe, K. Hybrid multi-objective evolutionary meta-heuristics for a parallel machine scheduling problem with setup times and preferences. Comput. Ind. Eng.185, 109675. 10.1016/j.cie.2023.109675 (2023). [Google Scholar]
- 26.Elyasi, M., Selcuk, Y. S., Özener, O. Ö. & Coban, E. Imperialist competitive algorithm for unrelated parallel machine scheduling with sequence-and-machine-dependent setups and compatibility and workload constraints. Comput. Ind. Eng.190, 110086. 10.1016/j.cie.2024.110086 (2024). [Google Scholar]
- 27.Wang, H., Li, R. & Gong, W. Minimizing tardiness and makespan for distributed heterogeneous unrelated parallel machine scheduling by knowledge and Pareto-based memetic algorithm. Egypt. Inform. J.24, 100383. 10.1016/j.eij.2023.05.008 (2023). [Google Scholar]
- 28.Lei, D., Yuan, Y. & Cai, J. An improved artificial bee colony for multi-objective distributed unrelated parallel machine scheduling. Int. J. Prod. Res.59, 5259–5271. 10.1080/00207543.2020.1775911 (2021). [Google Scholar]
- 29.Wang, B., Feng, K. & Wang, X. Bi-objective scenario-guided swarm intelligent algorithms based on reinforcement learning for robust unrelated parallel machines scheduling with setup times. Swarm Evol. Comput.80, 101321. 10.1016/j.swevo.2023.101321 (2023). [Google Scholar]
- 30.Wu, T., Luo, C. & Dong, Y. Learning-based two-phase cooperative optimizer for distributed machine scheduling with heterogeneous factories and order priorities. Egypt. Inform. J.25, 100424. 10.1016/j.eij.2023.100424 (2024). [Google Scholar]
- 31.Xia, J.-Y. et al. Metalearning-based alternating minimization algorithm for nonconvex optimization. IEEE Trans. Neural Netw. Learn. Syst.34, 5366–5380. 10.1109/TNNLS.2022.3165627 (2022). [DOI] [PubMed] [Google Scholar]
- 32.Lu, Y. et al. Adaptive maintenance window-based opportunistic maintenance optimization considering operational reliability and cost. Reliab. Eng. Syst. Saf.250, 110292. 10.1016/j.ress.2024.110292 (2024). [Google Scholar]
- 33.Wei, M., Yang, S., Wu, W. & Sun, B. A multi-objective fuzzy optimization model for multi-type aircraft flight scheduling problem. Transport39, 313–322 (2024). [Google Scholar]
- 34.Yuan, Y. et al. Attack-defense strategy assisted osprey optimization algorithm for PEMFC parameters identification. Renew. Energy225, 120211. 10.1016/j.renene.2024.120211 (2024). [Google Scholar]
- 35.Yuan, Y. et al. Multidisciplinary design optimization of dynamic positioning system for semi-submersible platform. Ocean Eng.285, 115426. 10.1016/j.oceaneng.2023.115426 (2023). [Google Scholar]
- 36.Wu, H. & Zheng, H. Single-machine scheduling with periodic maintenance and learning effect. Sci. Rep.13, 9309. 10.1038/s41598-023-39460-4 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Heidari, A., Sheikh-Azadi, A.-H., Hasan-Zadeh, A. & Kazemzadeh, Y. Optimization of carbon emission in an integrated machine-piece scheduling and vehicle routing problem and its solution using MOPSO and NSGAII metaheuristic algorithms. Sci. Rep.10.1038/s41598-024-77217-9 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Khurshid, B., Maqsood, S., Khurshid, Y., Naeem, K. & Khalid, Q. S. A hybridization of evolution strategies with iterated greedy algorithm for no-wait flow shop scheduling problems. Sci. Rep.14, 2376. 10.1038/s41598-023-47729-x (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Yeni, F. B., Cevikcan, E., Yazici, B. & Yilmaz, O. F. Aggregated planning to solve multi-product multi-period disassembly line balancing problem by considering multi-manned stations: A generic optimization model and solution algorithms. Comput. Ind. Eng.196, 110464. 10.1016/j.cie.2024.110464 (2024). [Google Scholar]
- 40.Lian, X., Zheng, Z., Zhu, M. & Gao, X. Proactive scheduling for steel plants with unrelated parallel machines and time uncertainty. Comput. Ind. Eng.188, 109890. 10.1016/j.cie.2024.109890 (2024). [Google Scholar]
- 41.Cui, H., Li, X., Gao, L. & Zhang, C. Multi-population genetic algorithm with greedy job insertion inter-factory neighbourhoods for multi-objective distributed hybrid flow-shop scheduling with unrelated-parallel machines considering tardiness. Int. J. Prod. Res.62, 4427–4445. 10.1080/00207543.2023.2262616 (2024). [Google Scholar]
- 42.Zhu, Y., Tang, Q., Zhang, L., He, M. & Kapenda, J. Improved multi-objective artificial bee colony algorithm for parallel machine lot-streaming scheduling problem with limited and unequal sub-lots. Comput. Ind. Eng.183, 109428. 10.1016/j.cie.2023.109428 (2023). [Google Scholar]
- 43.Yilmaz, B., Yilmaz, O. & Yeni, F. Comparison of lot streaming division methodologies for multi-objective hybrid flowshop scheduling problem by considering limited waiting time. J. Ind. Manag. Optim.20, 3373–3414. 10.3934/jimo.2024058 (2024). [Google Scholar]
- 44.Sugianto, W. C. & Kim, B. S. Particle swarm optimization for integrated scheduling problem with batch additive manufacturing and batch direct-shipping delivery. Comput. Oper. Res.161, 106430. 10.1016/j.cor.2023.106430 (2024). [Google Scholar]
- 45.Jiang, T., Zhang, C. & Sun, Q. M. Green job shop scheduling problem with discrete whale optimization algorithm. IEEE Access.7, 43153–43166. 10.1109/ACCESS.2019.2908200 (2019). [Google Scholar]
- 46.Lin, L., Ma, X., Chen, C., Xu, J. & Huang, N. Imbalanced Industrial load identification based on optimized CatBoost with Entropy features. J. Electr. Eng. Technol.19, 4817–4832. 10.1007/s42835-024-01933-5 (2024). [Google Scholar]
- 47.Zhang, Y., Deng, X. & Zhang, Y. Generation of sub-item load profiles for public buildings based on the conditional generative adversarial network and moving average method. Energy Build.268, 112185. 10.1016/j.enbuild.2022.112185 (2022). [Google Scholar]
- 48.Yuan, Y. et al. Combined improved tuna swarm optimization with graph convolutional neural network for remaining useful life of engine. Qual. Reliab. Eng. Int.41, 174–191. 10.1002/qre.3651 (2025). [Google Scholar]
- 49.Yuan, Y. et al. Short-term power load forecasting based on SKDR hybrid model. Electr. Eng.2024, 1–17. 10.1007/s00202-024-02821-x (2024). [Google Scholar]
- 50.Jiang, T., Liu, L. & Zhu, H. A Q-learning-based biology migration algorithm for energy-saving flexible job shop scheduling with speed adjustable machines and transporters. Swarm Evol. Comput.90, 101655 (2024). [Google Scholar]
- 51.Chen, Y., Zhong, J., Mumtaz, J., Zhou, S. & Zhu, L. An improved spider monkey optimization algorithm for multi-objective planning and scheduling problems of PCB assembly line. Expert Syst. Appl.229, 120600. 10.1016/j.eswa.2023.120600 (2023). [Google Scholar]
- 52.Zhang, W., Li, C., Gen, M., Yang, W. & Zhang, G. A multiobjective memetic algorithm with particle swarm optimization and Q-learning-based local search for energy-efficient distributed heterogeneous hybrid flow-shop scheduling problem. Expert Syst. Appl.237, 121570. 10.1016/j.eswa.2023.121570 (2024). [Google Scholar]
- 53.Jia, H. et al. Modified crayfish optimization algorithm for solving multiple engineering application problems. Artif. Intell. Rev.57, 127. 10.1007/s10462-024-10738-x (2024). [Google Scholar]
- 54.Mirjalili, S. & Lewis, A. Novel performance metrics for robust multi-objective optimization algorithms. Swarm Evol. Comput.21, 1–23. 10.1016/j.swevo.2014.10.005 (2015). [Google Scholar]
- 55.Ma, H., Zhang, Y., Sun, S., Liu, T. & Shan, Y. A comprehensive survey on NSGA-II for multi-objective optimization and applications. Artif. Intell. Rev.56, 15217–15270 (2023). [Google Scholar]
- 56.Gharehchopogh, F. S. & Gholizadeh, H. A comprehensive survey: Whale Optimization Algorithm and its applications. Swarm Evol. Comput.48, 1–24. 10.1016/j.swevo.2019.03.004 (2019). [Google Scholar]
- 57.Sun, K. et al. Hybrid genetic algorithm with variable neighborhood search for flexible job shop scheduling problem in a machining system. Expert Syst. Appl.215, 119359. 10.1016/j.eswa.2022.119359 (2023). [Google Scholar]
- 58.Liu, M., Yao, X. & Li, Y. Hybrid whale optimization algorithm enhanced with Lévy flight and differential evolution for job shop scheduling problems. Appl. Soft Comput.87, 105954. 10.1016/j.asoc.2019.105954 (2020). [Google Scholar]
- 59.Luo, Q., Deng, Q., Gong, G., Guo, X. & Liu, X. A distributed flexible job shop scheduling problem considering worker arrangement using an improved memetic algorithm. Expert Syst. Appl.207, 117984. 10.1016/j.eswa.2022.117984 (2022). [Google Scholar]
- 60.Tian, Y., Cheng, R., Zhang, X., Li, M. & Jin, Y. Diversity assessment of multi-objective evolutionary algorithms: Performance metric and benchmark problems [research frontier]. IEEE Comput. Intell. Mag.14, 61–74. 10.1109/MCI.2019.2919398 (2019). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets used and; or analysed during the current study available from the corresponding author on reasonable request.












































































