Multi-objective and multi constrained task scheduling framework for computational grids

Sujay N Hegde; D B Srinivas; M A Rajan; Sita Rani; Aman Kataria; Hong Min

doi:10.1038/s41598-024-56957-8

. 2024 Mar 19;14:6521. doi: 10.1038/s41598-024-56957-8

Multi-objective and multi constrained task scheduling framework for computational grids

Sujay N Hegde ¹, D B Srinivas ^2,^✉, M A Rajan ³, Sita Rani ⁴, Aman Kataria ⁵, Hong Min ^6,^✉

PMCID: PMC10948903 PMID: 38499637

Abstract

Grid computing emerged as a powerful computing domain for running large-scale parallel applications. Scheduling computationally intensive parallel applications such as scientific, commercial etc., computational grids is a NP-complete problem. Many researchers have proposed several task scheduling algorithms on grids based on formulating and solving it as an optimization problem with different objective functions such as makespan, cost, energy etc. Further to address the requirements/demands/needs of the users (lesser cost, lower latency etc.) and grid service providers (high utilization and high profitability), a task scheduler needs to be designed based on solving a multi-objective optimization problem due to several trade-offs among the objective functions. In this direction, we propose an efficient multi-objective task scheduling framework to schedule computationally intensive tasks on heterogeneous grid networks. This framework minimizes turnaround time, communication, and execution costs while maximizing grid utilization. We evaluated the performance of our proposed algorithm through experiments conducted on standard, random, and scientific task graphs using the GridSim simulator.

Keywords: Grid computing, Direct acyclic graph, Scientific graph, GridSim, TOPSIS

Subject terms: Energy science and technology, Engineering, Mathematics and computing

Introduction

Applications with high computational and data demands, such as climate modelling, drug discovery, genomics, bioinformatics, financial modelling, data analytics, and healthcare informatics, are fueling the demand for computational grids^1–10. Computational grids have emerged as powerful computational paradigms, facilitating large-scale, distributed computing through the utilization of interconnected computing and storage resources. The optimal allocation of tasks to resources in computational grids becomes increasingly intricate due to various constraints, including resource heterogeneity, dynamic workload characteristics, system dynamics, and adherence to user Quality of Service (QoS) parameters, such as latency and cost.

Grid service providers typically aim to maximize profits, while users seek to minimize execution costs, communication costs, and turnaround time for their applications. One approach to achieve this is by designing efficient task schedulers to schedule user applications on computational grids. Efficient task schedulers play a crucial role in achieving these objectives, enabling intelligent decisions regarding task allocation and resource management within specified constraints. Despite being an NP-complete problem¹¹, designing efficient task scheduling algorithms for computational grids is essential in meeting user-defined QoS requirements.

The design of task scheduling algorithms is based on either single or multi-objective functions. Task scheduling algorithms based on a single objective function are not suitable for scheduling complex real-time applications. Single-objective task scheduling algorithms primarily focus on optimizing a specific objective ( minimizing makespan, cost, energy etc) based on heuristics, metaheuristic algorithms, or mathematical optimization techniques to find near-optimal scheduling sequences. Single objective functions will find the best solution, which corresponds to either minimum or maximum value. However, they often fail to consider other objectives, resulting in imbalanced resource utilization, increased energy consumption etc. These algorithms are based on meta-heuristic algorithms¹², greedy¹³, fuzzy model¹⁴, game theory¹⁵, bio-inspired¹⁶, and more. However, in real-world applications, it is necessary to take into account several conflicting goals at once. For instance, maximizing resource utilisation, minimizing turnaround time, minimizing task execution cost etc are equally crucial for improving system efficiency. On the other hand task scheduling algorithms based on multi-objective criteria will address these limitations by simultaneously optimizing multiple objectives, offering more robustness for users to prioritize one or more criteria over other and diverse solutions.

Multi-objective function optimization involves optimizing multiple conflicting objectives simultaneously. Common heuristic approaches for multi-objective task scheduling include the application of genetic algorithms (NSGA, NSGA-II)^17,18, particle swarm optimization (MOPSO)¹⁹, simulated annealing(MOSA)²⁰, ant colony optimization (MOACO)²¹, and other evolutionary (MOEAs)²² etc.These methods leverage principles inspired by natural processes to explore the solution space and find trade-off solutions among conflicting objectives. In our proposed method, heuristics are utilized as general problem-solving strategies, employing intuitive, trial-and-error methods to quickly find effective solutions. This systematic approach is designed to identify the best solution based on a defined objective function or set of criteria. Heuristics serve as rule-of-thumb methods, particularly valuable when an exhaustive search or an exact solution is impractical. The objective of incorporating heuristic approaches into our framework is to strike a balance among competing objectives. This includes minimizing turnaround time, execution cost, and communication cost while maximizing resource utilization. The application of heuristics enables the derivation of practical and computationally efficient solutions, especially in scenarios where finding an optimal solution proves challenging or unfeasible. In this article, we propose a task scheduling algorithm based on multi-objective optimization formulation with different objective functions such as minimising turnaround time (TAT), task execution cost, data communication cost between resources, and maximising grid utilization in a heterogeneous multi-grid environment. The proposed framework is plugged into a gridsim architecture as shown in Fig. 1(green colour). The framework contains five different schedulers namely 1. Greedy scheduler: prioritizes minimizing turnaround time, communication cost, and execution cost while maximizing grid utilization. 2. Greedy communication cost scheduler: minimizes communication cost by distributing tasks across computing resources within a single Grid. 3. Greedy execution cost scheduler: aims to minimize execution cost by scheduling each task on the most suitable subset of computing resources based on their cost-to-performance ratio. 4. Greedy no fragmentation scheduler: task as fragmented and schedule tasks on individual computing resources. 5. Random scheduler: schedules tasks on a random subset of computing resources.

We summarize our contributions as follows:

(1) Formulating a task scheduling framework with multiple objectives. (2) The proposed framework is integrated with Grid-sim (simulator) and performance is evaluated. (3) We applied a Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) to solve the proposed multi-objective optimization for task scheduling.

The rest of this paper is organized as follows. Section "Related work" describes the related work. Section "System model" describes the system model. In Sect. "Formulation of multi-objective optimization for task scheduling", objective functions are formulated for TAT, execution cost, communication cost and grid utilization. The task scheduling algorithm is presented in Sect. "Proposed task scheduling algorithm" In Sect. "Demonstration of the proposed task scheduling algorithm" demonstration of our proposed task scheduling is discussed. In Sect. "Results and discussion" results are discussed. Multi-Objective Decision Making Problem is presented in Sect. "Formulation of the multi-Objective-decision-making problem". Finally, in Sect. "Conclusion and future work" we conclude the paper.

Related work

In this section, we present a brief discussion on existing multi-objective task scheduling frameworks/algorithms/models etc. A Grid-based Evolutionary Algorithm (GrEA) is proposed in Ref.²³ to tackle multi-objective optimization issues by utilising the grid-based resource capacity to boost selection pressure in the best direction while maintaining a broad and uniform distribution of choices. A framework is designed to evaluate multi-objective functions (makespan, cost, deadline violation rate, and resource utilization.) for scheduling tasks²⁴ based on the Ant Colony Algorithm in Cloud Computing. A new bio-inspired diversity metric, Pure Diversity (PD) is proposed in Ref.²⁵ to assess the performance of diversity of multi-objective evolutionary algorithms (MOEAs) for solving Many-objective optimization problems(MaOPs). A MATLAB-based PlatEMO is developed to use it for performing comparative experiments, embedding new algorithms, creating new test problems, and developing performance indicators²⁶. This platform includes more than 50 multiobjective evolutionary algorithms and more than 100 multi-objective test problems. Multi-objective particle swarm optimizer(NMPSO) algorithm with a Balanceable Fitness Estimation(BFE) method was designed in Ref.²⁷ to tackle many-objective optimization problems( MaOPs). A multi-objective optimization method based on a non-dominated sorting genetic algorithm (NSGA-II) is applied and tested on an IEEE 17-bus test system²⁸, which simultaneously minimizes two contradicting objective functions such as voltage deviation at buses and total line loss. A multi-objective charging framework that incorporates a vehicle-to-grid (V2G) strategy to optimally manage the real power dispatch of electric cars. The objective functions minimizing load fluctuation and charging costs related with EVs in residential areas²⁹. Partitional Clustering Method (PCM) and Hierarchical Clustering Method (HCM) are used in clustering-based evolutionary algorithms for tackling MaOPs³⁰. For determining congestion thresholds in low-voltage (LV) grids, authors in Ref.³¹ used a multi-objective particle swarm optimisation (MOPSO) approach paired with data analytics via affinity propagation clustering. A virtual machine migration method is designed to maximize host release and minimize virtual machine migration is proposed in³². Task Scheduling for Deadline and Cost Optimization (DCOTS) is presented in Ref.³³. This work ensures the fulfilment of user requirements while simultaneously aiming to maximize the profitability for cloud providers. The objective functions for building a multi-objective cloud task scheduling model include³⁴ execution time, execution cost, and virtual machine load balancing. Subsequently, the task scheduling problem is addressed using the multi-factor optimization (MFO) technique, and the characteristics of task scheduling are integrated with the multi-objective multi-factor optimization (MO-MFO) algorithm to formulate an assisted optimization task. A Task Scheduling technique³⁵ based on a Hybrid Competitive Swarm Optimization Algorithm (HCSOA-TS) within the context of the CC platform. The proposed HCSOA-TS efficiently schedules tasks to maximize resource utilization and overall performance. The construction of a multi-objective task scheduling model for cloud computing³⁶, aimed at optimizing cloud computing tasks, utilizes the Cat Swarm Optimization (CSO) model. The task objectives for cloud computing were scrutinized, leading to the formulation of a multi-objective task scheduling model with execution time and system load as key scheduling objectives. Study in Ref.³⁷ presents a parallel algorithm for task scheduling, where both the priority assignment to tasks and the construction of the heap are concurrently executed. Authors in Ref.³⁸ present edge scheduling stage, tasks are arranged based on the latest start times of their successors instead of their sub-deadlines, with the goal of mitigating lateness in subsequent tasks.

In Grid Computing, the resource optimisation problem is treated as a Multi-Objective Optimisation problem³⁹, and PSO is used to search the problem area for possible solutions. To find non-dominated solutions for the multi-objective issue and to optimise and search for the best Grid resources, the Functional Code Sieve algorithm is used. Similarly, various task scheduling algorithms^40–47 based on multi-objective optimization are studied.

Resource management and task scheduling are intricate operations in computational grids. To manage distributed resources and evaluate scheduling algorithms and their performance with different numbers of resources, a toolkit named GridSim has been proposed. GridSim aids in the mapping of user tasks to grid resources. Several task scheduling algorithms have been simulated using GridSim since its introduction^48–55.

The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) is a method used for multi-criteria decision analysis. It was initially introduced in Refs.^56–58. A-TOPSIS, presented in Ref.⁵⁹, aims to compare the performance of different algorithms based on mean and standard deviations. This technique calculates the best and worst algorithms based on user-defined parameters. Another method, D-TOPSIS, is presented in Ref.⁶⁰ and is more effective in representing uncertain information compared to other group decision support systems based on the classical TOPSIS method. TOPSIS fuzzy⁶¹ is a multi-objective decision-making tool used to find a scheduling algorithm that can minimize response time and maximize throughput. In Ref.⁶², the authors propose a method that combines the Heterogeneous Earliest Finish Time (HEFT) algorithm with the TOPSIS method to solve multi-objective problems. Thus, TOPSIS is a valuable decision-making technique because it provides a systematic and structured approach to evaluate and rank alternatives based on multiple criteria, helping end users to make well-justified choices in complex decision scenarios.

System model

Task model

The task scheduling framework consists of a task graph, a task scheduler and a grid network. A task graph is an input to a task scheduler and is defined as a Weighted Directed Acyclic Graph (WDAG) $W T G = (T, E)$ . where T is set of tasks and E set of edges which describes the dependency between tasks. The weight $W (T_{i})$ is assigned to task $T_{i}$ represents the size of a $i^{th}$ task and is expressed as Million Instructions (MI).

Grid model

Grid network consists of set of grid nodes G = ${G_{1}, G_{2}, G_{3}, . . ., G_{m}}$ and they are interconnected by high speed network. Each grid node contains p number of heterogeneous processing elements $G_{i}$ = ${r_{i 1}, r_{i 2}, r_{i 3}, . . ., r_{ip}}$ and these processing elements are internally connected by a high-speed communication network. Processing speed / CPU_speed of each processor is represented in terms of Million Instructions Per Second (MIPS). Each computational grid contains a local scheduler, The function of the local scheduler is to manage the execution of a task on a grid resource given by the task scheduler. The local scheduler is also responsible for collecting information about computational resources periodically and communicating with the task scheduler.

Simulation model

GridSim⁶⁶

We have employed a Java-based discrete-event toolkit called GridSim to simulate our multi-objective task scheduling framework. This versatile toolkit offers a comprehensive suite of features for modelling and simulating resources and network connectivity, accommodating various capabilities and configurations. Among its capabilities are primitives for composing applications, information services for resource discovery, and interfaces for task allocation to resources and managing their execution. These capabilities enable us to simulate resource brokers or grid schedulers, facilitating the evaluation of scheduling algorithms’ performance. It’s worth noting that GridSim does not prescribe any specific application model, but in our proposed framework, we have adopted a Directed Acyclic Graph (DAG) as the application model. Within the GridSim environment, individual tasks can exhibit differing processing times and input file sizes. To represent these tasks and their requirements, we utilize Gridlet objects. Each Gridlet encapsulates comprehensive information related to a job, including execution management details such as job length (measured in MIPS), disk I/O operations, input and output file sizes, and the job’s originator. In the context of GridSim, a Processing Element (PE) stands as the smallest computing unit, configurable with varying capacities denoted in Million Instructions per Second (MIPS). Multiple PEs can be combined to construct a machine, and in a similar fashion, machines can be aggregated to form a grid. Grids can allocate Gridlets in either a time-sharing mode (common in single-processor Grids) or a space-sharing mode (typical for multi-processor Grids).

Existing GridSim architecture

Proposed multi-layer architecture and abstractions are shown in Fig. 1. The layered structure of this system begins with the foundational run-time machinery, known as the JVM (Java Virtual Machine). This JVM is versatile, catering to both single and multiprocessor systems, including clusters. Moving up to the second layer, we encounter a fundamental discrete-event infrastructure that relies on the interfaces offered by the first layer. This infrastructure is actualized through SimJava, a well-regarded Java library for discrete event simulation. The third layer delves into the simulation of essential grid entities, encompassing resources and information services, among others. Here, the GridSim toolkit employs the discrete event services provided by the underlying infrastructure to simulate these core resource entities. Ascending to the fourth layer, our attention turns to the simulation of resource aggregators, often referred to as grid resource brokers or schedulers. Finally, the fifth and topmost layer is dedicated to application and resource modelling across various scenarios. It harnesses the services furnished by the two lower-level layers to evaluate scheduling strategies, resource management policies, heuristics, and algorithms.

Life cycle of a GridSim simulation

Prior to commencing a simulation, we establish the resource entities (including PEs, Machines, and Grids) that will be available throughout the simulation. Upon GridSim’s initiation, these resource entities autonomously enroll themselves with the Grid Information Service (GIS) entity by dispatching relevant events.

Furthermore, at the onset of the simulation, a user initiates the process by submitting their job to a Resource Broker. The resource broker plays a pivotal role in the simulation, encompassing several responsibilities. It first employs information services to identify accessible resources for the user. Subsequently, it performs task-to-resource mapping (scheduling), orchestrates the staging of application components and data for processing (deployment), initiates job execution, and ultimately aggregates the results. Beyond these tasks, the resource broker also takes on the crucial role of monitoring and tracking the progress of application execution.

Our resource broker implementation

All the application models we have explored rely on task inter-dependencies, which are precisely defined using Directed Acyclic Graphs (DAGs). Regrettably, GridSim does not inherently accommodate the execution of tasks that are constrained by these inter-dependencies. In response to this limitation, our Resource Broker implementation extends support for such scenarios by ensuring that the order of task execution adheres to the specified dependency constraints. Our Resource Broker defines a versatile task Scheduler interface, offering seamless integration with various schedulers. This interface serves as a plug-and-play mechanism, enabling the utilization of multiple schedulers introduced in our work (GS, GCPS, GEPS, GNFS), all of which adhere to this common interface. Furthermore, our task scheduling framework introduces an innovative concept called task fragmentation, allowing tasks to be divided for execution across multiple computing resources. To facilitate this, our resource broker incorporates a Gridlet Fragmentation Service. When a gridlet is scheduled to run on more than one Processing Element, it is initially fragmented into multiple smaller virtual gridlets. These virtual gridlets are then individually executed by the allocated Processing Elements. Upon their completion, the Gridlet Fragmentation Service reunites them into the original single gridlet. Another novel concept introduced by our task scheduling framework involves partial dependencies among tasks. However, GridSim does not inherently enable the Resource Broker to monitor task progress during execution. To address this, we have implemented a pinger service within the Resource Broker and individual Processing Elements. This pinger service allows the Broker to stay informed about a gridlet’s execution progress, enabling it to schedule child tasks once a parent task has reached a predefined threshold percentage of execution, as dictated by the parent-child dependency.

Lastly, we have enhanced the Resource Broker with the capability to gather performance statistics, including Turnaround Time, Resource Utilization, Execution Price, and Communication Price. These statistics provide valuable insights into the system’s performance.

Formulation of multi-objective optimization for task scheduling

We propose task scheduling problem as a multi-objective optimization problem with a goal to minimize TAT, execution price, communication price and maximize grid utilization for precedence constrained task graphs is represented as argmin(TAT, EP,CP, $- G U$ ).

The objective function for TAT is defined and formulated as shown in Eq. (1).

\begin{matrix} T A T = \sum_{i = 1}^{n} \sum_{j = 1}^{m} \sum_{k = 1}^{p [j]} X_{i j_{k}} \times τ_{{ij}_{k}} \end{matrix}

where $X_{i j_{k}} = \{\begin{matrix} 1, & if the task T_{i} is scheduled on the j t h grid \\ on its k t h resource \\ 0, & otherwise . \end{matrix})$

$τ_{{ij}_{k}} =$ Execution time of Task $T_{i}$ on k’th resource of grid j

\begin{matrix} G U = \frac{\sum_{i = 0}^{n} W_{T_{i}}}{(\sum_{j = 0}^{m}, \sum_{k = 0}^{j}, W_{jk}) \times T A T} \end{matrix}

Grid Utilization is formulated in Eq. (2).

\begin{matrix} \begin{matrix} EP & = \sum_{i = 0}^{n} \sum_{j = 0}^{m} \sum_{k = 0}^{p [j]} (X_{i j_{k}} \times τ_{{ij}_{k}} \times P r i c e_{E_{kj}}) \end{matrix} \end{matrix}

Task execution price and communication price is defined and formulated in Eqs. (3) and (4) respectively. Rest of the paper used price and cost interchangeably.

\begin{matrix} C P = \sum_{i = 1}^{n} ((\binom{M_{i}}{2}) M A X_{j = 1}^{m} (τ_{ij}) \times P r i c e_{C}) \end{matrix}

Where

$τ_{ij} = \sum_{k = 1}^{p [j]} X_{ijk} * τ_{ijk}$

and

$M_{i} = \sum_{j = 1}^{m} X_{ij}$

where $X_{ij} = \{\begin{matrix} 1, & if the task T_{i} is scheduled on \\ on any machine of Grid G_{j} \\ 0, & otherwise . \end{matrix})$

Proposed task scheduling algorithm

Proposed Multi-Objective task scheduling algorithm is described in algorithm 2. Algorithm generates an optimized schedule sequence (task-id, [grid-ID, machine-ID], execution start-time and end-time) according to multiple objectives (TAT, EC, CC and RU).

Input to the algorithm is number of tasks(n), task dependency graph (weighted adjacency matrix WTG[1, ..., n][1, ..., n]), task lengths ( $W_{T} [1, . . ., n]$ ), number of grids(m), number of machines p[1, ..., m] in each grid, processing capacity of each grid in terms of MIPS ( $W_{G} [1, . . ., m])$ , and the user’s objective optimization criteria (See 2 for choices). The algorithm’s output is the optimized task schedule sequence (step 1 and 2). Step 3 generates all possible combinatorial subsets of Grid-Machines that a task can be allocated onto, depending on the user’s objective optimization criteria, as so: If the user criteria is GS then this step generates all possible subsets of grid-machines sets. If the user criteria is GCPS then it generates combinatorial sets of grid machines with all the machines in each set belonging to the same grid. If the user criteria is GEPS then it generates combinatorial sets of grid-machines which offer the lowest task execution price (other Grid-Machines are ignored). Similarly, if the user criteria is GNFS then it generates singleton sets of all the individual grid-machines.

Table 2.

Function $f_{g} ()$ to generate possible subsets of Grid machines to allocate tasks to.

ObjectiveType	Function $f_{g} (W_{G} [1, . . ., m], p [1, . . ., m], o b j e c t i v e T y p e)$
GS	Generate all possible combinatorial subset of grid-machines
GCCS	Generate combinations of grid-machines with all the machines in each subset belonging to the same grid
GECS	Generate combinatorial subsets of all grid-machines that offer the lowest execution price (Ignore other Grid-Machines)
GNFS	Generate singleton subsets of all the grid-machines

Input	Value
n	4
m	2
p[1, ..., m]	[1, 2]
$W_{T} [1, . . ., n]$	[60, 60, 60, 60]
$W_{G} [1, . . ., m]$	[20, 20]
WTG[1, ...n][1, ..., n]	$[\begin{matrix} 0 & 100 & 0 \\ 0 & 0 & 100 & 0 \\ 0 & 0 & 100 & 100 \\ 0 & 0 & 0 & 0 \end{matrix}]$

GMSubset
${G_{1} M_{1}}$
${G_{2} M_{1}}$
${G_{2} M_{2}}$
${G_{1} M_{1}, G_{2} M_{1}}$
${G_{1} M_{1}, G_{2} M_{2}}$
${G_{2} M_{1}, G_{2} M_{2}}$
${G_{1} M_{1}, G_{2} M_{1}, G_{2} M_{2}}$

	$T_{1}$	$T_{2}$	$T_{3}$	$T_{4}$
${G_{1} M_{1}}$	20.6	21	20.3	20.3
${G_{2} M_{1}}$	20.6	21	20.3	20.3
${G_{2} M_{2}}$	20.6	21	20.3	20.3
${G_{1} M_{1}, G_{2} M_{1}}$	40.6	41	40.3	40.3
${G_{1} M_{1}, G_{2} M_{2}}$	40.6	41	40.3	40.3
${G_{2} M_{1}, G_{2} M_{2}}$	40.6	41	40.3	40.3
${G_{1} M_{1}, G_{2} M_{1}, G_{2} M_{2}}$	60.6	61	60.3	60.3

Time (s)	Available tasks	freeGMs			$m a x (f_{s} ())$	Generated schedule
Time (s)	Available tasks	$G_{1} M_{1}$	$G_{2} M_{1}$	$G_{2} M_{2}$	$m a x (f_{s} ())$	Generated schedule
0	$T_{1}$	$✓$	$✓$	$✓$	60.6	$T_{1} \to {G_{1} M_{1}, G_{2} M_{1}, G_{2} M_{2}}$
1	$T_{2}$	$✓$	$✓$	$✓$	61.0	$T_{2} \to {G_{1} M_{1}, G_{2} M_{1}, G_{2} M_{2}}$
2	$T_{3}, T_{4}$	$✓$	$✓$	$✓$	60.33	$T_{3} \to {G_{1} M_{1}, G_{2} M_{1}, G_{2} M_{2}}$
3	$T_{4}$	$✓$	$✓$	$✓$	60.33	$T_{4} \to {G_{1} M_{1}, G_{2} M_{1}, G_{2} M_{2}}$
4		$✓$	$✓$	$✓$		Complete; $T A T = 4 s$

	$T_{1}$	$T_{2}$	$T_{3}$	$T_{4}$
${G_{1} M_{1}}$	30.6	31.0	30.3	30.3
${G_{2} M_{1}}$	30.6	31.0	30.3	30.3
${G_{2} M_{2}}$	30.6	31.0	30.3	30.3
${G_{2} M_{1}, G_{2} M_{2}}$	60.6	61.0	60.3	60.3

	$T_{1}$	$T_{2}$	$T_{3}$	$T_{4}$
${G_{1} M_{1}}$	66666.6	66666.6	66666.6	66666.6
${G_{2} M_{1}}$	100000.0	100000.0	100000.0	100000.0
${G_{2} M_{2}}$	100000.0	100000.0	100000.0	100000.0
${G_{2} M_{1}, G_{2} M_{2}}$	200000.0	200000.0	200000.0	200000.0

Notation	Description
n	Number of tasks
m	Number of Grids present on the grid network
$W_{T_{i}}$	Length (in millions of instructions) of task $T_{i}$
$W_{G_{j}}$	Processing Capacity in MIPS (millions of instructions per second) of a single machine belonging to Grid $G_{j}$
p[1, ..., m]	Number of machines present on Grids $G_{1}, . . ., G_{m}$
$P r i c e_{E_{G_{j}}}$	Price (cost) incurred per second in executing a task on any machine belonging to Grid $G_{j}$
$P r i c e_{C}$	Price(cost) incurred per second in reserving the network link connecting any two different Grids on the Grid Network
$d^{+} (T_{i})$	Out degree of Task $T_{i}$ on the task dependency graph i.e. the number of child tasks dependent on Task $T_{i}$
GS	Greedy scheduler - Minimize TAT and maximize Resource Utilization
GCPS	Greedy communication price scheduler - Minimize the communication Price (Cost)
GEPS	Greedy execution price scheduler - Minimize the execution Price (Cost)
GNFS	Greedy No-Fragmentation scheduler - Minimize TAT and maximize Resource Utilization without fragmenting any task across multiple Grid-Machines
R	Random scheduler

Scheduler name	Symbol used
Greedy scheduler	●
Greedy communication cost scheduler	⧫
Random scheduler	■
Greedy execution cost scheduler	▲
Greedy no fragmentation scheduler	★

Scientific application workflow	Brief description
Epigenomics	Created by the USC Epigenome Center and the Pegasus Team to automate various operations in genome sequence processing.
Cybershake	Used by the Southern California Earthquake Center to characterize earthquake hazards in a region.
Gausian elimination	An algorithm for solving linear equations
LIGO	Used to generate and analyze gravitational wave forms from data collected during the coalescing of compact binary systems.
Montage	Created by NASA/IPAC to stitch together multiple input images to create custom mosaics of the sky
Cascade	User level library allowing manual pluralization of complex C++ systems such as video game engines

Task graph	TAT
Pipe line	$\frac{W_{T}}{W_{G} \times m \times M} \times n$
Star	$\frac{W_{T}}{W_{G} \times m \times M} + \frac{W_{T} \times (n - 1)}{W_{G} \times m \times M}$
Independent	$\frac{W_{T} \times n}{W_{G} \times m \times M}$
Binary	$\sum_{i = 0}^{ln (n - 1) - 1} \frac{2^{i} \times W_{T}}{W_{G} \times m \times M}$
$α$ ary	$\sum_{i = 0}^{{ln}_{α} (n (α - 1) + 1) - 1} \frac{α^{i} \times W_{T}}{W_{G} \times m \times M}$
Fully connected	$\frac{W_{T}}{W_{G} \times m \times M} \times n$

Standard task graph	Number of tasks (n)	TAT with fragmentation		TAT without fragmentation
Standard task graph	Number of tasks (n)	TAT (Table 18) in seconds	Grid sim TAT in seconds	TAT (Table 19) in seconds	Grid sim TAT in seconds
Independent task graph	40	200	200.03	200	200.03
	121	605	605.1	640	640.1
	363	1820	1820.03	1840	1840.04
	1093	5465	5465.91	5480	5480.11
Star task Graph	40	200	200.04	240	240
	121	605	605.11	640	640.01
	364	1820	1820.31	1880	1880.04
	1093	5465	5465.92	5520	5520.11
$α - a r y$ Task graph (a=3)	40	200	200.04	320	280.01
	121	605	605.11	760	680.01
	364	1820	1820.31	2000	1880.04
	1093	5465	5465.92	5680	5560.04
Pipeline task graph	40	200	200.26	1600	1600.03
	121	605	605.8	4840	4840.1
	364	1820	1822.42	14560	14560.3
	1093	5465	5472.28	43720	43720.91
Fully connected task graph	40	200	200.26	200	200.26
	121	605	605.8	605	605.8
	364	1820	1822.42	1820	1822.42
	1093	5465	5472.28	5465	5472.28

Number of gridlets (n)	Scheduler	Standard task graphs
		Pipeline task graph				Star task graph
		Turn around time (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)	Turn around time (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)
40	Greedy (with fragmentation) scheduler	89.24	99.61	16.54	88.88	84.47	105.23	15.71	64.44
	Greedy communication cost scheduler	114.51	77.63	15.87	0	92.87	95.71	16.55	0
	Random scheduler	380.68	23.35	17.66	88.29	99.22	89.59	16.53	12.21
	Greedy execution cost scheduler	800.03	11.11	15.87	0	120	74.07	16.48	0
	Greedy (without-fragmentation) scheduler	800.03	11.11	15.87	0	120	74.07	16.48	0
121	Greedy (with fragmentation) scheduler	269.96	99.6	50.04	268.86	234.48	114.67	43.25	64.44
	Greedy communication cost scheduler	346.4	77.62	48.01	0	272.88	98.54	50.04	0
	Random scheduler	1335.94	20.13	53.73	272.51	294.01	91.46	50.16	44
	Greedy execution cost scheduler	2420.1	11.11	48.02	0	300.01	89.63	50.13	0
	Greedy (without-fragmentation) scheduler	2420.1	11.11	48.02	0	300.01	89.63	50.13	0
364	Greedy (with fragmentation) scheduler	812.14	99.6	150.53	808.81	675.89	119.68	125.69	64.44
	Greedy communication cost scheduler	1042.07	77.62	144.44	0	812.93	99.5	150.52	0
	Random scheduler	3683.09	21.96	161.88	892.11	844.34	95.8	150.68	18.33
	Greedy execution cost scheduler	7280.3	11.11	144.44	0	840.02	96.29	150.48	0
	Greedy(without-Fragmentation) scheduler	7280.3	11.11	144.44	0	840.02	96.29	150.48	0
1039	Greedy (with fragmentation) scheduler	2318.17	99.6	429.67	2308.66	1904.62	121.23	354.34	64.44
	Greedy communication cost scheduler	2974.48	77.62	412.28	0	2313.05	99.82	429.63	0
	Random scheduler	11322.52	20.39	462.72	2492.34	2350.67	98.22	429.77	71.66
	Greedy execution cost scheduler	20780.86	11.11	412.3	0	2340.05	98.67	429.79	0
	Greedy (without-Fragmentation) scheduler	20780.86	11.11	412.3	0	2340.05	98.67	429.79	0

Number of gridlets (n)	Scheduler	Standard task graphs
		Independent task graph				Ternary task graph
		Turn around time (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)	Turn around time (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)
40	Greedy (with fragmentation) scheduler	84.45	105.25	15.61	62.22	84.47	105.23	15.71	64.44
	Greedy communication cost scheduler	90.01	98.76	16.55	0	92.87	95.71	16.55	0
	Random scheduler	109	81.55	16.67	25.67	106.67	83.33	16.58	43.11
	Greedy execution cost scheduler	100	88.89	16.48	0	140	63.49	16.48	0
	Greedy (without-fragmentation) scheduler	100	88.89	16.48	0	140	63.49	16.48	0
121	Greedy (with fragmentation) scheduler	232.25	115.77	43.15	62.22	234.48	114.67	43.25	64.44
	Greedy communication cost scheduler	270.02	99.58	50.04	0	272.88	98.54	50.04	0
	Random scheduler	284	94.68	50.08	4	294.34	91.35	50.09	15.08
	Greedy execution cost scheduler	280.01	96.03	50.13	0	320	84.03	49.98	0
	Greedy (without-fragmentation) scheduler	280.01	96.03	50.13	0	320	84.03	49.98	0
364	Greedy (with fragmentation) scheduler	682.29	118.56	125.75	62.22	675.89	119.68	125.69	64.44
	Greedy communication cost scheduler	810.06	99.86	150.52	0	812.93	99.5	150.52	0
	Random scheduler	821	98.52	150.54	21.33	835.01	96.87	150.58	5
	Greedy execution cost scheduler	820.02	98.64	150.48	0	860	94.06	150.48	0
	Greedy (without-fragmentation) scheduler	820.02	98.64	150.48	0	860	94.06	150.48	0
1039	Greedy (with fragmentation) scheduler	1904.46	121.24	354.24	62.22	1904.62	121.23	354.34	64.44
	Greedy communication cost scheduler	2310.19	99.94	429.63	0	2313.05	99.82	429.63	0
	Random scheduler	2332.78	98.98	429.74	17.78	2368.68	97.48	429.8	15.67
	Greedy execution cost scheduler	2320.05	99.52	429.79	0	2360	97.83	429.64	0
	Greedy (without-fragmentation) scheduler	2320.05	99.52	429.79	0	2360	97.83	429.64	0

Number of gridlets (n)	Scheduler	Standard task graphs
		Fully connected task graph
		Turn around time (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)
40	Greedy (with fragmentation) scheduler)	89.24	99.61	16.54	88.88
	Greedy communication cost scheduler	114.51	77.63	15.87	0
	Random scheduler	341.24	26.05	17.59	98.41
	Greedy execution cost scheduler	800.03	11.11	15.87	0
	Greedy (without-fragmentation) scheduler	800.03	11.11	15.87	0
121	Greedy (with fragmentation) scheduler	269.96	99.6	50.04	268.86
	Greedy communication cost scheduler	346.4	77.62	48.01	0
	Random scheduler	1332.88	20.17	53.97	259.55
	Greedy execution cost scheduler	2420.1	11.11	48.02	0
	Greedy (without-fragmentation) scheduler	2420.1	11.11	48.02	0
364	Greedy (with fragmentation) scheduler	812.14	99.6	150.53	808.81
	Greedy communication cost scheduler	1042.07	77.62	144.44	0
	Random scheduler	3872.95	20.89	161.64	874.12
	Greedy execution cost scheduler	7280.3	11.11	144.44	0
	Greedy (without-fragmentation) scheduler	7280.3	11.11	144.44	0
1039	Greedy (with fragmentation) scheduler	2318.17	99.6	429.67	2308.66
	Greedy communication cost scheduler	2974.48	77.62	412.28	0
	Random scheduler	11190.07	20.63	461.31	2516.51
	Greedy execution cost scheduler	20780.86	11.11	412.3	0
	Greedy (without-fragmentation) scheduler	20780.86	11.11	412.3	0

Task graph	TAT
Pipe line	$\frac{W_{T}}{W_{G}} \times n$
Star	$\frac{W_{T}}{W_{G}} + (\frac{W_{T}}{W_{G}} \times ⌈ \frac{n - 1}{m \times M} ⌉)$
Independent	$\frac{n \times W_{T}}{W_{G} \times m \times M}$
Binary	$\leq \sum_{i = 0}^{ln (n - 1) - 1} (⌈ \frac{2^{i}}{m \times p} ⌉ \times \frac{W_{T}}{W_{G}})$
$α$ ary	$\leq \sum_{i = 0}^{{ln}_{α} n (α - 1) + 1 - 1} (⌈ \frac{α^{i}}{m \times M} ⌉ \times \frac{W_{T}}{W_{G}})$
Fully connected	$\frac{W_{T}}{W_{G} \times m \times M} \times n$

Number of gridlets (n)	Scheduler	Task graph
		A task graph with 0% connectivity				A task graph with 25% connectivity
		TAT (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)	TAT (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)
100	Greedy (with fragmentation) scheduler	193.97	99.56	36.01	62.22	224.56	98.96	38.51	84.44
	Greedy communication cost scheduler	222.87	98	41.33	0	248.6	89.39	41.86	0
	Random scheduler	246	90.33	41.37	6.67	550.5	40.47	41.86	194.24
	Greedy execution cost scheduler	240	92.59	41.42	0	680.03	32.68	39.68	0
	Greedy (without-fragmentation) scheduler	240	92.59	41.42	0	779.01	28.53	39.66	0
300	Greedy (with fragmentation) scheduler	562.28	118.57	103.95	62.22	651.03	102.4	115.09	217.76
	Greedy communication cost scheduler	670.05	99.49	124.08	0	781.51	85.31	124.63	0
	Random scheduler	690	96.62	124.14	23.33	1794.81	37.14	126.11	493.82
	Greedy Execution cost scheduler	680.01	98.04	124.17	0	2037.06	32.73	118.98	0
	Greedy (without-Fragmentation) scheduler	680.01	98.04	124.17	0	2113.01	31.55	118.9	0
500	Greedy (with fragmentation) scheduler	922.55	120.44	171.64	62.22	1133.6	98.02	196.05	275.53
	Greedy communication cost scheduler	1111.46	99.97	206.74	0	1224.37	90.75	207.21	0
	Random scheduler	1119	99.29	206.76	16.67	2860.33	38.85	210.25	902.54
	Greedy execution cost scheduler	1120.02	99.2	206.86	0	3652.14	30.42	198.25	0
	Greedy (without-fragmentation) scheduler	1120.02	99.2	206.86	0	3329.08	33.38	198.19	0

Number of gridlets (n)	Scheduler	Task graph
		A task graph with 50% connectivity				A task graph with 5% connectivity
		TAT (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)	TAT (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)
100	Greedy (with fragmentation) scheduler	242.96	91.46	41	159.98	226.63	98.06	41.35	211.09
	Greedy communication cost scheduler	320.13	69.42	41.61	0	337.34	65.88	41.33	0
	Random scheduler	854.68	26	42.78	189.18	874.23	25.42	43.3	193.28
	Greedy execution cost scheduler	1039.06	21.39	39.66	0	1600.08	13.89	39.68	0
	Greedy(without-fragmentation) scheduler	1099.05	20.22	39.66	0	1660.08	13.39	39.68	0
300	Greedy (with fragmentation) scheduler	747.19	89.22	121.54	453.29	708.43	94.1	123.34	597.72
	Greedy communication cost scheduler	946.2	70.46	124.72	0	966.73	68.96	122.79	0
	Random scheduler	2574.16	25.9	129	638.88	2951.95	22.58	130.79	633.81
	Greedy execution cost scheduler	3491.15	19.1	118.87	0	4720.23	14.12	119.05	0
	Greedy (without-fragmentation) scheduler	3453.12	19.31	118.91	0	4800.23	13.89	119.05	0
500	Greedy (with fragmentation) scheduler	1240.61	89.56	203.32	759.92	1158.96	95.87	205.88	1026.56
	Greedy communication cost scheduler	1536.23	72.33	207.32	0	1634.52	67.98	204.31	0
	Random scheduler	3821.02	29.08	214.51	1003.02	5088.05	21.84	218.14	1091.33
	Greedy execution cost scheduler	5820.31	19.09	198.41	0	7960.38	13.96	198.41	0
	Greedy(without-Fragmentation) scheduler	5872.29	18.92	198.25	0	7920.38	14.03	198.41	0

Scientific task graph	Scheduler	TAT (seconds)	Resource utilization (%)	Execution cost ($)	Communication cost ($)
Cascade	Greedy (with fragmentation) scheduler	44.49	99.9	8.27	44.44
	Greedy communication cost schedule	51.46	86.37	8.24	0
	Random scheduler	157.33	28.25	8.49	34.68
	Greedy execution cost schedule	160	27.78	7.94	0
	Greedy (without-fragmentation) scheduler	160	27.78	7.94	0
Montage	Greedy (with fragmentation) scheduler	44.51	99.84	8.27	44.44
	Greedy Communication cost schedule	57.18	77.73	8.24	0
	Random scheduler	196.53	22.61	8.43	33.37
	Greedy Execution cost schedule	180.01	24.69	7.94	0
	Greedy (without-fragmentation) scheduler	180.01	24.69	7.94	0
Ligo	Greedy (with fragmentation) scheduler	84.47	105.24	15.71	64.44
	Greedy communication cost schedule	92.87	95.71	16.55	0
	Random scheduler	177.12	50.19	16.76	43
	Greedy execution cost schedule	140.01	63.49	16.17	0
	Greedy (without-fragmentation) scheduler	140.01	63.49	16.17	0
Gausian elimination	Greedy (with fragmentation)scheduler	66.73	99.9	12.41	66.66
	Greedy communication cost scheduler	78.62	84.79	12.36	0
	Random scheduler	169.69	39.29	12.85	96.21
	Greedy execution cost schedule	300.01	22.22	11.9	0
	Greedy (without-Fragmentation) scheduler	300.01	22.22	11.9	0
Cybershake	Greedy (with fragmentation) scheduler	44.46	99.97	8.27	44.44
	Greedy communication cost scheduler	50	88.88	8.31	0
	Random scheduler	93.67	47.45	8.42	6.67
	Greedy execution cost scheduler	100	44.44	8.01	0
	Greedy (without-fragmentation) scheduler	100	44.44	8.01	0
Epigenomics	Greedy (with fragmentation) scheduler	44.49	99.9	8.27	44.44
	Greedy communication cost scheduler	51.45	86.38	8.24	0
	Random scheduler	106.8	41.61	8.59	48.44
	Greedy execution cost scheduler	159	27.95	7.92	0
	Greedy (without-Fragmentation) scheduler	159	27.95	7.92	0

Standard task graph	Number of gridlets	Weightage type	TOPSIS solution rank	Scheduler
Fully connected, pipeline, ternary, star and independent	40, 121, 364 and 1039	1.TAT > RU > EP > CP 2.TAT > RU > CP > EP 3. TAT > EP > RU > CP 4. TAT > EP > CP > RU	1	Greedy (with fragmentation) scheduler
			2	Greedy Communication cost scheduler
			3	Greedy Execution cost scheduler
			4	Greedy (without-Fragmentation) scheduler
		5. TAT > CP > RU > EP 6. TAT > CP > EP > RU	1	Greedy Communication cost scheduler
			2	Greedy Execution cost scheduler
			3	Greedy (without-Fragmentation) scheduler
			4	Greedy (with fragmentation) scheduler
		7. RU > TAT > EP > CP 8. RU > TAT > CP > EP 9. RU > EP > TAT > CP 10. RU > EP > CP > TAT	1	Greedy (with fragmentation) scheduler
			2	Greedy Communication cost scheduler
			3	Greedy Execution cost scheduler
			4	Greedy (without-Fragmentation) scheduler
		11. RU > CP > TAT > EP 12. RU > CP > EP > TAT	1	Greedy Communication cost scheduler
			2	Greedy Execution cost scheduler
			3	Greedy (without-Fragmentation) scheduler
			4	Greedy (with fragmentation) scheduler

Scientific task graph	Number of gridlets	Weightage type	TOPSIS solution rank	Scheduler
Cascade, montage, ligo, cybershake, epigenomics and Gaussian elimination	40, 121, 364 and 1039	1.TAT > RU > EC > CC 2.TAT > RU > CC > Ec 3. TAT > EC > RU > CC 4. TAT > EC > CC > RU	1	Greedy (with fragmentation) scheduler
			2	Greedy Communication cost scheduler
			3	Greedy Execution cost scheduler
			4	Greedy (without-Fragmentation) scheduler
		5. TAT > CC > RU > EC 6. TAT > CC > EC > RU	1	Greedy Communication cost scheduler
			2	Greedy Execution cost scheduler
			3	Greedy (without-Fragmentation) scheduler
			4	Greedy (with fragmentation) scheduler
		7. RU > TAT > EC > CC 8. RU > TAT > CC > EC 9. RU > EC > TAT > CC 10. RU > EC > CC > TAT	1	Greedy (with fragmentation) scheduler
			2	Greedy Communication cost scheduler
			3	Greedy Execution cost scheduler
			4	Greedy (without-Fragmentation) scheduler
		11. RU > CC > TAT > EC 12. RU > CC > EC > TAT	1	Greedy Communication cost scheduler
			2	Greedy Execution cost scheduler
			3	Greedy (without-Fragmentation) scheduler
			4	Greedy (with fragmentation) scheduler

PERMALINK

Multi-objective and multi constrained task scheduling framework for computational grids

Sujay N Hegde

D B Srinivas

M A Rajan

Sita Rani

Aman Kataria

Hong Min

Abstract

Introduction

Figure 1.

Related work

System model

Task model

Grid model

Simulation model

GridSim66

Existing GridSim architecture

Life cycle of a GridSim simulation

Our resource broker implementation

Formulation of multi-objective optimization for task scheduling

Proposed task scheduling algorithm

Table 2.

Algorithm 1.

Algorithm 2.

Algorithm 3.

Demonstration of the proposed task scheduling algorithm

Figure 2.

Table 1.

Objective type: greedy scheduler

Table 3.

Table 4.

Table 5.

Objective type: greedy communication price scheduler

Table 6.

Table 7.

Table 8.

Objective type: greedy no fragmentation scheduler

Table 9.

Table 10.

Table 11.

Objective type: greedy execution price scheduler

Table 12.

Table 13.

Table 14.

Results and discussion

Simulation setup

Table 15.

Table 16.

Table 17.

Standard task graphs

Table 18.

Table 19.

Table 20.

Table 21.

Table 22.

Table 23.

Random task graphs

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Figure 12.

Figure 13.

Figure 14.

Figure 15.

Figure 16.

Figure 17.

Table 24.

Table 25.

Table 26.

Scientific task graphs

Figure 18.

Figure 19.

Figure 20.

GridSim⁶⁶