Abstract
In our modern digital era, social networks have seamlessly integrated into the fabric of our daily lives. These digital platforms serve as vital channels for communication, exchanging information, and cultivating valuable connections. The propagation of information within these social networks has emerged as a central focus for numerous sectors, including politics, marketing, research, education, and finance. Diverse models have been employed to depict the dynamics of information dissemination across these networks. Nevertheless, the notion of influence holds profound significance for both businesses and individuals. Influence maximization, particularly within the context of social networks, has garnered considerable attention owing to its potential to reach and impact a broad audience. This intricate challenge is commonly referred to as the “influence maximization problem,” a problem well-known for its NP-hard complexity. This paper proposes a cutting-edge technique that leverages the Moth-Flame Optimization Algorithm to enhance influence maximization. Influence maximization is an important issue in network analysis, which widely occurs in social networks. Influence can be seen as a cascading effect, where the actions of a few trigger a chain reaction, ultimately reaching a large portion of the network. Identifying these “influencers” is crucial for efficient resource allocation and information dissemination. One of the important issues in finding the maximum influence is choosing the best vertex among all the vertices in the graph. This research presents a new method to find the maximum influence in social networks based on the Moth-Flame Algorithm (MFA). The proposed method aims to find the maximum influence in the social network graph that has a good fitness degree. The algorithm can identify potential influencers. Our simulations across multiple networks have unequivocally showcased the superiority of this algorithm as the preeminent and scalable solution to the influence maximization problem. The experimental outcomes clearly delineate that the employment of the MFA (Maximal First Activation) approach effectively diminishes the execution time required to approximate the maximum influence. The proposed technique improved the accuracy and excucation time by 3.140 % and 12.2 % compared to other methods.
Keywords: Maximum influence, Graph, Social network, Moth-flame algorithm
1. Introduction
In the past and in the years that lie ahead, the online social networking user base continues its relentless growth [1,2]. Consequently, this surge places an ever-expanding burden on the shoulders of service providers tasked with overseeing and enhancing the cloud platform's capabilities [[3], [4], [5]]. Social networks are a collection of complicated networks that are now widely used [6,7]. Social relations are analyzed with graph theory in the analysis of social networks. The vertices represent the people in the networks, and the edges denote the relationships among these individuals [8]. Various types of edges can exist among vertices [9]. The results of numerous studies illustrate that social network analysis can be applied at both individual and social levels to identify communities and types of relationships, establish social connections, explore graphs to discern patterns, and achieve various goals [[10], [11], [12]]. Social networks are integral to the success and growth of businesses as they offer valuable channels for information gathering, competition management, and collaborative decision-making regarding pricing and policies [[13], [14], [15]].
One of the major challenges in exploring alternating patterns is that the algorithms used to perform this exploration often get a huge set of alternating patterns as an answer [16,17]. This issue becomes more apparent when a min-support threshold is selected. The main reason for this issue is that all subsets of an item set are alternating. Accordingly, the presence of a large alternative item set in the transaction database can lead to an exponential growth of the number of alternative item sets (in the number of subsets of the large item set) in the transaction database [18]. Furthermore, finding related information and knowledge with data mining approaches among several users is time-consuming and even disappointing [19]; it is one of the most important challenges in social networks. One of the ways to pay attention to this issue is to use social network analysis to identify users who interact with each other the most. Hence, it is necessary to discover the influential node [20]. Through them, it is possible to present the most appropriate and desirable ones to each user from among a large amount of information and various products [21]. Besides, suggestions and opinions of known users are used through the effective and influential node; in this case, more effective and accurate suggestions will be obtained [22]. The maximum influence issue is focused on finding a small subset of nodes with maximum influence in a social network. Maximum influence is an important and NP-hard problem in social networks. The majority of recent studies have focused on controlling how intrusions form, such as by identifying intrusion agents and developing algorithms to identify the first intrusion node [23]. However, there are a few studies on the mathematical modeling of maximum influence. This research will use a social computing method (Moth-Flame Algorithm (MFA)), which includes analyzing and modeling social behaviors in the media and finding the maximum influence in social networks. The MFA is an optimization method that uses the natural selection theory [24]. A model based on the meta-heuristic MFA will be given to determine who has the most impact on the social network. Moths go through two significant life stages: caterpillar and adult. The primary characteristic of moths is their nocturnal activity. They fly at night using the moonlight. They use a transverse orientation to move at night. In this mechanism, the moth moves at a constant angle to the moon. This mechanism is very suitable for direct movement on long routes [25]. Additionally, it is observed that the moths move spirally around the smart lights. At first, it tries to find a direction transverse to the light. Since the lamp is much closer to the moth than the moon, it chooses a deadly spiral path to reach the lamp.
The technique of finding a subgroup of people in a social network to maximize influence can maximize the spread of information, trends, or ideas. Social networks, which operate on remote servers, present opportunities and challenges for influence maximization [26]. Hence, focusing on it through different algorithms and providing new solutions have always been discussed. Networks offer scalability and accessibility, but the vast amount of data generated poses challenges for influence analysis [27,28]. Traditional algorithms struggle with the complexity of these networks. Inspired by the navigational behavior of moths towards light sources, the MFA is a metaheuristic approach. It mimics the attraction of moths to flames and applies this concept to optimization problems, making it suitable for complex scenarios like influence maximization. In addition, according to the studies conducted in this field, we found that the MFA outperforms other existing algorithms. In this research, we were able to find the effective influence in a set of social network nodes with a large size using the MFA and presenting a new method to optimize the influence. Also, adjacency matrices have properties that simplify network analysis. They allow us to represent relationships between nodes efficiently. Applying adjacency matrices to influence maximization helps in quantifying and predicting the spread of influence. Furthermore, our proposed method works well in terms of convergence to the answer and can find the best influence to some extent in the case of networks with high dispersion. Also, Influence maximization in social networks faces numerous challenges, including network size, data noise, and computational complexity. The MFA offers a unique perspective in overcoming these hurdles. The objectives of this research are as follows:
-
F0B7
Improving the accuracy, precision, and recall in finding the maximum influence using the MFA;
-
F0B7
Decreasing the execution time in finding the maximum influence using the MFA.
The contributions of this research are:
-
F0B7
Enhancing Influence Spread: The MFA excels in identifying influential nodes within a social network. Optimizing the selection of these nodes significantly improves influence spread, reaching a wider audience with minimal effort.
-
F0B7
Reducing Computational Complexity: Unlike some other optimization techniques, the MFA offers a streamlined approach to influence maximization. Its computation efficiency ensures faster results without compromising accuracy.
Section 2 includes the literature review of influence maximization and influence maximization in social networks. Section 3 discusses the MFA, and the proposed method is examined using the MFA to maximize influence. Section 4 checks the proposed method's results and outputs, performed using MATLAB, and compared to other methods. Section 5 discusses the proposed method's conclusions, advantages, and disadvantages and makes suggestions to continue the work in the field of influence maximization issue.
2. Literature review
Networked systems are pervasive in our daily lives, and there is a growing interest in analyzing their dynamics and properties [[29], [30], [31]]. Among these networks, social networks, which are widely studied using graph theory, hold significant importance in modern society. Platforms like Facebook, Twitter, WeChat, and Microblog have become integral to people's lives, offering easy access to the latest news and facilitating communication among users [32,33]. Moreover, the close connections within social networks make information dissemination effortless, turning it into a potent promotional tool for companies and advertisers [34]. A social network with maximum influence is the focus of the maximization problem and how it affects the outcomes of a limited group of nodes [35]. In recent research, the issue of influence maximization has garnered growing interest [36,37].
2.1. Related work
Liang, He [38] delved into the challenge of maximizing targeted influence in competitive social networks (TIMC). They established that the objective function of TIMC is both monotone and submodular. To tackle this problem efficiently, they introduced the Reverse Reachable set-based Greedy (RRG) algorithm. They further enhanced the algorithm's efficiency by utilizing marginal influence to prune nodes and established an upper bound for the marginal influence. Their experiments demonstrated the effectiveness of the RRG algorithm, especially in sparse large networks with intense competition.
Beni, Bouyer [39] introduced the CSP (Combined modules for Seed Processing) algorithm, designed to recognize influential nodes. CSP initially identified graph modules based on criteria like clustering coefficient, degree, and common neighbors. It then grouped nodes with the same label into modules using label diffusion. The most influential modules were selected through diffusion capacity filtering. The algorithm subsequently merged neighboring modules and extracted a candidate set of influential nodes with a defined limit. Seed nodes were then chosen from the candidate set utilizing a novel scoring measure. Experimental results showcased CSP's superiority in solution quality and speed on various networks.
Dong, Xu [40] presented the Three-Stage Iterative Framework For Influence Maximization (TSIFIM) for identifying seed spreaders in complex networks. TSIFIM commenced by selecting initial candidate seeds considering global communicability and local network importance. Next, it assigned the remaining nodes to specific communities using a local resource allocation similarity index. Core nodes within each community satisfying local influence threshold conditions were selected as supplementary candidate seeds. TSIFIM demonstrated superior performance in influence spreading, sensitivity analysis, seed dispersion, and statistical testing.
Bouyer, Beni [41] devised a method aimed at reducing the search space and enhancing time complexity. Their approach selected seed nodes based on optimal influence spread, considering community structure, diffusion capability, and global diffusion probability. The FIP algorithm detected overlapping communities, analyzed emotional relationships within communities, and limited the search space by removing insignificant communities. Candidate nodes were generated using the probability of global diffusion. The final seed nodes were selected based on the importance of nodes and diffusion impact within the communities. The FIP algorithm outperformed others in terms of efficiency and runtime.
Bouyer and Beni [42] introduced an efficient solution for the influence maximization problem, known as the Local Maximal Power (LMP) algorithm. This innovative approach employs a local traversal method to label nodes based on their influence power. The LMP algorithm initiates its traversal from a node with the lowest influence power and proceeds to assign ranking labels to this node and its neighboring nodes at each step. These labels are determined by considering factors such as diffusion capability and strategic position. By incorporating node labeling steps, the LMP algorithm significantly reduces the search space involved in the influence maximization problem. Within the proposed algorithm, three ranking labels are utilized, and nodes possessing the highest ranking labels are identified as candidate nodes. This localized and rapid step plays a crucial role in streamlining the search space. Ultimately, the LMP algorithm selects seed nodes by taking into account both the topology features and the strategic position of candidate and connector nodes. To evaluate the performance of this algorithm, comprehensive benchmarking was conducted against well-established and recently introduced seed selection algorithms. Experimental assessments were carried out using real-world and synthetic networks to validate the efficiency and effectiveness of the LMP algorithm. The results unequivocally demonstrated that the proposed algorithm stands out as the fastest among its state-of-the-art counterparts, all while maintaining a linear time complexity. Additionally, the LMP algorithm strikes a commendable balance between efficiency and time complexity, making it an excellent choice for solving the influence maximization problem.
As an illustration, Tang, Zhu [43] introduced an efficacious solution to the influence maximization problem, employing a novel approach known as the Learning-Automata-Driven Discrete Butterfly Optimization Algorithm (LA-DBOA), meticulously tailored to the network's topology. The empirical outcomes of their work unveiled that the proposed algorithm not only achieved a comparable influence spread to that of the CELF algorithm but also outperformed various conventional methodologies. This compelling evidence underscores the efficacy of meta-heuristics grounded in swarm intelligence for tackling the complex challenge of influence maximization.
Furthermore, addressing the inherent inefficiencies of extant greedy algorithms and the limited accuracy of centrality-based heuristics, Fu, Zhang [44] introduced a refined algorithm termed the Improved Differential Evolution Algorithm (IDDE), which leverages network discretization techniques. This innovative algorithm enhanced the variance rule within the differential evolution framework, leveraging discrete parameters such as the number and granularity of the remaining network nodes after the target node removal to assess node importance. It also introduced a fitness function based on network robustness. Comparative experiments conducted across four real datasets of varying sizes unequivocally demonstrated the superior performance of the IDDE algorithm when juxtaposed against its peers.
The remainder of this section delves into the intricacies of the maximum influence problem and its relevance within social networks.
2.2. Maximum influence issue
Initially, influence spread in a target network G is defined before defining maximum influence. The expected number of active nodes with a particular seed set S is the definition relevant to the influence spread of a node-set S. σ(s) can be an illustration of this set. Maximum influence has been explained as an issue of selecting K nodes (i.e., seed set S) of the network G to maximize the influence spread σ(s). The formal definition of maximum influence is a problem of limited optimization [45].
Maximum influence: Assume a graph G=(N, E, W) and an integer k (K < |N|) and select the node as an initial node set s = { so that the Influence spread σ(s) for a specific expansion model is the maximum, which can be in the form of Eq. (1):
| (1) |
The basic objective of the maximum influence issue is to identify a subset of nodes in a social network that distribute influence as much as feasible. A fundamental data mining problem called maximum influence locates k nodes in a given social network G. Numerous studies have been conducted on this issue, and it has been used in online applications [46], such as viral marketing. For instance, viral marketing aims to influence as many people as possible directly about a specific product. For instance, k people receive samples of a product, and they are asked to promote and advocate the product—which is superior to its rivals in terms of quality—to their friends. This procedure has to be developed and shared with additional friends.
2.3. Maximum influence on social networks
Numerous studies have been conducted using both heuristic and greedy techniques. In big social networks, greedy strategies have more influence but are less scalable. Heuristic methods are quick and scalable; however, they are unsuitable for all networks. Finding ways to make greedy techniques more scalable is still a hot topic.
Domingos and Richardson [47] first introduced maximum influence as an algorithmic problem to the data mining community. They used Markov Random Field (MRF) techniques to model and studied the problem of finding an optimal set of individuals (seeds). They proposed a heuristic greedy hill-climbing method to solve this problem. However, the MRF-based formulation is not nearly as successful as the discrete optimization formulation of [48].
In a directed graph G=(V, E), nodes represent people in a social network, and edges represent relationships among people. Additionally, a positive integer k is the maximum influence task to find a node set S of size less than k; targeting them for initial activation maximizes the expected number of activated nodes in the entire social network under a specific random diffusion model. The target set S of nodes is often referred to as the seed set. represents the influence spread function; given that S is activated initially, σ(s) is the expected number of active nodes at the end of the spread. As a convention σ(s), it is commonly known as the influence spread of S [49].
2.3.1. Solving the maximum influence problem
Two obstacles must be overcome in order to resolve the maximum influence issue:
-
1)
How to assess nodes' significance for spreading influence
-
2)
How to select network nodes with influence
In past works, many methods have been proposed to solve these two challenges. In Ref. [48], A technique is described that assesses the impact of network expansion depending on its degree. It is dependent on the centrality degree (axis degree). During the expansion phase, a node with several nearby nodes (neighbor nodes) is considered. In order to choose the first seed set for the maximum influence issue, the degree approach chooses the k nodes with the highest degree. The degree of node i is calculated using Eq. (2) [50]:
| (2) |
According to Eq. (2), aij = 1 indicates that nodes i, j are connected, and aij = 0 shows that these two nodes are not connected.
In [48], another method is called the center-proximity-based (proximity axis) or the distance-based approach. According to the typical shortest path length to other nodes, this technique assesses a node's capacity to influence other nodes. The distance-based technique chooses nodes based on their typical shortest path length in the network, much like the centrality degree approach. Eq. (3) determines the average length of the shortest route li of node i:
| (3) |
in Eq. (3), lij is the length of the shortest path between node i, j. In Ref. [45], a solution to the maximum influence issue has been investigated. They employed a discrete Particle Swarm Optimization (PSO) -based technique. They created an optimization model for the maximum influence issue based on the local influence criteria that can accurately estimate the influence spread in independent and weighted cascade models. The discrete PSO approach was then suggested in order to improve the influence criterion.
In [51], Degree Descent Search Evolution (DDSE) is a suggested approach to enhance the effectiveness of greedy algorithms. They introduced an evolutionary algorithm based on Degree Descent Search (DDS) that was able to greatly outperform greedy algorithms by doing away with their time-consuming simulations. The DDSE algorithm examines the following parts of the efficiency problem:
-
1.
Each node's influence from the seed set is estimated by EDV using a readily calculable metric. Thus, it does not need iterative simulations frequently employed in greedy techniques.
-
2.
EDV is decided by DDSE. Typically, the candidate seed set's local impact spread is calculated concurrently. Like greedy algorithms, it selects candidates one at a time rather than considering each node's nearby effect.
-
3.
The DDSE method is created using efficient evolutionary operators depending on the degree-descend search strategy and only needs a small population; therefore, the evolution procedure only requires a limited number of calculations. There are two noteworthy aspects of the DDSE algorithm. First, it performs much better than the greedy algorithm in terms of efficiency. Second, it performs better than the current heuristic approaches and has extremely excellent accuracy.
3. Methodology
This section discusses the definition of the algorithm and the full introduction of the parameters used in the proposed algorithm and how to implement it. Furthermore, the required parameters are mentioned to perform the tests required to evaluate the results of the proposed algorithm.
3.1. Moth-Flame Algorithm
Algorithms that draw inspiration from nature are widely employed nowadays in a variety of fields, such as atom optimization [52], Python-based algorithm [53], and swarm‐intelligence optimization algorithm [54]. The MFA is an optimization method that uses the natural selection theory. In this research, the meta-heuristic MFA presents a model inspired to find the maximum influence in the social network. Moths experience two important stages in their life: caterpillar and adult. The most important fact about moths is their movement at night. They fly at night using the moonlight. They use a transverse orientation to move at night. In this mechanism, the moths move at a constant angle to the moon. This mechanism is very suitable for straight movement on long routes. Besides, we derive that the moths move spirally around the smart lights. It is due to the inefficiency of the transverse direction when the light source is close to the moth. At first, the moth tries to find a direction transverse to the light. Since the lamp is much closer to the moth than the moon, the butterfly chooses a deadly spiral path to reach the lamp. Exploring the area surrounding the top spots found so far is thus certain for the following reasons.
-
A.
Moths revise their positions inside super domains based on the most successful solutions thus far.
-
B.
The flames' order changes based on the optimal answers throughout each iteration. The updated flames must be used to update the moths' location. Hence, near various fires, the moths' condition may change. One issue with this approach is that local updating of the moths in accordance with various locations in the search space may cause the process to be less effective at finding a solution.
A method for the number of flames is defined in Eq. (4) to address this issue.
| (4) |
in this context, l represents the number of repetitions, N signifies the count of flames, and ‘T' denotes the upper limit for algorithm repetitions. In the initial iteration, the algorithm commences with N flames. However, as the iteration progresses, in the latter stages, the moths primarily refine their positions relative to the most promising flame. This approach maintains a delicate equilibrium between exploration and exploitation within the search space by systematically diminishing the number of flames. Considering the factors elucidated above, the algorithm's procedural sequence can be briefly depicted in the form of a flowchart, as illustrated in Fig. 1.
Fig. 1.
Flowchart of the moth-flame optimization algorithm.
To solve the problem of active nodes (influence) using the MFA, the adjacency matrix is first considered for the desired graph. According to the connection of a vertex with other vertices, the proximity matrix is built. Next, according to the edges of the group, the degree of the vertices is formed, and the vertices that have the same degree are placed in a group. This algorithm also starts with some random initial population. To use the MFA to solve the problem, first, sub-communities (acceptable answers) are created randomly, binary, and with higher chance probability for the vertices that have a higher degree (the vertices with a higher degree are likely to be part of the complete subgraph) according to the total number of moths. Moths contain a complete subgraph of length n. Here n is the number of vertices of the graph.
3.2. Suggested method
Efficiency and precision are paramount considerations when detecting connections within social networks [55]. Beyond speed and accuracy, emerging identification methods must possess the capacity to minimize the likelihood of erroneous connection identifications, thereby reducing noise. Moreover, it's crucial to ascertain the optimal quantity and scale of connections while relying on minimal prior knowledge regarding the network's structure and the number and size of connections within it [56,57]. Typically, in algorithms employing optimization techniques, the goal is to address all these aspects collectively, thereby identifying active (influence) nodes with high-quality results at a suitable pace. In many algorithms, all the data, along with their numerous dimensions, are centralized in one location, and community detection and identification are executed under this assumption. However, as data dimensions increase, the efficiency of these algorithms diminishes. In practical applications, factors such as limited storage space or security concerns may render it impossible to centralize all data with their full dimensions in a single location, impeding community detection and identification processes.
To delve into the details of the proposed approach for detecting and identifying active nodes or influencers within social networks, please refer to Fig. 2. The algorithm comprehensively analyzes various components, which are explored below.
Fig. 2.
Flowchart of the proposed method.
3.2.1. Adjacency matrix
Assume that G(V, E) is a simple graph whose number of vertices is n. Also, the vertices of G are arbitrarily listed as v1. v2. …. vn. The adjacency matrix A is an n×n matrix of zeros and ones according to the list of vertices. i,j are equal to 1 if vi and vj are adjacent to each other. Additionally, they are zero if they are not adjacent to each other.
The adjacency matrix of a simple graph is symmetric because, in the presence of an edge from vi to vj, there is an edge from vj to vi. Besides, the adjacency matrix can be employed to indicate undirected graphs with multiple loops and edges. However, the adjacency matrix is no longer a zero-one matrix due to the existence of multiple edges and loops. All undirected graphs, including multiple graphs and pseudo graphs, have a symmetric adjacency matrix.
We can also use zero and one matrix to display directed graphs. In the adjacency matrix for the directed graph G(V,E), i,j are equal to 1 when there exists vi to vj, and the vertices of the directed graph G are arbitrarily listed as v1. v2. …. vn.
The adjacency matrices for directed graphs are not symmetric because, in the presence of an edge from vi to vj, another one from vj to vi is not necessary. Adjacency matrices can also be used for multi-directed graphs, in which case it is no longer a zero-one matrix.
3.2.2. Initialization
In this method, moths are considered potential neighbors, and the variables represent their spatial locations. Moths can travel in one, two, or several dimensions. Due to the population-based nature of the MFA, the set of moths is a matrix of n×d order.
In the initialization stage, this method generates a random solution for each moth (mothsk (k = 1, 2, …, B)). B illustrates the number of moths. A possible solution here, which is the active (influence) node, is represented by an array of length n. The number stored in the index i indicates the ID of the candidate active (influence) node that will execute the task Ti. Because the moth will be created in the initialization step B, the initial population of solutions will be a B×n matrix [58].
| (5) |
3.2.3. Search space
Initialization is followed by repetitive execution of function P until function T is accurate. The primary function that navigates the search space is P. This method draws its inspiration from transverse orientation, as was already explained. The location of each node in relation to the flame is updated by employing Eq. (6) to characterize this behavior mathematically [51].
| (6) |
Fj represents the jth flame, Mi represents the ith moth, and s is the spiral function for the MFA.
| (7) |
here, Di refers to the distance of the ith moth from the jth flame. b is a constant number to determine the form of the logarithmic spiral and t is a random number in the range [-1,1], where Di is obtained using Eq. (8):
| (8) |
3.2.4. Pseudocode
According to the above, the pseudocode of the proposed method is indicated below. While all newly made moths are approved and preserved in the following generation in the typical MFA, only moths that suit our method well are accepted. This greedy tactic can be as Eq. (9):
| (9) |
here, is the newly produced moth for the next generation, and are the fitness of the moths and , respectively.
The pseudocode of the proposed method:
As shown in Fig. 3, the new influence that is deployed in the corresponding graph has five direct neighbor nodes as suggested and indicated by the red arrow. The nodes depicted in blue are the nodes provided in the initialization.
Fig. 3.
Search space for influence detection.
The arrangement of features in the M-dimensional space is illustrated in Fig. 4. In order to explore the Multidimensional Feature-Attribute (MFA) space, the features are encoded as moths, as shown in Fig. 4. Each feature is assigned a distinct code within the entire population of moths (P). Following the establishment of parameters, the fitness function is subsequently computed.
Fig. 4.
The structure of nodes in the graph space.
To prevent falling into local optima and expand the search space within each graph, we select the τth worst vertex instead of simply choosing the single worst vertex at each step. In this context, we identify the vertex whose rank, denoted as k and determined using Eq. (10), and designate it for replacement.
| (10) |
Here, k refers to the active number (influence) that was selected from the ordered list of ranks, n is the number of graph vertices, and rand () is the random number generation function in the interval [0, 1]. The value of parameter τ is fixed and determined by the trial-and-error method in this research. Algorithms that draw inspiration from nature are widely employed nowadays in various fields.
-
➢Case Studies and Success Stories
-
F0B7Real-World Applications: The Moth-Flame Optimization Algorithm has found applications in various domains, including marketing, public health, and social activism. Real-world examples showcase its effectiveness.
-
F0B7Notable Results: Influence maximization campaigns powered by this algorithm have yielded impressive outcomes. From viral marketing to behavior change initiatives, success stories abound.
-
F0B7
4. Simulation and calculation results
This section presents the simulation and assessment of the MFA-based suggested technique. The suggested technique's simulation and different test procedures are provided in the following sections. All tests were carried out on an Acer laptop with a Core i5 2.67 GHz CPU and 6 GB of main memory.
4.1. Criteria and simulation of the suggested method
Simulating and assessing the suggested MFA process was done using MATLAB software. Given that the MFA was utilized in the suggested method, the outcomes have been assessed for convergence and are shown below. Additionally, the outcomes of the suggested approach have been contrasted with those of PSO [45], GA [23], DDSE [51] procedures and Linear Programming based Diffusion Models (LPDM) [59]. Hao et al. introduced three measures of accuracy, sensitivity, and F1 value to quantitatively evaluate the results in finding influence. These criteria are defined as Table 1 [60,61]:
Table 1.
Classification of evaluation criteria.
| Main classes | Classification result | |
|---|---|---|
| N | P | |
| N | TN | FP |
| P | FN | TP |
True Positive Rate: Nodes that are correctly detected as influence nodes in the network.
False Positive Rate: Nodes that are incorrectly identified as influence nodes in the network from similar cases.
True Negative Rate: Nodes that are correctly detected as non-influencing nodes.
False Negative Rate: Nodes that are mistakenly recognized as non-influenced nodes in the network from similar cases.
4.2. Datasets
In this article, different data sets are presented based on different criteria to evaluate the performance of the proposed method. Email, Facebook, and Twitter have been used for social monitoring.
-
➢
Email
The data related to the email social network graph is related to email. This graph contains 118 vertices and 200 edges which are available on the site (https://snap.stanford.edu/data/).
-
➢
Facebook
The data related to the accuracy, sensitivity, and F1 value graphs are from the graph related to the Facebook social network and related to the Facebook website. This graph contains 1133 vertices and 5453 edges that are available on the site (https://snap.stanford.edu/data/).
-
➢
Twitter
The Twitter social network graph data is related to the Twitter website. This graph contains 4038 vertices and 88234 edges that are available on the site (https://snap.stanford.edu/data/).
Besides, similar methods such as PSO, GA, DDSE and LPDM have been used in this research to compare the efficiency of the proposed method. The proposed algorithm was executed 30 times on 17 graphs. The results of the implementation are given in the next section. Three standard data sets of email, Facebook, and Twitter have been used for the experiments. In the following, each of the data sets is introduced in detail in Table 2.
Table 2.
Dataset specifications.
| Social network | Nodes | Edges | Society | |
|---|---|---|---|---|
| Dataset | 118 | 200 | 8 | |
| 1133 | 5451 | 14 | ||
| 4038 | 88234 | 8 |
4.3. Accuracy
The suggested technique has been compared to and evaluated against existing methods using PSO, GA, DDSE, and LPDM. The ratio of accurate category diagnoses to all inputs is used to measure accuracy. This experiment was carried out ten times with various inputs, and for each repetition, the accuracy was assessed and computed using Eq. (11).
| (11) |
This measure indicates how much the system has been able to correctly recognize the states of users and non-users, which was performed to evaluate the simulation in four stages, and the result can be seen in Fig. 5. The suggested strategy outperformed the first three strategies in this criterion.
Fig. 5.
Comparing the accuracy of the proposed method with other algorithms (GA, PSO, DDSE, and LPDM).
4.4. Precision
This criterion is one of the crucial criteria in algorithms for finding maximum influence in social networks, which is determined using Eq. (12).
| (12) |
Here, TP stands for the proportion of positive data that was actually identified, and FP for the proportion of erroneously positive data [62]. Fig. 6 displays the simulation outcome based on the aforementioned standards.
Fig. 6.
Comparing the precision of the proposed method with other algorithms (GA, PSO, DDSE, and LPDM).
The suggested technique is less effective than the other two ways in the first two phases of the simulation, but it performs well as the data increases.
4.5. Recall
The next criterion is “Recall” in algorithms for finding maximum influence in social networks, which is calculated through Eq. (13).
| (13) |
Here, TP is the amount of data correctly recognized as positive, and FN is the amount of data falsely recognized as negative. Fig. 7 presents the simulation outcomes of this criterion. Based on the findings, it is evident that the proposed approach has exhibited superior performance when compared to its counterparts.
Fig. 7.
Comparing the recall of the proposed method with other algorithms (GA, PSO, DDSE, and LPDM).
4.6. Comparison to other methods
The proposed algorithm has been compared to PSO, GA, DDSE algorithms and LPDM in solving the problem of finding the maximum influence. These algorithms can check three parameters: Nseed_set parameter value, convergence, and stability. The parameter Nseed_set is considered 5, 10, 15, and 20 for the implementation of all graphs in this method, and it has been compared in terms of the convergence criterion, which is the article's main goal.
-
➢
Convergence Results to the Optimal Response
To test the convergence, the proposed method, along with PSO, GA, DDSE were implemented for the Twitter graph, and Fig. 8, Fig. 9, Fig. 10, respectively, show the method of convergence to the final solution. In the above graphs, the horizontal axis shows the order of algorithm repetition and the vertical axis indicates the best fitness of each repetition. In the graphs below, the red curve illustrates the results of our proposed algorithm in this research.
Fig. 8.
Comparison of convergence of algorithms in Nseed_set = 5.
Fig. 9.
Comparison of convergence of algorithms in Nseed_set = 10.
Fig. 10.
Comparison of convergence of algorithms in Nseed_set = 15.
The results indicate that the suggested algorithm performs well in terms of convergence to the answer. Even in the case of graphs that do not find the maximum influence, the average size of influence is greater; it shows that this algorithm has a good convergence response. By looking at the study's findings, it is evident that the suggested approach for locating the maximum influence has a high degree of convergence and identifies the influence that is close to ideal. Additionally, the result analysis illustrates that the suggested technique finds the greatest influence faster than the PSO, GA, DDSE.
-
➢
Comparison of the Execution Time
The outcomes of the research's suggested strategy are contrasted with those of comparable approaches in Table 3. This table shows that when compared to other ways that are similar, the suggested method has a real-time feature.
Table 3.
Comparison of the results of the proposed method in terms of execution time.
| Time (s) | |||
|---|---|---|---|
| Method | Nseed_set = 5 | Nseed_set = 10 | Nseed_set = 15 |
| PSO algorithm | 9.6 | 10.2 | 11.05 |
| DDSE algorithm | 5.7 | 6.8 | 7.2 |
| GA algorithm | 5.10 | 5.95 | 6.03 |
| LPDM | 5.25 | 2.89 | 6.01 |
| Suggested method | 4.63 | 5.36 | 5.89 |
In real-time applications, time and storage limitations are raised. In the proposed method, a search is made for each influence pattern to find all associated influences before entering any warning. The test results validate the method's accuracy and efficiency, particularly with respect to its processing time. Notably, the method demonstrates a remarkable advantage in terms of speed, executing in a mere 5.36 s on a standard home computer, thereby achieving real-time performance.
Also, Table 4 compares the number of solutions of the basic GA article [63] and the proposed method in the 15 and 30 influence.
Table 4.
Comparison of the number of solutions.
| Influence number | ||||
|---|---|---|---|---|
| 15 | Basic article of GA | 49 | 75 | 118 |
| PSO algorithm | 53 | 69 | 123 | |
| Suggested method | 60 | 70 | 146 | |
| 30 | Basic article of GA | 68 | 10 | 158 |
| PSO algorithm | 72 | 54 | 162 | |
| Suggested method | 79 | 88 | 179 | |
According to the results of this table, the more the influence, the easier it is to provide a solution and the less time required; also, the speed of the operation to find the influence will be less delayed. In the proposed method, there has been an improvement of about 10 % in mode 15 and about 7 % in mode 30 compared to the original article. In this scenario, the duration of the program execution and the extraction of results have been evaluated. The shorter the execution time, the higher the quality of the algorithm.
5. Conclusion and suggestions
Influence maximization remains a crucial objective in the realm of social networks. The incorporation of the Moth-Flame Optimization (MFA) algorithm introduces novel avenues for achieving precision, efficiency, and adaptability in this domain. As society's digital footprint continues to expand, mastering influence becomes an increasingly potent tool for making an impact. The comprehensive analysis of the results highlights that the proposed method surpasses alternative approaches like PSO, GA, and DDSE and LPDM in terms of both speed and accuracy. Notably, the algorithm's standout feature is its rapid execution, clocking in at just 5.36 s on a standard laptop. This research presents a novel framework for identifying maximum influence in social networks, leveraging the MFA algorithm. The primary objective of this method is to identify maximum influence within a graph while maintaining optimal fitness. The method's efficacy was demonstrated across various settings, accompanied by the corresponding graphs. Utilizing the MFA algorithm reduces the time required to ascertain maximum influence, albeit without yielding an absolute answer. Instead, it provides an approximation that closely approaches optimality. It is imperative to subject each algorithm to a comparative analysis against its predecessors to ascertain its performance. Consequently, this study juxtaposed the outcomes with those of the PSO, GA, and DDSE and LPDM, revealing that the MFA strategy boasts a quicker response time in comparison. The MFA algorithm also exhibits robust stability, ensuring that the final result closely approximates the ideal solution. Additionally, it demonstrates a commendable convergence rate, as evidenced by the convergence graphs. The proposed method was subjected to various scenarios involving graphs of differing sizes and numbers. It consistently demonstrated accurate and reliable performance across diverse social networks. Variations in edge and vertex numbers were accounted for, and the method exhibited resilience in accommodating these changes. One of the standout advantages of the proposed method is its superior detection accuracy in influence maximization within social networks. It excels in making correct predictions, which significantly enhances its overall utility. Nonetheless, it is essential to acknowledge that no technique is without limitations. Factors such as initial assumptions, data quality, and network dynamics can influence the precision of results.
The proposed method holds substantial promise for uncovering maximum influence in social networks, particularly when search influence nodes are considered in approximate locations using the MFA approach. As technology advances, the integration of AI and machine learning can further elevate the accuracy and efficiency of this method. In light of the results and materials presented in this research, the following suggestions for future work are offered:
-
➢
Definition of different objective functions in finding the maximum influence and using multi-objective algorithms
-
➢
Finding the maximum influence by integrating the MFA algorithm with other methods, such as Kalman filtering [64] and deep learning [65].
-
➢
Defining different parameters on influence and comparing them based on different evolutionary algorithms
-
➢
Using the CNN [66,67] and other meta-heuristic methods to maximize influence.
-
➢
Using some large-scale datasets like Amazon, YouTube, As022july, and Conamate.
Data availability statement
All data are reported in the paper.
CRediT authorship contribution statement
Qi Cui: Writing – review & editing, Writing – original draft, Methodology, Data curation, Conceptualization. Feng Liu: Writing – review & editing, Visualization, Validation, Supervision, Resources, Project administration, Methodology, Investigation, Formal analysis, Conceptualization.
Declaration of competing interest
I hereby declare that: I have no pecuniary or other personal interest, direct or indirect, in any matter that raises or may raise a conflict with my duties.
References
- 1.Ni Q., et al. Continuous influence-based community partition for social networks. IEEE Transactions on Network Science and Engineering. 2021;9(3):1187–1197. [Google Scholar]
- 2.Ni Q., Guo J., Wu W., Wang H. Influence-based community partition with sandwich method for social networks. IEEE Transactions on Computational Social Systems. 2022;10(2):819–830. [Google Scholar]
- 3.Praveena A., Smys S. 2017 International Conference of Electronics, Communication and Aerospace Technology (ICECA) IEEE; 2017. Ensuring data security in cloud based social networks. [Google Scholar]
- 4.Darbandi M. Proposing new intelligent system for suggesting better service providers in cloud computing based on Kalman filtering. Published by HCTL International Journal of Technology Innovations and Research. 2017;24(1):1–9. ISSN: 2321-1814. [Google Scholar]
- 5.Darbandi M. Kalman filtering for estimation and prediction servers with lower traffic loads for transferring high-level processes in cloud computing. Published by HCTL International Journal of Technology Innovations and Research. 2017;23(1):10–20. ISSN: 2321-1814. [Google Scholar]
- 6.Chen W., Wang Y., Yang S. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2009. Efficient influence maximization in social networks. [Google Scholar]
- 7.Wang S., Tan X. Determining seeds with robust influential ability from multi-layer networks: a multi-factorial evolutionary approach. Knowl. Base Syst. 2022;246 [Google Scholar]
- 8.Chen J., et al. A review of vision-based traffic semantic understanding in ITSs. IEEE Trans. Intell. Transport. Syst. 2022 [Google Scholar]
- 9.Bucur D., Iacca G. European Conference on the Applications of Evolutionary Computation. Springer; 2016. Influence maximization in social networks with genetic algorithms. [Google Scholar]
- 10.Anugerah A.R., Muttaqin P.S., Trinarningsih W.J.H. 2022. Social Network Analysis in Business and Management Research: A Bibliometric Analysis of the Research Trend and Performance from 2001 to 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Li B., et al. Dynamic event-triggered security control for networked control systems with cyber-attacks: a model predictive control approach. Inf. Sci. 2022;612:384–398. [Google Scholar]
- 12.Qian L., et al. A new method of inland water ship trajectory prediction based on long short-term memory network optimized by genetic algorithm. Appl. Sci. 2022;12(8):4073. [Google Scholar]
- 13.Cheng B., et al. Situation-aware dynamic service coordination in an IoT environment. IEEE/ACM Trans. Netw. 2017;25(4):2082–2095. [Google Scholar]
- 14.Zhang H., et al. Security defense decision method based on potential differential game for complex networks. Comput. Secur. 2023;129 [Google Scholar]
- 15.Peng Y., Zhao Y., Hu J. On the role of community structure in evolution of opinion formation: a new bounded confidence opinion dynamics. Inf. Sci. 2023;621:672–690. [Google Scholar]
- 16.Zheng Y., et al. A lightweight ship target detection model based on improved YOLOv5s algorithm. PLoS One. 2023;18(4) doi: 10.1371/journal.pone.0283932. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ma X., et al. Real-time assessment of asphalt pavement moduli and traffic loads using monitoring data from Built-in Sensors: optimal sensor placement and identification algorithm. Mech. Syst. Signal Process. 2023;187 [Google Scholar]
- 18.Sadough S.M.S., Chamideh Z., Khalighi M.A. Efficient signal detection for cognitive radio relay networks under imperfect channel estimation. Transactions on Emerging Telecommunications Technologies. 2016;27(11):1593–1605. [Google Scholar]
- 19.Cao B., et al. Security-aware industrial wireless sensor network deployment optimization. IEEE Trans. Ind. Inf. 2019;16(8):5309–5316. [Google Scholar]
- 20.Zareie A., Sheikhahmadi A. A hierarchical approach for influential node ranking in complex social networks. Expert Syst. Appl. 2018;93:200–211. [Google Scholar]
- 21.Liu X. Real-world data for the drug development in the digital era. Journal of Artificial Intelligence and Technology. 2022;2(2):42–46. [Google Scholar]
- 22.Sun H., et al. Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication. 2016. Multiple influence maximization in social networks. [Google Scholar]
- 23.Zhang K., Du H., Feldman M.W. Maximizing influence in a social network: improved results using a genetic algorithm. Phys. Stat. Mech. Appl. 2017;478:20–30. [Google Scholar]
- 24.Mirjalili S. Moth-flame optimization algorithm: a novel nature-inspired heuristic paradigm. Knowl. Base Syst. 2015;89:228–249. [Google Scholar]
- 25.Yıldız B.S., Yıldız A.R. Moth-flame optimization algorithm to determine optimal machining parameters in manufacturing processes. Mater. Test. 2017;59(5):425–429. [Google Scholar]
- 26.Belkhiri L., Kim T.-J. Individual influence of climate variability indices on annual maximum precipitation across the global scale. Water Resour. Manag. 2021;35(9):2987–3003. [Google Scholar]
- 27.Guo F., Zhou W., Lu Q., Zhang C. Path extension similarity link prediction method based on matrix algebra in directed networks. Comput. Commun. 2022;187:83–92. [Google Scholar]
- 28.Wu H., Jin S., Yue W. Pricing policy for a dynamic spectrum allocation scheme with batch requests and impatient packets in cognitive radio networks. J. Syst. Sci. Syst. Eng. 2022;31(2):133–149. [Google Scholar]
- 29.Liu X., et al. Developing multi-labelled corpus of twitter short texts: a semi-automatic method. Systems. 2023;11(8):390. [Google Scholar]
- 30.Lu S., et al. The multi-modal fusion in visual question answering: a review of attention mechanisms. PeerJ Computer Science. 2023;9:e1400. doi: 10.7717/peerj-cs.1400. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Bai X., He Y., Xu M. Low-thrust reconfiguration strategy and optimization for formation flying using Jordan normal form. IEEE Trans. Aero. Electron. Syst. 2021;57(5):3279–3295. [Google Scholar]
- 32.Amiri Z., et al. The personal health applications of machine learning techniques in the internet of behaviors. Sustainability. 2023;15(16) [Google Scholar]
- 33.Souri A., Nourozi M., Rahmani A.M., Jafari Navimipour N. A model checking approach for user relationship management in the social network. Kybernetes. 2019;48(3):407–423. [Google Scholar]
- 34.Wang S., Liu J., Jin Y. Finding influential nodes in multiplex networks using a memetic algorithm. IEEE Trans. Cybern. 2019;51(2):900–912. doi: 10.1109/TCYB.2019.2917059. [DOI] [PubMed] [Google Scholar]
- 35.Li J., et al. Community-diversified influence maximization in social networks. Inf. Syst. 2020;92 [Google Scholar]
- 36.Wang S., Tan X. A Memetic algorithm for determining robust and influential seeds against structural perturbances in competitive networks. Inf. Sci. 2023;621:389–406. [Google Scholar]
- 37.Wang S., Tan X. Solving the robust influence maximization problem on multi-layer networks via a Memetic algorithm. Appl. Soft Comput. 2022;121 doi: 10.3390/s22062191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Liang Z., He Q., Du H., Xu W. Targeted influence maximization in competitive social networks. Inf. Sci. 2023;619:390–405. [Google Scholar]
- 39.Beni H.A., et al. A fast module identification and filtering approach for influence maximization problem in social networks. Inf. Sci. 2023;640 [Google Scholar]
- 40.Dong C., Xu G., Yang P., Meng L. TSIFIM: a three-stage iterative framework for influence maximization in complex networks. Expert Syst. Appl. 2023;212 [Google Scholar]
- 41.Bouyer A., et al. FIP: a fast overlapping community-based Influence Maximization Algorithm using probability coefficient of global diffusion in social networks. Expert Syst. Appl. 2023;213 [Google Scholar]
- 42.Bouyer A., Beni H.A. Influence maximization problem by leveraging the local traveling and node labeling method for discovering most influential nodes in social networks. Phys. Stat. Mech. Appl. 2022;592 [Google Scholar]
- 43.Tang J., et al. Maximizing the influence spread in social networks: a learning-automata-driven discrete butterfly optimization algorithm. Symmetry. 2022;15(1):117. [Google Scholar]
- 44.Fu B., et al. A differential evolutionary influence maximization algorithm based on network discreteness. Symmetry. 2022;14(7):1397. [Google Scholar]
- 45.Gong M., et al. Influence maximization in social networks based on discrete particle swarm optimization. Inf. Sci. 2016;367:600–614. [Google Scholar]
- 46.Waziri T.A., Yakasai B.M. Assessment of some proposed replacement models involving moderate fix-up. Journal of Computational and Cognitive Engineering. 2023;2(1):28–37. [Google Scholar]
- 47.Domingos P., Richardson M. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2001. Mining the network value of customers. [Google Scholar]
- 48.Kempe D., Kleinberg J., Tardos É. Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2003. Maximizing the spread of influence through a social network. [Google Scholar]
- 49.Xu W., Lu Z., Wu W., Chen Z. A novel approach to online social influence maximization. Social Network Analysis and Mining. 2014;4(1):1–13. [Google Scholar]
- 50.Wang G., et al. A modified firefly algorithm for UCAV path planning. Int. J. Hospit. Inf. Technol. 2012;5(3):123–144. [Google Scholar]
- 51.Cui L., et al. DDSE: a novel evolutionary algorithm based on degree-descending search strategy for influence maximization in social networks. J. Netw. Comput. Appl. 2018;103:119–130. [Google Scholar]
- 52.Chen R., Pu D., Tong Y., Wu M. Image‐denoising algorithm based on improved K‐singular value decomposition and atom optimization. CAAI Transactions on Intelligence Technology. 2022;7(1):117–127. [Google Scholar]
- 53.Adil E., Mikou M., Mouhsen A. A novel algorithm for distance measurement using stereo camera. CAAI Transactions on Intelligence Technology. 2022;7(2):177–186. [Google Scholar]
- 54.Basak S., Dey B., Bhattacharyya B. Demand side management for solving environment constrained economic dispatch of a microgrid system using hybrid MGWOSCACSA algorithm. CAAI Transactions on Intelligence Technology. 2022;7(2):256–267. [Google Scholar]
- 55.Cao B., Zhao J., Lv Z., Yang P. Diversified personalized recommendation optimization based on mobile data. IEEE Trans. Intell. Transport. Syst. 2020;22(4):2133–2139. [Google Scholar]
- 56.Shi J., et al. Optimal adaptive waveform design utilizing an end-to-end learning-based pre-equalization neural network in an UVLC system. J. Lightwave Technol. 2022;41(6):1626–1636. [Google Scholar]
- 57.Shi Y., Hu J., Wu Y., Ghosh B.K. Intermittent output tracking control of heterogeneous multi-agent systems over wide-area clustered communication networks. Nonlinear Analysis: Hybrid Systems. 2023;50 [Google Scholar]
- 58.Goldenberg J., Libai B., Muller E. Using complex systems analysis to advance marketing theory development: modeling heterogeneity effects on new product growth through stochastic cellular automata. Acad. Market. Sci. Rev. 2001;9(3):1–18. [Google Scholar]
- 59.Zhang P. Linear programming for influence maximization problems in social networks based on diffusion models. Highlights in Science, Engineering and Technology. 2023;38:327–332. [Google Scholar]
- 60.Masías V.H., et al. Modeling verdict outcomes using social network measures: the watergate and caviar network cases. PLoS One. 2016;11(1) doi: 10.1371/journal.pone.0147248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Gruszczyński W., Arabas P. Application of social network inferred data to churn modeling in telecoms. Journal of Telecommunications and Information Technology. 2016;(2):77–86. [Google Scholar]
- 62.Jia Z., Wang W., Zhang J., Li H. Contact high-temperature strain automatic calibration and precision compensation research. Journal of Artificial Intelligence and Technology. 2022;2(2):69–76. [Google Scholar]
- 63.Nemhauser G.L., Wolsey L.A., Fisher M.L. An analysis of approximations for maximizing submodular set functions—I. Math. Program. 1978;14(1):265–294. [Google Scholar]
- 64.Darbandi M. Proposing new intelligence algorithm for suggesting better services to cloud users based on kalman filtering. Published by Journal of Computer Sciences and Applications. 2017;5(1):11–16. (ISSN: 2328-7268) [Google Scholar]
- 65.Wang X., et al. 2020. Block Switching: a Stochastic Approach for Deep Learning Security. arXiv preprint arXiv:2002.07920. [Google Scholar]
- 66.Zheng M., et al. A hybrid CNN for image denoising. Journal of Artificial Intelligence and Technology. 2022;2(3):93–99. [Google Scholar]
- 67.Shakeel N., Shakeel S. Context-free word importance scores for attacking neural networks. Journal of Computational and Cognitive Engineering. 2022;1(4):187–192. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
All data are reported in the paper.











