Abstract
During coronavirus (SARS-CoV2) the number of fraudulent transactions is expanding at a rate of alarming (7,352,421 online transaction records). Additionally, the Master Card (MC) usage is increasing. To avoid massive losses, companies of finance must constantly improve their management information systems for discovering fraud in MC. In this paper, an approach of advancement management information system for discovering of MC fraud was developed using sequential modeling of data depend on intelligent forecasting methods such as deep Learning and intelligent supervised machine learning (ISML). The Long Short-Term Memory Network (LSTM), Logistic Regression (LR), and Random Forest (RF) were used. The dataset is separated into two parts: the training and testing data, with a ratio of 8:2. Also, the advancement of management information system has been evaluated using 10-fold cross validation depend on recall, f1-score, precision, Mean Absolute Error (MAE), Receiver Operating Curve (ROC), and Root Mean Square Error (RMSE). Finally various techniques of resampling used to forecast if a transaction of MC is genuine/fraudulent. Performance for without re-sampling, with under-sampling, and with over-sampling is measured for each Algorithm. Highest performance of without re-sampling was 0.829 for RF algorithm-F score. While for under-sampling, it was 0.871 for LSTM algorithm-RMSE. Further, for over-sampling, it was 0.921 for both RF algorithm-Precision and LSTM algorithm-F score. The results from running advancement of management information system revealed that using resampling technique with deep learning LSTM generated the best results than intelligent supervised machine learning.
Keywords: Supervised machine learning, Fraud discovering, Master card, Long short-term memory LSTM, SARS-CoV2
Abbreviations table
- SARS-CoV2
Coronavirus
- MC
Master Card
- LR
Logistic Regression
- MAE
Mean Absolute Error
- RMSE
Root Mean Square Error
- LSTM
Long Short-Term Memory Network
- ISML
Intelligent supervised machine learning
- RF
Random Forest
- ROC
Receiver Operating Curve
- OTP
One-Time Password
1. Introduction
In the wake of the coronavirus outbreak, MC fraud is on the rise because of a vulnerability that has made it more difficult (Bandyopadhyay & Dutta, 2020; Sadgali, Sael & Benabbou, 2019). In deep learning LSTM fraud, the challenge of preventing fraud has increased dramatically. Customers who make purchases through websites frequently use online transactions. Paying using electronic funds eliminates the time-consuming and expensive practice of collecting cash from customers when purchasing and selling goods online (Sherstinsky, 2020). There is a significant human fatality rate due to extensive use of the World Wide Web and COVID-19, which has been shut down because of its high mortality rate. Also, the volume of electronic transactions has increased significantly (Hu et al., 2021). It is also possible to hack into the latest technological developments. Therefore, it is imperative that new strategies for detecting fraud are developed immediately. This is the driving force for the research. Visa, for example, is relying on technical solutions such as artificial intelligence to combat MC fraud (Liu, Jin & Shen, 2019).
The popularity of e-commerce websites for acquiring varied things at the more affordable or fair cost has the favorable effect on target market's growth (Gupta et al., 2022; Li & Yan, 2022). Almost any sort of payment can be made with the MC payment system. People who have phone with the online transaction feature can make whatever kind of payment they choose. The use of a mobile device is necessary to receive a One-Time Password (OTP). MC transactions have become ubiquitous in recent years as technology has advanced and novel payment of e-service alternatives, such as payments of mobile and e-commerce, have emerged (Jain, Sarupria & Kothari, 2020).
In the current circumstance, where the world is dealing with an unknown COVID, credit card transactions are growing more prevalent. Authorities in a number of countries are now urging citizens to avoid using cash wherever feasible. It is not always possible to apply it in all transactions in practice. Due to the rise in cashless transactions during the lockdown time as a result of COVID, the number of fraudulent transactions is also rising. A customer's previous transactions' data can be used to spot patterns of fraudulent behavior (Shadmi et al., 2020). A variety of methods, including DM, decision trees, rule depend mining, neural networks, clustering of fuzzy, and ML, will be used by banks and credit card firms during COVID-19 in an effort to catch fraudsters red-handed. Based on previous activity, the technique attempts to determine a customer's regular usage pattern (Adday, Shaban, Jawad, Jaleel & Zahra, 2021; Ashtiani & Raahemi, 2021; Khan, Ateeq, Ali & Butt, 2021). The purpose of this research is to suggest a mechanism for detecting such fraud transactions in such an uncontrolled pandemic situation.
With a low percentage of erroneous triggering, the proposed model functions well. MC payment fraud detection can be accomplished through the use of both supervised and unsupervised methods. Unsupervised machine learning contains K Means, EM, Farthest First, X-Means, Clustering based on density that is used to financial data, SVM, Logistic regression, Naïve Bayes, OneR, C4.5, Decision tree, Random Forests, Random Tree are running for fraud detection of financial (Abu Alfeilat et al., 2019; Ali et al., 2021). Neural networks, Decision Tree and Bayesian belief networks were found to be 72.5 percent, 77.5 percent and 88.9 percent efficient in committing financial statement fraud by the sample of Greek industrial enterprises (Berahmand, Nasiri & Li, 2021, 2021). An analysis has used the Classification and Regression Tree to identify fake financial statements (Nasiri, Berahmand, Rostami & Dabiri, 2021). Six machine learning techniques, including LR, SVM, artificial neural networks, C4.5, bagging and stacking, were contrasted and explored (Ebadi, Hosseini & Hosseini, 2017). According to the results of an experimental study, LR and SVM outperform other stated classifier models (Jamali, Sadegheih, Lotfi, Wood & Ebadi, 2021).
Payment card fraud has been detected using a variety of algorithms of machine learning, containing supervised and unsupervised learning, anomaly detection, and ensemble learning (Popat & Chaudhary, 2018). In particular, supervised classification techniques, which use pre-classified datasets, have proven to be particularly helpful in dealing with this problem (Khatri, Arora & Agrawal, 2020). An analysis of past transactions is used to train the classifier, which in turn provides detection model that can forecast if a new transaction is fraudulent or not (SVM algorithms), hidden Markov models, LR algorithms, decision trees, RF and k-nearest neighbors (Hammed & Soyemi, 2020; Heydarpour, Abbasi, Ebadi & Karbassi, 2020). A system's odd behavior and transactions that don't match the model are detected using unsupervised classification methods. It can assist in the detection of novel fraud trends that have not been recognized previously. Artificial intelligence communities, on the other hand, are interested in detection of MC fraud for a number of different reasons. As a result, for these skewed data sets, a large number of commonly used classifiers are incapable of identifying items that belong to a marginalized social group.
There are a number of different types of fraud-detection systems, but the most common is the use of transactional data such as the amount, time and location of the transaction. However, comprehensive sequential information defining customers' profile is ignored (Brezočnik, Fister Jr, & Podgorelec, 2018; Ganji & Mannem, 2012). Such methods are insufficient for detecting MC fraud, because they don't evaluate the customer behavior of spending, which is crucial to find fraud trends that are relevant and change over time owing to new attack and seasonality methods.
Given its status as one of the most precise sequence analysis learning algorithms, recurrent neural networks based deep learning methods, particularly its variation LSTM, have recently been applied in the fraud detection. Recurrent neural network is dynamic ML method that may be used to analyze by simulating the sequential reliance middle from credit card transactions, different bank accounts can exhibit dynamic temporal behavior. Context-dependent representations can be found via the attention mechanism recently.
Regardless of their distance, this method considers dependencies between elements in a sequence (Carcillo, Le Borgne, Caelen & Bontempi, 2018; Vaughan, 2020). Picture captioning and Machine translation have both benefited greatly from its use. The mechanism of attention works by taking a weighted mean of the series of vectors is used to construct the context vector (includes the most significant info), which is then employed as input in the next layer (Benchaji, Douzi, El Ouahidi & Jaafari, 2021).
Objectives of this research are, but not limited to:
-
•
Model outperforms the competition in terms of detecting fraudulent transactions.
-
•
LSTM algorithms will be used to identify fraudulent transactions during shutdown period.
-
•
ML will be used to identify types of transactions that are most likely to be fraudulent.
-
•
AI will be used because it can constantly modify & update its rules for new transactions.
The goal of this study is to evolve a fraud detection model for MC transactions that is both efficient and error-free. This model is implemented using LR, RF, and LSTM approaches.
In this study, Since the financial dataset contains hierarchical features, these methods are useful. The precise goals of this study are summarized as follows: First: During the COVID-induced shift in MC's work environment, to identify fraudulent transactions. Second: To protect clients from financial loss by identifying it and informing them, as well as the bank, so that appropriate action can be taken. Third: LR, RF, and LSTM has been used to predict the fraud. Fourth: To develop a fraud detection model that is both efficient and error-free for MC companies.
In this paper, Section 2 discusses problem statement, Section 3 gives Methods, and Sections 4 and 5 are for Results and discussion.
2. Problem statement
MC fraud is ever-growing concern in nowadays banking system. In recent years, the number of fraudulent operations has risen dramatically, as a result of which enormous financial losses have occurred for many companies and agencies of government (Khan et al., 2022). In the future, the numbers are likely to rise, as a result, many academics in this field have concentrated their efforts on early detection of fraudulent conduct using powerful machine learning approaches (Singh, Ranjan & Tiwari, 2022). MC fraud detection, on the other hand, is not an easy task due to two factors: (i) each effort at fraud usually results in a different set of fraudulent behaviors. (ii) Data is substantially skewed, i.e., the average number of samples in the majority of samples (instances of genuine) outnumbers the samples from the minority groups (cases of fraudulent).
When presenting input dataset with a very imbalanced distribution of the class or label to the forecasting framework, the model has a tendency to favor majority specimens (Kamaruddin & Ravi, 2016). As an outcome, it is more likely to disguise the fraudulent transaction as legitimate.
To address this issue, a data-level strategy was used, with multiple resampling approaches like underdamping, oversampling, and hybrid strategies utilized, as well as a technique of algorithmic using methods of ensemble like boosting and bagging on a massive number of transactions in an extremely skewed dataset.
To describe if a transaction is fraudulent or real, predictive models like as LR and RF have been used in conjunction with various resampling strategies (Choi & Lee, 2018). As MC become the most widely used method of payment, particularly in the internet sector, activities of fraudulent involving payment of MC technology are on the rise (Nti et al., 2021).
3. Methods
3.1. Long short-term memory networks
In the realm of deep learning, LSTM is a time series data representation architecture based on artificial recurrent neural networks. Unlike traditional feed forward neural networks, the feedback links between hidden units in the LSTM are linked to discrete time steps, transaction labels can be predicted based on long-term sequence dependencies learned from past transactions (Van Houdt, Mosquera & Nápoles, 2020). LSTM was created to solve the issue of exploding and vanishing gradients that can occur during regular recurrent neural network training (Fischer & Krauss, 2018). The LSTM unit is made up of three specific gates that update a memory cell that saves information: input, forget and output gates (Fischer & Krauss, 2018). Flow of info into and out of cell is controlled by the three gates, and the cell retains values throughout time. Fig. 1 offers the unit structure of LSTM (Jin, Wu & Guo, 2020). The LSTM algorithm is used to solve the vanishing gradient problem in recurrent neural networks is well-suited to categorize, analyze, and predict time series given temporal lags of unknown duration with LSTM, Back-propagation will be used to train the Deep Learning model (Mohanty, Seth, Sanjay & Prithvi, 2022).
Fig. 1.
R-NN vs LSTM.
3.2. 10-fold cross validation based hyper parameter tuning
A hyper parameter is an out to the technique setup whose value can't be described from data of training (Shekar & Dagnew, 2019). The value of a model parameter can be approximated throughout the training phase because it is an internal configuration of the model, whereas an external to the model, a hyper parameter might be defined as a configuration. Because the hyper parameter is not part of the model, the practitioner must manually specify its value (Ozcan & Basturk, 2020).
However, in order to get the optimum performance from the model, the value must first be fine-tuned. The method of cross validation was utilized to tune the hyper parameter in this case (Duarte & Wainer, 2017). We employed cross validation of K fold, with K set to 10. The training dataset is separated into 10 folds in 10-fold cross-validation, with current fold serving as a test set and remaining folds serving as a training set for each fold. This model is then fitted into the set of training and tested on the set of tests (Guo et al., 2019). This cross-validation method can also be used to fine-tune hyper-parameters. We built the best hyper parameter search by sckit learn's grid search function with cross-validation. Because every ML model had its own set of hyper-parameters, the overall search for each model was unique (Al-Abdaly, Al-Taai, Imran & Ibrahim, 2021). When selecting hyper-parameters, it is important to keep in mind that performance is a primary factor in determining which parameters to use. It's one of the most important conditions for getting useful and reliable outputs from machine learning algorithms in practice (Le, Huynh, Yapp & Yeh, 2019). The following figure offers the process of model tuning, workflow, and things to consider:
3.3. Logistic regression
It is one of the most commonly used ML algorithms for classification. Despite the name's inclusion of the word ‘regression,' this is not a regression algorithm. LR's name is derived from a well-known ML method that it was built on, the LR algorithm is used to solve regression problems. When using LR, the probability of each result is stated in terms of how likely it is that it will occur. The weights and input variables of an LR model can be combined to predict real valued outputs. For clarity, let's assume there is only one independent and one dependent variable (Omondi-Ochieng, 2020).
LR makes use of a linear equation of this type as well. To limit the expected real values to a range between 1 and 0, it makes use of sigmoid or logistic functions to forecast the likelihood of each class's result.
Fig. 2 illustrates the function of sigmoid. Suppose we have a classification problem with two variables: dependent 'y’ and an independent 'x'. By default, LR utilizes a 0.5 threshold, which classifies any probability class 1 above 0.5 and any probability class 0 as below 0.5. When necessary, this threshold can be modified (Gregova, Valaskova, Adamko, Tumpach & Jaros, 2020; Sadalia, Nasution & Muda, 2020).
Fig. 2.
Model tuning process.
3.4. Random forest
RF is an approach of an ensemble learning that may be applied to classification and regression problems. It's a type of bagging that has been extended. Using the bagging technique, a large number of underachieving students are brought together. Decision trees are used to teach weak learners in RF. So, before delving into the specifics of the RF, let's review the fundamentals of decision trees. A decision tree is a type of supervised learning technique that may be employed for regression and classification. However, classification is the most common usage for it. Several internal nodes, each representing the test in a certain property, make up the structure (e.g., weather will be bright, gloomy, or rainy tomorrow). In the tree, each branch represents a different test result, and the leaf nodes are the final results (label of class). It entails segmenting a training set into many subsamples (An & Suh, 2020).
Fig. 3 depicts a simple decision tree that is attempting to choose whether or not to play golf tomorrow. It begins with a three-choice viewpoint: overcast, rainy, and sunny. Verify the wind speed if the sky is clear and sunny (false or true). In the event that this is the case, we will not be playing golf that day (Uddin, Chi, Al Janabi & Habib, 2022).
Fig. 3.
Example for RF.
We select to play if the answer is false. If the sky is cloudy, we can decide to go out for a game. If the forecast calls for rain, we should also evaluate the humidity level. We prefer not to play if the humidity is high, the humidity isn't too high so we can go out and play golf. Here we can plainly see how decision trees are constructed, with each step in the chain of events leading to the final conclusion. The process of partitioning whole training data into subsets in every internal node depending on some criterion, is required for the construction of a decision tree (Borup, Christensen, Mühlbach & Nielsen, 2022; Iwendi et al., 2020).
The DT algorithm uses metrics like gini impurity and Information gain to identify the appropriate split for each node. Using the distribution of labels in subset, one can calculate the impurity of gini, which is the scope of the likelihood that a randomly selected element from a set will be erroneously labeled. Data received at each stage of the tree-building process is used to decide on which feature to divide the tree. When the node in internal has the value of label class, the process of splitting proceeds.
Despite the fact that DT are simple to comprehend and perform well in particular data, they have a large variance due to the greedy process of algorithm, which causes tree to constantly select optimal split at each level and can't look beyond the current level. Thus, over fitting may occur, where the model outperforms the testing set in the training set (Borup et al., 2022).
By utilizing the bootstrap idea, the RF algorithm effectively mitigates the problem of over-fitting. For the uninitiated, the random forest is a decision-tree building technique that takes many decision trees and mixes them into a single model. It is necessary to use replacement sampling from the training data in order to do bootstrapping, as we explained earlier. Decision trees are trained with diverse subsets of data using bootstrapping. Furthermore, the random forest employs random feature subsets. Suppose there are 50 characters in the data. RF will only train on 10 of them per tree if the data has just a particular number of features, let's say ten. As a result, each tree will include ten random attributes that will be employed for training, such as determining the best split of every 3 node. Final result will be derived after the gathering of DT has been completed (vote). Because not one, but numerous DT are utilized to make the choice, and each tree is trained with various data subsections, the model trained in this way ensures generalization (Zhang, Zhong, & Hu, 2022).
3.5. Proposed methodology
MC money transactions are identified by using a sample of real transactions in the financial dataset. These transactions were compiled from a month's worth of financial logs from MC money service that was used in several nations. During COVID-19, there were 7,352,421 online transaction records, each of which contains a collection of attributes. In the dataset, the non-numeric data is converted into numeric info. This info is then scaled down to the precise range between 0 and 1. As a result, the proposed classifier can be used to a cleaner dataset. Transactions involving cash withdrawals and transfers appear to have a suspicious transaction set. The dataset is separated into two parts: the training and testing data, with the ratio of 8:2. The data of training is applied to train the LSTMs classifier model, which is then used to make predictions for the testing dataset. It is decided to keep the attribute 'is Fraud' as a target variable for the classification operation. As previously stated, our proposed model reduces the quantity of input features by using choice of feature and reduction of dimensionality techniques to MC fraud data before feeding them into the model. For this purpose, to capture sequential dependency middle from consecutive MC transactions, the LSTM sequence learner is used as a basis dynamic pattern recognition classifier. Following that, a resampling technique is used to offer a distinct focus on the output from the LSTM's hidden layers, allowing our method to uncover key fraud trends and discover extremely within a consumer's purchases, there are a variety of transactions. Fig. 4 shows method of proposed framework.
Fig. 4.
Proposed methodology.
4. Results
Any prediction model's performance must be assessed, illustrating the significance of assessment measures. The metrics applied to assess the performance of classifier models are discussed in this section. This study uses the following metrics as performance evaluation metrics to support its predicted results. The percentage of true forecasting to the total number of occurrences studied is measured by accuracy. Since it doesn't take into account incorrect predictions, the accuracy metric may not be an adequate measure of how well a model performs in practice. As a result, precision and recall must be calculated in order to solve the issue described above. The classifier's accuracy is measured by the number of TP compared to the projected number of TP. The number of correct positive outcomes divided by total number of relevant specimens is referred to as recall. According to F1-Score, a measurement of both recalls and precision, it is the harmonic mean of these two values. The highest amount of precision, recall, and F1-score is known to be 1. Measuring absolute discrepancies between predictions and observations of test samples, MSE is another evaluation metric. MSE generates non-negative floating-point values, with the amount near 0.0 proving to be the best. With these previously mentioned processed MC fraud data and the chronologically ordered sequence in which they were collected, we may use our suggested model to forecast the label of each subsequent transaction. Each data is broken down into three groups. 70% of the dataset is utilized to train the models, and this is the data that is used for that purpose. The validation set is a 15% subset of data that is used to validate the classifiers in order to avoid over-fitting and develop performance of the method a 15 percent test subset of the total info is utilized to determine whether generalization of network holds up under scrutiny. Our suggested method and the baseline LSTM method are compared using the same training and testing sets of MC data. Both models' accuracy and recall graphs are shown in Fig. 5 when applied to our dataset labeled. As can be seen, this work's model outperformed the others in terms of RMSE. Because of the usage of the resampling method, this significant improvement has occurred, for better detection, average of dataset driven weighted of all transactions in a sequence can be used to extract more relevant patterns from transactions. This allows the sequence classifier to regularly focus on info items that are most significant to classification objective.
Fig. 5.
RMSE for LSTM, RF, and LR.
Furthermore, to emphasize the sensitivity of our suggested model's classification performance, we've created a visual representation of the confusion matrix for each model we tested on our dataset. Dataset. We show that our suggested methodology is capable of reducing the number of routine fraudulent transactions while also catching the uncommon fraudulent transaction, It is critical for financial service providers in the real world. These models were chosen because of their promising results and the fact that they all use the same dataset, Dataset, as stated in this paper.
Tables 1, 2, and 3 displays the precision, recall, MAE, RMSE, and F score values for each model. Fig. 5 shows RMSE for the three algorithms. The latter parameter is particularly important in the detection of fraud area, because institutions of financial are more interested in discovering potential fraud cases in order to protect consumers' interests and prevent the significant yearly losses of financial generated by fraud.
Table 1.
Performance measures without re-sampling.
| Algorithms | MAE | RMSE | Precision | Recall | F score |
|---|---|---|---|---|---|
| LR | 0.267 | 0.348 | 0.240 | 0.070 | 0.333 |
| RF | 0.312 | 0.445 | 0.812 | 0.044 | 0.829 |
| LSTM | 0.511 | 0.678 | 0.613 | 0.094 | 0.712 |
Table 2.
Performance measures with over-sampling.
| Algorithms | MAE | RMSE | Precision | Recall | F score |
|---|---|---|---|---|---|
| LR | 0.113 | 0.150 | 0.367 | 0.080 | 0.178 |
| RF | 0.231 | 0.432 | 0.921 | 0.099 | 0.112 |
| LSTM | 0.539 | 0.684 | 0.632 | 0.076 | 0.921 |
Table 3.
Performance measures with under-sampling.
| Algorithms | MAE | RMSE | Precision | Recall | F score |
|---|---|---|---|---|---|
| LR | 0.449 | 0.145 | 0.347 | 0.099 | 0.321 |
| RF | 0.812 | 0.123 | 0.211 | 0.067 | 0.297 |
| LSTM | 0.762 | 0.871 | 0.339 | 0.045 | 0.421 |
Even without resampling, the LR was able to accurately classify the valid samples with precision, recall and f1 scores as shown in tables. This was to be expected, given that we're dealing with an unequal class. This is especially true in cases where the entire class is defrauded. This framework's recall and precision were only 0.240 and 0.070, respectively. As illustrated in Figs. 6 , 7 , and 8 the ROC AUC (area under the receiver operating characteristic curve) amounts are likewise subpar.
Fig. 6.
ROC for LR.
Fig. 7.
ROC for RF.
Fig. 8.
ROC for LSTM.
Fig. 9 depicts the LSTM model's confusion matrix when none of the resampling approaches are utilized. Tables shows that when random under-sampling was applied, the LSTM performed extremely well in categorizing the negative class.
Fig. 9.
Confusion matrix for LSTM.
5. Discussion
The LSTM model is assessed using the above-mentioned evaluation metrics. The results of this study show that the proposed model outperforms the competition in terms of detecting fraudulent transactions. For each epoch, a loss is acquired during the training of this model as the number of epochs grows, the loss decreases until it reaches a minimum. A model with a lower loss indicates a more accurate prediction. Because of the growing popularity of mobile money transfers, it's important to be aware of potential fraud through bank transactions. This is now a foregone conclusion at COVID-19. Customers will not be tormented by financial disputes if unlawful transactions are discovered and prevented.
To date, this research has been conducted from the announcement of Covid-19 by the Indian Government to the first unlock period. The study's major goal is to track down fraudulent transactions and reduce fraud as much as attainable. It demonstrates that strategy is both practical and appropriate for use in current context. LSTM algorithms can be used to identify fraudulent transactions during a shutdown period. LSTM model is developed and implemented for this purpose, with hyper-parameter fine-tuning as needed.
By adjusting hyper-parameters, a more precise model can be created, one that performs to its full potential. It is obvious from the testing findings that suggested model is qualified to detect suspicious transactions with a high precision. Because it can be used to big financial datasets, the proposed strategy is preferred. Due to the fact that clients will be alerted to fraudulent transactions, an effective and error-free system is needed in the mobile transaction industry.
ML may be employed to identify classes of transactions which are most presumably to be fraudulent, which can help avoid fraud. Predictive modeling is employed in this paper to identify fraudulent transactions. It may generate rules/models, and analyses to determine whether a certain transaction, conducted in a particular manner, or coming from a particular person, is presumably to be fraudulent. Risk factors and thresholds were analyzed in real time using the proposed technique. Individual transactions might be accepted or rejected by businesses.
That the danger factor has been effectively minimized is a clear indication of this. When a transaction is proven to be fraudulent, the transaction authority can use this new approach to help them get rid of it. We utilize ML because it may constantly modify and update its rules in response to new transactions, ensuring that the rules remain current.
6. Conclusions
To describe if MC transaction is fraudulent or not, ML algorithms was used. A data provided by the ML group at ULB was used for this purpose. As the accounts of positive class for a large portion of the dataset, it is severely lopsided. When a severely uneven class distribution is used as input to a predictive model, the model is skewed toward the majority samples. As a result, a fraudulent transaction can be passed off as a legitimate one.
To address this issue, a data-level strategy that included a variety of resampling strategies was used. In addition, methods of algorithmic like as boosting and bagging were used to address the problem of class imbalance. Additionally, the LSTM method was used to compare it to other methods. After that, analyses were conducted on all 3 methods both without and with resampling. The RF in combination with a re-sampling strategy connections elimination outperformed other models, according to the comparative results.
-
•
By taking into account the misclassification costs, a cost learning of sensitive technique can be used for future work.
-
•
The price of misclassifying the phony class as a legitimate one (FN), True Positive costs are substantially higher than price of misclassifying a lawful class as the fraudulent class (FP), which is price of studying contacting and transaction cardholder.
-
•
This method of learning entails categorizing an instance into the type with the lowest projected price.
-
•
Highest performance of without re-sampling was 0.829 for RF algorithm-F score. While for under-sampling, it was 0.871 for LSTM algorithm-RMSE. Further, for over-sampling, it was 0.921 for both RF algorithm-Precision and LSTM algorithm-F score.
For future work, it could be interesting to consider other confusion matrices and compare them with this work. Further, other SMLs may be considered for next studies.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Data availability
Data will be made available on request.
References
- Abu Alfeilat H.A., Hassanat A.B., Lasassmeh O., Tarawneh A.S., Alhasanat M.B., Eyal Salman H.S., et al. Effects of distance measure choice on k-nearest neighbor classifier performance: A review. Big data. 2019;7(4):221–248. doi: 10.1089/big.2018.0175. [DOI] [PubMed] [Google Scholar]
- Adday B.N., Shaban F.A.J., Jawad M.R., Jaleel R.A., Zahra M.M.A. 2021 International Conference on Engineering and Emerging Technologies (ICEET) IEEE; 2021. Enhanced vaccine recommender system to prevent COVID-19 based on clustering and classification; pp. 1–6. [Google Scholar]
- Al-Abdaly N.M., Al-Taai S.R., Imran H., Ibrahim M. Development of prediction model of steel fiber-reinforced concrete compressive strength using random forest algorithm combined with hyper-parameter tuning and k-fold cross-validation. Eastern-European Journal of Enterprise Technologies. 2021;5(7):113. [Google Scholar]
- Ali N.G., Abed S.D., Shaban F.A.J., Tongkachok K., Ray S., Jaleel R.A. Hybrid of K-Means and partitioning around medoids for predicting COVID-19 cases: Iraq case study. Periodicals of Engineering and Natural Sciences (PEN) 2021;9(4):569–579. [Google Scholar]
- An B., Suh Y. Identifying financial statement fraud with decision rules obtained from Modified Random Forest. Data Technologies and Applications. 2020;54(2):235–255. [Google Scholar]
- Ashtiani M.N., Raahemi B. Intelligent fraud detection in financial statements using machine learning and data mining: A systematic literature review. IEEE access : practical innovations, open solutions. 2021;10:72504–72525. [Google Scholar]
- Bandyopadhyay S.K., Dutta S. Detection of fraud transactions using recurrent neural network during COVID-19: Fraud transaction during COVID-19. Journal of Advanced Research in Medical Science & Technology. 2020;7(3):16–21. ISSN: 2394-6539. [Google Scholar]
- Benchaji I., Douzi S., El Ouahidi B., Jaafari J. Enhanced credit card fraud detection based on attention mechanism and LSTM deep model. Journal of Big Data. 2021;8(1):1–21. [Google Scholar]
- Berahmand K., Nasiri E., Li Y. Spectral clustering on protein-protein interaction networks via constructing affinity matrix using attributed graph embedding. Computers in Biology and Medicine. 2021;138 doi: 10.1016/j.compbiomed.2021.104933. [DOI] [PubMed] [Google Scholar]
- Berahmand K., Nasiri E., Rostami M., Forouzandeh S. A modified DeepWalk method for link prediction in attributed social network. Computing. 2021;103(10):2227–2249. [Google Scholar]
- Borup D., Christensen B.J., Mühlbach N.S., Nielsen M.S. Targeting predictors in random forest regression. International Journal of Forecasting. 2022 [Google Scholar]
- Brezočnik L., Fister Jr, I., Podgorelec V. Swarm intelligence algorithms for feature selection: A review. Applied Sciences. 2018;8(9):1521. [Google Scholar]
- Carcillo F., Le Borgne Y.A., Caelen O., Bontempi G. Streaming active learning strategies for real-life credit card fraud detection: Assessment and visualization. International Journal of Data Science and Analytics. 2018;5(4):285–300. [Google Scholar]
- Choi D., Lee K. Security and Communication Networks; 2018. An artificial intelligence approach to financial fraud detection under IOT environment: A survey and implementation. 2018. [Google Scholar]
- Duarte E., Wainer J. Empirical comparison of cross-validation and internal metrics for tuning SVM hyper-parameters. Pattern Recognition Letters. 2017;88:6–11. [Google Scholar]
- Ebadi M.J., Hosseini A., Hosseini M.M. A projection type steepest descent neural network for solving a class of non-smooth optimization problems. Neurocomputing. 2017;235:164–181. [Google Scholar]
- Fischer T., Krauss C. Deep learning with long short-term memory networks for financial market predictions. European journal of operational research. 2018;270(2):654–669. [Google Scholar]
- Ganji V.R., Mannem S.N.P. Credit card fraud detection using anti-k nearest neighbor algorithm. International Journal on Computer Science and Engineering. 2012;4(6):1035–1039. [Google Scholar]
- Gregova E., Valaskova K., Adamko P., Tumpach M., Jaros J. Predicting financial distress of Slovak enterprises: Comparison of selected traditional and learning algorithms methods. Sustainability. 2020;12(10):3954. [Google Scholar]
- Guo J., Yang L., Bie R., Yu J., Gao Y., Shen Y., et al. An XGBoost-based physical fitness evaluation model using advanced feature selection and Bayesian hyper-parameter optimization for wearable running monitoring. Computer Networks. 2019;151:166–180. [Google Scholar]
- Gupta V., Santosh K.C., Arora R., Ciano T., Kalid K.S., Mohan S. Socioeconomic impact due to COVID-19: An empirical assessment. Information Processing & Management. 2022;59(2) doi: 10.1016/j.ipm.2021.102810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hammed M., Soyemi J. An implementation of decision tree algorithm augmented with regression analysis for fraud detection in credit card. International Journal of Computer Science and Information Security (IJCSIS) 2020;18(2):79–88. [Google Scholar]
- Heydarpour F., Abbasi E., Ebadi M.J., Karbassi S.M. Solving an optimal control problem of cancer treatment by artificial neural networks. IJIMAI. 2020;6(4):18–25. [Google Scholar]
- Hu T., Liu X., Chen T., Zhang X., Huang X., Niu W., et al. Transaction-based classification and detection approach for Ethereum smart contract. Information Processing & Management. 2021;58(2) [Google Scholar]
- Iwendi C., Bashir A.K., Peshkar A., Sujatha R., Chatterjee J.M., Pasupuleti S., et al. COVID-19 patient health prediction using boosted random forest algorithm. Frontiers in public health. 2020;8:357. doi: 10.3389/fpubh.2020.00357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jain A., Sarupria A., Kothari A. The impact of COVID-19 on E-wallet's payments in Indian economy. International Journal of Creative Research Thoughts. 2020;8(6):2447–2454. [Google Scholar]
- Jamali N., Sadegheih A., Lotfi M.M., Wood L.C., Ebadi M.J. Estimating the depth of anesthesia during the induction by a novel adaptive neuro-fuzzy inference system: A case study. Neural Processing Letters. 2021;53(1):131–175. [Google Scholar]
- Jin Y., Wu D., Guo W. Attention-based LSTM with filter mechanism for entity relation classification. Symmetry. 2020;12(10):1729. [Google Scholar]
- Kamaruddin S., Ravi V. Proceedings of the international conference on informatics and analytics. 2016. Credit card fraud detection using big data analytics: Use of PSOAANN based one-class classification; pp. 1–8. [Google Scholar]
- Khan A.T., Cao X., Li S., Katsikis V.N., Brajevic I., Stanimirovic P.S. Fraud detection in publicly traded US firms using Beetle Antennae Search: A machine learning approach. Expert Systems with Applications. 2022;191 [Google Scholar]
- Khan F., Ateeq S., Ali M., Butt N. Impact of COVID-19 on the drivers of cash-based online transactions and consumer behavior: Evidence from a Muslim market. Journal of Islamic Marketing. 2021 [Google Scholar]
- Khatri S., Arora A., Agrawal A.P. 2020 10th international conference on cloud computing, data science & engineering (Confluence) IEEE; 2020. Supervised machine learning algorithms for credit card fraud detection: A comparison; pp. 680–683. [Google Scholar]
- Le N.Q.K., Huynh T.T., Yapp E.K.Y., Yeh H.Y. Identification of clathrin proteins by incorporating hyper-parameter optimization in deep learning and PSSM profiles. Computer methods and programs in biomedicine. 2019;177:81–88. doi: 10.1016/j.cmpb.2019.05.016. [DOI] [PubMed] [Google Scholar]
- Li S., Yan Y. DATA-driven shock impact of COVID-19 on the market financial system. Information Processing & Management. 2022;59(1) doi: 10.1016/j.ipm.2021.102768. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y., Jin X., Shen H. Towards early identification of online rumors based on long short-term memory networks. Information Processing & Management. 2019;56(4):1457–1467. [Google Scholar]
- Mohanty S., Seth V.K., Sanjay H.S., Prithvi B.S. Assessment of long short-term memory network for quora sentiment analysis. Journal of The Institution of Engineers (India): Series B. 2022;103(2):375–384. [Google Scholar]
- Nasiri E., Berahmand K., Rostami M., Dabiri M. A novel link prediction algorithm for protein-protein interaction networks by attributed graph embedding. Computers in Biology and Medicine. 2021;137 doi: 10.1016/j.compbiomed.2021.104772. [DOI] [PubMed] [Google Scholar]
- Nti, I.K., .Aning, J., Ayawli, B.B.K., Frimpong, K., Appiah, A.Y., .& Nyarko-Boateng, O. (2021). A Comparative Empirical Analysis of 21 Machine Learning Algorithms for Real-World Applications in Diverse Domains.
- Omondi-Ochieng P. Financial performance of the United Kingdom's national non-profit sport federations: A binary logistic regression approach. Managerial Finance. 2020 [Google Scholar]
- Ozcan T., Basturk A. Static facial expression recognition using convolutional neural networks based on transfer learning and hyper-parameter optimization. Multimedia Tools and Applications. 2020;79(35):26587–26604. [Google Scholar]
- Popat R.R., Chaudhary J. 2018 2nd international conference on trends in electronics and informatics (ICOEI) IEEE; 2018. A survey on credit card fraud detection using machine learning; pp. 1120–1125. [Google Scholar]
- Sadalia I., Nasution F.N., Muda I. Logistic regression analysis to know the factors affecting the financial knowledge in decision of investment non Riil assets at university investment gallery. International Journal of Management (IJM) 2020;11(2) [Google Scholar]
- Sadgali I., Sael N., Benabbou F. Performance of machine learning techniques in the detection of financial frauds. Procedia computer science. 2019;148:45–54. [Google Scholar]
- Shadmi E., Chen Y., Dourado I., Faran-Perach I., Furler J., Hangoma P., et al. Health equity and COVID-19: Global perspectives. International journal for equity in health. 2020;19(1):1–16. doi: 10.1186/s12939-020-01218-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shekar B.H., Dagnew G. 2019 second international conference on advanced computational and communication paradigms (ICACCP) IEEE; 2019. Grid search-based hyper-parameter tuning and classification of microarray cancer data; pp. 1–8. [Google Scholar]
- Sherstinsky A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena. 2020;404 [Google Scholar]
- Singh A., Ranjan R.K., Tiwari A. Credit card fraud detection under extreme imbalanced data: A comparative study of data-level algorithms. Journal of Experimental & Theoretical Artificial Intelligence. 2022;34(4):571–598. [Google Scholar]
- Uddin M.S., Chi G., Al Janabi M.A., Habib T. Leveraging random forest in micro-enterprises credit risk modelling for accuracy and interpretability. International Journal of Finance & Economics. 2022;27(3):3713–3729. [Google Scholar]
- Van Houdt G., Mosquera C., Nápoles G. A review on the long short-term memory model. Artificial Intelligence Review. 2020;53(8):5929–5955. [Google Scholar]
- Vaughan G. Efficient big data model selection with applications to fraud detection. International Journal of Forecasting. 2020;36(3):1116–1127. [Google Scholar]
- Zhang, C., Zhong, H., & Hu, A. (2022). Research on early warning of financial crisis of listed companies based on random forest and time series. Mobile Information Systems, 2022, doi: 10.1155/2022/1573966.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data will be made available on request.









