A gradient boosting classifier for purchase intention prediction of online shoppers

Abdullah-All-Tanvir; Iftakhar Ali Khandokar; AKM Muzahidul Islam; Salekul Islam; Swakkhar Shatabda

doi:10.1016/j.heliyon.2023.e15163

. 2023 Apr 3;9(4):e15163. doi: 10.1016/j.heliyon.2023.e15163

A gradient boosting classifier for purchase intention prediction of online shoppers

Abdullah-All-Tanvir ¹, Iftakhar Ali Khandokar ¹, AKM Muzahidul Islam ¹, Salekul Islam ¹, Swakkhar Shatabda ^1,^⁎

PMCID: PMC10121810 PMID: 37095970

Abstract

Early purchase prediction plays a vital role for an e-commerce website. It enables e-shoppers to enlist consumers for product suggestions, offer discount and for many other interventions. Several work has already been done using session log for analyzing customer behavior whether he performs a purchase on the product or not. In most cases, it is difficult to find out and make a list of customers and offer them discount when their session ends. In this paper, we propose a customer's purchase intention prediction model where e-shoppers can detect customer's purpose earlier. First, we apply feature selection technique to select best features. Then the extracted features are fed to train supervised learning models. Several classifiers like support vector machine (SVM), random forest (RF), multilayer perceptron (MLP), decision tree (DT), and XGBoost classifiers have been applied along with oversampling method for balancing the dataset. The experiments were performed on a standard benchmark dataset. Experimental results show that XGBoost classifier with feature selection techniques and oversampling method has the significantly higher area under ROC curve (auROC) score and are under precision-recall curve (auPR) score which are 0.937 and 0.754 respectively. On the other hand accuracy achieved by XGBoost and Decision tree are significantly improved and they are 90.65% and 90.54% respectively. Overall performance of the gradient boosting method is significantly improved compared to other classifiers and state-of-the-art methods. In addition to this, a method for explainable analysis on the problem was outlined.

Keywords: Gradient boosting classifier, Imbalanced dataset, Feature selection, Online shopper's purchase intention, Real time prediction

1. Introduction

Nowadays, online shopping has become part and parcel of the daily life of any human. People are more interested to buy product from online than physical market. Therefore, use of e-commerce websites has been increasing and it is becoming more intriguing in this modern era. However, the increase of conversion rate is less than expected. In 2015, there were 1.46 billion online buyers which raised to 1.79 billion in 2018. In 2021, the curve of digital buyers is not so high, which is 2.14 billion although relevant technologies improved significantly compared to the previous years [1]. In the physical market, a salesperson can bargain with the customers, explain the benefits of the products and try to read the mind of the customers. Moreover, the salesperson can give some discount or other offers to the customer based on his profit and the nature/intention of the customer. Note that the purchase intention varies from customer to customer and thus, a salesperson can dynamically adapt his strategies based on the customer's behaviors. For example, a customer may always buy product from the same salesperson due to the discount offered by that salesperson. Therefore, attracting customer is easier for a physical retail than online shopping. In case of online shopping, it is difficult to understand a customer's desire. For this reason, giving promotion to the right customer has been a challenge.

Different types of methods have been introduced in the recent past to solve this problem. Some e-commerce companies use an early detection and behavioral prediction so that the behavior of a salesperson can be emulated in the virtual shopping environment [2], [3], [4]. On the other hand, some machine learning and deep learning methods have also been introduced to generate a pattern so that it can detect a customer's intention quickly and act accordingly to keep away from the cart abandonment, and finally, the customer feels compelled to buy products [5], [1], [6]. Researchers have also studied the key factors of online shopping [7], [8]. Others attempt to predict the user behaviors in real time and take appropriate action to reduce the shopping cart abandonment and to increase the buy conversion rates [9], [10], [11].

There is a standard benchmark dataset for online shoppers purchase intention prediction which was first proposed by Sakar et al. [5]. It has been used extensively in the literature [5], [12], [13], [14], [15], [16]. Most of these algorithms attempt to improve the accuracy of the classifiers. Note that, the dataset is imbalanced and thus the nature of the classification methods applied to solve this problem also deploy oversampling and undersampling techniques [17], [18], [19], [20]. One of the common drawbacks of the methods is that they use confusion matrix based metrics such as accuracy. It is highly recommended for imbalanced datasets [21], [22] that metrics like area under Receiver Operating Characteristic curve (auROC) and area under precision recall curve (auPR) are used. These two metrics particularly auPR is not sensitive to the bias or thresholds set by the classifiers. It is observed that though the prediction models proposed in the literature do produce higher accuracies, their performance suffer in terms of F1, MCC, auROC and auPR.

In this paper, a real-time online shopper behavior analysis system in presented. In the proposed system, a visitor's purchasing intention can be predicted simultaneously in real time. A consumer who starts an interactive session is suggested new content based on the prediction sent by the model in real time. This prediction is based on the features that are extracted real time. However, periodical model retraining is performed using these stored features and labels. An online retailer data was used. As the dataset is imbalanced oversampling techniques are applied to upsample the dataset and selected best features and lastly, XGBoost is proposed as a classifier to find whether a customer will buy a product or not. The main contributions of this paper are as follows:

•
Performed extensive feature analysis and selected effective features to predict online shoppers purchase intention.
•
Experimented with different sampling techniques to handle the data imbalance problem.
•
Performed experimental analysis using several classification algorithms. Our proposed method significantly outperforms existing works in terms of standard performance metrics.
•
Outlined a mechanism for explainable analysis of the problem.

The rest of the paper is organized as follows: Section 2 presents a literature review of the related work; Section 3 delineates our proposed materials and methods; Experimental results and analysis are presented in Section 4 and the paper concludes with a discussion and a direction for future work in Section 5.

2. Related work

There are several works in the literature that study the problem of predicting customers' purchasing intention for online shopping or e-commerce sites. Most of the studies are based on real-time data that were collected from online shopping websites. Several of them have also used session and log data available from the users. A few of the studies focus to find the key features for online purchase and thus, to find out why a buyer abandons online shopping.

Esmeli et al. [23] have proposed a machine learning model which is able to predict the intention of a customer in a running session. One of their findings is using session and log data are not very effective for predicting online purchasing intention. To evaluate their model, they have used utility scoring method, which can predict a customer's intention of buying a product by creating features dynamically in a particular session. They use Decision Tree, Random Forest, Naive Bayes, K-Nearest Neighbor and Bagging classifiers. Among them Decision Tree performs better with 97% area under receiver operating characteristic curve (auROC) score.

Houda et al. [2] investigate the choice of customers behind rejecting shopping in online. They perform an empirical study with 147 students and conclude that cognitive variables of perceived usefulness and perceived ease of use have a crucial impact towards the attitude of the Internet usage. Chung et al. [3] apply different approaches to explore the underlying connection between online shoppers' decision-making process and the type of the input device the shoppers use in terms of affect driven information processing. The results reflect that shoppers using a touch interface to view products demonstrate significantly higher engagement with their shopping experience in a low involvement setting. Using a touch interface also increases the likelihood that consumers will choose a hedonic over a utilitarian option to make an immediate purchase decision.

Mudaa et al. [24] have analyzed the purchasing behavior of a generation of Malaysia and conclude that most of the persons buy product from online retailer operating via Facebook and Instagram. The summary of their study is to find a positive relationship between the perceived trust and the perceived reputation with online purchase intention of Generation Y shoppers. Baga et al. [25] have studied the level of attributes of the durable goods for predicting the customer's purchase intention. Social perception scores of all brands collected form social media and reviews, the polarity which are calculated by using sentiment analysis are used in purpose of building the model of prediction. After that, all the satisfactory instances are selected for each attribute with a significant regression analysis to predict the proper product attributes.

By combining the technological acceptance model with the additional determinants and incorporating habitual online usage as a new mediator, a new online purchase intention model is suggested by Law et al. [4]. This model demonstrates the constant use of online connectivity to the intention of purchase. The impact of this research will help the online marketers and retailer to understand the situation of the middle-aged market segment, which helps them to come up with better strategies. Suchacka et al. [1] use Web server log data for understanding the attitude of a customer who buys products from e-commerce websites. As there is a problem of customer's behavior characterized by the web server log data, the authors analyze user sessions based on session features. The main goal is to point out the features producing high probability of making a purchase for innovative and traditional customers. An online bookstore website is used for this study, which is hosted on an Apache HTTP server on Linux with PHP and MYSQL support. They use Association Rules Mining, also called as Apriori Algorithm, for this purpose. To recast online purchase predictions, the same authors demonstrate a classification problem in another study [6]. Here, in the session feature space of an online shop, each user session is represented by a 23-element vector. The authors propose a Support Vector Machine (SVM) classification model dividing the user sessions into binary classes. The data are collected from the online bookshop's history. All the data are classified by the browsing sessions and the buying sessions. 99% accuracy is achieved by this model along with almost 95% probability of predicting a purchasing session. In another study [26], they have introduced a binary classification over user sessions in an online store, which are two types: buying sessions and browsing session. Their proposed method uses the traditional K-NN algorithm. They also use historical data collected from an e-commerce bookstore's log file. After implementing the K-NN classifier with different neighbor sizes, $K = 11$ turns out the best possible number of neighbors to be considered for classification. The model achieves 87.5% sensitivity and 99.85% accuracy.

Rita et al. [8] introduce the four characteristics of the e-service quality model that better predicts the consumer behaviors that are the subject of this research. This study focuses not only on the intention of a customer but also on the impact of customer trust. The findings reveal that three aspects of e-service quality, including website design, security or privacy, and fulfillment, have an impact on total e-service quality. Customer service, on the other hand, has no bearing on the overall quality of an e-service. Dabbous et al. [27] address the gap between online and offline shopping. In light of this expanding tendency, this study proposes an empirical model that is evaluated using structural equation modeling to explain the impact of content quality and brand interaction inside social media on consumers' brand awareness and purchase inclinations. The study also looks at whether hedonic motivation, consumer involvement, and brand awareness play a role in the relationship between social media stimuli and offline purchase intent. According to the findings, Millennials place a high value on the quality of information given by companies on social media as well as the involvement of corporate users.

Martins et al. [28] shows that smartphone advertisement can play a vital role to convince customers for online shopping. Ducoffe's web advertising model and flow experience theory are combined in their proposed model and they use 303 Portuguese respondents for data collection. The study concludes that the value of advertising, interactive web design, flow experience and brand recognition can define the purchase intention. Xiao et al. [29] describe four cues that support this consuming behavior that influences customers decision to buy in cross-border e-commerce which are online promotion cues, content marketing cues, personalized recommendation cues, and social review cues. They find out that these four cues play crucial role on customer's purchasing behavior and have a negative moderate impact with brand familiarity. In a recent work [5], the authors attempt to predict the purchase intention of a customer by tracking them during the visit of a session. They use Random Forest, Multilayer Perception and Support Vector Machine with oversampling and feature selection method. They conclude that multilayer perception using back propagation along with weight backtracking gives best accuracy and F1-Score compare to other classifier. In another work [12], the authors use the same dataset and apply Naive Bayes, C4.5 and Random Forest Classifier. They also use oversampling techniques to balance the dataset. Their results show that the accuracy and F1-Score of Random Forest is higher compare to the earlier work by [5].

The standard benchmark dataset that was proposed by Sakar et al. [5] has been used extensively in the literature [5], [13], [12], [16], [14], [15]. A summary of the literature review has been presented in Table 1. However, note that most of these methods are performing well only in terms of accuracy and other metrics that are dependent on the confusion matrix and thus prone to threshold or bias set by the classification methods. Often this is not recommended for imbalanced datasets. It is observed that these classifiers proposed in the literature does poorly in terms of robus metrics like MCC, F1, auROC and auPR. This paper addresses this gap in the literature and performs ablation using several classifiers to come to a conclusive method.

Table 1.

Summary of the literature review.

Ref.	Dataset	Feature selection	Data Imbalance	ML	Evaluation Metric
Sakar et al. [5]	consumers' purchase intentions	Principal Component Analysis	Oversampling (adding more minority data by uniform distribution)	C4.5, MLP, RF, SVM	Accuracy, TPR, TNR, F1-Score
Karim et al. [12]	consumers' purchase intentions	No Feature Selection	Oversampling (SMOTE)	Naive Bayes, C4.5, RF	Accuracy, Sensitivity, Specificity, F1-Score
Song et al. [13]	consumers' purchase intentions	Pearson correlation coefficient of the “Revenue” feature	No	XGBoost, RF	Accuracy, Positive Precision, Recall and F1-Score, Negative Precision, Recall, F1-Score, AUC
Rizal et al. [14]	consumers' purchase intentions	Information Gain (IG) and Correlation (CORR)	ADASYN	RF	accuracy, precision, recall, and F1-score
Kabir et al. [15]	consumers' purchase intentions	No	No	RF, bagging with RF, Gradient Boosting	accuracy, precision, recall, and F1-score
Hamami et al. [16]	consumers' purchase intentions	Correlation	No	Logistic Regression, KNN, RF, DT	Accuracy
Esmeli et al. [23]	European e-commerce business	No	SMOTE oversampling and random undersampling	NB, RF, Bagging, DT, KNN	AUC, utility scoring method
Suchacka et al. [26]	real data from commercial Web server access logs, recorded in April 2014	No	No	SVM	true and false positives and negatives, error rate, accuracy, and sensitivity.
Suchacka et al. [1]	Custom made dataset from online store	No	No	KNN	Accuracy, error rate, Sensitivity

Open in a new tab

3. Material and methods

This section provides the details of our proposed model in the real time scenario to predict online shopper's purchase intention followed by the details of the methodology of training and the experimentation.

3.1. Real time prediction

Our real time customer purchase intention prediction model depends on tracking of the customer as he/she may browse through the e-commerce site. The situation is depicted in Fig. 1. Whenever the customer enters the site and starts interactions the systems collects information. These may include products that are being browsed or selected in addition to the information that are available through session and cookies. Based on that the interactions features are extracted and the model is used to predict the intention based on which new content, i.e., discount, product, etc is suggested to the consumer. However, this particular feature is tagged with the purchase activity of the consumer to a database. Periodically, information or data from this database is used to retrain the model.

Real Time Consumer Intention Prediction and Model Training.

3.2. Training of the model

The methodology followed in this paper to train our model is depicted in Fig. 2. It starts by collecting dataset from an e-commerce site followed by a pre-processing step where the categorical data are converted into numeric and also normalized. After that feature selection is performed, which is an optional phase in the machine learning methodology and depends on the performance of the features selected. It follows a balancing technique to address the imbalance problem. After that classifiers are designed for selection of the final prediction model after performance evaluation. The rest of this section describes these steps in details.

Workflow for training our proposed model.

3.3. Benchmark dataset

In our study, the problem of online shoppers' purchase intention prediction is formulated as a supervised binary classification. We have collected a dataset that was previously used in [5]. There are different types of features collected from real time user interactions. The account management related page visits are typed into administrative or account management and the duration thereby is accounted in seconds. The features also include information about the number of pages and the duration that the visitor visits or stays within the online shopping site. The product related page visits and the duration is also taken into account. Based on the user interaction or action types like visiting new pages, switching between product types, the feature values are collected in real time. The browsing information for all the pages visited by the shopper is then converted into other features based on the browsing pattern and page success rates. Also the client system information (browser, location, operating systems, etc), timing of the month/year/week are taken into consideration as important features.

In this dataset, there are both categorical and numerical features. The summary of the features are given in Table 2 and Table 3. All the features in Table 2 are numerical where the features in Table 3 are categorical. There are 12,330 records of different customers in a one-year period with different behavior. Among these 12,300 records, 10,422 (i.e., 84.5%) are with negative samples where customers did not complete their shopping, and the rest of the 1,908 records are with positive samples where customers completed their transaction.

Table 2.

Summary of numerical features.

Feature	Feature description	Max value	σ
Account Management	Number of pages that the visitor visited related to account management	27	3.32
Account Management Duration	Time spent by the visitor in pages related to account management (in seconds)	3398	176.70
Content Pages	Number of pages that the visitor browsed that are related to the generic information content of the shopping site	24	1.26
Content duration	Time spent by the visitor browsing pages that are related to the generic information content of the shopping site	2549	140.64
Product pages	Number of pages the visitor visited that are about product or items	705	44.45
Product pages duration	Time spent by the visitor browsing pages related to product or items	63,973	1912.25
Bounce rate	Average bounce rate of the pages that are browsed by the shopper	0.2	0.04
Exit rate	Average exit rate value of the pages that are browsed by the shopper	0.2	0.05
Page value	Average page value of the pages that are browsed by the shopper	361	18.55
Special day	Temporal adjacency of the site browsing time to a special day	1.0	0.19

Open in a new tab

Table 3.

Summary of categorical features.

Feature	Feature description	Levels
Operating Systems	Operating system of the machine of the online shopper	8
Browser	Browser of the online shopper	13
Location	Geographic location from which the browsing session has been started by the online shopper	9
Traffic Type	Traffic source by which the online shopper has reached to the Shopping site	20
Visitor Type	Three types: new_visitor, returning_visitor, and others	3
Weekend	Binary value representing whether the day of the browsing is weekend	2
Month	Month value of the browsing date	12
Revenue	Class label representing whether the visit has been finalized with a purchase or not	2

Open in a new tab

3.4. Pre-processing

One-hot encoding converts categorical data into a numeric form and provided to any ML algorithms for better predictive work. The categorical features in the dataset are converted into numerical features using this technique. As the dataset also have numeric features, the scaling was also necessary for better performance. Standard Scalar converts statistical data where 0 is the mean value and 1 is the standard variance. As the ML models are initialized to a starting value and gradually it updates itself on the basis of error estimation, unscaled varying input ranges can make the model to learn slowly and target variables having unscaled can affect the gradients causing the whole learning model to get into an ultimate failure. The class label was converted to two distinct labels 0 and 1 using a simple label encoding method.

3.5. Feature selection

There are two types of features in the benchmark dataset that is used in this work. A summary of the features are given in Table 2 and Table 3. The $χ^{2}$ based feature selection technique is applied to select best features. The feature selection reveals that out of 28 features 20 are significant. They are ‘VisitorType_New_Visitor’, ‘SpecialDay’, ‘Month_May’, ‘Month_Mar’, ‘BounceRates’, ‘Month_Oct’, ‘ExitRates’, ‘Month_Feb’, ‘VisitorType_Returning_Visitor’, ‘Month_Dec’, ‘Month_Sep’, ‘Weekend’, ‘Month_June’, ‘Region’, ‘Browser’, ‘OperatingSystems’, ‘VisitorType_Other’, ‘TrafficType’, ‘Month_Jul’ and ‘Month_Aug’.

The premise with the $χ^{2}$ square based feature selection is combining the univariate statistical test to select the top features based on the correlation between the features and the label. For real time prediction, the focus was on time and model complexity. The main reason for using this method is to keep the model simple which will improve the overall performance. On the other hand, the wrapper based feature selection methods [30] and hybrid feature selection methods [31], though are sophisticated have large time complexity and dependency on the model used.

3.6. Handling data imbalance

As discussed earlier, out of 12,300 records, 10,422 records belong to negative class, thus it creates a class imbalance problem. For this problem, the prediction would be biased as it tends to predict the negative class. Therefore, resampling was necessary of the positive class proportion following the negative class proportion. Synthetic Minority Oversampling Technique (SMOTE) [32] and random undersampling [33] methods were selected as oversampling and undersampling techniques. Note that the balancing methods were applied only to the training set and kept the test set imbalanced in all the experiments reported in this paper. In the train set before balancing there were 7293 negative samples and 1338 positive samples. After oversampling, the ratio was 1:1. In the case of undersampling the negative samples were undersampled to 5352.

3.7. Classification algorithm

XGBoost [34] approximates a function by optimizing a specific loss function where different regularization methods are applied. The algorithm for XGBoost is given in Algorithm 1. The weak learner learned using an objective (loss function and regularization) function at iteration t as the following:

L_{(t) (x_{i}, y_{i})} = \sum_{i = 1}^{N} L (y_{i}, y_{i}^{(t - 1)} + f_{t} (x_{i})) + Ω (f_{t})

Algorithm 1 — XGBoost(Training Set $D = {(x_{i}, y_{i})}_{i = 1}^{N}$ , Loss Function $L (f (x_{i}), y_{i})$ ), learning rate α, weak learner M.

Four classification algorithms were employed in the experiments for comparison with the gradient boosting algorithm. They are: Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF) and Multilayer perceptron (MLP). In this section brief preliminaries and the parameter settings of these algorithms are given.

i.
Decision Tree (DT): Decision tree [35] is an explainable classification model that takes decision based on features set as nodes in the decision tree. Decisions are taken on the nodes based on the values of the attributes. Different types metrics such as entropy, information gain, gini index, etc. are used for attribute selection in Decision trees. In the experiments, entropy was used as the attribute selector and pre-pruning parameter max_depth was set o 5.
ii.
Support Vector Machine (SVM): SVM [36] works by an optimal hyperplane separating the datasets which is defined as an optimization problem.
$m i n \frac{1}{2} | | w^{2} | | + C \sum_{i = 1}^{k} ϵ_{i} s u b j e c t t o r^{t} (w^{T} x^{t} + w_{0}) \geq 1 - ϵ_{i}$

Here, w is a discriminant weight vector, C is the regularization parameter which is the control complexity of the model that is fitted to the data, $ϵ = (ϵ_{1}, ϵ_{2}, \dots, ϵ_{k})$ is the vector of the slack variables, and $r^{t}$ is the actual value of sample t. In the experiments with SVM, “rbf” kernel was used with C value equal to 7.
iii.
Multilayer perceptron (MLP): Multilayer perceptron is an artificial neural network model [37] with feed forwarding connections. Our MLP model consists of three hidden layers with 28, 56 and 28 neurons in each, respectively. Stochastic gradient descent was used as optimizer with learning rate set to 0.00005 and relu activation function [38].
iv.
Random Forest: Random forest [39] learns a large number of individual decision tree to form an ensemble. It performs bootstrapping on the feature space and the decision is taken based on a voting mechanism. In the experiments, max_depth of the decision tree was set to 20 and the number of estimators to 100.

3.8. Performance evaluation

In the experiments, the train-test split percentage on the dataset was kept similar to the original paper where the selected dataset was proposed [5]. This is done to enable a fair comparison among the methods. Note that all the experimental results that are reported in the next section represents performance on the test data after the dataset was split by 70% (train) to 30% (test).

In a typical binary classification task often the metrics are defined on confusion matrix for each model. In a confusion matrix, true positives (TP) are the correctly predicted positive instances, true negatives (TN) are the correctly predicted negative instances, false positives (FP) are the incorrectly predicted negative instances and the false negatives (FN) are the incorrectly predicted positive instances. The following metrics are employed in this work based on these terms.

The percentage of the correct prediction is denoted by accuracy. The model will perform better if the accuracy is high (close to 100%). The formula is given in Equation (1).

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(1)

Precision is the value of ratio of correctly predicted positive instances to the total predicted positive instances. Here, 1 is the highest value of precision and 0 is the lowest. The formula is given in Equation (2).

P r e c i s i o n = \frac{T P}{T P + F P}

(2)

Recall or sensitivity or True Positive Rate (TPR) is the true positive rate defined as the ratio of true positive to the total positive instances. The formula is given in Equation (3).

R e c a l l = \frac{T P}{T P + F N}

(3)

Specificity or True Negative Rate (TNR) is the probability that an actual negative will test negative. The value of TNR lies in between 0 to 1 where 1 is the highest value. The formula is given in Equation (4).

T N R = \frac{T P}{T P + F P}

(4)

F1-Score is the weighted average of Precision and Recall. The range of F1-Score is 0 to 1. The formula is given in Equation (5).

F 1 - S c o r e = \frac{2 ⁎ (P r e c i s i o n ⁎ R e c a l l)}{P r e c i s i o n + R e c a l l}

(5)

Matthews correlation coefficient (MCC) is arguably the most elegant way to find out the quality of a binary classification problem. The high value (close to 1) means that both classes are predicted well. The formula is given in Equation (6).

M C C = \frac{T P \times T N - F P \times F N}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}}

(6)

There are two metrics that are not dependent on the confusion metrics. They are the area under receiver operating characteristic curve (auROC) and the area under precision recall curve (auPR). ROC curve is a plot of TPR against FPR. Both of these curves are not dependent on the threshold chosen by the classifier and they are highly suitable for imbalanced data classification in addition to MCC and F1-score.

4. Experimental analysis

Python 3.7 version and scikit-learn library [40] was used for implementation of all the algorithms. Algorithms were run using the Jupyter notebook environment. All the experiments are run on the dataset after a 70-30 split to generate train and test data respectively. Methods like feature selection, oversampling and undersampling are only done on the train set to avoid any possible overfitting.

The experiments started with a baseline and the components of the proposed method was added to assess the individual strengths. Following sections present results from each of the experiments. A summary of the ablation is given in Table 4.

Table 4.

Summary of the experiments conducted in ablation study.

experiment	baseline	feature selection	oversampling	undersampling
baseline-1
data sampling-1
data sampling-2
feature selection-1
combined-1
combined-2

Open in a new tab

4.1. Baseline methods

The first set of experiments are carried out on the selected classifier with all features present. In this setting, none of the data sampling were performed to handle the imbalance. The results are reported in Table 5. The results show that Decision Tree provides the higher accuracy, precision, recall and F1-score while XGBoost provides better True Positive Rate, AUC Score and AUC-PR Score. However, Random Forest gives the highest MCC, which is 0.6089.

Table 5.

Evaluation of classification with baseline methods.

Model	Accuracy	Precision	F1 Score	TPR	TNR	auROC	auPR	MCC
SVM	89.59	88.76	0.8887	0.35	0.92	0.878	0.691	0.5591
RF	90.51	89.93	0.9007	0.74	0.93	0.932	0.748	0.6089
MLP	88.05	88.08	0.8806	0.61	0.93	0.886	0.632	0.543
DT	90.54	89.9	0.8998	0.75	0.92	0.922	0.717	0.6049
XGBoost	89.94	89.14	0.8905	0.76	0.91	0.937	0.749	0.5672

Open in a new tab

4.2. Baseline with data sampling

For the next set of experiments data sampling techniques are employed to handle the imbalance in our data. Firstly, oversampling techniques are applied to the dataset. SMOTE algorithm is used for oversampling of the data. The results are reported in Table 6. Here, note that Random Forest performs better than other classifiers in terms of TNS, auROC auPR and MCC. However, XGBoost has similar performances and slightly better performance in terms of accuracy, precision, recall and f1-score. Note that overall TPR has lowered in this experiments compared to the baseline experiments.

Table 6.

Classification Report with Oversampling.

Model	Accuracy	Precision	F1 Score	TPR	TNR	auROC	auPR	MCC
SVM	88.94	88.62	0.8876	0.65	0.93	0.878	0.666	0.5564
RF	89.65	90.05	0.8982	0.65	0.95	0.933	0.731	0.625
MLP	87.21	87.2	0.8721	0.59	0.92	0.9	0.631	0.4931
DT	89.4	89.96	0.8964	0.65	0.95	0.919	0.689	0.6127
XGBoost	90.21	90.11	0.9016	0.69	0.94	0.93	0.724	0.6243

Open in a new tab

When random under sampling technique was applied to balance our dataset, overall better performance from XGBoost was found except for the auPR Score, which is better for Random Forest. The results for this set of experiments are reported in Table 7. Here too, note the downgrade in TPR. However, both oversampling and undersampling show the relative superiority of XGBoost as a classifier over the other methods.

Table 7.

Classification Report with Under sampling.

Model	Accuracy	Precision	F1 Score	TPR	TNR	auROC	auPR	MCC
SVM	87.51	88.73	0.88	0.58	0.94	0.897	0.663	0.564
RF	88.21	90.4	0.8893	0.58	0.96	0.93	0.734	0.6196
MLP	84.1	88.83	0.8555	0.49	0.96	0.9	0.66	0.5445
DT	87.02	89.88	0.8794	0.55	0.96	0.919	0.696	0.5961
XGBoost	89.32	90.27	0.8969	0.64	0.95	0.93	0.718	0.6235

Open in a new tab

4.3. Feature selection

In this paper, $χ^{2}$ based feature selection is used to select the best features. Top k features are selected according to the ranking keeping values of k in the set ${10, 15, 20}$ . The results are reported in Table 8. Here, note that the best accuracy is found using 20 top features selected and using the XGBoost classifier, which also shows higher precision, TPR, TNR and auPR values. Decision tree classifier however, shows good performance when trained with 15 and 20 features in terms of F1-Score and MCC.

Table 8.

Classification performance evaluation of feature selection with different top k features selected.

Model	k	Accuracy	Precision	F1 Score	TPR	TNR	auROC	auPR	MCC
SVM	20	89.56	88.79	0.8895	0.72	0.92	0.88	0.694	0.5675
	15	89.78	89.03	0.8918	0.72	0.92	0.877	0.701	0.5699
	10	89.51	88.76	0.8895	0.71	0.92	0.85	0.663	0.5619

RF	20	89.1	88.11	0.8787	0.75	0.91	0.932	0.743	0.5898
	15	89.13	88.14	0.8814	0.73	0.91	0.928	0.736	0.6025
	10	89.51	89.09	0.8926	0.68	0.93	0.923	0.707	0.5802

MLP	20	88.97	88.37	0.8859	0.67	0.92	0.902	0.669	0.5516
	15	88.4	87.44	0.877	0.67	0.91	0.898	0.622	0.5113
	10	88.23	87.73	0.8786	0.62	0.92	0.861	0.587	0.5291

DT	20	90.54	89.9	0.8998	0.76	0.93	0.923	0.731	0.6049
	15	90.54	89.9	0.8998	0.76	0.93	0.923	0.731	0.6049
	10	89.89	89.64	0.8975	0.68	0.94	0.926	0.7014	0.6037

XGBoost	20	90.65	90.01	0.898	0.801	0.96	0.937	0.749	0.5992
	15	90.19	89.45	0.8928	0.7804	0.92	0.936	0.742	0.5775
	10	89.83	89.01	0.8889	0.76	0.91	0.935	0.731	0.5611

Open in a new tab

To demonstrate the effectiveness of feature selection two bubble graphs are created as shown in Fig. 3. The first plot on the left of Fig. 3 shows a MCC vs F1-Score plot and the second graph on the right shows an auPR vs auROC plot using bubbles of different shapes denoting different sizes of the features selected and the colors indicating the classifiers. The graphs reflect two different scenarios: i) how the classifiers are acting in terms of metrics that are dependent on the confusion matrix and independent of the confusion matrix, and ii) based on different features selected. It is evident that XGBoost and Decision tree classifiers outperform other classifiers.

Plot of MCC vs F1-Score (left) and auPR vs auROC (right) for all algorithms using baseline with feature selection.

In Fig. 4, a lolly-pop chart is shown with all the feature importance from $χ^{2}$ test. Here, the feature importance values are scaled to show the relative importance.

Plot of feature ranks from χ² test (importance values scaled).

4.4. Combined evaluation

In the combined evaluation, feature selection technique is combined with oversampling and undersampling respectively. In Table 9, the classification metrics are reported from the experiments with different classifiers using feature selection and oversampling.

Table 9.

Classification Report of Feature Selection with different k values and Oversampling.

Model	k	Accuracy	Precision	F1 Score	TPR	TNR	auROC	auPR	MCC
SVM	20	88.86	88.75	0.888	0.64	0.93	0.877	0.665	0.5747
	15	88.61	89.07	0.8882	0.61	0.94	0.869	0.67	0.5769
	10	88.21	88.88	0.885	0.6	0.94	0.854	0.641	0.5756

RF	20	89.48	89.72	0.8959	0.65	0.94	0.933	0.737	0.6069
	15	89.13	89.67	0.8936	0.63	0.94	0.93	0.722	0.6072
	10	88.51	89.31	0.8884	0.61	0.94	0.923	0.701	0.6069

MLP	20	87.92	88.28	0.8808	0.6	0.93	0.881	0.611	0.5347
	15	86.32	88.68	0.8715	0.54	0.95	0.892	0.624	0.5542
	10	87.32	88.27	0.8766	0.47	0.94	0.863	0.647	0.576

DT	20	87.29	90.15	0.882	0.56	0.96	0.919	0.705	0.6009
	15	88.38	89.68	0.8887	0.6	0.95	0.921	0.699	0.6059
	10	89.16	90.32	0.8959	0.62	0.95	0.922	0.693	0.6046

XGBoost	20	90.32	89.82	0.8998	0.72	0.93	0.937	0.754	0.6269
	15	90.21	89.97	0.9008	0.7	0.94	0.935	0.754	0.6069
	10	89.86	89.72	0.8979	0.68	0.94	0.933	0.734	0.6029

Open in a new tab

Two bubble plots similar to the previous section are created to show the relative performance of the classifiers. The plots are given in Fig. 5. From the data reported in the table and presented via the plot, it is clearly evident that the variants of XGBoost are outperforming the rest of the classifiers in both cases.

Plot of MCC vs F1-Score (left) and auPR vs auROC (right) for all algorithms using baseline with feature selection and oversampling.

The next set of experiments are done using feature selection and undersampling combined. The results are reported in Table 10. One of the interesting findings in case of undersampling is TNR for all the classifiers have improved compared to the previous experiments. However, MCC is not that high due to poor TPR. The overall comparison is shows using two sets of parameters in Fig. 6. Again note that the XGBoost classifiers are performing better than all other classifiers for both sets of parameters. However, feature selection with oversampling has a higher auPR compared to feature selection with undersampling.

Table 10.

Classification Report of Feature Selection with different k values and Undersampling.

Model	k	Accuracy	Precision	F1 Score	TPR	TNR	auROC	auPR	MCC
SVM	20	87.81	88.91	0.8825	0.59	0.94	0.892	0.665	0.5677
	15	87.62	88.83	0.881	0.58	0.94	0.893	0.671	0.5693
	10	87.37	88.84	0.8793	0.57	0.95	0.883	0.651	0.5693

RF	20	87.86	90.11	0.8861	0.58	0.96	0.925	0.71	0.6088
	15	87.62	89.9	0.8838	0.57	0.96	0.924	0.715	0.6009
	10	87.4	89.87	0.8822	0.56	0.96	0.918	0.675	0.5986

MLP	20	86.51	88.41	0.8722	0.55	0.95	0.899	0.66	0.5482
	15	85.78	89.35	0.8691	0.53	0.96	0.911	0.661	0.5717
	10	85.1	88.8	0.863	0.51	0.96	0.892	0.637	0.5508

DT	20	87.13	90.1	0.8807	0.56	0.96	0.921	0.697	0.6033
	15	87.13	90.1	0.8807	0.56	0.96	0.921	0.697	0.60339
	10	87.32	90.16	0.8822	0.56	0.96	0.917	0.687	0.6063

XGBoost	20	88.94	90.04	0.8936	0.62	0.96	0.931	0.712	0.6138
	15	88.81	89.76	0.8918	0.61	0.95	0.93	0.718	0.6042
	10	88.86	89.83	0.8924	0.62	0.95	0.927	0.698	0.6068

Open in a new tab

Plot of MCC vs F1-Score (left) and auPR vs auROC (right) for all algorithms using baseline with feature selection and undersampling.

From the overall experiments it is evident that XGBoost is the best performing classifier. However, note that the second best performance is given by the very simple Decision Tree Classifier. Short experiments are performed on parameter tuning for these two classifiers only. The results are presented in Table 11 and Table 12. Different types of max_depth ${5, 10, 15}$ are used for finding out the best outcome for Decision tree classifier. From the results in Table 11, note that the best results are found where max_depth=5. This value was selected for the explainability analysis part as well.

Table 11.

Performance of Decision Tree with Different Max-Depth Value.

max_depth	Accuracy	Precision	F1 Score	TPR	TNR	auROC	auPR	MCC
5	90.54	89.9	0.8998	0.76	0.93	0.923	0.731	0.6049
10	88.97	88.21	0.8844	0.68	0.92	0.845	0.626	0.5438
15	87.72	87.29	0.8748	0.61	0.92	0.767	0.594	0.5391

Open in a new tab

Table 12.

Performance of XGboost with Different Max-Depth Value.

max_depth	Accuracy	Precision	F1 Score	TPR	TNR	auROC	auPR	MCC
2	90.65	90.01	0.898	0.801	0.96	0.937	0.749	0.5992
5	90.08	89.32	0.8935	0.75	0.92	0.928	0.725	0.5789
10	89.27	88.47	0.8867	0.7	0.92	0.925	0.718	0.5513

Open in a new tab

Hyper-parameters for XGBoost are tuned with different max_depth values from ${2, 5, 10}$ . The results reported in Table 12 show that max_depth=2 has the best outcome.

To show the overall performance comparison, two scatter plots are shown in Fig. 7 and Fig. 8. Here, different classifiers with their best version are shown for six different ablation experiments. In Fig. 7, the scatter plots of MCC vs F1-Score have been shown. Here, note that XGBoost is the best performing classifier. Random Forest classifier is in the next position, however, it does not dominate the other classifiers in any of the metric.

Comparison of MCC vs F1-Score for all algorithms. Here all six ablation experiments are shown: B=baseline, FS=Feature Selection, OS=Oversampling and US=Undersampling.

Comparison of auPR vs auROC for all algorithms. Here all six ablation experiments are shown: B=baseline, FS=Feature Selection, OS=Oversampling and US=Undersampling.

Fig. 8 shows the comparison for all classifiers as scatter plots of auROC vs auPR for different types of ablation experiments. Here too, note that the best performing classifier is XGBoost followed by Random Forest. In both of these plots shown in Fig. 7 and Fig. 8, note that the best performing variant of XGBoost is the one where feature selection and oversampling were employed along with the baseline. However, only in terms on F1-Score, the baseline and feature selection variant are slightly better, but note that it has relatively poor performance in terms of the other metrics as shown in both figures. The Receiver Operating characteristics (ROC) curves for two best classifiers: decision tree and XGBoost are depicted in Fig. 9 (a) and Fig. 9 (b).

Receiver Operating characteristics curves for (a) Decision Tree and (b) XGBoost Classifier.

4.5. Comparison with the existing works

In this section, the propose method is compared with those in the literature. Three other existing works from the literature are selected for comparison: Sakar et al. [5], Baati et al. [12] and Song et al. [13]. All of these works have been applied on the same benchmark dataset that is used in this paper. The comparison is shown in the results of Table 13. From the results reported in the table, note the our method achieves best performance among all the methods in terms of accuracy, TNR and F1-Score. The other methods in the literature actually did not report auROC, auPR or MCC. Therefore, those metrics are not included in this table. Sakar et al. [5] achieves higher TPR, however, the other metrics achieved by their methods show that their proposed method might be biased as compared to the proposed method. Also note that the method proposed by Song et al. [13] performs very much similar in terms of accuracy, but fails to do well in terms of F1-Score and TPR.

Table 13.

Comparison with Existing Works.

Matrics	Sakar et al. [5]	Baati et al. [12]	Song et al. [13]	This paper
Accuracy	87.24	86.78	90.15	90.65
True Positive Rate	0.84	0.62	0.73	0.801
True Negative Rate	0.92	0.91	0.93	0.96
F1-Score	0.86	0.60	0.65	0.91

Open in a new tab

4.6. Explainability analysis

For the explainability analysis the best model found by the decision tree with max_depth=5 using feature selection was selected. The performance of this model is very similar to the best model XGBoost and yet this model provides us with the opportunity to understand or explain the nature of the classifier. The rest of the classifiers are black-box in nature and are not much suitable for such analysis.

The best decision tree with entropy as attribute selector and max_depth=5 is shown in parts of Fig. 10. The root and its two children are shown in Fig. 10(a). Note that, the values on which the decision tree is learned are normalized. Note that the features used here are PageValues, Month_November and Bounce_rates. The two subtrees situated at the two children of the root node are elaborated in Fig. 10(b) and Fig. 10(c). Part of the subtree rooted and shown in Fig. 10(c) is shown fully in Fig. 10(d). Such trees give us idea about the importance of the features or attributes and also help us to take decisions. The business rules could be generated from these trees by traversing from the root to a leaf. For example, two such rules are shown in the following.

i.
VisitorType_ReturningVisitor = True && Exit_Rates >τ && BounceRates >β && Page_Values >α && administrative >γ && Region >δ ⇒ yes
ii.
VisitorType_ReturningVisitor = True && Exit_Rates >τ && BounceRates >β && Page_Values >α && administrative ≤κ && Page_Values >η ⇒ no

Note that the values are masked for the attributes. Similar rules are possible to generate from trees of this kind. Also note that association rule mining was not performed for this dataset, which also shows the itemsets with high confidence and may provide explainable rules similar to the decision tree classifier.

5. Conclusion

In this paper, a novel customer intention prediction method in e-commerce sites is proposed. In order to assist the e-shoppers, the intention prediction model was proposed as an assistant to determine the intention of the customer based on the features derived from browsing activities. The proposed method works in two steps. In the first step, features that best represent the customers' activity were selected leveraging the $χ^{2}$ based feature selection technique. In the second step, to address the class imbalance problem, oversampling and undersampling techniques are used to ensure that our model is not affected by the biasedness. Subsequently, the problem is modeled as a binary classification problem and fed the selected features into classifiers to classify the customers' intention of buying as either positive or negative. Finally, comprehensive experiments are conducted to verify the performance of the proposed model. Experimental results show that XGBoost classifier can effectively achieve an accuracy of 90.65% which is significantly higher than other schemes. Moreover, the proposed model is evaluated in terms of area under ROC curve (auROC) and area under precision-recall curve (auPR) score. The XGBoost classifier can effectively outperform the other approaches with an auROC score of 0.93 and auPR score of 0.75. Experimental results prove that, with the consideration of feature selection and sampling techniques, the proposed method guarantees significantly high intention prediction accuracy even with imbalance data.

One of the limitation of this study is lack of understanding for the other black-box models which is party due to the nature of the algorithms. The experimental analysis and the methodology presented here are applicable to real world problems similar to the one studied here and could be extended further. Also note that further explainable analysis are possible on the dataset using state-of-the-art explainable and interpretable AI models.

Funding statement

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

CRediT authorship contribution statement

Abdullah-All-Tanvir conceived and designed the experiments, Contributed reagents, materials, analysis tools or data; performed the experiments and wrote the paper; Iftakhar Ali Khandokar performed the experiments and wrote the paper; A.K.M. Muzahidul Islam analyzed and interpreted data and wrote the paper; Salekul Islam analyzed and interpreted data and wrote the paper and Swakkhar Shatabda conceived and designed the experiments, analyzed and interpreted data and wrote the paper.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Contributor Information

Abdullah-All-Tanvir, Email: abdullahalltanvir@gmail.com.

Iftakhar Ali Khandokar, Email: iftakharalikhandokar@gmail.com.

A.K.M. Muzahidul Islam, Email: muzahid@cse.uiu.ac.bd.

Salekul Islam, Email: salekul@cse.uiu.ac.bd.

Swakkhar Shatabda, Email: swakkhar@cse.uiu.ac.bd.

Data availability

Data will be made available on request.

References

1.Suchacka Grażyna, Chodak Grzegorz. Using association rules to assess purchase probability in online stores. Inf. Syst. E-Bus. Manag. 2017;15(3):751–780. [Google Scholar]
2.Zarrad Houda, Debabi Mohsen. Online purchasing intention: factors and effects. Int. Bus. Manag. 2012;4(1):37–47. [Google Scholar]
3.Chung Sorim, Kramer Thomas, Wong Elaine M. Do touch interface users feel more engaged? The impact of input device type on online shoppers' engagement, affect, and purchase decisions. Psychol. Mark. 2018;35(11):795–806. [Google Scholar]
4.Law Monica, Kwok Ron Chi-Wai, Ng Mark. An extended online purchase intention model for middle-aged online users. Electron. Commer. Res. Appl. 2016;20:132–146. [Google Scholar]
5.Sakar C. Okan, Polat S. Olcay, Katircioglu Mete, Kastro Yomi. Real-time prediction of online shoppers' purchasing intention using multilayer perceptron and lstm recurrent neural networks. Neural Comput. Appl. 2019;31(10):6893–6908. [Google Scholar]
6.Suchacka Grazyna, Skolimowska-Kulig Magdalena, Potempa Aneta. Classification of e-customer sessions based on support vector machine. ECMS. 2015;15:594–600. [Google Scholar]
7.Ariffin Shaizatulaqma Kamalul, Mohan Thenmoli, Goh Yen-Nee. Influence of consumers' perceived risk on consumers' online purchase intention. J. Res. Interact. Mark. 2018 [Google Scholar]
8.Rita Paulo, Oliveira Tiago, Farisa Almira. The impact of e-service quality and customer satisfaction on customer behavior in online shopping. Heliyon. 2019;5(10) doi: 10.1016/j.heliyon.2019.e02690. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Fernandes Ricardo Filipe, Teixeira Costa Magalhães. 2015. Using Clickstream Data to Analyze Online Purchase Intentions. [Google Scholar]
10.Budnikas Germanas, et al. Computerised recommendations on e-transaction finalisation by means of machine learning. Stat. Trans. New Ser. 2015;16(2):309–322. [Google Scholar]
11.Awad Mamoun A., Khalil Issa. Prediction of user's web-browsing behavior: application of Markov model. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 2012;42(4):1131–1142. doi: 10.1109/TSMCB.2012.2187441. [DOI] [PubMed] [Google Scholar]
12.Baati Karim, Mohsil Mouad. IFIP International Conference on Artificial Intelligence Applications and Innovations. Springer; 2020. Real-time prediction of online shoppers' purchasing intention using random forest; pp. 43–51. [Google Scholar]
13.Song Peiyi, Liu Yutong. An xgboost algorithm for predicting purchasing behaviour on e-commerce platforms. Teh. Vjesn. 2020;27(5):1467–1471. [Google Scholar]
14.Prayogo Rizal Dwi, Karimah Siti Amatullah. 2021 8th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA) IEEE; 2021. Feature selection and adaptive synthetic sampling approach for optimizing online shopper purchase intent prediction; pp. 1–5. [Google Scholar]
15.Kabir Md Rayhan, Ashraf Faisal Bin, Ajwad Rasif. 2019 22nd International Conference on Computer and Information Technology (ICCIT) IEEE; 2019. Analysis of different predicting model for online shoppers' purchase intention from empirical data; pp. 1–6. [Google Scholar]
16.Hamami Faqih, Muzakki Ahmad. vol. 2329:1. AIP Publishing LLC; 2021. Machine learning pipeline for online shopper intention classification; p. 050014. (AIP Conference Proceedings). [Google Scholar]
17.Saha Soumitra, Sarker Partho Sarathi, Al Saud Alam, Shatabda Swakkhar, Newton MA Hakim. Cluster-oriented instance selection for classification problems. Inf. Sci. 2022;602:143–158. [Google Scholar]
18.Azim Sayed Mehedi, Sharma Alok, Noshadi Iman, Shatabda Swakkhar, Dehzangi Iman. A convolutional neural network based tool for predicting protein ampylation sites from binary profile representation. Sci. Rep. 2022;12(1):1–7. doi: 10.1038/s41598-022-15403-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Chowdhury Kibtia, Shatabda Swakkhar. 2021 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE) IEEE; 2021. Sentiment analysis on bangla financial news; pp. 64–67. [Google Scholar]
20.Saha Deepita, Haque Mozzammel, Sarkar Akash, Alam Famina, Farid Dewan Md, Rahman Chowdhury Mofizur, Shatabda Swakkhar. 2018 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE) IEEE; 2018. Ieee wiecon-ece 2018 novel class detection in concept drifting data streams using decision tree leaves; pp. 87–90. [Google Scholar]
21.Rayhan Farshid, Ahmed Sajid, Shatabda Swakkhar, Farid Dewan Md, Mousavian Zaynab, Dehzangi Abdollah, Rahman M. Sohel. idti-esboost: identification of drug target interaction using evolutionary and structural features with boosting. Sci. Rep. 2017;7(1):1–18. doi: 10.1038/s41598-017-18025-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Mondal Istiak Ahmed, Haque Md. Enamul, Hassan Al-Maruf, Shatabda Swakkhar. 2021 24th International Conference on Computer and Information Technology (ICCIT) IEEE; December 2021. Handling imbalanced data for credit card fraud detection. [Google Scholar]
23.Esmeli Ramazan, Bader-El-Den Mohamed, Abdullahi Hassana. Electronic Markets. 2020. Towards early purchase intention prediction in online session based retailing systems; pp. 1–19. [Google Scholar]
24.Muda Mazzini, Mohd Rohani, Hassan Salwana. Online purchase behavior of generation y in Malaysia. Proc. Econ. Finance. 2016;37:292–298. [Google Scholar]
25.Bag Sujoy, Tiwari Manoj Kumar, Chan Felix TS. Predicting the consumer's purchase intention of durable goods: an attribute-level analysis. J. Bus. Res. 2019;94:408–419. [Google Scholar]
26.Suchacka Grażyna, Skolimowska-Kulig Magdalena, Potempa Aneta. A k-nearest neighbors method for classifying user sessions in e-commerce scenario. J. Telecommun. Inf. Technol. 2015 [Google Scholar]
27.Dabbous Amal, Barakat Karine Aoun. Bridging the online offline gap: assessing the impact of brands' social network content quality on brand awareness and purchase intention. J. Retail. Consum. Serv. 2020;53 [Google Scholar]
28.Martins José, Costa Catarina, Oliveira Tiago, Gonçalves Ramiro, Branco Frederico. How smartphone advertising influences consumers' purchase intention. J. Bus. Res. 2019;94:378–387. [Google Scholar]
29.Xiao Liang, Guo Feipeng, Yu Fumao, Liu Shengnan. The effects of online shopping context cues on consumers' purchase intention for cross-border e-commerce sustainability. Sustainability. 2019;11(10):2777. [Google Scholar]
30.Shatabda Swakkhar, Saha Sanjay, Sharma Alok, Dehzangi Abdollah. iphloc-es: identification of bacteriophage protein locations using evolutionary and structural features. J. Theor. Biol. 2017;435:229–237. doi: 10.1016/j.jtbi.2017.09.022. [DOI] [PubMed] [Google Scholar]
31.Dehzangi Iman, Sharma Alok, Shatabda Swakkhar. Computational Methods for Predicting Post-Translational Modification Sites. 2022. iprotgly-ss: a tool to accurately predict protein glycation site using structural-based features; p. 125. [DOI] [PubMed] [Google Scholar]
32.Chawla Nitesh V., Bowyer Kevin W., Hall Lawrence O., Kegelmeyer W. Philip. Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002;16:321–357. [Google Scholar]
33.Hasanin Tawfiq, Khoshgoftaar Taghi. 2018 IEEE International Conference on Information Reuse and Integration (IRI) IEEE; 2018. The effects of random undersampling with simulated class imbalance for big data; pp. 70–79. [Google Scholar]
34.Friedman Jerome H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 2001:1189–1232. [Google Scholar]
35.Quinlan J. Ross. Induction of decision trees. Mach. Learn. 1986;1(1):81–106. [Google Scholar]
36.Cortes Corinna, Vapnik Vladimir. Support-vector networks. Mach. Learn. 1995;20(3):273–297. [Google Scholar]
37.Rumelhart David E., Hinton Geoffrey E., Williams Ronald J. California Univ San Diego La Jolla Inst for Cognitive Science; 1985. Learning internal representations by error propagation. Technical report. [Google Scholar]
38.Nair Vinod, Hinton Geoffrey E. Icml. 2010. Rectified linear units improve restricted Boltzmann machines. [Google Scholar]
39.Breiman Leo. Random forests. Mach. Learn. 2001;45(1):5–32. [Google Scholar]
40.Pedregosa Fabian, Varoquaux Gaël, Gramfort Alexandre, Michel Vincent, Thirion Bertrand, Grisel Olivier, Blondel Mathieu, Prettenhofer Peter, Weiss Ron, Dubourg Vincent, et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data will be made available on request.

[br0010] 1.Suchacka Grażyna, Chodak Grzegorz. Using association rules to assess purchase probability in online stores. Inf. Syst. E-Bus. Manag. 2017;15(3):751–780. [Google Scholar]

[br0020] 2.Zarrad Houda, Debabi Mohsen. Online purchasing intention: factors and effects. Int. Bus. Manag. 2012;4(1):37–47. [Google Scholar]

[br0030] 3.Chung Sorim, Kramer Thomas, Wong Elaine M. Do touch interface users feel more engaged? The impact of input device type on online shoppers' engagement, affect, and purchase decisions. Psychol. Mark. 2018;35(11):795–806. [Google Scholar]

[br0040] 4.Law Monica, Kwok Ron Chi-Wai, Ng Mark. An extended online purchase intention model for middle-aged online users. Electron. Commer. Res. Appl. 2016;20:132–146. [Google Scholar]

[br0050] 5.Sakar C. Okan, Polat S. Olcay, Katircioglu Mete, Kastro Yomi. Real-time prediction of online shoppers' purchasing intention using multilayer perceptron and lstm recurrent neural networks. Neural Comput. Appl. 2019;31(10):6893–6908. [Google Scholar]

[br0060] 6.Suchacka Grazyna, Skolimowska-Kulig Magdalena, Potempa Aneta. Classification of e-customer sessions based on support vector machine. ECMS. 2015;15:594–600. [Google Scholar]

[br0070] 7.Ariffin Shaizatulaqma Kamalul, Mohan Thenmoli, Goh Yen-Nee. Influence of consumers' perceived risk on consumers' online purchase intention. J. Res. Interact. Mark. 2018 [Google Scholar]

[br0080] 8.Rita Paulo, Oliveira Tiago, Farisa Almira. The impact of e-service quality and customer satisfaction on customer behavior in online shopping. Heliyon. 2019;5(10) doi: 10.1016/j.heliyon.2019.e02690. [DOI] [PMC free article] [PubMed] [Google Scholar]

[br0090] 9.Fernandes Ricardo Filipe, Teixeira Costa Magalhães. 2015. Using Clickstream Data to Analyze Online Purchase Intentions. [Google Scholar]

[br0100] 10.Budnikas Germanas, et al. Computerised recommendations on e-transaction finalisation by means of machine learning. Stat. Trans. New Ser. 2015;16(2):309–322. [Google Scholar]

[br0110] 11.Awad Mamoun A., Khalil Issa. Prediction of user's web-browsing behavior: application of Markov model. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 2012;42(4):1131–1142. doi: 10.1109/TSMCB.2012.2187441. [DOI] [PubMed] [Google Scholar]

[br0120] 12.Baati Karim, Mohsil Mouad. IFIP International Conference on Artificial Intelligence Applications and Innovations. Springer; 2020. Real-time prediction of online shoppers' purchasing intention using random forest; pp. 43–51. [Google Scholar]

[br0130] 13.Song Peiyi, Liu Yutong. An xgboost algorithm for predicting purchasing behaviour on e-commerce platforms. Teh. Vjesn. 2020;27(5):1467–1471. [Google Scholar]

[br0140] 14.Prayogo Rizal Dwi, Karimah Siti Amatullah. 2021 8th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA) IEEE; 2021. Feature selection and adaptive synthetic sampling approach for optimizing online shopper purchase intent prediction; pp. 1–5. [Google Scholar]

[br0150] 15.Kabir Md Rayhan, Ashraf Faisal Bin, Ajwad Rasif. 2019 22nd International Conference on Computer and Information Technology (ICCIT) IEEE; 2019. Analysis of different predicting model for online shoppers' purchase intention from empirical data; pp. 1–6. [Google Scholar]

[br0160] 16.Hamami Faqih, Muzakki Ahmad. vol. 2329:1. AIP Publishing LLC; 2021. Machine learning pipeline for online shopper intention classification; p. 050014. (AIP Conference Proceedings). [Google Scholar]

[br0170] 17.Saha Soumitra, Sarker Partho Sarathi, Al Saud Alam, Shatabda Swakkhar, Newton MA Hakim. Cluster-oriented instance selection for classification problems. Inf. Sci. 2022;602:143–158. [Google Scholar]

[br0180] 18.Azim Sayed Mehedi, Sharma Alok, Noshadi Iman, Shatabda Swakkhar, Dehzangi Iman. A convolutional neural network based tool for predicting protein ampylation sites from binary profile representation. Sci. Rep. 2022;12(1):1–7. doi: 10.1038/s41598-022-15403-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[br0190] 19.Chowdhury Kibtia, Shatabda Swakkhar. 2021 IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE) IEEE; 2021. Sentiment analysis on bangla financial news; pp. 64–67. [Google Scholar]

[br0200] 20.Saha Deepita, Haque Mozzammel, Sarkar Akash, Alam Famina, Farid Dewan Md, Rahman Chowdhury Mofizur, Shatabda Swakkhar. 2018 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE) IEEE; 2018. Ieee wiecon-ece 2018 novel class detection in concept drifting data streams using decision tree leaves; pp. 87–90. [Google Scholar]

[br0210] 21.Rayhan Farshid, Ahmed Sajid, Shatabda Swakkhar, Farid Dewan Md, Mousavian Zaynab, Dehzangi Abdollah, Rahman M. Sohel. idti-esboost: identification of drug target interaction using evolutionary and structural features with boosting. Sci. Rep. 2017;7(1):1–18. doi: 10.1038/s41598-017-18025-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[br0220] 22.Mondal Istiak Ahmed, Haque Md. Enamul, Hassan Al-Maruf, Shatabda Swakkhar. 2021 24th International Conference on Computer and Information Technology (ICCIT) IEEE; December 2021. Handling imbalanced data for credit card fraud detection. [Google Scholar]

[br0230] 23.Esmeli Ramazan, Bader-El-Den Mohamed, Abdullahi Hassana. Electronic Markets. 2020. Towards early purchase intention prediction in online session based retailing systems; pp. 1–19. [Google Scholar]

[br0240] 24.Muda Mazzini, Mohd Rohani, Hassan Salwana. Online purchase behavior of generation y in Malaysia. Proc. Econ. Finance. 2016;37:292–298. [Google Scholar]

[br0250] 25.Bag Sujoy, Tiwari Manoj Kumar, Chan Felix TS. Predicting the consumer's purchase intention of durable goods: an attribute-level analysis. J. Bus. Res. 2019;94:408–419. [Google Scholar]

[br0260] 26.Suchacka Grażyna, Skolimowska-Kulig Magdalena, Potempa Aneta. A k-nearest neighbors method for classifying user sessions in e-commerce scenario. J. Telecommun. Inf. Technol. 2015 [Google Scholar]

[br0270] 27.Dabbous Amal, Barakat Karine Aoun. Bridging the online offline gap: assessing the impact of brands' social network content quality on brand awareness and purchase intention. J. Retail. Consum. Serv. 2020;53 [Google Scholar]

[br0280] 28.Martins José, Costa Catarina, Oliveira Tiago, Gonçalves Ramiro, Branco Frederico. How smartphone advertising influences consumers' purchase intention. J. Bus. Res. 2019;94:378–387. [Google Scholar]

[br0290] 29.Xiao Liang, Guo Feipeng, Yu Fumao, Liu Shengnan. The effects of online shopping context cues on consumers' purchase intention for cross-border e-commerce sustainability. Sustainability. 2019;11(10):2777. [Google Scholar]

[br0300] 30.Shatabda Swakkhar, Saha Sanjay, Sharma Alok, Dehzangi Abdollah. iphloc-es: identification of bacteriophage protein locations using evolutionary and structural features. J. Theor. Biol. 2017;435:229–237. doi: 10.1016/j.jtbi.2017.09.022. [DOI] [PubMed] [Google Scholar]

[br0310] 31.Dehzangi Iman, Sharma Alok, Shatabda Swakkhar. Computational Methods for Predicting Post-Translational Modification Sites. 2022. iprotgly-ss: a tool to accurately predict protein glycation site using structural-based features; p. 125. [DOI] [PubMed] [Google Scholar]

[br0320] 32.Chawla Nitesh V., Bowyer Kevin W., Hall Lawrence O., Kegelmeyer W. Philip. Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002;16:321–357. [Google Scholar]

[br0330] 33.Hasanin Tawfiq, Khoshgoftaar Taghi. 2018 IEEE International Conference on Information Reuse and Integration (IRI) IEEE; 2018. The effects of random undersampling with simulated class imbalance for big data; pp. 70–79. [Google Scholar]

[br0340] 34.Friedman Jerome H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 2001:1189–1232. [Google Scholar]

[br0350] 35.Quinlan J. Ross. Induction of decision trees. Mach. Learn. 1986;1(1):81–106. [Google Scholar]

[br0360] 36.Cortes Corinna, Vapnik Vladimir. Support-vector networks. Mach. Learn. 1995;20(3):273–297. [Google Scholar]

[br0370] 37.Rumelhart David E., Hinton Geoffrey E., Williams Ronald J. California Univ San Diego La Jolla Inst for Cognitive Science; 1985. Learning internal representations by error propagation. Technical report. [Google Scholar]

[br0380] 38.Nair Vinod, Hinton Geoffrey E. Icml. 2010. Rectified linear units improve restricted Boltzmann machines. [Google Scholar]

[br0390] 39.Breiman Leo. Random forests. Mach. Learn. 2001;45(1):5–32. [Google Scholar]

[br0400] 40.Pedregosa Fabian, Varoquaux Gaël, Gramfort Alexandre, Michel Vincent, Thirion Bertrand, Grisel Olivier, Blondel Mathieu, Prettenhofer Peter, Weiss Ron, Dubourg Vincent, et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]

PERMALINK

A gradient boosting classifier for purchase intention prediction of online shoppers

Abdullah-All-Tanvir

Iftakhar Ali Khandokar

AKM Muzahidul Islam

Salekul Islam

Swakkhar Shatabda

Abstract

1. Introduction

2. Related work

Table 1.

3. Material and methods

3.1. Real time prediction

Figure 1.

3.2. Training of the model

Figure 2.

3.3. Benchmark dataset

Table 2.

Table 3.

3.4. Pre-processing

3.5. Feature selection

3.6. Handling data imbalance

3.7. Classification algorithm

Algorithm 1.

3.8. Performance evaluation

4. Experimental analysis

Table 4.

4.1. Baseline methods

Table 5.

4.2. Baseline with data sampling

Table 6.

Table 7.

4.3. Feature selection

Table 8.

Figure 3.

Figure 4.

4.4. Combined evaluation

Table 9.

Figure 5.

Table 10.

Figure 6.

Table 11.

Table 12.

Figure 7.

Figure 8.

Figure 9.

4.5. Comparison with the existing works

Table 13.

4.6. Explainability analysis

Figure 10.

5. Conclusion

Funding statement

CRediT authorship contribution statement

Declaration of Competing Interest

Contributor Information

Data availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases