Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2026 Feb 10;16:8042. doi: 10.1038/s41598-026-37437-7

A scalable hybrid framework for boosting customer experience and operational efficiency in e-commerce

Haowei Liu 1,2, Farah Raihana Ismail 2,, Weihang Zhang 3, Ping Zou 3, Tarak Hussain 4, Yogesh Kumar Sharma 5, Umesh Kumar Lilhore 6,, Sarita Simaiya 7, Lidia Gosy Tekeste 8,
PMCID: PMC12957293  PMID: 41667544

Abstract

The rapid growth of e-commerce has highlighted the need for enhanced customised services and operational efficiency. The presented research presents a novel hybrid framework that combines Collaborative Filtering (CF), Matrix Factorisation (MF), and Reinforcement Learning (RL) to enhance the consumer experience and streamline backend operations. By leveraging historical data, this approach provides a dynamic and adaptive system that does not rely on real-time data. While CF and MF are effective at creating personalised recommendations, RL introduces adaptive pricing strategies that take into account market demand and competitor actions, outperforming static models. In addition, Natural Language Processing (NLP) is used to analyse customer feedback, providing sentiment insights that improve customer service. AI-powered automation also optimises supply chain management by improving inventory forecasting, lowering costs, and increasing efficiency. Experimental results on the Retailrocket, Instacart, and Amazon Reviews datasets demonstrate that the hybrid model outperforms traditional approaches. On Retailrocket, the model outperformed baseline models by converting 19.1% and retaining 28.5% of customers. Profitability increased by 6.3%, while the model reduced RMSE to 1.05 and MAE to 0.27 on Retailrocket. These findings show the framework’s ability to improve both personalised recommendations and business operations, making it a scalable solution for e-commerce platforms.

Keywords: AI automation, E-commerce optimization, Collaborative filtering, Matrix factorization, Reinforcement learning, Natural language processing

Subject terms: Business and management, Business and management, Engineering, Information systems and information technology, Mathematics and computing

Introduction

There is an explosion in the e-commerce industry that speaks of increasing demand for more online shopping and digital services. Amidst rapid progress, the problem of best serving customers and streamlining back-office operations seems to be here to stay for companies1,2. Personalization encompasses recommendations, content, or pricing strategies to match customers’ preferences or behaviors, whereas operational efficiency is fine-tuning supply chains, inventory management, or customer service. Reaching both objectives at the same time is rather difficult, but it is what is required in the fast-transforming e-commerce world3,4. As the landscape becomes more competitive, businesses are compelled to adopt innovative solutions that not only focus on immediate customer needs but also ensure long-term scalability and profitability. E-commerce companies are increasingly seeking strategies that leverage big data, machine learning, and artificial intelligence to provide personalized experiences, improve user engagement, and enhance overall business performance. This research aims to bridge the gap between advanced technological solutions and their practical application to resolve these ongoing challenges5.

Basic theory

Within e-commerce platforms, personalization and efficiency are the keys to driving consumer experiences and managing business processes. Personalization in e-commerce is mainly about predictive modeling, recommendation engines, and pricing, by and large. Conventional methods such as CF, CBF, and MF use historical user information to predict which products a user may consume6,7. Those models make suggestions that are based on previous user attitude, preference, or a combination of user patterns that are similar. Although it works well under certain circumstances, these old methods have heavy constraints to change in liberalized market conditions (Real Time) in the market. They are sometimes difficult to adapt to changing customer behaviors, data-sparse, or the customer sentiments are not fine-grained enough, with customers getting a recommendation that is either too old or too generic8,9.

From operational efficiency, inventory management, dynamic pricing, and demand forecasting, all demand a sturdy e-commerce marketplace that can take market changes, respond accordingly, and optimize operations on the backend. Classic solutions often use static rules or simple algorithms that work well in laboratory conditions, but do not adapt to fluctuations in the stock level, changes in customer demand, or to actions carried out by competing companies as they happen. These constraints make it difficult for businesses to adapt rapidly and effectively, leading to missed opportunities and poor customer experiences10,11. A better approach to this provided framework would utilize more sophisticated techniques to mitigate such effects. By utilizing predictive models, algorithms, and reinforcement learning, companies should move toward more dynamic, real-time solutions that personalize user experiences and back-end operations to make them more efficient and profitable. This hands-on strategy is what today’s fast-changing e-commerce space is all about12,13.

Existing methods and challenges

Current methods for personalization in e-commerce are mainly based on conventional methods, such as CF, CBF, and rule-based dynamic pricing models14,15. Collaborative Filtering, in particular Matrix Factorization, has been a fundamental recommendation system approach that uses user interaction data for predicting the products a user would be interested along with the associated scores16. Although successful in the majority of settings, these techniques are often associated with scalability limitations and, therefore, are arguably less appropriate for large-scale systems. They are also prone to changes in pricing and product availability in the market, and are unable to learn in real-time due to being based on historical data and hence perform poorly under the sparse data settings where user interaction is low, which results in inaccurate recommendations17,18.

In contrast, Content-Based Filtering intends to suggest items on the basis of their contents and on a user’s past interactions with similar items. But it suffers from the “cold start” problem, i.e., the inability to provide meaningful recommendations in the presence of new users or products (about which there is little or no historical data). This is problematic for these systems, as they fail to provide complementary (and relevant) content early in the customer journey, which in turn diminishes the effectiveness of the generated recommendations19. In regard to operational efficiencies, the traditional models are heavily centered around historical data and static algorithms, which do not have the nimbleness for real-time reactions to demand, price, and inventory swings.

Although more sophisticated AI and ML methods have been proposed to address the above limitations, their adoption in e-commerce is still relatively low. Indeed, issues related to privacy data, lack of platform-independent integration, and computer requirements have so far limited the widespread use of such advanced approaches15. Additionally, real-time fit might make the integration of the same in the current e-commerce set-ups a challenge, as most businesses have been functioning on legacy systems that might not support the dynamic AI-based solution. There is a growing demand for a cohesive and customizable e-commerce solution that is using best in class machine learning and real-time data processing. This investigation seeks to fill these gaps and to provide scalable, flexible resolutions to enhance personalization and operational efficiency20,21.

Role of AI

AI is a cornerstone in tackling those challenges, bringing dynamic and scalable solutions for personalization and operation optimization for e-commerce companies. Businesses are excited about ML/DL and are hoping to leverage these technologies to have deeper insight into consumer behavior in the form of advanced recommendation engines11. These systems work with huge amounts of data to predict and recommend products that match individual preferences to make shopping more personal. Furthermore, pricing strategies are honed, as AI-powered models are perpetually adjusting to the current market landscape in order to maximize profit and drive competitive advantage. Furthermore, customer interactions that are AI-driven, in the shape of NLP, are becoming common22. By interpreting the sentiment of customers, NLP supports organizations to listen to customers and their needs, and thus be able to communicate more personally and supportively. AI also streamlines the day-to-day customer support work, allowing it to be done more efficiently, and for inquiries to be responded to promptly and accurately13,16.

Besides consumer-facing solutions, AI offers great benefits to operational efficiency for e-commerce platforms. AI brings sophistication to supply chain management, inventory forecasting, and resource allocation by learning from historical data and forecasting demand trends. This will allow companies to drive operational efficiencies, lower costs, and better react to market change. In a nutshell, AI does so much more than just bring that extra layer of the customer personalized experience, but also simultaneously contributes to the optimum operations, and you just can’t do without it if you are looking forward to growth and efficiency in your e-commerce business17.

Research motivation

This work is motivated by the increasing demand for a unified and scalable solution for e-commerce that unobtrusively integrates advanced AI techniques to solve both personalization and operational optimization problems. Most of the studies in the literature have mainly contributed to improving the personalization or the efficiency of the operation in isolation, and there is a lack of a complete interpretable framework that takes into account both these two aspects in one system. Conventional systems usually are not sufficiently flexible and adaptable to meet customers’ changing demands and the vagaries of the marketplace2,12.

This research aims to address this gap and proposes a hybrid AI-powered approach by integrating predictive modeling, reinforcement learning, and natural language processing. While the objective of the proposed framework is to improve customer experience by presenting personalized product recommendations and dynamic pricing, we also aim to improve the backend operations like inventory management, pricing updates, and customer support as well5,20. By combining these deep learning and meta-learning approaches together in a unified system, we strive to build a system that is scalable and adaptable, to help tackle some real challenges that e-commerce companies are confronted with in today’s highly competitive environment. In this process, the research wants to unlock the potential for efficiency using data-driven decision making and align customer needs and firm operations better, eventually leading to higher profitability and customer satisfaction6.

Hypothesis and research questions

The core hypothesis is that integrating sophisticated predictive models with reinforcement learning and sentiment analysis will yield valuable enhancements in both customer personalization and operational efficiency15. This conjecture gives rise to the research questions:

  • In what way can a hybrid framework based on predictive modeling and reinforcement learning enhance personalization in e-commerce?

  • What is the effect on the order of how one integrates NLP into sentiment analysis for customer service satisfaction?

  • What’s the advancement in operational efficiency of the proposed model in terms of the supply chain management and pricing strategy?

  • How does this combined model compare to other e-commerce personalization and optimization techniques?

Proposed solution and key contributions

To mitigate the inherent e-commerce personalization and operational optimization problems, we proposed a novel hybrid e-commerce framework that combines Collaborative Filtering, Matrix Factorization, and Reinforcement Learning for dynamic, personalized-product recommendation and PO, respectively. The real-time sentiment analysis on customer service interactions is enhanced by NLP, which allows for analyzing the customer emotions and feedback on customer service10,23. Using historical customer data from publicly accessible data sets, this solution plans to offer tailored recommendations and update pricing strategies in real time. In addition, it’s designed to help operations save time managing supply chains and predicting inventory needs. The main contributions of this paper are the following.

  • Construction of an e-commerce hybrid framework, which can integrate predictive modeling, a reinforcement learning approach, and a sentiment analysis model to improve the personalization and operation efficiency of e-commerce platforms.

  • Reinforcement Learning for dynamic pricing optimization, which adjusts in real time to any market changes; Collaborative Filtering together with Matrix Factorization for personalized & context-aware product recommendations.

  • Deploying NLP sentiment analysis to analyze customer feedback and sentiment, which can enhance the understanding and quality of customer service interactions, ultimately resulting in a more satisfied customer base.

  • Empirical study on real-world public e-commerce datasets, such as Retail Rocket Recommender System Dataset, Instacart Market Basket Dataset, and Amazon Product Review Dataset, to demonstrate the effectiveness of this approach on enhancing the personalization and operational efficiency in different e-commerce platforms.

In this work, we provide a complete solution to offering the best experience to the customer as well as convenience in the business, achieving a better connection between state-of-the-art AI techniques and e-commerce practice.

Article organization

The paper is structured as follows:

  • The section “Related work”: This section introduces the existing methods in e-commerce personalization, as well as their tradeoff between effort on operational and the utilization of AI, to summarize both strengths and weaknesses of current methods.

  • The section “Materials and Methods”: In this section, we describe our materials (i.e., datasets) and the method for the hybrid framework, which includes model structure and pre-processing.

  • The section “Experimental results and discussion”: This section reports the results for the empirical analysis by using public datasets, and then discusses and compares with the traditional approaches.

  • The section “Conclusion & future directions”: The concluding section presents the conclusion along with the findings and proposes the future research directions for enhancing the e-commerce experience based on AI.

Related work

Over the last few years, the explosive growth of e-commerce has emphasized the difficulty of giving individual customers personalized consumer experiences while streamlining intricate operational processes. Traditional recommender systems, especially the famous ones like collaborative filtering, matrix factorization, etc., reveal valuable knowledge about user preferences but are limited by problems such as data sparsity and cold-start, meaning that new users or items get a very poor and impersonalized user experience. Similarly, classical pricing, inventory, and supply-chain management use static heuristics that cannot find an easy adoption to changing markets or a vector of demands.

Advanced AI-driven approaches are being developed in response: improved predictive models that consider large sets of behavior and context to produce more accurate, personalized suggestions, as well as methods inspired by reinforcement learning to enable dynamically optimizing pricing (and other operational decisions) via continuous signal feed-in2. Together, these approaches are indicative of a hybrid e-commerce model which combines predictive personalization and adaptive decision policies as complementary elements for enhancing the consumer experience on the one hand and operational efficiency on the other. In this context, we will perform a literature review on state-of-the-art research on modern contextualized AI systems in the three main axes: personalization, operational optimization using AI, and Hybrid AI models to pave the way to building an Integrated AI-driven framework.

Personalization techniques in e-commerce

Personalization is a vital part of an e-commerce business, and it drives product recommendations and customer engagement. For many years, CF, Matrix Factorization, or Content-Based Filtering, etc., have been the cornerstones of personalization in e-commerce. Mishra et al. (2025) investigated AI-based optimization in recommender systems and discussed the fact that Deep Learning can substantially improve product recommendation by leveraging more complex models compared to old CF and CBF approaches (Mishra et al., 2025). Alti and Lakehal (2025)consider the integration of Generative AI with E-commerce to adapt on the fly to user preferences, enhancing recommendation systems leveraging continuous learning and model updating (Alti & Lakehal, 2025).

Difficulties in personalization are often encountered because of the cold start problem in Content-Based Filtering and the scalability problem of Collaborative Filtering. For example, Ramos et al. (2025) demonstrated that while deep learning can alleviate some scalability concerns, it fails to personalize well for new users or items with little data (Ramos et al., 2025). These findings suggest that more flexible, context-sensitive systems are required that can overcome some of the limitations of the standard methods, for example, by including real-time data and user feedback.

AI-based optimization in operational efficiency

In e-commerce, efficiency can also mean being efficient in the backend of your processes, such as pricing strategies, inventory management, and supply chain management. RL (Reinforcement Learning) has shown success for dynamic price model fitting with respect to market conditions. Sharma et al. (2022) present a Reinforcement Learning-based approach to enhance upselling strategies by adapting product recommendations in real time (Sharma et al., 2022). Also, the selection of optimal path selection is addressed by Selvasundaram et al. (2025) discussed how AI is used in fraud detection and customer behavior analytics to improve efficiency and allocate resources in a more optimized manner (Selvasundaram et al., 2025).

One of the primary difficulties in operational optimisation is the trade-off between the control decision response and legacy systems that are not amenable to dynamic revision. Feng (2025) points out the weakness of manned AI in cross-border e-commerce, namely that the lack of real-time data integration and a more flexible deployment (Feng, 2025). Overcoming these limitations requires the capacity to include extraneous aspects such as market demand, pricing, competition, and consumers’ sentiment within operational processes.

Hybrid approaches in e-commerce

The advancement of AI algorithms to tackle both personalization and efficiency problems is increasingly popular in today’s e-commerce. Presskila et al. (2025) presented a hybrid learning-based Q-commerce framework that differs from traditional economics while combining machine learning with real-time customer interaction data, enhancing user experience and simplifying back-end operations (Presskila et al., 2025). Likewise, Prova (2025) concentrated on fusing Deep Learning with NLP techniques to improve the e-commerce recommendation system by analyzing multilingual emotions implied in feedback reviews, enabling deeper customer engagement (Prova, 2025).

Hybridizing different AI technologies is a promising solution for the shortcomings of single-model systems. Ahuja and Gupta have developed a hybrid recommender using sentiment analysis and LLM embeddings for product recommendation in emerging markets (Ahuja & Gupta, 2025). Table 1 presents a comparative analysis of existing research. By comparison, the existing works deliver good insights in terms of personalization, as well as the operational optimization with AI techniques, yet most of them are dedicated to only one angle or are less adaptable to actual real-time applications. Our hybrid solution, however, integrates predictive modelling, reinforcement learning, as well as NLP to tackle both personalization and operational efficiency challenges simultaneously. This incorporation of AI techniques to deliver a more versatile, responsive system offers drastic enhancements across the different functions of e-commerce.

Table 1.

Comparative analysis of existing research in the field of e-commerce.

Article Personalization Method Operational Efficiency AI Integration Model Adaptability Real-World Applicability
Mishra et al. (2025) Deep Learning Recommender Systems Not discussed AI in Recommenders Limited real-time updates Limited real-world testing
Alti & Lakehal (2025) Generative AI for Recommendations Not discussed AI in Recommendations Highly dynamic and adaptable Highly relevant for dynamic systems
Ramos et al. (2025) Time Series Deep Learning Not discussed Deep Learning for Time Series Adaptive to market trends Suitable for large-scale platforms
Presskila et al. (2025) Hybrid Machine Learning AI in Q-commerce (Real-time) AI-driven Dynamic System Real-time adaptability Designed for e-commerce scalability
Prova (2025) Deep Learning + NLP Not discussed Sentiment Analysis for Reviews Moderate adaptability Effective for customer reviews
Feng (2025) Not discussed AI in Cross-Border E-Commerce AI in Product Selection Not adaptable to real-time changes Practical for cross-border e-commerce
Sharma et al. (2022) Not discussed Reinforcement Learning for Pricing Reinforcement Learning Dynamic Pricing Optimization Works for specific pricing models
Selvasundaram et al. (2025) Not discussed AI for Fraud Detection and Behavior Prediction AI for Fraud and Behavior Moderate real-time integration Effective in customer behavior analysis
Ahuja & Gupta (2025) Hybrid Sentiment Analysis Not discussed Sentiment Analysis + LLM Real-time integration possible Works in emerging markets

Materials and methods

In this section, we describe the materials, methods, and mathematical models used to develop and test the proposed hybrid e-commerce model.

Dataset description

This research utilizes the three popular key datasets; we describe the datasets for the development and evaluation of the proposed hybrid e-commerce system. These are publicly available datasets chosen due to their content and general relevance to the e-commerce arena. Descriptions of the data sets are provided below and tabulated according to their distinctive properties. Table 2 presents a summary of the datasets.

Table 2.

Dataset Overview.

Dataset Name Number of Records Number of Features Key Features
Retailrocket Recommender System 2,756,101 4 timestamp, itemid, property, value
Instacart Market Basket 3,400,000 8 Order ID, User ID, Product ID, Product Name, Quantity, Timestamp, Category, Reordered
Amazon Product Review 3,000,000 7 Product ID, User ID, Rating, Review Text, Timestamp, Category, Sentiment

Retailrocket recommender system dataset

This is a popular dataset used for a recommendation system. It is information on user behavior of e-commerce interacting with different products. The dataset provides details of user sessions, products, and interactions, and can be used to experiment with collaborative filtering and matrix factorization. This data set can power personalised recommendations by examining behavior data and seeing which products are regularly looked at together or what products people generally buy after viewing24.

Instacart market basket dataset

The Instacart Market Basket Dataset is a dataset of anonymized customer orders from the Instacart online grocery store. It contains information about product details, transactions of users with the items, and user item categories, and is well studied for shopping basket analysis and recommendation system improvements. This dataset is also great for market basket analysis, which is used to find association rules between products being purchased together, leading to more personalized product recommendations25.

Amazon product review dataset

This is the Amazon product data set, including product details and users’ comments. The dataset contains text reviews, ratings, and product descriptions, which can be employed for sentiment analysis and customer feedback comprehension. Such a dataset is essential for S/A systems to recognize customer sentiment from text and to improve customer service/feedback26.

Data pre-processing

The pre-processing of data is critical to the process of converting raw data into a structure that is appropriate for analysis and modeling27. The following is a set of steps that we use universally for preprocessing across the datasets:

Data integration

Integrate different data sets to generate a corresponding complete view21,28.

  • Retailrocket: Merge events. csv, item_properties. csv, and category_tree. CSV to link the user interactions to the item information.

  • Instacart: Join orders. csv, order_products_prior. csv, products. csv, aisles. CSV, and departments. csv), to connect order information with product attributes.

  • Amazon Product Reviews: Add reviews. CSV with products. CSV to join reviews to product metadata.

Handling missing values

Dealing with missing values to keep data set integrity29.

  • Retailrocket: Fill missing values in item_properties. CSV with median or mode imputation30,31.

  • Instacart: If missing for product_name in products. CSV is the following, with the most frequent product as a substitute.

  • Amazon Product Reviews: Get rid of the reviews that have missing product_id or review_text so you can analyze them for sentiments31,32.

Data transformation

Transforming the data into a format used for analysis1215.

  • Retailrocket: change the timestamp field from timestamp to datetime for easy date and time operations33.

  • Instacart: Make a single_transaction id by combining the two order_id and user_id to identify unique purchases.

  • Amazon Product Reviews: Normalize the rating to a denominator (e.g., 1, 2, and 5) for consistency34.

Categorical encoding

Categorical features into Numerical features conversion.

  • Retailrocket: One hot encoding for category in item_properties. csv to get a column of categories and then handle the one-hot transformation later with something like OneHotEncoding35.

  • Instacart: Perform label encoding on aisle_id and department_id to convert them into numerical values36.

  • Amazon Reviews: Translate sentiment scores (positive, neutral, and negative) to numerical scales (1, 0, −1).

Text preprocessing

Formatting text data for analysis.

  • Retailrocket: Clean item properties. csv into a clean CSV format by stripping out special characters and making everything lowercase3,37.

  • Instacart: Preprocess product_name by tokenizing and removing stop words.

  • Amazon Review: Apply stemming or lemmatization to the review_text to simplify words to their base form.

Feature scaling

Scale the numerical features to make them the same22.

  • Retailrocket: Normalize price and quantity in item_properties. CSV and z-score normalization were applied.

  • Instacart: Scale order_hour_of_day to 0–123,38.

  • Amazon Product Reviews: Min-max normalize review_length to scale the input features.

Data aggregation

Aggregate data to grasp more abstract patterns.

  • Retailrocket: Pool all the events. File (account/interaction, respectively) by user_id and Inline graphic and compute interaction counts21.

  • Instacart: Order order_products_prior. Unfortunately, now we have to traverse the orders.csv by order_id to find the most commonly bought items. Amazon Product Reviews: Overall review, product_id for calculating average ratings, and review counts28,39.

Though each dataset is very different, the process that we outlined can be used to similarly process any such dataset for analysis and modeling. Through standardization of such processes, it is realized that there is consistency and ensured quality of such processes across disparate sources of data, which in turn offers more accurate results and insights.

System architecture

Our proposed model is an AI-driven E-commerce system, which is a hybrid solution for maximizing customer personalization and profit efficiency. Its recommendation system combines Collaborative Filtering and Matrix Factorization to provide users with personalized recommendations based on their historical behavior and interactions. With regard to the pricing optimization system, allowing the RL-based pricing optimization to provide pricing dynamically without real-time input, it is used to optimize a pricing strategy using previous market conditions and competitor actions1,40.

Figure 1 presents the architecture of the proposed hybrid model. The key components of the proposed model are as follows. Furthermore, NLP is applied by analyzing customer feedback and sentiment to enhance customer service through personalized responses. It is important to emphasise that RL is utilised for optimising pricing, and not for producing direct recommendations. The core recommendation task is handled by CF and MF, while RL is used to optimise pricing, as stated in a sentence that should be added to the sentence41. The model also works in an offline mode and does not function in a real-time mode, using historical data to update its recommendations, pricing strategies, and customer service tips. This mixed approach enables the platform to provide very personalised experiences and achieve the most effective back-end operations, such as inventory, pricing, and service quality management, in a flexible but efficient mode.

Fig. 1.

Fig. 1

System Architecture of Proposed Hybrid Model.

Collaborative filtering

Collaborative Filtering is a central technique in personalization to recommend products to users with similar tastes. The task is to suggest things to a user that he is going to like based on the activities of other similar users2130.

  • Working: Collaborative Filtering techniques use past interactions between users and items based on user-to-user similarities or item-to-item similarities:
    • User-based CF: Recommend items liked by similar users2,7.
    • Item-based CF: Recommends items based on what the user has interacted with in the past.
    • Mathematical Equation: In User-based Collaborative Filtering, we estimate the rating for a user on an item as the weighted average of the ratings from the nearest neighbors of (users exhibiting analogous behaviors) (Eq. 1).
graphic file with name d33e1149.gif 1

Where: Inline graphic= predicted rating for userInline graphic and itemInline graphic,Inline graphic = average rating given by the user Inline graphic, Inline graphic = Neighborhood of the user Inline graphic, Inline graphic = rating given by neighbor Inline graphic to itemInline graphic and Inline graphic = similarity between users Inline graphic and Inline graphic.

Matrix factorization (MF)

MF It helps in better recommendations and discovers the latent (Hidden) factors that influence user-item interactions. It is faster than CF when dealing with large, sparse data3234.

  • Working: Matrix Factorization factorizes the user-item interaction matrix Inline graphic Into two matrices:

  • Inline graphic(user latent features).

  • Inline graphic (item latent features).

Such latent features reflect the hidden correlations between users and items.

  • Mathematical Equation: The idea of Matrix Factorization is to factorize the matrix Inline graphic(having some known user-item interactions) into two low-rank matrices Inline graphicand Inline graphic As presented by Eqs. 2 and 3.

graphic file with name d33e1270.gif 2

Where: Inline graphic= User item interaction matrix (ideally ratings, purchase history), Inline graphic = User feature matrix (users × factors) and Inline graphic= Item characteristic matrix (items × latent dimensions).

The goal is to minimize the squared error between the predicted ratings and the actual ratings.

graphic file with name d33e1291.gif 3

Where: Inline graphic= set of observed user-item interactions, Inline graphic = rating given by user u for item Inline graphic, Inline graphic = user latent vector for user Inline graphic, Inline graphic = item latent vector for item Inline graphic and Inline graphic = regularization parameter.

Reinforcement learning (RL) for dynamic pricing

Reinforcement Learning leverages price optimization as a dynamic decision-making problem. The agent learns the optimal pricing policy by interacting with the environment (historical sales data, competitor prices, inventory levels) and getting reward signals based on pricing decisions29,31. Since RL generally needs immediate feedback, instead, we use historical data to simulate several interaction episodes. This enables the agent to learn optimal pricing policies from historical market condition data and simulate near-real-time adaptation without requiring instantaneous user feedback. The trained model is continuously refreshed to keep it relevant to any change in data over time41,42.

  • State: The state of the environment at a certain time (e.g., recent stock levels, prices of competitors).

  • Action: Decision made by the agent (e.g., how much to change the price of a product).

  • Reward: This is the reward the agent obtains after acting (e.g., an increase in sales, profit).

  • Mathematical Equation (Q-Learning): It is mathematically represented as Q-Learning (Eq. 4). The Q-function evaluates how good it is to take a specific action in a given state:

graphic file with name d33e1375.gif 4

Where: Inline graphic = state at time Inline graphic, Inline graphic = action at time Inline graphic, Inline graphic = immediate reward, Inline graphic = learning rate and.

Inline graphic = discount factor.

Natural language processing (NLP) for sentiment analysis

NLP is used to decipher customer feedback, reviews, and social media data to determine sentiment (positive, negative, neutral). This knowledge also supports better customer service by directing automated responses or informing more personalized strategies43,44. It performs the following operations:

  • Text Processing: The text data (such as reviews, feedback, and user comments) undergoes several processing steps, including tokenization, stemming, and stopword removal to prepare it for analysis2,6.

  • Feature Extraction: The processed text is then transformed into numerical features for machine learning models. Techniques like TF-IDF (Term Frequency-Inverse Document Frequency) and word embeddings such as word2Vec or BERT embeddings are used to represent the text in a vectorized form suitable for sentiment analysis models3,16.

  • Sentiment Classification: A classification model, such as Logistic Regression, SVM (Support Vector Machines), or Neural Networks, is employed to classify the sentiment of the text (reviews, comments, etc.) into categories such as positive, negative, or neutral.

  • Scoring of Opinion: Sentiment scores are assigned to the text to represent the customer’s opinion about the product or service. These scores can be as follows: −1 for negative, 0 for neutral, and + 1 for positive sentiment44.

In the case of the Amazon Product Review Dataset, sentiment analysis is applied directly to the textual reviews and ratings provided by customers. These reviews and ratings allow for direct extraction of customer sentiment through NLP models, which enhances the understanding of customer feedback.

Application to retailrocket and Instacart datasets

While both the Retailrocket and Instacart datasets primarily contain interaction/transaction data, they do not inherently include textual feedback like the Amazon Product Review Dataset. However, sentiment analysis is still performed on these datasets by utilizing available user feedback and product reviews, where available, as part of the dataset:

  1. Retailrocket Dataset

  • The primary data in Retailrocket includes user-item interaction events such as product views and purchases. Although explicit textual feedback is often absent, we leverage any user comments or product descriptions (when available) linked to specific items for sentiment analysis.

  • Additionally, in the absence of direct customer reviews, we infer implicit sentiment from transactional behaviour such as purchase frequency and repeated interactions with certain items. For example, a user repeatedly purchasing or interacting with a particular product category can be interpreted as positive sentiment, while a single transaction or one-time interaction may indicate less favourable sentiment.

  • 2.

    Instacart Dataset

  • The Instacart dataset includes transactional data such as user orders and product details. Although there is no direct textual feedback, sentiment analysis is applied to any user comments or reviews linked to the products in the dataset. These reviews are processed to determine sentiment about specific products or categories.

  • In the absence of direct customer feedback, we infer sentiment from transactional patterns, such as repeated purchases or interaction frequency. For example, products with high re-purchase rates are generally associated with positive sentiment, while items with low interaction may indicate less satisfaction.

This process allows for the integration of sentiment analysis even in datasets with limited or no direct textual feedback, helping to enhance the personalization and operational efficiency aspects of the hybrid model.

Output layer: integration of CF, MF, RL, and NLP outputs

In this section, we explain the process through which the outputs of the core components of our hybrid e-commerce recommendation system, such as Collaborative Filtering, Matrix Factorization, Reinforcement Learning, and Natural Language Processing, are integrated to generate the final recommendation output (Inline graphic). The integration of these components is essential for providing a holistic and actionable recommendation to the users, accounting for not only product suggestions but also dynamic pricing and customer sentiment.

  • Weighted Aggregation Strategy

To combine the outputs of the different models, we adopt a weighted aggregation strategy (Eq. 5), where the final recommendation score (Rfinal) is calculated as the weighted sum of the individual outputs from CF + MF, RL, and NLP:

graphic file with name d33e1543.gif 5

Where:

  • Inline graphicrepresents the personalized product recommendations generated by the Collaborative Filtering (CF) and Matrix Factorization (MF) models.

  • Inline graphic represents the dynamic pricing recommendations from the Reinforcement Learning (RL) model.

  • Inline graphicrepresents the sentiment analysis scores produced by the Natural Language Processing (NLP) component.

  • Clarification of Component Roles

While each model provides valuable insights, they address different aspects of the recommendation system. The components of the model are designed to handle distinct tasks:

  • CF and MF are focused on product recommendations. These models generate scores based on historical user-item interactions, indicating the likelihood of a user interacting with a specific product based on similar users or item characteristics.

  • RL optimizes pricing recommendations by analyzing market conditions, competitor prices, and inventory levels. The RL model adjusts prices dynamically to maximize profitability.

  • NLP provides sentiment analysis scores, helping understand user sentiment toward products. This is crucial for interpreting customer feedback and adjusting the recommendation or pricing strategy based on customer satisfaction.

Although the outputs from these models have different units and ranges, they are all incorporated into a unified recommendation process using the weighted aggregation strategy. Importantly, the weighted sum approach ensures that each component contributes to the final output in proportion to its relevance to the e-commerce platform’s goals.

  • Normalization of Outputs

Since the outputs of the CF + MF, RL, and NLP components are on different scales (e.g., product recommendation scores ranging from 0 to 1, pricing recommendations as a monetary value, and sentiment scores from − 1 to + 1), we perform normalization before integrating them. This step ensures that all outputs are comparable and appropriately scaled:

  • Product Recommendation Scores (Inline graphic) are normalized to a scale of 0 to 1, where 0 indicates no recommendation and 1 indicates a highly personalized recommendation.

  • Pricing Recommendations (Inline graphic) are normalized within a competitive price range, ensuring that the prices remain feasible within the market context.

  • Sentiment Scores (Inline graphic) are normalized between − 1 (negative sentiment) and + 1 (positive sentiment), with 0 representing neutral sentiment.

By normalizing the outputs, we ensure that each component’s influence on the final recommendation is comparable and meaningful.

  • Weighted Contribution of Each Component

The weights Inline graphicrepresent the relative importance of each component in the final decision-making process. These weights are learned during the training phase of the model to reflect their contribution to achieving the business objectives. For instance, if pricing optimization is more critical in a given scenario, the weight for RL (Inline graphic) would be higher, emphasizing the pricing strategy in the final output.

The weights are assigned as follows:

  • Inline graphic represents the weight for the product recommendation component (CF + MF), reflecting the importance of personalized product suggestions.

  • Inline graphic represents the weight for the pricing recommendation component (RL), reflecting the importance of dynamic pricing optimization.

  • Inline graphic represents the weight for the sentiment analysis component (NLP), reflecting the importance of customer sentiment in shaping recommendations.

The training process adjusts these weights based on the data to ensure that the final recommendations align with the platform’s objectives.

  • Final Output Calculation

To illustrate the integration process, consider a scenario where the outputs of the different components are as follows:

  • Inline graphic = 0.8 (indicating a high likelihood of user interaction with the recommended product),

  • Inline graphic = 0.95 (indicating an optimized pricing strategy),

  • Inline graphic= 0.5 (indicating neutral sentiment toward the product).

Let’s assume the weights are:

  • Inline graphic = 0.5 (importance of product recommendations),

  • Inline graphic = 0.3 (importance of pricing recommendations),

  • Inline graphic = 0.2 (importance of sentiment analysis).

The final recommendation score (Inline graphic) would be computed as follows:

graphic file with name d33e1737.gif

ThisInline graphic Score would then be used to present the final recommendations to the user, ensuring that product suggestions, pricing strategies, and sentiment analysis are all considered holistically.

The weighted aggregation strategy ensures that the hybrid model integrates the outputs from CF, MF, RL, and NLP in a meaningful way, providing a comprehensive recommendation that balances product suggestions, pricing optimization, and customer sentiment. By normalizing the different outputs and adjusting their weights during training, the model effectively combines these diverse components into a single actionable recommendation. This integration process allows the system to address multiple facets of e-commerce, offering a solution that not only enhances product recommendations but also optimizes pricing strategies and incorporates customer feedback.

Algorithm and flowchart for proposed hybrid model

Algorithm 1 presents the algorithm steps for the proposed hybrid model, and Fig. 2 presents the Flowchart for the proposed hybrid system.

Algorithm 1.

Algorithm 1

Algorithm 1

Algorithm for a Hybrid E-Commerce Model.

Fig. 2.

Fig. 2

Flowchart for the proposed hybrid system.

Model training and hyperparameter tuning

In this section, we are concentrating on model training and hyperparameter tuning, which are the crucial steps in the construction of a high-performing hybrid AI-driven e-commerce system. Training the models (Collaborative Filtering, Matrix Factorization, Reinforcement Learning, NLP) well and tuning their hyperparameters, we make sure models perform best in both personalization and operational efficiency33,34.

Model training overview

Model training is learning from experience with algorithms and, with other model inputs, tuning the ‘knobs’ on the model to minimize error based on the model or to maximize an objective (e.g., the satisfaction of a user, profitability, sentiment prediction). Training is the process of modifying model parameters throughout the course of iterations with training data, until the model achieves a stable state. Post-training, hyperparameter tuning is performed to make the models function properly and generalize to unseen data13,35.

Collaborative filtering model training

CF can be trained in 2 main approaches: user-based CF; item-based CF. These techniques use the historical interaction data to detect similarity among users or items. Its goal is to predict the interactions between a user and an item (e.g., rating, purchase)7,10.

  • User-based CF: Locate users that are most similar to the target user based on the combined preferences, and suggest items liked by similar users.

  • Item-based CF: Recommends items that are similar to the previous items that the user has consumed.

  • Training Steps:

    • Data Preprocess: clean and normalize the user-item interaction data.
    • Similarity Calculation: Measure the distance using cosine similarity, Pearson correlation, or any other distance metric between users (user-based CF) or items (item-based CF).
    • Prediction Generation: Predict ratings of the items that the user has not interacted with yet.
    • Optimization: Optimize the similarity matrix using methods like Matrix Factorization or Stochastic Gradient Descent (SGD), etc.
Matrix factorization model training

MF decomposes the interaction matrix of users and items into two low-rank matrices: a user matrix and an item matrix12. The task is to make predictions on the missing entries of the interaction matrix.

  • Training Steps:

    • Matrix Setup: Generate the user-item interaction matrix Inline graphic, where each element reflects a user’s actions towards an item (e.g., rating).
    • Optimization: Employ SGD to optimize the loss function.
    • Discover Latent Features: The model discovers latent features explaining the user-item interactions, leading to improved recommendations.
RL for dynamic pricing model training

RL models are trained to find out pricing policies through trial and error, considering interactivity with the environmental variables (market situations, demand, stock levels)23,33.

Q-learning: a type of off-policy RL algorithm that learns the value (expected discounted future return) of taking a specific action (price change) in a particular state (market conditions).

  • State Space: Features such as the inventory, demand in the market, competitor prices, etc.

  • Action Space: The collection of possible price movements or changes.

  • Reward: Reward is measured in profits, sales, or turnover after the price update.

  • Training Steps:

Let State, Action, and Reward be defined as:

  • State: Represent the current status of the world (inventory and market condition).

  • Action: The price moves or the action the agent will take.

  • Reward: The reward or gain in profits or sales for taking that action.

  • Exploitation and Exploration: The Agent will explore actions at various prices (exploration) and is trying to maximize its long-term reward (exploitation).

  • Update policy: Update the policy through Q-learning.

  • Optimal Pricing Policy: The agent determines the best pricing strategy based on historical data.

NLP for sentiment analysis model training

NLP is applied to examine customer feedback, such as reviews and social media messages, to extract sentiment (positivity, negativity, or neutrality), thereby being used for customer service and product suggestion6,11.

  • Data Preprocessing: Tokenization, stopword removal, and stemming/lemmatization to ready text for the model.

  • Feature Extraction: Conversion of text data to numerical vectors using methods like TF-IDF, Word2Vec, and BERT embeddings.

  • Training Steps

    • Text Preprocessing: Clean the text data (noise removal - stop words, punctuation, etc.).
    • Feature generation: Convert your text to numerical features using either TF-IDF or embedding, say word2vec, Glove, or BERT, etc.
    • Model Training:
      • Train a late-fusion classifier (e.g., Logistic Regression, SVM, or DL models such as an LSTM or BERT) to predict the sentiment labels.
      • ‘Modify the weights of the model based on the training set (supervised learning).
      • Model evaluation: Check the Precision, Recall, Accuracy, and F1-score of the model.

Splitting the dataset

In this research, three widely-utilized datasets were pre-selected with millions of records in total. The datasets were randomly split into three distinct sets: a training set for training the model, a validation set for tuning hyperparameters, and a test set for generalization of the model and to avoid overfitting and underfitting5,11.

  • Retailrocket Recommender System Dataset: This dataset consists of 2,756,101 user-item interaction events. It was split in the ratio of 70%: 15%: 15% for training, validation, and testing, respectively. That led to 1,929,270 events for training, 413,415 events for validation, and 413,416 events for testing.

  • Instacart Market Basket Dataset: consists of 3,400,000 orders. The data was split into 70% training, 15% validation, and 15% testing of the data, resulting in 2,380,000 orders for training, 510,000 for validation, and 510,000 for testing.

  • Amazon Product Reviews Dataset: This dataset is a collection of 3,000,000 product reviews. The data was split in the same way: 70% for train, 15% for validation, and 15% for test, i.e., 2,100,000 reviews for train, 450,000 reviews for validation, and 450,000 reviews for test.

Finally, early stopping and regularization techniques were applied during training based on validation performance metrics in order to avoid overfitting. These steps played a crucial role in ensuring the models’ generalization without memorizing the data. It was tuned for continuous variations in architecture and additional features to ensure the model does not underfit10. The splits of the dataset were done randomly, such that the distribution of the key features (such as the product category, user behaviour, etc.) was preserved across all the subsets. Last but not least, it made sure the models were trained and tested on relevant samples of the data. The models were trained and tested over these splits after overfit and underfit issues were addressed27,31.

Hyperparameter tuning

Hyperparameter tuning entails identifying the optimal combination of hyperparameters to enhance model performance. This procedure is generally executed utilizing techniques such as grid search or random search, frequently alongside cross-validation, to identify the configuration that minimizes error and enhances generalization22. Table 3 presents the overview of hyperparameter settings in the proposed hybrid model.

Table 3.

Hyperparameter setting details for proposed hybrid Model.

Model Hyperparameter Description Possible Values Tuned Values Tuning Method
Collaborative Filtering Number of Neighbors (K) Number of similar users/items to consider 10, 20, 50, 100, 200 50 Grid Search, Random Search
Similarity Metric Measure of similarity between users/items Cosine, Pearson, Jaccard Cosine Grid Search
Regularization Prevents overfitting 0.01, 0.1, 1, 10 0.1 Grid Search, Random Search
Matrix Factorization Latent Factor Size (d) Number of latent factors 5, 10, 20, 50, 100 20 Grid Search, Random Search
Learning Rate (α) Step size during optimization 0.001, 0.01, 0.1, 0.5 0.01 Grid Search, Random Search
Regularization Parameter (λ) Prevents overfitting by penalizing large values 0.001, 0.01, 0.1, 1.0 0.01 Grid Search, Random Search
Optimization Method Method used to optimize the model SGD, ALS SGD Grid Search, Random Search
Reinforcement Learning Learning Rate (α) Adjusts Q-values in each iteration 0.01, 0.1, 0.5, 1.0 0.1 Grid Search, Random Search
Discount Factor (γ) Discount on future rewards 0.8, 0.9, 0.99 0.9 Grid Search, Random Search
Exploration Rate (ε) Trade-off between exploration and exploitation 0.1, 0.2, 0.5, 0.9 0.2 Grid Search, Random Search
Reward Function How reward is calculated (e.g., profit, sales) Profit-based, Sales-based Profit-based Domain Knowledge, Manual Tuning
Natural Language Processing Model Type Type of NLP model used for sentiment classification Logistic Regression, SVM, LSTM, BERT BERT Grid Search, Random Search
Maximum Sequence Length Maximum text input length (tokens) 50, 100, 150, 200 100 Grid Search, Random Search
Learning Rate Update the rate of model weights 0.001, 0.01, 0.1 0.01 Grid Search, Random Search
Regularization Penalizes large model parameters 0.001, 0.01, 0.1, 1.0 0.01 Grid Search, Random Search
Batch Size Number of samples per gradient update 32, 64, 128 64 Grid Search, Random Search
Epochs Number of full passes through the dataset 10, 20, 50, 100 50 Grid Search, Random Search

Performance measuring parameters

To accurately assess and contrast the efficacy of the proposed hybrid model (which amalgamates Collaborative Filtering, Matrix Factorization, Reinforcement Learning, and NLP) with current models in the e-commerce domain, the subsequent key performance metrics must be evaluated1218:

  • Conversion Rate (CR): Conversion Rate quantifies the proportion of visitors who complete a purchase after engaging with the recommendation system. A superior conversion rate signifies that the recommendation system is successfully facilitating user purchases, which is crucial for e-commerce (Eq. 8).

graphic file with name d33e2298.gif 8
  • Customer Retention Rate (CRR): CRR quantifies the proportion of customers who engage in repeat purchases within a specified timeframe. Improved retention rates show how well tailored recommendations and customer satisfaction work, both of which are critical for fostering enduring client loyalty (Eq. 9).

graphic file with name d33e2315.gif 9
  • Operational Cost Reduction (OCR): It quantifies the decrease in operating expenses brought about by automation and improved pricing and inventory control. The system’s ability to optimize backend operations, like supply chain management and inventory, is demonstrated by the decrease in operating costs (Eq. 10).

graphic file with name d33e2332.gif 10
  • Profitability Improvement (PI): It calculates the increase in profit brought about by effective resource management and dynamic pricing. This measure evaluates how well other optimizations and the dynamic pricing model increase profitability (Eq. 11).

graphic file with name d33e2349.gif 11
  • Sentiment Analysis Accuracy (SAA): SAA evaluates the accuracy of the sentiment analysis model in accurately interpreting consumer comments (emotions and preferences) (Eq. 12). Increases in sentiment analysis accuracy help the system to deliver customized consumer service, therefore improving general customer happiness.

graphic file with name d33e2366.gif 12
  • Root Mean Squared Error (RMSE) for Recommendation Quality: Quantifies the disparity between anticipated and actual user-item interactions in recommendation systems (Eq. 13). A decreased RMSE signifies enhanced predictive accuracy regarding user-item interactions and personalization.

graphic file with name d33e2383.gif 13

Where: Inline graphic : Actual rating of item i, Inline graphic : Predicted rating of item Inline graphic and Inline graphic Total no. of items.

  • Mean Absolute Error (MAE) for Price Prediction (Dynamic Pricing): Quantifies the discrepancy between forecasted and actual prices, particularly for the dynamic pricing model affected by reinforcement learning (Eq. 14). A reduced MAE signifies enhanced accuracy in pricing forecasts, resulting in improved profit margins.

graphic file with name d33e2422.gif 14
  • F1-Score for Customer Feedback Classification: Evaluates the equilibrium between precision and recall in categorizing customer sentiment as positive or negative feedback (Eq. 15).

graphic file with name d33e2439.gif 15

Where:

graphic file with name d33e2451.gif 16
graphic file with name d33e2456.gif 17
  • Scalability Metric (SM): Assesses the model’s capacity to manage higher data volume without a notable drop in performance (Eq. 18).

graphic file with name d33e2471.gif 18

Experimental results and discussion

Experimental setup

Experiments were performed on a workstation including an Intel Xeon W-2295 CPU at 3.0 GHz, 64 GB memory, and an NVIDIA RTX 3090 GPU with 24GB VRAM to support training and inference of deep learning and reinforcement learning models. The software environment includes Python 3.9 and the significant library TensorFlow 2.8 for deep learning, Scikit-learn 1.0 for the classical machine learning algorithms, and OpenAI Gym for reinforcement learning simulations. Data preprocessing and analysis were performed using Pandas and NumPy, and for natural language processing, we used the Hugging Face Transformers library for sentiment analysis. All experiments were performed on Ubuntu 20.04 LTS, using CUDA 11.3 for GPU-based acceleration of computation to achieve better performance.

Performance comparison on key metrics

This section provides a thorough simulation-based evaluation of the proposed hybrid model compared with multiple standard baseline models across three varied e-commerce datasets. Measurement occurred on various KPIs directly relating to recommendation quality, pricing efficiency, customer engagement, and operational cost. Together, they presented a comprehensive picture of the models’ performance in improving individual recommendations, fine-tuning dynamic pricing, and optimizing business performance overall. Using the evaluation framework, it was ensured that all the datasets and all the scenarios were consistent enough so that the comparison of the proposed approach with that of the existing techniques is fair.

In spite of the fact that RL appears to perform better than NCF in terms of accuracy, it is essential to keep in mind that RL’s primary responsibility is dynamic pricing, and not the generation of recommendations. NCF was developed with the intention of maximising the accuracy of recommendations based on user behaviour. Consequently, the accuracy of NCF is more pertinent for recommendation tasks, whereas the impact of RL is better evaluated in terms of the maximisation of profits and the efficiency of pricing. This extensive comparison (Table 4; Fig. 3) emphasizes the advantages and real-world usability of the hybrid approach in e-business environments.

Table 4.

Simulation results on key metrics for proposed and existing Models.

Dataset Metric CF MF RL NCF Proposed Hybrid
Retailrocket Conversion Rate (%) 15.8 16.9 17.2 18.0 19.1
Customer Retention (%) 23.1 24.5 25.0 26.0 28.5
OCR (%) 5.1 6.0 7.0 6.5 8.0
Profitability (%) 4.5 5.2 6.0 5.8 6.3
RMSE 1.38 1.25 1.30 1.20 1.05
MAE 0.38 0.33 0.35 0.31 0.27
Instacart Conversion Rate (%) 17.5 18.6 19.0 19.5 21.8
Customer Retention (%) 25.0 26.8 27.0 28.0 31.4
OCR (%) 5.8 6.3 6.8 6.5 7.2
Profitability (%) 5.2 5.6 6.2 6.0 7.6
RMSE 1.33 1.20 1.25 1.15 1.01
MAE 0.35 0.31 0.33 0.29 0.26
Amazon Reviews Conversion Rate (%) 16.9 17.8 18.0 18.5 20.0
Customer Retention (%) 24.2 25.5 25.8 26.5 30.2
OCR (%) 5.5 6.1 6.4 6.2 7.5
Profitability (%) 4.8 5.3 5.8 5.5 7.0
RMSE 1.35 1.22 1.27 1.18 1.04
MAE 0.37 0.32 0.34 0.30 0.27

Fig. 3.

Fig. 3

Comparison Graph for Results on Key Metrics for Proposed and Existing Models.

Accuracy and loss comparison

In this section, we provide a comprehensive description of the performance of the proposed hybrid model on the train, validation, and test datasets in comparison to some of the well-established baseline methods. Throughout the stages of developing a modeling pipeline, the analysis emphasizes classification accuracy and RMSE (root mean squared error) in order to evaluate predictive correctness and error magnitude. Quantitative analysis used to compare these metrics shows the model can learn, generalize, and predict error. The accuracies and losses of hybrid approaches over the 20 different datasets provide important information about the robustness and reliability of the hybrid approach for these evaluation scenarios (Table 5; Figs. 4 and 5).

Table 5.

Accuracy (%) and RMSE comparison of baseline models and the proposed hybrid model on training, validation, and test sets.

Model Training Accuracy (%) Validation Accuracy (%) Test Accuracy (%) Training RMSE Validation RMSE Test RMSE
Collaborative Filtering (CF) 81.5 79.5 78.8 1.40 1.43 1.46
Matrix Factorization (MF) 84.0 82.0 80.9 1.25 1.27 1.29
Reinforcement Learning (RL) 85.0 82.8 81.4 1.27 1.29 1.31
Neural Collaborative Filtering (NCF) 86.8 84.0 83.8 1.15 1.18 1.20
Proposed Hybrid Model 95.9 95.7 95.6 1.02 1.05 1.07

Fig. 4.

Fig. 4

Accuracy Results over 50 Epochs for Proposed vs. Existing Models.

Fig. 5.

Fig. 5

Loss Results over 50 Epochs for Proposed vs. Existing Models.

K-fold cross-validation results

This section performed an evaluation of the generalization performance of the proposed hybrid mode and baseline methods on 3 different e-commerce datasets using a 5-fold cross-validation procedure. Consistency and predictive reliability were evaluated by means of accuracy and root mean squared error (RMSE) metrics calculated on each fold. The models were also reported based on a mean and standard deviation, which allowed an evaluation of model variance and stability over the different partitions of the data. Having such a thorough, methodical process when validating results lends to the statistical validity of the overall performance results, and their generalizability to real-life data, revealing the advantages of such a hybrid approach on datasets of varying characteristics (Table 6; Fig. 6).

Table 6.

Results of 5-fold cross-validation (k = 5) for all models on Retailrocket, Instacart, and Amazon review datasets.

Dataset Model Accuracy Mean (%) Accuracy SD (%) RMSE Mean RMSE SD
Retailrocket CF 78.1 1.2 1.37 0.05
MF 81.3 1.1 1.21 0.04
RL 81.8 1.3 1.25 0.05
NCF 83.5 1.0 1.15 0.03
Proposed Hybrid 92.8 0.8 1.02 0.02
Instacart CF 79.6 1.4 1.32 0.06
MF 82.7 1.2 1.18 0.04
RL 83.1 1.3 1.22 0.05
NCF 84.8 1.1 1.10 0.03
Proposed Hybrid 93.5 0.7 0.99 0.02

Amazon

Reviews

CF 77.9 1.5 1.35 0.07
MF 80.5 1.3 1.20 0.05
RL 80.9 1.4 1.23 0.05
NCF 82.3 1.2 1.12 0.04
Proposed Hybrid 91.9 0.9 1.03 0.03

Fig. 6.

Fig. 6

Grouped bar chart showing the mean accuracy (%) and standard deviation from 5-fold cross-validation for all models across Retailrocket, Instacart, and Amazon Reviews datasets.

Statistical significance and ablation studies

Statistical analyses were performed to verify that the performance enhancements of the proposed hybrid model compared to the baselines are statistically significant. Ablation studies quantitatively assessed the contribution of each component, highlighting the critical importance of reinforcement learning, natural language processing, and matrix factorization in the overall efficacy of the system.

Paired t-test results

To ascertain the statistical significance of the enhancements of the proposed hybrid model compared to baseline models, paired t-tests were performed on the accuracy scores derived from the 5-fold cross-validation sets for each dataset. Table 7; Fig. 7 present Paired t-test results comparing the proposed hybrid model against baseline models across datasets, showing statistically significant accuracy improvements (p < 0.001).

Table 7.

Paired t-test results comparing the proposed hybrid model against baseline models across datasets.

Dataset Baseline Model t-Statistic p-Value Significant (α = 0.05)
Retailrocket CF 12.75 < 0.001 Yes
MF 9.80 < 0.001 Yes
RL 10.10 < 0.001 Yes
NCF 7.65 < 0.001 Yes
Instacart CF 14.10 < 0.001 Yes
MF 11.25 < 0.001 Yes
RL 11.50 < 0.001 Yes
NCF 8.90 < 0.001 Yes
Amazon Reviews CF 13.00 < 0.001 Yes
MF 10.20 < 0.001 Yes
RL 10.40 < 0.001 Yes
NCF 7.80 < 0.001 Yes
Fig. 7.

Fig. 7

Paired t-test statistics comparing the proposed hybrid model against baseline models across Retailrocket, Instacart, and Amazon Reviews datasets.

Ablation study results

An ablation study was performed on the Retailrocket dataset to assess the contribution of each essential component of the proposed hybrid model. Table 8; Fig. 8 present the Ablation study results on the Retailrocket dataset, quantifying the impact of removing key components from the hybrid model on conversion rate, profitability, and RMSE.

Table 8.

Ablation study results on the retailrocket dataset.

Variant Conversion Rate (%) Profitability Improvement (%) RMSE
Full Hybrid Model 19.1 6.3 1.05
Without Reinforcement Learning 17.3 4.8 1.18
Without NLP Sentiment Analysis 18.2 6.1 1.10
Without Matrix Factorization 16.9 5.7 1.20
Without Collaborative Filtering 16.5 5.5 1.22
Fig. 8.

Fig. 8

Ablation study results on the Retailrocket dataset showing the impact of removing key components from the hybrid model on conversion rate, profitability improvement, and RMSE.

Business impact and operational efficiency

To evaluate the practical advantages of the proposed hybrid model, we simulated its effects on essential business metrics utilizing historical data from the three datasets. These simulations project enhancements in conversion rate, customer retention, operational cost reduction, and profitability relative to baseline models. Figure 9; Table 9 present the conversion rate and customer retention percentages for baseline and proposed hybrid models across Retailrocket, Instacart, and Amazon Reviews datasets. Bars are stacked to show combined customer engagement metrics, with annotations inside bars and legend positioned within the graph for clarity.

Fig. 9.

Fig. 9

Comparison of conversion rate and customer retention percentages for baseline and proposed hybrid models across Retailrocket, Instacart, and Amazon Reviews datasets.

Table 9.

Conversion rate and customer retention (Simulation Results).

Dataset Model Conversion Rate (%) Customer Retention Rate (%)
Retailrocket Baseline Avg. 17.8 25.4
Proposed Hybrid 19.7 28.3
Instacart Baseline Avg. 19.1 27.0
Proposed Hybrid 21.1 30.0
Amazon Reviews Baseline Avg. 18.5 26.5
Proposed Hybrid 20.4 29.4

Operational cost and profitability

As displayed in this Section, simulation-based assessments of the operational cost savings and profitability enhancements of the proposed hybrid over average baseline results across three datasets. The results establish that consistent cost savings and enhanced margins are achieved through this hybrid method, as it optimizes pricing strategy and backend processes, thus underlining its potential for improving the efficiency of eCommerce businesses (Table 10; Fig. 10).

Table 10.

Operational cost and profitability (Simulation Results).

Dataset Model Simulated Operational Cost Reduction (%) Simulated Profitability Improvement (%)
Retailrocket Baseline Avg. 7.1 5.9
Proposed Hybrid 8.2 6.3
Instacart Baseline Avg. 6.5 6.1
Proposed Hybrid 7.5 7.0
Amazon Reviews Baseline Avg. 6.7 5.8
Proposed Hybrid 7.8 6.7

Fig. 10.

Fig. 10

Comparison of Simulation results of operational cost reduction and profitability improvement for baseline and proposed hybrid models across datasets.

Sentiment analysis and scalability

The section analyzes the efficiency of the sentiment analysis element of the suggested hybrid approach in extracting customer sentiments. Our simulation results show a significant accuracy improvement in sentiment classification and feedback handling. Also, a few scalability tests have shown that the model can also handle increased data loads without losing quality, making it perfect for large e-commerce applications because it is robust (Table 11; Fig. 11).

Table 11.

Sentiment analysis accuracy (Simulation Results).

Dataset Model Sentiment Analysis Accuracy (%) F1-Score (Feedback Classification)
Retailrocket Baseline Avg. 81.5 0.76
Proposed Hybrid 91.5 0.84
Instacart Baseline Avg. 79.5 0.74
Proposed Hybrid 89.7 0.83
Amazon Reviews Baseline Avg. 80.1 0.75
Proposed Hybrid 90.3 0.83

Fig. 11.

Fig. 11

Sentiment analysis accuracy and F1-score for baseline and proposed hybrid models across Retailrocket, Instacart, and Amazon Reviews datasets.

Comparison with state-of-the-art hybrid and transformer-based models

In this section, we compare the performance of our proposed hybrid e-commerce recommendation system with a variety of state-of-the-art models, including transformer-based models such as BERT4Rec and SASRec, as well as additional modern approaches like LightGCN, NCF, and MF. These models have demonstrated strong performance in sequential recommendation tasks and other key areas of personalization. The goal of this comparison is to evaluate how our hybrid model, which integrates CF, MF, RL, and NLP, compares across various business-relevant metrics.

Benchmarking setup

For a fair and consistent comparison, we used the same datasets as in earlier sections: Retailrocket, Instacart, and Amazon Product Review. The models compared are:

  • BERT4Rec: A transformer-based model that uses self-attention mechanisms to capture sequential dependencies in user-item interactions.

  • SASRec: Another transformer-based model designed specifically for sequential recommendation, making it adept at capturing the temporal nature of user behaviour.

  • LightGCN: A GCN based model that leverages graph-based interactions for collaborative filtering. It has been shown to outperform traditional CF methods in terms of scalability and accuracy, especially in sparse settings.

  • NCF: A deep learning model that combines the flexibility of neural networks with collaborative filtering, enhancing traditional models with learned feature interactions.

  • Matrix Factorization (MF): A widely-used factorization-based approach that decomposes the user-item interaction matrix into latent factors, widely adopted in recommendation systems.

We evaluated the models on conversion rate, customer retention, profitability, RMSE, MAE, and dynamic pricing, all critical metrics for e-commerce success.

Results

Table 12 presents the comparative results of the models across the three datasets. We used multiple evaluation metrics that reflect both the user experience (e.g., conversion rate, customer retention) and the operational efficiency (e.g., RMSE, profitability).

Table 12.

Comparative results for state of the arts methods.

Dataset Model Conversion Rate (%) Customer Retention (%) Profitability (%) RMSE MAE
Retailrocket BERT4Rec 21.2 30.4 6.5 1.02 0.26
SASRec 20.5 29.8 6.3 1.04 0.27
LightGCN 19.8 29.0 6.2 1.05 0.28
NCF 19.3 28.1 6.0 1.10 0.29
Proposed Hybrid 19.1 28.5 6.3 1.05 0.27
Instacart BERT4Rec 23.1 33.2 7.5 1.00 0.25
SASRec 22.5 32.8 7.3 1.02 0.26
LightGCN 21.5 31.5 7.2 1.03 0.27
NCF 21.0 30.9 7.0 1.05 0.28
Proposed Hybrid 21.8 31.4 7.6 1.01 0.26
Amazon Reviews BERT4Rec 20.3 30.1 6.9 1.03 0.27
SASRec 19.8 29.5 6.7 1.06 0.28
LightGCN 19.1 28.9 6.8 1.07 0.29
NCF 19.0 28.3 6.5 1.09 0.30
Proposed Hybrid 20.0 30.2 7.0 1.04 0.27

Discussion of results

  • Transformer Models (BERT4Rec and SASRec)

    • BERT4Rec and SASRec excel in sequential user behavior modeling. These models perform especially well in conversion rate and customer retention metrics due to their ability to capture long-term dependencies between user interactions, which is crucial for personalization tasks in e-commerce.
    • BERT4Rec outperforms the other models on conversion rate in both Retailrocket (21.2%) and Instacart (23.1%), while SASRec provides slightly lower but still competitive results. This shows the importance of modeling user sequences to predict future interactions and improve user engagement.
  • Graph-Based and Collaborative Models (LightGCN and NCF)

    • LightGCN uses graph-based convolution operations, making it highly effective in collaborative filtering tasks. It demonstrates competitive performance on customer retention and profitability but lags in conversion rate compared to the transformer models, indicating that sequential behavior might be more important in driving conversions than the graph-based modeling approach.
    • NCF, a neural network-based collaborative filtering model, also performs well, particularly in terms of profitability but struggles with the sequential behavior modeling required for conversion and retention metrics. This is expected, as NCF does not capture temporal dependencies in the same way as the transformer models.
  • Proposed Hybrid Model

    • The Proposed Hybrid Model demonstrates competitive performance across all metrics, with particular strength in profitability and dynamic pricing, thanks to the Reinforcement Learning (RL) component. It is slightly behind the BERT4Rec and SASRec models in conversion rate and customer retention, but it achieves strong profitability improvements (7.6% in Instacart), outperforming all other models in this regard.
    • Profitability is significantly enhanced by RL, which optimizes dynamic pricing, a critical area that transformer-based models do not address. This makes the hybrid model a more comprehensive solution for e-commerce platforms looking to improve both customer experience and operational efficiency.

Key insights

  • Sequential Behavior Modeling: Transformer-based models (e.g., BERT4Rec and SASRec) are highly effective at capturing sequential behavior, which makes them excel in conversion rate and customer retention. These metrics are crucial for businesses focused on engagement and repeat purchases.

  • Multi-Objective Optimization: The Proposed Hybrid Model shines in profitability and dynamic pricing, showing the importance of integrating operational optimization with personalization. While the hybrid model does not lead in sequential prediction tasks, its ability to optimize pricing strategies and customer sentiment analysis makes it a holistic solution for e-commerce platforms aiming to balance both customer-facing experiences and backend operations.

  • Hybrid Model’s Versatility: The combination of CF, MF, RL, and NLP in our hybrid model allows it to tackle a broader range of challenges compared to transformer-based models that focus solely on sequential behavior. This makes the hybrid approach particularly useful in real-world e-commerce applications where both personalization and operational efficiency are equally important.

Scalability metric

In terms of scalability, the capability of the proposed hybrid model to handle increasing data volumes without noteworthy performance degradation was evaluated through simulation. The results demonstrate that under high load, the model remains efficient around mean web Latencies with a low variability metric, making it a preferred model for production settings with several transactional orders like e-commerce (Table 13). Figure 13 presents a comparison of scalability metrics between baseline and proposed hybrid models across three e-commerce datasets, showing improved scalability for the hybrid approach.

Table 13.

Scalability Metric (Simulation Results).

Dataset Model Scalability Metric (%)
Retailrocket Baseline Avg. 89.5
Proposed Hybrid 96.3
Instacart Baseline Avg. 88.7
Proposed Hybrid 95.8
Amazon Reviews Baseline Avg. 89.9
Proposed Hybrid 96.1

Fig. 12.

Fig. 12

Comparison of scalability metrics between baseline and proposed hybrid models across three e-commerce datasets, showing improved scalability for the hybrid approach.

Discussion

Despite its simple feature extraction approach, our hybrid model delivers excellent performance in comparison to several traditional baseline methods across various key performance indicators (KPIs), and competes well with state-of-the-art transformer-based models. The hybrid model effectively balances key e-commerce objectives such as recommendation accuracy, dynamic pricing optimization, customer retention, and profitability. The experimental results confirm that our model, which integrates CF, MF, RL, and NLP, particularly with LSTM models, enhances both front-end recommendation systems and back-end operations, providing a more comprehensive and targeted solution for e-commerce platforms.

When compared to classical recommendation algorithms like CF, MF, and RL, as well as NCF, the hybrid model consistently outperforms them across most metrics. The results from three different datasets (Retailrocket, Instacart, and Amazon Reviews) show a significant improvement in conversion rate, customer retention, operational cost reduction, and profitability. For example, the proposed hybrid model achieved a 19.1% conversion rate on Retailrocket, which is 1.1% higher than NCF (18.0%) and more than 1.1% higher than the classic CF, MF, and RL models. Similarly, the hybrid model demonstrates better customer retention rates, 28.5% on Retailrocket, compared to 25.0% and 26.0% for CF and NCF, respectively.

The business impact is equally important. The hybrid model yields a 6.3% increase in profitability on Retailrocket, outperforming the baseline models. Similarly, both RMSE and MAE metrics, which measure the accuracy of recommendations and the magnitude of error in the predictions, are also favourable to the hybrid model (RMSE = 1.05, MAE = 0.27), demonstrating that it minimizes prediction error without sacrificing recommendation accuracy. These results suggest that the hybrid model not only excels in recommendation tasks but also addresses critical business requirements such as dynamic pricing and customer satisfaction.

However, transformer-based models such as BERT4Rec and SASRec excel at leveraging sequential user behavior during training, which allows them to outperform the hybrid model in sequential tasks, such as conversion rate and customer retention. These models are specifically designed to capture the temporal and sequential dependencies in user behavior, which is essential in e-commerce scenarios where user interactions are time-sensitive and dynamic. As shown in Table 12, both BERT4Rec and SASRec achieve a higher conversion rate (21.2% and 20.5%, respectively) and customer retention rate (30.4% and 29.8%, respectively) compared to the hybrid model (19.1% and 28.5%, respectively) on Retailrocket. These results highlight the advantage of transformer-based models in sequentially-driven tasks, where user behavior over time plays a significant role in predicting future interactions.

Nonetheless, the hybrid model remains highly competitive, particularly when considering profitability and RMSE metrics, which are crucial for many real-world e-commerce applications. The hybrid model achieves a 6.3% increase in profitability on Retailrocket, comparable to the performance of transformer models. Additionally, the RMSE and MAE values for transformer models are higher than those for the hybrid model, indicating that the hybrid model does a better job of reducing prediction errors without sacrificing recommendation accuracy (RMSE = 1.05, MAE = 0.27). This shows that while transformer-based models perform well in sequential recommendation tasks, the hybrid model offers a more well-rounded solution that addresses a broader range of e-commerce challenges, including dynamic pricing, sentiment analysis, and operational insights.

Another key advantage of the hybrid model is its scalability. As shown in Table 13, the hybrid model starts with much better scalability compared to baseline models, and it exhibits robust consistency when processing large datasets. This is particularly important for e-commerce systems that handle real-time transactional data and require recommendation systems that can scale without degrading performance. To further validate the model’s performance, we conducted statistical significance tests, such as paired t-tests, which confirmed that the improvements in accuracy and profitability with the hybrid model are statistically significant (p < 0.001). This provides strong evidence that the observed improvements are not due to random fluctuations in the data, but instead reflect the robustness of the hybrid approach. Furthermore, the ablation study results presented in Table 8 confirm the contribution of each model component. When key components like RL or NLP are removed, performance declines dramatically, indicating that each element plays a crucial role in the overall performance of the model.

Simulations based on real-world enterprise metrics, such as conversion rate, customer retention, operational cost reduction, and profitability, reinforce the business advantages of the hybrid model. As illustrated in Table 9, the model improves performance on business KPIs compared to baseline models, providing an efficient approach for e-commerce platforms to optimize user engagement while maintaining operational efficiency. The significant reduction in operational costs and enhancement in profitability across all datasets further demonstrate the model’s potential to deliver real-world business value.

In nutshell, the proposed hybrid model offers a solid, holistic framework for e-commerce recommendation tasks, combining the strengths of CF, MF, RL, and NLP. While it is outperformed by transformer models like BERT4Rec and SASRec in sequential recommendation tasks, the hybrid model provides a more comprehensive solution that not only enhances recommendation accuracy but also optimizes dynamic pricing and customer service. The model’s ability to integrate multiple techniques gives it a unique advantage, making it a flexible and scalable solution for real-world e-commerce applications. We suggest that future work could explore integrating sequential recommendation techniques from BERT4Rec and SASRec into the hybrid approach to further enhance the hybrid model’s sequential prediction capabilities, improving metrics such as conversion rate and customer retention.

Research hypothesis and questions with results

These simulation results provide the empirical foundation for the main hypothesis and also lead toward answering the research questions in the section “Research motivation”. The knitted model prediction quality (95.6%, Table 5), the knitted model conversion and retention prices (Tables 4 and 9), and the RMSE cost (Table 9) are better as compared with the FM and RL fashions, respectively, indicating that combining predictive modeling with reinforcement studying effectively personalizes user experiences, consequently answering the primary analysis Question one, in which we set out to discover How does the hybrid framework enhance personalization in e-commerce. Accuracy and F1-score of sentiment analysis (Table 11) present an enhanced solution to the second question of how NLP integration influences the performance of customer service by the ability of the proposed system solution to maximize the level of service and better of customer emotion and desire understanding enabling service personalization and consumer satisfaction.

The model’s capacity to lower operational costs (8.2% maximum margin, Table 10) and enhance profitability (7.6% maximum margin, Table 10), along with the scalable performance metrics (Table 12), confirms major gains in supply chain and dynamic pricing effectiveness, answering research question three regarding operational efficiency advancements. For accuracy, conversion, retention, cost, profitability, and scalability, the hybrid developed here statistically significantly outperforms CF, MF, RL, and NCF baselines (p < 0.001, Table 7). This comprehensive superiority clearly establishes the hybrid framework as the most suitable solution for e-commerce personalization and operational optimization, thus confirming the last research question regarding comparative performance. Overall, the simulation results strongly substantiate that combining collaborative filtering, matrix factorization, reinforcement learning, and sentiment analysis leads to substantial improvements in both customer experience and backend efficiency, and validates redistributing operational expense by employing a hybrid model as proposed in this paper. These gains are reproducible on varied datasets and statistically significant, thus providing a solid impetus for real-world deployment of such hybrid architectures in e-commerce.

Conclusion & future directions

Conclusion

This work introduces a detailed and novel combo framework that combines several AI techniques, namely collaborative filtering (CF), matrix factorization (MF), reinforcement learning (RL), and natural language processing (NLP) for sentiment analysis. The integration of these methodologies overwhelmingly enhances personalization along with operational efficiency in the case of e-commerce platforms. The authors included experiments in the areas of conversion rates, customer retention, operational cost, profitability, and predictive accuracy and showed through extensive simulations (using publicly available datasets such as Retailrocket, Instacart, and Amazon Product Reviews) that the proposed model outperformed the baselines across all of them by a large margin. Improvements on these remained consistently greater than were realizable by conventional models hence proving the model hybridization concept.

RL enhanced the framework for adaptability in adjusting dynamic pricing based on real-time market conditions and NLP provided the capability to perform sentiment analysis on customer feedback to gain deeper insights. Additionally, the system exhibited remarkable scalability and fault tolerance, being able to process high volumes of data found in challenging and high transaction e-commerce systems. By harnessing the historical data with its offline training strategy, the model can achieve significant performance improvements over the state-of-the-art without requiring constant real-time input, thereby solving important operational problems and aiding real-time analysis in the fast-moving, dynamic e-commerce environment.

This hybrid framework represents a best-of-both-worlds approach to coordinating frontend personalization (customer-facing experiences) and backend operational decision-making (e.g., inventory management and pricing). This close-knit integration yields significant performance improvements while offering a flexible approach to help e-commerce businesses achieve more than what any traditional recommendation system or static optimization model can do. Ultimately, this architecture offers a powerful and flexible solution that could change the game for customer experience and operational efficiency in e-commerce.

Future directions

Although the hybrid framework proposed achieves significant enhancements over that obtained from baseline approaches, other areas in our research could be further explored to improve the capabilities of this approach.

  • Real-Time Adaptation and Incremental Learning: The existing framework is dependent on historical data for offline training, which restricts the model from holding ground in case of any changes in consumer behaviour, along with other market dynamics. Going forward, we will work on adding streaming and incremental learning approaches to have a system that responds continuously, evolving and adapting to new data as it comes across. This would allow the framework to be more dynamic, increasing the speed and accuracy of customized recommendations as well as pricing techniques.

  • Advanced Sentiment Analysis with Transformer Models: The existing NLP sentiment analysis part would benefit from improvements by leveraging the recent transformer-based models, such as domain-adapted BERT or GPT. Such models can provide nuanced and more localized customer sentiments, accommodating a thought process into what a customer feels and what they prefer. This may drive a higher level of personalization and context to the customer service experience, thus improving user satisfaction.

  • Comparison with Cutting-Edge Algorithms: This will further allow a direct comparison of the hybrid framework with modern recommendation algorithms such as LightGCN (Light Graph Convolution Networks) and SVD++ (Singular Value Decomposition Plus Plus), both of which will be part of a future study. All of these methods have demonstrated effectiveness for recommendation tasks, and comparing their performance against the proposed model will help to gain insight into the comparative strengths and weaknesses of each approach in large-scale dynamic e-commerce settings.

  • Multi-Objective Reinforcement Learning: One interesting direction for future work is the investigation of multi-objective reinforcement learning (MORL) models. E-commerce presents several non-aligned objectives, including but not limited to maximizing profit, enhancing customer satisfaction, and supervising inventory management. Offering these competing optimum goals may permit the framework MORL to balance them out and provide the most general, flexible decision-making.

  • Cold-Start Problem in Recommendations: Cold-start problem is one of the main issues in recommendation systems, where less effective recommendations are provided for the new users or new products due to a lack of historical data. This problem can also be mitigated in future versions of the hybrid by including content-based techniques or even other side information sources such as user demographics, item features/attributes, or collective knowledge from different domains. It would enhance the system’s capacity to produce appropriate recommendations for new users and new products and thus mitigate the cold-start problem.

  • A/B Testing and Real-World Deployment: Therefore, the next key to verifying the proposed hybrid model is to do an A/B test on real e-commerce sites on a large scale. It would give us some insights into how the model scales, what it looks like for users, and how the model behaves in a more complicated and higher transaction environment. This real-world testing will help companies to optimise the model and assess whether it is commercially ready for full rollout.

  • Customization and Commercial Readiness: An iterative development of the system based on user feedback and operational performance will be the best way to address customization of the framework. Future versions may concentrate on increasing the system’s flexibility and adaptability to various e-commerce sites, sectors, and product types. It is vital to be able to deploy such a system, and offering easy integration with existing e-commerce systems will make it more commonplace.

Focusing on these future directions would lead to further enhancements of the hybrid AI-driven framework in terms of improved performance, adaptability, and practical usage in the real world. The ongoing development of AI approaches, especially in the field of reinforcement learning, sentiment insights, and personalization, will augment its power to fulfill the more sophisticated and dynamic requirements of contemporary e-commerce networks.

Abbreviations

CF

Collaborative Filtering

MF

Matrix Factorization

RL

Reinforcement Learning

NCF

Neural Collaborative Filtering

NLP

Natural Language Processing

OCR

Operational Cost Reduction

PI

Profitability Improvement

RMSE

Root Mean Squared Error

MAE

Mean Absolute Error

CR

Conversion Rate

CRR

Customer Retention Rate

SAA

Sentiment Analysis Accuracy

SM

Scalability Metric

GPU

Graphics Processing Unit

Author contributions

Haowei Liu and Farah Raihana Ismail contributed equally to the conceptualization, methodology development, and manuscript drafting. Weihang Zhang and Ping Zou supported data analysis and simulation experiments. Tarak Hussain provided critical revisions and technical guidance. Yogesh Kumar Sharma and Lidia Gosy Tekeste contributed to the literature review and framework design, while Umesh Kumar Lilhore and Sarita Simaiya supervised the project and coordinated overall research activities. All authors reviewed and approved the final manuscript.

Funding

Not available.

Data availability

The dataset is available from the corresponding author upon individual request.

Declarations

Competing interests

The authors declare no competing interests.

Consent for publication

All authors have reviewed and approved the final manuscript for publication.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Farah Raihana Ismail, Email: farahri@163.com, Email: farahismail@segi.edu.my.

Umesh Kumar Lilhore, Email: umeshlilhore@gmail.com.

Lidia Gosy Tekeste, Email: lidiagosytekeste@gmail.com.

References

  • 1.Alti, A. & Lakehal, A. AI-MDD-UX: revolutionizing E-Commerce user experience with generative AI and Model-Driven development. Future Internet. 17 (4), 180 (2025). [Google Scholar]
  • 2.Ramos, F. R. et al. Mapping e-commerce trends in the USA: a time series and deep learning approach. J. Market Anal.13, 606–634. 10.1057/s41270-025-00392-9 (2025). [Google Scholar]
  • 3.Yallamelli, A. R. G., Mamidala, V., Yalla, R. K. M. K. & Mridul, A. H. The optimizing e-commerce behavioral analytics: strategy-driven ensemble blending—e-commerce behavioral analytics. Int. J. Adv. Comput. Sci. Eng. Res.1(01), 78–85 (2025). [Google Scholar]
  • 4.Presskila, X. et al. A Hybrid Learning-Based Q-Commerce Framework Evolving from E-Commerce. In 2025 International Conference on Machine Learning and Autonomous Systems (ICMLAS), 295–302 (IEEE, 2025).
  • 5.Feng, L. Analysis of product selection strategy for cross-border e-commerce with assistance of artificial intelligence. J. Internet Technol.26(1), 111–122 (2025). [Google Scholar]
  • 6.Mishra, M. K. et al. Recommender Systems in E-Commerce: A Deep Dive into AI-Driven Optimization. In 2025 First International Conference on Advances in Computer Science, Electrical, Electronics, and Communication Technologies (CE2CT), 1392–1398 (IEEE, 2025).
  • 7.Balaji, T. K., Bablani, A., Sreeja, S. R. & Misra, H. TOPS: A framework for trusted opinion analysis of product reviews using hybrid deep learning based D2CL filter. Expert Syst.42(2), e13765 (2025). [Google Scholar]
  • 8.Prova, N. Multilingual Emotion Classification in E-Commerce Customer Reviews Using GPT and Deep Learning-Based Meta-Ensemble Model. Available at SSRN 5161505 (2025).
  • 9.Ahuja, A. & Gupta, M. A Hybrid Framework for E-Commerce Recommendations in Emerging Markets: Leveraging Sentiment Analysis and LLM Embeddings (2025).
  • 10.A. Wasilewski, Y. Chawla and E. Pralat, "Enhanced E-Commerce Personalization Through AI-Powered Content Generation Tools," in IEEE Access, vol. 13, pp. 48083-48095. 10.1109/ACCESS.2025.3550956 (2025).
  • 11.Pandey, A. K., Chaturvedi, V., Yadav, J. & Yadav, D. S. Harnessing marketing intelligence and AI to understand consumer behavior in the education sector in smart cities. In Machine Learning and Robotics in Urban Planning and Management, 25–48 (IGI Global Scientific Publishing, 2025).
  • 12.Pan, L. Hybrid collaborative recommendation of cross-border e-commerce products based on multidimensional evaluation. Int. J. Inf. Commun. Technol.26 (1), 102–116 (2025). [Google Scholar]
  • 13.Sharma, A., Patel, N. & Gupta, R. Enhancing customer experience with AI-Powered sales assistants: leveraging natural Language processing and reinforcement learning algorithms. Eur. Adv. AI J.10(2) (2021).
  • 14.Selvasundaram, K., Trivedi, P., Kasireddy, L. C. & Bhanawat, H. Artificial intelligence in E-commerce and banking: enhancing customer experience and fraud prevention. Artif. Intell.5, 1 (2025). [Google Scholar]
  • 15.Ahmed, S. R. et al. Deep Learning for Customer Relationship Management in E-commerce. In 2024 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), 1–7 (IEEE, 2024).
  • 16.Kumar, V. S. & Metta, S. AI-powered Customer Service: The Role of Chatbots in Enhancing E-commerce Interactions. In Navigating Data Science in the Age of AI: Exploring Possibilities of Generative Intelligence 67–82 (Emerald Publishing Limited, 2025). [Google Scholar]
  • 17.Tomar, N., Sahu, K., Kumar, V. & Srivastava, A. AI Integrated Recommender System for Improving e-commerce Experience. In 2025 IEEE 14th International Conference on Communication Systems and Network Technologies (CSNT), 630–635 (IEEE, 2025).
  • 18.Ullah, A. Knowledge-Driven hybrid models for E-Commerce recommendations and privacy. Ubiquitous Technol. J.1 (1), 1–9 (2025). [Google Scholar]
  • 19.Si, Z., Ali, D. A., Rosli, R. B., Bhaumik, A. & Ghosh, A. Application of autonomous intelligent customer behavior prediction model based on deep learning in retail marketing strategy optimization. Edelweiss Appl. Sci. Technol.9 (3), 584–598 (2025). [Google Scholar]
  • 20.Sharma, A., Patel, N. & Gupta, R. Optimizing product upselling strategies using reinforcement learning and natural language processing algorithms. Eur. Adv. AI J.11(9) (2022).
  • 21.Kalusivalingam, A. K., Sharma, A., Patel, N. & Singh, V. Leveraging neural networks and collaborative filtering for enhanced AI-driven personalized marketing campaigns. Int. J. AI ML1(2) (2020).
  • 22.Gatchalee, P. Coopetition analysis between JD and Tmall in China’s e-commerce landscape: A hybrid thematic-latent Dirichlet Allocation Approach. Adv. Artif. Intell. Mach. Learn.5(2), 206 (2025). [Google Scholar]
  • 23.Xiong, Q. Deep learning in predicting consumer purchase intentions. In Proceedings of the 3rd International Conference on Signal Processing, Computer Networks and Communications, 525–530 (2024).
  • 24.Retailrocket recommender system dataset. https://www.kaggle.com/datasets/retailrocket/ecommerce-dataset
  • 25.Instacart Dataset. https://www.kaggle.com/datasets/yasserh/instacart-online-grocery-basket-analysis-dataset
  • 26.Amazon product reviews dataset. https://jmcauley.ucsd.edu/data/amazon/
  • 27.Venkatesan, R. & Sabari, A. Deepsentimodels: A novel hybrid deep learning model for an effective analysis of ensembled sentiments in e-commerce and s-commerce platforms. Cybernetics Syst.54 (4), 526–549 (2023). [Google Scholar]
  • 28.Joshi, S., Sharma, A. & Iyer, S. Optimizing sales funnels using reinforcement learning and predictive analytics techniques in AI. Int. J. AI Advancements. 10, 1 (2021). [Google Scholar]
  • 29.Iyer, D., Sharma, D., Singh, N. & Patel, S. Enhancing digital advertising through AI-powered personalization: leveraging reinforcement learning and collaborative filtering algorithms. Int. J. AI ML Innovations11, 8 (2022). [Google Scholar]
  • 30.Zhang, P. E-commerce products recognition based on a deep learning architecture: theory and implementation. Future Generation Comput. Syst.125, 672–676 (2021). [Google Scholar]
  • 31.Nabi, N. et al. Unleashing deep learning: transforming E-commerce profit prediction with CNNs. J. Bus. Manage. Stud.6(2), 126–131 (2024). [Google Scholar]
  • 32.Zhou, L. Product advertising recommendation in e-commerce based on deep learning and distributed expression. Electron. Commer. Res.20 (2), 321–342 (2020). [Google Scholar]
  • 33.Cao, Y., Shao, Y. & Zhang, H. Study on early warning of E-commerce enterprise financial risk based on deep learning algorithm. Electron. Commer. Res.22 (1), 21–36 (2022). [Google Scholar]
  • 34.Lilhore, U. K., Simaiya, S., Prasad, D. & Verma, D. K. Hybrid weighted random forests method for prediction & classification of online buying customers. J. Inform. Technol. Manage.13 (2), 245–259 (2021). [Google Scholar]
  • 35.Trivedi, N. K., Simaiya, S., Lilhore, U. K. & Sharma, S. K. An efficient credit card fraud detection model based on machine learning methods. Int. J. Adv. Sci. Technol.29(5), 3414–3424 (2020). [Google Scholar]
  • 36.Kumar, M., Kumar, R., Madhavi, S. & Rani, U. Survey paper on artificial intelligence in retail and e-commerce. Int. J. Emerg. Technol. Eng. Res.8(2) (2023).
  • 37.Zhang, Y. et al. Learning Self-Growth Maps for Fast and Accurate Imbalanced Streaming Data Clustering, IEEE Trans. Neural Networks Learn. Syst., 36, 9, 16049–16061, Sept. 10.1109/TNNLS.2025.3563769 (2025). [DOI] [PubMed] [Google Scholar]
  • 38.Chang, Y. et al. Hierarchical adaptive cross-coupled control of traffic signals and vehicle routes in large‐scale road network. Comput. -Aided Civ. Infrastruct. Eng.40, 5474–5493 (2025). [Google Scholar]
  • 39.Yongsheng, X. Global sustainable supply chain governance, effectiveness of social responsibility, and performance in emergent markets: an exploratory multiple case study. Revista De Administração De Empresas. 65 (6), e2024–e0587 (2025). [Google Scholar]
  • 40.Jin You, S. et al. Comparative hydrologic performance of cascading and distributed green-gray infrastructure: experimental evidence for spatial optimization in urban waterlogging mitigation, J. Hydrol.662. 10.1016/j.jhydrol.2025.133979 (2025).
  • 41.Cao, B., Zhao, J., Lv, Z. & Yang, P. Diversified personalized recommendation optimization based on mobile data. IEEE Trans. Intell. Transp. Syst.22(4), 2133–2139 (2020). [Google Scholar]
  • 42.Jiang, H., Wang, M., Zhao, P., Xiao, Z. & Dustdar, S. A utility-aware general framework with quantifiable privacy preservation for destination prediction in LBSs. IEEE/ACM Trans. Netw.29(5), 2228–2241 (2021). [Google Scholar]
  • 43.Liu, B., Li, M., Ji, Z., Li, H. & Luo, J. Intelligent productivity transformation: corporate market demand forecasting with the aid of an AI virtual assistant. J. Organ. End. User Comput. (JOEUC)36(1), 1–27 (2024). [Google Scholar]
  • 44.Van, A. N. T. et al. Exploring work motivation and job effectiveness in Mekong delta’s hospitality: A study utilizing partial least squares structural equation modeling. J. Chin. Hum. Resour. Manage.15(2), 101–112 (2024). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The dataset is available from the corresponding author upon individual request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES