Skip to main content
Current Research in Food Science logoLink to Current Research in Food Science
. 2023 Mar 25;6:100495. doi: 10.1016/j.crfs.2023.100495

A deep learning-based framework for predicting pork preference

Eunyoung Ko a,1, Kyungchang Jeong b,1, Hongseok Oh b, Yunhwan Park c, Jungseok Choi c,∗∗, Euijong Lee b,
PMCID: PMC10070177  PMID: 37026021

Abstract

Meat consumption per capita in South Korea has steadily increased over the last several years and is predicted to continue increasing. Up to 69.5% of Koreans eat pork at least once a week. Considering pork-related products produced and imported in Korea, Korean consumers have a high preference for high-fat parts, such as pork belly. Managing the high-fat portions of domestically produced and imported meat according to consumer needs has become a competitive factor. Therefore, this study presents a deep learning-based framework for predicting the flavor and appearance preference scores of the customers based on the characteristic information of pork using ultrasound equipment. The characteristic information is collected using ultrasound equipment (AutoFom III). Subsequently, according to the measured information, consumers’ preferences for flavor and appearance were directly investigated for a long period and predicted using a deep learning methodology. For the first time, we have applied a deep neural network-based ensemble technique to predict consumer preference scores according to the measured pork carcasses. To demonstrate the efficiency of the proposed framework, an empirical evaluation was conducted using a survey and data on pork belly preference. Experimental results indicate a strong relationship between the predicted preference scores and characteristics of pork belly.

Keywords: Artificial intelligence, Deep learning, Pork quality, Consumer preference prediction, Precision livestock farming (PLF)

Graphical abstract

Image 1

Highlights

  • A deep learning-based approach is proposed that predicts consumer flavor and appearance preferences.

  • A preference survey was conducted consumer flavor and appearance based on the characteristics of pork over a long period.

  • Experimental results show that a relationship exists between the prediction of preference and the characteristics of pork.

  • This is the first study to apply deep learning to data collected using ultrasound measuring equipment.

1. Introduction

Recently, per capita meat consumption in South Korea steadily increased by a factor of 4.77 from 11.3 kg in 1980 to 53.9 kg in 2018 (South Korea National Statistical Office, 2020), with pork being the most popular meat that Koreans consume. Up to 69.5% of Koreans eat pork at least once a week. (South Korea Rural Development Administration, 2021). In South Korea, domestic pork production and imports are high enough to meet the tremendous increase in meat consumption (South Korea Agricultural Affairs, 2021). Specifically, the amount of pork imported in January 2022 was counted at 45,555 tons. This corresponded to an increase of 11.1 percent compared to the previous month (36,518 tons) and 88.2 percent compared to the same period previous year (21,547 tons) (South Korea Ministry of Food and Drug Safety, 2021). Considering the pork locally produced and imported, Korean consumers have a high preference for high-fat parts, such as pork belly (Oh and See, 2012). Hence, managing the high-fat portions of domestically produced and imported meat according to consumer needs has become a competitive factor. Therefore, collecting and utilizing data on meat management is crucial (Mertens et al., 2011).

Big data have been collected from substantial scientific research and various industries regarding domain-specific information about problems such as marketing and medical informatics (Najafabadi et al., 2015). A deep learning methodology provides excellent prediction performance by identifying the characteristics (patterns) of big data confirmed in many recent studies (LeCun et al., 2015). Therefore, active attempts have been made to predict or classify target data using deep learning methodology for big data in various fields (Emmert-Streib et al., 2020). Predicted data help users make decision choices and prepare for future situations (Gulati, 2015). For instance, deep learning methodology has been applied in agriculture to increase crop production according to harvest period prediction and provide convenience in farm management (Jha et al., 2019). By predicting agricultural products price in advance, market participants such as the government, farmers, and consumer markets can obtain advantages, such as business strategy establishment and financial adjustment (Chuluunsaikhan et al., 2020). Deep learning methodology is also used in recommendation systems that estimate user preferences for products and recommends products suitable for users (Adomavicius and Tuzhilin, 2005; Ricci et al., 2015; Zhang et al., 2019). In the online website and mobile application industry, recommender systems play an important role in promoting new product sales and services, and enhancing the user experience (Cheng et al., 2016; Covington et al., 2016; Okura et al., 2017). For example, more than 60% of videos viewed on YouTube were recommended videos (Davidson et al., 2010), and 80% of movies watched on Netflix were recommended products (Gomez-Uribe and Hunt, 2015).

The concept of precision livestock farming (PLF) has recently emerged (Suryawanshi et al., 2017). Decision making in traditional livestock farming is based on producers’ experience. However, owing to global warming and changes in various environmental factors, livestock farming based on experience is unlikely to achieve high performance. Therefore, the PLF system, which helps farmers and producers make decisions using quantitative data, is being applied to farms. Previous PLF studies have focused on production and management, such as pig health and welfare, by incorporating machine learning and have increased pork production and improved convenience for producers (Banhazi et al., 2012; Ochs et al., 2018). Numerous studies have addressed the issue of increased meat consumption. However, it is to be noted that consumer purchasing decisions often depend on taste, highlighting the need to shift the focus toward pork quality characteristics, particularly taste preferences. If we can estimate pork with a high consumer preference, both companies and producers can increase sales.

We assumed that the flavor and appearance preference scores of consumers could be predicted on the characteristic information of pork. Therefore, we proposed a deep learning-based framework for predicting the flavor and appearance preference scores based on the characteristic information of pork using ultrasound equipment (i.e., AutoFom III). Also, the predicted results can be utilized in various applications in the framework. To demonstrate the efficiency of the proposed framework, we performed empirical experiments using data from pork belly parts. Additionally, an application example that can utilize the predicted results is presented. We believe our study is the first to use deep learning methods to predict consumer preferences for pork characteristics information. The main contributions of this study are summarized as follows:

  • -

    We conducted consumer flavor (n = 1767) and appearance (n = 5150) preference surveys based on the characteristics of pork over a long period.

  • -

    We initially proposed a deep learning-based framework that predicts and utilizes the consumer flavor and appearance preferences.

  • -

    To the best of our knowledge, this is the first study to apply deep learning to data collected using ultrasound measuring equipment.

Section 2 describes backgrounds and related works. Section 3 explains the framework of the proposed preference prediction system. Also, experimental results are described in Section 4. Section 5 discusses the detailed analysis of the experimental results. Finally, Section 6 concludes the study.

2. Background and related work

This section is divided into two parts. Section 2.1 briefly describes the methodologies and applications. Section 2.2 introduces the related studies on pig carcass analysis systems for precision livestock farming.

2.1. Methodologies and applications

Deep learning methodology is briefly introduced in this section. Deep learning is a machine learning method based on learning from data latent representations, and it is an important subset of Artificial Intelligence (AI) (Coşkun et al., 2017). Deep learning improves the ability to “learn” data processed by inputting learning algorithms and data into artificial neural networks. A deep neural network (DNN) receives the data to be trained through an input layer. Input data are then processed through several hidden layers and the final result obtained through the output layer. DNN training shows the correct answer to the model and produces an output in the form of a score vector, one for each category. We then calculate an objective function measuring the error between output and correct scores. To reduce error (or loss), a parameter called the weight vector must be adjusted. The gradient vector is computed through differentiation to properly adjust the weight vector. The weight vector is adjusted in the direction opposite the gradient vector using a method called stochastic gradient descent (SGD) (Bottou, 2012). This process is repeated for the training set until the mean reduction in the objective function stops. Surprisingly, this simple procedure determines an appropriate set of weights for a deep neural network model. This process is called the “training deep neural network model” (LeCun et al., 2015).

A deep learning neural network model that has been trained has better predictive performance than traditional statistical and machine learning methods. Therefore, many studies have applied deep-learning methodologies to domain data in various industries (Schmidhuber, 2015). For example, the deep learning method exhibits excellent predictive performance in research results such as cancer prognosis prediction (Zhu et al., 2020), building energy consumption prediction (Amasyali and El-Gohary, 2018), urban traffic prediction (Liu et al., 2018), and crop yield prediction (Maimaitijiang et al., 2020).

Additionally, many attempts have been made to create a composite model with better performance using ensembles of individual deep neural network models. For example, to solve the time-series classification (TSC) task, individual models (ResNet, FCN, Encoder, MLP, Time-CNN, MCDCNN) were combined with a weighted average to obtain an optimal result (Fawaz et al., 2019). It can be seen that the test accuracy of the single model FCN and ResNet was 30%, but the prediction performance of the ensemble model was significantly improved by 92%. Additionally, DenseNet and Inception Network were ensembled to improve the performance for the task of classifying aerial scenes. The individual prediction model Inception shows 94% prediction accuracy and DenseNet shows 95% prediction accuracy, but it was observed that the ensemble model stacking the two models improved the performance by 97% (Dede et al., 2018).

The ensemble method demonstrates the advantage of integrating individual models to improve the prediction accuracy and generalization performance. Inspired by this recent success, we have used a deep neural network-based ensemble model, which stacks ensembles of four machine learning and deep learning models.

2.2. Related work

In this section, related studies on automatic pig carcass analysis equipment and systems are described. Most of the systems use vision-based equipment or ultrasound-based equipment that can measure each carcass quickly.

A 3D camera (i.e., Microsoft Kinect V2) was used to measure the body weight, muscle depth, and back fat of 557 pigs (Fernandes et al., 2020). In individual factor prediction of the deep learning method, it was confirmed that the prediction performance was higher than that of the statistical method and machine learning. In addition, the VCS2000 equipment can measure lean meat percentage (LMP) from the images of half carcasses (i Furnols and Gispert, 2009). Lohumi et al. developed a non-destructive prediction model for the LMP of pig carcasses using VCS2000 in Korea (Lohumi et al., 2018). Data were collected from 175 pigs, analyzed and utilized using multilinear regression. The result of estimating the whole LMP has a coefficient of determination (R2) value of 0.77 and has a predictive accuracy of R2 values ≥ 0.8 for specific parts such as ham, belly, and shoulder.

Ultrasound-based equipment can measure important characteristics, such as lean meat percentage, as well as muscle and fat proportions. This equipment can accurately measure the muscle and fat percentage of each part and can be used in various ways. First, the authors proposed an automatic classification method based on data obtained from AutoFom III and unique data (e.g., sex, breed, and weight) of pigs (Masferrer et al., 2018). A total of 4000 hams were selected and classified according to thickness into thin (0–10 mm), standard (11–15 mm), semi-fat (16–20 mm), and fat (20 mm or more). The authors compared and evaluated the results of automatic classification using decision trees, support vector machines (SVMs), and k-nearest neighbor algorithms. The SVM classification model showed classification performance with an accuracy of 73%, showing that the proposed classification method can be utilized as a useful online tool to classify hams in a slaughterhouse. Subcutaneous fat thickness (SFT) is an important factor for classifying pork ham (Masferrer et al., 2019). This study also proposes a system for automatically classifying hams with predicted SFT values using the SVM model during the slaughter process. Experimental results show that the prediction accuracy of the SVM model was 75.3%, which is approximately 5.5% higher than that of manual measurement. If ham is classified automatically using the SVM model, it has economic advantages and can be classified more accurately than human classification.

3. Deep learning based framework for predicting pork preference

In this section, we describe the specific content and process of predicting consumer flavor and appearance preference scores according to the pork feature information. Fig. 1 presents the overall flow of the proposed deep neural network-based ensemble (DNNE) framework and consists of the following steps: data acquisition, data pre-processing, training and evaluation, and application. Datasets were obtained from the data acquisition and pre-processing steps. Subsequently, we used DNNE model that extracts features from the datasets and performs well in predicting consumer preferences based on pork information. Finally, the predicted flavor and appearance preference scores can be used for assistant classification, production management, and evaluation metrics. Details are described in the following sections.

Fig. 1.

Fig. 1

Overview of the deep neural network-based ensemble framework.

3.1. Data acquisition

Owing to high pork consumption, a large number of pigs are slaughtered. With the modernization of slaughter-houses, more than 300 pigs can be slaughtered per hour. Ultrasound equipment (i.e., AutoFom III (FRONTMATEC)) was used to obtain detailed information about pork characteristics in real time. The ultrasound data contained as many as (n = 40) measurement values of characteristic information, such as pig weight, whole lean meat percentage, trimmed fat percentage, trimmed fat weight, and backfat thickness. Especially, the data was measured by subdividing pork belly, blade shoulder, sparer rib, loin, tenderloin, picnic, ham, jowl, and skirt meat. Each part of the pork belly had the measurement information of lean meat weight, lean meat percentage, subcutaneous fat weight, subcutaneous fat percentage, intermuscular fat weight, and fat percentage. Two particularly important elements of measurement information are described. Subcutaneous fat is fat stored between the skin and muscles, which prevents spoilage by microorganisms and improves long-term storage. Intermuscular fat is the fat between muscles under the subcutaneous fat. Additionally, intermuscular fat is an important factor contributing to meat quality (Choi et al., 2019).

Raw data was composed by combining ultrasound data with preference survey data according to the pork parts (See Section 4.1). Raw data were then pre-processed before training the proposed models. In the following subsections, we describe each step.

3.2. Data pre-processing

To obtain machine learning and deep neural network model with a good performance, checking the collected data is crucial. Data pre-processing was performed to check whether there were any missing parts, errors, or parts to be processed for the collected data.

In the first process, data cleaning is conducted to maintain completeness, uniqueness, and unity of the raw data (Rahm and Do, 2000). Data selection is the process of selecting the feature information of partial meat and its corresponding preference scores. Since each user has different preferences for the same product, determining a representative preference value for the product is difficult. Past research has solved this problem with a modified content-based filtering method that considers the consumer's personal preference and the influence of other people (Sato et al., 2013).

We used a different approach, weighting the preference score according to each ratio fat of pork belly. By combining the values of the preference mean (μ) and the standard deviation (σ) given for each belly, a weighting technique as shown in Equation (1) was applied. In addition, the ratio of original preference score and weighting value was adjusted as a parameter of alpha (α) and beta (β) note that α + β = 1 in Equation (2). The ratio of α and β may be appropriately adjusted according to the situation. As the ratio of α increases, the weight of the individual consumer preference score increases. However, as the ratio of β increases, the influence of the mean consumer score increases. After conducting various trials considering different settings, we set the parameters to α = 0.2 and β = 0.8, which yielded the optimal performance.

XWeight=μσ (1)
X=α×Xoriginal+β×XWeight (2)

In the next step, if the scale of the feature data to be used as an input value is different, training may not work well. Therefore, we proceeded with data scaling (Evans, 2006) to make the range of all input features the same. Data scaling was performed using the MinMax Scaler (Patro and Sahu, 2015) to adjust the scale of the feature data to 0–1. Finally, the dataset was created through data pre-processing and delivered to Train & Evaluate step.

3.3. Train & Evaluate

This section describes the structure and data flow of the DNN-based ensemble (DNNE) model used in the experiment to predict flavor and appearance preference scores (Fig. 2). The dataset that was used to proceed with data preprocessing is divided into training (80%) and test (20%) data sets. The divided training data were used to train the four traditional machine learning (linear (Tranmer and Elliot, 2008), ridge (McDonald, 2009), random forest (Rodriguez-Galiano et al., 2015), gradient boosting (Ketkar, 2017)) and the DNN model. The proposed DNNE model uses an ensemble technique (i.e., the prediction step) by stacking the trained models. This stacking ensemble technique consists of two major steps (Ribeiro and dos Santos Coelho, 2020).

Fig. 2.

Fig. 2

Deep neural network-based ensemble model.

First, each prediction result is derived from the four trained models of machine learning and deep neural networks. In the process of predicting the flavor and appearance preference score, seven pork feature datasets (lean meat weight and intermuscular fat percentage of pork belly, etc.) are used as equal inputs to the individual model. Second, the final consumer preference score is predicted using the meta-regressor (LightGBM regressor (Alzamzami et al., 2020)) by synthesizing and training the derived prediction results. Finally, the predicted final consumer preference score is transmitted to the application and utilized. The prediction performance and results of the proposed model are described in detail in Section 4.

3.4. Application

The consumer's prediction score according to the pig characteristic information can be used in the following framework: (a) The assistant classification system can assist the classifier in making decisions regarding where to ship based on the predicted preference scores of pork meat in real-time during the slaughtering process. (b) A production management system that tracks information on where and how pigs with high preference scores are produced. (c) Using an evaluation metric other than the pig carcass grades. Korean's pig carcass grade is an indicator of carcass quality. The carcass grading method used in Korea is divided into 1st grade, which is based on carcass weight and backfat thickness and 2nd grade, which is based on appearance (fat attachment condition) and defect items (fractures, spinal abnormalities). The final grade is divided into 1+, 1, 2, and others by assigning the lowest grade between the 1st and 2nd grades. Ironically, carcass grade and meat quality are commonly used interchangeably in Korea, making it difficult for consumers to choose the desired quality. Hence, the predicted preference score indicates how adequately the meat satisfies consumer preferences. Moreover, predicting consumer preferences in advance is an important tool to help marketing and distribution, and provide guidance to stakeholders when making decisions.

4. Empirical evaluation

In Section 4.1, detail of the raw datasets are described, and the experimental results to predict consumer preference using the raw data is described in Section 4.2.

4.1. Raw datasets

This section explains the type of information contained in the raw data, which comprised ultrasound and preference survey data. The ultrasound data contains characteristic information on pork obtained using ultrasound equipment (AutoFom III) implemented in the slaughter line. The ultrasound data used in the experiment had seven characteristics: commercial meat weight, lean meat weight, lean meat percentage, subcutaneous fat weight, subcutaneous fat percentage, intermuscular fat weight, and intermuscular fat percentage of the pork belly. Given that intermuscular fat has low thermal conductivity, pork with more than 20% intermuscular fat undergoes relatively little water evaporation during heating and is rich in “juiciness” (Hoa et al., 2021). When pork is being chewed, fat stimulates the secretion of saliva, allowing the meat to be deeply savored. Additionally, fat, which is softer than lean meat, gives the meat a softer texture when chewed (Fortin et al., 2005). Therefore, measured intermuscular fat is an important indicator that affects consumer taste and the decision-making process more than other factors.

We collected pork according to the measured ratio to determine consumer preference according to the percentage of intermuscular fat in the pork belly. Fig. 3 shows the classification of pork at a ratio of 10–20% intermuscular fat. Preference survey data were acquired by conducting flavor and appearance preferences of the classified pork. The flavor preference survey was conducted six times over 5 months at the company restaurant. A total of 1767 people participated in the survey, including consumers, wholesalers, and food workers. Three pigs were used per survey, and the pork belly was divided into parts that were adjusted such that the same parts could be compared. Each participant sampled two pieces of meat from each fat percentage group; efforts were made to maintain consistent grilling conditions and ensure that the meat was fully cooked. In addition, 18 pigs were used in the flavor survey, and the thickness of the pork belly was standardized to 0.9 cm. The appearance preference survey was conducted eight times over a period of eight months in the conference room. A total of 5150 people participated in the survey, including wholesalers and food workers. One hundred fifty-two pigs were used in the appearance preference survey, and photographs of the same pork belly parts were standardized. Each participant evaluated six photos with different fat percentages using a screen in the conference room. In addition, the survey was conducted using photos with similar fat percentages taken from different pigs. Both flavor and appearance surveys rated preferences on a scale of 1–5.

Fig. 3.

Fig. 3

Pork classification according to the intermuscular fat of pork belly. A total of 170 pigs with an intermuscular fat ratio between 10 and 20% were selected, and their consumer preferences were investigated. (a)–(f) Representative images of pork according to the percentage of intermuscular fat. As the percentage of fat increases, we observed that the white portion (i.e., intermuscular fat) is more spread out in the meat in between the red portion (i.e., lean meat). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Finally, the collected datasets were checked for distributions and outliers. Both the flavor and appearance datasets had an appropriate distribution and were not biased toward any value. Additionally, missing data or outliers were not found by data cleaning, and the format was unified (Maletic and Marcus, 2000). Datasets were suitable for machine learning and deep learning training if they underwent data pre-processing (See Section 3.2).

4.2. Experimental results to predict consumer preference with DNNE model

This section discusses in detail the results of consumer preference prediction using the aforementioned raw data. At this time, the results were compared and evaluated by dividing them into flavor and appearance models according to the type of raw data. Section 4.2.1 describes the details of the DNN model, which is the core model among the components of the DNNE model. Section 4.2.2 describes the predictive performance of consumer preference scores using various figures and tables.

4.2.1. Training result

The used DNN model consisted of an input layer, four hidden layers, and an output layer. The number of nodes in the first hidden layer was 256, the second node was 128, and the third and fourth layers were each composed of 64 nodes. In the process of predicting the flavor and appearance preference score, seven pork feature datasets (lean meat weight and intermuscular fat percentage of pork belly) were used as the input layer of the DNN model. The detailed composition of the DNN model is as follows:

In Fig. 4, visualization is performed to evaluate the performance of how well the flavor and appearance DNN models predicted the preference for the test dataset. Using a random sampling of 50, the prediction performance of the actual score was checked intuitively. Both DNN models predicted values close to the actual preference value of the test dataset.

Fig. 4.

Fig. 4

Prediction result of the proposed model according to random sampling. The blue points represent the real preference scores, and the yellow points represent the predicted preference scores. For better visual comprehension, prediction results are sorted in ascending order by score value. It can be seen that prediction has been performed well for various data distributions. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

4.2.2. Comparison result

Model evaluation was conducted by comparing this study's proposed DNNE model with the traditional machine learning regression method. We experimented with machine learning regression models with multiple linear, ridge, random forest, and gradient boosting with the same dataset.

Given that the multiple linear regression model predicts the dependent variable using several input characteristics, better performance than general linear regression can be expected. Multiple linear regression has the advantage of being easy to implement and having a short execution time, but analyzing the coefficients is difficult. Regularization technology was applied to the ridge model to increase the accuracy or interpretability of the linear model. Regularization is a technique that prevents overfitting according to the alpha value, and it is necessary to find an appropriate alpha value. Random forest is a model that makes predictions based on majority vote or average after generating several decision trees. That is, it is an ensemble learning method that combines decision trees and bagging. The advantage of the random forest is that it prevents overfitting and is robust against missing values. It can also interpret the importance of variables; however, it requires longer execution time. Unlike the random forest model, gradient boosting is an ensemble model that creates a tree in a manner that compensates for errors in the previous tree. Therefore, the gradient boosting model exhibits impressive performance in regression and classification. Finally, Table 1 compares the advantages and disadvantages of each model.

Table 1.

Advantages and disadvantages of each model.

Type Method Advantages Disadvantages
Machine Learning Multiple Linear
  • -

    Simple to implement

  • -

    Good interpretation

  • Prediction time is low

  • -

    Linear Coefficient analysis is difficult

Ridge
  • -

    Simple to implement

  • -

    Good interpretation

  • -

    Prevent over-fitting

  • -

    Need to select the perfect hyperparameter

Random Forest
  • -

    Can handle missing values

  • -

    Prevent over-fitting

  • -

    Prediction time is high

  • -

    Difficulty in interpreting the results

Gradient Boosting
  • -

    Good interpretation

  • -

    Prevent over-fitting

  • -

    Difficult to scale up - Prediction time is high

Deep Learning Proposed model
  • -

    Non-linear

  • -

    High predictive performance

  • -

    High generalization performance

  • -

    Depends on a lot of data

To intuitively check the prediction results of each model in Fig. 5, Fig. 6, its preference score, and the vertical axis is the predicted preference score. The red line is a trend line that linearly expresses the relationship between actual and predicted data. The proposed model of flavor and appearance draws a trend line with a slope closer to 1 than the other models. Clearly, the two models predicted the preference score well overall.

Fig. 5.

Fig. 5

Flavor model regplot comparison.
MAE=1ni=1n(yiyˆ) (4)
MSE=1ni=1n(yiyˆ)2 (5)
RMSE=1ni=1n(yiyˆ)2 (6)
MAPE=1ni=1nyiyˆyi (7)
Fig. 6.

Fig. 6

Appearance model regplot comparison.

Additionally, Pearson correlation coefficients of the actual and predicted preference scores were compared for each model. The Pearson correlation coefficient value was obtained using Equation (3). Here, Xi is the actual preference score, and Yi is the predicted preference score, and X and Y represent the averages of each value. Pearson's correlation coefficient values close to 1 indicate a perfectly linear relationship, and values close to 0 indicate no linear relationship between the two variables.

rxy=in(XiX)(YiY)in(XiX)2in(YiY)2 (3)

Fig. 7 shows that the flavor proposed model has a comparatively higher score than the other models with 0.86 points but shows similar performance results to the gradient boosting model. The appearance proposed model was closest to 1, with 0.98 points, confirming that the predicted score had a strong linear relationship with the actual score.

Fig. 7.

Fig. 7

Comparison results of Pearson correlation coefficients.

A general statistical measurement method was used to numerically compare and evaluate the performance of the proposed DNNE model. Specifically, we used mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), and mean absolute percentage error (MAPE). MAE in Equation (4), called L1 Loss, is calculated by taking the absolute value of the difference between the predicted value and the actual value, adding them together, and dividing by the number of samples (n). MSE in Equation (5), called L2 Loss, is a value obtained by calculating the difference between the predicted value and actual value by squared. Th e MAPE in Equation (7) calculates the average of the absolute percentage error each time. Since MAPE represents the degree of error as a percentage value, intuitively understanding the performance of the model and evaluating the performance of each variable when there are multiple target variables is relatively easy. The values measured using the four evaluation indicators represent the overall error degree between the actual and predicted values, and it can be said that the model with the lowest error value has the best performance.

Fig. 8 and Table 2 show that the proposed model has the lowest error value, and the prediction performance is better than that of the other models. Specifically, analyzing the loss value of the appearance model shows 4.5 times lower results than the linear model, 4.6 times lower than the ridge model, 3.4 times lower than the random forest model, and 2 times lower than the gradient boost model in terms of the MAPE.

Fig. 8.

Fig. 8

Comparison results of prediction performance.

Table 2.

Flavor and appearance preference prediction performance comparison.

Flavor Appearance
Method MAE MSE RMSE MAPE MAE MSE RMSE MAPE
Linear 0.370 0.210 0.458 15.034 0.561 0.484 0.696 23.011
Ridge 0.456 0.351 0.592 18.714 0.570 0.496 0.704 23.414
Random Forest 0.296 0.139 0.373 12.232 0.429 0.315 0.561 17.143
Gradient Boosting 0.280 0.121 0.347 11.357 0.242 0.087 0.295 10.085
Proposed Model 0.279 0.124 0.352 11.171 0.117 0.021 0.145 5.024

Finally, the actual and predicted preference scores of flavor and appearance models were compared using a t-test. The paired t-test was conducted to confirm the statistical significance of the actual and predicted scores (Equation (8)). In the formula for a paired t-test, d denotes the difference between a single pair deducted from the value of another pair. The formula of the paired t-test is defined as the sum of the differences of each pair divided by the square root of n times the sum of the differences squared minus the sum of the squared differences, divided by n-1. The null hypothesis (Equation (9)) is defined as “there will be no difference in mean or score between the two groups,” and the alternative hypothesis (Equation (10)) is defined as “there will be a difference in mean or score between the two groups.” At this time, if the value of the significance probability (p-value) is less than 0.05, the null hypothesis is rejected, and the average difference between the two groups is interpreted as significant. In the opposite case, we accept the null hypothesis that the mean difference is not significant.

t=dn(d2)(d)2n1 (8)
H0:μ1=μ2 (9)
H1:μ1μ2 (10)

This study's results of the t-test analysis of the actual preference score and the predicted preference score for each model are as follows (Table 3). In the case of appearance, only the random forest and proposed model have a p-value greater than 0.05, so we adopt the null hypothesis that the mean difference is not significant, the actual and prediction two groups hypothesis that the average difference is not significant is accepted. In Fig. 9, which shows the t-test graph, individual data points are plotted, overlapping a box plot, summarizing the data distribution. Fig. 9(a) and (b) show that the proposed model provides a data distribution highly similar to that of the actual preference score, and the average line and interval of the box plot are drawn similarly.

Table 3.

Actual and predicted preference score t-test results for each model.

Flavor Appearance
Group statistic p-value statistic p-value
Linear 0.0648 0.94 2.5760 0.01
Ridge −0.1245 0.90 2.6395 0.008
Random Forest 0.7193 0.47 1.7804 0.075
Proposed Model 2.8157 0.99 7.3157 0.99
Fig. 9.

Fig. 9

T-test graph of flavor and appearance models.

Fig. 10, Comparison results of prediction performance.

Fig. 10.

Fig. 10

Predicted consumer preference score scatter plot according to the ratio of intermuscular fat. The groups are classified according to ratio of fat, with each group comprising 100 random pork samples. High-fat, medium-fat, and low-fat meats are indicated by a red circle, green triangle, and yellow square, respectively. Large circles of each colors indicate the location of each element's average value; thus, the area in which the element is concentrated can be intuitively checked. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

In summary, the proposed DNNE model extracts the features of the input data well and shows a high prediction performance. It was proven that the proposed model outperformed other machine-learning models, and the flavor and appearance models predicted the preference accurately because the MSE value was less than 0.2. Essentially, consumer preference scores could be successfully predicted according to the characteristics of pork.

4.2.3. Application example

As previously described, the raw dataset and experimental results were analyzed to confirm the relationship between pork feature information and preference scores. Seven pork belly features were analyzed by linking them with flavor and appearance preference scores. Interestingly, among the features of pork belly, the preference score varies depending on the ratio of intermuscular fat. Preference scores were analyzed by classifying them into low-fat (less than 13%), medium-fat (13% or more and less than 16%), and high-fat (16% or more) groups according to the ratio of intermuscular fat. Fig. 10 presents the predictions of the proposed models regarding consumers’ preferences for pork classified by fat percentage. High-fat pork had the highest flavor score, and medium-fat pork had the highest appearance preference. Additionally, the flavor preference score tended to be 14% higher for high-fat meat than those for low-fat and medium fat meats. Moreover, the appearance preference score tended to be 17% higher for medium-fat meat than those for low-fat and high-fat meat.

Based on using the predicted consumer scores as an indicator, pork with a high flavor score can be delivered to restaurants and that with a high appearance score to the market. Similarly, consumer preference scores can be used as part of pork evaluation metrics as well as in various applications such as assistant systems for pork shipping and classification and high-preference production management systems.

5. Discussion

The experimental results show that the performance of the proposed model in predicting consumer flavor and appearance preferences is better than that of the machine learning method, according to the percentage of intermuscular fat in the pork belly. However, there was a slight difference in the prediction performance for the flavor model from that of the gradient boosting model. In fact, it shows that numerically expressing the taste preferences of individual consumers is difficult. The subjective taste and taste preferences experienced by an individual change over time and may also change depending on their condition. In this study, 25 subjects with the same beverage and the same subjects measured taste preference for 29 days, confirming that taste measurement and preference change over time (Mattes, 1988). Additionally, accurately measuring taste preferences is difficult because taste preferences change owing to changes in individual body weight (Sauer et al., 2017) and hormonal changes during the menstrual cycle (Kuga et al., 1999). Therefore, we assumed that the prediction of flavor preference showed disappointing results compared with the prediction of appearance preference.

In addition, we confirmed that the higher the intermuscular fat, the higher the flavor preference, as argued in a recent study (Hoa et al., 2021). The preference distribution of appearance showed that the authors preferred lower-fat pork relatively because of the increased interest in diet and health (Font-i Furnols and Guerrero, 2014; Frank et al., 2017) or that Koreans prefer high-fat pork for consumption and pork that contains slightly less fat visually. Hence, we were able to determine the recent preference for pork consumption among Koreans. The analysis results reaffirmed the claims of past studies with accurate measurement data for both flavor and appearance. We know the Koreans’ preference for high fat of pork for consumption and medium fat visually, which has the advantage of being able to use marketing data and applications in various manners.

6. Conclusion

Previous studies on the use of machine learning in PLF have focused on pig production and management. In most cases, there have been several studies in the image processing field, such as pig weight estimation and pig position detection. Economically, producing as many pigs as possible is valuable; however, modern consumer trends have indicated that producing pigs that taste better and are of higher quality is more important.

This study presents a deep learning-based framework for predicting the flavor and appearance preference scores of the customers based on the characteristic information of pork bellies. Ultrasound-based equipment (AutoFom III) was used to accurately measure pork belly information, and consumer preferences according to the characteristic information were surveyed over a significant period. Empirical experiment results show that the proposed model using deep learning has better performance than other machine learning methods. Consequently, we assumed that the proposed framework with predicted preference score has the advantage of helping with the company's marketing strategy and high-demand pig production management. Companies may increase transaction satisfaction by supplying pork that meet consumer needs; accordingly, producers would have the advantage of earning additional profits. Although we predicted the preference for pork belly, future research should focus on expanding the system through additional research on partial meat, such as the neck and ribs. In this study, we collected data and performed experiments using pork belly parts only; thus, the results of the experiments show limited applicability. However, the results provide valuable insights into the applications of AI in food science. In the future, we will expand the framework using various information on pork carcasses.

CRediT authorship contribution statement

Eunyoung Ko: collects resources and performs, Data curation. Kyungchang Jeong: Writing – original draft, and implements the, Software. Hongseok Oh: refines the, Software, algorithm. Yunhwan Park: performed, Investigation. Jungseok Choi: performed, Funding acquisition, and, Project administration. Euijong Lee: Writing – review & editing, the manuscript, and he, Supervision, the research.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was supported by Korea Institute of Planning and Evaluation for Technology in Food, Agriculture and Forestry (IPET) through High Value-added Food Technology Development Program, funded by Ministry of Agriculture, Food and Rural Affairs (MAFRA) (321028–5). This work was supported by the Dodram Quality Control Management Team of Dodram Pig Farmers Cooperative.

Contributor Information

Eunyoung Ko, Email: Koe77@dodram.co.kr.

Kyungchang Jeong, Email: rudckd135@cbnu.ac.kr.

Hongseok Oh, Email: hong36367662@cbnu.ac.kr.

Yunhwan Park, Email: yhp056@cbnu.ac.k.

Jungseok Choi, Email: jchoi@cbnu.ac.kr.

Euijong Lee, Email: kongjjagae@cbnu.ac.kr.

Data availability

Data will be made available on request.

References

  1. Adomavicius G., Tuzhilin A. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 2005;17:734–749. [Google Scholar]
  2. Alzamzami F., Hoda M., El Saddik A. Light gradient boosting machine for general sentiment classification on short texts: a comparative evaluation. IEEE Access. 2020;8:101840–101858. [Google Scholar]
  3. Amasyali K., El-Gohary N.M. A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy Rev. 2018;81:1192–1205. [Google Scholar]
  4. Banhazi T.M., Lehr H., Black J., Crabtree H., Schofield P., Tscharke M., Berckmans D. Precision livestock farming: an international review of scientific and commercial aspects. Int. J. Agric. Biol. Eng. 2012;5:1–9. [Google Scholar]
  5. Bottou L. 2012. Stochastic Gradient Descent Tricks; pp. 421–436. [Google Scholar]
  6. Cheng H.T., Koc L., Harmsen J., Shaked T., Chandra T., Aradhye H., Anderson G., Corrado G., Chai W., Ispir M., et al. Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. 2016. Wide & deep learning for recommender systems; pp. 7–10. [Google Scholar]
  7. Choi J., Kwon K., Lee Y., Ko E., Kim Y., Choi Y. Characteristics of pig carcass and primal cuts measured by the autofom III depend on seasonal classification. Food science of animal resources. 2019;39:332. doi: 10.5851/kosfa.2019.e30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Christoffersen P., Jacobs K. The importance of the loss function in option valuation. J. Financ. Econ. 2004;72:291–318. [Google Scholar]
  9. Chuluunsaikhan T., Ryu G.A., Yoo K.H., Rah H., Nasridinov A. Incorporating deep learning and news topic modeling for forecasting pork prices: the case of South Korea. Agriculture. 2020;10:513. [Google Scholar]
  10. Coşkun M., Yildirim Ö., Ayşegül U., Demir Y. An overview of popular deep learning methods. Eur. J. Tech. (EJT) 2017;7:165–176. [Google Scholar]
  11. Covington P., Adams J., Sargin E. Proceedings of the 10th ACM Conference on Recommender Systems. 2016. Deep neural networks for youtube recommendations; pp. 191–198. [Google Scholar]
  12. Davidson J., Liebald B., Liu J., Nandy P., Van Vleet T., Gargi U., Gupta S., He Y., Lambert M., Livingston B., et al. Proceedings of the Fourth ACM Conference on Recommender Systems. 2010. The youtube video recommendation system; pp. 293–296. [Google Scholar]
  13. Dede M.A., Aptoula E., Genc Y. Deep network ensembles for aerial scene classification. Geosci. Rem. Sens. Lett. IEEE. 2018;16:732–735. [Google Scholar]
  14. Emmert-Streib F., Yang Z., Feng H., Tripathi S., Dehmer M. An introductory review of deep learning for prediction models with big data. Front. Artif. Intell. 2020;3:4. doi: 10.3389/frai.2020.00004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Evans P. Scaling and assessment of data quality. Acta Crystallogr. Sect. D Biol. Crystallogr. 2006;62:72–82. doi: 10.1107/S0907444905036693. [DOI] [PubMed] [Google Scholar]
  16. Fawaz H.I., Forestier G., Weber J., Idoumghar L., Muller P.A. 2019 International Joint Conference on Neural Networks (IJCNN) IEEE; 2019. Deep neural network ensembles for time series classification; pp. 1–6. [Google Scholar]
  17. Fortin A., Robertson W., Tong A. The eating quality of canadian pork and its relationship with intramuscular fat. Meat Sci. 2005;69:297–305. doi: 10.1016/j.meatsci.2004.07.011. [DOI] [PubMed] [Google Scholar]
  18. Frank D., Oytam Y., Hughes J. 2017. Sensory Perceptions and New Consumer Attitudes to Meat; pp. 667–698. [Google Scholar]
  19. FRONTMATEC Advanced ultrasonic image analysis. https://www.frontmatec.com/en/pork-solutions/unclean-linecarcass-grading-traceability
  20. Fernandes A.F., Dórea J.R., Valente B.D., Fitzgerald R., Herring W., Rosa G.J. Comparison of data analytics strategies in computer vision systems to predict pig body composition traits from 3d images. J. Anim. Sci. 2020;98 doi: 10.1093/jas/skaa250. skaa250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Font-i Furnols M., Guerrero L. Consumer preference, behavior and perception about meat and meat products: an overview. Meat Sci. 2014;98:361–371. doi: 10.1016/j.meatsci.2014.06.025. [DOI] [PubMed] [Google Scholar]
  22. Gomez-Uribe C.A., Hunt N. The netflix recommender system: algorithms, business value, and innovation. ACM Transactions on Management Information Systems (TMIS) 2015;6:1–19. [Google Scholar]
  23. Gulati H. 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom) IEEE; 2015. Predictive analytics using data mining technique; pp. 713–716. [Google Scholar]
  24. Hoa V.B., Seol K.H., Seo H.W., Seong P.N., Kang S.M., Kim Y.S., Moon S.S., Kim J.H., Cho S.H. Meat quality characteristics of pork bellies in relation to fat level. Animal Bioscience. 2021;34:1663. doi: 10.5713/ab.20.0612. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. i Furnols M.F., Gispert M. Comparison of different devices for predicting the lean meat percentage of pig carcasses. Meat Sci. 2009;83:443–446. doi: 10.1016/j.meatsci.2009.06.018. [DOI] [PubMed] [Google Scholar]
  26. Jha K., Doshi A., Patel P., Shah M. A comprehensive review on automation in agriculture using artificial intelligence. Artificial Intelligence in Agriculture. 2019;2:1–12. [Google Scholar]
  27. Ketkar N. 2017. Stochastic Gradient Descent; pp. 113–132. [Google Scholar]
  28. Kuga M., Ikeda M., Suzuki K. Gustatory changes associated with the menstrual cycle. Physiol. Behav. 1999;66:317–322. doi: 10.1016/s0031-9384(98)00307-2. [DOI] [PubMed] [Google Scholar]
  29. LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
  30. Li Y., Yuan Y. Convergence analysis of two-layer neural networks with relu activation. Adv. Neural Inf. Process. Syst. 2017;30 [Google Scholar]
  31. Liu Z., Li Z., Wu K., Li M. Urban traffic prediction from mobility data using deep learning. Ieee network. 2018;32:40–46. [Google Scholar]
  32. Lohumi S., Wakholi C., Baek J.H., Do Kim B., Kang S.J., Kim H.S., Yun Y.K., Lee W.Y., Yoon S.H., Cho B.K. Nondestructive estimation of lean meat yield of south Korean pig carcasses using machine vision technique. Korean J. Food Sci. Anim. Resour. 2018;38:1109. doi: 10.5851/kosfa.2018.e44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Maimaitijiang M., Sagan V., Sidike P., Hartling S., Esposito F., Fritschi F.B. Soybean yield prediction from uav using multimodal data fusion and deep learning. Rem. Sens. Environ. 2020;237:111599. [Google Scholar]
  34. Maletic J.I., Marcus A. Data cleansing: beyond integrity analysis., in: Iq. Citeseer. 2000:200–209. [Google Scholar]
  35. Masferrer G., Carreras R., Font-i Furnols M., Gispert M., Marti-Puig P., Serra M. On-line ham grading using pattern recognition models based on available data in commercial pigs laughter houses. Meat Sci. 2018;143:39–45. doi: 10.1016/j.meatsci.2018.04.011. [DOI] [PubMed] [Google Scholar]
  36. Masferrer G., Carreras R., Font-i Furnols M., Gispert M., Serra M., Marti-Puig P. Automatic ham classification method based on support vector machine model increases accuracy and benefits compared to manual classification. Meat Sci. 2019;155:1–7. doi: 10.1016/j.meatsci.2019.04.018. [DOI] [PubMed] [Google Scholar]
  37. Mattes R.D. Reliability of psychophysical measures of gustatory function. Percept. Psychophys. 1988;43:107–114. doi: 10.3758/bf03214187. [DOI] [PubMed] [Google Scholar]
  38. McDonald G.C. Ridge regression. Wiley Interdisciplinary Reviews: Comput. Stat. 2009;1:93–100. [Google Scholar]
  39. Mertens K., Decuypere E., De Baerdemaeker J., De Ketelaere B. Statistical control charts as a support tool for the management of livestock production. J. Agric. Sci. 2011;149:369–384. [Google Scholar]
  40. Najafabadi M.M., Villanustre F., Khoshgoftaar T.M., Seliya N., Wald R., Muharemagic E. Deep learning applications and challenges in big data analytics. J. big data. 2015;2:1–21. [Google Scholar]
  41. Ochs D.S., Wolf C.A., Widmar N.J., Bir C. Consumer perceptions of egg-laying hen housing systems. Poultry Sci. 2018;97:3390–3396. doi: 10.3382/ps/pey205. [DOI] [PubMed] [Google Scholar]
  42. Oh S.H., See M. Pork preference for consumers in China, Japan and South Korea. Asian-Australas. J. Anim. Sci. 2012;25:143. doi: 10.5713/ajas.2011.11368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Okura S., Tagami Y., Ono S., Tajima A. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017. Embedding-based news recommendation for millions of users; pp. 1933–1942. [Google Scholar]
  44. Patro S., Sahu K.K. 2015. Normalization: A Preprocessing Stage. arXiv preprint arXiv:1503.06462. [Google Scholar]
  45. Rahm E., Do H.H. Data cleaning: problems and current approaches. IEEE Data Eng. Bull. 2000;23:3–13. [Google Scholar]
  46. Ribeiro M.H.D.M., dos Santos Coelho L. Ensemble approach based on bagging, boosting and stacking for short-term prediction in agribusiness time series. Appl. Soft Comput. 2020;86:105837. [Google Scholar]
  47. Ricci F., Rokach L., Shapira B. 2015. Recommender Systems: Introduction and Challenges; pp. 1–34. [Google Scholar]
  48. Rodriguez-Galiano V., Sanchez-Castillo M., Chica-Olmo M., Chica Rivas M. Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol. Rev. 2015;71:804–818. [Google Scholar]
  49. Sato T., Fujita M., Kobayashi M., Ito K. 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013) IEEE; 2013. Recommender system by grasping individual preference and influence from other users; pp. 1345–1351. [Google Scholar]
  50. Sauer H., Ohla K., Dammann D., Teufel M., Zipfel S., Enck P., Mack I. Changes in gustatory function and taste preference following weight loss. J. Pediatr. 2017;182:120–126. doi: 10.1016/j.jpeds.2016.11.055. [DOI] [PubMed] [Google Scholar]
  51. Schmidhuber J. Deep learning in neural networks: an overview. Neural Network. 2015;61:85–117. doi: 10.1016/j.neunet.2014.09.003. [DOI] [PubMed] [Google Scholar]
  52. South Korea Agricultural Affairs South Korea: livestock and products semi-annual. 2021. https://www.fas.usda.gov/data/south-korea-livestock-and-products-semi-annual-5
  53. South Korea Ministry of Food and Drug Safety Monthly pork imports report in South Korea. 2021. https://www.mfds.go.kr/wpge/m_311/de010603l0001.do
  54. South Korea National Statistical Office Changes in the structure of the livestock industry through statistics. 2020. https://kostat.go.kr/portal/korea/kor_nw/1/1/index.board?bmode=read&aSeq=386478
  55. South Korea Rural Development Administration A study on changes in livestock consumption environment in South Korea. 2021. https://www.korea.kr/common/download.do?fileId=196534137&tblKey=GMN
  56. Suryawanshi K.R., Redpath S.M., Bhatnagar Y.V., Ramakrishnan U., Chaturvedi V., Smout S.C., Mishra C. Impact of wild prey availability on livestock predation by snow leopards. R. Soc. Open Sci. 2017;4:170026. doi: 10.1098/rsos.170026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Tranmer M., Elliot M. Multiple linear regression. The Cathie Marsh Centre for Census and Survey Research (CCSR) 2008;5:1–5. [Google Scholar]
  58. Zhang S., Yao L., Sun A., Tay Y. Deep learning based recommender system: a survey and new perspectives. ACM Comput. Surv. 2019;52:1–38. [Google Scholar]
  59. Zhang Z. 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS) IEEE; 2018. Improved adam optimizer for deep neural networks; pp. 1–2. [Google Scholar]
  60. Zhu W., Xie L., Han J., Guo X. The application of deep learning in cancer prognosis prediction. Cancers. 2020;12:603. doi: 10.3390/cancers12030603. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data will be made available on request.


Articles from Current Research in Food Science are provided here courtesy of Elsevier

RESOURCES