. 2022 Aug 26;22(17):6443. doi: 10.3390/s22176443

Table 1.

The study analysis.

Paper	Year of Publication	Population/Dataset	Purpose of Study	Sensors Used	Methodology
Bahador et al. [23]	2021	Two scenarios: 1. data from three days of wristband device use form a single person, and 2. Open data set of 10 individuals performing 186 activities (mobility, eating, personal hygiene, and housework)	Develop a data fusion technique to achieve a more comprehensive insight of human activity dynamics. Authors considered statistical dependency of multisensory data and exploring intramodality correlation patters for different activities.	Sensor array with temperature, interbeat intervals, dermal activity, photoplethysmography, heart rate (1st dataset). Wristband 9 axis inertial measurement units (2nd dataset)	Deep residual network.
Doulah et al. [24]	2021	30 volunteers using the system for 24 h in pseudo-free-living and 24 in a free-living environment	Food intake detection, sensor fusion classifier (accelerometer and flex sensor). Image sensor was used to capture data every 15 s and validate sensor fusion decision.	5 mp camera glasses add-on, accelerometer and flex sensor in contact with temporalis muscle	SVM model.
Heydarian et al. [25]	2021	OREBA dataset [26], composed by OREBA-DIS with 100 participants consuming food in discrete portions and OREBA-SHA with 102 participants while consuming a communal dish	Data fusion for automatic food intake gesture detection	Although no sensors were used, dataset was obtained through video and inertial sensors data	Fusion of inertial and video data with several methods that use deep learning.
Kyritsis et al. [27]	2021	FIC [28], FreeFIC [29], and FreeFIC held-out datasets containing triaxial acceleration and orientation velocity signals	A complete Framework towards automated modeling of in-meal eating behavior and temporal localization of meals	Data from smartwatch either worn on right or left wrist—accelerometer and gyroscope	CNN for feature extraction and LSTM network to model temporal evolution. Both parts are jointly trained by minimizing a single loss function.
Lee [30]	2021	8 participants in noisy environments	Detect eating events and calculate calorie intake	Ultrasonic doppler shifts to detect chewing events and a camera placed on user’s neck	Markov hidden model recognizer to maximize swallow detection accuracy. Relation between chewing counts and amount of food through a linear regression model. CNN to recognize food items.
Mamud et al. [31]	2021	Not specified, students were used with emphasis on acoustic signal	Develop a Body Area Network for automatic dietary monitoring system to detect food type and volume, nutritional benefit and eating behavior	Camera on chest with system hub, phones with added microphone and dedicated hardware to capture chewing and swallowing sounds, wrist-worn band with accelerometer and gyroscope	Emphasis was given to the hardware system and the captured signals, but not on signal processing itself.
Mirtchouk and Kleinberg [32]	2021	6 subjects for 6 h in a total of 59 h of data	Gain insight on dietary activity, namely chews per minute and causes for food choices	Custom earbud with 2 microphones—one in-ear and one external	SVDKL uses a deep neural network and multiple Gaussian Processes, one per feature, to do multiclass classification.
Rouast and Adam [33]	2021	Two datasets of annotated intake gestures—OREBA [26] and Clemson University	A single stage approach which directly decodes the probabilities learned from sensor data into sparse intake detection—eating and drinking	Video and inertial data	Deep neural network with weakly supervised training using Connectionist Temporal Classification loss and decoding using an extended prefix beam search decoding algorithm.
Fuchs et al. [34]	2020	10,035 labeled product image instances created by the authors	Detection of diet related activities to support health food choices	Mixed reality headset-mounted cameras	Comparison of several neural networks were performed based on object detection and classification accuracy.
Heremans et al. [35]	2020	16 subjects for training, and 37 healthy control subjects and 73 patients with functional dyspepsia for testing	Automatic food intake detection through dynamic analysis of heart rate variability	Electrocardiogram	ANN with leave-one-out.
Hossain et al. [36]	2020	15,343 images (2127 food images and 13,216 not food images)	Target and classify images as food/not food	Wearable egocentric camera	CNN based image classifier in a Cortex M7 microcontroller.
Rachakonda et al. [37]	2020	1000 images obtained from copyright-free sources—800 used for training and 200 for testing	Focus on eating behavior of users, detect normal eating and stress eating, create awareness about its food intake behaviors	Camera mounted on glasses	Machine learning models to automatically classify the food from the plate, automatic object detection from plate, and automatic calorie quantification.
Sundarramurthi et al. [38]	2020	Food101 dataset [39] (101,000 images with 101 food categories)	Develop a GUI-based interactive tool	Mobile device camera	Convolutional Neural Network for food image classification and detection.
Ye et al. [40]	2020	COCO2017 dataset [41]	A method for food smart recognition and automatic dietary assessment on a mobile device	Mobile device camera	Mask R-CNN.
Farooq et al. [42]	2019	40 participants	Create an automatic ingestion monitor	Automatic ingestion monitor—hand gesture sensor used on the dominant hand, piezoelectric strain sensor, and a data collection module	Neural network classifier.
Johnson et al. [43]	2019	25 min of data divided into 30 s segments, while eating, shaving, and brushing teeth	Development of a wearable sensor system for detection of food consumption	Two wireless battery-powered sensor assemblies, each with sensors on the wrist and upper arm. Each unit has 9-axis inertial measurement units with accelerometer, magnetometer, and gyroscope	Machine learning to reduce false positive eating detection after the use of a Kalman filter to detect position of hand relative to the mouth.
Konstantinidis et al. [44]	2019	85 videos with people eating from a side view	Detect food bite instances accurately, robustly, and automatically	Cameras to capture body and face motion videos	Deep network to extract human motion features from video sequences. A two-steam deep network is proposed to process body and face motion, together with the data form the first deep network to take advantage of both types of features simultaneously.
Kumari et al. [45]	2019	30 diabetic persons to confirm glucose levels with a glucometer	Regulate glycemic index through calculation of food size, chewing style and swallow time	Acoustic sensor in trachea using MEMS technology	Deep belief network with Belief Net and Restricted Boltzmann Machine combined.
Park et al. [46]	2019	4000 food images by taking pictures of dishes in restaurants and Internet search	Develop Korean food image detection and recognition model for use in mobile devices for accurate estimation of dietary intake	Camera	Training with TensorFlow machine learning framework with a batch size of 64. Authors present a deep convolutional neural network—K-foodNet.
Qiu et al. [47]	2019	360 videos and COCO dataset to train mask R-CNN	Dietary intake on shared food scenarios—detection of subject’s face, hands and food	Video camera (Samsung gear 360)	Mask R-CNN to detect food class, bounding box indicating the location and segmentation mask of each food item. Predicted food masks could presumably be used to calculate food volume.
Raju et al. [48]	2019	Two datasets (food and no food) with 1600 images each	Minimization of number of images needed to be processed either by human or computer vision algorithm for food image analysis	Automatic Ingestion Monitor 2.0 with camera mounted on glasses frame	Image processing techniques—lens barrel distortion, image sharpness analysis, and face detection and blurring.
Turan et al. [49]	2018	O participants, 4 male and 4 female, 22–29 years old	Detection of ingestion sounds, namely swallowing and chewing	Throat microphone with IC recorder	Captured sounds are transformed into spectrograms using short-time Fourier transforms and use Convolutional Neural network for food intake classification problem.
Wan et al. [50]	2018	300 types of Chinese food and 101 kinds of western food from food-101	Identify the ingredients of the food to determine if diet is healthy	Digital camera	p-faster R-CNN based on Faster-CNN with Zeiler and Fergus model and Caffe network.
Lee [51]	2017	10 participants with 6 types of food	Food intake monitoring, estimating the processes of chewing and swallowing	Acoustic Doppler sonar	Analysis of the jaw and its vibration pattern depending on type of food, feature extraction and classification with an Artificial Neural Network.
Nguyen et al. [52]	2017	10 participants in a lab environment	Calculate the number of swallows in food intake to calculate caloric values	Wearable necklace with piezoelectric sensors, accelerometer, gyroscope and magnetometer	A recurrent neural network framework, named SwallowNet, detects swallows on continuous data steam after being trained with raw data using automated feature learning methods.
Papapanagiotou et al. [53]	2017	60 h semi-free living dataset	Design a convolutional neural network for chewing detection	In-ear microphone	1-dimensional convolutional neural network. Authors also present results from leave-one-subject-out with fusion+ (acoustic and inertial sensors)
Farooq et al. [54]	2016	120 meals, 4 visits of 30 participants, from which 104 meals were analyzed	Automatic measurement of chewing count and chewing rate	Piezoelectric sensor to capture lower jaw motion	ANN machine learning to classify epochs as chewing or not chewing. Epochs were derived from sensor data processing.
Farooq et al. [55]	2014	30 subjects (5 were left out) in a 4-visit experiment	Automatic detection of food intake	Electroglottograph, PS3Eye camera and miniature throat microphone	Three-layer feed-forward neural network trained by the back propagation algorithm, neural network toolbox of Matlab.
Dong et al. [56]	2013	3 subjects, one female and two males	Development of a system for wireless and wearable diet monitoring system to detect solid and liquid swallow events based on breathing cycles	Piezoelectric respiratory belt	Machine learning for feature extraction and selection.
Pouladzadeh et al. [57]	2013	Over 200 images of food, 100 for training set and another 100 for testing set	Measurement and record of food calorie intake	Built-in camera of mobile device	Image processing using color segmentation, k-means clustering and texture segmentation to separate food items. Food portion identification through SVM and calorific value of food using nutritional table.