Table 1.

A brief summary of the existing HR estimation networks based on spatiotemporal maps.

Method	Input signal	Feature extraction	Backbone network	Outcomes
Hsu et. al [23]	Green Channel Signal	Time frequency representation of an image	VGG15	Pioneering work for real-time rPPG measurement using deep learning framework

Qiu et. al [9]	RGB	EVM	regression CNN	Realtime HR estimation with very less processing time

Niu et. al [12]	CHROM	n temporal signals are concatenated row wise to form spatiotemporal map	Deep regression model with Gated Recurrent Unit(GRU)	HR estimation in general situations like head movements and bad illumination

Pulse GAN [22]	CHROM	Noisy rPPG signal	Conditional GAN	Noise-less realistic rPPG signal is generated

Song et. al [24]	CHROM	Feature map is constructed by arranging the peaks of the signal in a time delayed manner	ResNet18	Noise-less feature images are produced which improves the prediction accuracy of HR

Wu et. al [25]	RGB	Spatiotemporal feature map generation similar to Song et. al with equivalent padding	ResNet18	Able to generate spatiotemporal maps which compensates missing frames in unstable situations.

Proposed	RGB	Spatiotemporal feature map generation using wavelets, for better motion estimation	ResNet18	Motion robust HR estimation is possible under realistic situations