Abstract
Today, we are facing the COVID-19 pandemic. Accordingly, properly wearing face masks has become vital as an effective way to prevent the rapid spread of COVID-19. This research develops an Efficient Mask-Net method for low-power devices, such as mobile and embedding models with low-memory requirements. The method identifies face mask-wearing conditions in two different schemes: I. Correctly Face Mask (CFM), Incorrectly Face Mask (IFM), and Not Face Mask (NFM) wearing; II. Uncovered Chin IFM, Uncovered Nose IFM, and Uncovered Nose and Mouth IFM. The proposed method can also be helpful to unmask the face for face authentication based on unconstrained 2D facial images in the wild. In this study, deep convolutional neural networks (CNNs) were employed as feature extractors. Then, deep features were fed to a recently proposed large margin piecewise linear (LMPL) classifier. In the experimental study, lightweight and very powerful mobile implementation of CNN models were evaluated, where the novel “EffientNetb0” deep feature extractor with LMPL classifier outperformed well-known end-to-end CNN models, as well as conventional image classification methods. It achieved high accuracies of 99.53 and 99.64% in fulfilling the two mentioned tasks, respectively.
Keywords: COVID-19, EfficientNet, Face mask-wearing, Face authentication, Large margin classifier, Deep feature extraction
Introduction and motivation
It is necessary to design a model for automatic identification of face mask-wearing conditions and use it as a first step to unmask the faces for face authentication in mobile devices and security systems, such as ATMs, banks, airport security checkpoints, and facial-biometric attendance systems.
The face mask condition identification is a very challenging task because while samples from the different classes are highly similar, samples from the same class may be much different. In other words, there are a great intra-class variation and a small inter-class variation, which make it difficult to learn discriminant features. Figure 1 depicts some samples from the three classes.
Fig. 1.

Sample images of MaskedFace-Net dataset. a Correct Face Mask (CFM), b Incorrectly Face Mask (IFM), and c Not Face Mask (NFM) wearing show the challenge of this task: Samples of different classes are highly similar, while those of the same classes are so different
In this paper, a new method has been developed for face mask-wearing identification using well-known deep convolutional neural networks (CNNs) as feature extractors and a novel large margin piecewise linear (LMPL) [1] as a classifier.
The proposed method contains four main steps: image preprocessing, deep feature extraction, face mask- wearing classification, and face unmasking. The proposed method showed an excellent performance in a computational resource-limited environment, for both classification tasks with 99.53 and 99.64% accuracy, respectively. Moreover, unmasking the masked faces showed a promising result. It can be concluded that the proposed EfficientMask-Net method is effective in face mask-wearing identification, as well as face unmasking. Therefore, it can be used in many security systems for epidemic prevention and face authentication.
Related work
Masked face detection
Prasad et al. [2] proposed a lightweight model called “MaskedFaceNet” for real-time mask detection using a progressive semi-supervised approach. Fasfous et al. [3] presented BinaryCoP (Binary COVID-mask Predictor) to detect correct face mask-wearing and positioning. The proposed BinaryCoP was a low-power binary neural network (BNN) classifier, which performed the classification on edge devices, such as embedded FPGA accelerator. They used the MaskedFaceNet dataset with four classes, including IMFD Nose and Mouth, IMFD Nose, IMFD Chin, and CMFD, and balanced the dataset with data augmentation techniques. As a result, accuracy of up to 98% was obtained for the wearing positioning problem.
Mobile-based face mask detection
Cabani et al. [4] introduced the MaskedFace-Net dataset with 137,016 images. This large-scale dataset includes Correctly Masked Face Dataset (CMFD) and Incorrectly Masked Face Dataset (IMFD), in which masked faces are created by applying a deformable model on the Flickr-Faces-HQ3 (FFHQ) face dataset. Qin et al. [5] proposed image super-resolution and classification network (SRCNet), in which a super-resolution method was applied to improve the performance of low-quality images. They classified the face mask-wearing situations into three classes, including correct mask-wearing, incorrect mask-wearing, and no mask-wearing, and achieved an accuracy of 98.70%. The training and evaluation were performed on the public Medical Masks Dataset containing 3835 images.
Identification of face mask-wearing conditions
Dey et al. [6] proposed a deep learning and multi-stage face mask detection method called “Mobile-Net Mask.” They used two different datasets with 5200 images to detect Masked or NotMasked faces from still images and video streams. The Mobile-Net Mask reached an accuracy of 93%. Jiang et al. [7] presented a RetinaFaceMask detector based on the one-stage RetinaNet for high-accuracy face mask detection. The introduced model contained ResNet or MobileNet as a backbone, along with a feature pyramid network (FPN) and context attention modules. The authors achieved a 93.4% precision, which was higher than baseline results.
As presented in this section, although researchers have introduced several approaches to identify face mask-wearing conditions, face authentication lacks a unified system. In this study, we developed a unified, efficient method for face mask-wearing identification besides unmasking the masked faces, which can be useful in authentication systems.
Materials and methods
This section describes the overall process of the proposed EfficientMask-Net method. Figure 2 demonstrates the diagram of the proposed mask-wearing system.
Fig. 2.
A schematic of the proposed EfficientMask-Net
Image preprocessing
Image preprocessing enhances the visual appearances of images and results in higher accuracy of the detection system.
Resizing face images
The input images of EfficientNet were resized to using bicubic interpolation.
Image adjustment
Real-world images have a considerable variation in contrast and exposure. The images were adjusted by mapping input intensity to the new values to saturate 1% of the pixel values in low and high intensities. Besides, the histogram of images was calculated to determine the adjustment limit automatically.
Deep feature extraction
High-level and abstract features can be extracted by deep CNNs. This study focused on a small and efficient network in computational power. Transfer learning was used to prevent overfitting and obtain better generalization.
EfficeintNet was introduced by Tan and Le [8] in 2019. It is one of the most efficient CNN models among well-known pre-trained networks with a small number of FLOPS. Compared to other models achieving similar ImageNet accuracy, the EfficientNet is much smaller and faster. The authors have shown that the proposed EfficientNet is five times faster for inference on mobile devices [8].
Large margin piecewise linear (LMPL) classifier
The novel large margin piecewise linear (LMPL) classifier [1] works based on a cellular structure. First of all, a grid is considered on feature space. In fact, some random hyper-planes partition feature space into subpartitions called cells. Each cell is labeled by a class label based on covered training instances. The main problem is with tuning of initial hyper-planes.
- Normal: Ordinary samples, which are correctly classified at just one side of the hyper-plane. Their loss function is Hinge loss as defined in (1):
where is the virtual label of sample and determines at which side of the hyper-plane, is correctly classified.1 - Negative don’t care: These samples are classified incorrectly on both sides of the hyper-plane. Their loss is defined in (2):
2 - Positive don’t care: This group is the opposite of Negative don’t care, and samples are classified correctly on both sides of the hyper-plane. Their loss function is defined in (3):
3
The Positive don’t care samples, which are always classified correctly, are ignored in this paper. The main reason is that the loss function in (3) is not convex. Therefore, the objective function is defined as presented in (4):
| 4 |
The scalar values and control the balance between the structural and empirical error. In this paper, both and were experimentally tested and set to 1000.
The LMPL classifier optimizes each hyper-plane based on the introduced objective function with a convex optimizer. After some iterations, the model converges to some hyper-planes in order to classify samples of different classes, and extra hyper-planes that are not useful in the classification are removed. Therefore, regarding the distribution and the complexity of the decision boundaries, the complexity of the model is tuned by removing redundant hyper-planes, and an efficient large margin approach is obtained.
Unmasking the face
Given its contactless nature, especially in the pandemic era, using faces is preferred in biometric recognition. However, these systems are designed for non-occluded faces [9], the proposed method was designed to work based on existing face authentication methods and avoid retraining them on masked face datasets. Most of the recent works have focused on the eye area exclusively [10] or retraining existing methods on the simulated masked faces [11].
Image segmentation
As the first step, faces were segmented into Mask and Non-Mask segments to determine missed parts of the face. Figure 3a illustrates an example of an input masked face and the resulting segmented face.
-
2)
Generating Synthetic Faces
Fig. 3.
The steps of unmasking the faces. a A masked face and the resulting segmented face. b 25 generated images by GAN. c The selected synthetic face and extracted facial parts of the mask area. d The unmasked face
A generative advertising network (GAN) was trained on 15,000 real-world faces without face masks. Then, 25 synthetic faces were generated by the trained GAN to complete the masked faces, as shown in Fig. 3b.
-
3)
Selecting the Matched generated face
The distance between a masked face and generated faces was calculated at pixel level based on the normalized root-mean-square error (NRMSE), which ranges from 0 (identical) to 1 (completely different). The synthetic face with the smallest value was selected to complete the masked face. An example is shown in Fig. 3c.
-
4)
Face Completion
Facial parts of the mask area were extracted from the selected synthetic face to fill missed parts of the masked face. An example of the final output of the proposed method is shown in Fig. 3d.
Algorithm 1 shows the whole process of the proposed EfficientMask-Net method.
Experimental results
Experimental setup
All experiments were implemented using the deep learning and image processing toolboxes of MATLAB R2021a. A CPU Core i7 4.00 GHz with 24 GB RAM was applied to implement the
MaskEfficeint-Net. The Adam optimizer [12] with , , and was also used. Moreover, weight decay of for L2 regularization was applied to avoid overfitting.
The network was trained for five epochs with a mini-batch size of 64. The initial learning rate was set on , and the learning rate drop factor was set on for all three epochs to increase the learning speed. Besides, the training dataset shuffled every epoch.
In this study, two experiments were carried out for two different classification schemes:
-
I.
Experiment 1: Correctly Face Mask (CFM), Incorrectly Face Mask (IFM), and Not Face Mask (NFM) wearing
-
II.
Experiment 2: Uncovered Chin IFM, Uncovered Nose IFM, and Uncovered Nose and Mouth IFM
-
B.
MaskedFace Dataset
MaskedFace dataset
In this study, we combined the novel MaskedFace-Net1 and the well-known Flicker-Face-HQ2 (FFHQ) datasets. FFHQ is an open-access high-quality dataset with PNG images of resolution. The original FFHQ was used as the Not mask-wearing dataset. The details of class samples and the related experiments are listed in Table 1. Finally, 14,783 and 4992 face images were used in Experiments 1 and 2, respectively. The complete dataset for each experiment can be found in the Zenodo repository (https://zenodo.org/record/4892677).
Table 1.
Details of face image dataset
| Experiment | Types | No. of x-ray images | Source database |
|---|---|---|---|
| Experiment 1 | Correctly Face Mask (CFM)-wearing | 4792 | MaskedFace Net, Correctly Maskedface Dataset (CMD) |
| Incorrectly Face Mask (IFM)-wearing | 4991 | MaskedFace Net, Incorrectly Maskedface Dataset (IMD) | |
| Not Face Mask (NFM)-wearing | 5000 | Flicker-Face-HQ (FFHQ) | |
| Experiment 2 | Uncovered Chin, IFM | 1815 | MaskedFace Net, Incorrectly Maskedface Dataset (IMD) |
| Uncovered Nose, IFM | 1608 | MaskedFace Net, Incorrectly Maskedface Dataset (IMD) | |
| Uncovered Nose and Mouth, IFM | 1569 | MaskedFace Net, Incorrectly Maskedface Dataset (IMD) | |
| Unmasking face | Not face mask (NFM)-wearing | 15,000 | Flicker-Face-HQ (FFHQ) |
Experimental results and analysis
Performance Analysis
Several lightweight deep networks were compared as an end-to-end network and a feature extractor with the novel LMPL classifier (called CNN+) in terms of different metrics, as shown in Tables 2 and 3 for both experiments. In both experiments, EfficientNetB0 achieved the best results in both schemes as an end-to-end network and a feature extractor with the LMPL classifier (EfficientNetB0+).
Table 2.
Comparison of deep CNNs as an end-2-end network and as a feature extractor, along with the proposed LMPL classifier ) in experiment 1: correctly Face Mask (CFM), Incorrectly Face Mask (IFM), and Not Face Mask (NFM) wearing
| Method | Performance Metrics (%) | |||||
|---|---|---|---|---|---|---|
| Sensitivity (Recall) | Specificity (TNR) | Precision (PPV) | F1-score | Accuracy | Average rank | |
| (1) | (1) | (1) | (1) | (1) | 1.0 | |
| (1) | (1) | (1) | (1) | (1) | 1.0 | |
| (2) | (2) | (2) | (2) | (2) | 2.0 | |
| (2) | . (2) | (2) | (2) | (2) | 2.0 | |
| (4) | (4) | (4) | (4) | (4) | 4.0 | |
| (4) | (4) | . (4) | (4) | (4) | 4.0 | |
| (3) | (3) | (3) | (3) | . (3) | 3.0 | |
| (3) | (3) | (3) | (3) | (3) | 3.0 | |
| (5) | (4) | (5) | (5) | (5) | 4.8 | |
| (5) | (5) | (5) | (5) | (5) | 5.0 | |
| Avg. on CNN | ||||||
| Avg on CNN+ | ||||||
*Bold numbers indicate the best performance
Table 3.
Comparison of deep CNNs as an end-2-end network and as a feature extractor, along with the proposed LMPL classifier ) in experiment 2: uncovered chin IFM, uncovered nose IFM, and uncovered nose and mouth IFM
| Method | Performance metrics (%) | |||||
|---|---|---|---|---|---|---|
| Sensitivity (Recall) | Specificity (TNR) | Precision (PPV) | F1-score | Accuracy | Average rank | |
| (1) | (1) | (1) | (1) | (1) | 1.0 | |
| (1) | (1) | (1) | (1) | (1) | 1.0 | |
| (2) | (2) | (2) | (2) | (2) | 2.0 | |
| (2) | (2) | (2) | (2) | (2) | 2.0 | |
| (5) | (5) | (5) | (5) | (5) | 5.0 | |
| (5) | (5) | (5) | (5) | (5) | 5.0 | |
| (4) | (3) | (4) | (4) | (4) | 3.8 | |
| (3) | (3) | (3) | (3) | (3) | 3.0 | |
| (3) | (4) | (3) | (3) | (3) | 3.2 | |
| (4) | (4) | (4) | (4) | (4) | 4.0 | |
| Avg. on CNN | ||||||
| Avg on CNN+ | ||||||
*Bold numbers indicate the best performance
The novel LMPL was also compared with well-known classifiers. According to the results, LMPL outperformed all other classifiers in terms of performance metrics. As illustrated in Tables 4 and 5, the LMPL achieved the best classification accuracy in both experiments.
Table 4.
Comparison of well-known classifiers with the proposed LMPL classifier in experiment 1: correctly face mask (CFM), incorrectly face mask (IFM), and not face mask (NFM) wearing
| Method | Performance Metrics (%) | |||||
|---|---|---|---|---|---|---|
| Sensitivity (Recall) | Specificity (TNR) | Precision (PPV) | F1-score | Accuracy | Average rank | |
| NaiveBayes | (13) | (13) | (13) | (13) | (13) | 13.0 |
| NN ( = 3) | (6) | (6) | (7) | (6) | (6) | 6.2 |
| NN ( = 5) | (4) | (4) | (4) | (4) | (4) | 4.0 |
| NN ( = 7) | (5) | (5) | (5) | (5) | (5) | 5.0 |
| OvO SVM | (7) | (7) | (6) | (7) | (7) | 6.8 |
| OvA SVM | (8) | (8) | (8) | (8) | (8) | 8.0 |
| Decision Tree | (11) | (11) | (11) | (11) | (11) | 11.0 |
| AdaBoostM2 | (9) | (9) | (9) | (9) | (9) | 9.0 |
| TotalBoost | (10) | (10) | (10) | (10) | (10) | 10.0 |
| LP Boost | (12) | (12) | (12) | (12) | (12) | 12.0 |
| Random Forrest | (3) | (2) | (3) | (3) | (3) | 2.8 |
| SoftMax | (2) | (3) | (2) | (2) | (2) | 2.2 |
| Novel LMPL | (1) | (1) | (1) | (1) | (1) | 1.0 |
*Bold numbers indicate the best performance
Table 5.
Comparison of well-known classifiers with the proposed LMPL classifier in experiment 2: uncovered chin IFM, uncovered nose IFM, and uncovered nose and mouth IFM
| Method | Performance metrics (%) | |||||
|---|---|---|---|---|---|---|
| Sensitivity (Recall) | Specificity (TNR) | Precision (PPV) | F1-score | Accuracy | Average Rank | |
| NaiveBayes | (11) | (10) | (11) | (10) | (10) | 10.4 |
| NN ( = 3) | (9) | (9) | (9) | (9) | (9) | 9.0 |
| NN ( = 5) | (7) | (7) | (7) | (7) | (7) | 7.0 |
| NN ( = 7) | (6) | (6) | (6) | (6) | (6) | 6.0 |
| OvO SVM | (4) | (4) | (4) | (4) | (4) | 4.0 |
| OvA SVM | (5) | (5) | (5) | (5) | (5) | 5.0 |
| Decision Tree | (13) | (13) | (13) | (13) | (13) | 13.0 |
| AdaBoostM2 | (8) | (8) | (8) | (8) | (8) | 8.0 |
| TotalBoost | (12) | (12) | (12) | (12) | (12) | 12.0 |
| LP Boost | (10) | (11) | (10) | (11) | (11) | 10.6 |
| Random Forrest | (3) | (3) | (3) | (3) | (3) | 3.0 |
| SoftMax | (2) | (2) | (2) | (2) | (2) | 2.0 |
| Novel LMPL | (1) | (1) | (1) | (1) | (1) | 1.0 |
*Bold numbers indicate the best performance
-
2)
Statistical Analysis
Friedman test is a popular statistical analysis for simple, nonparametric, and safe comparison of at least three-related samples. It has no assumption about primary data distribution. This test ranks methods for each metric independently. Indeed, is the average rank of the method based on different metrics. Note that in the case of tie, i.e., identical performance, the same ranks are assigned.
As can be seen in Tables 2, 3, 4, 5, the novel LMPL improved the performance metrics significantly and obtained the best average ranks in all cases. These tables reveal the significant difference between the efficiency of the different methods.
-
3)
Visual Analysis
Gradient-weighted class activation mapping (Grad-CAM) technique [15] was used for detailed visual analysis, which provides a visualization of the extracted deep features through the fine-tuned EfficientNetB0, as shown in Fig. 6. Grad-CAM is a technique to interpret deep CNN predictions and check whether the CNN is focusing on the right parts of the input image. Prediction regions can be investigated using heat maps. The spatial parts with the greatest impact on the network score were identified by Grad-CAM heat mapping, as shown in Fig. 6. The standard jet map was used in which red and yellow indicate regions with high contribution to the right predictions and blue denotes regions with low contribution. As can be seen, the fine-tuned deep EfficientNetB0 well identified the effective regions in the classification predictions.
-
4)
Comparison with State-of-the-Art Studies
Fig. 4.

Grad-CAM visualization results of different class face images. a Correctly Face Mask (CFM), b Incorrectly Face Mask (IFM), and c Not Face Mask (NFM) wearing. (Original images are shown in Fig. 1)
According to Table 6, the developed method showed superior performance in comparison to several recent studies. It can be concluded that the proposed Efficient-Mask Net can be useful in face mask-wearing monitoring systems, especially in public places, to control coronavirus spreading, as well as face authentication services in lightweight devices like mobile phones.
Table 6.
Comparison of the proposed method with state-of-the-art deep models in face mask detection (CFM = correctly face masK-, IFM = incorrectly face mask-, NFM = not face mask-wearing)
| Study | No. of cases | Method | Accuracy (%) |
|---|---|---|---|
| Qin et al. [5] | 3030 CFM | SRCNet | 98.70 |
| 134 IFM | |||
| 671 NFM | |||
| Nagrath et al. [14] | 690 Masked | MobilenetV2 | 92.64 |
| 686 Not Masked | |||
| Fasfous et. al [3] | 68,229 CFM | BinaryCOP | 98.10 |
| 6,689 Chin | |||
| 1608 Nose | |||
| 52,175 Nose & Mouth | |||
| Jiang et al. [7] | 34,806 Masked | RetinaFaceMask | 93.4 (Precision) |
| 393,703 Not Masked | |||
|
1916 Masked 1930 Not Masked |
MobileNetV2 | 96.85 | |
| Inamdar et al. [21] | 10 CFM | Facemasknet | 98.6 |
| 15 IFM | |||
| 10 NFM | |||
| Loey et al. [19] | 785 Masked | ResNet-50 | 99.49 |
| 785 Not Masked | |||
| Mercaldo and Santone [15] | 2165 Masked | MobilenetV2 | 98.0 |
| 1930 Not Masked | |||
| Zhang et al. [20] | 636 CFM |
Context-Attention R-CNN |
84.1 (mAP) |
| 48 IFM | |||
| 3988 NFM | |||
| Batagelj et al. [16] | 29,532 CFM | ResNet-152 | 98.93 |
| 1528 IFM | |||
| 32,012 NFM | |||
| Jiang et al. [7] | 7695 CFM | SE-YOLOv3 and DarkNet-53 |
98.6 (AP) |
| 366 IFM | |||
| 10,471 NFM | |||
| Militante and Dionsio [17] | 12,500 Masked | CNN | 96.0 |
| 12,500 Not Masked | |||
| Dey et. al [6] | 1916 Masked | MobileNet Mask | 93.0 |
| 1919 Not Masked | |||
|
Efficient-Mask Net (Proposed Method) |
4792 CFM | Efficient -NetB0 | 99.53 |
| 4991 IFM | |||
| 5000 NFM | |||
| 1815 Chin | 99.64 | ||
| 1608 Nose | |||
| 1569 Nose & Mouth |
Conclusion and future work
The proposed EfficientMask-Net model is lightweight and needs low power resources. Hence, the method can be useful in real-time face mask-wearing systems to identify mask-wearing conditions in public places for epidemic prevention. Two experiments were conducted to evaluate the proposed method on various deep CNNs. The EffientNetB0 with the novel LMPL classifier showed the best average accuracy in both experiments, equal to 99.53 and 99.64%, respectively. The face unmasking was also performed on masked faces and showed promising results that can be useful in face authentication systems.
In the future, the proposed method can be extended to work on real-world masked face datasets. In order to improve face unmasking, the existing face completion methods under occlusion can be applied to masked faces. Besides, the impact of unmasking on present face recognition methods can be investigated.
Footnotes
see “MaskedFace-Net dataset” https://github.com/cabani/MaskedFace-Net
see “dataset of face images Flickr-Faces-HQ (FFHQ)” https://github.com/NVlabs/ffhq-dataset
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Neda Azouji, Email: azouji@shirazu.ac.ir.
Ashkan Sami, Email: sami@shirazu.ac.ir.
Mohammad Taheri, Email: motaheri@shirazu.ac.ir.
References
- 1.Azouji N, Sami A, Taheri M, Müller H. A large margin piecewise linear classifier with fusion of deep features in the diagnosis of COVID-19. Comput. Biol. Med. 2021;139:104927. doi: 10.1016/j.compbiomed.2021.104927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Prasad, S., Li, Y., Lin, D., Sheng, D.: maskedFaceNet: a progressive semi-supervised masked face detector. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3389–3398 (2021)
- 3.Fasfous, N., Vemparala, M.-R., Frickenstein, A., Frickenstein, L., Stechele, W.: BinaryCoP: Binary neural network-based COVID-19 face-mask wear and positioning predictor on edge devices. arXiv Prepr. arXiv2102.03456 (2021)
- 4.Cabani A, Hammoudi K, Benhabiles H, Melkemi M. MaskedFace-net–a dataset of correctly/incorrectly masked face images in the context of COVID-19. Smart Heal. 2021;19:100144. doi: 10.1016/j.smhl.2020.100144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Qin B, Li D. Identifying facemask-wearing condition using image super-resolution with classification network to prevent COVID-19. Sensors. 2020;20:5236. doi: 10.3390/s20185236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Dey, S.K., Howlader, A., Deb, C.: MobileNet Mask: A multi-phase face mask detection model to prevent person-to-person transmission of SARS-CoV-2. In: Proceedings of International Conference on Trends in Computational and Cognitive Engineering, pp. 603–613. Springer (2021)
- 7.Jiang, M., Fan, X., Yan, H.: RetinaMask: a face mask detector. arXiv Prepr. arXiv2005.03950 (2020)
- 8.Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
- 9.Yu J, Hu C-H, Jing X-Y, Feng Y-J. Deep metric learning with dynamic margin hard sampling loss for face verification. Signal Image Video Process. 2020;14:791–798. doi: 10.1007/s11760-019-01612-3. [DOI] [Google Scholar]
- 10.Li Y, Guo K, Lu Y, Liu L. Cropping and attention based approach for masked face recognition. Appl. Intell. 2021;51:3012–3025. doi: 10.1007/s10489-020-02100-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Anwar, A., Raychowdhury, A.: Masked face recognition for secure authentication. arXiv Prepr. arXiv2008.11104 (2020)
- 12.Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv Prepr. arXiv1412.6980 (2014)
- 13.Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
- 14.Nagrath P, Jain R, Madan A, Arora R, Kataria P, Hemanth J. SSDMNV2: a real time DNN-based face mask detection system using single shot multibox detector and MobileNetV2. Sustain. Cities Soc. 2021;66:102692. doi: 10.1016/j.scs.2020.102692. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mercaldo, F., Santone, A.: Transfer learning for mobile real-time face mask detection and localization. J. Am. Med. Informatics Assoc. (2021) [DOI] [PMC free article] [PubMed]
- 16.Batagelj B, Peer P, Štruc V, Dobrišek S. How to correctly detect face-masks for COVID-19 from visual information? Appl. Sci. 2021;11:2070. doi: 10.3390/app11052070. [DOI] [Google Scholar]
- 17.Militante, S. V, Dionisio, N. V: Real-time facemask recognition with alarm system using deep learning. In: 2020 11th IEEE Control and System Graduate Research Colloquium (ICSGRC), pp. 106–110. IEEE (2020)
- 18.Jiang X, Gao T, Zhu Z, Zhao Y. Real-time face mask detection method based on YOLOv3. Electronics. 2021;10:837. doi: 10.3390/electronics10070837. [DOI] [Google Scholar]
- 19.Loey M, Manogaran G, Taha MHN, Khalifa NEM. A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic. Measurement. 2021;167:108288. doi: 10.1016/j.measurement.2020.108288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Zhang J, Han F, Chun Y, Chen W. A novel detection framework about conditions of wearing face mask for helping control the spread of COVID-19. IEEE Access. 2021;9:42975–42984. doi: 10.1109/ACCESS.2021.3066538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Inamdar, M., Mehendale, N.: Real-time face mask identification using facemasknet deep learning network. Avail. SSRN (2020)


