Abstract
COVID-19 pandemic is the main reason people must wear face masks in public places. Traditionally, officers monitor the use of face masks in the public area manually. However, monitoring masks using manual techniques is challenging in a crowded spot. Thus, we propose a face mask detection based on Generative Adversarial Networks (GAN) through the learning model to accelerate mask detection accurately and quickly. To construct our detection model, we collect the dataset, conduct pre-processing, and train the model by tuning multiple parameters to obtain the highest accuracy and tiny loss. The experimental results can produce D_Loss = 0.0032 and G_Loss = 7.3296. Therefore, the proposed model can be a promising solution for mask detection issues.
Keywords: Face mask, Detection, Deep learning, GAN
Introduction
COVID-19 is a virus that causes coughing up phlegm, shortness of breath, and fever. It is caused by the severe acute respiratory syndrome coronavirus 2 (SARS Cov-2). The World Health Organization (WHO) has assigned the infectious disease COVID-19 first appeared in Wuhan, China, and spread quickly to various countries. One of the efforts to prevent COVID-19 is to use face masks in public places. The use of face masks in public areas is considered sufficient to avoid COVID-19 spreading in any country [1–3].
Using a face mask is a common approach to preventing COVID-19 from spreading. To support the action, the Government released rules and laws to force people to comply with face masks regulation. To implement the regulation, several officers were placed in public places to monitor the use of face masks. The monitoring process involves detecting anyone wearing or not wearing a mask. However, monitoring activities in a broader area is becoming more challenging. Thus, it requires an effective way to deal with the issues including a computer vision solution to monitor whether drivers are wearing a face mask or not [4].
Traditional methods of face mask detection rely on visual capabilities or conventional image processing methods—the traditional way to detect masks in places with a small number of people. However, traditional detection becomes a tedious task in crowded places such as schools, malls, subways, and other public areas. This issue motivates the researchers to provide an effective solution to optimize the detection of face masks. Automatic face mask detection has been proposed by another researcher using the transfer learning model. The transfer learning model for detection has been presented with MobileNet and Global Pooling Block [5].
COVID-19 can be prevented through social distancing, sanitization, and face masks in public settings. The habit of implementing social distancing, sanitation, and using masks has become the new normal. Public service providers support the use of face masks by asking customers to wear face masks properly to take advantage of the services available. Machine Learning is used to create a detection model to detect the use of masks. The model developed can see the face from the given image and then identify whether the face in the picture uses a cover or not. This detection model works by detecting objects, in general, to detect the classes in the thing. The identification of face classes is then made categorically, with many groupings being distinguished [6].
The detection of face masks is divided into two classes: the detection of masks and without covers. Because of the issues, some papers proposed different methods and models to deal with the task. For example, a study suggested an image-based automatic mask detection that used classes with and without masks. The characteristics or features used are the texture of each face. The adjacent evaluation local binary patterns (AELBP) method, which develops the local binary patterns (LBP) method, extracts texture features from each image. The proposed model is tested with face images of various sizes, facial accessories, and facial expressions. Another paper uses a camera with good images in RGB format and then compares it with the dataset using LabView. The model is built to distinguish whether people are wearing masks by capturing images directly through the camera [2, 7].
Manual checking in a large area is a complex task. Thus, current communities develop various techniques to address the problem, including a learning model. Recent papers proposed CNN for the detection of the use of face masks. Using different CNN to extract the features from facial images, some research used further processed with various ML classifiers such as SVM and K-NN. First, SVM trains and tests the model on a massive image dataset. Next, the K-NN constructs the model with various parameters and predicts face masks. This CNN detection model can detect face with a mask, face without a mask, and face masked by hand [8]. Another paper explored CNN to classify the mask or without a mask with a large, varied, and augmented dataset so the model can identify and detect face masks in real-time videos [9].
Therefore, we proposed a GAN model to detect face masks by analyzing and extracting features for detecting face masks. In this research, we present several significant contributions, particularly in face mask detection, using the following learning method:
We provide a new method for detecting a face mask involving GAN to train the face mask dataset features according to a benchmark dataset to generate an effective model. We gathered the experimental dataset from various sources and utilized the features to construct our model.
We train the detection model by tuning various hyperparameters to gain the best performance. Instead of using a conventional approach to classify wear without a face mask, we developed a sophisticated strategy using unsupervised learning to improve the detection accuracy with a large dataset.
We test the proposed model to measure model performance in detecting the face mask based on the attributes of a large dataset. We modify several parameters to obtain the best accuracy and tiny loss in the training phase.
Organization: The following is a breakdown of the journal’s structure: Sect. 2 delves further into past findings. The study’s issue description is discussed in Sect. 3. Section 4 outlines the experimental design, including a feature learning algorithm, a dataset, and pre-processing, while Sect. 5 gives the study's findings and extensive analysis. Finally, section VI summarizes the research's results.
Related work
Deep learning is one of the most common approaches for image processing, natural language processing, and security protection [10–14]. Several studies have proposed face mask detection with various methods [1, 2, 5, 7, 15]. A study suggests a classification-based image-based automatic mask detection method. This strategy can be implemented in automated systems to improve public awareness of the need to wear masks to prevent the spread of the SARS-CoV-2 virus. The classes used in the classification are with a mask, without a mask, and mask wear incorrect. The feature or feature used is the texture of each face. The adjacent evaluation local binary patterns (AELBP) approach extracts texture information from each image and creates the local binary patterns (LBP) method. 2172 facial pictures of varying sizes, accessories and facial expressions were tested [2].
A paper proposed face mask detection using a real-time system with a camera and LabVIEW. They use a real-time technology that acquires a real-time image using a camera to detect face masks, whether they are wearing one or not. The proposed method employs high pixel quality cameras, resulting in RGB-format images. Separating the mask from the surrounding image is then used to examine the concept. A vision assistant pattern matching algorithm compares the image to a custom-made dataset, which aids in detecting the front. The consequences of the face mask can then be viewed in real-time photographs of the person captured by the camera [7].
Another researcher explored the detection model using transfer learning to control Covid-19. The researcher's method involves extracting deep characteristics from photos of faces using various deep CNN. The collected elements are processed using machine learning classifiers like SVM and K-NN. Faces with masks and faces without masks are divided into two groups in the featured selection. Transfer learning produced significant results despite the short dataset size, with an accuracy rate of 97.11% [8].
Another research proposed a hybrid model for face mask detection that included deep and traditional ML. The proposed model is divided into two sections. The first stage involved extracting features using Resnet50, a popular deep transfer ML. The second segment looked at how typical ML methods could recognize face masks. The traditional ML approaches such as SVM decision trees and ensemble algorithms were utilized. The model was tested on RMFD dataset accuracy of 99.64%, SMFD dataset accuracy of 99.49%, and LFW dataset accuracy of 100% [4].
An article presented the review on the use of face masks using CNN. A face mask detection dataset was then analyzed using Open CV to make real-time face detection through the camera. Then use Keras, Python, Tensorflow, and Open CV to build a detection system for computer vision to detect people in the image using a mask. CNN significantly affects facial mask recognition and non-mask face detection accuracy [15].
Another article proposed the MobileNet and Global Pooling Block approaches for detecting face masks. A color image is fed into the pre-trained MobileNet, which outputs a multi-dimensional features map. The feature vectors were flattened into 64 features using the global pooling block. The dense layer is then fully coupled to the softmax layer, allowing binary classification to be performed using the 64 features. The proposed model was tested using two available datasets. The precision of the proposed model ranges from 99 to 100% [5].
To predict COVID-19 patients, a study has discussed using GAN in the DL system using Chest-CT scan. However, the accuracy is limited since DL, such as CNN, requires substantial data quantified for training to provide high-quality outputs. The research proposed using GAN to create synthetic Chest-CT scans of COVID-18 patients, both positive and negative. The resulting dataset can then be used to train a CNN-based classifier. As a result, synthetic images utilizing baseline models can see about 40% of concepts correctly forecasted as COVID-19 positives when using GANs [16].
To automatically screen COVID 19, a paper discussed using GAN to generate additional CT images. Using the SARS-CoV-2 CT-Scan dataset, which included COVID 19 and non-COVID-19 pictures, the suggested technique was evaluated and validated against various classification algorithms. The method had a 99.22% inaccuracy rate, a 97.82% positive predictive value, and a 99.77% negative predictive value [1]. To detect whether a face is using a mask or not, another article proposed a learning method to detect the image face mask. It can also see a face and a mask moving as a surveillance task performer. On two separate datasets, the technique achieves an accuracy of up to 95.77% and 94.58%, respectively [6].
Therefore, we propose a face mask detection model using the GAN algorithm based on the real dataset. We utilize various features to build our dataset to construct an effective model for detecting mask accurately and quickly.
Background
This section will provide a formal definition of the research problem and some of the concepts in this journal.
Problem definitions
This study focuses on detecting face masks using the features of the dataset. The dataset is 2 Dimension, including diameter size, weight, and average RGB values. The size of dataset samples is Ν × Ν × 3 with pixel values , we can refer to the space as , with values ranging from zero to the maximum detectable pixel intensity in each dimension. The dataset used to support , on the other hand, represents the manifold of real data related to a specific problem, generally taking up only a small portion of the whole space, . Likewise, the samples generated by the generator should only take up a small amount of [17].
Proposed method
This study uses the GAN to construct a face mask detection model with two networks, namely Generator and Discriminator. The generator aims to make realistic images, and the discriminator will tell apart where the authentic images or the fake images. Both are being trained simultaneously and in competition with one another [18].
GAN is categorized as unsupervised learning to make a model by Generator and Discriminator. In unsupervised learning, the training examples are selected from an unknown distribution in generative modeling. The purpose of generative modeling is to learn a closely as feasible approximates . The best technique to learn an approximation of is to define an explicit function with parameters and seek the parameter values that make and as comparable as feasible, one example of maximum likelihood estimation is estimating the mean parameter distribution by taking the mean of a set of data [17].
GAN training entails determining the discriminator settings that maximize classification accuracy and the generator parameters that maximum confuse the discriminator. , a value function that depends on both the generator and the discriminator is used to calculate the cost of training. The course includes problem-solving exercises.
| 1 |
where,
| 2 |
The discriminator D loss function is a standard cross-entropy loss function related to the binary classifier described below:
| 3 |
Based on the input sample types, the outcomes of the loss function were quite diverse. represent the first digit while the integers reflect the probability of correctly predicting the accurate and enormous models.
The generator G loss function is intended to optimize the loss function of discriminator D by generating as many random variables as possible. The loss function of G is represented as . Generator loss is described below:
| 4 |
In the process, while one model's parameters are updated during training, the other model's parameters remain unchanged. When the discriminator grows confused and cannot tell the difference between actual and false samples, the generator is at its greatest. The discriminator is updated after being trained until optimal for the current generator.
Experimental setup
Main idea
The main goal of this paper is to create a mask detection model based on dataset features using GAN. The dataset is 2 Dimensional, including diameter size, weight, and average RGB value. For this study, we collect a dataset of 1000 images of people wearing and not wearing masks. 80% of the dataset images perform training, while the remaining 20% evaluate the model. GAN generates more samples from the dataset and then makes training and testing data that target the labeled data to categorize the data and organize data into pre-existing categories. In addition, we utilize the unsupervised method to produce better samples and differentiate existing models to achieve a high level of accuracy [16].
Dataset
This study focuses on detecting face masks using the features of the dataset. We construct a model using a benchmark dataset of 2000 images with GAN. There are in the dataset two categories: with mask and without the mask. We use the dataset in this study by separating 80% dataset as training and 20% for the testing dataset. Table 2 shows the distribution of the datasets used in this study.
Table 1.
Mathematic notation of the GAN
| Notation | Description |
| Value function | |
| Discriminator | |
| Generator | |
| Denotes the expectation | |
| Input data | |
| Noise varibales | |
| Input vector | |
| Noise vector |
Table 2.
Details for training and testing
| Dataset | Sample |
| Data Training (80%) | 1600 |
| Data Testing (20%) | 400 |
| Total | 2000 |
Figure 1 shows the samples of images in the dataset that we use in this study. The dataset contains images with two datasets, namely mask and unmask.
Fig. 1.
Dataset sample that represents the two classes
Data pre-processing
In this phase, we utilize vectorization to convert the raw data image of the 2D dataset into vectors using the label encoding and feature scaling methods. The images in the dataset have different dimensions and brightness levels, so we scale down and normalize the brightness. We scale the images with the min and max scaler method. Then we transform the image size to 32 × 32 × 3 tensor images to minimize errors. We conduct pre-processing by resizing the images, centering the point, and transforming the features to tensor 32 × 32 × 3 size [5].
Detection method
To conduct our experiment, we gather a dataset with two labels, including mask and unmask. To construct our model, this study collects the 2D dataset and chooses diameter size, weight, and average RGB value as informative features. The dataset consists of 2000 images by separating 80% dataset as training and 20% for the testing dataset.
The next stage of our experiment is pre-processing to process raw data from the dataset and its features so that the dataset can be used for further processing. We adopt vectorization with label encoding and features scaling methods to convert the raw data image of the 2D dataset to vectors. We scale down and normalize the brightness of the images with the minimum and maximum scaler method. Then we transform the image size to 32 × 32x3 tensor images to minimize errors. We transform the images at the final pre-processing stage by resizing, centering the point, and transforming the features to 32 × 32 × 3 tensors.
We utilize a generator and discriminator to construct the detection model using GAN architecture. Next, we use BatchNorm2D as the normalization method. In the GAN concept, the generators generate a fake image from the dataset and deliver it to the discriminator. Then the discriminator distinguishes the data provided by the generator whether data is real or fake. To gain the best result, we tune our model with various hyperparameters. Finally, the proposed model can produce an effective model for detecting the face mask.
Result and analysis
Detection test
This study focuses on detecting face masks using the most informative features. To perform this experiment, we tune various hyperparameters to acquire the best results and construct the model by considering several network indicators. To construct an effective learning model, we build the generator to make fake data and then utilize the discriminator function to detect and classify the actual data. Figure 2 shows a graph of the training loss of the generator and the discriminator in this study.
Fig. 2.

Discriminator and generator Loss during the experiment
In Fig. 2, label X and label Y are the assessment parameters in Fig. 2, where label X is the number of trainings performed, and Y is the loss value obtained by the model. It displays the loss results from the training phase, the blue line indicates the value of the discriminator, and the yellow line indicates the value of the generator. The generator can generate fake data from the 2000 dataset, differentiated by the discriminator, the original, and the fake data.
Based on the experimental result, the discriminator gets fewer scores than the generator. Discriminators can distinguish between real and fake data accurately. On the other hand, the generator produces higher losses than the discriminator, which means the generator cannot deceive the discriminator. In generator calculation, the more epochs, the better model. To measure the model capability, we compute the loss function that getting closer to the loss discriminator means a better model.
To evaluate the accuracy, we calculate a variety of assessment criteria. We analyze the detection outcomes by calculating the discriminator and generator loss values to indicate more accurate findings and a lower error rate. The results of the discriminator loss value show that the proposed model value indicates more accuracy with a lower error rate. Table. 3 shows the discriminator and generator values results with the hyperparameter we use in this experiment.
Table 3.
Results discriminator loss (D_Loss) and generator loss(G_Loss)
| Hyperparameter | D_Loss | G_Loss |
|---|---|---|
|
Epoch = 1000 Batch size = 32 |
0.0032 | 7.3296 |
To get the best model performance in the mask detection problem, we tune the hyper-parameters epoch = 1000 and batch size = 32 during the process with Adam Optimizer. Thus, the proposed GAN can effectively detect face masks by producing a higher score with a D_Loss = 0.0032 and a G_Loss = 7.3296. After several process phases, the detection technique gets higher results with a tiny loss.
Our proposed model can produce a promising performance in detecting face masks based on the experimental result. In the mask detection issue, GAN has shown tremendous capability and potential in the machine learning world to create realistic-looking images and videos. Beyond its generative ability, the concept of adversarial learning is a framework that, if further explored, could lead to a massive breakthrough in deep learning. However, GAN remains a common failure in the training process known as mode collapse, where the generator discovers and exploits a weakness in the discriminator. Mode collapse occurs when it generates similar images regardless of variation in the random input
Conclusion
Traditional face mask detection techniques rely on visual abilities or conventional image processing methods. However, on broad-scale detection of face masks becomes more difficult. The conventional way becomes more tedious in a large dataset, especially in crowded places. To solve this problem, we build a detection model using the GAN that consists of a Generator and Discriminator to identify the use of face masks efficiently. To conduct this experiment, we collect a large dataset, perform pre-processing, and build our model by tuning different hyperparameters to get the highest accuracy.
This paper constructs a novel model to detect face masks using GAN by analyzing features extracted from dataset images. Based on this study, our detection model shows a D_Loss = 0.0032 and a G_Loss = 7.3296. which D_Loss represents the discriminator’s loss, and G_Loss represents the generator’s loss. We calculate discriminator and generator loss values for each prediction outcome to indicate more accurate findings and a lower error rate. Our model can quickly detect wear or not wearing a mask by producing a higher accuracy with tiny loss. Based on the experimental result, the proposed model can be a promising solution to deal with face mask detection issues using the real and huge dataset [19– 27].
This paper constructs a novel model to detect face masks using GAN by analyzing features extracted from dataset images. Based on this study, our detection model shows a D_Loss = 0.0032 and a G_Loss = 7.3296. which D_Loss represents the discriminator’s loss, and G_Loss represents the generator’s loss. We calculate discriminator and generator loss values for each prediction outcome to indicate more accurate findings and a lower error rate. Our model can quickly detect wear or not wearing a mask by producing a higher accuracy with tiny loss. Based on the experimental result, the proposed model can be a promising solution to deal with face mask detection issues using the real and huge dataset.
In future research, another dynamic learning algorithms such as GCN and RBM can be adopted to improve the detection result. To improve face mask detection, the next model can utilize the Graph Convolutional Network (GCN) architecture. GCN is semi-supervised learning on structured graph data so that it is expected to produce higher quality accuracy. The next research can utilize face mask detection using GCN configured with additional hyperparameter settings to get higher results while simultaneously reducing power consumption.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Hamzah, Email: hamzah@respati.ac.id.
I. Wayan Ordiyasa, Email: wayanordi@respati.ac.id.
Muhammad Hanif R. Najib, Email: 18220007@respati.ac.id
References
- 1.Goel, T., Murugan, R., Mirjalili, S., Chakrabartty, D. K.: Automatic screening of COVID-19 using an optimized generative adversarial network. Cogn. Comput. (2021) [DOI] [PMC free article] [PubMed]
- 2.Wihandika R. Face mask detection using adjacent evaluation method local binary patterns. RESTI. 2021;5(4):705–712. doi: 10.29207/resti.v5i4.3094. [DOI] [Google Scholar]
- 3.Abboah-Offei M, Salifu Y, Adewale B, Bayuo J, Ofosu-Poku R, Opare-Lokko EBA. A rapid review of the use of face mask in preventing the spread of COVID-19. Int. J. Nurs. Stud. Adv. 2021;3:100013. doi: 10.1016/j.ijnsa.2020.100013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Loey, M., Manogaran, G., Taha, M. H. N., Khalifa, N.E.M.: A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic. Measur. J. Int. Measur. Confed. 167 (2021) [DOI] [PMC free article] [PubMed]
- 5.Venkateswarlu, I. B., Kakarla, J., Prakash, S. Face mask detection using MobileNet and global pooling block. 4th IEEE Conference on Information and Communication Technology, CICT (2020)
- 6.Das, A., Wasif Ansari, M., Basak, R.. Covid-19 face mask detection using TensorFlow, Keras and OpenCV. In: 2020 IEEE 17th India Council International Conference, INDICON (2020)
- 7.Santhosh, C., Kumar, M. R., Prasanna, J. L., Kumar, I. R., Kumar, U. V., Sri, S. N.: Face mask detection using LabView1. Int. J. Online Biomed. Eng. 17(6). (2021)
- 8.Oumina, A., el Makhfi, N., Hamdi, M.: Control the COVID-19 pandemic: face mask detection using transfer learning. In: 2020 IEEE 2nd International Conference on Electronics, Control, Optimization and Computer Science, ICECOCS (2020)
- 9.Sakshi, S., Gupta, A. K., Singh Yadav, S., Kumar, U.: Face mask detection system using CNN. In: 2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE) (2021)
- 10.Wanda P, Jie HJ. DeepProfile: finding fake profile in online social network using dynamic CNN. J Inf Secur Appl. 2020;52:102465. [Google Scholar]
- 11.Wanda, P., Marselina, E.H., Jie, H.J.: DeepOSN: bringing deep learning as malicious detection scheme in online social network. IAES Int. J. Artif. Intell. (IJ-AI) 9(1):146 (2020)
- 12.Jie, H. J.: Wanda, P.: RunPool: a dynamic pooling layer for convolution neural network. 13(1), 66–76 (2020)
- 13.Wanda P, Jie HJ. DeepFriend: finding abnormal nodes in online social networks using dynamic deep learning. Soc. Netw. Anal. Min. 2021;11:34. doi: 10.1007/s13278-021-00742-2. [DOI] [Google Scholar]
- 14.Wanda, P., Huang J.J.: DeepSentiment : finding malicious sentiment in online social network based on dynamic deep learning. (2019).
- 15.Singh, K. R., Kamble, S. D., Kalbande, S. M., Fulzele, P.: A review on COVID-19 face mask detection using CNN. J. Pharm. Res. Int. (2021)
- 16.Mann, P., Jain, S., Mittal, S., Bhat, A.: Generation of COVID-19 chest CT scan images using generative adversarial networks (2021)
- 17.Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial networks. Commun. ACM. 2020;63(11):139–144. doi: 10.1145/3422622. [DOI] [Google Scholar]
- 18.Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA. Generative adversarial networks: an overview. IEEE Signal Process. Mag. 2018;35(1):53–65. doi: 10.1109/MSP.2017.2765202. [DOI] [Google Scholar]
- 19.Tomás, J., Rego, A., Viciano-Tudela, S., Lloret, J.: Incorrect facemask-wearing detection using convolutional neural networks with transfer learning. Healthc. (Switzerl.) 9(8). (2021) [DOI] [PMC free article] [PubMed]
- 20.Kumar, A., Kalia, A., Verma, K., Sharma, A., Kaushal, M. Scaling up face masks detection with YOLO on a novel dataset. Optik 239 (2021)
- 21.Jiang X, Gao T, Zhu Z, Zhao Y. Real-time face mask detection method based on YOLOv3. Electronics. 2021;10:837. doi: 10.3390/electronics10070837. [DOI] [Google Scholar]
- 22.Hussain S, Yu Y, Ayoub M, Khan A, Rehman R, Wahid JA, Hou W. IoT and deep learning based approach for rapid screening and face mask detection for infection spread control of COVID-19. Appl. Sci. 2021;11:3495. doi: 10.3390/app11083495. [DOI] [Google Scholar]
- 23.Singh, S., Ahuja, U., Kumar, M., Kumar, K., Sachdeva, M.: Face mask detection using YOLOv3 and faster R-CNN models: COVID-19 environment. Multimed. Tools Appl. 80(13) (2021) [DOI] [PMC free article] [PubMed]
- 24.Sanajalwe, Y., Anbar, M., Al-E’Mari, S.: Covid-19 automatic detection using deep learning. Comput. Syst. Sci. Eng. 39(1) (2021)
- 25.Said, Y.: Pynq-YOLO-Net: an embedded quantized convolutional neural network for face mask detection in COVID-19 pandemic era. Int. J. Adv. Comput. Sci. Appl. 11(9) (2020)
- 26.Zulkifley, M. A., Abdani, S. R., Zulkifley, N. H.: COVID-19 screening using a lightweight convolutional neural network with generative adversarial network data augmentation. Symmetry 12(9) (2020)
- 27.Rabbi, J., Ray, N., Schubert, M., Chowdhury, S., Chao, D.: Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network. Remote Sens. 9 (2020)

