Abstract
Wearing face masks appears as a solution for limiting the spread of COVID-19. In this context, efficient recognition systems are expected for checking that people faces are masked in regulated areas. Hence, a large dataset of masked faces is necessary for training deep learning models towards detecting people wearing masks and those not wearing masks. Currently, there are no available large dataset of masked face images that permits to check if faces are correctly masked or not. Indeed, many people are not correctly wearing their masks due to bad practices, bad behaviors or vulnerability of individuals (e.g., children, old people). For these reasons, several mask wearing campaigns intend to sensitize people about this problem and good practices. In this sense, this work proposes an image editing approach and three types of masked face detection dataset; namely, the Correctly Masked Face Dataset (CMFD), the Incorrectly Masked Face Dataset (IMFD) and their combination for the global masked face detection (MaskedFace-Net). Realistic masked face datasets are proposed with a twofold objective: i) detecting people having their faces masked or not masked, ii) detecting faces having their masks correctly worn or incorrectly worn (e.g.; at airport portals or in crowds). To the best of our knowledge, no large dataset of masked faces provides such a granularity of classification towards mask wearing analysis. Moreover, this work globally presents the applied mask-to-face deformable model for permitting the generation of other masked face images, notably with specific masks. Our datasets of masked faces (137,016 images) are available at https://github.com/cabani/MaskedFace-Net. The dataset of face images Flickr-Faces-HQ3 (FFHQ), publicly made available online by NVIDIA Corporation, has been used for generating MaskedFace-Net.
Keywords: Public health; Health education; Virus protection; COVID-19; Image editing; Realistic image synthesis; Feature matching; Masked face dataset, Smart health
1. Introduction and motivation
The wearing of the face masks appears as a solution for limiting the spread of COVID-19. In this context, efficient recognition systems are expected for checking that people faces are masked in regulated areas. To perform this task, a large dataset of masked faces is necessary for training deep learning models towards detecting people wearing masks and those not wearing masks. In this sense, some large datasets of face images with virus-related protection mask are available in the literature; e.g. the MAsked FAces dataset (MAFA) Ge et al. (2017), the Real-World Masked Face Dataset (RMFD2 ) and a masked face recognition dataset Wang et al. (2020) composed of Masked Face Detection Dataset (MFDD), Real-world Masked Face Recognition Dataset (RMFRD) and Simulated Masked Face Recognition Dataset (SMFRD).
Besides, many people are not correctly wearing their masks due to bad practices, bad behaviors or vulnerability of individuals (e.g., children, old people). In this sense, several mask wearing campaigns intend to sensitize people about this problem and good practices Africa Centres for Disease Control and Prevention - Africa CDC, African Union (2020); Bouteiller (2020); Action Santé-Social Côte d’Ivoire (2020); Colart (2020). In Hammoudi et al. (2020), a mobile application “CheckYourMask” has been designed towards permitting people to check if their mask is correctly worn or not by taking a selfie. The creation of a dataset with correctly/incorrectly worn mask classes has been suggested. In makeml-mask (2020), a dataset composed of images with individual or multiple masked faces (853 images) has been proposed towards creating detection model taking into account the improperly masked faces. In this paper, we propose a relatively large dataset of 137,016 masked face images that is divided into two masked face categories; correctly worn and incorrectly worn (see samples in Fig. 1 a and Fig. 1b, respectively).
Specifically, this work proposes three types of masked face detection dataset; namely, the Correctly Masked Face Dataset (CMFD), the Incorrectly Masked Face Dataset (IMFD) and their combination (MaskedFace-Net) for the masked face detection (see dataset structure in Fig. 3a). Realistic masked face datasets are proposed with a twofold objective: i) to detect people having their faces masked or not masked, ii) to detect faces having their masked correctly worn or incorrectly worn (e.g.; at airport portals or in crowds). To the best of our knowledge, no large dataset of masked faces provides such a granularity of classification towards permitting mask wearing analysis. Moreover, this work globally presents the applied mask-to-face deformable model for permitting the generation of other masked face images, notably with specific masks.
2. Applied mask-to-face deformable model and data outputs
The dataset of face images Flickr-Faces-HQ3 (FFHQ) has been selected as a base for creating an enhanced dataset MaskedFace-Net composed of correctly and incorrectly masked face images. Indeed, FFHQ contains 70,000 high-quality images of human faces in PNG file format of resolution and is publicly available. The FFHQ dataset offers a lot of variety in terms of age, ethnicity, viewpoint, lighting, and image background. It was originally created as a benchmark for generative adversarial networks (GAN) Karras et al. (2018).
The global data-flow diagram shown in Fig. 2 shows the major stages of the image editing approach applied for generating the dataset of correctly/incorrectly masked face images “MaskedFace-Net”. In particular, the MaskedFace-Net dataset has been created by defining a mask-to-face deformable model. A pseudo-code of the global principle for generating MaskedFace-Net is shown in Fig. 3b with respect to outputs depicted in Fig. 3a. For each face image of FFHQ (e.g. Fig. 4 a), Haar feature-based cascade classifiers are applied for detecting a region of interest (detection of face rectangle). Then, a specific key point detector “shape predictor 68 face landmarks4 5 ” (model derived from Sagonas et al. (2016)) is applied to the detected region of interest and permits to automatically detect 68 landmarks of the facial structure (see sample in Fig. MaskedFace-Net – A dataset of correctly/incorrectly masked face images in the context of COVID-19). Besides, an image of a widespread face protection mask (single-use blue face protection mask) has been selected as a reference image for the mapping (see sample in Fig. MaskedFace-Net – A dataset of correctly/incorrectly masked face images in the context of COVID-19). For this latter, 12 key points have manually been annotated for delineating the mask area (polygonal area).
At this stage, four types of mask-to-face mapping have been defined with respect to targeted cases (see Fig. 3a), namely mask covering the nose, mouth and chin (i.e. mask correctly worn), mask only covering the nose and mouth; mask only covering mouth and chin and mask under the mouth (i.e. three cases of mask incorrectly worn). For each type of mask-to-face mapping (CMFD, IMFD1, IMFD2 or IMFD3), a subset of 12 facial key points is retained from the 68 landmarks automatically detected; then matched to the 12 mask key points. By this way, the mask can fit specific areas of the face for each targeted case. Hence, a mask-to-face deformable model has been created to generate MaskedFace-Net. Moreover, each targeted case can have up to 2 key points of the mask (amongst 12 key points) that have their locations randomly displaced in a limited perimeter. In particular, this tolerance allows to act on the height of the mask on the face and then to bring more realism to the generated dataset. Therefore, MaskedFace-Net also contains a high variety of positioned masks.
Finally, a homography transformation which relies on the defined point-to-point correspondence of landmarks between mask image and face image is applied for mapping mask pixels over the targeted facial areas. Instances of produced face landmarks and corresponding mask-to-face mapping are displayed for each type in Fig. 5 a, Fig. 5b, Fig. 5c, Fig. 5d and Fig. 5e, Fig. 5f, Fig. 5g, Fig. 5h, respectively.
For information, Fig. 3b illustrates a nominal scenario of face-related detection. Performance evaluation of the applied face-related detection is shown in Table 1 . In particular, some faces of FFHQ have not been processed (177 images) since face occlusions (e.g., arms, hands) made the face detection failing (i.e. no detected face rectangle). After the face detection, the MaskedFace-Net dataset contained 139,646 images. Moreover, a manual filtering has been operated for deleting detected face images having their mask incorrectly mapped in reason of failing landmark detection. Indeed, erroneous landmark detection occurs when the visibility of the facial contours is limited (e.g. for profile views of detected faces). Nevertheless, the face-related image detection and edition applied over the FFHQ dataset have been highly effective since more than 95% of FFHQ images were exploited for generating the classes of masked faces. Hence, the resulting MaskedFace-Net dataset contains 137,016 masked face images. The proposed MaskedFace-Net dataset is composed of 49% of correctly masked faces (67,193 images) and 51% of incorrectly masked faces (69,823 images). For this latter set, approximately 80% represents faces with only mouth and chin masked, 10% with only nose and mouth masked and 10% with only chin masked.
Table 1.
Results of applied detection techniques | |||
---|---|---|---|
Considered features | Face | Facial landmarks | |
Targeted mask-to-face mapping | – | Correct | Incorrect |
Detection rate (over the FFHQ dataset) | 99.75% | 95.99% | – |
Retained images | 69,823 | 67,193 | 69,823 |
We emphasize that a raw mask-to-face mapping has been applied to the FFHQ dataset. In particular, no images have been filtered according to specific parameters (e.g. age). However, the file naming of MaskedFace-Net includes the one given by the FFHQ dataset. Hence, correspondence in between FFHQ and MaskedFace-Net can be established towards related filtering.
It is worth mentioning that the minimum age for mask wearing depends on applicable laws in concerned countries. For instance, the mask wearing is compulsory from 6 years old in Spain, 11 years old in France, 12 years old in Belgium under certain conditions RTBF (2020). Between 2 and 11 years old, opinions differ Daclin (2020). Since FFHQ contains face images of all ages, it is also the case for masked face image of MaskedFace-Net. Such datasets could then be exploited for detecting children in crowds that wear a mask under the recommended limit of age.
Recently, our MaskedFace-Net dataset has been featured online in the section COVID-19 by a major source of computer vision datasets “VisualData.io".
3. Conclusion
An image editing approach has been highlighted for generating masked face images with realistic image synthesis. A large dataset of 137,016 quality masked face images has been produced and made available online. MaskedFace-Net can be seen as a benchmark dataset for creating machine learning models related to the mask wearing analysis; notably, detecting the presence of mask or not over detected face images, the correct or incorrect wearing for detected masked faces. MaskedFace-Net can then be used for enhancing vision-based monitoring systems towards several applications such as checking the respect of laws related to the mask wearing or generating crowd statistics. Moreover, the method used for the generation of MaskedFace-Net has been described for permitting the generation of masked face images by using other types of mask. MaskedFace-Net has been generated for studying behaviors and contamination processes related to the COVID-19. In particular, MaskedFace-Net has been generated for limiting the spread of COVID-19 by supporting the health education. MaskedFace-Net may also be a base for studying behaviors and contamination phenomenon in the case of an appearing new virus having a similar transmission type.
Disclaimer
In no case the contributors of this work could be held responsible for any incident when using the MaskedFace-Net dataset or masks.
Funding statement
The authors received no specific funding for this study.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Footnotes
see “Real-World Masked Face Dataset” https://github.com/X-zhangyang/Real-World-Masked-Face-Dataset.
see “dataset of face images Flickr-Faces-HQ (FFHQ)” https://github.com/NVlabs/ffhq-dataset.
see “Facial point annotations” https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/.
see “shape predictor 68 face landmarks.dat.bz2” https://github.com/davisking/dlib-models/#shape_predictor_68_face_landmarksdatbz2.
References
- Action Santé-Social Côte d'Ivoire Comment bien mettre son masque. 2020. https://www.facebook.com/110412877115436/photos/comment-bien-mettre-son-masque/154573562699367/ URL:
- Africa Centres for Disease Control and Prevention - Africa CDC, African Union How to wear a face mask correctly. 2020. https://africacdc.org/download/how-to-wear-a-face-mask-correctly/ URL:
- Bouteiller J. Coronavirus. Comment bien porter son masque ? Les conseils d’une infirmière de la métropole de Lille. 2020. https://actu.fr/hauts-de-france/lille_59350/coronavirus-comment-bien-porter-masque-conseils-dune-infirmiere_32651335.html URL:
- Colart L. Le port du masque: Les gestes à faire et ne pas faire. 2020. https://plus.lesoir.be/296003/article/2020-04-21/le-port-du-masque-les-gestes-faire-et-ne-pas-fairehttps://www.lesoir.be/sites/default/files/dpistyles_v2/ena_16_9_in_line/2020/04/21/node_296003/27512244/public/2020/04/21/B9723268640Z.1_20200421182927_000GBLFTKA92.1-0.jpg?itokvge-65yl1587734455 URL:
- Daclin C. Coronavirus : Où et à quel âge les enfants doivent-ils porter le masque ? 2020. https://www.rtl.fr/actu/bien-etre/coronavirus-ou-et-a-quel-age-les-enfants-doivent-ils-porter-le-masque-7800688133 URL:
- Ge S., Li J., Ye Q., Luo Z. 2017 IEEE conference on computer vision and pattern recognition (CVPR) 2017. Detecting masked faces in the wild with lle-cnns; pp. 426–434. [Google Scholar]
- Hammoudi K., Cabani A., Benhabiles H., Melkemi M. Validating the correct wearing of protection mask by taking a selfie: Design of a mobile application “CheckYourMask” to limit the spread of COVID-19. Computer Modeling in Engineering & Sciences; 2020. [Google Scholar]
- Karras T., Laine S., Aila T. 2018. A style-based generator architecture for generative adversarial networks. ArXiv:1812.04948. [DOI] [PubMed] [Google Scholar]
- Mask dataset. 2020. https://makeml.app/datasets/mask URL:
- RTBF Le masque obligatoire dès 6 ans en espagne, 11 ans en France, 12 ans en belgique : Pourquoi tant de différences ? 2020. https://www.rtbf.be/info/societe/detail_le-masque-obligatoire-des-6-ans-en-espagne-11-ans-en-france-12-ans-en-belgique-pourquoi-tant-de-differences?id=10549655 URL:
- Sagonas C., Antonakos E., Tzimiropoulos G., Zafeiriou S., Pantic M. 300 faces in-the-wild challenge: Database and results. Image and Vision Computing. 2016;47:3–18. doi: 10.1016/j.imavis.2016.01.002. [DOI] [Google Scholar]
- Wang Z., Wang G., Huang B., Xiong Z., Hong Q., Wu H., Yi P., Jiang K., Wang N., Pei Y., Chen H., Miao Y., Huang Z., Liang J. 2020. Masked face recognition dataset and application. ArXiv:2003.09093. [Google Scholar]