Plenoptic Face Presentation Attack Detection

SHUAISHUAI ZHU; XIAOBO LV; XIAOHUA FENG; JIE LIN; PENG JIN; LIANG GAO

doi:10.1109/access.2020.2980755

. Author manuscript; available in PMC: 2020 Jul 28.

Published in final edited form as: IEEE Access. 2020 Mar 13;8:59007–59014. doi: 10.1109/access.2020.2980755

Plenoptic Face Presentation Attack Detection

SHUAISHUAI ZHU ^1,^2,³, XIAOBO LV ^2,³, XIAOHUA FENG ^1,⁴, JIE LIN ^2,³, PENG JIN ^2,^3,⁵, LIANG GAO ^1,⁴

PMCID: PMC7386417 NIHMSID: NIHMS1583801 PMID: 32724759

Abstract

The vulnerability of current face recognition systems to presentation attacks significantly limits their application in biometrics. Herein, we present a passive presentation attack detection method based on a complete plenoptic imaging system which can derive the complete plenoptic function of light rays using a single detector. Moreover, we constructed a multi-dimensional face database with 50 subjects and seven different types of presentation attacks. We experimentally demonstrated that our approach outperforms the state-of-the-art methods on all types of presentation attacks.

INDEX TERMS: Biometrics, face recognition, multi-spectral imaging, light-field imaging

I. INTRODUCTION

Fast and non-intrusive, face recognition has been extensively used in a broad spectrum of applications, such as border control, national ID control, and personal device access [1], [2]. However, due to information globalization, the users’ personal data, including face images and videos, can be easily accessed through the Internet. These data can be used to fabricate presentation attack instruments (PAI) which are further employed to spoof biometric systems. Currently, the presentation attacks impose significant challenges on face recognition systems under non-supervision conditions [3]. Therefore, presentation attack detection (PAD) has attracted growing attention in the past decade [3]-[8].

Most existing face PAD methods are based on either software or hardware [4]. The software-based PAD methods use algorithms to distinguish the attacks from the real face presentations (hereinafter referred to as “bona fide presentations”) based on the texture [9]-[11], and frequency- [12], [13], and/or motion-patterns [14]-[17] in the sample images or video replays. Maatta et al. [9] exploited multi-scale Local Binary Patterns (LBP) to detect presentation attacks. Hadid et al. [10] further developed the LBP-based PAD method by introducing an extra descriptor computed from the gray-level co-occurrence matrices (GLCMs) and combining the classification results by a score-level fusion algorithm. Boulkenafet et al. [11] extended the analysis from the grayscale space to the chromatic components and concatenated five texture descriptors to improve the PAD performance. Li et al. [12] proposed a PAD method which extracts the structure and movement information of real faces through the two-dimensional (2D) Fourier spectra of the sample images. Zhang et al. [13] presented a frequency-based PAD method which extracts the high-frequency information from the sample images by multiple Difference of Gaussian (DoG) filters. To resist presentation attacks from high-quality photographs, Kollreider et al. [14] utilized an optical flow estimation method to detect the motion-difference between the central and outer face regions. Wei et al. [15] proposed a motion-based PAD method which detects the difference between the optical flow field derived from a planar photograph or a three-dimensional (3D) face. Marsico et al. [16] employed 3D projective invariants to distinguish a real face from a photograph. Considering the different acquisition-related noise signatures in the bona fide presentations and PAIs, Pinto et al. [17] presented a PAD method based on visual rhythms.

With the improvement in the fidelity of PAIs, the software-based PAD schemes are increasingly vulnerable to spoofing. By contrast, the hardware-based PAD methods deter attack presentations by exploring characteristics such as the involuntary signals and/or challenge-responses of a living person [5]. Nonetheless, such a measurement relies on the responses of the user, making the detection slow and unstable.

The development of cutting-edge imaging modalities has enabled a new class of passive PAD approaches which identify the bona fide samples by detecting the intrinsic properties (e.g. profile, reflectance or thermal image) of a living person. As spotlight implementations, the PAD methods based on multi-spectral imaging [18]-[20] and light field imaging [21]-[23] measure the reflectance of living skin and the 3D profile of a bona fide face, respectively. Capable of acquiring multi-dimensional information, these PAD methods outperform most existing PAD approaches which use conventional color cameras. However, the multi-spectral-imaging-based methods (hereafter referred to as “multi-spectral PAD”) are sensitive to illumination changes because the reflectance spectra from subjects highly rely on the spectrum of the light source, while the light-field-imaging-based methods (here-after referred to as “light-field PAD”) are prone to be spoofed by 3D-printed face masks.

To overcome these limitations, herein we present a passive PAD method based on complete plenoptic imaging [24]. Our method, referred to as the plenoptic PAD, combines the two complementary countermeasures, i.e. multi-spectral PAD and light-field PAD, by synergistically integrating a recently developed complete plenoptic imager [25] with a decision level fusion algorithm. Leveraging the multi-dimensional information so obtained, our approach significantly improves the accuracy of anti-spoofing detections. The paper is organized as follows. In Section II, we briefly introduce the complete plenoptic imaging. Then, the proposed PAD method based on the complete plenoptic imaging is elaborated in Section III. We present the experimental results in Section IV. Finally, the conclusions are given in Section V.

II. COMPLETE PLENOPTIC IMAGING

Figure 1 depicts the sketch of the complete plenoptic imaging system. We introduced the paradigm of light field imaging into Fourier transform imaging spectroscopy. The incident light from a subject is first imaged by an objective lens and a relay lens, forming a virtual image (denoted as second inter-mediated image in Fig. 1) behind the charge-coupled device (CCD). Next, an elemental image array is formed through a microlens array (MLA) onto the CCD. Unlike focused light field cameras [26], the complete plenoptic imaging system incorporates a birefringent polarization interferometer (BPI) which contains two polarizers, two Nomarski prisms (NP), and a half-wave plate (HWP). Based on the birefringence of the prisms, the BPI introduces interference fringes to the raw image captured by the CCD. Herein the raw image couples both the light field and interference of the incoming light rays. We adopted a reconstruction algorithm based on convolutional neural network (CNN) [25] to derive the seven-dimensional (7D) datacube (x,y,z,λ,θ,φ,t), which gives the spatial (x,y,z) and angular (θ, φ) coordinates, wavelength (λ), and time (t) of the incoming light rays. Simply put, we first trained a CNN architecture including two hidden layers and an output layer to decouple the light-field image (x,y,θ,φ) and interference from a raw image. Then we reconstructed the depth map (x,y,z) from the light-field image using a disparity estimation algorithm based on scale-depth space transform [27], while a 3D spectral datacube (x,y,λ) of the subject was derived from the interference by Fourier transform [28], [29]. We thoroughly described the theoretical model and reconstruction algorithm of the complete plenoptic imaging in Ref [25].

It is noteworthy that we use a different BPI configuration from the one used in the previous system [25]-the Wollaston prism is replaced by two NPs and an HWP [28], [29]. Therefore, a real interference plane is formed outside the BPI, co-located with the elemental image array formed by the MLA. Additionally, we directly place the CCD at the interference plane without a relay lens. As a result, the new imaging system is more compact and robust, compared to the previous setup with a relay lens between the BPI and CCD [25].

III. PRESENTATION ATTACK DETECTION BASED ON COMPLETE PLENOPTIC IMAGING

The PAI can be generally divided into two categories: artificial characteristics (e.g. printed photos or videos of faces, 3D face masks, digital photos of faces displayed by a screen) and human-based characteristics (e.g. faces of cadavers, mutilations of faces, plastic and cosmetic surgeries, facial expressions, faces of humans who are unconscious or under duress) [30]. In this work, we focus on defending against the attacks using PAIs with artificial characteristics which are more popular than human-based characteristics due to easy implementation.

Each PAD approach has its own vulnerabilities. For example, the multi-spectral PAD methods are effective to distinguish the PAIs with artificial characteristics from bona fide faces under known illumination. However, they are vulnerable to the presentation attacks based on illumination changes. On the other hand, although the light-field PAD methods can well distinguish plat or wrapped PAIs from bona fide faces, they are easily spoofed by 3D face masks. To solve these problems, our plenoptic PAD combines the advantages of the multi-spectral and light-field PAD methods to enhance the robustness of a face recognition system against various types of presentation attacks.

Figure 2 shows the flowchart of the plenoptic PAD method which contains three steps, namely (I) reconstruction of the 7D datacubes (x,y,z,λ,θ,φ,t) of subjects, (II) training of two support vector machine (SVM) classifiers based on the 3D spectral datacubes (x,y,λ) and the light field datacubes (x,y,θ,φ), respectively, (III) fusion of the outputs of the two classifiers. In step I, we first capture the raw images of subjects by the complete plenoptic imaging system and then reconstruct the 7D datacubes (x,y,z,λ,θ,φ,t) of subjects as described in Section II.

As shown in Fig. 2, step II can be further divided into two branches. In the upper branch, we first extract a 3D spectral datacube, S(x,y,λ), from the 7D datacube of each subject. Then we detect the face area of the subject and set it as the region of interest (ROI). As shown in previous studies [19], [31], [32], the reflectance of living skin is different from that of the artificial materials used to fabricate PAI, e.g. printed photos and 3D face masks. Therefore, we employ the average spectrum in the face area of each subject as the descriptor [19], which can be calculated by

D_{s} = \frac{1}{M \times N} \sum_{(x, y) ϵ R O I} S (x, y, λ)

(1)

where M and N are the number of rows and columns in the ROI, respectively. In the complete plenoptic imaging system, we employ an MLA with 13 × 18 microlenses and two NPs made of quartz (apex angle equals to 3.9°), which yield 100 spectral bands in the average spectrum. Figure 2(b) plots an average spectrum in the face area of a subject. Finally, we use the descriptors to train a support vector machine (SVM) model, indicated as SVM classifier I in Fig. 2, and employ the model to classify the input images into two separate groups, i.e. bona fide faces and PAIs.

On the other hand, the 3D profile of a bona fide face is different from that of plain or wrapped PAIs. Hence, in the lower branch, we calculate the histograms of disparity gradients (HDG) from the light field images (x,y,θ,φ) and use them as descriptors [23]. Simply put, we first extract a light field image from the 7D datacube of each subject. Then the face area of each sub-aperture image is detected and aligned based on the eye locations. Each sub-aperture image is cropped based on the face area of the central sub-aperture image and resized to 128 × 128 pixels. Next, we calculate the HDG descriptor in the face area of each light field image. The HDG descriptor is an extension of the histogram of oriented gradients (HOG) which has been applied in the computer vision field [33]-[35]. Unlike the HOG descriptor, the horizontal and vertical disparity gradients of the HDG descriptor are calculated as [23]:

{\begin{matrix} G_{h} (x, y) = L (x, y, u_{1,}, v_{1}) - L (x, y, u_{2,}, v_{2}) \\ G_{V} (x, y) = L (x, y, u_{3,}, v_{3}) - L (x, y, u_{4,}, v_{4}) \end{matrix}

(2)

where L(x, y, u, v) denotes the light field image of the subject, while (u₁, v₁), (u₂, v₂), (u₃, v₃), and (u₄, v₄) are the coordinates of the selected sub-aperture images as indicated by red squares in Fig. 2(c). Then the disparity gradient magnitude and orientation are respectively calculated by

| \nabla I (x, y) | = \sqrt{G_{h} {(x, y)}^{2} + G_{v} {(x, y)}^{2}}

(3)

θ (x, y) = arctan (\frac{G_{h (x, y)}}{G_{v (x, y)}})

(4)

We divide the gradient magnitude and orientation images into non-overlapping cells. Each cell contains 8 × 8 pixels which are grouped into nine histogram bins. And then, a histogram is calculated by accumulating the magnitudes of pixels in each bin. To normalize the local contrast, we further group each 2 × 2 cells into a block and concatenate the histograms of the four cells as a single vector. The final HDG descriptor, D_l(h₁, h₂, … , h_k), is derived by concatenating all the normalized vectors of blocks. Here, h_i is the ith element in the descriptor. To improve the normalization performance, we overlap the adjacent blocks [33] and extend the dimensionality of the HDG descriptor to 8100. Finally, we train an SVM classifier model (SVM classifier II) using the HDG descriptors.

In step III, we derive the final output by fusing the classification results from the SVM classifier I and II. These two classifiers are trained based on a multi-spectral PAD method [19] and a light-field PAD method [23] which are vulnerable to the presentation attacks based on illumination changes and 3D face masks, respectively. To combine these two PAD methods, we adopt a decision-level fusion based on “AND” rule-we only accept the input samples which are classified as “bona fide face” by both classifiers and reject all other samples as PAIs. Therefore, the resultant PAD method is effective against both types of presentation attacks based on illumination changes or 3D face masks.

IV. EXPERIMENT

A. DATABASE ESTABLISHMENT

In the past decade, several multi-dimensional face databases, e.g. multi-spectral [3], [20], [36], 3D [37], light filed [22], and multi-modality (i.e. RGB, infrared, and depth) [38] face databases, have been established. However, none of them contains all of the six-dimensional (6D) information (x,y,z,λ,θ,φ) of subjects. In this work, we built a 6D face database which comprised of 50 subjects with an age range from 22 to 34 (17 females and 33 males). Additionally, we generated seven types of presentation attacks:

Printed photo I: two-dimensional (2D) images of 50 subjects were captured by a Canon 550D DSLR camera and printed on A4 papers using a color inkjet printer (Epson Stylus Photo 1500W). The printed photos were placed on a flat surface [Fig. 3(b)].
Printed photo II: the 2D images of 50 subjects were printed on A4 papers using a color laser printer (Xerox 700 Digital Color Press). The printed photos were placed on a flat surface [Fig. 3(c)].
Wrapped photo I: the printed photo I was wrapped to mimic the human face profile [Fig. 3(d)].
Wrapped photo II: the printed photo II was wrapped [Fig. 3(e)].
Digital photo I: the 2D images of 50 subjects were displayed by a laptop (Surface Pro 5) [Fig. 3(f)].
Digital photo II: the 2D images of 50 subjects were displayed by an LCD screen (Philips 247E7QHSWP) [Fig. 3(g)].
3D face mask: 3D face masks of 12 subjects were fabricated by a color 3D printer [39] using the facial models reconstructed by a 3D scanner [Fig. 3(h)].

FIGURE 3. — Photos of a bona fide face and its corresponding PAIs captured by a commercial camera. (a) Bona fide face. (b) Printed photo by a color inkjet printer. (c) Printed photo by a color laser printer. (d) Wrapped photo printed by a color inkjet printer. (e) Wrapped photo printed by a color laser printer. (f) Digital photo displayed by a laptop. (g) Digital photo displayed by an LCD screen. (h) 3D face mask. The photos were authorized to disclose under the subject’s consent.

As an example, we show the photos of a bona fide face and the seven types of presentation attacks in Figs. 3(a) and 3(b-h), respectively. We used the complete plenoptic imaging system and a Canon 550D DSLR camera to capture the raw data. In our experiment, all subjects were informed about the study and required to face the imaging systems with neutral expression in a laboratory environment. To enrich the database, we employed nine types of illumination, including a cold LED (color temperature: 5500K), a warm LED (color temperature: 2700K), a fluorescent lamp, a halogen lamp, and combination of any two of them (except for the combination of cold LED and warm LED). A 6D datacube (x,y,z,λ,θ,φ) for each subject or PAI under certain illumination was reconstructed by a CNN-based algorithm [25], while a high-resolution RGB image was captured by the DSLR camera. It is worth noting that no illumination was used for the digital photos displayed by the two screens. The database contains a total of 2440 datacubes (x,y,z,λ,θ,φ) and 2440 RGB images.

B. EXPERIMENTAL RESULTS

To evaluate the performance of the PAD methods, we divided the 50 subjects from the multi-dimensional face database into three non-overlapping subsets, namely training set (25 subjects), development set (10 subjects), and test set (15 subjects). The training set was used to train SVM classifiers, while the development set and test set were employed to fine-tune parameters in the SVM classification models and evaluate the performance of the PAD methods, respectively. Note that all 3D face masks were distributed to the test set.

According to the ISO/IEC 30107–3:2017 [40], we evaluated the performance of the PAD methods using two metrics: i) attack presentation classification error rate (APCER) which measures the proportion of attack presentations using the same PAI species incorrectly classified as bona fide ones, and ii) bona fide presentation classification error rate (BPCER) which measures the proportion of bona fide presentations incorrectly classified as attacks. Additionally, we evaluated the overall performance of a PAD method using F-score which can be calculated by:

F - score = \frac{P \cdot R}{P + R}

(5)

where P and R are the precision and recall of the classification result, respectively. They are calculated as

P = \frac{T P}{T P + F P}

(6)

R = \frac{T P}{T P + F N}

(7)

where TP, FP, and FN are the number of true positive, false positive, and false negative samples, respectively.

In practice, it is infeasible to collect all types of PAIs when we train the classification models. Therefore, the PAD systems must experience the attacks from “unknown” PAIs. To test the effectiveness of PAD methods against the attacks from “unknown” PAIs, we first trained and cross-validated the SVM classifiers using the training set and development set, respectively, except for two types of samples: i) the subjects and PAIs illuminated by the warm LED or any hybrid light sources with the warm LED, and ii) the 3D face masks. Then we evaluated the PAD methods using the test set which contains two types of “unknown” PAIs: i) the printed PAIs illuminated by the warm LED, and ii) the 3D face masks illuminated by all nine types of illumination except for the warm LED or any hybrid light sources with the warm LED. According to the above configuration, we collected 800 training samples in the training set. Considering the dimensionalities of the spectral and HDG descriptors, we used the Gaussian and linear kernel for SVM classifier I and II, respectively.

In this experiment, we compared the plenoptic PAD with several state-of-the-art PAD methods, including the multi-spectral PAD method [19], the light-field PAD method [23], and four software-based PAD methods: i) a frequency-based PAD method using multiple Difference of Gaussian (DoG) filters [13], ii) a PAD approach based on multi-scale LBPs [9], iii) a texture-based spoofing countermeasure using LBPs and gray-level co-occurrence matrices (GLCMs) [10], and iv) a PAD method based on color texture analysis [11]. We denote these software-based methods as “DoG-PAD”, “MTA-PAD”, “T-PAD” and “CTA-PAD”, respectively. Table 1 reports the performance of these methods. The results show that all state-of-the-art PAD methods perform well on “known” PAIs. However, none can distinguish both types of “unknown” PAIs, denoted as “Mask” and “Warm LED” in Table 1, from bona fide faces.

TABLE 1.

Performance of the proposed and state-of-the-art presentation attack detections^a.

APCER
	Inkjet	W-Inkjet	Laser	W-Laser	Surface	Philips	Mask	Warm LED	BPCER	F-score
DoG-PAD	13.3	2.2	14.4	6.7	26.7	40	26.4	13.3	38.9	51.2
MTA-PAD	8.9	0	5.6	1.1	20	6.7	97.2	3.3	8.9	62.6
T-PAD	0	0	0	0	0	0	48.6	0	5.6	81.0
CTA-PAD	0	0	0	0	6.7	0	73.6	0	7.8	73.1
Multi-spectral PAD	0	0	0	0	0	0	8.3	100	1.1	72.7
Light-field PAD	0	0	2.2	0	0	0	41.7	0	3.3	83.3
Plenoptic PAD	0	0	0	0	0	0	4.2	0	4.4	96.1

Open in a new tab

Acronyms and abbreviations: APCER attack presentation classification error rate; BPCER bona fide presentation classification error rate; Inkjet photos printed by a inkjet printer; W-Inkjet wrapped photos printed by an inkjet printer; Laser photos printed by a laser printer; W-Laser wrapped photos printed by a laser printer; Philips, digital photos displayed by a Philips screen; Surface, digital photos displayed by a Surface Pro laptop; Mask 3D face masks; Warm LED, Printed photos illuminated by a warm LED.

Particularly, the multi-spectral PAD incorrectly classified all the printed photos illuminated by the warm LED as bona fide faces (APCER = 100%). This is because the reflectance spectra of the printed photos illuminated by a warm LED are close to that of the bona fide faces illuminated by a cold LED. On the other hand, the light-field PAD could not distinguish the 3D face masks from bona fide faces (APCER = 41.7%) because the masks have similar 3D profiles with real faces. It is worth noting that the multi-spectral and light-field PAD correctly rejected most of the 3D face masks and the warm-LED-illuminated PAIs, respectively. This implies that the multi-spectral PAD is robust to the change of 3D profiles while the light-field PAD is insensitive to the illumination. Combining the advantages of the two methods, the plenoptic PAD outperforms both multi-spectral and light-field PAD on F-score and the APCER of all types of presentation attacks. Note that the BPCER of plenoptic PAD is higher than that of multi-spectral PAD. This is because the plenoptic PAD only classifies the samples accepted by both of the multi-spectral and light-field PAD as bona fide faces and it is more likely to reject bona fide faces as PAIs. However, since both the multi-spectral and light-field PAD perform well on BPCER, the plenoptic PAD well maintains the classification accuracy of bona fide samples when dramatically suppressing the APCER of “unknown” PAIs. Figure 4 shows the detection error tradeoff (DET) curves of the state-of-the-art and proposed PAD methods. The red line indicates the DET curve of the proposed approach which outperforms the other state-of-the-art methods.

FIGURE 4. — DET curves of the state-of-the-art and proposed PAD methods.

Figure 5 shows the scatter plot of the scores of multispectral PAD and light-field PAD, which visually presents the results given in Table 1. The green and blue dashed lines indicate the decision boundaries of the multi-spectral and light-field PAD, respectively. According to the decision-level fusion rule used in the plenoptic PAD, we draw its decision boundary by orange lines in Fig. 5. The results imply that the plenoptic PAD only accepts the samples in the first quartile as bona fide faces. Therefore, it is more robust against different types of PAIs, e.g. PAIs in the second and fourth quartiles, than the multi-spectral PAD and light-field PAD. Furthermore, although the plenoptic PAD incorrectly rejects a few bona fide samples out of the first quartile as PAIs, it well maintains the classification accuracy of bona fide samples (BPCER = 4.4%).

V. CONCLUSION

We have developed a plenoptic face presentation attack detection method based on a complete plenoptic imaging system which can acquire the complete plenoptic function, P(x,y,z,λ,θ,φ,t), of light rays using a single detector. This is the first time a complete plenoptic imaging system has been used in the PAD of face recognition. Combining the advantages of multi-spectral and light-field PAD methods, our plenoptic PAD approach is robust against various types of presentation attacks. To evaluate the performance of the PAD methods, we constructed a multi-dimensional (x,y,z,λ,θ,φ,t) face database with 50 subjects and seven types of presentation attacks. The experimental results show that the proposed method outperforms the state-of-the-art PAD methods on all types of presentation attacks, especially on the “unknown” attacks. Although our method tends to reject more bona fide samples as PAIs than the multi-spectral and light-field PAD, its BPCER remains low (=4.4%).

In summary, our plenoptic PAD approach is effective against various types of presentation attacks while maintaining the classification accuracy of the bona fide samples. The imaging system employed is compact, robust, and low cost, enabling a passive and real-time measurement. Seeing its unprecedented performance, we anticipate that the plenoptic PAD will open a new area of investigations in face recognition.

ACKNOWLEDGMENT

The authors thank all the participants during the database establishment.

This work was supported in part by the National Science Foundation CAREER Award under Grant 1652150, in part by the National Institutes of Health under Grant R35GM128761 and Grant R01EY029397, and in part by the Applied Technology Research and Development Program of Heilongjiang Province under Grant GX16C013.

Biographies

graphic file with name nihms-1583801-b0001.gif

SHUAISHUAI ZHU received the B.S. degree in mechanical engineering from Shandong University, Jinan, China, in 2009. He is currently pursuing the Ph.D. degree in instrument science and technology with the Harbin Institute of Technology, Harbin, China.

From 2016 to 2017, he was a Visiting Scholar with the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA. His research interests include the spectral imaging, light field imaging, and their applications in biometrics and biomedicine.

graphic file with name nihms-1583801-b0002.gif

XIAOBO LV received the B.S. degree in physics from the Harbin Institute of Technology, Harbin, China, in 2017, where he is currently pursuing the Ph.D. degree in instrument science and technology.

His research interests include the spectral imaging, light field imaging, polarization imaging, and their applications in biometrics and biomedicine.

graphic file with name nihms-1583801-b0003.gif

XIAOHUA FENG received the B.S. degree in electrical engineering from Xidian University, Xi’an, China, in 2011, and the Ph.D. degree from Nanyang Technological University, Singapore, in 2016. He is currently a Postdoctoral Research Associate with the University of Illinois at Urbana-Champaign. His research interests include ultrafast photography, optical tomography, and computational imaging.

graphic file with name nihms-1583801-b0004.gif

JIE LIN received the Ph.D. degree in physics from the Harbin Institute of Technology, Harbin, China, in 2007. He currently works as an Associate Professor with the School of Instrumentation Science and Engineering, Harbin Institute of Technology. He presides and participates in more than ten projects supported by NSFC, MOE, MOST, and so on. He has published more than 30 peer-viewed articles and applied more than 30 patents. His research interests include precision measurement, super-resolution focusing, and micro-nano optical elements.

graphic file with name nihms-1583801-b0005.gif

PENG JIN received the B.S. degree in physics from Jilin University, in 1994, and the M.S. and Ph.D. degrees in instrument science and technology from the Harbin Institute of Technology, Harbin, China, in 2001.

From 2002 to 2003, he was a Postdoctoral Research Associate with the University of Birmingham, U.K. From 2004 to 2018, he was a Professor with the School of Electrical Engineering and Automation, Harbin Institute of Technology. Since 2018, he has been a Professor with the School of Instrumentation Science and Engineering, Harbin Institute of Technology. His research interests include fabrication and applications of micro-electro-mechanical systems, spectral imaging, and biomedical imaging.

graphic file with name nihms-1583801-b0006.gif

LIANG GAO received the B.S. and M.S. degrees in physics from Tsinghua University, Beijing, China, in 2007, and the Ph.D. degree in applied physics from Rice University, Houston, USA, in 2011.

From 2011 to 2015, he was a Postdoctoral Research Associate with Washington University in St. Louis. From 2015 to 2016, he worked as an Advisory Resea0072ch Scientist in Ricoh research. Since 2016, he has been an Assistant Professor with the Department of Electronic and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, USA. His research interests include ultrafast biophotonics, multidimensional imaging, photoacoustic imaging, and near-eye 3D display.

REFERENCES

[1].Prabhakar S, Pankanti S, and Jain AK, “Biometric recognition: Security and privacy concerns,” IEEE Secur. Privacy, vol. 1, no. 2, pp. 33–42, March 2003. [Google Scholar]
[2].Zhao W, Chellappa R, Phillips PJ, and Rosenfeld A, “Face recognition: A literature survey,” ACM Comput. Surv, vol. 35, no. 4, pp. 399–458, December 2003. [Google Scholar]
[3].Chingovska I, Erdogmus N, Anjos A, and Marcel S, “Face recognition systems under spoofing attacks,” in Face Recognition Across the Imaging Spectrum. Cham, Switzerland: Springer, 2016, pp. 165–194. [Google Scholar]
[4].Ramachandra R and Busch C, “Presentation attack detection methods for face recognition systems: A comprehensive survey,” ACM Comput. Surv, vol. 50, no. 1, pp. 1–37, April 2017. [Google Scholar]
[5].Galbally J, Marcel S, and Fierrez J, “Biometric antispoofing methods: A survey in face recognition,” IEEE Access, vol. 2, pp. 1530–1552, 2014. [Google Scholar]
[6].Hadid A, “Face biometrics under spoofing attacks: Vulnerabilities, countermeasures, open issues, and research directions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, Columbus, OH, USA, June 2014, pp. 113–118. [Google Scholar]
[7].Boulkenafet Z et al. , “A competition on generalized software-based face presentation attack detection in mobile scenarios,” in Proc. IEEE Int. Joint Conf. Biometrics (IJCB), Biometric, Denver, USA, October 2017, pp. 688–696. [Google Scholar]
[8].Akhtar Z, Micheloni C, and Foresti GL, “Biometric liveness detection: Challenges and research opportunities,” IEEE Secur. Privacy, vol. 13, no. 5, pp. 63–72, September 2015. [Google Scholar]
[9].Maatta J, Hadid A, and Pietikainen M, “Face spoofing detection from single images using micro-texture analysis,” in Proc. Int. Joint Conf. Biometrics (IJCB), October 2011, pp. 1–7. [Google Scholar]
[10].Hadid A, Evans N, Marcel S, and Fierrez J, “Biometrics systems under spoofing attack: An evaluation methodology and lessons learned,” IEEE Signal Process. Mag, vol. 32, no. 5, pp. 20–30, September 2015. [Google Scholar]
[11].Boulkenafet Z, Komulainen J, and Hadid A, “Face spoofing detection using colour texture analysis,” IEEE Trans. Inf. Forensics Security, vol. 11, no. 8, pp. 1818–1830, August 2016. [Google Scholar]
[12].Li J, Wang Y, Tan T, and Jain AK, “Live face detection based on the analysis of Fourier spectra,” in Proc. Biometric Technol. Hum. Identificat, Orlando, FL, USA, August 2004, pp. 296–303. [Google Scholar]
[13].Zhang Z, Yan J, Liu S, Lei Z, Yi D, and Li SZ, “A face antispoofing database with diverse attacks,” in Proc. 5th IAPR Int. Conf. Biometrics (ICB), New Delhi, India, March 2012, pp. 26–31. [Google Scholar]
[14].Kollreider K, Fronthaler H, and Bigun J, “Non-intrusive liveness detection by face images,” Image Vis. Comput, vol. 27, no. 3, pp. 233–244, February 2009. [Google Scholar]
[15].Bao W, Li H, Li N, and Jiang W, “A liveness detection method for face recognition based on optical flow field,” in Proc. Int. Conf. Image Anal. Signal Process, Taizhou, China, 2009, pp. 233–236. [Google Scholar]
[16].De Marsico M, Nappi M, Riccio D, and Dugelay J-L, “Moving face spoofing detection via 3D projective invariants,” in Proc. 5th IAPR Int. Conf. Biometrics (ICB), New Delhi, India, March 2012, pp. 73–78. [Google Scholar]
[17].Pinto A, Robson Schwartz W, Pedrini H, and De RA. Rocha, “Using visual rhythms for detecting video-based facial spoof attacks,” IEEE Trans. Inf. Forensics Security, vol. 10, no. 5, pp. 1025–1038, May 2015. [Google Scholar]
[18].Yi D, Lei Z, Zhang Z, and Li SZ, “Face Anti-spoofing: Multi-spectral Approach,” in Handbook Biometric Anti-Spoofing: Trusted Biometrics under Spoofing Attacks. London, U.K.: Springer, 2014, pp. 83–102. [Google Scholar]
[19].Raghavendra R, Raja KB, Venkatesh S, and Busch C, “Face presentation attack detection by exploring spectral signatures,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Honolulu, HI, USA, July 2017, pp. 672–679. [Google Scholar]
[20].Agarwal A, Yadav D, Kohli N, Singh R, Vatsa M, and Noore A, “Face presentation attack with latex masks in multispectral videos,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Hawaii, HI, USA, July 2017, pp. 275–283. [Google Scholar]
[21].Raghavendra R, Raja KB, and Busch C, “Exploring the usefulness of light field cameras for biometrics: An empirical study on face and iris recognition,” IEEE Trans. Inf. Forensics Security, vol. 11, no. 5, pp. 922–936, May 2016. [Google Scholar]
[22].Sepas-Moghaddam A, Malhadas L, Correia PL, and Pereira F, “Face spoofing detection using a light field imaging framework,” IET Biometrics, vol. 7, no. 1, pp. 39–48, January 2018. [Google Scholar]
[23].Sepas-Moghaddam A, Pereira F, and Correia PL, “Light field-based face presentation attack detection: Reviewing, benchmarking and one step further,” IEEE Trans. Inf. Forensics Security, vol. 13, no. 7, pp. 1696–1709, July 2018. [Google Scholar]
[24].Gao L and Wang LV, “A review of snapshot multidimensional optical imaging: Measuring photon tags in parallel,” Phys. Rep, vol. 616, pp. 1–37, February 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[25].Zhu S, Gao L, Zhang Y, Lin J, and Jin P, “Complete plenoptic imaging using a single detector,” Opt. Express, vol. 26, no. 20, pp. 26495–26510, October 2018. [DOI] [PubMed] [Google Scholar]
[26].Zhu S, Lai A, Eaton K, Jin P, and Gao L, “On the fundamental comparison between unfocused and focused light field cameras,” Appl. Opt, vol. 57, no. 1, pp. A1–A11, January 2018. [DOI] [PubMed] [Google Scholar]
[27].Tosic I and Berkner K, “Light field scale-depth space transform for dense depth estimation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, Columbus, OH, USA, June 2014, pp. 435–442. [Google Scholar]
[28].Kudenov MW and Dereniak EL, “Compact real-time birefringent imaging spectrometer,” Opt. Express, vol. 20, no. 16, pp. 17973–17986, July 2012. [DOI] [PubMed] [Google Scholar]
[29].Zhu S, Zhang Y, Lin J, Zhao L, Shen Y, and Jin P, “High resolution snapshot imaging spectrometer using a fusion algorithm based on grouping principal component analysis,” Opt. Express, vol. 24, no. 21, pp. 24624–24640, October 2016. [DOI] [PubMed] [Google Scholar]
[30].Information Technology-Biometric presentation attack detection-Part 1: Framework. International Organization for Standardization, Standard ISO/IEC JTC1 SC37 Biometrics, 2016.
[31].Kim Y, Na J, Yoon S, and Yi J, “Masked fake face detection using radiance measurements,” J. Opt. Soc. Amer. A, Opt. Image Sci, vol. 26, no. 4, pp. 760–766, April 2009. [DOI] [PubMed] [Google Scholar]
[32].Angelopoulou E, “Understanding the color of human skin,” Proc. SPIE, vol. 4299, pp. 243–251, June 2001. [Google Scholar]
[33].Dalal N and Triggs B, “Histograms of oriented gradients for human detection,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), San Diego, CA, USA, June 2005, pp. 886–893. [Google Scholar]
[34].Zhu Q, Yeh M-C, Cheng K-T, and Avidan S, “Fast human detection using a cascade of histograms of oriented gradients,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), New York, NY, USA, vol. 2, June 2006, pp. 1491–1498. [Google Scholar]
[35].Déniz O, Bueno G, Salido J, and De la Torre F, “Face recognition using histograms of oriented gradients,” Pattern Recognit. Lett, vol. 32, no. 12, pp. 1598–1603, September 2011. [Google Scholar]
[36].Di W, Zhang L, Zhang D, and Pan Q, “Studies on hyperspectral face recognition in visible spectrum with feature band selection,” IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 40, no. 6, pp. 1354–1361, November 2010. [Google Scholar]
[37].Erdogmus N and Marcel S, “Spoofing face recognition with 3D masks,” IEEE Trans. Inf. Forensics Security, vol. 9, no. 7, pp. 1084–1097, July 2014. [Google Scholar]
[38].Zhang S, Wang X, Liu A, Zhao C, Wan J, Escalera S, Shi H, Wang Z, and Li S, “A dataset and benchmark for large-scale multi-modal face anti-spoofing,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit, Long Beach, CA, USA, June 2019, pp. 919–928. [Google Scholar]
[39].SimpNeed. Accessed: Mar. 24, 2020. [Online]. Available: http://www.simpneed.com/
[40].Information Technology-Presentation Attack Detection-Part 3: Testing, Reporting and Classification of Attacks, Standard ISO/IEC 30107–3, 2017.

[R1] [1].Prabhakar S, Pankanti S, and Jain AK, “Biometric recognition: Security and privacy concerns,” IEEE Secur. Privacy, vol. 1, no. 2, pp. 33–42, March 2003. [Google Scholar]

[R2] [2].Zhao W, Chellappa R, Phillips PJ, and Rosenfeld A, “Face recognition: A literature survey,” ACM Comput. Surv, vol. 35, no. 4, pp. 399–458, December 2003. [Google Scholar]

[R3] [3].Chingovska I, Erdogmus N, Anjos A, and Marcel S, “Face recognition systems under spoofing attacks,” in Face Recognition Across the Imaging Spectrum. Cham, Switzerland: Springer, 2016, pp. 165–194. [Google Scholar]

[R4] [4].Ramachandra R and Busch C, “Presentation attack detection methods for face recognition systems: A comprehensive survey,” ACM Comput. Surv, vol. 50, no. 1, pp. 1–37, April 2017. [Google Scholar]

[R5] [5].Galbally J, Marcel S, and Fierrez J, “Biometric antispoofing methods: A survey in face recognition,” IEEE Access, vol. 2, pp. 1530–1552, 2014. [Google Scholar]

[R6] [6].Hadid A, “Face biometrics under spoofing attacks: Vulnerabilities, countermeasures, open issues, and research directions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, Columbus, OH, USA, June 2014, pp. 113–118. [Google Scholar]

[R7] [7].Boulkenafet Z et al. , “A competition on generalized software-based face presentation attack detection in mobile scenarios,” in Proc. IEEE Int. Joint Conf. Biometrics (IJCB), Biometric, Denver, USA, October 2017, pp. 688–696. [Google Scholar]

[R8] [8].Akhtar Z, Micheloni C, and Foresti GL, “Biometric liveness detection: Challenges and research opportunities,” IEEE Secur. Privacy, vol. 13, no. 5, pp. 63–72, September 2015. [Google Scholar]

[R9] [9].Maatta J, Hadid A, and Pietikainen M, “Face spoofing detection from single images using micro-texture analysis,” in Proc. Int. Joint Conf. Biometrics (IJCB), October 2011, pp. 1–7. [Google Scholar]

[R10] [10].Hadid A, Evans N, Marcel S, and Fierrez J, “Biometrics systems under spoofing attack: An evaluation methodology and lessons learned,” IEEE Signal Process. Mag, vol. 32, no. 5, pp. 20–30, September 2015. [Google Scholar]

[R11] [11].Boulkenafet Z, Komulainen J, and Hadid A, “Face spoofing detection using colour texture analysis,” IEEE Trans. Inf. Forensics Security, vol. 11, no. 8, pp. 1818–1830, August 2016. [Google Scholar]

[R12] [12].Li J, Wang Y, Tan T, and Jain AK, “Live face detection based on the analysis of Fourier spectra,” in Proc. Biometric Technol. Hum. Identificat, Orlando, FL, USA, August 2004, pp. 296–303. [Google Scholar]

[R13] [13].Zhang Z, Yan J, Liu S, Lei Z, Yi D, and Li SZ, “A face antispoofing database with diverse attacks,” in Proc. 5th IAPR Int. Conf. Biometrics (ICB), New Delhi, India, March 2012, pp. 26–31. [Google Scholar]

[R14] [14].Kollreider K, Fronthaler H, and Bigun J, “Non-intrusive liveness detection by face images,” Image Vis. Comput, vol. 27, no. 3, pp. 233–244, February 2009. [Google Scholar]

[R15] [15].Bao W, Li H, Li N, and Jiang W, “A liveness detection method for face recognition based on optical flow field,” in Proc. Int. Conf. Image Anal. Signal Process, Taizhou, China, 2009, pp. 233–236. [Google Scholar]

[R16] [16].De Marsico M, Nappi M, Riccio D, and Dugelay J-L, “Moving face spoofing detection via 3D projective invariants,” in Proc. 5th IAPR Int. Conf. Biometrics (ICB), New Delhi, India, March 2012, pp. 73–78. [Google Scholar]

[R17] [17].Pinto A, Robson Schwartz W, Pedrini H, and De RA. Rocha, “Using visual rhythms for detecting video-based facial spoof attacks,” IEEE Trans. Inf. Forensics Security, vol. 10, no. 5, pp. 1025–1038, May 2015. [Google Scholar]

[R18] [18].Yi D, Lei Z, Zhang Z, and Li SZ, “Face Anti-spoofing: Multi-spectral Approach,” in Handbook Biometric Anti-Spoofing: Trusted Biometrics under Spoofing Attacks. London, U.K.: Springer, 2014, pp. 83–102. [Google Scholar]

[R19] [19].Raghavendra R, Raja KB, Venkatesh S, and Busch C, “Face presentation attack detection by exploring spectral signatures,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Honolulu, HI, USA, July 2017, pp. 672–679. [Google Scholar]

[R20] [20].Agarwal A, Yadav D, Kohli N, Singh R, Vatsa M, and Noore A, “Face presentation attack with latex masks in multispectral videos,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), Hawaii, HI, USA, July 2017, pp. 275–283. [Google Scholar]

[R21] [21].Raghavendra R, Raja KB, and Busch C, “Exploring the usefulness of light field cameras for biometrics: An empirical study on face and iris recognition,” IEEE Trans. Inf. Forensics Security, vol. 11, no. 5, pp. 922–936, May 2016. [Google Scholar]

[R22] [22].Sepas-Moghaddam A, Malhadas L, Correia PL, and Pereira F, “Face spoofing detection using a light field imaging framework,” IET Biometrics, vol. 7, no. 1, pp. 39–48, January 2018. [Google Scholar]

[R23] [23].Sepas-Moghaddam A, Pereira F, and Correia PL, “Light field-based face presentation attack detection: Reviewing, benchmarking and one step further,” IEEE Trans. Inf. Forensics Security, vol. 13, no. 7, pp. 1696–1709, July 2018. [Google Scholar]

[R24] [24].Gao L and Wang LV, “A review of snapshot multidimensional optical imaging: Measuring photon tags in parallel,” Phys. Rep, vol. 616, pp. 1–37, February 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] [25].Zhu S, Gao L, Zhang Y, Lin J, and Jin P, “Complete plenoptic imaging using a single detector,” Opt. Express, vol. 26, no. 20, pp. 26495–26510, October 2018. [DOI] [PubMed] [Google Scholar]

[R26] [26].Zhu S, Lai A, Eaton K, Jin P, and Gao L, “On the fundamental comparison between unfocused and focused light field cameras,” Appl. Opt, vol. 57, no. 1, pp. A1–A11, January 2018. [DOI] [PubMed] [Google Scholar]

[R27] [27].Tosic I and Berkner K, “Light field scale-depth space transform for dense depth estimation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, Columbus, OH, USA, June 2014, pp. 435–442. [Google Scholar]

[R28] [28].Kudenov MW and Dereniak EL, “Compact real-time birefringent imaging spectrometer,” Opt. Express, vol. 20, no. 16, pp. 17973–17986, July 2012. [DOI] [PubMed] [Google Scholar]

[R29] [29].Zhu S, Zhang Y, Lin J, Zhao L, Shen Y, and Jin P, “High resolution snapshot imaging spectrometer using a fusion algorithm based on grouping principal component analysis,” Opt. Express, vol. 24, no. 21, pp. 24624–24640, October 2016. [DOI] [PubMed] [Google Scholar]

[R30] [30].Information Technology-Biometric presentation attack detection-Part 1: Framework. International Organization for Standardization, Standard ISO/IEC JTC1 SC37 Biometrics, 2016.

[R31] [31].Kim Y, Na J, Yoon S, and Yi J, “Masked fake face detection using radiance measurements,” J. Opt. Soc. Amer. A, Opt. Image Sci, vol. 26, no. 4, pp. 760–766, April 2009. [DOI] [PubMed] [Google Scholar]

[R32] [32].Angelopoulou E, “Understanding the color of human skin,” Proc. SPIE, vol. 4299, pp. 243–251, June 2001. [Google Scholar]

[R33] [33].Dalal N and Triggs B, “Histograms of oriented gradients for human detection,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), San Diego, CA, USA, June 2005, pp. 886–893. [Google Scholar]

[R34] [34].Zhu Q, Yeh M-C, Cheng K-T, and Avidan S, “Fast human detection using a cascade of histograms of oriented gradients,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), New York, NY, USA, vol. 2, June 2006, pp. 1491–1498. [Google Scholar]

[R35] [35].Déniz O, Bueno G, Salido J, and De la Torre F, “Face recognition using histograms of oriented gradients,” Pattern Recognit. Lett, vol. 32, no. 12, pp. 1598–1603, September 2011. [Google Scholar]

[R36] [36].Di W, Zhang L, Zhang D, and Pan Q, “Studies on hyperspectral face recognition in visible spectrum with feature band selection,” IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 40, no. 6, pp. 1354–1361, November 2010. [Google Scholar]

[R37] [37].Erdogmus N and Marcel S, “Spoofing face recognition with 3D masks,” IEEE Trans. Inf. Forensics Security, vol. 9, no. 7, pp. 1084–1097, July 2014. [Google Scholar]

[R38] [38].Zhang S, Wang X, Liu A, Zhao C, Wan J, Escalera S, Shi H, Wang Z, and Li S, “A dataset and benchmark for large-scale multi-modal face anti-spoofing,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit, Long Beach, CA, USA, June 2019, pp. 919–928. [Google Scholar]

[R39] [39].SimpNeed. Accessed: Mar. 24, 2020. [Online]. Available: http://www.simpneed.com/

[R40] [40].Information Technology-Presentation Attack Detection-Part 3: Testing, Reporting and Classification of Attacks, Standard ISO/IEC 30107–3, 2017.

PERMALINK

Plenoptic Face Presentation Attack Detection

SHUAISHUAI ZHU

XIAOBO LV

XIAOHUA FENG

JIE LIN

PENG JIN

LIANG GAO

Abstract

I. INTRODUCTION

II. COMPLETE PLENOPTIC IMAGING

FIGURE 1.

III. PRESENTATION ATTACK DETECTION BASED ON COMPLETE PLENOPTIC IMAGING

FIGURE 2.