Exploring the importance of shape on dynamic recognition of self-face or friend-face

Sogo Yumura; Karen Lander; Miyuki G Kamachi

doi:10.1038/s41598-026-45374-8

. 2026 Mar 28;16:10802. doi: 10.1038/s41598-026-45374-8

Exploring the importance of shape on dynamic recognition of self-face or friend-face

Sogo Yumura ^1,^2,^✉, Karen Lander ³, Miyuki G Kamachi ⁴

PMCID: PMC13039733 PMID: 41904210

Abstract

Our own face is well-known to us, but it is unclear whether we perceive it in the same way as other familiar faces. Unlike others’ faces, self-faces provide limited opportunities to observe facial motion. This study investigated the importance of shape when recognizing self or friend from dynamic face clips. Dynamic sequences were created using Deepfake, manipulating face shape and motion independently. In both experiments, participants observed a visually presented face and identified whether the face motion was self or friend. In Experiment 1, face shape was manipulated to match or mismatch the dynamic parameters of the observed person. In Experiment 2, the contribution of self- and friend- shape was manipulated in a series of stages to match or mismatch the dynamic parameters. The results showed the identification of the friends’ face motion was independent of the observed face shape. However, self-face motion could not be clearly identified until the face shape was judged to be self-face. These results support the prediction that self-face motion identification is more dependent on face shape, compared to another familiar friend face. We propose that self-faces may have specific perceptual characteristics that are distinct from the recognition of other familiar faces.

Keywords: Face recognition, Self-face, Dynamic facial information, Identification, Deepfake

Subject terms: Psychology, Human behaviour, Information technology

Introduction

When we learn the visual representations of familiar people (acquaintances, famous people, etc.), we acquire this information through face-to-face conversations or by seeing them on various media. However, when we learn the visual representation of our own face (self-face), this information is usually obtained through reflection in a mirror or reflective surface. Despite differences in the methods used to acquire facial information from self and other friends’ faces, both are classified as familiar faces¹. Humans are experts in elaborately processing faces from birth to adulthood, and in identifying who the observed face belongs to². The Bruce and Young (1986) model³ is a classic cognitive theory of face perception, which makes a distinction between the processing of familiar and unfamiliar faces (faces that have not been seen before). It has been suggested that the bandwidth of features (image dimensions between shape and texture) required for person identification differs depending on the familiarity of the face⁴.

Focusing on the recognition of familiar faces, it has been reported that the recognition of self-face may be processed by a different brain network than for the recognition of others’ faces⁵. In this fMRI study, it was shown that during face recognition, the posterior temporal regions are activated in common for self- and others’ faces. However, the posterior inferior temporal gyrus (pITG) and supramarginal gyrus (SMG) showed greater activation during self-face recognition compared to the recognition of others’ faces. Additionally, the occipital lobe was also activated during self-face recognition, suggesting its role in processing the synchronization between self-face motion and its mirror image (i.e., contingency cues). In other words, self-face recognition involves not only comparing perceived facial features with stored representations but also integrating dynamic visual feedback to assess temporal consistency of facial motion. In addition, it has been reported that self-face promotes attention pre-reflexively, which is characterized by the processing of self-related information at a stage prior to the conscious level⁶. One possibility is that implicitly pre-processed self-related information may assist in self-face recognition.

In recent years, in order to deepen our understanding of the face recognition process, research has focused on the importance of dynamic information on expression recognition⁷ or person identification^8,9, using a perceptual psychological approach. There are two main hypotheses about how facial motion may influence identification^10,11. First, the ‘representation enhancement hypothesis’ posits that facial motion facilitates recognition by improving the structural representation of a face. Second, the ‘supplemental information hypothesis’ suggests that facial motion provides dynamic, idiosyncratic cues, such as a unique smile or head tilt, which supplement static facial features for identity recognition. From a neurological perspective, it has also been shown that dynamic information about faces is processed via the superior temporal sulcus (STS) in the dorsal stream¹². While numerous studies using dynamic facial stimuli have examined the distinction between familiar¹³ and unfamiliar¹⁴ individuals, recognition mechanisms within subcategories of familiar faces remain insufficiently understood. In particular, the differentiation between self-faces and other familiar faces in dynamic contexts has not yet been clearly explored. Although prior research employing static facial images has addressed this difference^15,16, findings remain inconclusive and may not generalize to dynamic facial processing. Indeed, previous studies¹⁰ have reported that the more experience with face recognition, the higher the effectiveness of dynamic information in aiding familiar face recognition. However, it is uncertain what the importance of facial motion is for self-face recognition.

Recently, Deepfake technology has been used to explore the role of face motion on face perception^17–19. A ‘Deepfake’ is generally defined as a fake image or video generated using deep neural networks. For example, a typical example is an artificially created video in which the speaker in the video looks like a certain person (here, Mr. A), but the actual speech act in the video is performed by a different person (here, Mr. B). The quality of generated fake videos may be so good and realistic that they are indistinguishable from real ones, unless viewers are explicitly informed of their inauthenticity^20,21. However, if the Deepfake is slightly ‘off’ then the sequence often looks somewhat uncanny²². Deepfake image sequences are mainly used in the real world for identity theft, prompting the development of protection measures²³. On the other hand, this technology is a potentially very useful tool in face perception research. As in the previous example, a video can be created in which Mr. A's face shape is paired with Mr. B’s face motion, showing that shape and motion can be treated as independent information respectively. Importantly, it is possible to create image sequences where the face shape is kept as similar as possible, with the face motion ‘driven’ by different individual’s motion parameters.

In the present paper, two experiments are presented which aim to investigate whether the recognition of self-face has different perceptual properties than the recognition of a friend’s face. In both experiments, the face motion in the presented faces is identified by participants as ‘self’ or ‘friend’, and the probability of correctly identifying the motion is compared between self/friend faces. Our focus is on the influence of structurally processed shape information during facial motion identification. In other words, we are interested in whether there are differences in shape dependence during face motion identification between self and friend. In situations where shape-based person identification is difficult for other people’s faces, it has been shown that person identification by motion can be carried out without much dependence on shape²⁴. Structural shape information input is processed independently of the motion of the face; thus, facial motion identification is performed with high sensitivity to the motion itself. According to this theory, the motion identification of a friend’s face in our experiments may not be dependent on face shape and can be expected to show a high identification probability, no matter what face shape (between self and friend) is shown. On the other hand, for self-face we know a lot about facial shape, but we rarely actively observe our own facial motions. Thus, there may be relatively few records of self-face motion stored in memory, meaning participants rely more on shape to determine identity. If this is the case then the motion identification rate of self-face is expected to show shape-dependent results, which is different from the perceptual characteristics of a familiar friend’s face.

Method

Participants

16 Japanese males with a mean age of 21.9 (SD = 0.89) years, who had normal vision or corrected to normal vision (only contact lenses) and no abnormalities in facial motor function, participated in these experiments. The participants were paired with another participant who they knew (a ‘friend’), through seeing each other daily. Participants took part in both Experiments 1 and 2 in the same pair. The shortest period they had been friends with each other was approximately five months and the longest five years. Thus, participants were recruited in pairs and within each pair participants served as both ‘self’ and ‘friend’. A participants’ face was ‘self’ when they viewed their own face, and ‘friend’ when viewed by their paired participant. All participants were fully informed about the two experiments prior to their participation. Moreover, they agreed through informed consent to participate in the experiments and to authorize the publication of any images (videos) obtained in the experiments. These two experiments were conducted according to the principles of the Helsinki Declaration and was approved by the ‘Ethics Review Committee for Research on Human Subjects at Kogakuin University’. An a priori power analysis was conducted using G*Power (version 3.1; Faul et al., 2007) to determine the required sample size for a within-participants ANOVA (1 group; 4 measurements). Assuming a large effect size (Cohen’s f = 0.40), an alpha level of 0.05, and desired statistical power of 0.80, the analysis indicated that a minimum of 10 participants were required to detect significant effects. This calculation assumes moderate correlation among repeated measures (r = 0.50) and sphericity (ε = 1.0). To ensure adequate power, we recruited 16 participants.

Preparation

This preparation was carried out on the day before Experiment 1. Experiment 1 was then conducted, followed by Experiment 2 a few days later.

Video recording

The recording was conducted in a dark room with the participant sat on a chair, with a white screen behind. A video camera (NEX-VG10) was positioned on a tripod approximately 150 cm in front of the center of the participant’s head. The height of the tripod was adjusted so that the tip of the participant’s nose appeared in the center of the video camera’s preview screen, and the zoom lens was adjusted so that the width of the face on the preview screen (at the base of each auricle) was approximately 1.5 cm. A prompter (X12 Foldable Camera Teleprompter) was placed on the video camera, and a tablet device (XPERIA SGP712) was used for the prompter to display Japanese phrases one by one in sequence. The phrases consisted of 30 different phrases that contained vowels and consonants equally and could be read aloud in about 3.0 s to 6.0 s (e.g., ‘Kama de rubabu wo nemoto kara karitotta’: in English ‘Rhubarb was cut from the root with a sickle’). There was also an additional phrase with a longer reading time of about 8.0 s (‘Ojiisan ha yama he shibakari ni obaasan ha kawa he sentaku ni dekakemashita’: in English ‘Grandpa went to the mountains to mow the lawn and grandma went to the river to do the washing’). The lighting (tripod: RIFA-F, bulb: LDA11N-G/100VIR) was a two-lamp system, which illuminated the participant from a distance of 150 cm at 45 degrees to the front, left and right, and from a height of 170 cm above the participant. To diffuse the light on the participant, the irradiated area was covered with a diffuser.

The participant was signaled to start recording the displayed phrase at the sound of a beep. The beep was audible from the prompter control tablet. Eight seconds after the first beep, a second beep was played, and another phrase was visually displayed on the prompter. The participant read out the phrase each time it was updated. 30 different phrases were read out. After the 30 phrases had been recorded, the longer phrase (above mentioned) was read out three times with a beep interval of 10.0 s. Thus, a total of 270 s of video (called SRC) was saved as source face video, consisting of 240 s (8.0 s per phrase × 30 phrases) and 30 s (10.0 s × 1 phrase × 3 repetitions). The SRC was used for training the motion and shape of each speaker in Deepfake.

Face stimuli generated by Deepfake

The face stimuli used in the experiments were generated by DeepFaceLab 2.0²⁵, controlled by a desktop PC (SENSE-F069-LC129K-QRX, CPU: Intel Core i9-12900 K, GPU: NVIDIA RTX A6000 48 GB GDDR6, memory: 128 GB). The final 30 s of the SRC taken during the video recording was extracted as the destination face video (three 10 s clips). This extracted video called DST, was the output face data in face stimulus generation using Deepfake and was the face replaced by the deepfake. Each participant’s SRC and DST videos were respectively exported as frame images at 30 fps and facial regions were extracted for all frames. Next, face exchange models were trained on each pair to swap the face shape of the DST (e.g., participant A) for that of the SRC (e.g., participant B) while keeping the face motion of the DST. The training step of this face exchange model was carried out with the aim of generating a high definition of the optimal face shape for the retained motion (DST face motion). Finally, the trained face exchange model was used to generate suitable SRC face shapes for each frame of the DST, which were then merged. During the merging process, blurring of the exchanged face regions were reduced by sharpening with a Gaussian filter. In addition, the colour transformation was removed to eliminate the influence of skin colour on self/friends’ motion identification. The face stimulus video generated was a DST video (motion unaltered) in which the face shape was replaced, with three 10 s videos created. In order to unify the ‘generated face stimuli’ in this experiment, face exchange processing was applied even if the face shape and face motion were identical to those of the participants. During stimulus presentation, background information other than the generated face area was removed (grey background) and the face stimuli were displayed so that the actual measured width of the face stimuli was 5.0 cm on the display.

Results

Experiment 1

Stimuli & conditions

Face stimuli generated by Deepfake (see above) were used for Experiment 1. The experimental condition (see Fig. 1) consisted of four conditions combining two factors: face shape (2 levels: Shape-Friend, Shape-Self) and facial motion information (2 levels: Motion-Friend, Motion-Self). The stimuli used in each condition were three patterns showing a same long phrase being read aloud (each stimulus was 10 s). These stimuli were produced by splitting the DST video with replaced face shape into three 10 s segments. These three segmented videos read aloud the same phrase, but with slightly different motions. Hence, the three patterns were used to avoid a ceiling effect.

Inline graphic — Schematic diagram of conditions and procedures for Experiment 1: the stimuli were generated by Deepfake, which combined facial motion (2) and shape (2), creating four combinations of these stimuli. At the start of the experiment, a fixation point was displayed for 1 s, after which the generated dynamic face stimulus was visually presented for 10 s. The observer identified whether the stimulus motion seen was that of a paired friend or self, and responded at the keyboard. The fixation point was again displayed and the trial was repeated. Experiment 1 was completed after a total of 24 trials. Experiment 2 was a similar procedure, but the conditions (stimuli) were subdivided to 22 conditions; motion (2) and shape (11), and number of trials were totally 220 (22 conditions 10 trials). In addition, the stimulus was visually presented for 2 s in Experiment 2. Individuals depicted in this figure have provided their informed consent for the use of their images.

Procedure

Figure 1 shows the flow of the procedure. In the self/friend motion identification task, stimuli were visually presented on a monitor (GL2480-B) and controlled by a notebook PC (raytrek R5-TA6). The participants sat on a chair in a dark room with a monitor in front of him. The viewing distance from the monitor was approximately 60 cm. After pressing the space key, a fixed viewpoint was first displayed in the center of the monitor for 1.0 s, and then the face stimulus was displayed for 10 s. After the end of the face stimulus, the participants were then asked to identify whether the face they observed had ‘self-motion’ or ‘friend motion’ and they responded with the keyboard. If it was the former, the participant pressed ‘J’ on the keyboard; if it was the latter, the participant pressed ‘F’ on the keyboard. After the choice was made, the fixed viewpoint was displayed again, and the next trial was repeated in the same way. The visual presentation of the stimuli was completely randomized, and the total number of trials was 24 (4 conditions × 3 patterns × 2 trials). Moreover, participants were instructed in advance that the shape of the face could not be used to judge self or friend, and that the identification during the task should be related to the observed motion.

Results of self or friends’ motion identification

Figure 2 shows the results of the motion identification probabilities for Experiment 1. The ‘motion identification probability’ is defined as the proportion correct identification of the observed face motion. The average motion identification probabilities were calculated for each of the four conditions for the combination of shape (Shape-Friend and Shape-Self) and motion information (Motion-Friend and Motion-Self). A two-factor repeated measures ANOVA was conducted for face ’Shape’ (two levels) and face ‘Motion’ (two levels). Results found there was no significant difference in the Shape factor [F(1,15) = 1.489, p = 0.241, Inline graphic = 0.090], and in the Motion factor [F(1,15) = 0.328, p = 0.575, = 0.021]. However, a significant interaction was found [F(1,15) = 7.244, p = 0.017, = 0.326; see Fig. 2] and thus the simple main effects of the Shape factor at each level of the Motion factor were analyzed. The simple main effects showed no significant difference in the Motion-Friend [p = 0.258] comparison but a significant difference in Motion-Self [Shape-Friend < Shape-Self, p = 0.009]. It was indicated the friends’ facial motion can be identified regardless of whether the shape is Self or Friend. On the other hand, in the identification of self-faces, the identification probability was higher when the applied shape was a self-face and significantly lower than when the applied shape was a friends’ face. In addition, the simple main effects of the Motion factor at each level of the Shape factor were also analyzed. The simple main effects showed significant differences in Shape-Friend [Motion-Friend > Motion-Self, p = 0.043] and in Shape-Self [Motion-Friend < Motion-Self, p = 0.042] comparisons. These results indicate that when the Shape and the Motion information are from the same person (congruent), the motion can be easily identified. However, when there is an incongruence between the Shape and the Motion, this makes identification relatively difficult.

Fig. 2 — Results of the motion identification probability in Experiment 1. The bar graph shows the average identification probability for the combination of shape (2 levels: Friend and Self) and motion (2 levels: Friend and Self). The error line also shows the standard error and ** means p < .01.

Experiment 2

Stimuli

As in Experiment 1, face stimuli were generated using Deepfake and then morphing between shapes. A schematic of the stimuli used in Experiment 2 is shown in Fig. 3. The ‘ratio of face matching to motion’ refers to the proportion in which the face shape was congruent or incongruent with the motion. For the morphing, each video of the facial stimuli with a congruent combination of motion was framed at 30 fps. Furthermore, the morphing was performed one by one on each frame (static image) with timing corresponding to the two images. The video used here was the part corresponding to the last 10 s of the DST. The face regions of each extracted frame were recognized using MediaPipe’s face recognition algorithm, and information on the mesh consisting of 479 landmarks was stored. The stored mesh was geometrically manipulated by Delaunay triangulation and morphed by deforming the triangles corresponding to the two frames as appropriate. During deformation, the contribution of each triangle was manipulated with an alpha value and blended. The generated stimuli were alpha blended with self-face shape and friends’ face shape. In addition, this allowed participants to not only see stimuli where they could clearly judge self-face (or friends’ face) from the shape, but also discretely observe an ambiguous face shape between the two identities.

Fig. 3 — Image regarding the morphing stimuli in Experiment 2: the top and bottom rows of the figure show the view of ‘ratio of shape matching to motion’ from the perspective of the respective participants (participants A and B). For example, 0% are stimuli with shape incongruent to the motion (Motion-A_Shape-B or Motion-B_Shape-A). Conversely, 100% are stimuli whose shape is perfectly congruent to the motion (Motion-A_Shape-A or Motion-B_Shape-B). Between these 0% and 100% stimuli, morphing was applied in 11 steps of 10% to the shape whilst maintaining the motion information. Therefore, the direction of shape steps between the set of morphing stimuli based on Motion-A (top row) and those based on Motion-B (bottom row) had an opposite relationship. Individuals depicted in this figure have provided their informed consent for the use of their images.

Procedure

The procedure was similar to that of Experiment 1, but some changes were made in the experimental design. In Experiment 1, the face stimuli were visually presented for 10 s, but this was changed so that they were visually presented for 2 s in Experiment 2. This was because it was judged that 2 s of visual presentation was sufficient time to identify the motion shown. The 2 s clips were randomly selected from the 10 s of the original face stimulus video, and the stimuli were presented in such a way that they were always in the middle of a face movement (facial stimuli that did not move for 2 s were not present during the experiment). There were 22 experimental conditions (2 Motions × 11 Shapes) and the total number of trials was 220 (22 conditions × 10 trials).

Results of shape influence in steps

The motion identification probability results for Experiment 2 are shown in Fig. 4. The horizontal axis shows the ratio of shape matching. The average of the motion identification probabilities at each ratio of shape matching were calculated. Regression lines were drawn for the calculated Motion-Friend and Motion-Self conditions. Furthermore, the intercepts, slopes and the ratio of shape matching at an identification probability of 75% (the standard by which a motion can be judged to be approximately accurately identified) were compared. The slope of the regression line for Motion-Friend was 0.23 and the intercept was 0.68 [ Inline graphic = 0.849]. In the same way, the slope of the regression line for Motion-Self was 0.38 and the intercept was 0.52 [ = 0.929]. To examine whether the relationship between ratio of shape matching and motion differed between the two series (Motion-Friend and Motion-Self), we performed an ANCOVA including the series × ratio of shape matching interaction term. A significant main effect would indicate that the intercepts of the regression lines differ between the two series, and significant interaction would indicate that the slopes differ. This analysis showed that the slopes of the regression lines at each motion were significantly different [F(1,18) = 10.266, p < 0.01, Inline graphic = 0.363]. The stepwise change in motion identification for self-faces was steeper than that for friends’ faces. The results also showed a significant intercept difference between self and friends’ face motion [F(1,18) = 31.998, p < 0.001, = 0.640]. Indeed, the threshold (ratio of shape matching) at which the motion identification probability exceeded 75% (good recognition) was 0.30 for friends’ faces, compared to 0.60 for self faces. This means that about 30% congruent shape was required for good face motion recognition of a friend’s face, whereas approximately 60% was needed for good recognition of self-face motion.

Discussion

The results of Experiment 1 showed that participants can identify both self and friend’s face from facial motion. In the case of identifying a friend’s facial motion, the identification probability was not significantly different between friend’s face and self-face shape. On the other hand, when identifying self-face motion, recognition was significantly higher when the shape was also self, than when a friend’s shape was displayed. In Experiment 1, there were only two types of shape (self or friend) and the shape information was salient during observation. In comparison with this, Experiment 2 there was more ambiguity in the shape information (due to morphing between shapes). The results of Experiment 2 show that when identifying a friend’s face motion, the motion of the face could be identified even at a stage when the shape of the face was hardly matched with the motion. However, the identification of self-face motion was found to be significantly influenced by shape congruency, with lower identification probabilities when the shapes were not congruent and higher identification probabilities when the shapes were clearly congruent. It was also shown that only when the shape reached a level at which it can be judged to be one’s own shape (a morphing face containing approximately 60% self-face shape) is it possible to clearly identify self-face motion.

These results suggest that the identification of self-face motion is more shape-dependent than the identification of a well-known friend’s motion. This could be due to differences in the perceptual characteristics of self/friend face recognition. Identification using facial motion is likely to be particularly important when identification by face shape is difficult²⁴. In the present experiments, the participants were informed in advance that they could not make decisions based on the shape, and the task was carried out under these conditions. The facial motions of a friend, whose motion was learned relatively well and viewed on a daily basis, was likely congruent with the stored structural information. On the other hand, in the case of self-face motion, there may be a lack of stored information. Hence, our prediction that when we look at our own faces, we observe the shape well, but not enough about the facial motion, is supported by the results of the present study. In other words, we don’t usually see our own face moving and thus we don’t encode this information as well as the shape information. This difference in perceptual characteristics contrasts with the traditional idea of processing self-faces as familiar faces. Instead, self-faces could be recognized similarly to unfamiliar faces with respect to dynamic information. However, this present experimental design did not allow for a clear investigation of which processes led to the final identification.

In this experiment, only two types of face motion were used for motion identification, i.e. the face motion of the self and the face motion of a friend. To further deepen our understanding of how we perceive face motion, we believe that modulation of the motion could be explored by using exaggerated dynamic parameters^26–28, to investigate more specifically the perception of the motion itself. Other possible methods include the inclusion of non-target stimuli (motions of another person other than self and the target friend) as noise, to look at any general benefit for seeing a face move that is outside the learning of dynamics of familiar faces. A final suggestion is that we could switch the task, asking participants to categorize identity on shape (self or friend) with congruent or incongruent motion. If motion is less important for self-face, then there should be lower (or even no) costs for identity incongruent motion, compared to for friend-face recognition.

Limitations of Deepfake

The present experiment used advanced stimulus creation with Deepfake. It has been reported that smiling face stimuli generated with Deepfake produce more attenuated emotional processing, than when viewing a real smiling face²⁹. Thus, especially in experiments that require identification tasks (such as people and facial expressions), the information generated by Deepfake stimuli may give rise to beliefs about whether the information is real or fake. This may lead to hesitation during the task and participants responding differently than they would to natural real images. In these experiments, all stimuli were generated using Deepfake, and no real (as-recorded) images were used. Thus, there was no investigation as to whether participants considered the visually presented stimuli real or fake. Given this, the unnatural nature of the generated stimuli created by Deepfake may have influenced decisions during identification. The influence of these unnatural judgments could not be investigeted in the present experiments.

Conclusion

In this paper, we investigated the shape dependence of face motion identification between self and friend’s face. Deepfake was used to separate the shape and motion of the faces (self and friends) as independent information, and to generate stimuli by swapping (Experiment 1) or manipulating (Experiment 2) the shape information, while maintaining the motion. In Experiment 1, the motion identification probabilities were obtained for four combinations of face shape and motion for self and friends’ face. The results showed that the identification probability of motion in self-face was sufficiently different depending on whether the shape and motion were congruent or incongruent compared to that in friend-face. In Experiment 2, a similar identification task to Experiment 1 was performed using stepwise shape changes (interpolation by spatial morphing) to further investigate the influence of shape. Regression lines were drawn for each of the self and friend motion to examine the change in identification probabilities with stepwise shape. The slope of the regression line for the self-face motion was steeper than that for the friend-face motion, and the intercept was lower. It was also found that identification was not accurate unless a certain amount of self-shape information could be observed. In other words, it was shown that the self-face still depends on shape information, whereas the friend face did not. These findings support the possibility that self-face has different perceptual properties from their friend-face and that ‘familiarity’ in familiar faces exists independently in shape and motion.

Acknowledgements

This work was supported by JSPS KAKENHI (to S.Y.; Grant Number JP25KJ2090, to M.G.K.; Grant Number JP20H00608), and POLA CHEMICAL INDUSTRIES, INC (to M.G.K.).

Author contributions

Yumura, S.: Conceptualization, Methodology, Data collection, Software, Analysis, Writing—original draft. Lander, K.: Analysis, Writing—review & editing. Kamachi, G. M.: Conceptualization, Methodology, Analysis, Writing—review & editing.

Funding

JSPS KAKENHI (Grant Number JP25KJ2090, JP20H00608), and POLA CHEMICAL INDUSTRIES, INC.

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Bortolon, C., Lorieux, S. & Raffard, S. Self- or familiar-face recognition advantage? New insight using ambient images. Q. J. Exp. Psychol.71, 1396–1404 (2018). [DOI] [PubMed] [Google Scholar]
2.Carey, S. Becoming a face expert. Philos. Trans. R. Soc. Lond. B Biol. Sci.335, 95–103 (1992). [DOI] [PubMed] [Google Scholar]
3.Bruce, V. & Young, A. Understanding face recognition. Br. J. Psychol.77, 305–327 (1986). [DOI] [PubMed] [Google Scholar]
4.Andrews, T. J. et al. A narrow band of image dimensions is critical for face recognition. Vision Res.212, 108297 (2023). [DOI] [PubMed] [Google Scholar]
5.Sugiura, M. et al. Neural mechanism for mirrored self-face recognition. Cereb. Cortex25, 2806–2814 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Wójcik, M. J., Nowicka, M. M., Bola, M. & Nowicka, A. Unconscious detection of one’s own image. Psychol. Sci.30, 471–480 (2019). [DOI] [PubMed] [Google Scholar]
7.Kamachi, M. et al. Dynamic properties influence the perception of facial expressions. Perception42, 1266–1278 (2013). [DOI] [PubMed] [Google Scholar]
8.Knight, B. & Johnston, A. The role of movement in face recognition. Vis. Cogn.4, 265–273 (1997). [Google Scholar]
9.Lander, K., Bruce, V. & Hill, H. Evaluating the effectiveness of pixelation and blurring on masking the identity of familiar faces. Appl. Cognit. Psychol.15, 101–116 (2001). [Google Scholar]
10.Lander, K. & Bruce, V. Recognizing famous faces: Exploring the benefits of facial motion. Ecol. Psychol.12, 259–272 (2000). [Google Scholar]
11.O’Toole, A. J., Roark, D. A. & Abdi, H. Recognizing moving faces: A psychological and neural synthesis. Trends Cogn. Sci.6, 261–266 (2002). [DOI] [PubMed] [Google Scholar]
12.Haxby, J. V., Hoffman, E. A. & Gobbini, M. I. The distributed human neural system for face perception. Trends Cogn. Sci.4, 223–233 (2000). [DOI] [PubMed] [Google Scholar]
13.Lander, K., Christie, F. & Bruce, V. The role of movement in the recognition of famous faces. Mem. Cognit.27, 974–985 (1999). [DOI] [PubMed] [Google Scholar]
14.Christie, F. & Bruce, V. The role of dynamic information in the recognition of unfamiliar faces. Mem. Cognit.26, 780–790 (1998). [DOI] [PubMed] [Google Scholar]
15.Alzueta, E., Kessel, D. & Capilla, A. The upside-down self: One’s own face recognition is affected by inversion. Psychophysiology58, e13919 (2021). [DOI] [PubMed] [Google Scholar]
16.Sugiura, M. et al. Self-face recognition in social context. Hum. Brain Mapp.33, 1364–1374 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Wen, W. et al. Control over self and others’ face: Exploitation and exploration. Sci. Rep.14, 15473 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Kasahara, S., Kumasaki, N. & Shimizu, K. Investigating the impact of motion visual synchrony on self face recognition using real time morphing. Sci. Rep.14, 13090 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Shimizu, K., Naruse, S., Nishida, J. & Kasahara, S. Morphing identity: exploring self-other identity continuum through interpersonal facial morphing experience. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. CHI ’23, Article No. 500, 1–15 (Association for Computing Machinery, New York, NY, USA, 2023).
20.Bray, S. D., Johnson, S. D. & Kleinberg, B. Testing human ability to detect ‘deepfake’ images of human faces. J. Cybersecur.9, tyad011 (2023). [Google Scholar]
21.Lewis, A., Vu, P., Duch, R. M. & Chowdhury, A. Deepfake detection with and without content warnings. R. Soc. Open Sci.10, 231214 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Mori, M., MacDorman, K. F. & Kageki, N. The uncanny valley. IEEE Robot. Autom. Mag.19, 98–100 (2012). [Google Scholar]
23.Dai, Y., Fei, J., Huang, F. & Xia, Z. Face Omron Ring: Proactive defense against face forgery with identity awareness. Neural Netw.180, 106639 (2024). [DOI] [PubMed] [Google Scholar]
24.Knappmeyer, B., Thornton, I. M. & Bülthoff, H. H. The use of facial motion and facial form during the processing of identity. Vis. Res.43, 1921–1936 (2003). [DOI] [PubMed] [Google Scholar]
25.Liu, K. et al. Deepfacelab: Integrated, flexible and entensible face-swapping framework. Pattern Recogn.141, 109628 (2023). [Google Scholar]
26.Hill, H. C. H., Troje, N. F. & Johnston, A. Range- and domain-specific exaggeration of facial speech. J. Vis.5, 793–807 (2005). [DOI] [PubMed] [Google Scholar]
27.Westhoff, C. & Troje, N. F. Kinematic cues for person identification from biological motion. Percept. Psychophys.69, 241–153 (2007). [DOI] [PubMed] [Google Scholar]
28.Hill, H. & Pollick, F. E. Exaggerating temporal differences enhances recognition of individuals from point light displays. Psychol. Sci.11, 223–228 (2000). [DOI] [PubMed] [Google Scholar]
29.Eiserbeck, A., Maier, M., Baum, J. & Abdel, R. R. Deepfake smiles matter less—The psychological and neural impact of presumed AI-generated faces. Sci. Rep.13, 16111 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

[CR1] 1.Bortolon, C., Lorieux, S. & Raffard, S. Self- or familiar-face recognition advantage? New insight using ambient images. Q. J. Exp. Psychol.71, 1396–1404 (2018). [DOI] [PubMed] [Google Scholar]

[CR2] 2.Carey, S. Becoming a face expert. Philos. Trans. R. Soc. Lond. B Biol. Sci.335, 95–103 (1992). [DOI] [PubMed] [Google Scholar]

[CR3] 3.Bruce, V. & Young, A. Understanding face recognition. Br. J. Psychol.77, 305–327 (1986). [DOI] [PubMed] [Google Scholar]

[CR4] 4.Andrews, T. J. et al. A narrow band of image dimensions is critical for face recognition. Vision Res.212, 108297 (2023). [DOI] [PubMed] [Google Scholar]

[CR5] 5.Sugiura, M. et al. Neural mechanism for mirrored self-face recognition. Cereb. Cortex25, 2806–2814 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Wójcik, M. J., Nowicka, M. M., Bola, M. & Nowicka, A. Unconscious detection of one’s own image. Psychol. Sci.30, 471–480 (2019). [DOI] [PubMed] [Google Scholar]

[CR7] 7.Kamachi, M. et al. Dynamic properties influence the perception of facial expressions. Perception42, 1266–1278 (2013). [DOI] [PubMed] [Google Scholar]

[CR8] 8.Knight, B. & Johnston, A. The role of movement in face recognition. Vis. Cogn.4, 265–273 (1997). [Google Scholar]

[CR9] 9.Lander, K., Bruce, V. & Hill, H. Evaluating the effectiveness of pixelation and blurring on masking the identity of familiar faces. Appl. Cognit. Psychol.15, 101–116 (2001). [Google Scholar]

[CR10] 10.Lander, K. & Bruce, V. Recognizing famous faces: Exploring the benefits of facial motion. Ecol. Psychol.12, 259–272 (2000). [Google Scholar]

[CR11] 11.O’Toole, A. J., Roark, D. A. & Abdi, H. Recognizing moving faces: A psychological and neural synthesis. Trends Cogn. Sci.6, 261–266 (2002). [DOI] [PubMed] [Google Scholar]

[CR12] 12.Haxby, J. V., Hoffman, E. A. & Gobbini, M. I. The distributed human neural system for face perception. Trends Cogn. Sci.4, 223–233 (2000). [DOI] [PubMed] [Google Scholar]

[CR13] 13.Lander, K., Christie, F. & Bruce, V. The role of movement in the recognition of famous faces. Mem. Cognit.27, 974–985 (1999). [DOI] [PubMed] [Google Scholar]

[CR14] 14.Christie, F. & Bruce, V. The role of dynamic information in the recognition of unfamiliar faces. Mem. Cognit.26, 780–790 (1998). [DOI] [PubMed] [Google Scholar]

[CR15] 15.Alzueta, E., Kessel, D. & Capilla, A. The upside-down self: One’s own face recognition is affected by inversion. Psychophysiology58, e13919 (2021). [DOI] [PubMed] [Google Scholar]

[CR16] 16.Sugiura, M. et al. Self-face recognition in social context. Hum. Brain Mapp.33, 1364–1374 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Wen, W. et al. Control over self and others’ face: Exploitation and exploration. Sci. Rep.14, 15473 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Kasahara, S., Kumasaki, N. & Shimizu, K. Investigating the impact of motion visual synchrony on self face recognition using real time morphing. Sci. Rep.14, 13090 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Shimizu, K., Naruse, S., Nishida, J. & Kasahara, S. Morphing identity: exploring self-other identity continuum through interpersonal facial morphing experience. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. CHI ’23, Article No. 500, 1–15 (Association for Computing Machinery, New York, NY, USA, 2023).

[CR20] 20.Bray, S. D., Johnson, S. D. & Kleinberg, B. Testing human ability to detect ‘deepfake’ images of human faces. J. Cybersecur.9, tyad011 (2023). [Google Scholar]

[CR21] 21.Lewis, A., Vu, P., Duch, R. M. & Chowdhury, A. Deepfake detection with and without content warnings. R. Soc. Open Sci.10, 231214 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] 22.Mori, M., MacDorman, K. F. & Kageki, N. The uncanny valley. IEEE Robot. Autom. Mag.19, 98–100 (2012). [Google Scholar]

[CR23] 23.Dai, Y., Fei, J., Huang, F. & Xia, Z. Face Omron Ring: Proactive defense against face forgery with identity awareness. Neural Netw.180, 106639 (2024). [DOI] [PubMed] [Google Scholar]

[CR24] 24.Knappmeyer, B., Thornton, I. M. & Bülthoff, H. H. The use of facial motion and facial form during the processing of identity. Vis. Res.43, 1921–1936 (2003). [DOI] [PubMed] [Google Scholar]

[CR25] 25.Liu, K. et al. Deepfacelab: Integrated, flexible and entensible face-swapping framework. Pattern Recogn.141, 109628 (2023). [Google Scholar]

[CR26] 26.Hill, H. C. H., Troje, N. F. & Johnston, A. Range- and domain-specific exaggeration of facial speech. J. Vis.5, 793–807 (2005). [DOI] [PubMed] [Google Scholar]

[CR27] 27.Westhoff, C. & Troje, N. F. Kinematic cues for person identification from biological motion. Percept. Psychophys.69, 241–153 (2007). [DOI] [PubMed] [Google Scholar]

[CR28] 28.Hill, H. & Pollick, F. E. Exaggerating temporal differences enhances recognition of individuals from point light displays. Psychol. Sci.11, 223–228 (2000). [DOI] [PubMed] [Google Scholar]

[CR29] 29.Eiserbeck, A., Maier, M., Baum, J. & Abdel, R. R. Deepfake smiles matter less—The psychological and neural impact of presumed AI-generated faces. Sci. Rep.13, 16111 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Exploring the importance of shape on dynamic recognition of self-face or friend-face

Sogo Yumura

Karen Lander

Miyuki G Kamachi

Abstract

Introduction

Method

Participants

Preparation

Video recording

Face stimuli generated by Deepfake

Results

Experiment 1

Stimuli & conditions

Fig. 1.

Procedure

Results of self or friends’ motion identification

Fig. 2.

Experiment 2

Stimuli

Fig. 3.

Procedure

Results of shape influence in steps

Fig. 4.

Discussion

Limitations of Deepfake

Conclusion

Acknowledgements

Author contributions

Funding

Data availability

Declarations

Competing interests

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases