Public attitudes towards the use of automatic facial recognition technology in criminal justice systems around the world

Kay L Ritchie; Charlotte Cartledge; Bethany Growns; An Yan; Yuqing Wang; Kun Guo; Robin S S Kramer; Gary Edmond; Kristy A Martire; Mehera San Roque; David White

doi:10.1371/journal.pone.0258241

. 2021 Oct 13;16(10):e0258241. doi: 10.1371/journal.pone.0258241

Public attitudes towards the use of automatic facial recognition technology in criminal justice systems around the world

Kay L Ritchie ^1,^*, Charlotte Cartledge ¹, Bethany Growns ^2,^3,^¤, An Yan ⁴, Yuqing Wang ⁴, Kun Guo ¹, Robin S S Kramer ¹, Gary Edmond ⁵, Kristy A Martire ², Mehera San Roque ⁵, David White ²

Editor: Ritesh G Menezes⁶

PMCID: PMC8513835 PMID: 34644306

Abstract

Automatic facial recognition technology (AFR) is increasingly used in criminal justice systems around the world, yet to date there has not been an international survey of public attitudes toward its use. In Study 1, we ran focus groups in the UK, Australia and China (countries at different stages of adopting AFR) and in Study 2 we collected data from over 3,000 participants in the UK, Australia and the USA using a questionnaire investigating attitudes towards AFR use in criminal justice systems. Our results showed that although overall participants were aligned in their attitudes and reasoning behind them, there were some key differences across countries. People in the USA were more accepting of tracking citizens, more accepting of private companies’ use of AFR, and less trusting of the police using AFR than people in the UK and Australia. Our results showed that support for the use of AFR depends greatly on what the technology is used for and who it is used by. We recommend vendors and users do more to explain AFR use, including details around accuracy and data protection. We also recommend that governments should set legal boundaries around the use of AFR in investigative and criminal justice settings.

Introduction

Biometrics refers to the characteristics of a person which can be used to identify them [1]. The most common forms of biometrics used in law enforcement and other security settings are fingerprints, iris, voice, DNA and face. Over the past decade or more, the use of biometric has grown rapidly, particularly in investigative and criminal justice settings, often in response to terrorism [2,3]. Facial recognition technology is an increasingly common form of biometrics in use in many different areas of our lives–from unlocking smart devices, to crossing borders, and increasingly in security and policing settings.

Automatic facial recognition (AFR) technology is based on algorithms that perform a series of functions, including detecting a face, creating a digital representation–or ‘template’– of the face, and comparing this representation against other images to determine the degree of similarity between them. Here we focus solely on AFR technology which performs two main functions: verification and identification. Verification is an identity confirmation based on a one-to-one comparison of a single stored image, for example on a passport, to another single face image, for example an image taken by an automated border control gate. Identification is a one-to-many (1:N) search of a database, for example a criminal watchlist, to find a match to the target image, which could for example be a CCTV image of someone committing a crime. Verification and identification might be performed by a person or an algorithm or by combinations of one or more persons and algorithms. A detailed discussion of the operational uses of these types of algorithms is provided elsewhere [4,5].

In this paper we report a study aimed at understanding public opinion towards use of this technology in society, with a focus on how the technology is used in the criminal justice system. Before describing our study, we provide background on: 1) AFR algorithm accuracy; 2) AFR algorithm bias; 3) use and governance of AFR; 4) public opinion of AFR; 5) the current study.

Algorithm accuracy

In recent years, there has been a rapid improvement in the performance of facial recognition algorithms through the use of ‘Deep Convolutional Neural Networks’ (DCNNs; e.g. [6–8] see [9]. One study tested algorithms made in 2015, 2016 and 2017 and showed a monotonic increase in performance from the oldest (68% accurate) to the newest (96% accurate [10]). The National Institute of Standards and Technology (NIST) in the USA runs a regular Face Recognition Vendor Test (FRVT) which is a standard test of facial recognition algorithms. The FRVT has consistently reported improvements in algorithm 1:N face identification and now conducts continual testing of algorithms and produces a publicly available ranking of their performance (e.g. [11]). The algorithm currently topping this leaderboard has a false negative rate of around or under 1% in 5 of the 8 tests (6.9%, 9.9% and 16.7% in the other three tests respectively), with a false negative rate being the percent of searches with a match in the system failing to return that matched image [12]. Overall, false negative rates of the 274 algorithms that were submitted to this most recent evaluation ranged from 0.15% to 99.99%.

In the UK, the Data Protection Act 2018 states that any identification ‘decision’ made by an algorithm must be checked by a human. Increasingly, hybrid human-AFR systems are used in 1:N identification settings where typically the human is used to verify the top matches returned by the algorithm (see [5] section 1.4 for a detailed overview). Combining algorithm and human judgements may yield the highest accuracies for the most challenging conditions including identification across changes in pose and lighting as well as identification from blurry images and videos [10]. Depending on their design, systems that integrate humans and algorithms can both enhance and reduce accuracy.

Algorithm bias

Demographic biases in AFR have been a cause for concern in recent reports because they contravene the fundamental human right that citizens should be treated equally [13–15]. One may expect algorithms to be free from the biases that humans often show in face recognition. However, it is now known that algorthims also show bias, which can be built into algorithms as a function of the system design and programming, data, or images they are trained on. For example, face recognition algorithms show the Own Race Bias [16–18], whereby humans are typically better at remembering and comparing faces from their own demographic group opposed to another race (see [19] for review).

Another example concerns gender classification. Algorithms trained on datasets which contain mainly lighter-skinned people have been shown to produce gender classification errors of up to 34.7% in darker-skinned females compared to only 0.8% in lighter-skinned males [17]. A preprint, however, tested five commercial facial recognition algorithms and showed that most features used by these algorithms to make identity judgements were unrelated to gender and race [20]. Another study tested four algorithms (one previous generation, and three based on DCNNs) and found that race bias increased with item difficulty, and that equal levels of false acceptance rates for each ethnicity could only be achieved by changing the “decision threshold” for each race [21]. Therefore it was possible in this case to eliminate bias, but only by making acceptance decisions less strict for different races.

In 2019, NIST published the FRVT: Demographic Effects which describes and quantifies demographic differentials for modern commercially avaliable facial recognition algorithms [18]. This test of over 100 facial recognition algorithms showed large discrepancies between the performance of different algorithms. For example, while some algorithms did not show a race bias, other algorithms falsely identified non-White faces between 10 and 100 times more often than White faces. These results highlight the need for agencies using AFR to know how well their algorithm performs with different faces, and whether the threshold for identification should be kept constant across all faces.

The type of bias described above involves differential accuracy for one demographic group relative to another. However, other types of bias introduced by AFR are also important to consider. For example, when people are presented with prior face identification decisions that have been made by algorithms or humans, this can bias their face matching decisions [22–24]. This suggests that human-algorithm hybrid systems which require the human to verify a decision made by an algorithm may be open to the human biasing their decision in the direction of the algorithm decision. This is likely to amplify any existing biases based on differential accuracy for demographic groups.

Use and governance

AFR has been integrated with CCTV and used by some police forces in the UK, the USA and Australia for a number of years, although to different extents in the different countries [25,26]. There is a lack of reliable information around the first date of AFR use, and the pervasiveness of AFR use in the UK, USA, Australia and China, and so instead of providing such information here, we focus on broad definitions and legal use cases. AFR is typically used by the police to match the digital representations captured by the technology with images present in a database [4]. In theory, this database could contain images of every citizen, or only images of individuals on a ‘watchlist’ [27]. Watchlists are created by authorities and contain information about a person of interest, typically fugitives and those deemed to require close surveillance [27]. Trials of live AFR deployed on city streets by police in the UK have reported high numbers of incorrect matches (i.e., false positives; [27,28]).

The use of images as evidence in legal proceedings tends to be more obviously regulated–by various rules of evidence and procedure (e.g. PACE (1978) in England and Wales), but these are hardly consistent or principled. Before AFR became quite accurate (in specific conditions) and capable of outperforming humans, most courts allowed jurors to examine images–often of a crime, such as an armed robbery–and to compare the images with images of the defendant as well as the appearance of the defendant in court. In many cases this lay comparison was supported by the opinions of police officers–often those investigating the crime–and/or a range of putative experts, sometimes described as facial mappers. These ‘mappers’ originated from a wide range of domains–specialist police, anatomy, IT, photography, military intelligence, art, anthropology–but were unified by their pervasive inattention to validation, accuracy and cognitive bias [29,30].

English courts admitted the opinions of investigating police and those recognised as experts, and allowed the examiners to make claims about similarities as well as categorical identifications [31]. Australian courts, in contrast, prevented police officers from making identifications and recently appear to have deemed the opinions of mappers (at least where offenders are well disguised) inadmissible [32,33]. The disparate jurisdictions of the USA have been influenced by Daubert v Merrell Dow Pharmaceuticals, Inc. [34] and the need for (validity and) reliability, in conjunction with emerging concerns about the forensic sciences [35,36]. Though, express concern with reliability as an admissibility pre-condition has not prevented reliance on a range of mappers. The admissibility of an identification by AFR has yet to be considered by a court in the jurisdictions considered in this survey. With the improved accuracy of the latest generation of algorithms, the admission of the output of an algorithm (as ‘machine testimony’, see [37]) or the combination of human/AFR systems can only be a matter of time [38].

The governance of AFR differs greatly across countries. The UK has a surveillance camera commissioner, a government-appointed position, and a surveillance camera code [39]. There is no equivalent in the USA. In Australia, an Identity-matching Services Bill (2019) is being considered by parliament which would allow the use of AFR to assist with identity verification by government and industry for transactions with citizens and customers, and also to identify suspects in criminal investigations. As in other social democracies, human rights and civil liberties organisations have expressed concern about the expanding use of AFR, especially dangers identified in the USA and UK with race and bias [40].

The increased use of AFR, combined with the lack of clear legislation around its use, and the potential for bias and consequential errors, has led to debates around the ethics of gathering face images for training algorithms, and the use of AFR by state and private users [3,41]. There have also been high profile calls for the outright banning of AFR by public interest groups such as banfacialrecognition.com and Big Brother Watch. Recently an independent research institute called for a moratorium on AFR following its survey of public opinion in the UK [13], and The Electronic Privacy Information Center (EPIC) issued an open letter, signed predominantly by organisations in the USA, opposing the use of AFR by private companies as well as governments [42]. As well as calls for bans, there have been several challenges to the use of AFR. In the UK, South Wales Police’s use of AFR was ruled as unlawful as a breach of article 8 of the European convention on human rights [43]. In the USA, a number of cities have placed bans of varying severity on the use of AFR, and legislation currently under consideration in the USA (at the time of writing) would prohibit the use of AFR by the Federal Government [44]. In addition, three recent congressional hearings in the USA examined AFR’s impact on civil rights and liberties, and transparency in both government and commercial use, and accuracy [45–47].

The London Policing Ethics Panel in 2019 made three recommendations around the use of live AFR: that there should be enhanced ethical governance of policing technology field research trials; that public views on live AFR should be reviewed after the deployment of live AFR; and that there is a need to simplify and strengthen regulation of new identification technologies [15]. Similarly, the UK Information Commissioner gave an opinion on the use of live facial recognition technology by law enforcement in public places which concluded that the use of live AFR should meet the threshold of strict necessity. For example, to locate a known terrorist but not to be used indiscriminately in order to identify suspects of minor crimes, that the government should introduce a code of practice, and that public debate around the use of AFR should be encouraged [14]. The mention of engagement with the public and surveying public opinion which is common to both these institutions’ recommendations highlights the need to understand what the public know and think about the use of AFR.

Public opinion

Given the wide use of facial recognition technology in society and recent mass media attention, it is surprising that there are relatively few publicly available surveys of public opinion, and no comparisons of public opinion across different jurisdictions.

A recent survey of public opinion in the UK asked participants about the use of AFR by police, government, and private sector, in airports, on public transports, in schools, in supermarkets and by human resources departments in workplaces. This survey showed that 46% of people thought the public should be given the ability to consent to or opt out of the use of AFR, 55% agreed the government should limit police use of facial recognition technology, and that support for the use of AFR by police (70%) was higher than in airports (50%) or in supermarkets (7%), schools (6%) and at work (4%) [13]. This survey reveals differences in attitudes to the use of AFR by different users and for different use cases. Another survey of Londoners found that support for police use of live AFR was also dependent on the use case, with 81–83% support for serious crimes depending on the nature of the threat, compared to 55% support for minor crimes, and below 50% for nuisance behaviour [15].

In Australia, half or respondents to a public opinion survey believed that the use of AFR in public spaces constitued an intrusion of privacy, but consistent with the UK, there was public support for particular uses of the technology and especially for policing [48]. A recent survey by Beijing News Think Tank found that over 80% of Chinese people surveyed opposed the use of AFR in commercial zones in Beijing, and 96% were worried about security around personal information and data [49].

The current study

In the present studies, we sought to explore, understand and compare the attitudes towards AFR of members of the public in Australia, the UK, the USA and China, with emphasis on criminal justice applications of AFR. We focussed particularly on differences in public opinion depending on the people or group that was deploying and using the technology (users) and the specific purpose for which it is being used (use cases). We began with focus groups aimed at finding themes, or common areas of discussion (Study 1) which we then further explored in a large-scale international survey (Study 2).

The conversations in the focus groups (Study 1) fell into three overarching themes: society, technology, and purpose. Participants in all countries (Australia, China, USA) generally spoke about the same things, with notable differences being that people in China spoke more about current uses of AFR, and people in China and Australia thought of AFR as more accurate than in the UK. These differences likely reflect different uses of AFR and different reporting in the media across the different countries. Again response to the questionnaire (Study 2) were similar across countries (Australia, UK, USA) with some notable differences whereby people in the USA were more accepting of AFR being used to track citizens, and more accepting of use by private companies. Key issues surrounding privacy, trust, and a need for legislative boundaries around the use of AFR came up across both studies, as did differing levels of acceptance depending on who AFR was being used by and for what purpose.

Study 1—Focus groups

Methods

Ethics statement

Both studies presented here were given ethical approval from the University of Lincoln Research Ethics Committee (Project ID 449) in accordance with local and international regulations. All participants gave written or electronic informed consent.

Participants

Focus groups were conducted in the UK, Australia and China. Two focus groups were conducted in each country, with each group comprising between 7 and 11 participants. In total, 58 people took part in the focus groups. The demographic data collected was inconsistent across countries due to a miscommunication within the research team. In Australia, 18 participants (no age data; 10 male, 8 female) took part. All indicated having heard of AFR, and 55% reported feeling they were knowledgeable about AFR prior to the focus groups. In China, 20 participants took part (no age or gender data), 90% had heard of AFR and 20% felt knowledgeable about AFR prior to the focus groups. In the UK, 20 participants took part (mean age 38 years; age range 20–70 years; 3 male, 16 female, 1 no gender response), 90% had heard of AFR and 10% felt knowledgeable about AFR prior to the focus groups. All focus group members were given £30 (or the local equivalent) to compensate them for their time.

Procedure

The focus groups took place between 23^rd May 2019 and 12^th July 2019. Focus groups were recorded and transcribed. The schedule of questions was translated into Chinese for the two focus groups which were conducted in China. Those focus groups were conducted in Chinese, and the transcripts were translated back into English by the focus group moderators (who are fluent in both Chinese and English) for analysis. Focus group moderators gave prompt information and questions to begin the discussions, and while some of the themes identified, for example ‘accuracy’ and ‘who uses it’, were closely linked to the questions, other themes such as trust and privacy were evident in all focus groups without being explicitly prompted. There were nine prompt questions covering background knowledge of AFR, how participants would feel about it being used in different situations, and accuracy. The full schedule of focus group questions is in S1 File. All sessions lasted no longer than one and a half hours.

Analysis

Two researchers independently coded the transcripts by hand and conducted a collaborative thematic analysis to explore the data and generate key themes. The team followed the six phases of analysis outlined in [50], initially familiarising themselves with the data and generating codes. Potential themes were then identified and reviewed to establish three overarching themes. These overarching themes were identified and named to encompass each of their respective themes and subthemes. All of the themes and subthemes were represented in the analysis, and both researchers were in complete agreement in terms of the themes identified.

Results

The overarching themes, themes and subthemes identified during the thematic analysis are shown in Figs 1–4.

Fig 1 — Graphical representation of overarching themes and themes identified from focus groups conducted in the UK, Australia and China.

Fig 4 — Graphical representation of Purpose overarching theme and its component themes and subthemes identified from focus groups conducted in the UK, Australia and China.

All themes that were identified were common across all countries, but some sub-themes were specific to pairs of countries. Here, we describe all subthemes in turn.

Society—Overarching theme

Fig 2 shows the Society overarching theme with each of its component themes and subthemes.

Privacy theme: Data protection subtheme. Participants were concerned about the storage and sharing of data, including images of their face.

Privacy theme: Big Brother/tracking subtheme. Participants were concerned about the use of AFR to track individuals, and felt that this could normalise surveillance.

Trust theme: Scary/wary subtheme. This subtheme was specific only to the UK and Australia and did not come up in the Chinese focus groups. People were wary of AFR and expressed being scared by it. Participants in the first UK focus group made comments such as “It is just terrifying” and “How it is being used in China I think is absolutely petrifying”, and in one Australian focus group made comments such as “It’s innocent until it’s not”.

Trust theme: Acceptance for use by ‘good’ governments subtheme. Focus group participants in all three countries seemed confident that their own government was ‘good’ and would use the technology responsibly, but were concerned that it should not be used in other countries whose governments they trusted less. One participant in Australia said “It’s like the example that comes to mind is like what’s happening in China there’s obviously a bit of an authoritarian police state emerging or it’s pretty much there. I think in Australia I’d be more comfortable with it because the justice system is (like) just”, and a participant in China said “For example, I think our country may be okay, but for the United States, it may be easier to deepen this kind of judicial bias for those who are [non-White] or marginalized”.

Trust theme: Concerns over misuse by criminals/governments subtheme. Participants in all three countries expressed concern that in the wrong hands, the technology could be misused.

Regulation theme: In court it needs to be used in conjunction with other evidence subtheme. Participants in all three countries felt comfortable with AFR being used as evidence in courts, but all expressed that this should only be used in conjunction with other evidence.

Regulation theme: Cost subtheme. Cost was mentioned in all countries. Participants showed an understanding that the technology would require financial investment from governments/police forces/local authorities to set up and use. There was also concern that without large monetary investment, the systems may not be as accurate as they could be.

Technology—Overarching theme

Fig 3 shows the Technology overarching theme with each of its component themes and subthemes.

Accuracy theme: CCTV image quality subtheme. In all three countries, participants were concerned that AFR would not work well if used with poor quality CCTV images. Particularly in the UK there was a concern that all CCTV images are poor quality. This is not the case, and in fact all focus groups were conducted in Lincoln which had recently undergone a CCTV upgrade with very high quality cameras in use since 2018 [51].

Accuracy theme: Racial/gender bias subtheme. We did ask a specific question about perceptions of any racial bias in AFR (see S1 File), but this came up in each focus group prior to the question being raised with the group. Participants had seen news reports about potential inaccuracies of AFR for non-White people. This was mentioned in the UK and Australian focus groups, but not the Chinese focus groups.

Accuracy theme: Within-person variability subtheme. In all countries, participants were concerned that AFR may not be able to cope with changes in appearance. Makeup and changes of hairstyle were mentioned in all focus groups. This leads on to the next subtheme.

Accuracy theme: Disguise/twins subtheme. Participants in all countries spoke about the use of disguise to evade AFR, and the possibility that AFR would not be able to discriminate between similar looking people, for example twins.

Accuracy theme: More accurate in the future subtheme. This was discussed in both Australian focus groups, but not in the UK or China. This may be due to the fact that people living in Australia may be more concerned about AFR because the enactment of the Identity-matching Services Bill (2019) which would see AFR used on a national scale has led to more media coverage of, and perhaps more public interest in AFR. Participants in Australia, when asked about accuracy, felt that although AFR may not be perfectly accurate at the moment, it would continue to develop and would become more accurate in the future.

Accuracy theme: Dependent on algorithm training subtheme. This was only discussed in our Chinese focus groups, perhaps due to the more widespread use of AFR in China, leading to a greater public knowledge of how the systems work. When asked about accuracy, participants in China were aware that the success of a system depends a great deal on the images with which it is trained.

Public Perception theme: Lack of information subtheme. Participants in all of our focus groups agreed that there is a lack of information in their country about how AFR systems are built, how they are used, and how the data are stored and shared.

Public Perception theme: Negative press subtheme. Focus group participants in the UK and Australia, but not China, stated that they had seen negative press surrounding their country’s use of AFR.

Public Perception theme: Conflation with 1:1 facial recognition subtheme. In all of our focus groups, there was at least one discussion in which participants conflated the 1:N identification use case with the 1:1 verification. This is important to note because participants had been given a definition of 1:N identification as opposed to 1:1 identity verification, and so this confusion between the two may reflect a lack of distinction between these two processes in the general public, and in how this technology is described.

Public Perception theme: Fictional subtheme. Perhaps the most surprising subtheme was mentioned only by participants in the UK and Australia, but was mentioned frequently. Some people expressed that they did not really believe AFR was real, and they had only come across it in films or on TV. One UK participant said “I have only really come across it in television programmes…so it has always been something that I kind of almost didn’t really necessarily think was real”, and one participant in Australia asked “Is it already out there?”.

Purpose—Overarching theme

Fig 4 shows the Purpose overarching theme with each of its component themes and subthemes.

Who uses it theme: Acceptance of use to identify criminals subtheme. Participants in all focus groups were mostly accepting of the idea that AFR could be used to identify people who had committed crimes.

Who uses it theme: Variable acceptance of use to identify people irrespective of criminality subtheme. In contrast with the previous subtheme, participants in all focus groups were less accepting of the idea that AFR could be used to identify anyone, irrespective of whether or not they had committed a crime. There was, however some disagreement on this. In the UK one participant said “You could be innocently out with your family or on your own or anything and then this facial recognition could pick you up”, but another said “I just think if you’re not doing anything wrong then why would you have a problem with it”. These sorts of disagreements came up in all focus groups, and are a regular feature of debates over the expansion of surveillance and crime control in recent decades.

Intentional positives theme: Missing persons/proving innocence subtheme. The intended positives of AFR came up in all focus groups. Most frequently people mentioned that it could be used to locate missing persons, or that if you were accused of a crime but were not actually present, it could show that you were somewhere else at the time. These uses, particularly the second, may be of limited value given the increased scope of constant surveillance of all people in public places which would be required.

Intentional positives theme: Automate police work/free up man hours subtheme. Participants in all focus groups also felt that a positive aspect of AFR is that it could automate some aspects of police work, and that it could free up man hours spent searching for people in CCTV footage.

Current use: It is already in use. Participants in Australia noted that AFR is already in limited use in their country, and that acceptance of use in one scenario could lead to more widespread use, saying things such as “…it’s a slippery slope…”. In China participants spoke of the current widespread use of AFR for payments and for access to University accommodation. Participants in the UK did not speak of current AFR use in their country.

Discussion

In general, the themes covered by members of all six of our focus groups across all three countries were similar. People in all three countries were concerned about privacy, trusting the users of AFR, and thought the use of AFR should be regulated. Notable differences were that participants in China spoke more about current AFR use in their country, and this was not spoken about in the UK. This may reflect differences in use of AFR in these countries (with China reportedly using AFR frequently in many public spaces, and only a small number of UK police forces trialling AFR use), and so shows that our participants were sensitive to AFR use, or perhaps media commentary, in their own country. Other differences were that participants in Australia and China tended to think of AFR systems as being accurate, where participants in the UK thought of it as less accurate. This may reflect differences in media reporting, or stories picked up by the press in these countries. Using the themes that had been identified in Study 1, we created specific questions for our large-scale questionnaire. We aimed to survey a large number of people in different countries to gain insight into public attitudes towards the use of AFR in different criminal justice systems.

Study 2—Questionnaire

The questionnaire questions were predominantly derived from the themes that were identified in the focus groups. In addition, we replicated some questions from the Ada Lovelace Institute report [13].

Prior to finalising the questionnaire, we sought feedback from a variety of sources. We initially sought feedback from members of a multidisciplinary group who attended a meeting of academics, police, forensic services and related industries–the Unfamiliar Face Identification Group (UFIG). We surveyed members of UFIG2020 (those attending the 2020 iteration of the group meeting) via an online survey link prior to the meeting. We received 26 responses, with 15 respondents identifying themselves as academics/researchers, 3 as members of the police/forensic services, and 8 as ‘other’ including public servants and biometrics suppliers, and federal government. Respondents were given a list of the intended topics to be covered in the questionnaire and asked to indicate to which (if any) they would be interested or uninterested in knowing public responses. We included space for respondents to make suggestions, but none were made. Our intended topics were looked upon favourably by this group (M = 74% interested in each topic; M = 4% not interested in each topic). The questions were then sent in full to the Ada Lovelace Institute who acted as an independent body to verify that the specific wording of the questions was unbiased and not leading.

We collected data from Australia, the UK, and the USA. We added the USA here so as to compare three Western, English-speaking countries which all use AFR to different extents. Initially we had intended to collect data from China, but found we could not access participants in China through either of the data collection websites we used (MTurk and Prolific.co). We collected data from participants in India via MTurk, but do not report those data here due to concerns about data quality and internal consistency of responses on reverse-coded questions (specifically questions 13–15).