Abstract
Article impact statement: Machine learning can be used to automatically monitor and assess illegal wildlife trade on social media platforms.
Unsustainable harvesting is one of the major threats driving the global extinction crisis (Maxwell et al. 2016). Wildlife trade is a multibillion‐dollar industry, in which thousands of animals, plants, and associated products are traded globally as food, pets, medicines, clothing, and trophies (Dalberg Global Development Advisors 2012). Wildlife trade escalates into a crisis when an increasing proportion is illegal and unsustainable and thus directly threatens the persistence of many species in the wild (Ripple et al. 2016). High‐profile species, such as rhinoceroses and elephants (Wittemyer et al. 2014; Di Minin et al. 2015a ), as well as many lesser‐known species (Rosen & Smith 2010; Phelps & Webb 2015), are threatened by illegal trade. Illegal wildlife trade is among the largest illegitimate businesses (Dalberg Global Development Advisors 2012). Furthered by poverty, poorly monitored borders, corruption, and weak regulations and enforcement, illegal wildlife trade continues to grow (Dalberg Global Development Advisors 2012; UNODC 2016).
In recent years, the scale and nature of illegal wildlife trade has changed dramatically, and the internet has become a major market for wildlife products (Lavorgna 2014). Although law enforcement has been partially successful in controlling illegal wildlife trade on major e‐commerce platforms, the trade appears to have moved to alternative platforms, in particular social media (Yu & Jia 2015). Illegal wildlife trade on the dark web appears to be low (Roberts & Hernandez‐Castro 2017). This may be partly because accessing the dark web and locating illegal wildlife products requires technical skills.
We propose a new research framework in which machine learning is used to investigate illegal wildlife trade on social media platforms (Fig. 1). The framework has 3 stages: mining, filtering, and identifying relevant data on illegal wildlife trade on social media.
Figure 1.
Framework to (a) mine, (b) filter, and (c) identify relevant data on the illegal wildlife trade from social media platforms with machine learning. Photo in (c) is from Twitter.
User‐generated content, including images, text, and videos, can be downloaded from several social media platforms, including Facebook, Twitter, Weibo, and Flickr, via an application programing interface (API) (see https://www.programmableweb.com/category/social/apis?category=20087 for a full list). Application programing interfaces are publicly available, and researchers can independently collect global‐scale data from the content made available by the social media company. Using APIs, researchers can access publically available data. Social media data collected via APIs are being used increasingly in conservation (e.g., Di Minin et al. 2015b ), but automated classification is limited. Automated content classification can help filter out information irrelevant to illegal wildlife trade (e.g., “pangolin armoured vehicle” as opposed to pangolin taxa [Fig. 1a]) and render content classification cost‐efficient.
Machine learning and its subfields and components (deep learning, neural networks, and natural language processing) can be used to identify verbal, visual, and audiovisual content pertaining to illegal wildlife trade (e.g., Di Minin et al. 2018) (Fig. 1b). Neural networks are often trained with a large set of labeled data and architectures that contain multiple layers of neurons, which allow the networks to learn increasingly abstract representations of the data (Krizhevsky et al. 2012; Liao et al. 2013). However, to learn to associate inputs and outputs, such as images and their respective labels, neural networks require large volumes of human‐verified training data. When provided with consistently labeled data and a clearly defined task, neural networks perform at a high level. Norouzzadeh et al. (2018), for example, used neural networks to identify, count, and describe the behaviors of 48 species in the 3.2 million‐image Snapshot Serengeti data set. The system they developed can automate animal identification for 99.3% of the data while still performing at the same 96.6% accuracy as that of crowdsourced teams of citizen scientists, saving >8 years of human labeling effort. Publicly available data sets, such as ImageNet, which includes 14 million images classified in 22,000 classes, can provide initial training data for many species (Deng et al. 2009). However, more specific training data are needed to identify specific wildlife products (e.g., pangolin scales or rhinoceros horn) to determine whether a wildlife product is being traded illegally, to account for the source of the specimens traded (e.g., captive bred or wild sourced) (Hinsley et al. 2016), or to discard scams. For this purpose, citizen scientists, as in the case of the Snapshot Serengeti data set, could be used to label images and associated text via platforms such as Zooniverse (https://www.zooniverse.org/). Advances in machine learning combined with rich training data sets may even allow detecting alternative code words used for selling wildlife products on social media.
Once the original information derived from social media is filtered and data sets are created (Fig. 1c), analyzing data will improve understanding of the trends and patterns of illegal wildlife trade on social media. Because social media data often contain metadata for geographical location and a time stamp indicating when the content was uploaded to the service, they can be used to analyze the spatiotemporal dynamics of illegal trade (e.g., the type and quantity of wildlife products traded, the nodes for trade routes, the types of routes that exist between trade nodes and how they change over time, etc.). Using this information in combination with other biodiversity knowledge products, such as the International Union for Conservation of Nature (IUCN) Red List, can help determine whether the species or products are traded outside the species range or whether the species is coded as threatened on the IUCN Red List (IUCN 2016). Furthermore, through social network analysis techniques, information available on user profiles and the global connections between them can help identify the key exporter, intermediary, and importer countries. Finally, sentiment analysis can be used to identify and categorize opinions expressed in social media content, especially to determine users’ attitudes toward wildlife products. Such information, in turn, can inform campaigns for behavioral change. Sentiment analysis can also be used by law enforcement and security agencies to monitor rapidly developing situations.
Following the framework proposed in Fig. 1, Di Minin et al. (2018) trained a deep neural network to determine whether Twitter posts with the word rhino in 19 different languages contained images of rhinoceros species. With this approach, they were able to discard 94% of the images in tweets or in reference pages as not relevant. In another application, not from social media, Hernandez‐Castro and Roberts (2015) developed an automated system to detect potentially illegal elephant ivory items for sale on eBay.
Although the characteristics of social media data provide a great opportunity to track illegal wildlife trade, there are still challenges and caveats (e.g., spatial inaccuracy and unreliable data related to scams, etc.) associated with using social media content for research purposes (Di Minin et al. 2015b ; Tsou 2017). In addition, scientists and practitioners have the ethical responsibility to minimize potential harm to people who share illegal wildlife trade content on social media platforms (Zook et al. 2017). For example, the privacy policy and terms of use of each social media platform should be followed strictly and only publicly available social media data used. The anonymity of social media users should be respected and their privacy protected by anonymizing the data so that it cannot be linked to any personal information, such as names or phone numbers. Receiving, storing, processing, and applications of social media data should strictly follow all data security and privacy requirements (e.g., the European Union General Data Protection Regulation) of the country where researchers are based. Another problem is that a wealth of relevant data on illegal wildlife trade is currently not open to research via APIs. For this reason, manual observation, filtering and classification of content, particularly to assess whether content pertains to legal or illegal trade, remains important (Hinsley et al. 2016; Eid & Handal 2017). Still, openly available data, which can be downloaded through the APIs, represent an important sample of all available social media data. Our framework can be applied within the safe environments of social media platforms without breaching privacy fences.
Our proposed methods and analyses are relevant for the implementation of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) (e.g., Decisions 17.92 and 17.93 at the Conference of the Parties 17). Given the pressing issue, creating partnerships between CITES parties, social media companies, and scientists working on artificial intelligence will help create the conditions (e.g., by accessing full social media data in full respect of privacy) that will make investigation of the illegal wildlife trade on social media possible. Our framework, with differences in related to how data can be downloaded, can also be applied to other online platforms.
Acknowledgments
An earlier version of this paper was published online in Accepted Articles. It was removed and substantially revised to eliminate overlap with an earlier publication by the authors on the same topic. The authors acknowledge an error in judgment that resulted in the need for this revision. E.D.M thanks the Academy of Finland 2016–2019 (grant 296524) for support. C.F. thanks the University of Helsinki for support via an Early Career Grant to E.D.M. T.H. was funded by the Finnish Cultural Foundation. H.T. thanks the DENVI doctoral program at University of Helsinki for support.
Article impact statement: Machine learning can be used to automatically monitor and assess illegal wildlife trade on social media platforms.
The above article from Conservation Biology, published online on 12 March 2018 in Wiley Online Library (wileyonlinelibrary.com) has been revised to eliminate overlap with a previously published article: Machine Learning for Tracking Illegal Wildlife Trade on Social Media in Nature Ecology and Evolution 2, 406–407 (2018).
The copyright line for this article was changed on 7 August 2019 after original online publication.
Literature Cited
- Dalberg Global Development Advisors . 2012. Fighting illicit wildlife trafficking: a consultation with governments. WWF, Gland, Switzerland.
- Deng J, Dong W, Socher R, Li LJ, Li K, Fei‐Fei L. 2009. ImageNet: a large‐scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition:248–255.
- Di Minin E, Fink C, Tenkanen H, Hiippala T. 2018. Machine learning for tracking illegal wildlife trade on social media. Nature Ecology and Evolution 2:406–407. [DOI] [PubMed] [Google Scholar]
- Di Minin E, Laitila J, Montesino‐Pouzols F, Leader‐Williams N, Slotow R, Goodman PS, Conway AJ, Moilanen A. 2015a. Identification of policies for a sustainable legal trade in rhinoceros horn based on population projection and socioeconomic models. Conservation Biology 29:545–555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Di Minin E, Tenkanen H, Toivonen T. 2015b. Prospects and challenges for social media data in conservation science. Frontiers in Environmental Science 3:63. [Google Scholar]
- Eid E, Handal R. 2017. Illegal hunting in Jordan: using social media to assess impacts on wildlife. Oryx 10.1017/S0030605316001629. [DOI] [Google Scholar]
- Hernandez‐Castro J, Roberts DL. 2015. Automatic detection of potentially illegal online sales of elephant ivory via data mining. PeerJ Computer Science 1:e10. [Google Scholar]
- Hinsley A, Lee TE, Harrison JR, Roberts DL. 2016. Estimating the extent and structure of trade in horticultural orchids via social media. Conservation Biology 30:1038–1047. [DOI] [PubMed] [Google Scholar]
- IUCN (International Union for Conservation of Nature) . 2016. The IUCN Red List of threatened species. IUCN, Gland, Switzerland. [Google Scholar]
- Krizhevsky A, Sutskever I, Hinton GE. 2012. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 1:1097–1105. [Google Scholar]
- Lavorgna A. 2014. Wildlife trafficking in the Internet age. Crime Science 3:5. [Google Scholar]
- Liao H, McDermott E, Senior A. 2013. Large scale deep neural network acoustic modeling with semi‐supervised training data for YouTube video transcription. 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013—Proceedings:368–373.
- Maxwell SL, Fuller RA, Brooks TM, Watson JEM. 2016. The ravages of guns, nets and bulldozers. Nature 536:145–146. [DOI] [PubMed] [Google Scholar]
- Norouzzadeh MS, Nguyen A, Kosmala M, Swanson A, Palmer M, Packer C, Clune J. 2018. Automatically identifying, counting, and describing wild animals in camera‐trap images with deep learning. Proceedings of the National Academy of Sciences of the United States of America 10.1073/pnas.1719367115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phelps J, Webb EL. 2015. ‘Invisible’ wildlife trades: Southeast Asia's undocumented illegal trade in wild ornamental plants. Biological Conservation 186:296–305. [Google Scholar]
- Ripple WJ, et al. 2016. Bushmeat hunting and extinction risk to the world's mammals. Royal Society Open Science 3:160498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roberts DL, Hernandez‐Castro J. 2017. Bycatch and illegal wildlife trade on the dark web. Oryx 51:393–394. [Google Scholar]
- Rosen GE, Smith KF. 2010. Summarizing the evidence on the international trade in illegal wildlife. EcoHealth 7:24–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsou M. 2017. Research challenges and opportunities in mapping social media and big data. Cartography and Geographic Information Science 42:70–74. [Google Scholar]
- UN Office on Drugs and Crime (UNODC) . 2016. World wildlife crime report: trafficking in protected species. UNODC, Vienna. [Google Scholar]
- Wittemyer G, Northrup JM, Blanc J, Douglas‐Hamilton I, Omondi P, Burnham KP. 2014. Illegal killing for ivory drives global decline in African elephants. Proceedings of the National Academy of Sciences of the United States of America 111:13117–13121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu X, Jia W. 2015. Moving targets: tracking online sales of illegal wildlife. Traffic, Cambridge. [Google Scholar]
- Zook M, et al. 2017. Ten simple rules for responsible big data research. PLOS Computational Biology 13:e1005399. [DOI] [PMC free article] [PubMed] [Google Scholar]