Skip to main content
Data in Brief logoLink to Data in Brief
. 2022 Sep 7;45:108573. doi: 10.1016/j.dib.2022.108573

Dataset of vector mosquito images

Reshma Pise a,, Kailas Patil b,, Meena Laad c, Neeraj Pise d
PMCID: PMC9508436  PMID: 36164301

Abstract

Mosquitoes pose substantial threat to public health resulting in million number of deaths wordlwide every year. They act as the vectors responsible for diseases such as Dengue, Yellow fever,Chikungunya, Zika etc. The harmful mosquito species are contained in the genera Aedes, Anopheles and Culex. Automated species identification of vectors is essential to implement targeted vector control strategies. The objective of the proposed paper is to construct a novel dataset of images of dangerous mosquito species. We have prepared a dataset of images of adult mosquitoes belonging to three species: Aedes Aegypti, Anopheles stephensi and Culex quinquefasciatus stored in two folders. The first folder comprises of total 2640 augmented images of mosquitoes belonging to the three species. The second folder contains original images of the the three species. The dataset is valuable for training machine and deep learning models for automatic species classification.

Keywords: Computer vision, Deep learning, Mosquito classification, Vector control


Specifications Table

Subject: Computer Vision and Pattern Recognition, Machine Learning, Entomology and insect science.
Specific subject area: Morphological classification of mosquito species.
Type of data: Images of Mosquitoes
How data points were acquired: The images were captured with a 48 Mpx One Plus mobile phone camera in the day light condition.
Data format: Raw images in JPEG file format.
Description of data collection: Photographs of fresh mosquito specimens were shot at day light using high resolution mobile phone rear camera.
Data source location: All photos were captured at Ross life Lab located in the city of Pune, India
ROSS LIFE SCIENCE PVT. LTD
Plot No.96, Sector No.10, PCNTDA, Bonsai, Pune – 411026
Maharashtra, India
Latitude and longitude: 18.6466° N, 73.8306° E
Data accessibility: The dataset of images is available online Mendeley website.
Repository name: Dataset of Vector Mosquito Images
Data identification number (doi): 10.17632/88s6fvgg2p.4
Direct URL to data: https://data.mendeley.com/datasets/88s6fvgg2p/4

Value of the Data

  • The dataset provides images of three mosquito species: Aedes Aegypti, Anopheles stephensi and Culex quinquefasciatus.

  • The dataset can be used to train mosquito species classification and prediction models. The dataset can potentially benefit the society in controlling mosquito borne diseases.

  • The dataset can be used to train automated species classification models which is a vital contribution for vector control.

  • Automated genera and species identification can be efficient as compared to the laborious and time consuming task of manual species identification carried out by entemologists.

1. Data Description

Mosquitoes of genera Aedes, Anopheles and Culex are vectors responsible for spreading diseases such as Dengue, Yellow fever, Chikungunya, Zika etc. [1]. Mosquito vector surveillance is carried out by local government to monitor the mosquito population and the species predominant in a geographic area to implement effective mosquito vector control plans [2]. Automated species classification can be an important contribution to target harmful species. Image processing techinques with machine learning algorithms can be used to train machine learning models to classify and predict the genera or species. Availability of a quality data set is a prerequisite to train such deep learning models [3,4,5].

There are datasets which include geographical density and distribution record of vector mosquito species [6,7]. There are images datasets available which contain images of female mosquitoes belonging to i) Aedes genera (Aedes aegypti and Aedes albopictus species), ii) Aedes and Culex genera (Aedes aegypti and Aedes albopictus and Culex quinquefasciatus species and  iii) Aedes, Anopheles and Culex species. [8,9,10].

The images in these datasets were acquired using microscope and a digital camera. In our work we have included the three important harmful vector species i.e., Aedes aegypti, Anopheles stephensi and Culex quinquefasciatus of both sexes. Also, the images are captured with mobile phone camera.

The proposed dataset folder comprises of two folders. The folder named “Mosquito Images Original” contains original images of the three species stored under three sub folders that are created for each of the three species: Aedes Aegypti, Anopheles Stephensi and Culex Quinquefasciatus. The folder named “Mosquito Images Augmented” consists of total 2640 augmented images of the three species stored under the 3 corresponding subfolders. The pictures were captured with One Plus mobile 48 Mpx camera and were saved in JPEG file format. The original pictures are RGB images with dimension 3000 x 4000 pixels and 72 dpi. The augmented images are of resolution 256 × 256 pixels with 96 dpi. Table 1presents the number of images and a sample picture of mosquito belonging to each of the three species..

Table 1.

Dataset description

SubFolder Number of original Images Number of Augmented Images Sample Image
Aedes Aegypti 15 900 Image, table 1
Anopheles Stephensi 9 540 Image, table 1
Culex Quinquefasciatus 20 1200 Image, table 1

2. Experimental Design, Materials and Methods respective

2.1. Experimental Design

The images were collected during April 2022, at the mosquito colony maintained by Ross Life lab, Pune city of Maharashtra, India. Fig. 1 shows the steps involved in dataset construction process. The fresh adult mosquito specimens were imaged using a handheld smartphone camera. The species included were Aedes Aegypti, Anopheles Stephensi and Culex Quinquefasciatus. The species of the specimen were confirmed by an entomologist from the Ross Life.

Fig. 1.

Fig. 1

Data acquisition process

2.2. Materials or Specification of Image Acquisition System

The imaging system consisted of a 48 Mpx One Plus mobile RGB camera. (Table. 2)

Table 2.

Specification of image acquisition system

Sr. No. Particulars Details
1 Camera a) Make and Model: Sony IMX586 and
b) Sensor: 48 MP
b) Focus Adjustment: automatic
c) Aperture: f / 1.7

2 Resolution of augmented images 256 × 256 pixels

3 Image Format JPEG

4 Original Image Resolution Range 3000 × 4000

2.3. Method

Adult mosquitoes of both sexes belonging to the three species were sampled and photographed. There were 7 Aedes Aegypti, 5 Anopheles and 10 Culex specimens. The genera and species of the mosquitoes were confirmed by an entomologist. The live mosquito specimens were frozen and placed over a plain white surface. Specimens were oriented slightly and images were taken at various angles to capture morphological features specific to each species. This was easy to achieve with fresh specimens.

The images were stored with the file name format as: genus_species_imagenumber.jpg in the corresponding species subfolder under “Mosquito Images Original” folder. The data augmentation procedure was used to increase the size of the dataset as huge dataset is a requirement for machine learning projects. It is a is a well-known technique in image classification problems to make the model generalize better. The original images were resized to 256 × 256 pixels which is a standard input image resolution for deep neural networks. Image augmentation functions from python library “imgaug” were used to augment all the original images. Augmentation functions such as random rotations, scale, shears, perspective transformation, flips, gaussian blur, noise and colour space transformations were applied with different parameters. For example rotation augmentation was performed by rotating the image by 10°, between 10° - 360°. Table 3 presents the augmentation functions, parameters and parameter values that we used for each function. These parameter values can be changed to get an altered set of new images from original images dataset.

Table 3.

Image Augmentation details

Augmentation function Parameter Values
Rotation Degree of rotation: 10°, between 10° - 360°
Scaling Scaling Factor: 1.2 and 0.8
Shear Between 16 °- 16 °
Perspective Scale: 0.15
Gaussian Blur Sigma value =1
Flip Horizontal flip for all images
Gaussian Noise Scale: (0, 0.05 * 255)
GammaContrast Pixel Values in range: 0.5 and 1.44
Linear Contrast Pixel Value: 0.62

The resulting image files are named genus_species_ imagenumber.jpg and stored in the respective subfolders in the “Mosquito Images Augmented ” folder. The number of images in each category afer augmentation is specified in Table 1

Ethics Statement

This data is available in the public domain, and no funding is received for the present effort. There is no conflict of interest.

CRediT Author Statement

Reshma Pise: Conceptualization, Writing the original article, Publication of the dataset; Kailas Patil: Data Validation, Supervision, Project administration; Meena Laad: Editing, Formal analysis; Neeraj Pise: Image capture & Augmentation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper

Contributor Information

Reshma Pise, Email: reshma.pise@vupune.ac.in.

Kailas Patil, Email: kailas.patil@vupune.ac.in.

Data Availability

References

  • 1.Organization, W.H. World Health Organization; Geneva: 2017. Vector-Borne Diseases Factsheet. [Google Scholar]. Accessed June 1, 2022.
  • 2.Sasmita HI, Neoh K-B, Yusmalinar S, Anggraeni T, Chang N-T, Bong L-J, et al. Ovitrap surveillance of dengue vector mosquitoes in Bandung City, West Java Province, Indonesia. PLoS Negl Trop Dis. 2021;15(10) doi: 10.1371/journal.pntd.0009896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Pise Reshma, Patil Kailas, Pise Neeraj. Automatic Classification Of Mosquito Genera Using Transfer Learning. Journal of Theoretical and Applied Information Technology. 2022;100(06) doi: 10.5281/zenodo.6417511. [DOI] [Google Scholar]
  • 4.Park J., Kim D.I., Choi B., et al. Classification and Morphological Analysis of Vector Mosquitoes using Deep Convolutional Neural Networks. Sci. Rep. 2020;10:1012. doi: 10.1038/s41598-020-57875-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Okayasu Kazushige, Yoshida Kota. Masataka Fuchida and Akio Nakamura “Vision-Based Classification of Mosquito Species: Comparison of Conventional and Deep Learning Methods. Appl. Sci. 2019;9:3935. doi: 10.3390/app9183935. [DOI] [Google Scholar]
  • 6.Atoni Evans, Zhao Lu, Hu Cheng, Ren Nanjie, Wang Xiaoyu, Liang Mengying;, et al. A dataset of distribution and diversity of mosquito-associated viruses and their related mosquito vectors in China” figshare. Dataset Pap. Sci. 2020 doi: 10.6084/m9.figshare.12638792.v1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Irish D., Kyalo R.W.Snow, Coetzee M. “Anopheles species present in countries in sub-Saharan Africa and associated islands.”. Harvard Dataverse. 2019 doi: 10.7910/DVN/PHGADL. [DOI] [Google Scholar]
  • 8.Pradeep Isawasan (2020), “Aedes Mosquitos Dataset” https://www.kaggle.com/datasets/pradeepisawasan/aedes-mosquitos. Accessed July 20, 2022.
  • 9.Ong SQ., Ahmad H. An annotated image dataset for training mosquito species recognition system on human skin. Sci Data. 2022;9:413. doi: 10.1038/s41597-022-01541-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Couret Jannelle. Malaria vector mosquito images. Dryad, Dataset. 2020 doi: 10.5061/dryad.z08kprr92. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement


Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES