Skip to main content
Data in Brief logoLink to Data in Brief
. 2024 Jul 3;55:110701. doi: 10.1016/j.dib.2024.110701

NSTU-BDTAKA: An open dataset for Bangladeshi paper currency detection and recognition

Md Jubayar Alam Rafi 1, Mohammad Rony 1, Nazia Majadi 1,
PMCID: PMC11296233  PMID: 39100771

Abstract

One of the most popular and well-established forms of payment in use today is paper money. Handling paper money might be challenging for those with vision impairments. Assistive technology has been reinventing itself throughout time to better serve the elderly and disabled people. To detect paper currency and extract other useful information from them, image processing techniques and other advanced technologies, such as Artificial Intelligence, Deep Learning, etc., can be used. In this paper, we present a meticulously curated and comprehensive dataset named ‘NSTU-BDTAKA’ tailored for the simultaneous detection and recognition of a specific object of cultural significance - the Bangladeshi paper currency (in Bengali it is called ‘Taka’). This research aims to facilitate the development and evaluation of models for both taka detection and recognition tasks, offering a rich resource for researchers and practitioners alike. The dataset is divided into two distinct components: (i) taka detection, and (ii) taka recognition. The taka detection subset comprises 3,111 high-resolution images, each meticulously annotated with rectangular bounding boxes that encompass instances of the taka. These annotations serve as ground truth for training and validating object detection models, and we adopt the state-of-the-art YOLOv5 architecture for this purpose. In the taka recognition subset, the dataset has been extended to include a vast collection of 28,875 images, each showcasing various instances of the taka captured in diverse contexts and environments. The recognition dataset is designed to address the nuanced task of taka recognition providing researchers with a comprehensive set of images to train, validate, and test recognition models. This subset encompasses challenges such as variations in lighting, scale, orientation, and occlusion, further enhancing the robustness of developed recognition algorithms. The dataset NSTU-BDTAKA not only serves as a benchmark for taka detection and recognition but also fosters advancements in object detection and recognition methods that can be extrapolated to other cultural artifacts and objects. We envision that the dataset will catalyze research efforts in the field of computer vision, enabling the development of more accurate, robust, and efficient models for both detection and recognition tasks.

Keywords: Computer vision, Deep learning, Image analysis, Taka detection, Taka recognition, YOLOv5 model


Specification Table

Taka Detection Subset Taka Recognition Subset
Subject Deep Learning, Computer Vision, Currency Analysis
Specific Subject Area YOLOv5 Object Detection, Image Processing Image Classification, Deep Learning
Data Format Raw: JPG
Conversion: JPG
Annotation: TXT
Type of Data Image, Text file Image
Dataset Size Raw Data: 1831 images
After Augmentation: 3111 images
Annotation: 3111 text files
Raw Data: 10161 images
After Augmentation: 28,875 images
Data collection The taka detection dataset was compiled through a manual data collection process utilizing a readily available iPhone 6 s Plus smartphone camera (12 MP, f/2.2, 29 mm, 1/3.0″, 1.22 µm). This choice facilitated the capture of real-world scenarios with diverse lighting conditions, backgrounds, and surroundings encountered during taka identification. Images were captured in JPG format (640×640 pixels) at various times of day and from various angles and distances to replicate the range of viewing perspectives present in real-world applications. To further enhance the model's performance and mitigate limitations arising from a potentially limited initial dataset, data augmentation techniques such as random rotation, shearing, and exposure adjustments were employed after the initial image and video capture for detection with a custom YOLOv5 model. Images for this recognition dataset were captured using an iPhone 6 s Plus smartphone camera with a 12 MP resolution, f/2.2 aperture, and a 29 mm standard focal length. The images were saved in JPG format with dimensions of 256 * 256 pixels. These images were directly taken during the manual data collection process, ensuring the collection of authentic and high-quality photographs suitable for taka recognition.
The data collection procedure began with capturing images and videos for detection using a custom YOLOv5 [1] model. Subsequently, the target area of taka was isolated from the images. To enhance the model's performance and expand the dataset, data augmentation techniques such as random rotation, shearing, and exposure adjustments were applied.
This approach ensured a comprehensive dataset encompassing various real-world complexities and improved the model's ability to generalize to unseen scenarios.
Data source location The dataset is not limited to a certain location. It contains images from various locations in Bangladesh. It encompasses images from diverse locations across Bangladesh, demonstrating a wide geographical coverage. For instance:
Dhaka: Latitude 23.8103° N, Longitude 90.4125° E
Noakhali: Latitude 22.8697° N, Longitude 91.0991° E
Kasba, Brahmanbaria: Latitude 23.9633° N, Longitude 91.1240° E
Data accessibility Repository name: NSTU-BDTAKA: NSTU Bangladeshi Paper Currency Dataset
Data identification number: 10.17632/w4y6h723xg.1
Direct URL to data: https://data.mendeley.com/datasets/w4y6h723xg/1
Related research article https://doi.org/10.1145/3287098.3287152

1. Value of the Data

  • The dataset NSTU-BDTAKA serves as a pivotal asset for advancing the field of currency analysis, particularly in the context of Bangladeshi currency. Researchers and practitioners can leverage this dataset to develop, refine, and evaluate machine learning and deep learning models tailored to the detection and recognition of Bangladeshi currency notes.

  • The utilization of the dataset extends to the creation of automated systems designed to verify and recognize Bangladeshi paper currency. Furthermore, the dataset can serve as a foundational benchmark for training and testing currency recognition models, facilitating research and innovation.

  • Authentic images in the dataset allow real-world testing and validation of currency recognition algorithms, ensuring effectiveness in practical scenarios.

  • The dataset can promote responsible currency handling, minimizing errors and discrepancies in currency-related processes.

  • It can encourage collaboration among researchers, financial institutions, and technology developers to advance currency analysis technologies. Moreover, it can provide an educational resource for skill enhancement in machine learning and image processing, fostering learning and expertise.

  • It can contribute to economic growth by enhancing currency accuracy, security, and transparency in transactions and institutions.

2. Objective

The primary objective of this dataset paper is to present a comprehensive dataset specifically curated for advancing research in taka detection and recognition. With a focus on Bangladeshi currency analysis, the dataset encompasses two distinct facets: taka detection using the YOLOv5 model and taka recognition. The major goals of this research are as follows:

  • Through the provision of meticulously annotated images with rectangular bounding boxes, the dataset aims to empower the development, training, and evaluation of robust taka detection models. The objective is to facilitate the creation of accurate and efficient systems capable of identifying taka instances in real-time scenarios [2,3].

  • The taka recognition subset in the dataset, comprising a substantial collection of 28,875 images, seeks to enable comprehensive taka recognition research. By providing a diverse range of taka instances captured under varying conditions, the dataset serves as a platform for advancing image classification and deep learning techniques in the context of currency recognition.

  • This dataset paper aims to catalyze innovation within the domain of currency analysis. By offering high-quality, real-world data for both detection and recognition tasks [4], the objective is to inspire researchers, practitioners, and technology developers to devise novel algorithms, techniques, and models that elevate the accuracy, efficiency, and security of taka-related processes.

3. Data Description

The data description of NSTU-BDTAKA is shown in Table 1.

Table 1.

Directories and subdirectories of the NSTU-BDTAKA dataset.

Directory Subdirectory Subdirectory Subdirectory Contents
Data Detection Train images 2560
Labels 2560
Test images 186
Labels 186
Validation images 365
Labels 365
Recognition Train 2_taka 2877
5_taka 2442
10_taka 2875
20_taka 2982
50_taka 2895
100_taka 3123
200_taka 2994
500_taka 3008
1000_taka 2901
Test 2_taka 136
5_taka 99
10_taka 134
20_taka 130
50_taka 147
100_taka 127
200_taka 128
500_taka 126
1000_taka 117
Validation 2_taka 200
5_taka 176
10_taka 164
20_taka 187
50_taka 183
100_taka 189
200_taka 192
500_taka 179
1000_taka 164

In the context of a detection dataset, a hierarchical folder structure, referred to as the ``Detect'' folder, is employed. This structure initiates the dataset's categorization into three primary subsets: ``train,'' ``test,'' and ``validation.'' Each of these subsets contains two subfolders: ``images'' and ``labels.'' The ``images'' subfolder is dedicated to housing the original image files, while the ``labels'' subfolder stores the corresponding annotation files. These annotation files adopt the YOLO format, providing vital details for each object detection, including the class ID and the precise bounding box coordinates, encompassing center X, center Y, width, and height. This meticulous annotation strategy ensures the accurate localization of objects within the images. The structured folder arrangement not only optimizes dataset organization but also plays a pivotal role in facilitating the training and evaluation processes for the YOLOv5 object detection model.

In the recognition dataset, the initial organization involves dividing it into three primary subsets: ``train,'' ``test,'' and ``validation.'' Within each of these subset folders, the various currency classes, including ``2_take'',``5_taka'', ``10_taka'', ``20_taka'', ``50_taka'', ``100_taka'', ``200_taka'', ``500_taka'' and ``1000_taka'' are each represented by separate class folders. These class folders contain images of the respective currency denominations. This structured hierarchy is designed to streamline the dataset for the training, testing, and validation of recognition models, enabling the precise identification and classification of currency denominations through image data alone.

4. Experimental Design, Materials and Methods

This section describes how we acquired the dataset ‘NSTU-BDTAKA’ more in details.

  • Data for taka detection:
    • Image collection: The first step for taka detection was to collect the data required to train and test the YOLOv5 model for real-time taka area detection [5]. We used a smartphone camera (iPhone 6 s Plus, 12 MP, f/2.2, 29 mm, 1/3.0″, 1.22 µm) to capture real-time images of various taka areas in different lighting conditions and environments. To ensure that our dataset is diverse and representative of real-world scenarios, we captured images of taka areas in different times of the day, under different weather conditions, and with varying backgrounds and surroundings [6]. We also captured images of taka areas from different angles and distances to account for various viewing perspectives. In total, 3111 images were captured. The image dataset split into training and testing sets, and annotated the images. Annotation of images involved highlighting each of the taka within an image manually using bounding boxes and labeling them appropriately. The images were labeled manually using the labeling tool, labeling each image with bounding boxes around the taka areas. The labeling process was crucial to train the YOLOv5 model accurately, and we ensured that the labeling was consistent and accurate across all images. After labeling the images, the dataset was split into training, validating and testing sets in a ratio of 80:10:10. The training set was used to train the YOLOv5 model and the testing set to evaluate its performance. Overall, the data collection process was crucial in ensuring that the YOLOv5 model can accurately detect taka areas in real-time under various environmental conditions (e.g., different times of the day, different weather conditions, varying nature of backgrounds and surroundings, etc.). The data collection architecture is shown in Fig. 1. Fig. 2 illustrates several images that were collected, resized, and preprocessed.
    • Data augmentation: Making little changes to already-existing data to broaden its diversity without gathering new data is known as data augmentation. It is a method for increasing a dataset. Standard data augmentation methods [7,8] include flipping data horizontally and vertically, rotating data, cropping data, shearing data, etc. Data augmentation can assist stop a neural network from picking up unrelated features. Typical methods include:
      • Flip: Flipping the photos in either a horizontal or vertical direction broadens the diversity of the training data. To expand the training dataset and enhance the performance of machine learning models, flipping is used as a data augmentation method. How to flip is used in data augmentation is shown in Fig. 3. To further boost the variety and size of the training dataset, flipping can also be used in conjunction with other data augmentation methods like rotation, scaling, and translation.
      • Rotation: Slightly rotating the photos improves the diversity of the training data. Data enrichment methods like rotation are frequently used in machine learning and computer vision. To produce a new image with a different orientation, it entails rotating an image by a specific angle. The rotation for 15° both clockwise and counterclockwise is shown in Fig. 4.
      • Blurring: Image blurring is an image augmentation technique that involves applying a blur filter to an image. This technique is used to simulate the effect of motion or defocusing in an image. In practice, blurring can be achieved using mathematical formulas or pre-built libraries that apply filters such as Gaussian or box blur to an image. The degree of blur can be controlled by adjusting the filter size and strength. Fig. 5 presents the 1px blurring which is applied in data augmentation.
      • Shearing: Image shearing is a type of image augmentation technique that involves slanting or tilting an image in a specific direction. This is achieved by applying a transformation matrix to the image, which shifts the position of the pixels along a diagonal line. Fig. 6 presents shearing for 15° all axis. In practice, shearing is often used in combination with other image augmentation techniques, such as rotation, flipping, and scaling, to generate a diverse range of training images that can help to improve the performance of computer vision models.
      • Exposure: Image exposure augmentation is a technique used to adjust the brightness and contrast of images. This technique is commonly used in computer vision to increase the variety of images in a dataset and improve the performance of machine learning models. By adjusting the exposure of images, computer vision models can better generalize to new images in different lighting conditions. Fig. 7 presents 25° positive and negative exposure.
    • Data preprocessing: To prepare image data for model input, preprocessing is necessary. For instance, convolutional neural networks' fully linked layers demanded that each image be stored in an array of the same size. The dataset is preprocessed to improve the results by removing unused features, instances, etc. [9]. The following are the common steps involved in data preprocessing for real-time Bangladeshi currency detection (refer to Fig. 8):
      • (i)
        Denoising: As it helps to minimize the impact of noise on currency images, denoising is a crucial step in the study of real-time Bangladeshi currency detection.
      • (ii)
        CLAHE: The common image enhancement method known as CLAHE (Contrast Limited Adaptive Histogram Equalization) is frequently applied while analyzing Bangladeshi currency. It is a form of histogram equalization designed to enhance the image's contrast without oversaturating the image's brighter or darker areas. CLAHE is a helpful technique for preparing data on Bangladeshi currency and can assist increase the precision of the findings from the analysis of the condition.
      • (iii)
        Thresholding: By setting a threshold value, this approach transforms a grayscale image into a binary image. Thresholding can help isolate important characteristics used to diagnose Bangladeshi currency by reducing background noise from the image.
      • (iv)
        Cropping: It is a method for removing parts of an image that are outside the scope of the study. Cropping can be employed to remove the black or white borders from the photos in the case of Bangladeshi currency, which can help to shorten the deep learning model's calculation time and memory needs.
      • (v)
        Resizing: Resizing is a technique for changing the image's size to a desired dimension. Images of Bangladeshi currency can be reduced in size while keeping key details, in the image
    • Image augmentation: The Roboflow data annotation tool [10] is used to annotate images specifically for the “Taka” object. Each image is annotated using a rectangular bounding box. The resulting annotations are saved in individual text files for each image, containing the bounding box coordinates and labels. As shown in Fig. 9, these annotations serve as samples and can be utilized for object detection algorithms, such as YOLOv5.
  • Data for taka recognition:
    • Image collection: The first step of taka recognition was to collect the data required to train and test the recognition model for real-time taka area detection. A smartphone camera (iPhone 6 s Plus, 12 MP, f/2.2, 29 mm, 1/3.0″, 1.22 µm) was used to capture real-time images of various taka areas in different lighting conditions and environments. To ensure that the dataset is diverse and representative of real-world scenarios, the images of taka areas were captured in different times of the day, under different weather conditions, and with varying backgrounds and surroundings. Furthermore, the images of taka areas were captured from different angles and distances to account for various viewing perspectives. The process began by capturing images and videos for detection using the custom YOLOv5 model. Then, the target area from the images was extracted. To improve the performance of the model and increase the dataset size, data augmentation techniques such as random cropping, flipping, and rotation were applied.
      After augmentation, the pixel values of the images were normalized to bring them to a common scale. Then, the dataset was divided into training, validation, and test sets to use for different purposes. In total, 28,875 images were captured for the taka recognition subset. Fig. 10 illustrates the data collection architecture for the taka recognition subset of NSTU-BDTAKA. Fig. 11 presents several images that were collected, resized, and preprocessed.
    • Data augmentation: Typical methods include:
      • Rotation: Slightly rotating the photos improves the training data's diversity. Data enrichment methods like rotation are frequently used in machine learning and computer vision. In order to produce a new image with a different orientation, it entails rotating an image by a specific angle. Fig. 12 presents the rotation for both clockwise and anticlockwise.
      • Shearing: Image shearing is a type of image augmentation technique that involves slanting or tilting an image in a specific direction. This is achieved by applying a transformation matrix to the image, which shifts the position of the pixels along a diagonal line. Shearing can be used to simulate the effect of an object being viewed from a different angle or perspective, which can help to increase the robustness and generalization of computer vision models. Fig. 13 shows shearing for 15° all axis. In practice, shearing is often used in combination with other image augmentation techniques, such as rotation, flipping, and scaling, to generate a diverse range of training images that can help to improve the performance of computer vision models.
      • Exposure: Image exposure augmentation is a technique used to adjust the brightness and contrast of images. This technique is commonly used in computer vision to increase the variety of images in a dataset and improve the performance of machine learning models. By adjusting the exposure of images, computer vision models can better generalize to new images in different lighting conditions. Fig. 14 shows 18° positive and negative exposure. In practice, exposure augmentation can be achieved by manipulating the brightness and contrast of images using mathematical formulas or through the use of pre-built libraries. This technique can be used in combination with other image augmentation techniques such as rotation, scaling, and flipping to create a more diverse set of training data that improves the overall performance of computer vision models.

Fig. 1.

Fig. 1

The data collection architecture for taka detection stage of NSTU-BDTAKA.

Fig. 2.

Fig. 2

Sample of collected images.

Fig. 3.

Fig. 3

Flip used in data augmentation.

Fig. 4.

Fig. 4

Rotation in data augmentation.

Fig. 5.

Fig. 5

Blurring applied in data augmentation.

Fig. 6.

Fig. 6

Shearing applied in data augmentation.

Fig. 7.

Fig. 7

Exposure applied in data augmentation.

Fig. 8.

Fig. 8

The steps required for data preprocessing.

Fig. 9.

Fig. 9

Bounding boxes extraction of Bangladeshi paper currency.

Fig. 10.

Fig. 10

The data collection architecture for taka recognition phase of NSTU-BDTAKA.

Fig. 11.

Fig. 11

Sample of collected images.

Fig. 12.

Fig. 12

Rotation applied in data augmentation.

Fig. 13.

Fig. 13

Shearing applied in data augmentation.

Fig. 14.

Fig. 14:

Exposure applied in data augmentation.

Limitations

There are some issues (i.e., inconsistent image quality, object occlusion, changing illumination, the need for rapid processing, various types of taka sizes, and a lack of training data) those could be obstacles to the deployment of a real-time taka detection and recognition system. These factors could reduce the effectiveness and precision of the system when combined, leading to erroneous classifications or false alarms. Moreover, solving these problems requires the creation of effective algorithms that can manage these kinds of variances, which might require a lot of time and resources. Furthermore, the dataset is not compared with the dataset of foreign currency in this paper. We would like to work on it in future. In addition, we intend to compare our model in the future with and without noises (such as fingers, human-like items, etc.).

Ethics Statement

The authors confirm that the provided data set and presented work strictly meet the ethics requirements for publication in Data in Brief as mentioned in “https://www.elsevier.com/authors/ journal- authors/policies-and-ethics”. Furthermore, the data collection approach emphasizes privacy protection, consent, and the equitable representation of taka instances, ensuring a diverse and unbiased compilation.

CRediT Author Statement

Md. Jubayar Alam Rafi: Conceptualization, Methodology, Software, Formal analysis, Investigation, Data Curation, Writing - Original Draft, Visualization. Mohammad Rony: Conceptualization, Methodology, Software, Formal analysis, Investigation, Data Curation, Writing - Original Draft, Visualization. Nazia Majadi: Conceptualization, Methodology, Validation, Formal analysis, Writing - Review & Editing, Visualization, Supervision.

Acknowledgment

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability

References

  • 1.Hendrawan A., Gernowo R., Nurhayati O.D., Warsito B., Wibowo A. 2022 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT) 2022. Improvement object detection algorithm based on YoloV5 with BottleneckCSP. [Google Scholar]
  • 2.Tasnim R., Pritha S.T., Das A., Dey A. 2021 2nd International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST) 2021. Bangladeshi banknote recognition in real-time using convolutional neural network for visually impaired people. [Google Scholar]
  • 3.Hasanuzzaman F.M., Yang X., Tian Y. Robust and effective component-based banknote recognition for the blind. IEEE Trans. Syst. Man Cybern. Part C. 2012;42:1021–1030. doi: 10.1109/TSMCC.2011.2178120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Murad H., Tripto N., Ali M. 2019. Developing a Bangla Currency Recognizer for Visually Impaired People; pp. 1–5. [Google Scholar]
  • 5.Dande S., Uppunuri G., Raghuvanshi A. YOLOv5 based web application for Indian currency note detection. Int. Res. J. Eng. Technol. 2022:2801–2807. [Google Scholar]
  • 6.Zhang Q., Yan W.Q. 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) 2018. currency detection and recognition based on deep learning. [Google Scholar]
  • 7.Mittal S., Mittal S. 2018 3rd International Conference On Internet of Things: Smart Innovation and Usages (IoT-SIU) 2018. Indian banknote recognition using convolutional neural network. [Google Scholar]
  • 8.Shorten C., Khoshgoftaar T.M. A survey on Image data augmentation for deep learning. J. Big Data. 2019;6:60. doi: 10.1186/s40537-021-00492-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Sonka M., Hlavac V., Boyle R., Sonka M., Hlavac V., Boyle R. Image Processing, Analysis and Machine Vision. Springer US; Boston, MA: 1993. Image pre-processing; pp. 56–111. [Google Scholar]
  • 10.Roboflow data annotation tool, Available: https://roboflow.com/annotate, Last accessed on 20 October 2023.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement


Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES