Skip to main content
Data in Brief logoLink to Data in Brief
. 2025 Aug 12;62:111968. doi: 10.1016/j.dib.2025.111968

RoseLeafInsight: A high-resolution image dataset for rose leaf disease recognition

Arnob Das Shacha 1, Sabbir Hossain Durjoy 1, Md Emon Shikder 1, Md Mostafa Kamal 1, Md Mehedi Hasan Shoib 1, Md Hasan Imam Bijoy 1,
PMCID: PMC12396392  PMID: 40896128

Abstract

The Rose (genus Rosa) has become a significant factor in the Bangladeshi flower industry, both in terms of exports and local consumption. However, rose farming in this country faces serious challenges due to diseases affecting its leaves, which weaken the plants and result in lower flower yields and financial losses for farmers. Rosa (genus Rosa) is one of the most attractive and commercially valuable flower genera. However, agricultural rose production faces several challenges, such as pesticide resistance, which affects plant growth and results in a reduced quantity and quality of healthy flowers. Several natural factors also cause interference with rose production. Most farmers involved in this industry have limited education, which hinders their ability to identify early-stage rose-leaf disease solely through visual inspection. Furthermore, limited communication with agricultural experts exacerbates the situation, leading to delayed interventions and economic losses. This study presents the rose leaf disease dataset, which would help enhance disease tracking, diagnosis, and research in roses. From October 2024 to January 2025, large-scale field surveys were conducted to capture quality images for each condition class in rose leaves. In this paper, four classes comprise ‘Black Spot,’ ‘Insect Hole,’ ‘Yellow Mosaic Virus,’ and ‘Healthy,’ representing different stages in disease progression. There are 3,228 original images, categorized as follows: Black Spot (409), Insect Hole (453), Yellow Mosaic Virus (680), and Healthy (1,686). During the pre-processing stage, the images are resized to 3000×3000 pixels, and low-quality, duplicate, or irrelevant images are removed to ensure high quality. We have employed various augmentation techniques, including rotation, flipping, contrast adjustment, blurring, shearing, zooming, and noise addition, to increase the dataset size and enhance model generalization. Datasets like this one are in high demand for agricultural research, leading to improved disease management and increased yields. These goals can be achieved through high-accuracy machine-learning models for early disease detection and cause identification. This gives the farmers more time to take necessary actions for disease prevention and pest control. This tech-based system combines the field of agriculture with the cutting edge of computer science and AI, making precision agriculture even more effective and efficient. Our dataset is designed to meet the need for data to train these models and provide a baseline benchmark for disease detection in our specific crop, the Rose. Improvements in different generations of models, as well as numerous other forms of scientific advancements, can lead to further increases in efficiency and ultimately result in better, smarter farms. In our initial testing for categorizing rose leaves, we employed two well-known transfer learning models. Among them, MobileNetV2 performed exceptionally well, achieving an accuracy of 96.79% in image classification. This dataset can be integrated with innovative farming equipment, such as drones and sensors, to monitor large fields in real-time. This dataset serves as a benchmark for training deep learning models, enabling enhanced automated monitoring and decision-making in precision agriculture.

Keywords: genus Rosa, Rose leaf disease, Deep learning, Image dataset, Disease classification, Precision agriculture


Specifications Table

Subject Computer Sciences
Specific subject area Computer Vision, Image Processing, Image Classification, Machine Learning.
Type of data Image (.JPG)
Data collection The RoseLeafInsight dataset was collected from two distinct locations in Bangladesh: Zailla, Singair, Manikganj, and Golap Gram, Sadullapur-Komolapur, Road Birulia Bridge, Dhaka. A total of 3,228 high-resolution images were captured, representing four categories of rose leaves: Healthy, Black Spot, Insect Hole, and Yellow Mosaic Virus. Images were resized to 3000×3000 pixels, backgrounds were removed, and brightness was enhanced to standardize inputs. This dataset offers a balanced collection for developing deep learning models in plant health monitoring and disease detection.
captured using:
(i) iPhone 12, (ii) Redmi Note 10 Pro Max.
Data source location
  • 1

    Zailla, Singair, Manikganj

    Latitude: 23°47′46.11"N, Longitude: 90°13′15.73"E

  • 2

    Golap gram, Sadullapur- Komolapur, Road Birulia Bridge, Dhaka 1216

    Latitude: 23° 50′ 6.108′' N, Longitude: 90° 18′ 31.5108′' E

Data accessibility Repository name: Mendeley Data
Data identification number: 10.17632/8chrjdxn79.1
Direct URL to data: https://data.mendeley.com/datasets/8chrjdxn79/2
The dataset is publicly available and can be accessed via the provided Mendeley Data repository link.
Related research article None

1. Value of the Data

  • This dataset facilitates the identification of various diseases present in rose plants, such as black spots, insect holes, and yellow mosaic virus. This way, it helps in their detection at an early stage, which in turn prevents such conditions, making roses healthier and more robust.

  • Researchers can create, train, and test classification algorithms to distinguish between healthy and diseased leaves, enabling the development of real-time monitoring tools that support informed decisions in precision agriculture.

  • By integrating plant pathology and computer vision, the dataset aims to bring together agricultural experts and technologists in efforts to achieve better sustainable farming and high-tech agricultural advances.

  • The dataset contributes to the creation of educational resources and training materials for students and professionals, offering practical, real-world data for learning and experimentation in plant disease detection.

2. Background

The Rose is an ornamental plant with economic importance, producing beauty and contributing to the economy globally [1]. It is used in cosmetics, perfumes, and medicine; rose oils and petals are known for their soothing character [2]. Just like many other plants, roses also exhibit susceptibility to black spot infection, insect holes, and yellow mosaic virus, which deteriorate their health and lower growth and flower production [3,4]. These are diseases of great importance and must be diagnosed early and managed as soon as possible to prevent significant damage. Traditional methods of disease detection include manual inspection, which is slow, error-prone, and usually fails to identify issues at an early stage [5]. In that view, the Rose Leaf Disease Dataset was designed. These images were collected from various geographical locations in Bangladesh under different environmental conditions and at different growth phases of the rose trees to create an exhaustive and representative dataset encompassing the manifestations of diseases. The Images will fall under the categories of ‘Healthy’, ‘Black Spot’, ‘Insect Hole’, and ‘Yellow Mosaic Virus’. With this much variance, it would surely ensure comprehensiveness in the collection of Machine Learning models. The dataset thus enables the development of AI-powered tools for the rapid and accurate identification of diseases in rose plants [6]. Such tools will help farmers identify the early signs of disease in plants, thus enabling timely action to prevent crop loss and maintain plant health. This dataset will further open avenues for collaboration between agriculture and technology in the pursuit of better farming solutions [7]. Training computer vision models with this data and developing more advanced tools for managing diseases more effectively will enable more sustainable and productive rose farming, resulting in healthier plants and higher yields, ultimately leading to better long-term outcomes for the industry.

3. Data Description

This Rose Leaf Disease Dataset includes 3228 high-resolution images collected from the Golap Gram and Manikganj areas of Bangladesh. The varieties of rose leaves used in the study are Doble White Rose, Macho Man, and Lady Bird. Additionally, the rose plants are approximately five months old. These images have been captured to reflect the real variations in natural rose leaf conditions under changing environmental circumstances. Fig. 1 illustrates the natural environment from which rose plants were collected, providing context for the origin of the dataset and the real-world conditions in which these images were captured.

Fig. 1.

Fig 1

The rose field where the dataset images were captured.

We classified Rose Leaf images into four classes. Each class represents one condition of the leaf: Healthy, Black Spot, Insect Hole, and Yellow Mosaic Virus. Table 1 showcases the distribution of the dataset's classes and the total number of images in each category.

Table 1.

Statistics of the Rose leaf dataset.

Serial No Classes (Leaf) Number of Images
1 Healthy 1686
2 Insect Hole 453
3 Yellow Mosaic Virus 680
4 Black Spot 409
Total 3228

We took several steps to ensure that our image collection process was straightforward, high-quality, and unbiased. We captured all the images in proper light, making them transparent. To remove distractions, we used a simple white background, making it easy for the model to focus on the subject without any visual noise. After the image collection, annotation, and labeling were carefully carried out by Professor Dr. M. A. Rahim, the Head of the Department of Agricultural Science at Daffodil International University (DIU), Dhaka, Bangladesh. His long experience in plant diseases helped to make the dataset precise, reliable, and well-balanced. These processes helped us build a diverse and unbiased dataset. Table 2 shows the details of the dataset, including the number of images for each type, a short description, and sample images to help explain each category.

Table 2.

Dataset summary of Rose leaf.

Class Total images Description Sample Images
Healthy 1986 Healthy rose leaves are vital for photosynthesis, nutrient distribution, and overall plant health, helping to prevent diseases and support growth [8]. Image, table 2
Insect Hole 453 It is a disease caused by minute insects such as aphids, caterpillars, or beetles that nibble on rose plant leaves, making small holes [9]. Image, table 2
Yellow Mosaic Virus 680 It is characterized by yellow patterns, rings, or mottling on rose leaves. It is caused by viruses Prunus necrotic ringspot virus (PNRSV) and Apple mosaic virus (ApMV), and Arabis mosaic virus (ArMV) [10]. Image, table 2
Black Spot 409 Black spot is one of the most common fungal diseases in roses, caused by the fungus Diplocarpon rosae. It causes dark, circular spots on the leaves, generally with a yellow border around them [11]. Image, table 2

There is one dataset related to ours by Sazzad et al. [12] provide images for the categorization into different classes of rose leaf diseases. This dataset is instrumental, but it does have some limitations. Our dataset is larger, comprising four main categories of disease: Healthy, Black Spot, Insect Hole, and Yellow Mosaic Virus, with a total of 3,228 images. This will go a long way in helping to improve machine learning models for the early detection of rose plant diseases [13]. Thus, this dataset would be of importance in improving the health management of rose plants and developing better diagnostic tools.

We have compared our dataset with another dataset. This suggests that our dataset differs from the others and contains more information. Table 3 compares the differences in the number of images and the categories of diseases and how our dataset provides more extensive information.

Table 3.

Comparison with Available Datasets of Rose leaf.

Class Number of Images
Our Dataset S. Sazzad et al. [12]
Healthy ✔ (1986) ✔ (404)
Insect Hole ✔ (453) ×
Yellow Mosaic Virus ✔ (680) ✔ (200)
Black Spot ✔ (409) ✔ (313)

Table 3 shows that our dataset surpasses that of S. Sazzad et al. [11] in several key aspects. First, our dataset is larger, containing a greater number of images for each class. Secondly, we have introduced a new disease category, namely “insect hole,” which was not covered by S. Sazzad et al. [11]. This introduction enhances our dataset to be more comprehensive, ensuring that previously neglected diseases are now addressed. Lastly, our dataset offers higher image quality, with images of 3000 × 3000 pixels compared to 512 × 512 pixels in Sazzad et al. [11] Since each of our images has approximately 34 times more detail, it enables better feature extraction, making it easier to detect subtle disease patterns and textures. This improvement enhances the accuracy of disease detection models, making our dataset more valuable for deep learning applications. By providing a larger, higher-quality, and more diverse dataset, our research improves the scope and precision of rose leaf disease detection.

We collected our data from October 2024 to January 2025. Pictures of leaves are taken on different days and times of the day. We chose different locations to gather a variety of leaf types. All important information, such as the class of leaf, weather, date, time, temperature, and devices used for taking the pictures, was recorded. Information is given in Table 4.

Table 4.

Collection Details of Rose Leaf Dataset.

Date Class Name Weather Temperature Camera Device Location
10 October,
2024
Healthy Sunny 32°C iPhone 12(60%) and the Redmi Note 10 Pro Max (40%) Golap Gram and Manikganj
7 November,
2024
Insect Hole Sunny 30°C iPhone 12(30%) and the Redmi Note 10 Pro Max (70%) Manikganj
3 December,
2014
Yellow Mosaic Virus Sunny 28°C iPhone 12(20%) and the Redmi Note 10 Pro Max (80%) Golap Gram
17 January,
2025
Black Spot Foggy 24°C iPhone 12(75%) and the Redmi Note 10 Pro Max (25%) Golap Gram and Manikganj
19 January,
2025
Healthy Sunny 28°C iPhone 12(45%) and the Redmi Note 10 Pro Max (55%) Manikganj
23 January,
2025
Yellow Mosaic Virus Sunny 22°C iPhone 12(75%) and the Redmi Note 10 Pro Max (25%) Golap Gram and Manikganj

4. Experimental Design, Materials and Methods

4.1. Experimental Design

Our dataset of rose leaf images was collected from Golap Gram and Manikganj between October 2024 and January 2025. We used multiple cameras in smartphones to capture high-quality images of leaves from plants under natural light conditions. It contains pictures of both healthy leaves and leaves with diseases such as black spots, insect holes, and yellow mosaic virus. The images were captured under various weather and lighting conditions to enhance the dataset's diversity. The geographic coordinates of the collection sites are as follows:

  • 1.

    Zailla, Singair, Manikganj

    Latitude: 23°47′46.11"N, Longitude: 90°13′15.73"E

  • 2.

    Golap gram, Sadullapur- Komolapur, Road Birulia Bridge, Dhaka 1216

    Latitude: 23° 50′ 6.108′' N, Longitude: 90° 18′ 31.5108′' E

After collecting the images, preparation for analysis was performed by resizing them to a uniform size, removing background noise by replacing them with white, and then normalizing them to 3000×3000 pixels. Further cleaning was performed to remove duplicate images and those of poor quality, leaving only high-quality data for model training. We split the data into two portions: 80% for training and 20% for validation, ensuring a balanced approach to machine learning and deep learning model development. Then, a deep learning pre-trained model, such as MobileNetV2, is trained to identify and classify the different Leaves in the model generation step. Finally, the model evaluation phase measures how accurately the system can classify images using accuracy, precision, and recall metrics. From the workflow, we can see how a machine learning system classifies images into different classes for the early detection and management of rose leaf diseases. Fig. 2 classifies images into different classes for the dataset by applying a machine-learning model.

Fig. 2.

Fig 2

The method by which the diseases of Rose leaf is evaluated.

4.2. Materials (Camera Specification)

Data acquisition was performed using two smartphones: the iPhone 12 and the Redmi Note 10 Pro Max. The iPhone 12′s camera system features a dual 12 MP setup, including a primary wide-angle lens with an f/1.6 aperture that enables high-quality image capture even in low-light conditions, preserving fine details and texture variations. The ultra-wide camera was applied for whole-leaf or multi-leaf shots, with an aperture of f/2.4 and a 120° field of view. High-resolution images were captured by the Redmi Note 10 Pro Max in variable lighting conditions using its 108 MP primary camera, which features an f/1.9 aperture. The 8 MP ultra-wide lens, with a 118 ° field of view, effectively framed the complete leaf structures, while a 2 MP macro lens captured close-up views of lesions, discoloration, and necrosis. Additionally, a depth sensor enhanced portrait shots, providing a more precise representation of the health of the leaves from all angles.

4.3. Image pre-processing and classification

First, we collected leaves from the field and took photos. Then, images are stored, resized to a uniform size, labeled, and organized into categories to ensure the high quality of our dataset. These changes simulate real-world conditions, enabling the model to perform effectively with new data. This process will ensure that the dataset is prepared to train an accurate and reliable machine-learning model for detecting and classifying various conditions on rose leaves. It enhances agriculture in disease management and health monitoring. Fig. 3 illustrates the detailed process of cleaning the dataset from raw data to the final processed data. Hence, a well-prepared dataset is essential for the following analysis process, especially in machine learning and data science fields related to the detection and classification of rose leaf diseases. Our great hope is that timely and accurate detection will help reduce yield losses and make crops healthy enough.

  • 1.

    Rose fields: We have initially listed the places of the rose plant. To make this dataset comprehensive and representative of the disease manifestations, images were gathered in various geographical locations of Bangladesh under diverse conditions and at different growth phases of the rose trees.

  • 2.

    Leaf Collection: We manually collected rose leaves from plants, ensuring they included both diseased and healthy leaves. Carefully handled the leaves in order to avoid physical damage.

  • 3.

    Click Images: We have utilized multiple cameras in smartphones to capture high-quality images of leaves directly from the plant under natural conditions. All images were taken in natural light to minimize shadows and reflections, which could compromise quality and lead to incorrect classification.

  • 4.

    Image Organization: Transfer the captured rose images to a secure storage medium, preferably in a folder created on the computer. Proper organization at this stage of the project enabled efficiency in subsequent steps.

  • 5.

    Image Resizing: We utilize multiple cameras to capture high-quality images. As a result, the image ratios were not the same. In this step, we resized the images to 3000 × 3000 pixels.

  • 6.

    Image Labeling: Each image was labeled with a category such as “Healthy”, “Black Spot”, “Yellow Mosaic Virus,” or “Insect Hole” to ensure the accuracy of classification for later use.

  • 7.

    Classification: Categorized images were organized into separate folders according to their categories. This organization made the dataset easier to manage and prepare for further processing.

Fig. 3.

Fig 3

Workflow of image pre-processing and classification.

We maintained high quality and consistency throughout the process by addressing key steps, including background removal, resizing, increasing brightness (by a factor of 1.2), and labeling. This result is simulated in Fig. 4. After completing all the processes, we collected a total of 3228 rose images in both categories, such as healthy and unhealthy leaves.

Fig. 4.

Fig 4

Processed images from the Rose Leaf Disease Dataset.

4.4. Data augmentation

These enhancements are crucial for enhancing the robustness and generalization capabilities of deep learning models when training on the Rose Leaf Disease dataset. Since there is always a challenge, whether it is the limited availability of data or the need to perform well under diverse real-world conditions, a need exists to expand the training dataset using data augmentation techniques synthetically. In our work, we have employed various augmentation techniques, including rotation, flipping, contrast adjustment, blurring, shearing, zooming, and noise addition, to enhance image quality and increase diversity for use in training the model. These augmentations enhance the learning capabilities of the model and improve its predictive accuracy. Then, several transformations were applied to the already available images, enhancing this dataset to a total of approximately 12,000 images, thereby enriching the training data pool by a significant margin. This was an augmentation process consisting of a series of transformations designed to reflect every change that would occur naturally. It included flipping horizontally and vertically to simulate various orientations and perspectives of the rose leaves, thereby helping the models remain invariant to these changes. Rotation of up to 45 degrees introduced angular variability, while scaling from 0.8 to 1.2 changed the size of the leaves to introduce variety in dimensions within the rose images. We used Gaussian blur with a radius ranging from 0.5 to 2.5. To accommodate different lighting conditions, we adjusted the contrast randomly between 0.7 and 1.3. This helped the model work even if the brightness was not the same in every rose image. We added a slight shearing tilt between -0.3 and +0.3 to illustrate how rose leaves would appear slightly tilted in the real world. We also zoomed in on each picture 1.5 times and then cropped it back to its original size, allowing the model to capture small details. Finally, we added random noise between -60 and +60 to improve the model's strength, even though the rose leaf image is not particularly clear. Table 5 showcases the rose leaf disease dataset number of original and augmentation images for each class.

Table 5.

Number of original and augmented images in the Rose Leaf Disease Dataset.

Serial Number Class Name Number of Original images Number of Augmentation images
1 Healthy 1686 3000
2 Insect Hole 453 3000
3 Yellow Mosaic Virus 680 3000
4 Black Spo 409 3000
Total 3228 12,000

All these augmentation methods ensured that the variations captured by the dataset were as broad as possible; therefore, this dataset would act as an influential tool in training deep learning models with enhanced generalization capabilities. Augmentation helps avoid overfitting, a scenario where a model trains well on the training data but fails to generalize to unseen data by presenting the model with more variable training examples. The broad approach applied led to the development of machine learning models that, even with completely new images or slightly modified ones, were better prepared to recognize and classify rose leaf diseases.

In this respect, data augmentation has become a crucial step in constructing robust and high-performance models for the early and accurate detection of leaf diseases, thereby supporting precision agriculture. This will enable farmers to implement timely and efficient disease management practices.

4.5. Dataset structure

The dataset images are divided into three main folders: Original Image, Processed Image, and Augmented Image. Each contains four different conditions of rose leaves. Fig. 5 provides a visual representation of the Rose Leaf Disease dataset organization. The folder “Original Image” contains the original, unprocessed images of rose leaves as they appear in nature. Inside, they are sorted into four subfolders, namely Healthy for leaves with no disease, Black Spot for leaves infected with black spot disease, Insect Hole for leaves damaged by insects, and Yellow Mosaic Virus for leaves infected with Yellow Mosaic Virus.

Fig. 5.

Fig 5

Rose Leaf Disease dataset organization.

The “Processed Image” folder contains images that have been enhanced for model training. Each image has undergone background removal, resizing to 3000×3000 pixels, and increasing brightness (factor 1.2) for consistency. To make the model more adaptable and enhance its ability to recognize features under various conditions, we applied a range of image augmentation techniques, including rotation, flipping, contrast adjustment, blurring, shearing, zooming, and adding noise. Similar to the raw dataset, the Processed Image and Augmented Image folders are divided into the same four categories, making it easy to compare and use the images for training and evaluation. The data is well-organized, which keeps the images in raw and pre-processed states separate, thus making it easy to use in research and modeling.

4.6. Model validation

We applied transfer learning models, such as VGG16 and MobileNetV2, to classify rose leaves into four categories: ‘Black Spot’, ‘Insect Hole’, ‘Yellow Mosaic Virus’, and ‘Healthy’. We used a batch size of 32, 30 epochs of training, and an adaptive learning rate of 0.0001 to perform stable and practical training that maximizes performance and prevents overfitting.

The Rose Leaf disease dataset consisted of 12000 images. We used 80% of the images for training and 20% for model validation. To verify the performance of the models, accuracy and loss plots for the training and validation processes were plotted. These plots were instrumental in identifying the learning process and potential problems, such as overfitting, where the model performs well on training data but generalizes poorly to validation data, and underfitting, where the model fails to capture significant patterns. Fig. 6 shows a visual summary of the proposed computer-aided system for classifying rose leaves. In this study, we determine accuracy, precision, recall, and F1-score using the following specific formulas [14]:

Accuracy=TP+TNTP+TN+FP+FN (1)
Precision=TPTP+FP (2)
Recall=TPTP+FN (3)
F1score=2×Precision×RecallPrecision+Recall (4)

Fig. 6.

Fig 6

The visual abstract for rose leaf classification.

Here, TP is True Positives, TN is True Negatives, FP is False Positives, FN is False Negatives.

The MobileNetV2 model’s performance in classifying rose leaves into four categories — ‘Black Spot’, ‘Insect Hole’, ‘Yellow Mosaic Virus’, and ‘Healthy’ — is presented in Table 6. It achieves an accuracy of 96.79%. Fig. 7 shows the MobileNetV2 accuracy and loss curve, and Fig. 8 showcases the confusion matrix for MobileNetV2.

Table 6.

Performance Metrics for MobileNetV2.

Class Name Precision (%) Recall (%) F1-Score (%) Accuracy (%)
Healthy Leaf 99 99 99
96.79
Insect Hole 99 99 99
Yellow Mosaic Virus 93 96 95
Black Spot 97 93 95

Fig. 7.

Fig 7

Accuracy and loss curve for MobileNetV2.

Fig. 8.

Fig 8

Confusion matrix for MobileNetV2.

The performance of the VGG16 model in classifying rose leaves into four categories is summarized in Table 7. It achieves 92.00% accuracy. Fig. 9 shows the VGG16 accuracy and loss curve, and Fig. 10 showcases the confusion matrix for VGG16.

Table 7.

Performance Metrics for VGG16.

Class Name Precision (%) Recall (%) F1-Score (%) Accuracy (%)
Healthy Leaf 95 98 96
92.00
Insect Hole 94 94 94
Yellow Mosaic Virus 89 89 89
Black Spot 89 87 88

Fig. 9.

Fig 9

Accuracy and loss curve for VGG16.

Fig. 10.

Fig 10

Confusion matrix for VGG16.

Table 8 showcases the accuracy of each model. By analyzing the performances of all models, we can see that MobileNetV2 outperformed all models due to its high 96.79% accuracy.

Table 8.

Overall Model Accuracy.

Model Accuracy (%)
VGG16 92.00
MobileNetV2 96.79

4.7. Data annotation protocol

The data annotation and labeling process was carried out by Professor Dr. M. A. Rahim, Head of the Department of Agricultural Science at Daffodil International University (DIU), Dhaka, Bangladesh, an expert agronomist with extensive expertise in plant disease diagnosis and classification. The process involved the following steps:

  • 1.

    Initial Screening: Each image was meticulously reviewed for quality to ensure clarity and adequate representation of disease symptoms. Images with low quality or insufficient details were excluded from the dataset.

  • 2.

    Class Assignment: Based on visible symptoms such as discoloration, necrosis, and deformations, each image was categorized into one of four disease classes: Healthy, Black Spot, Insect Hole, and Yellow Mosaic Virus.

  • 3.

    Verification: Following initial labeling, the annotations were thoroughly reviewed to confirm accuracy and maintain consistency across the dataset.

This expert-driven data annotation and labeling process ensures the creation of a high-quality dataset, suitable for machine learning-based plant disease detection and classification tasks.

Limitations

This study has some limitations. Since the dataset was collected from specific areas, it may not perform well in other regions. Moreover, changes in lighting and environmental conditions during image capture may also impact the model's real-world performance. Another limitation is that the images were collected against a plain background, whereas real-life scenarios often contain complex and cluttered backgrounds, which could affect model accuracy. Even with more images added, the original 3228 images may still not be enough for complex models.

Ethics Statement

We confirm that our study was conducted in full compliance with the relevant ethical guidelines and regulations. No harm was inflicted on plants, animals, or humans during the research. Additionally, no data was obtained from social media platforms. All authors affirm their adherence to the ethical standards required for publication in Data in Brief.

CRediT authorship contribution statement

Arnob Das Shacha: Conceptualization, Data curation, Methodology, Writing – original draft. Sabbir Hossain Durjoy: Conceptualization, Methodology, Writing – original draft, Data curation. Md. Emon Shikder: Conceptualization, Visualization, Writing – original draft. Md Mostafa Kamal: Conceptualization, Data curation, Methodology. Md Mehedi Hasan Shoib: Conceptualization, Methodology, Writing – original draft. Md Hasan Imam Bijoy: Conceptualization, Supervision, Formal analysis, Writing – review & editing.

Acknowledgements

We extend our heartfelt gratitude to Professor Dr. M. A. Rahim, Head of the Department of Agricultural Science at Daffodil International University (DIU), Dhaka, Bangladesh, for his invaluable expertise in data validation. His insightful feedback and unwavering support were instrumental in the successful completion of this project.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of Competing Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability

References

  • 1.S. Poornima, R. Divya, R. S. Krishnan, S. Jegadeesan, G. Yamini, and G. V. Rajkumar, “Advancing Rose Disease Diagnosis: A Deep Learning Framework with EfficientNet-B7,” pp. 1747–1754, Oct. 2024, 10.1109/i-smac61858.2024.10714594. [DOI]
  • 2.Saidul Islamm Nayan, Syed Mahfuzur Rahman, and Nosin Ibna Mahbub, “RoseVision: An Android Application To Detect Rose Leaf Diseases Using Modified Convolutional Neural Network,” Dec. 2023, 10.1109/iccit60459.2023.10537174. [DOI]
  • 3.Cut flowers and foliages. 2021. 10.1079/9781789247602.0000. [DOI]
  • 4.Kebert M., et al. Metabolically Tailored Selection of Ornamental Rose Cultivars through Polyamine Profiling, Osmolyte Quantification and Evaluation of Antioxidant Activities. Horticulturae. Apr. 2024;10(4) doi: 10.3390/horticulturae10040401. pp. 401–401. [DOI] [Google Scholar]
  • 5.Hussein M., Abbas A.H. Plant Leaf Disease Detection Using Support Vector Machine. Al-Mustansiriyah J Sci. Aug. 2019;30(1):105. doi: 10.23851/mjs.v30i1.487. [DOI] [Google Scholar]
  • 6.L. R. Chaudhari, A. V. Tidke, P. S. Gupta, P. Patel, and Shipla Hudnurkar, “Roses Plant Disease Detection using DenseNet121 Model,” pp. 1–5, Oct. 2024, 10.1109/icisaa62385.2024.10828973. [DOI]
  • 7.Jalal Syafiq Fauzi Kamarulzaman, Abu Md Arafatur Rahman, Uddin M. A Comprehensive Review on Deep Learning Assisted Computer Vision Techniques for Smart Greenhouse Agriculture. IEEE Access. Jan. 2024;12:4485–4522. doi: 10.1109/access.2024.3349418. [DOI] [Google Scholar]
  • 8.Shoib Md.M.H., Saeem S., Tonima A.B.A., Mojumdar M.U. IDBGL: A unique image dataset of black gram (Vigna mungo) leaves for disease detection and classification. Data Brief. Jan. 2025;59 doi: 10.1016/j.dib.2025.111347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Durjoy S.H., Shikder Md.Emon, Mojumdar Mayen Uddin. A Comprehensive Hog Plum Leaf Disease Dataset for Enhanced Detection and Classification. Data Brief. Jan. 2025 doi: 10.1016/j.dib.2025.111311. pp. 111311–111311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Talasila S., Rawal K., Sethi G., MSS S. Black gram Plant Leaf Disease (BPLD) dataset for recognition and classification of diseases using computer-vision algorithms. Data Brief. Dec. 2022;45 doi: 10.1016/j.dib.2022.108725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Gachomo E.W., Dehne H.-W., Steiner U. Microscopic evidence for the hemibiotrophic nature of Diplocarpon rosae, cause of black spot disease of rose. Physiolog. Molec. Plant Pathol. Jul. 2006;69(1–3):86–92. doi: 10.1016/j.pmpp.2007.02.002. [DOI] [Google Scholar]
  • 12.Sazzad S., Rajbongshi A., Shakil R., Akter B., Kaiser M.S. RoseNet: Rose leave dataset for the development of an automation system to recognize the diseases of rose. Data Brief. Aug. 2022;44 doi: 10.1016/j.dib.2022.108497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ashine T. Image-Based Rose Leaf Diseases Detection Using Deep Learning. Smuc.edu.et. 2024 http://hdl.handle.net/123456789/7886 [Google Scholar]
  • 14.Mahmud Md.Prince, Ali Md.Alams, Akter S., Hasan M. 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT) 2022. Lychee Tree Disease Classification and Prediction using Transfer Learning. Oct. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement


Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES