OHID-1: A New Large Hyperspectral Image Dataset for Multi-Classification

Ashish Mani; Sergey Gorbachev; Jun Yan; Abhishek Dixit; Xi Shi; Long Li; Yuanyuan Sun; Xin Chen; Jiaqi Wu; Jianwen Deng; Xiaohua Jiang; Dong Yue; Chunxia Dou; Xiangsen Wei; Jiawei Huang

doi:10.1038/s41597-025-04542-7

. 2025 Feb 12;12:251. doi: 10.1038/s41597-025-04542-7

OHID-1: A New Large Hyperspectral Image Dataset for Multi-Classification

Ashish Mani ¹, Sergey Gorbachev ^1,^✉, Jun Yan ^2,^✉, Abhishek Dixit ¹, Xi Shi ¹, Long Li ¹, Yuanyuan Sun ², Xin Chen ³, Jiaqi Wu ³, Jianwen Deng ³, Xiaohua Jiang ³, Dong Yue ^4,^✉, Chunxia Dou ⁴, Xiangsen Wei ⁴, Jiawei Huang ⁴

PMCID: PMC11822079 PMID: 39939330

Abstract

In the context of the increasing popularity of Big Data paradigms and deep learning techniques, we introduce a novel large-scale hyperspectral imagery dataset, termed Orbita Hyperspectral Images Dataset-1 (OHID-1). It comprises 10 hyperspectral images sourced from diverse regions of Zhuhai City, China, each boasting 32 spectral bands with a spatial resolution of 10 meters and spanning a spectral range of 400–1000 nanometers. The core objective of this dataset is to elevate the performance of hyperspectral image classification and pose substantial challenges to existing hyperspectral image processing algorithms. When compared to traditional open-source hyperspectral datasets and recently released large-scale hyperspectral datasets, OHID-1 presents more intricate features and a higher degree of classification complexity by providing 7 classes labels in wider area. Furthermore, this study demonstrates the utility of OHID-1 by testing it with selected hyperspectral classification algorithms. This dataset will be useful to advance cutting-edge research in urban sustainable development science, land use analysis. We invite the scientific community to devise novel methodologies for an in-depth analysis of these data.

Subject terms: Mathematics and computing, Environmental sciences

Background & Summary

Hyperspectral remote sensing has copious narrow spectral bands and provides a complete image for each band. This has numerous advantages over conventional remote sensing with only three RGB bands:

Improved ground object classification: The high spectral resolution of hyperspectral images (HSI) makes it much easier to detect subtle differences in appearance and boundaries between objects.
Better chemical composition analysis: It is much easier to identify various materials through specific light emission and absorption characteristics.
Differential analysis: Differences between images in one band and those in another band can be highly significant.
Quantitative analysis: Working with narrow spectral bands can reduce noise and help overcome interference.

These advantages make HSI useful for many applications, including agricultural analysis^1–3, weather forecasting⁴, land⁵ and ocean resource mapping^6,7, and a variety of others^8,9.

Classification of pixels in HSI is of fundamental importance. Classification involves using labeled data to train a model that learns the data features associated with each label by adjusting internal weights until the overall error with respect to the training set is minimized. A large computational volume is involved in repeating the adjustment of these weights to find the combination that minimizes the overall error for the chosen model. However, the huge advances in computing power in recent years have made this approach possible, and it is widely used for many purposes, including HSI analysis.

Deep Neural Networks (DNNs) have been found to be a useful supervised approach for many challenging tasks, including image parsing^10–14 and natural language processing^15–17. Setting up a DNN relies heavily on the availability of appropriately labelled datasets and in recent years these have been published for various tasks^18–21. Examples include CIFAR-10/100 and ImageNet¹⁴ for image recognition, Microsoft COCO²², Chinese City Parking Dataset (CCPD)^23,24 and PASCAL VOC²⁵ for object detection, and ActivityNet¹⁸ and the “something something” video datasets²⁶ for video parsing²⁷. These datasets are large-scale and well-annotated.

Several datasets have also been published in relation to remote sensing images^28–39. These can be divided into multiple categories, including standard red-green-blue (RGB) images and HSI. Exploiting HSI^{30,35,40–44} is popular since HSI contains rich spectral information that can better present the spatial features and material composition of ground objects. For example, HSI has been used in combination with DNNs to obtain better accuracy for object recognition by using information from specific spectral bands⁴⁵, and for winter wheat analysis^3,46. But the use of available open-source HSI datasets are limited. For example, the Indian Pines²⁸, Salinas Valley³², and Pavia University²⁹ datasets, which are widely used in classification researches. However, analytically, we find that:

① These older datasets contain relatively small amounts of data and are typically used with conventional algorithms or shallow DNNs. Since shallow DNNs have limited learning capabilities, these networks have limited generalization performance and are not suitable for many practical purposes.

② The generalization ability of classification model is one of the important indicators of model performance, which needs to be tested on different scenes and different types of datasets, but the existed available datasets cannot meet the demands.

In the context of the increasing popularity of big data paradigms and deep learning techniques, we build a new set of hyperspectral data with complex characteristics using data from Zhuhai Orbita Aerospace Technology Co., Ltd (Orbita) and named it Orbita Hyperspectral Images Dataset-1⁴⁷ (OHID-1). It describes different type of areas in Zhuhai City, China.

The Necessity of opening the OHID-1 dataset

We compared OHID-1⁴⁷ with other HSI datasets and also used 8 well-known DNNs to estimate the baseline of classification difficulty caused by richer characteristics contained in the OHID-1⁴⁷. The results show that the AI algorithms developed for these datasets perform not good enough on the OHID-1⁴⁷, which means OHID-1⁴⁷ provides greater challenges than those associated with the previously available HSI datasets. To mine the OHID-1⁴⁷, we propose to the scientific community to develop methods to study these data in depth. We believe that OHID-1⁴⁷ can contribute to the study of HSI and help advance the performance of HSI classification.

As we mentioned before, an effective classification method should be tested on various kinds of scenes and different types of HSI data. From the current research perspective, a lot of studies use the data captured by airborne sensors for experimentation, because most of the public datasets are captured by airborne sensors (Table 1). The Botswana dataset is one of the few publicly available satellite data, which has only 1 scene and a spatial resolution of 30 m.

Table 1.

Comparison between the existed public datasets and OHID-1 dataset.

Dataset	Resolution	Channels	Spectrum	Number of classes	Scenes	Size (pixels)	Sensor
Indian Pines (IP)²⁸	20 m	220	400–2500 nm	16	1	145 × 145	AVIRIS (Airborne)
Pavia University (PU)²⁹	1.3 m	115	430nm–860nm	9	1	610 × 340	ROSIS (Airborne)
Kennedy Space Center (KSC)³⁰	18 m	224	400–2500 nm	13	1	512 × 614	AVIRIS (Airborne)
Houston³¹	2.5 m	144	364–1046 nm	15	1	349 × 1905	ITRES CASI-1500 (Airborne)
Salinas Valley³²	3.7 m	224	400–2500 nm	16	1	512 × 217	AVIRIS (Airborne)
Botswana³³	30 m	242	400–2500 nm	14	1	1476 × 256	Hyperion (Spaceborne)
Xiongan New Area (Matiwan Village)³⁴	0.5 m	250	400–1000 nm	19	1	3750 × 1580	Full spectrum Multi-modal Imaging spectrometer (Airborne)
ShanDongFeiCheng (SDFC)³⁵	10 m	63	400–1000 nm	19	2	2000 × 2700, 2100 × 2840	high score special aviation hyperspectral spectrometer (Airborne)
Trento³⁶	1 m	63	400–980 nm	6	1	600 × 166	AISA Eagle (Airborne)
Chikusei³⁷	2.5 m	128	343–1018 nm	19	1	2517 × 2335	Headwall Hyperspec-VNIR-C (Airborne)
WHU-Hi³⁸	0.109 m	274	400–1000 nm	16	1	1217 × 303	Headwall Nano-hyperspec (Airborne)
OHID-1 (from this paper)	10 m	32	400–1000 nm	7	10	512 × 512	CMOS (spaceborne)

Open in a new tab

Comparing to the airborne-data, the satellite data also has irreplaceable value in environment, agricultural, city management, etc. since the satellite data has high timeliness and low labor costs. Although there are also some unresolved issues with satellite data, such as the calibration of radiation values and geometric information, the influence of cloud, the low spatial resolution and so on, we also think exploring the application of satellite data has significant value in social development.

The Research value of the OHID-1 dataset

In addition to helping to improve the effectiveness of classification algorithms, the OHID-1⁴⁷ can also be used in other researches of computer vision (CV). For example, the super-resolution reconstruction which is very popular in CV. The super-resolution reconstruction has crucial meaning for application of HSI data, because its ability in improving the spatial resolution. It is well-known that there is a trade-off between spatial and spectral resolution in hyperspectral satellite images due to the hardware device limitations, the low spatial resolution leads to insufficient ground detail in the image. This limitation restricts the widespread application of hyperspectral images across various domains. Our dataset not only has a spatial resolution of 10 m which is up to advanced level in this area, but also contains 10 scenes for the examination of different models.

There is now a much stronger call for integrated knowledge about understanding of the character and dynamics of cities, social and technological systems and their interfaces. Thus, OHID-1⁴⁷ can make a significant contribution to cutting-edge research on the science of sustainability and development of cities, urbanization, agriculture, contemporary climate change, cyclone forecasting, biodiversity conservation, environmental behavior, environmental degradation, green infrastructure, health and environment, land use, natural resource management, water-soil-waste and others other remote sensing tasks related to sustainability and management of cities.

Methods

Samples of OHID-1 Dataset

According to the above, we propose a new open-source dataset: the OHID-1⁴⁷, which is intended to help address the above limitations. The OHID-1⁴⁷ is collected by “Zhuhai No.1” hyperspectral satellite (OHS) constellation (Fig. 1), designed and produced by Zhuhai Orbita Aerospace Technology Co., Ltd (Orbita). In 2023 company has been renamed to Zhuhai Aerospace Microchips Science & Technology Co., Ltd. With eight OHS (labelled with OHID-1⁴⁷ as A, B, C, D, E, F, G, H30), “Zhuhai No.1” can make global observations within two days, and the payload of the satellite has four principal components: (i) the lens, (ii) the focal plane, (iii) the focusing mechanism, (iv) the hood. “Zhuhai No.1” adopts the pushbroom scan imaging technology, with a 150 km image width, a 10 m spatial resolution, a 2.5 nm spectral resolution, and a 400–1000 nm wavelength range. Due to transmission and storage limitations, available data bands is designed to be 32 (programmable, total number of bands is 256). A single hyperspectral satellite can orbit the earth about 15 times per day, and the maximum single data acquisition time in one orbit is about 8 minutes. With such high-quality of data, the HSI captured by “Zhuhai No.1” has been applied in many fields, such as water body monitoring and land use classification.

The OHID-1⁴⁷ provides 10 hyperspectral images each with 32 spectral bands, a size of 512 × 512 pixels and 7 types of objects (Fig. 1). This makes OHID-1⁴⁷ suitable for (i) training DNNs for use with hyperspectral images, (ii) deepening the depth of DNNs, and (iii) improving the generalization performance of DNNs. Some sample scenes with annotations of different types of land-use are shown in Fig. 2. The scenes shown in Fig. 2 (part I) come from different type of areas and Fig. 2 (part II) shows part of Zhuhai City, Guangdong Province, China. As we can see, different land-use types of these image are well annotated.

Fig. 2 — Sample scenes from OHID-1⁴⁷ dataset: **Part 1**, different type of areas at Zhuhai City, Guangdong Province, China. **Part 2**, part of Zhuhai City, Guangdong Province, China.

Subfigure (a) of Figure 1 comprises two key components: a high spectral resolution line graph and a schematic representation of satellite remote sensing. High spectral resolution displays detailed spectral information, characterized by its high spectral resolution of 2.5 nanometers. The horizontal axis represents the wavelength, indicating the range of electromagnetic radiation captured by the sensor. The vertical axis represents Radiance. Subfigure (b) presents samples from the dataset to illustrate the mean spectra curves and mean spectra ± standard deviation plots for the building and farmland categories. In both plots, the horizontal axis represents the wavelength, while the vertical axis represents the radiance values. These visualizations aid in understanding the spectral variability and similarity between the building and farmland categories within the dataset.

Figure 2 presents sample scenes extracted from the OHID-1 dataset. OHID-1 stands for the relevant dataset name. Part I show cases different types of areas (categorized as city, country, and mountain) within the dataset, with their corresponding annotated pseudo-color images, sourced from Zhuhai City, Guangdong Province, China. Annotations refer to the labeled areas within the images. Part II focuses on a specific image from the dataset, zoomed in to provide detailed views of both the image itself and its annotations, featuring a part of Zhuhai City, Guangdong Province, China.

To have a clearer vision of the current public datasets, 11 common public HSI datasets and their parameters are listed in Table 1. As we can see, these 11 datasets are different at spatial resolution, spectral resolution and spectral range. Moreover, there is only one dataset which was taken by a spaceborne sensor (Botswana). Indicating that the existed methods need to be applied to a lot of different datasets to improve their generalization ability. For this reason, we propose the OHID-1⁴⁷ which was collected by spaceborne sensor and contained 10 scenes.

Labeling

The OHID-1⁴⁷ uses 7 labels which mainly correspond to basic objects such as buildings and roads, as shown in Fig. 4. There are 10 scenes, all of Zhuhai, a coastal city in China. All the scenes have the same spatial resolution of 10 m/pixel, with 512 × 512 pixels per scene.

OHID-1⁴⁷ contain 2 folders for images and labels respectively:

images: 10 scenes and 32-band hyperspectral images, “tif” format;
labels: The semantic labels corresponding to the 10 scenes of hyperspectral images, which are single-channel data, “tif” format.

We selected 10 representative regions from the four original images for marking, each size 512 × 512, and the regional distribution is shown in the Fig. 3. Meanwhile, the dataset samples were classified by 7 categories: building, farmland, forest, road, water, bare land and fish shed, and the sample number of these 7 categories can be seen in Fig. 4. Figure 4a presents a detailed illustration of the categories and their distribution within a single scene image from the OHID-1⁴⁷. Figure 4b gives a distribution of all OHID-1⁴⁷ images. It is evident from this figure that the OHID-1⁴⁷ data is characterized by both multi-category presence and data imbalance. Among the seven categories, water and road occupy more than half of the total, whereas bare land and fish shed account for less than 2% each. This presents a challenge for subsequent HSI classification tasks.

Figure 4 is divided into two parts, providing a detailed exhibition of the categories and their distribution within the OHID-1 dataset. The dataset samples are categorized into seven classes: buildings, farmland, forests, roads, water areas, bare land, and fish ponds. Subfigure (a) presents a detailed view of the categories and their distribution within a single scene image. This includes a pseudo color annotated image, a table showing the specific sample counts for the different categories, and a pie chart illustrating the statistical distribution of these categories. Subfigure (b) is a pie chart representing the overall data distribution of the dataset.

To ensure the accuracy of the labels, we obtained high-resolution aerial flight images with similar dates as a reference and conducted on-site surveys with the help of GPS positioning. For buildings, roads, rivers, large areas of woodland, and other easily distinguishable features, we confirmed and manually outlined these by referring to aerial images, while for bare soil, fish shed, and other features that change rapidly over time, we used drones to conduct on-the-spot investigations and used the drone’s GPS to match the OHID-1 plot coordinates to obtain accurate features for the image area. Figure 5 is an example of a reference aerial image for each of the 7 categories, also giving the corresponding colors used in annotated images.

Fig. 5 — Reference aerial images and colours used for annotation: (a) Image cube. (b) Ground-truth image. (c) Typical zone in the study area.

We also analyzed the reflectance of the scenes at different wavelengths, and Fig. 6 presents the average reflectance for each wavelength band for each object class. From this figure, it can be seen that the “Zhuhai No.1” satellites perform well in providing the spectral characteristics of each object class and that it shows their difference well. Buildings and roads, however, are similar in reflectivity.

Figure 5 is an example of a reference aerial image for each of the 7 categories, also giving the corresponding colors used in annotated images. The figure presents three subplots (a, b, and c) illustrating different aspects of hyperspectral imagery and ground truth data. Subplot (a) is Hyperspectral Image Cube. Subplot (b) displays the ground truth image, which provides the true or reference labels for the hyperspectral data. Subplot (c) stands Typical Zone in the Study Area.

Figure 6 presents the average reflectance for each wavelength band for each object class. From this figure, it can be seen that the “Zhuhai No.1” satellites perform well in providing the spectral characteristics of each object class and that it shows their difference well. The horizontal axis (x-axis) represents the “band,” with units in nanometers (nm). The vertical axis (y-axis) represents the “reflectance,” with units in percent (%).

Data Records

Data storage format

The OHID-1⁴⁷ is available at Figshare repository https://figshare.com/articles/online_resource/OHID-1/27966024/8. It describes different type of areas at Zhuhai City, China. This link provides access to the original data, dataset, preprocessing, codes.

Original data

Original data is available at Baidu Netdisk: https://pan.baidu.com/s/1qMtY7ossLwRh0pI2v2bnDg?pwd=bi70, code: bi70.

This link provides access to the raw data and annotations of the OHID-1⁴⁷, which includes two different data formats: “mat” and “tif”. All data have a size of 5056 × 5056 pixels. The raw data consists of 32 bands, while the annotation data consists of 1 band.

Dataset

The “image” folder contains 10 hyperspectral images, each with 32 spectral bands, a size of 512 × 512 pixels, and depicting 7 types of objects. The naming format is “201912_n.tif”, where n ranges from 1 to 10. The “labels” folder contains the labels for the ten images in the “images” folder, with the same naming format of “201912_n.tif”, where n also ranges from 1 to 10. Each label has values ranging from 0 to 7, and the category represented by each value can be found in “sample_proportion.png”, “201912_n.png”, where n ranges from 1 to 10, represents the bar chart distribution of each category in “201912_n.tif”, “201912_n_color.png”, where n ranges from 1 to 10, represents the visualized pseudo color map of the labels in “201912_n.tif”.

Preprocessing

The “Preprocessing” folder provide the codes for band synthesis and slicing of the original file.

Codes

The “HSI_Classification” folder contains the codes for ID CNN, 2D CNN, 3D CNN and SVM. These codes are built upon HIS classification algorithms from https://github.com/zhangjinyangnwpu/HSI_Classification, with the primary changes made in the “unit.py” file. The “HyLiTE” files store the HyLITE code. These codes are built upon HyLITE algorithms from https://github.com/zhoufangqin/hylite, with the main changes made in the “main.py” file. We have added code for reading some additional files to make it compatible with other datasets.

Data source selection

The original satellite images of the dataset were derived from four OHS images taken by four hyperspectral satellites between 2019 and 2020. The data size of each scene is 5056 × 5056.

The naming rule of “Zhuhai No.1” hyperspectral satellite data product is “Satellite + ID + Receiving Station _ Receiving time _ scene _ level_Band_sensor”, where:

Satellite: In “Zhuhai No.1” constellation, the category of remote sensing satellite is denoted, and the hyperspectral satellite is denoted H;

ID: Satellite number, hyperspectral satellite number A, B, C, D, E, F, G, H, in which the four satellites A-D for the second group launch, the four satellites E-H for the third group launch. The naming convention is shown in the Fig. 7.

Fig. 7 — Naming rules for the data products of “Zhuhai No.1” hyperspectral satellite.

Figure 7 illustrates the naming convention for the data products of the “Zhuhai No.1” hyperspectral satellite. It provides an example of a data product name. Each part of the example name is clearly labeled and described to ensure understanding of the naming convention and the data products it represents.

Data Pre-processing

During the data pre-processing, the ENVI software was used to perform radiometric calibration, atmospheric correction, and geometric correction and orthographic correction on the collected data^48–50.

Radiation calibration

Radiation calibration is the process of converting the digital quantization value DN recorded by the sensor into the radiation brightness value L. Calibrated hyperspectral remote sensing data is essential for accurately extracting the genuine physical properties of ground objects from the imagery. Additionally, it enables the comparison of hyperspectral data collected from different regions or at different times. In order to compare and analyze hyperspectral remote sensing data with different remote sensors, spectrometers and even system simulation data, the radiation scaling formula used in OHS hyperspectral satellite images is as follows:

Le = gain \times \frac{DN}{TDIStage} + offset

In the formula:

Le is apparent radiance;
gain is the absolute gain coefficient of radiometric calibration;
offset indicates the absolute offset coefficient of radiative calibration;
TDIStage indicates the integral series. You can obtain the TDIStage field from the metadata file (XXX_meta.xml) in the hyperspectral data folder.

The gain and offset parameters for each band are automatically obtained by ENVI from the metadata file, and both parameters are calibrated in W/(m²-s × r-μ × m), so that the calculated radiance value Le is also in W/(m²-s × r-μ×m).

Atmospheric correction

In order to obtain the surface radiation information, it is necessary to eliminate the influence caused by external factors such as the atmosphere, which is called atmospheric radiation correction. Since the surface radiation information after inversion is generally reflectivity, it is also called reflectivity inversion. The surface reflectivity is mainly based on the theory of atmospheric radiative transfer, and is obtained by inversion of atmospheric radiative transfer model and software. The MODTRAN model and FLAASH software are mainly used in atmospheric correction of OHS hyperspectral satellite data.

The FLAASH algorithm was used for atmospheric correction. Based on the location of the study area and the time of image acquisition, we selected the tropical atmospheric correction algorithm, set the aerosol level to urban, and converted the image from apparent radiance data to surface reflectance.

Geometric correction

In the process of image acquisition, many factors will affect the geometric deformation of the image, so that the geometric figure in the image is different from the geometric figure in the selected map projection, which is manifested as displacement, rotation, scaling, affine, bending and other deformation. These deformations distort the geometry or position of the image. In order to eliminate these errors and improve the positioning accuracy and use value of remote sensing image, geometric correction of remote sensing image is needed.

The key steps in geometric correction of remote sensing image mainly include:

Establish a unified coordinate system and map projection for distorted images and reference images. The “CGCs2000_3_degree_Gauss_Kruger_CM_114E” coordinate system is adopted in this paper.
Select the ground control point (GCP), according to the GCP selection principle, find the same position of the ground control point pair on the distorted image and the reference image.
Select the correction model, use the selected GCP data to obtain the parameters of the correction model, and then use the correction model to realize the pixel coordinate transformation between the distorted image and the reference image.
Select the appropriate resampling method to perform grayscale assignment on the output image pixel of the distorted image.

Orthographic correction

Orthographic correction is a process of correcting space and geometric distortion of image to produce orthographic image of multi-center projection. It not only rectifies geometric distortions caused by systemic factors but also mitigates distortions induced by varying terrain conditions. The RPC coefficients provided by L1 radiation correction products of OHS hyperspectral satellite are used to construct rational function models for orthographic correction of hyperspectral images. The rational function model, as an alternative model to the rigorous geometry model of the sensor, can realize the orthographic correction of remote sensing images without control points. The RPC coefficient complies with NITF 2.1 standard (RPC00B format).

After processing by the series of means described above, the data can be used in a variety of applications to assist the industry. Table 2 lists some of these applications and the corresponding spectral bands used. There are differences in the spectral characteristics of different ground objects, which are reflected in different spectral bands. When using hyperspectral remote sensing images, appropriate bands are selected according to the spectral characteristics of different ground objects. These bands were also included in constructing the OHID-1⁴⁷.

Table 2.

Applications of “Zhuhai No.1” hyperspectral satellite.

Application	Criteria	Wave Length(nm)	OHID-1 Reference Band
Black and Smelly Water	Chlorophyll Inversion	566 and 670	b7,b14
	Suspended Matter and Transparency	670 and 806	b14,b23
	Dissolved Oxygen Inversion	520 and 566	b4,b7
	Total Nitrogen and Phosphorus Inversion	566 and 670	b7,b14
	Black and Smelly Water	550–580 and 626–700	b6-b8 and b11-b17
Water	NDWI	Near red, green light band	b21-b32 and b3-b7
	Suspended Matter Concentration	700–850	b16-b26
	Yellow Substance	466	b1
	Chlorophyll	480,536,566	b2,b5,b7
	Sediment, CODmn	500	b3
	Red Tide	520	b4
	Cyanobacteria Bloom	550,626,686	b6,b11
	Algae	610–640	b10-b12
	Chlorophyll Absorption	656	b13
	Chlorophyll Fluorescence	686	b15
	Water Plants, Transparency	716	b17
Forestry	Chlorophyll	640–660 and 430–450	b12-b13
	Anthocyanin	537	b5
	NDVI	Near red, red light band	b14,b24
	Pests and Diseases	400–700 and 720–1100	b1-b16 and b17-b32
	Red Edge Location (Pests and Diseases)	680–750	b15-b19
	FHI (Forest Health Evaluation)	566,606,654,866	b7,b10,b13,b27
	PRI (Photochemical Vegetation Index)	531,570	b5,b7
Crop	SAVI (Soil Regulated Vegetation Index), Rice Growth	700–750 and 725–890	b16-b19 and b18-b28
	RVI (Ratio Vegetation Index),	465–605 and 860–1000	b1-b9 and b27-b32
	SAVI, Wheat Growth	695–750 and 735–1000	b16-b19 and b18-b32
	RVI, Wheat Growth	460–590 and 725–1000	b1-b9 and b18-b32
Ocean	Bleached Coral	520–580	b4-b8
Ocean	Water and Land Segmentation	776–940	b21-b32

Open in a new tab

Technical Validation

Quality of OHID-1 Dataset

Ensuring the image quality of the dataset is critical to guaranteeing that the input data meets the highest standards, going beyond factors like spatial resolution or the number of spectral bands. Here we chose five parameters to state the quality of the provided data, here is the synopsis of the quality control (QC) implemented during its processing:

① Uncontrolled positioning accuracy: Utilize CE90 to evaluate the uncontrolled positioning accuracy of OHS hyperspectral satellite images, ensuring that the error is less than 500 m.

② Controlled positioning accuracy: Select an appropriate number and distribution of control points in the image, calculate their distance from the true position coordinates, and ensure the error is less than 3 pixels.

③ Full-band image registration accuracy: After the registration process is completed, evaluate the positions of each control point to verify that the absolute and geographic positioning errors relative to other products in the dataset are less than 3 pixels.

④ Relative radiometric calibration error accuracy: Examine the OHS hyperspectral satellite image after radiometric calibration processing, calculate the error between the calibrated radiance values and true radiance values, and display the relative radiometric calibration error to be less than 3%.

⑤ Signal-to-noise ratio (SNR): Select suitable images based on the requirements of a solar height angle greater than 30 degrees and a ground reflectance greater than 0.2 to calculate their SNR, in order to verify that their SNR falls within the range of 25 to 40 decibels.

Algorithms adopted

Tons of methods have been published for optimizing the performance of DNNs in recent years^{13,17,51–55}. For HSI classification, the following methods have been successful applied in image classification: CDCNN^56,57 uses a neural network that is deeper than most others. It fuses the ideas of AlexNet⁵⁴, DCNN¹⁹, ResNet¹³ and FCN²², and utilizes a residual structure with only convolutional layers used to extract HSI features.

However, CDCNN only deals with 2-D data. To address this problem, SSRN⁵⁸ adopts 3-D data as input, and improves its structure with ideas from 3DCNN and ResNet. To further improve accuracy, DBMA²¹ and DBDA⁵⁹ use a novel structure with a dual network to extract features from the space and spectrum of the data separately and merge the features during inference. SSSAN⁶⁰ differs from other methods mentioned in the article in that the backbone of its visual feature extraction module uses a transformer structure. A feature of CVSSN⁶¹ is to improve the representation of spectral-spatial characteristics based on the extraction of spatial information. In other aspects, FDSSC⁶² proposes a novel network to improve speed and accuracy. HyLITE⁶³, a vision transformer that incorporates both local and spectral information, demonstrates superior performance compared to other network. The classification frameworks of different DNNs are shown in Figs. 8–14.

Fig. 8 — Flowchart of different DCNNs classification framework: (a) 1DCNN. (b) 2DCNN. (c) 3DCNN.

Fig. 14 — Flowchart of HyLITE framework.

Figure 8 is divided into three parts, each offering a detailed visualization of a different convolutional neural network (CNN) architecture, along with its parameters and components. Part (a) of the image depicts the structure of 1DCNN. Part (b) of the image shows the structure of 2DCNN. Part(c) of the image illustrates the structure of 3DCNN.

1DCNN

CNNs have different implementations of convolutional and max pooling layers and different ways of training the network. As depicted in Fig. 8(a), 1DCNN comprises five weighted layers: the input layer, the convolutional layer C1, the max pooling layer M2, the fully connected layer F3, and the output layer. Assuming $θ$ denotes the complete set of trainable parameters, $θ = {θ_{i}}$ , where $θ_{i}$ is the parameter set between the ( $i - 1$ )-th and the i-th layer, index number i ranging from 1 to 4.

In HSI, each pixel sample can be conceptualized as a 2D image with a height dimension of 1, echoing the representation of 1D audio inputs in speech recognition. This comparison facilitates the application of similar processing techniques to HSI data. Hence, the size of the input layer are solely $(n_{1}, 1)$ , where $n_{1}$ represents the total number of bands. layer C1 processes the $n_{1} \times 1$ input data by applying 20 kernels of size $k_{1} \times 1$ . Layer C1 comprises $20 \times n_{2} \times 1$ nodes, where $n_{2}$ is determined by the relationship $n_{2} = n_{1} - k_{1} + 1$ . Between the input layer and layer C1, there exist $20 \times (k_{1} + 1)$ trainable parameters. Subsequently, layer M2, serving as the second hidden layer, employs a kernel size of $(k_{2}, 1)$ . Layer M2 comprises $20 \times n_{3} \times 1$ nodes, where $n_{3} = n_{2} / k_{2}$ . layer F3 has $n_{4}$ nodes and exhibits $(20 \times n_{3} + 1) \times n_{4}$ trainable parameters in its connection with layer M2. Finally, the output layer contains $n_{5}$ nodes and possesses $(n_{4} + 1) * n_{5}$ trainable parameters in its connection with layer F3. Consequently, the overall architecture of our proposed CNN classifier encompasses a total of $20 * (k_{1} + 1) + (20 \times n_{3} + 1) \times n_{4} + (n_{4} + 1) \times n_{5}$ trainable parameters⁶⁴.
2DCNN

As depicted in Fig. 8(b), the input hyperspectral image data of 2DCNN is represented as a 3D tensor, characterized by shape $h \times w \times c$ . Here, h and w correspond to the image’s height and width, respectively, while c denotes the number of spectral bands or channels present within the image. To align with the unique characteristics of CNNs, it is necessary to decompose the captured hyperspectral image into patches. Each patch encapsulates spectral and spatial information pertaining to a specific pixel, allowing for the extraction of relevant features for classification tasks⁶⁵. Specifically, to categorize a pixel $p_{x, y}$ located at coordinates $(x, y)$ on image plane while effectively merging spectral and spatial information, 2DCNN utilize a square patch, measuring $s \times s$ in size, with its center aligned at $p_{x, y}$ . Here, $l_{x, y}$ represents the class label assigned to the pixel situated at location $(x, y)$ , while $w_{x, y}$ designates the patch centered around pixel $p_{x, y}$ . Subsequently, a dataset D can be constructed, consisting of tuples ${(w_(x, y), l_(x, y))} f or x = 1, 2, \dots, w and y = 1, 2, \dots, h$ . Each tuple contains a patch $w_{x, y}$ , which is itself a 3D tensor with shape $s \times s \times c$ , encapsulating both spectral and spatial information pertaining to the pixel $p_{x, y}$ .

Furthermore, the tensor $w_{x, y}$ is decomposed into c matrices, each with shape s × s. These matrices are then inputted into a CNN, which progressively constructs high−level features that capture both spectral and spatial attributes of pixel $p_{x, y}$ . Subsequently, these extracted features are passed to a Multi-Layer Perceptron (MLP), which performs the classification task.
3DCNN

A combined spatio-spectral model is essential for the comprehensive analysis of both spectral and spatial information within hyperspectral data. The key advantage of spatio-spectral combined framework lies in the seamless integration of both components from the outset of the process, ensuring they remain inextricably linked throughout. By maximizing the utilization of data information, this solution significantly reduces costs. Unlike prior methods, 3DCNN concurrently processes spatial and spectral components through genuine 3D convolutions. This approach optimizes the limited number of available samples and requires fewer trainable parameters, enhancing overall performance⁶⁶. This proposal addresses the problem by decomposing it into the processing of a sequence of volumetric representations of the image. Therefore, each pixel is linked to an $n \times n$ spatial neighborhood and a defined number of spectral bands, effectively treating each pixel as a volumetric element of dimension $n \times n \times f$ . The core idea of this architecture lies in merging the principles of traditional CNN networks with a unique twist: the utilization of 3D convolution operations. This departure from the standard 1D convolution operators, which solely examine the spectral content of the data, allows for a more comprehensive analysis that incorporates both spatial and spectral information.

Figure 8(c) provides an overview of the 3DCNN architecture, which stacks various blocks of CNN layers to ensure deep and efficient representations of the image. Initially, a set of layers based on 3D convolutions is introduced to handle the three-dimensional input voxels. Each of these layers comprises a number of volumetric kernels that concurrently perform convolutions across the width, height, and depth axes of the input. Following this 3D convolution stack, a series of 1 × 1 convolution (1D) layers are employed to discard spatial neighborhood information, followed by a sequence of fully connected layers. Essentially, the proposed architecture treats 3D voxels as input data and initially generates 3D feature maps, which are gradually reduced to 1D feature vectors throughout the layers.
CDCNN

Figure 9 shows the overall framework of the CDCNN model, which can be divide into three main modules: feature vector extraction module, class center vector extraction module and prediction module. Where, the feature vector extraction module is located on the left side of Fig. 8, which can convert each input image into a vector representing image features, that is, it can convert $X_{i n p u t}$ into $X_{o u t p u t}$ . The functional module for extracting class center vectors through the extraction algorithm of class center vectors is located in the blue area in the upper right corner of Fig. 8. The prediction module for predicting image categories based on posterior probability values, located in the red dashed box in the bottom right corner⁶⁷.
DBMA

Figure 10 shows the overall framework of the DBMA model. Composed of dense spectral blocks and channel attention blocks, the spectral branch used for extracting spectral features is the top branch. Composed of dense spatial blocks and spatial attention blocks, the spatial branch used for extracting spatial features is the bottom branch²¹.
DBDA

The whole structure of the DBDA network can be seen in Fig. 11. For convenience, the top branch is called the “spectral branch” and the bottom branch is named the “spatial branch”. Spectral and spatial feature maps can be obtained by inputting inputs into the spectral and spatial branches respectively⁵⁹.
SSSAN

Figure 12 shows the overall framework of the SSSAN⁶⁰ model. The spectral module consists of two ReLU layers, two batch normalization layers, and two convolutional layers. The spatial module consists of two attention modules, two ReLU layers, two batch normalization layers, and two convolutional layers. The kernel size in the spatial module is 3 × 3 × q, where q is the channel number of input features. To maintain a spatial size of 7 × 7 for convolutional features, filling operations are used in each convolutional layer. In the end of SSSAN model, a softmax layer and two fully connected layers are utilized to classify.
CVSSN

Figure 13 shows the overall framework of the CVSSN⁶¹ model. Based on two similarity measures, an adaptive weighted addition based spectral vector self-similarity module (AWA-SVSS) and Euclidean distance based feature vector self-similarity module (ED-FVSS) have been designed, which can fully mine the central vector oriented spatial relationships. Moreover, a spectral-spatial information fusion module (SSIF) is formulated as a new pattern to fuse the central 1D spectral vector and the corresponding 3D patch for efficient spectral-spatial feature learning of the subsequent modules. Besides, a scale information complementary convolution module (SIC-Conv) and a channel spatial separation convolution module (CSS-Conv) are implemented for efficient spectral space feature learning.
HyLITE

Figure 14 is divided into two parts by a dashed line. On the left side, a Vision Transformer (ViT) structure is depicted, with a detailed illustration of the process through which hyperspectral data is input into the ViT structure. On the right side, the specific layer structure within the transformer architecture is shown.

Figure 14 furnishes an illustrative overview of HyLITE. The model rigorously adheres to the protocol specified in Spectralformer, approaching HSI as a task of image-level classification. For an image-label pair comprising $X \in R^{p \times p \times m}$ and y ∈ $R^{p \times p \times m}$ , where $X$ is a low-resolution square image featuring spatial dimensions $p \times p$ and spectral resolution $m$ , extracted from a high-resolution hyperspectral image via overlapping patchification, the model undertakes processing. Each patch is assigned a label corresponding to the category of its central pixel, selected from among $c$ possible classes, depending on different datasets. The model’s objective is to train a vision transformer, denoted as $f_{θ}$ with parameters $θ$ , to predict the image label $y' = f_{θ} (X)$ , where $y' \in R^{c}$ represents the model’s forecasted labels.

Evaluation Metrics

To evaluate HSI classification performance, we utilized the Overall Accuracy (OA), Average Accuracy (AA), and the kappa coefficient as evaluation metrics. The OA can be calculated using formula (2):

O A = \frac{m}{N},

where m denotes the number of accurate result and N denotes the total number of samples.

The AA can be obtained by formula (3):

A A = \frac{1}{n} \sum_{i = 0}^{n} \frac{m_{i}}{c_{i}},

where n denotes the class number, m_i denotes the number of accurate result of class i and c_i denotes the sample number of class i.

The kappa coefficient is a parameter to evaluate the consistency between different classes and it can be calculated using formula (4):

k a p p a = \frac{O A - p_{e}}{1 - p_{e}},

where p_e is calculated with formula (5):

p_{e} = \frac{c_{1} m_{1} + c_{2} m_{2} + \dots + c_{i} m_{i} + \dots + c_{n} m_{n}}{N^{2}}

and c_i denotes the sample number of the i_th class, m_i denotes the number of accurate results of the i_th class. OA denotes Overall Accuracy and N is the total number of samples.

The consistency indicated by the value of the kappa coefficient is divided into 5 classes. It is presented in Table 3.

Table 3.

Classes for the kappa coefficient.

Kappa	−1	0	0.00∼0.20	0.21∼0.40	0.41∼0.60	0.61∼0.80	0.81∼1.00
Class	Complete inconsistency	Occasional agreement	Slight	Fair	Moderate	Substantial	Almost Perfect

Open in a new tab

Experimental settings and results

We conducted experiments on OHID-1⁴⁷ with 8 DNNs and a traditional method: (1) 1DCNN⁶⁴, (2) 2DCNN⁶⁵, (3) 3DCNN⁶⁶, (4) CDCNN⁶⁷, (5) DBMA²¹, (6) DBDA⁵⁹, (7) SSSAN⁶⁰, (8) CVSSN⁶¹, (9) SVM⁶⁸ and HyLITE⁶³. We randomly selected 500 samples from each class as the training set and the rest samples as the test set. While training the DNNs, we used the Adam Optimizer and the CrossEntropy Loss. The data has been normalized. Experimentally, the learning rate and batch size are set to 0.0001 and 64, respectively, and 200 epochs are trained at a time. The learning rate is reduced by a factor of 0.1 at 50, 90, 110 epoch, respectively. We trained and tested for 10 times and calculated the mean value of OA, AA and kappa coefficient as our final results. The main parameters are set by experience and they are shown in Table 4. While training SVM, we exploited grid search to find parameters cost and gamma, we set parameter cost to 9.514 and parameter gamma to 0.03125 and also carried out training and testing 10 times.

Table 4.

Main parameters for training DNNs.

Parameters	Max Epochs	Early Stop	Learning Rate	Batch Size
Value	200	30	0.0001	64

Open in a new tab

The overall experiment results are described in Table 5, and visualization results of classification from different dataset are shown in Figs. 15–21.

Table 5.

Experiment results on OHID-1 and other HSI datasets.

Datasets	EVALUATION METRIC	1DCNN	2DCNN	3DCNN	CDCNN	DBMA	DBDA	SVM	SSSAN	CVSSN	HyLITE
Indian Pines	OA	0.861	0.917	0.84	0.623	0.932	0.954	0.694	0.956	0.981	0.983
	AA	0.916	0.93	0.904	0.509	0.877	0.965	0.656	0.957	0.959	0.969
	kappa	0.838	0.902	0.814	0.559	0.922	0.947	0.647	0.95	0.978	0.980
Pavia university	OA	0.918	0.968	0.971	0.877	0.947	0.96	0.843	0.998	0.999	0.991
	AA	0.925	0.969	0.97	0.824	0.955	0.965	0.83	0.998	0.998	0.988
	kappa	0.891	0.957	0.961	0.836	0.93	0.947	0.788	0.997	0.998	0.989
Salinas Valley	OA	0.911	0.936	0.882	0.778	0.954	0.975	0.881	0.985	0.997	0.964
	AA	0.964	0.974	0.951	0.799	0.963	0.98	0.915	0.993	0.998	0.984
	kappa	0.901	0.928	0.868	0.755	0.949	0.972	0.867	0.983	0.997	0.960
WHU-Hi-Hanchuan	OA	0.792	0.869	0.854	0.678	0.838	0.857	0.707	0.985	0.991	0.952
	AA	0.754	0.856	0.833	0.429	0.78	0.821	0.506	0.983	0.986	0.889
	kappa	0.759	0.848	0.83	0.615	0.809	0.831	0.655	0.983	0.989	0.943
WHU-Hi-Longkou	OA	0.918	0.969	0.963	0.967	0.993	0.996	0.949	0.96	0.999	0.989
	AA	0.925	0.973	0.958	0.924	0.979	0.989	0.885	0.886	0.997	0.977
	kappa	0.891	0.96	0.951	0.956	0.991	0.995	0.934	0.948	0.998	0.986
WHU-Hi-Honghu	OA	0.742	0.887	0.841	0.778	0.888	0.899	0.757	0.986	0.993	0.946
	AA	0.726	0.88	0.84	0.464	0.831	0.879	0.538	0.97	0.983	0.870
	kappa	0.687	0.858	0.804	0.716	0.859	0.873	0.684	0.982	0.991	0.931
OHID-1	OA	0.592	0.663	0.615	0.724	0.736	0.74	0.691	0.895	0.903	0.869
	AA	0.616	0.687	0.648	0.562	0.601	0.61	0.543	0.836	0.886	0.698
	kappa	0.501	0.583	0.529	0.653	0.67	0.675	0.614	0.855	0.864	0.798

Open in a new tab

Fig. 15 — Visualization of experimental results of the Indian Pines dataset: (a) true color. (b) ground truth. (c) 1DCNN. (d) 2DCNN. (e) 3DCNN. (f) HyLITE. (g) SVM.

Fig. 21 — Visualization of experimental results of the WHU-Hi-HongHu dataset: (a) true color. (b) ground truth. (c) 1DCNN. (d) 2DCNN. (e) 3DCNN. (f) HyLITE. (g) SVM.

Figure 15 contains seven images and a legend. Image (a) displays the true color of the Indian Pines dataset. Image (b) shows the ground truth. Images (c) to (g) present the visual results obtained after training five different Deep Neural Networks (DNNs) and a traditional method on the Indian Pines dataset and performing inference. Specifically, Image (c) corresponds to 1DCNN (one-dimensional convolutional neural network), Image (d) to 2DCNN (two-dimensional convolutional neural network), Image (e) to 3DCNN (three-dimensional convolutional neural network), Image (f) to HyLITE, and Image (g) to SVM. The legend provides the color mapping for each category in the Indian Pines dataset within the inference results.

Figure 16 contains seven images and a legend. Image (a) displays the true color of Pavia University dataset. Image (b) shows the ground truth. Images (c) to (g) present the visual results obtained after training five different Deep Neural Networks (DNNs) and a traditional method on the Pavia University dataset and performing inference. Specifically, Image (c) corresponds to 1DCNN, Image (d) to 2DCNN, Image (e) to 3DCNN, Image (f) to HyLITE (Hyperspectral Locality-aware Image TransformEr), and Image (g) to SVM. The legend provides the color mapping for each category in the Pavia University dataset within the inference results.

Fig. 16 — Visualization of experimental results of the Pavia University dataset: (a) true color. (b) ground truth. (c) 1DCNN. (d) 2DCNN. (f) HyLITE. (g) SVM.

Figure 17 contains seven images and a legend. Image (a) displays the true color of OHID-1 dataset. Image (b) shows the ground truth. Images (c) to (g) present the visual results obtained after training five different Deep Neural Networks (DNNs) and a traditional method on the OHID-1 dataset and performing inference. Specifically, Image (c) corresponds to 1DCNN, Image (d) to 2DCNN, Image (e) to 3DCNN, Image (f) to HyLITE, and Image (g) to SVM. The legend provides the color mapping for each category in the OHID-1 dataset within the inference results.

Fig. 17 — Visualization of experimental results of the OHID-1 dataset: (a) true color. (b) ground truth. (c) 1DCNN. (d) 2DCNN. (e) 3DCNN. (f) HyLITE. (g) SVM.

Figure 18 contains seven images and a legend. Image (a) displays the true color of the Salinas Valley dataset. Image (b) shows the ground truth. Images (c) to (g) present the visual results obtained after training five different Deep Neural Networks (DNNs) and a traditional method on the Salinas Valley dataset and performing inference. Specifically, Image (c) corresponds to 1DCNN, Image (d) to 2DCNN, Image (e) to 3DCNN, Image (f) to HyLITE, and Image (g) to SVM. The legend provides the color mapping for each category in the Salinas Valley dataset within the inference results.

Fig. 18 — Visualization of experimental results of the Salinas Valley dataset: (a) true color. (b) ground truth. (c) 1DCNN. (d) 2DCNN. (e) 3DCNN. (f) HyLITE. (g) SVM.

Figure 19 contains seven images and a legend. Image (a) displays the true color of the WHU-Hi-HanChuan dataset. Image (b) shows the ground truth. Images (c) to (g) present the visual results obtained after training five different Deep Neural Networks (DNNs) and a traditional method on the WHU-Hi-HanChuan dataset and performing inference. Specifically, Image (c) corresponds to 1DCNN, Image (d) to 2DCNN, Image (e) to 3DCNN, Image (f) to HyLITE, and Image (g) to SVM. The legend provides the color mapping for each category in the WHU-Hi-HanChuan dataset within the inference results.

Fig. 19 — Visualization of experimental results of the WHU-Hi-HanChuan dataset: (a) true color. (b) ground truth. (c) 1DCNN. (d) 2DCNN. (e) 3DCNN. (f) HyLITE. (g) SVM.

Figure 20 contains seven images and a legend. Image (a) displays the true color of the WHU-Hi-LongKou dataset. Image (b) shows the ground truth. Images (c) to (g) present the visual results obtained after training five different Deep Neural Networks (DNNs) and a traditional method on the WHU-Hi-LongKou dataset and performing inference. Specifically, Image (c) corresponds to 1DCNN, Image (d) to 2DCNN, Image (e) to 3DCNN, Image (f) to HyLITE, and Image (g) to SVM. The legend provides the color mapping for each category in the WHU-Hi-LongKou dataset within the inference results.

Fig. 20 — Visualization of experimental results of the WHU-Hi-LongKou dataset: (a) true color. (b) ground truth. (c) 1DCNN. (d) 2DCNN. (e) 3DCNN. (f) HyLITE. (g) SVM.

Figure 21 contains seven images and a legend. Image (a) displays the true color of WHU-Hi-HongHu dataset. Image (b) shows the ground truth. Images (c) to (g) present the visual results obtained after training five different Deep Neural Networks (DNNs) and a traditional method on the WHU-Hi-HongHu dataset and performing inference. Specifically, Image (c) corresponds to 1DCNN, Image (d) to 2DCNN, Image (e) to 3DCNN, Image (f) to HyLITE, and Image (g) to SVM. The legend provides the color mapping for each category in the WHU-Hi-HongHu within the inference results.

Analysis of experimental results

From Table 5, it can be seen that the performance of the vast majority of methods on OHID-1⁴⁷ has decreased compared to their performance on other datasets, which means that OHID-1⁴⁷ is more difficult to annotate correctly for current HSI classification methods. In other words, OHID-1⁴⁷ can provide a solid foundation for the further development of HSI classification.

As Figs. 15–21 show, the ground truth images of the original scenes and the visualization of classification results obtained by various methods across three datasets. Upon inspection, it is evident from Figs. 15–21 that all prediction maps have achieved approximate correct predictions of the overall regions compared to the ground truth, demonstrating the effective contribution of artificial intelligence techniques to hyperspectral classification. Compared to other methods, the HyLITE architecture leverages transformers to produce smoother and more accurate classification maps. The transformer’s ability to capture long-range dependencies and complex relationships within the data allows it to provide refined predictions that are superior in both smoothness and accuracy.

The OHID-1⁴⁷ focuses on 7 classes of area near Zhuhai City. Besides classification of hyperspectral images, it also can be used for testing general multi-classification algorithms, both deep learning and traditional algorithms.

Usage Notes

The dataset described here is available from https://figshare.com/articles/online_resource/OHID-1/27966024/8. This dataset offers flexibility for researchers aiming to study HSI classification for environmental monitoring, resource management, and zone planning etc. Subjected to policy, the dataset includes no geospatial information.

The OHID-1⁴⁷ focuses on 7 classes of area near Zhuhai City. It can be utilized for evaluating both general hyperspectral classification algorithms, encompassing deep learning and traditional methodologies, as well as for computer vision tasks such as super-resolution.

Its advantage lies in its distinctive spatial resolution, spectral characteristics, extensive data scale, advanced acquisition techniques, and vast application potential, distinguishing OHID-1⁴⁷ from other datasets. Unlike traditional remote sensing datasets confined to color space features, OHID-1’s spectral-spatial features enrich object details. Meanwhile, it covers a broader area and exhibits more complex features, posing greater classification challenges. This presents a substantial challenge to existing hyperspectral image classification algorithms and, compared to other remote sensing hyperspectral datasets, has greater potential to drive improvements in algorithm performance.

The limitations of OHID-1 include: For terrain labels of OHID-1⁴⁷ do not include desert, glacier, wetland and other terrains which can not be found in Zhuhai City, the terrain similarity should be evaluated before using the dataset. And subjected to the resolution at 10 m, it is more suitable for investigation of wide area information than local information such as details of building or river.

OHID-1⁴⁷ data is characterized by both multi-category presence and data imbalance. This presents a challenge for subsequent HSI classification tasks with imbalanced data.

This dataset not only can be used for land cover classification, but also has a wide range of applications in agricultural management, environmental monitoring, resource management, and other areas.

Mineral Resource Exploration: Hyperspectral technology can identify different types of minerals, and by analyzing the spectral characteristics of these minerals, their types and distributions can be determined. This holds significant guidance for mineral resource exploration and mining.
Geological Structure Analysis: Hyperspectral data can also be used to analyze geological structures, including rock types and fault distributions. This information is of great importance for geological exploration and early warning of geological disasters.
Environmental Monitoring and Protection: Hyperspectral data can identify the characteristic spectra of different pollutants in water bodies, enabling real-time monitoring and early warning of water pollution. Additionally, by analyzing the spectral characteristics of soil, the degree and type of soil erosion can be assessed, providing a scientific basis for soil protection and management.
Precision Agriculture Monitoring and Management: Hyperspectral data can capture the spectral characteristics of crops at different growth stages, allowing for precise monitoring of crop growth conditions, including nutritional status and pest and disease conditions. Meanwhile, by analyzing the spectral reflectance characteristics of crops, their nutritional needs and water status can be determined, enabling precise fertilization and irrigation, which improves the utilization efficiency of agricultural resources. Furthermore, continuous monitoring of crop growth conditions using hyperspectral data allows for the establishment of crop growth models, thereby enabling more accurate yield estimates.

In conclusion, this study aimed to address the challenges associated with the classification of hyperspectral images by introducing a new large-scale dataset, OHID-1⁴⁷, designed to advance the performance of hyperspectral image classification algorithms. Our primary goal was to provide a comprehensive dataset that captures the complex characteristics of different areas in Zhuhai City, China, and to evaluate the effectiveness of various deep neural networks (DNNs) on this dataset. The main results of our research indicate that OHID-1, with its rich spectral diversity and high spatial resolution, poses significant classification challenges for existing AI algorithms, suggesting that there is room for improvement in the development of more accurate and robust classification models.

For future work, we will expand the capacity of the OHID-1⁴⁷ from 100 MB level to 10 GB level (OHID-2). Meanwhile, we will provide more category labels ranging from 7 classes to more than 10 classes, including desert, glacier, wetland and other terrains. We will provide data for more regions besides Zhuhai. In addition, we will develop more efficient and accurate classification and detection algorithms as benchmark algorithms based on the characteristics of the OHID-1⁴⁷ to fulfill increasingly diverse application requirements.

Acknowledgements

This research was funded by Guangdong Provincial Key Laboratory of Big Data Processing and Applications of Hyperspectral Remote Sensing Micro/Nano Satellites under Grant 2023B1212020009, supported by Hybrid AI and Big Data Research Center of Chongqing University of Education (2023XJPT02); Science and Technology Research Program of Chongqing Education Commission of China (KJQN202201608); Collaborative QUST-CQUE Laboratory for Hybrid Methods of Big Data Analysis. This research was funded by Science and Technology Project of Social Development in Zhuhai No. 2420004000328.

Author contributions

The original data with quality and data standard control was provided by J.Y., J.D., X.C. and J.W. J.D. and X.J. performed data labeling and presentation. A.M., S.G., L.L. and D.Y. contributed to conceptualization, methodology and formal analysis. J.D., C.D., X.W., J.H., A.D. and S.X. have performed experimental studies and analysis experiments. A.M., A.D., J.D. wrote the manuscript. S.G. and Y.S. contributed to revision of the manuscript to its final version.

Code availability

The code for 1DCNN, 2DCNN, 3DCNN is available at GitHub: https://github.com/eecn/Hyperspectral-Classification (Hyperspectral-Classification Pytorch, nuaa.cf).

The code of CDCNN, DBDA, DBMA is available at GitHub: https://github.com/lironui/Double-Branch-Dual-Attention-Mechanism-Network.

The code for SSSAN is available at GitHub: https://github.com/jizexuan/SSSANet.

The code for CVSSN is available at GitHub: https://github.com/lms-07/CVSSN.

The code for HyLITE is available at GitHub: https://github.com/hifexplo/hylite.

The codes for band synthesis and slicing of the original file is available at https://github.com/hrnavy/OHID-1 (Preprocessing).

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Sergey Gorbachev, Email: gorbachev@cque.edu.cn.

Jun Yan, Email: yanjun@qust.edu.cn.

Dong Yue, Email: yued@njupt.edu.cn.

References

1.Yang, W., Yang, C., Hao, Z., Xie, C. & Li, M. Diagnosis of plant cold damage based on hyperspectral imaging and convolutional neural network. IEEE Access.7, 118239–118248 (2019). [Google Scholar]
2.Junttila, S. et al. Close-range hyperspectral spectroscopy reveals leaf water content dynamics. Remote Sensing of Environment: An Interdisciplinary Journal. 277 (2022).
3.Prey, L., Von Bloh, M. & Schmidhalter, U. Evaluating RGB imaging and multispectral active and hyperspectral passive sensing for assessing early plant vigor in winter wheat. Sensors.18(9), 2931 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Jones, T. A., Koch, S. & Li, Z. Assimilating synthetic hyperspectral sounder temperature and humidity retrievals to improve severe weather forecasts. Atmospheric Research.186, 9–25 (2017). [Google Scholar]
5.Ghosh, A. & Joshi, P. K. Hyperspectral imagery for disaggregation of land surface temperature with selected regression algorithms over different land use land cover scenes. ISPRS Journal of Photogrammetry and Remote Sensing96, 76–93 (2014). [Google Scholar]
6.Foglini, F. et al. Application of hyperspectral imaging to underwater habitat mapping. Southern Adriatic Sea. Sensors19(10), 2261 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Marcello, J., Eugenio, F., Martín, J. & Marqués, F. Seabed mapping in coastal shallow waters using high resolution multispectral and hyperspectral imagery. Remote Sensing10(8), 1208 (2018). [Google Scholar]
8.Rubo, S. & Zinkernagel, J. Exploring hyperspectral reflectance indices for the estimation of water and nitrogen status of spinach. Biosystems Engineering214, 58–71 (2022). [Google Scholar]
9.Gao, Y. et al. Hyperspectral and multispectral classification for coastal wetland using depthwise feature interaction network. IEEE Transactions on Geoscience and Remote Sensing60, 1–15 (2021). [Google Scholar]
10.Sellami, A., Abbes, A. B., Barra, V. & Farah, I. R. Fused 3-d spectral-spatial deep neural networks and spectral clustering for hyperspectral image classification. Pattern Recognition Letters.138, 594–600 (2020). [Google Scholar]
11.Bock, C. H., Poole, G. H., Parker, P. E. & Gottwald, T. R. Plant disease severity estimated visually, by digital photography and image analysis. and by hyperspectral imaging. Critical reviews in plant sciences29(2), 59–107 (2010). [Google Scholar]
12.Daudt, R. C., Le Saux, B., Boulch, A. & Gousseau, Y. Urban change detection for multispectral earth observation using convolutional neural networks. IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, 2115-2118 (2018).
13.He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778 (2016).
14.Russakovsky, O. et al. ImageNet large scale visual recognition challenge. International journal of computer vision115(3), 211–252 (2015). [Google Scholar]
15.Xu, R. et al. SDNN: Symmetric deep neural networks with lateral connections for recommender systems. Information Sciences595, 217–230 (2022). [Google Scholar]
16.Xiao, S. et al. Complementary or Substitutive? A novel deep learning method to leverage text-image interactions for multimodal review helpfulness prediction. Expert Systems with Application208, 118138 (2022). [Google Scholar]
17.Jin, L., Zhang, L. & Zhao, L. Feature selection based on absolute deviation factor for text classification. Information Processing & Management60, 103251 (2023). [Google Scholar]
18.Caba Heilbron, F., Escorcia, V., Ghanem, B. & Carlos Niebles, J. Activitynet: A large-scale video benchmark for human activity understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 961-970 (2015).
19.Chen, Y. et al. Hyperspectral images classification with gabor filtering and convolutional neural network. IEEE Geoscience and Remote Sensing14(12), 2355–2359 (2017). [Google Scholar]
20.Ji, X., Henriques, J. F. & Vedaldi, A. nvariant information clustering for unsupervised image classification and segmentation Proceedings of the IEEE/CVF international conference on computer vision, 9865-9874 (2019).
21.Ma, W., Yang, Q., Wu, Y., Zhao, W. & Zhang, X. Double-branch multi-attention mechanism network for hyperspectral image classification. Remote Sensing11(11), 1307 (2019). [Google Scholar]
22.Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, 3431-3440 (2015). [DOI] [PubMed]
23.Zou, Y. et al. License plate detection and recognition based on YOLOv3 and ILPRNET. Signal. Image and Video Processing.16, 473–480 (2022). [Google Scholar]
24.Zou, Y. et al. A robust license plate recognition model based on Bi-LSTM. IEEE Access8, 211630–211641 (2020). [Google Scholar]
25.Everingham, M., Van Gool, L., Williams, C. K., Winn, J. & Zisserman, A. The pascal visual object classes (voc) challenge. International journal of computer vision88(2), 303–338 (2010). [Google Scholar]
26.Goyal, R. et al. The “something something” video database for learning and evaluating visual common sense. Proceedings of the IEEE international conference on computer vision, 5842-5850 (2017).
27.Gu, C. et al. Ava: A video dataset of spatiotemporally localized atomic visual actions. Proceedings of the IEEE conference on computer vision and pattern recognition, 6047-6056 (2018).
28.Baumgardner, M. F.; Biehl, L. L.; Landgrebe, D. A. 220 Band AVIRIS Hyperspectral Image Data Set: June 12, 1992 Indian Pine Test Site 3. Purdue University Research Repository. https://purr.purdue.edu/publications/1947/1 (2015).
29.P. Gamba. Pavia university Dataset. https://ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes (2016).
30.NASA AVIRIS. Kennedy Space Center Dataset. http://www.ehu.eus/ccwintcolindex.php/Hyperspectral_Remote_Sensing_Scenes (2016).
31.National Science Foundation Center for Airborne Laser Mapping. Houston Dataset. 2013 IEEE GRSS Data Fusion Contesthttps://machinelearning.ee.uh.edu/2013-ieee-grss-data-fusion-contest/ (2013).
32.Salinas valley Dataset. https://ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes (2017).
33.Botswana Dataset. http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes (2017).
34.Yi, C. E. N. et al. Aerial hyperspectral remote sensing classification dataset of Xiongan New Area (Matiwan Village). National Remote Sensing Bulletin24(11), 1299–1306 (2020). [Google Scholar]
35.Sun, L., Zhang, J., Li, J., Wang, Y. & Zeng, D. SDFC dataset: a large-scale benchmark dataset for hyperspectral image classification. Optical and Quantum Electronics55(2), 173 (2023). [Google Scholar]
36.Trento Dataset. Githubhttps://github.com/pagrim/TrentoData (2019).
37.Yokoya, N. & Iwasaki, A. Airborne hyperspectral data over Chikusei. Naoto TOKYAhttps://naotoyokoya.com/Download.html (2016).
38.Zhong, Y. et al. WHU-Hi: UAV-borne hyperspectral with high spatial resolution (H2) benchmark datasets and classifier for precise crop identification based on deep convolutional neural network with CRF. Remote Sensing of Environment250, 112012 (2020). [Google Scholar]
39.Aerial hyperspectral remote sensing classification dataset of Xiongan New Area (Matiwan Village). 10.11834/jrs.20209065.
40.Jun, Y. et al. Using a hybrid FNN method for image classification of satellite remote sensing data. Artificial Intelligence Impressions1, 159–180 (2022). [Google Scholar]
41.Liu, W., Zhang, Y., Yan, J., Zou, Y. & Cui, Z. Semantic segmentation network of remote sensing images with dynamic loss fusion strategy. IEEE Access9, 70406–70418 (2021). [Google Scholar]
42.Mohanty, S. P., Hughes, D. P. & Salathé, M. Using deep learning for image-based plant disease detection. Frontiers in plant science7, 1419 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Lee, H. & Kwon, H. Going deeper with contextual cnn for hyperspectral image classification. IEEE Transactions on Image Processing26(10), 4843–4855 (2017). [DOI] [PubMed] [Google Scholar]
44.Rafal, J., Wojciech, Z. & Ilya, S. An empirical exploration of Recurrent Network architectures. 32nd International Conference on Machine Learning. ICML.3, 2332–2340 (2015). [Google Scholar]
45.Li, F. et al. Evaluating hyperspectral vegetation indices for estimating nitrogen concentration of winter wheat at different growth stages. Precision Agriculture11(4), 335–357 (2010). [Google Scholar]
46.Wang, W., Dou, S., Jiang, Z. & Sun, L. A fast dense spectral–spatial convolution network framework for hyperspectral images assification. Remote Sensing10(7), 1068 (2018). [Google Scholar]
47.Deng, J., & Wei, X. Dataset OHID-1: A New Large Hyperspectral Image Dataset for Multi-Classification, Figshare, 10.6084/m9.figshare.27966024.v8 (2024).
48.Rani, N., Mandla, V. R. & Singh, T. Evaluation of atmospheric corrections on hyperspectral data with special reference to mineral mapping. Geoscience Frontiers8(4), 797–808 (2017). [Google Scholar]
49.López-Serrano, P. M., Corral-Rivas, J. J., Díaz-Varela, R. A., Álvarez-González, J. G. & López-Sánchez, C. A. Evaluation of radiometric and atmospheric correction algorithms for aboveground forest biomass estimation using Landsat 5 TM data. Remote sensing8(5), 369 (2016). [Google Scholar]
50.Prieto-Amparan, J. A. et al. Atmospheric and radiometric correction algorithms for the multitemporal assessment of grasslands productivity. Remote Sensing10(2), 219 (2018). [Google Scholar]
51.Möllenbrok, L., Sumbul, G. & Demir, B. Deep active learning for multi-label classification of remote sensing images. IEEE Geoscience and Remote Sensing Letters20, 5002405 (2023). [Google Scholar]
52.Tan, S., Yan, J., Jiang, Z. & Huang, L. Approach for improving YOLOv5 network with application to remote sensing target detection. Journal of Applied Remote Sensing15(03), 036512 (2021). [Google Scholar]
53.Karen, S., Andrew, Z. Very deep convolutional networks for large-scale image recognition. 3rd International Conference on Learning Representations. ICLR 2015 - Conference Track Proceedings, 1-14 (2015).
54.Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems25, 1097–1105 (2012). [Google Scholar]
55.Webb, G. I., Keogh, E. & Miikkulainen, R. Naive bayes. Springer15(1), 713–714 (2010). [Google Scholar]
56.Shi, W. et al. Landslide recognition by deep convolutional neural network and change detection. IEEE Transactions on Geoscience and Remote Sensing59(6), 4654–4672 (2020). [Google Scholar]
57.Du, R. et al. CDCNN-CMR-SV algorithm for robust adaptive wideband beamforming. Signal. Image and Video Processing17(5), 2137–2143 (2023). [Google Scholar]
58.Zhong, Z. et al. Spectral–spatial residual network for hyperspectral image classification: A 3-D deep learning framework. IEEE Transactions on Geoscience and Remote Sensing56(2), 847–858 (2017). [Google Scholar]
59.Li, R., Zheng, S., Duan, C., Yang, Y. & Wang, X. Classification of hyperspectral image based on double-branch dual-attention mechanism network. Remote Sensing12(3), 582 (2020). [Google Scholar]
60.Zhang, X. et al. Spectral–Spatial self-attention networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing60, 5512115 (2022). [Google Scholar]
61.Li, M., Liu, Y., Xue, G., Huang, Y. & Yang, G. Exploring the relationship between center and neighborhoods: Central vector oriented self-similarity network for hyperspectral image classification. IEEE Transactions on Circuits and Systems for Video Technology33(4), 1979–1993 (2022). [Google Scholar]
62.Wang, W., Dou, S., Jiang, Z. & Sun, L. A fast dense spectral–spatial convolution network framework for hyperspectral images classification. Remote sensing10(7), 1068 (2018). [Google Scholar]
63.Thiele, S. T. et al. Multi-scale, multi-sensor data integration for automated 3-D geological mapping. Ore Geology Reviews136, 104252 (2021). [Google Scholar]
64.Hu, W., Huang, Y., Wei, L., Zhang, F. & Li, H. (Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors, 258619 (2015).
65.Makantasis, K., Karantzalos, K., Doulamis, A. & Doulamis, N. Deep supervised learning for hyperspectral data classification through convolutional neural networks. 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 4959-4962 (2015).
66.Hamida, A. B., Benoit, A., Lambert, P. & Amar, C. B. 3-D Deep Learning Approach for Remote Sensing Image Classification. IEEE Transactions on Geoscience and Remote Sensing56(8), 4420–4434 (2018). [Google Scholar]
67.Zhang, T. et al. CDCNN: a model based on class center vectors and distance comparison for wear particle recognition. IEEE Access8, 113262–113270 (2020). [Google Scholar]
68.Cortes, C. & Vapnik, V. Support-vector networks. Machine learning20, 273–297 (1995). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

Deng, J., & Wei, X. Dataset OHID-1: A New Large Hyperspectral Image Dataset for Multi-Classification, Figshare, 10.6084/m9.figshare.27966024.v8 (2024).

Data Availability Statement

The code for 1DCNN, 2DCNN, 3DCNN is available at GitHub: https://github.com/eecn/Hyperspectral-Classification (Hyperspectral-Classification Pytorch, nuaa.cf).

The code of CDCNN, DBDA, DBMA is available at GitHub: https://github.com/lironui/Double-Branch-Dual-Attention-Mechanism-Network.

The code for SSSAN is available at GitHub: https://github.com/jizexuan/SSSANet.

The code for CVSSN is available at GitHub: https://github.com/lms-07/CVSSN.

The code for HyLITE is available at GitHub: https://github.com/hifexplo/hylite.

The codes for band synthesis and slicing of the original file is available at https://github.com/hrnavy/OHID-1 (Preprocessing).

[CR1] 1.Yang, W., Yang, C., Hao, Z., Xie, C. & Li, M. Diagnosis of plant cold damage based on hyperspectral imaging and convolutional neural network. IEEE Access.7, 118239–118248 (2019). [Google Scholar]

[CR2] 2.Junttila, S. et al. Close-range hyperspectral spectroscopy reveals leaf water content dynamics. Remote Sensing of Environment: An Interdisciplinary Journal. 277 (2022).

[CR3] 3.Prey, L., Von Bloh, M. & Schmidhalter, U. Evaluating RGB imaging and multispectral active and hyperspectral passive sensing for assessing early plant vigor in winter wheat. Sensors.18(9), 2931 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Jones, T. A., Koch, S. & Li, Z. Assimilating synthetic hyperspectral sounder temperature and humidity retrievals to improve severe weather forecasts. Atmospheric Research.186, 9–25 (2017). [Google Scholar]

[CR5] 5.Ghosh, A. & Joshi, P. K. Hyperspectral imagery for disaggregation of land surface temperature with selected regression algorithms over different land use land cover scenes. ISPRS Journal of Photogrammetry and Remote Sensing96, 76–93 (2014). [Google Scholar]

[CR6] 6.Foglini, F. et al. Application of hyperspectral imaging to underwater habitat mapping. Southern Adriatic Sea. Sensors19(10), 2261 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Marcello, J., Eugenio, F., Martín, J. & Marqués, F. Seabed mapping in coastal shallow waters using high resolution multispectral and hyperspectral imagery. Remote Sensing10(8), 1208 (2018). [Google Scholar]

[CR8] 8.Rubo, S. & Zinkernagel, J. Exploring hyperspectral reflectance indices for the estimation of water and nitrogen status of spinach. Biosystems Engineering214, 58–71 (2022). [Google Scholar]

[CR9] 9.Gao, Y. et al. Hyperspectral and multispectral classification for coastal wetland using depthwise feature interaction network. IEEE Transactions on Geoscience and Remote Sensing60, 1–15 (2021). [Google Scholar]

[CR10] 10.Sellami, A., Abbes, A. B., Barra, V. & Farah, I. R. Fused 3-d spectral-spatial deep neural networks and spectral clustering for hyperspectral image classification. Pattern Recognition Letters.138, 594–600 (2020). [Google Scholar]

[CR11] 11.Bock, C. H., Poole, G. H., Parker, P. E. & Gottwald, T. R. Plant disease severity estimated visually, by digital photography and image analysis. and by hyperspectral imaging. Critical reviews in plant sciences29(2), 59–107 (2010). [Google Scholar]

[CR12] 12.Daudt, R. C., Le Saux, B., Boulch, A. & Gousseau, Y. Urban change detection for multispectral earth observation using convolutional neural networks. IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, 2115-2118 (2018).

[CR13] 13.He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778 (2016).

[CR14] 14.Russakovsky, O. et al. ImageNet large scale visual recognition challenge. International journal of computer vision115(3), 211–252 (2015). [Google Scholar]

[CR15] 15.Xu, R. et al. SDNN: Symmetric deep neural networks with lateral connections for recommender systems. Information Sciences595, 217–230 (2022). [Google Scholar]

[CR16] 16.Xiao, S. et al. Complementary or Substitutive? A novel deep learning method to leverage text-image interactions for multimodal review helpfulness prediction. Expert Systems with Application208, 118138 (2022). [Google Scholar]

[CR17] 17.Jin, L., Zhang, L. & Zhao, L. Feature selection based on absolute deviation factor for text classification. Information Processing & Management60, 103251 (2023). [Google Scholar]

[CR18] 18.Caba Heilbron, F., Escorcia, V., Ghanem, B. & Carlos Niebles, J. Activitynet: A large-scale video benchmark for human activity understanding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 961-970 (2015).

[CR19] 19.Chen, Y. et al. Hyperspectral images classification with gabor filtering and convolutional neural network. IEEE Geoscience and Remote Sensing14(12), 2355–2359 (2017). [Google Scholar]

[CR20] 20.Ji, X., Henriques, J. F. & Vedaldi, A. nvariant information clustering for unsupervised image classification and segmentation Proceedings of the IEEE/CVF international conference on computer vision, 9865-9874 (2019).

[CR21] 21.Ma, W., Yang, Q., Wu, Y., Zhao, W. & Zhang, X. Double-branch multi-attention mechanism network for hyperspectral image classification. Remote Sensing11(11), 1307 (2019). [Google Scholar]

[CR22] 22.Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, 3431-3440 (2015). [DOI] [PubMed]

[CR23] 23.Zou, Y. et al. License plate detection and recognition based on YOLOv3 and ILPRNET. Signal. Image and Video Processing.16, 473–480 (2022). [Google Scholar]

[CR24] 24.Zou, Y. et al. A robust license plate recognition model based on Bi-LSTM. IEEE Access8, 211630–211641 (2020). [Google Scholar]

[CR25] 25.Everingham, M., Van Gool, L., Williams, C. K., Winn, J. & Zisserman, A. The pascal visual object classes (voc) challenge. International journal of computer vision88(2), 303–338 (2010). [Google Scholar]

[CR26] 26.Goyal, R. et al. The “something something” video database for learning and evaluating visual common sense. Proceedings of the IEEE international conference on computer vision, 5842-5850 (2017).

[CR27] 27.Gu, C. et al. Ava: A video dataset of spatiotemporally localized atomic visual actions. Proceedings of the IEEE conference on computer vision and pattern recognition, 6047-6056 (2018).

[CR28] 28.Baumgardner, M. F.; Biehl, L. L.; Landgrebe, D. A. 220 Band AVIRIS Hyperspectral Image Data Set: June 12, 1992 Indian Pine Test Site 3. Purdue University Research Repository. https://purr.purdue.edu/publications/1947/1 (2015).

[CR29] 29.P. Gamba. Pavia university Dataset. https://ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes (2016).

[CR30] 30.NASA AVIRIS. Kennedy Space Center Dataset. http://www.ehu.eus/ccwintcolindex.php/Hyperspectral_Remote_Sensing_Scenes (2016).

[CR31] 31.National Science Foundation Center for Airborne Laser Mapping. Houston Dataset. 2013 IEEE GRSS Data Fusion Contesthttps://machinelearning.ee.uh.edu/2013-ieee-grss-data-fusion-contest/ (2013).

[CR32] 32.Salinas valley Dataset. https://ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes (2017).

[CR33] 33.Botswana Dataset. http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes (2017).

[CR34] 34.Yi, C. E. N. et al. Aerial hyperspectral remote sensing classification dataset of Xiongan New Area (Matiwan Village). National Remote Sensing Bulletin24(11), 1299–1306 (2020). [Google Scholar]

[CR35] 35.Sun, L., Zhang, J., Li, J., Wang, Y. & Zeng, D. SDFC dataset: a large-scale benchmark dataset for hyperspectral image classification. Optical and Quantum Electronics55(2), 173 (2023). [Google Scholar]

[CR36] 36.Trento Dataset. Githubhttps://github.com/pagrim/TrentoData (2019).

[CR37] 37.Yokoya, N. & Iwasaki, A. Airborne hyperspectral data over Chikusei. Naoto TOKYAhttps://naotoyokoya.com/Download.html (2016).

[CR38] 38.Zhong, Y. et al. WHU-Hi: UAV-borne hyperspectral with high spatial resolution (H2) benchmark datasets and classifier for precise crop identification based on deep convolutional neural network with CRF. Remote Sensing of Environment250, 112012 (2020). [Google Scholar]

[CR39] 39.Aerial hyperspectral remote sensing classification dataset of Xiongan New Area (Matiwan Village). 10.11834/jrs.20209065.

[CR40] 40.Jun, Y. et al. Using a hybrid FNN method for image classification of satellite remote sensing data. Artificial Intelligence Impressions1, 159–180 (2022). [Google Scholar]

[CR41] 41.Liu, W., Zhang, Y., Yan, J., Zou, Y. & Cui, Z. Semantic segmentation network of remote sensing images with dynamic loss fusion strategy. IEEE Access9, 70406–70418 (2021). [Google Scholar]

[CR42] 42.Mohanty, S. P., Hughes, D. P. & Salathé, M. Using deep learning for image-based plant disease detection. Frontiers in plant science7, 1419 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Lee, H. & Kwon, H. Going deeper with contextual cnn for hyperspectral image classification. IEEE Transactions on Image Processing26(10), 4843–4855 (2017). [DOI] [PubMed] [Google Scholar]

[CR44] 44.Rafal, J., Wojciech, Z. & Ilya, S. An empirical exploration of Recurrent Network architectures. 32nd International Conference on Machine Learning. ICML.3, 2332–2340 (2015). [Google Scholar]

[CR45] 45.Li, F. et al. Evaluating hyperspectral vegetation indices for estimating nitrogen concentration of winter wheat at different growth stages. Precision Agriculture11(4), 335–357 (2010). [Google Scholar]

[CR46] 46.Wang, W., Dou, S., Jiang, Z. & Sun, L. A fast dense spectral–spatial convolution network framework for hyperspectral images assification. Remote Sensing10(7), 1068 (2018). [Google Scholar]

[CR47] 47.Deng, J., & Wei, X. Dataset OHID-1: A New Large Hyperspectral Image Dataset for Multi-Classification, Figshare, 10.6084/m9.figshare.27966024.v8 (2024).

[CR48] 48.Rani, N., Mandla, V. R. & Singh, T. Evaluation of atmospheric corrections on hyperspectral data with special reference to mineral mapping. Geoscience Frontiers8(4), 797–808 (2017). [Google Scholar]

[CR49] 49.López-Serrano, P. M., Corral-Rivas, J. J., Díaz-Varela, R. A., Álvarez-González, J. G. & López-Sánchez, C. A. Evaluation of radiometric and atmospheric correction algorithms for aboveground forest biomass estimation using Landsat 5 TM data. Remote sensing8(5), 369 (2016). [Google Scholar]

[CR50] 50.Prieto-Amparan, J. A. et al. Atmospheric and radiometric correction algorithms for the multitemporal assessment of grasslands productivity. Remote Sensing10(2), 219 (2018). [Google Scholar]

[CR51] 51.Möllenbrok, L., Sumbul, G. & Demir, B. Deep active learning for multi-label classification of remote sensing images. IEEE Geoscience and Remote Sensing Letters20, 5002405 (2023). [Google Scholar]

[CR52] 52.Tan, S., Yan, J., Jiang, Z. & Huang, L. Approach for improving YOLOv5 network with application to remote sensing target detection. Journal of Applied Remote Sensing15(03), 036512 (2021). [Google Scholar]

[CR53] 53.Karen, S., Andrew, Z. Very deep convolutional networks for large-scale image recognition. 3rd International Conference on Learning Representations. ICLR 2015 - Conference Track Proceedings, 1-14 (2015).

[CR54] 54.Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems25, 1097–1105 (2012). [Google Scholar]

[CR55] 55.Webb, G. I., Keogh, E. & Miikkulainen, R. Naive bayes. Springer15(1), 713–714 (2010). [Google Scholar]

[CR56] 56.Shi, W. et al. Landslide recognition by deep convolutional neural network and change detection. IEEE Transactions on Geoscience and Remote Sensing59(6), 4654–4672 (2020). [Google Scholar]

[CR57] 57.Du, R. et al. CDCNN-CMR-SV algorithm for robust adaptive wideband beamforming. Signal. Image and Video Processing17(5), 2137–2143 (2023). [Google Scholar]

[CR58] 58.Zhong, Z. et al. Spectral–spatial residual network for hyperspectral image classification: A 3-D deep learning framework. IEEE Transactions on Geoscience and Remote Sensing56(2), 847–858 (2017). [Google Scholar]

[CR59] 59.Li, R., Zheng, S., Duan, C., Yang, Y. & Wang, X. Classification of hyperspectral image based on double-branch dual-attention mechanism network. Remote Sensing12(3), 582 (2020). [Google Scholar]

[CR60] 60.Zhang, X. et al. Spectral–Spatial self-attention networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing60, 5512115 (2022). [Google Scholar]

[CR61] 61.Li, M., Liu, Y., Xue, G., Huang, Y. & Yang, G. Exploring the relationship between center and neighborhoods: Central vector oriented self-similarity network for hyperspectral image classification. IEEE Transactions on Circuits and Systems for Video Technology33(4), 1979–1993 (2022). [Google Scholar]

[CR62] 62.Wang, W., Dou, S., Jiang, Z. & Sun, L. A fast dense spectral–spatial convolution network framework for hyperspectral images classification. Remote sensing10(7), 1068 (2018). [Google Scholar]

[CR63] 63.Thiele, S. T. et al. Multi-scale, multi-sensor data integration for automated 3-D geological mapping. Ore Geology Reviews136, 104252 (2021). [Google Scholar]

[CR64] 64.Hu, W., Huang, Y., Wei, L., Zhang, F. & Li, H. (Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors, 258619 (2015).

[CR65] 65.Makantasis, K., Karantzalos, K., Doulamis, A. & Doulamis, N. Deep supervised learning for hyperspectral data classification through convolutional neural networks. 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 4959-4962 (2015).

[CR66] 66.Hamida, A. B., Benoit, A., Lambert, P. & Amar, C. B. 3-D Deep Learning Approach for Remote Sensing Image Classification. IEEE Transactions on Geoscience and Remote Sensing56(8), 4420–4434 (2018). [Google Scholar]

[CR67] 67.Zhang, T. et al. CDCNN: a model based on class center vectors and distance comparison for wear particle recognition. IEEE Access8, 113262–113270 (2020). [Google Scholar]

[CR68] 68.Cortes, C. & Vapnik, V. Support-vector networks. Machine learning20, 273–297 (1995). [Google Scholar]

PERMALINK

OHID-1: A New Large Hyperspectral Image Dataset for Multi-Classification

Ashish Mani

Sergey Gorbachev

Jun Yan

Abhishek Dixit

Xi Shi

Long Li

Yuanyuan Sun

Xin Chen

Jiaqi Wu

Jianwen Deng

Xiaohua Jiang

Dong Yue

Chunxia Dou

Xiangsen Wei

Jiawei Huang

Abstract

Background & Summary

The Necessity of opening the OHID-1 dataset

Table 1.

The Research value of the OHID-1 dataset

Methods

Samples of OHID-1 Dataset

Fig. 1.

Fig. 2.

Labeling

Fig. 4.

Fig. 3.

Fig. 5.

Fig. 6.

Data Records

Data storage format

Original data

Dataset

Preprocessing

Codes

Data source selection

Fig. 7.

Data Pre-processing

Radiation calibration

Atmospheric correction

Geometric correction

Orthographic correction

Table 2.

Technical Validation

Quality of OHID-1 Dataset

Algorithms adopted

Fig. 8.

Fig. 14.

Fig. 9.

Fig. 10.

Fig. 11.

Fig. 12.

Fig. 13.

Evaluation Metrics

Table 3.

Experimental settings and results

Table 4.

Table 5.

Fig. 15.

Fig. 21.

Fig. 16.

Fig. 17.

Fig. 18.

Fig. 19.

Fig. 20.

Analysis of experimental results

Usage Notes

Acknowledgements

Author contributions

Code availability

Competing interests

Footnotes

Contributor Information

References

Associated Data

Data Citations

Data Availability Statement

ACTIONS