Skip to main content
Sensors (Basel, Switzerland) logoLink to Sensors (Basel, Switzerland)
. 2026 Apr 17;26(8):2474. doi: 10.3390/s26082474

Design of a High Dynamic Range Acquisition System for Airborne VNIR Push-Broom Hyperspectral Camera

Haoyang Feng 1,2,3, Yueming Wang 1,2,3,*, Daogang He 2, Changxing Zhang 2, Chunlai Li 2
Editor: Steve Vanlanduit
PMCID: PMC13120293  PMID: 42076581

Abstract

Achieving a high frame rate and high dynamic range (HDR) under complex illumination remains a significant challenge for airborne push-broom visible-near-infrared (VNIR) hyperspectral cameras. Problematic scenarios typically include high-contrast scenes, such as ocean whitecaps alongside deep water or concurrently sunlit and shadowed urban surfaces. To address this, a real-time HDR acquisition system based on a dual-gain complementary metal–oxide–semiconductor (CMOS) image sensor is proposed. Specifically, a four-pixel HDR fusion method is developed, utilizing an optical calibration setup to accurately determine the fusion parameters and configure the spectral region of interest (ROI) for reduced data volume. The complete workflow, encompassing spectral–spatial four-pixel binning and piecewise dual-gain fusion, is implemented on a field-programmable gate array (FPGA) using a dual-port RAM-based buffering strategy and a low-latency five-stage pipeline. Experimental results demonstrate a minimal processing latency of 0.0183 ms and a maximum frame rate of 290 frames/s. By extending the output bit depth from 11 to 15 bits, the system achieves a digital dynamic range of the final output of 2.03 × 104:1, representing a 9.58-fold improvement over the original low-gain data. The fused HDR data maintain high linearity and good spectral fidelity, with spectral angle mapper (SAM) values at the 10−3 level. Featuring a compact and low-power design, this system provides a practical engineering solution for efficient airborne VNIR hyperspectral acquisition.

Keywords: airborne hyperspectral, high frame rate, high dynamic range, dual-gain CMOS, FPGA, HDR fusion

1. Introduction

An airborne visible–near-infrared (VNIR) push-broom hyperspectral camera captures one-dimensional spatial information and continuous spectral data across the VNIR bands within a single exposure. By performing push-broom scanning along the flight direction, it reconstructs two-dimensional spatial information and forms a data cube [1]. This imaging modality holds significant application value in fields such as industrial inspection [2], water remote sensing [3], crop analysis [4], and environmental monitoring [5]. Complementary metal–oxide–semiconductor (CMOS) image sensors are widely employed as detectors in such hyperspectral systems due to their high quantum efficiency in the VNIR band, high level of circuit integration, and relatively simple hardware implementation [6].

During practical observations, a single field of view often contains targets with markedly different brightness levels. In ocean scenes, for example, whitecaps and foam can exhibit VNIR reflectance of 40–60% [7], whereas the water-leaving signal from clear deep water in the same spectral range is much weaker, with reflectance typically around 0.6–1% [8]. Shadowed regions further increase the contrast between bright and dark areas. A similar situation occurs in urban scenes, where occlusion by high-rise buildings causes sunlit and shaded surfaces to appear at the same time. This often compresses, or even obscures, details in shadowed regions and can therefore affect subsequent spectral inversion [9]. In such high dynamic range (HDR) scenes, the luminance span may reach roughly four orders of magnitude [10]. After spectral dispersion, the reflected light from the same target also varies considerably from one band to another. As a result, the CMOS area-array data obtained in a single exposure contain both strong and weak signals across the spatial and spectral dimensions. To preserve details in both bright and dark regions for most HDR scenes, the acquisition system therefore needs a dynamic range on the order of 104.

In contrast, the typical dynamic range of a conventional linear CMOS image sensor in a single exposure is approximately 60 dB, corresponding to a span of only about 103 [11]. As a result, such sensors struggle to directly accommodate the luminance variation encountered in HDR scenes: strong-signal pixels are prone to saturation, while weak-signal pixels are easily buried in noise. This leads to the loss of both bright and dark details in the two-dimensional spatial image and compromises the completeness of spectral information. To enhance the capability of a CMOS acquisition system to capture both bright and dark region details simultaneously, HDR acquisition methods are commonly employed.

Existing studies on HDR acquisition can generally be classified into three categories. The first expands the dynamic range in the temporal dimension by capturing multiple frames under varying exposure times or gain settings. For instance, Nie et al. [12] utilized low-rank representation, adaptive weights, and guided filtering to fuse low-illumination grayscale images, effectively mitigating noise and preserving shadow and highlight details. Building on this, Tan et al. [13] proposed a multi-ghost-suppression, multi-level fusion framework to reduce ghosting artifacts in dynamic scenes. However, for airborne push-broom hyperspectral cameras, multi-frame fusion remains highly susceptible to motion-induced ghosting and registration errors, while the associated fusion algorithms impose a prohibitive computational burden. The second category enhances dynamic range at the hardware level via optical design or device modification—such as spatial modulation or beam splitting—to acquire multiple responses simultaneously. For example, He et al. [14] developed a DMD-based scheme to obtain multiple luminance responses through optical modulation, while Qiao et al. [15] presented an HDR video reconstruction method utilizing variable-ratio beam splitting across multiple sensors. Although these approaches enable parallel multi-channel acquisition, they significantly increase hardware complexity and physical volume. This limitation is particularly detrimental for hyperspectral cameras, where such modules must be seamlessly integrated with intricate dispersive optics. Compared to the aforementioned approaches, the third category acquires multiple complementary responses within a single exposure and fuses them in real time. This effectively mitigates the artifacts introduced by multi-frame fusion and features a straightforward implementation, making it highly suitable for airborne hyperspectral imaging. The development of this approach has evolved from empirical piecewise fusion to image mapping-based fusion, and ultimately to optical calibration-based image fusion. Along this line, He et al. [16] proposed a three-segment HDR synthesis method for dual-channel scientific CMOS data. Although this method successfully extends the dynamic range, its fusion parameters are determined solely by ideal gain ratios rather than through rigorous optical calibration, thereby compromising the stringent spectral consistency required for hyperspectral data. Furthermore, He et al. [17] employed a simplified reverse S-shaped mapping to expand the usable image range prior to multiscale fusion. Although this method derives the fusion formulas and parameters through scientific computation, these parameters rely exclusively on the acquired image data itself, thereby compromising spectral fidelity and hindering subsequent data inversion. More recently, You et al. [18] developed a real-time HDR system for large-format area-array imaging based on a dual-gain CMOS sensor, which operates at 24 frames/s and employs a varying-illumination method for parameter calibration and real-time fusion. However, this approach focuses exclusively on area-array imaging while neglecting dark-end noise performance. The system also incurs a substantial computational latency of 35 ms, which severely restricts the achievable frame rate. Moreover, because its calibration relies heavily on stable light intensity variations, it struggles to guarantee the linearity of the fused data. Beyond the three aforementioned categories, deep-learning-based approaches have also been explored, shifting dynamic-range enhancement to the reconstruction stage [19]. However, these methods rely heavily on the scale and quality of training data, demand substantial computational resources, and remain exceedingly difficult to deploy within high-frame-rate, low-latency acquisition systems. Consequently, these inherent limitations render such data-driven approaches unsuitable for real-time hyperspectral imaging.

To satisfy the aforementioned requirements and provide a reliable solution for real-time HDR hyperspectral data acquisition, this study addresses the inherent limitations of existing single-exposure fusion methods. Specifically, issues such as compromised spectral fidelity, excessive dark-end noise, unrefined optical calibration procedures, and prohibitive computational latency limit the applicability of current approaches to real-time hyperspectral imaging. Consequently, this paper proposes a four-pixel real-time HDR fusion method based on a dual-gain CMOS sensor. To support this approach, an optical calibration scheme is designed to accurately determine the fusion parameters, and a low-latency field-programmable gate array (FPGA) architecture is developed for real-time implementation. The proposed method is intended to preserve spectral fidelity, suppress dark-end noise, and enhance the digital dynamic range of the final output while supporting high-frame-rate operation. Ultimately, this system provides a practical engineering pathway for airborne push-broom VNIR hyperspectral cameras to achieve high-frame-rate HDR information acquisition in complex scenes.

2. Methods

2.1. Constraint Analysis for Dynamic Range Enhancement

Within the linear operating region of a CMOS sensor, the digital number (DN) value of a single pixel can be approximated as proportional to the equivalent number of photoelectrons accumulated during a single exposure. This relationship is characterized by a linear response with an offset term [20]:

D=kNe+b, (1)

where D is the pixel DN value, k is the conversion coefficient from the equivalent number of photoelectrons to the pixel DN, Ne is the equivalent number of accumulated photoelectrons during the exposure, and b is the offset term.

In the single-exposure, single-gain acquisition mode, once the gain setting is determined, k and b can be approximately regarded as constants. For representative pixels selected from the bright end and dark regions within the same frame of data, let the equivalent numbers of photoelectrons corresponding to their pixel digital outputs be Neb and Ned, respectively. Then the bright-dark span of this frame can be expressed as

C=NebNed=DbbDdb, (2)

where C is the bright-dark span ratio of the frame, Db is the DN value of the bright-end pixel, and Dd is the DN value of the dark-end pixel.

In practical applications, to prevent saturation at the bright end of the CMOS sensor and ensure the preservation of highlight details, it is generally necessary to keep the DN value of the bright-end pixel below the saturation threshold, i.e.,

DbDsat, (3)

where Dsat denotes the saturation DN value of the CMOS sensor.

Substituting Equation (1) into the above inequality yields

kNeb+bDsat, (4)

Accordingly, the equivalent number of photoelectrons accumulated at the bright end during the exposure must satisfy

NebDsatbk, (5)

By incorporating Equation (2) into Equation (5), we obtain the following constraint on the equivalent number of photoelectrons at the dark end, which can be expressed as

NedDsatbkC, (6)

This condition can be further translated into a constraint on the effective signal amplitude at the dark end in the digital domain:

Ddb=DbbCDsatbC, (7)

Equation (7) indicates that, under single-exposure, single-gain acquisition, the effective digital amplitude at the dark end is passively reduced as the frame bright-dark span ratio C increases. When the scene to be captured exhibits a large bright-dark span, C increases accordingly. Even if the CMOS bright end remains within the linear region without saturation, the dark end may still stay in a low-DN range for a long time and be submerged by noise. Attempting to recover usable dark-end information by extending exposure time would cause the bright-end pixels to saturate, resulting in significant loss of highlight details. Moreover, during a single exposure, the gain setting and the conversion coefficient k are fixed, restricting the system to a single gain configuration. If a higher gain is selected, Equations (1) and (5) indicate that the dark-end pixel DN value Dd increases, but the maximum allowable accumulated equivalent photoelectrons at the bright end Neb decreases accordingly, rendering bright-end pixels more susceptible to premature saturation. Conversely, if a lower gain is selected, the headroom at the bright end expands, reducing the likelihood of saturation, but the effective amplitude at the dark end is further compressed, making it harder to exceed the noise threshold. Under the high-frame-rate acquisition constraints imposed by push-broom imaging, the exposure time is limited, and dark-end signal accumulation is difficult to compensate by simply extending the exposure. This limitation exacerbates the inherent trade-off, rendering it even more challenging to balance bright- and dark-end performance. Consequently, single-exposure, single-gain acquisition inevitably involves a trade-off in scenes where bright and dark regions coexist.

To enable a consistent system-level evaluation of the final digital output and the dynamic-range extension achieved by the proposed system, this study adopts the digital dynamic range of the final output as the evaluation metric. This relationship is formulated as follows [21]:

RD=Dsatbδd, (8)

Specifically, the difference between the maximum unsaturated output DN value and the dark offset, Dsatb, is taken as the maximum usable signal swing of the final output. The parameter δd denotes the standard deviation of repeated dark-frame DN outputs under dark conditions and is used to characterize the output noise floor. Accordingly, the ratio describes the span between the maximum usable signal swing and the output noise level of the final digital output.

Equation (8) indicates that the digital dynamic range of the final output in a single frame can be improved in two complementary ways. One is to enlarge the usable range at the bright end so that strong signals are not clipped by saturation while weak signals remain usable [22]. The other is to reduce the output noise floor so that weak signals under dark conditions can be distinguished more clearly. Guided by this idea, the present work develops a four-pixel HDR fusion method that improves both the non-saturated upper limit at the bright end and the discernibility of weak signals at the dark end.

2.2. Principle of the Four-Pixel HDR Fusion Method

This paper proposes a four-pixel HDR fusion method tailored for the real-time acquisition system of an airborne push-broom VNIR hyperspectral camera. The schematic diagram of the method is presented in Figure 1.

Figure 1.

Figure 1

Schematic of the four-pixel HDR fusion method.

Given that hyperspectral data exhibit local continuity in both the spatial and spectral dimensions [23], four-pixel binning is performed in the digital domain along the spectral and spatial dimensions for the high-gain and low-gain data streams separately. This operation aims to reduce dark noise and alleviate the computational burden of subsequent processing and transmission. The binning formula is

Dg(u,v)=14i=01j=01Dg(2u1+i,2v1+j), g{lg,hg}, (9)

where Dg denotes the binned high-gain and low-gain data, u is the binned row index in the spectral dimension, v is the binned column index in the spatial dimension, and Dg is the original high-gain and low-gain data.

Within the linear response region of the CMOS sensor, under constant incident illumination, the equivalent number of photoelectrons accumulated by a pixel is approximately proportional to the exposure time. Furthermore, due to the local continuity of push-broom VNIR hyperspectral data in the spatial and spectral dimensions, adjacent pixels exhibit only minor response variations. Therefore, after binning adjacent pixels, the output remains consistent with the linear model, which can be written as

Dg=sg·texp+bg, g{lg,hg}, (10)

where sg is the conversion coefficient for high-gain and low-gain, texp is the exposure time, and bg is the offset for high-gain and low-gain.

Since the binned high-gain and low-gain data corresponding to the same pixel share an identical exposure time, eliminating the exposure-time term texp yields a linear relationship within the overlapping linear region:

Dhg=aDlg+o,a=shgslg,o=bhga·blg, (11)

The mapping coefficient a and offset o are employed to characterize the linear correspondence between the two gain channels. These parameters can be determined through calibration and linear regression. As shown in Equation (12), this study adopts a pixel-wise piecewise fusion strategy to generate the final HDR output. For each binned high-gain sample, a saturation check is performed. If the binned high-gain data are not saturated, they are directly output as HDR data. When the signal approaches saturation or enters the nonlinear region, the binned low-gain data, which remain within the linear response region, are used instead. In this case, the low-gain value is mapped to the high-gain equivalent domain via Equation (11) as the HDR output for that pixel.

Dhdr=Dhg,DhgTsataDlg+o,Dhg>Tsat, (12)

where Dhdr is the HDR data and Tsat is the saturation decision threshold of the binned high-gain channel.

This fusion method involves only a threshold comparison and a single linear-mapping operation, which facilitates pipelined implementation on an FPGA and meets the demands of high-frame-rate real-time processing.

2.3. Calibration and Computation of Fusion Parameters

The proposed four-pixel HDR fusion involves two types of parameters: (1) the saturation decision threshold Tsat for the binned high-gain channel, used to determine the piecewise switching point, and (2) the mapping coefficient a and offset o, employed to map the binned low-gain data into the high-gain equivalent domain. Given that the linear response region of a CMOS sensor typically spans 5% to 95% of its full scale, signals exceeding this upper limit are prone to nonlinear distortion and saturation overflow. Therefore, the saturation decision threshold Tsat is set to 95% of the saturation digital value of the CMOS sensor. This threshold serves as the criterion for determining whether the binned high-gain data have approached saturation or entered the nonlinear region.

This paper employs an integrating sphere as a uniform light source to calibrate the mapping coefficient a and offset o. The experimental setup is illustrated in Figure 2.

Figure 2.

Figure 2

Experimental platform for fusion parameter calibration.

During calibration, the optical power output of the integrating sphere is maintained constant, and multiple sets of raw high-gain and low-gain data are acquired by varying the exposure time. Consequently, exposure time serves as the sole variable requiring control to fit the gain relationship in the subsequent fusion process. Regulated by a high-precision FPGA counter, this exact timing control significantly improves the fitting accuracy. For each set of raw data, four-pixel binning along the spectral and spatial dimensions is performed to obtain Dhg and Dlg. To minimize the influence of edge non-uniformity introduced by the integrating sphere, a stable central response region is selected within each binned image. The pixel DN values within this region are averaged to derive representative output values for each exposure time setting:

L¯=1NΩ(u,v)ΩDlg(u,v), (13)
H¯=1NΩ(u,v)ΩDhg(u,v), (14)

where H¯ and L¯ denote the average DN values within the stable region of the binned high-gain and low-gain data at the same exposure time, Ω is the set of pixels in the selected central stable region, and NΩ is the number of pixels in the set Ω.

For each exposure time setting, one frame is acquired and the corresponding H¯ and L¯ are computed. Then, using texp,H¯ and texp,L¯ as samples, a first-order linear fit is performed by the least-squares method.

The system presented in this study is based on the dual-gain CMOS sensor CIS2521F (Fairchild Imaging, San Jose, CA, USA). Its key parameters are summarized in Table 1.

Table 1.

Key parameters of CIS2521F.

Parameters Value
Native resolution 2560 × 2160 (pixels)
Number of data channels 2
High gain ×10, ×30
Low gain ×1, ×2
Data interface type Parallel
Data type Gray code
Output bit width 11-bit

In accordance with the aforementioned procedure, this study conducted a comprehensive optical calibration of the fusion parameters for the CIS2521F. Specifically, a 10× gain was selected for the high-gain channel, whereas a 1× gain was used for the low-gain channel, and the saturation threshold Tsat was set at 1940. A central stable region, encompassing a 100 × 100 pixel area within the binned image, was designated for calibration purposes. Subsequently, eight distinct sets of mean DN, exposure-time correspondences, derived from the linear region, were utilized as calibration samples to fit the response functions of the high-gain and low-gain channels, respectively. The fitting results for these two channels are visually presented in Figure 3.

Figure 3.

Figure 3

Mean DN as a function of exposure time and the corresponding linear fits. (a,b) low-gain channel; (c,d) high-gain channel.

Within the linear overlap region shared by the two channels, the mapping coefficient a and offset o are calculated from the fitting results as follows:

a9.2766, o2073.567, (15)

Accordingly, in this implementation, the HDR fusion output formula is ultimately expressed as

Dhdr=Dhg,Dhg19409.2766·Dlg2073.567,Dhg>1940, (16)

After calibrating the fusion parameters, spectral calibration of the hyperspectral camera was carried out with a monochromator. This calibration established the correspondence between spectral-row index and center wavelength, from which the row-index-to-wavelength mapping in Figure 4 was obtained. The results show that the effective system response in the 400–900 nm range mainly lies between rows 227 and 586 of the spectral dimension, covering about 360 rows. To reduce redundant data and improve acquisition efficiency, the spectral-dimension region of interest (ROI) was therefore set to rows 227–586. This setting reduces the spectral readout size while still covering the effective spectral range, which in turn lowers transmission and storage burden and supports high-frame-rate acquisition.

Figure 4.

Figure 4

Relationship between spectral-row index and center wavelength.

3. Hardware Implementation of the System

3.1. Overall System Design

As illustrated in Figure 5, the real-time acquisition system developed in this study is built around a CIS2521F dual-gain CMOS image sensor, utilizing a Xilinx FPGA, specifically the XC6SLX75 from the Spartan-6 series (Xilinx, San Jose, CA, USA), as the main controller. At the hardware level, the system is powered by an external 12 V supply, and the required voltage rails are generated by DC-DC converters and low-dropout linear regulators. These rails provide stable power for both the CMOS sensor and the FPGA, ensuring reliable operation. An external crystal oscillator serves as the reference clock for the FPGA, and multiple working clocks are generated through the FPGA phase-locked loop. These clocks are then distributed to the CMOS sensor and the internal FPGA modules. At the same time, the pixel clock returned by the CMOS sensor is used as the acquisition-chain clock inside the FPGA. At the control level, the FPGA configures the spectral ROI of the CMOS sensor through a low-speed JTAG interface and generates the exposure-control timing signals.

Figure 5.

Figure 5

Block diagram of the HDR acquisition system.

At the data-acquisition level, the raw high-gain and low-gain data are first received by the FPGA and then processed by gray-to-binary decoding, dual-gain line buffering, four-pixel binning along the spectral and spatial dimensions, and pixel-level HDR fusion. Finally, the HDR data are output to the data recorder through the Camera Link interface, enabling continuous acquisition and reliable storage under high-frame-rate conditions.

3.2. FPGA Implementation

To implement four-pixel binning efficiently, we use a two-line buffering scheme based on dual-port RAM. As shown in Figure 6, the architecture adopts a two-line buffer with a single-readout arrangement. Two adjacent lines are buffered together, and pixels from the same column are written into the dual-port RAM with odd-even interleaved addressing. This allows the four pixel values required for binning to be fetched simultaneously in one read cycle. The four-point accumulation, shift-based division, and fixed-point rounding are then completed within the same clock cycle:

Y=(X1+X2+X3+X4+2)>>2, (17)

where Y denotes the binned pixel value and X1, X2, X3, X4 are the original DN values of four adjacent pixels.

Figure 6.

Figure 6

Schematic of the four-pixel binning implementation.

Because the two-line buffer must be filled before the first binned output can be generated, a fixed latency occurs at the beginning of each frame. This latency can be approximated as

tstart2Ncolfpix, (18)

where tstart is the fixed per-frame buffering latency, Ncol is the number of columns in a raw single line, and fpix is the pixel clock frequency.

With Ncol = 2560 and fpix = 280 MHz in this system, the calculated tstart is approximately 18.3 μs. Once the system enters steady state operation, a ping-pong read/write mechanism is employed to prevent contention between read and write accesses. This configuration ensures that the output rate matches that of the raw CMOS data stream, thereby providing a stable and uninterrupted input for subsequent real-time processing.

After four-pixel binning, the FPGA carries out pixel-level HDR fusion on the binned dual-gain data. To meet the demands of high-frame-rate real-time processing, this computation is organized as a five-stage pipeline. Fusion parameters, including the mapping coefficient and offset, are pre-quantized in fixed-point form and stored in registers for online use. The five stages perform input alignment, high-gain saturation judgment, low-gain mapping multiplication, offset addition with fixed-point rounding, and piecewise output selection, respectively, as shown in Figure 7. This design supports continuous pixel-by-pixel output, and the overall pipeline latency is only five clock cycles. At a fusion-module clock of 80 MHz, the corresponding latency is 62.5 ns, which has negligible effect on the overall frame rate.

Figure 7.

Figure 7

Flowchart of the dual-gain fusion pipeline.

Ultimately, the proposed system achieves real-time HDR acquisition at a maximum frame rate of 290 frames/s with a single-frame resolution of 1280 × 180 pixels. The total processing latency is only 18.3 μs, accompanied by low hardware resource utilization (as shown in Table 2).

Table 2.

Resource utilization of the XC6SLX75 FPGA.

Slice Logic
Utilization
Slice
Registers
LUTs Occupied
Slices
BUFG Block RAM DSP48A1s
Available 93,296 46,648 11,662 16 516 132
Used 4451 4889 1952 8 70 15
Utilization 4% 10% 16% 50% 13% 11%

3.3. System Prototype

Based on the proposed method, the acquisition system was designed and implemented. To achieve a compact hardware footprint, a modular architecture was adopted, utilizing FPGA Mezzanine Card (FMC) connectors to interconnect multiple circuit boards. A photograph of the acquisition system circuitry is shown in Figure 8, with overall dimensions of 5.5 cm × 5.5 cm × 3.2 cm. Figure 9 presents the integrated hyperspectral camera prototype following assembly with the spectrometer.

Figure 8.

Figure 8

Photograph of the acquisition system hardware.

Figure 9.

Figure 9

Photograph of the hyperspectral camera prototype.

Furthermore, a multistage power-conversion scheme was employed to generate the required supply rails, thereby minimizing overall power consumption. Operating under a 12 V supply, the system draws an input current of approximately 0.18 A in the standby state and 0.37 A during active imaging, which corresponds to power consumptions of 2.16 W and 4.44 W, respectively. The primary physical and electrical parameters of the hardware system are summarized in Table 3.

Table 3.

Main physical and electrical parameters of the hardware system.

Parameters Value
Hardware system dimensions 5.5 cm × 5.5 cm × 3.2 cm
Supply voltage 12 V
Idle current 0.18 A
Operating current 0.37 A
Idle power consumption 2.16 W
Imaging power consumption 4.44 W

4. Imaging Results and Discussion

4.1. Experimental Setup and HDR Imaging Results

To evaluate the performance of the designed acquisition system and demonstrate its HDR capabilities, a ground-based experimental setup was constructed utilizing a rotating platform. As illustrated in Figure 10, the hyperspectral camera, consisting of a lens, a spectrometer, and the acquisition system, is mounted on an optical breadboard secured to the rotation stage. By providing controlled relative motion between the camera and the outdoor scene, this apparatus reproduces the line-by-line scanning process required for airborne push-broom hyperspectral imaging. The selected outdoor scene, encompassing sky, vegetation, and buildings under natural illumination, guarantees the simultaneous presence of bright and dark targets within a single scan. This configuration is representative of the high-contrast conditions typically encountered in airborne VNIR applications.

Figure 10.

Figure 10

Diagram of the ground experimental apparatus.

Utilizing the aforementioned experimental platform, raw high-gain, raw low-gain, and fused HDR data were acquired for the same scene under identical illumination conditions. During these acquisitions, the system maintained a uniform exposure time of 5 ms to facilitate subsequent presentation and analysis. Figure 11 presents the 2D imaging results of the HDR push-broom data at several representative bands. Specifically, Figure 11a displays the HDR hyperspectral data cube, where the RGB composite image is synthesized from the 660, 550, and 480 nm bands. Meanwhile, Figure 11b–f depict the corresponding HDR images extracted at 550 nm, 650 nm, 720 nm, 810 nm, and 900 nm, respectively.

Figure 11.

Figure 11

2D HDR imaging results at representative VNIR bands. (a) Hyperspectral data cube; (b) 550 nm; (c) 650 nm; (d) 720 nm; (e) 810 nm; (f) 900 nm.

As evident from the visual results, the primary spatial structures of the scene remain clearly discernible across the entire VNIR spectrum. Representative targets, such as buildings, vegetation, and the sky, can be robustly identified at the selected wavelengths. Concurrently, the contrast between sunlit and shadowed regions is well preserved, and the spectral responses across different bands exhibit distinct variations.

To effectively illustrate the enhancements achieved by the HDR fusion process, the 650 nm band was selected as a representative visible wavelength for detailed analysis. Figure 12 compares the low-gain, high-gain, and fused HDR results, presenting both 2D images and the corresponding 3D surface plots of their DN value distributions. Panels (a) and (b) correspond to the low-gain push-broom results, panels (c) and (d) to the high-gain results, and panels (e) and (f) to the HDR output generated by the proposed system. As evident from the 2D images and the 3D DN-distribution surfaces, the high-gain and low-gain images suffer from missing details in highlights and shadows, respectively. In contrast, the HDR push-broom image preserves the highlight detail characteristic of the low-gain channel while retaining clearer texture information from the high-gain channel in dark regions.

Figure 12.

Figure 12

2D spatial-domain push-broom imaging results at 650 nm and 3D surface plots of the DN-value distribution. (a,b) low-gain push-broom results; (c,d) high-gain push-broom results; (e,f) HDR push-broom results.

Figure 13 provides enlarged views of typical high-dynamic urban targets, specifically capturing the sunlit and shadowed regions of buildings and vegetation at the 650 nm and 810 nm wavelengths. For each wavelength, the three columns correspond to the low-gain, high-gain, and fused HDR results, respectively, with the darker and brighter regions clearly marked in the images. Beyond the intuitive visual presentation, the histogram distributions of the selected HDR regions are illustrated in Figure 14. For both building and vegetation targets, the pixel responses from the sunlit and shadowed areas span a notably broader DN interval, rather than being compressed into the low-DN noise floor or accumulating near the saturation limit. This quantitative observation aligns with the visual comparisons in Figure 13, further demonstrating that the fused HDR output successfully preserves discernible responses for targets exhibiting a high brightness span within a single frame.

Figure 13.

Figure 13

Enlarged views of typical HDR targets.

Figure 14.

Figure 14

Histogram distributions of typical HDR targets. (a) Building at 650 nm; (b) Building at 810 nm; (c) Vegetation at 650 nm; (d) Vegetation at 810 nm.

4.2. Spectral Usability Analysis

In hyperspectral imaging systems, data utility fundamentally depends not only on the extension of dynamic range but also on the preservation of spectral fidelity. Figure 15 further plots DN-value versus wavelength curves for representative pixels. In Figure 15a, the responses of sky, vegetation, and buildings are shown. The DN values stay continuous across the fusion transition, without obvious jumps. Figure 15b–d compare the high-gain, low-gain, and fused responses for the corresponding pixels.

Figure 15.

Figure 15

DN-value–wavelength curves for representative target pixels. (a) HDR curves for sky, vegetation, and buildings; (b) comparison of the three curves for sky; (c) comparison of the three curves for vegetation; (d) comparison of the three curves for buildings.

For bright targets such as the sky and vegetation, the overall signal level is high, so the high-gain channel is more likely to saturate in some bands and clip local spectral features. After fusion, the HDR curve takes information from the low-gain channel in those saturated bands, which maintains continuity in the spectral variation and reduces spectral-shape distortion caused by saturation. For dark targets such as shaded buildings, the low-gain channel approaches the noise floor, making band-edge spectral differences difficult to resolve. In the HDR result, the effective signal is taken mainly from the high-gain channel, producing a clearer spectral profile and more distinguishable inter-band variation for dark targets. This improves the usefulness of the data for subsequent spectral analysis and target discrimination.

To more objectively demonstrate the effectiveness of the proposed fusion method in preserving spectral fidelity, eight sets of HDR samples with varying integration times were acquired under uniform illumination using an integrating sphere. The coefficient of determination, R2, was utilized to quantitatively evaluate the linearity of the fused HDR data. Figure 16a plots the mean DN values of these representative HDR samples, while Figure 16b illustrates their distribution relative to the extended high-gain line. The results indicate that the data maintain a highly linear distribution, with the calculated R2 between the samples and the high-gain fitted line reaching 0.997. This result shows that the HDR output generated by the proposed method retains good linearity within the effective operating range, thereby providing a reliable basis for the physical validity of the hyperspectral acquisition.

Figure 16.

Figure 16

Linearity test results of the HDR data. (a) Representative test data samples. (b) Distribution of the test points and fitting results.

Furthermore, to quantitatively assess the spectral distortion introduced during the fusion process, the spectral angle mapper (SAM) metric is employed. In the field of hyperspectral imaging, SAM measures the shape similarity between a target spectrum and a reference spectrum by calculating their vector angle in a high-dimensional space. This metric determines whether two curves represent the same material, where a smaller value indicates a higher degree of similarity. In this evaluation, the unsaturated raw low-gain DN curve is utilized as the reference, and the fused HDR curve serves as the target. The quantitative results are summarized in Table 4. Specifically, the calculated average SAM values for the sky, vegetation, buildings, and the entire scene are 0.0021, 0.0056, 0.0045, and 0.0039, respectively. All metrics remain at the 10−3 level, indicating that the piecewise dual-gain fusion introduces limited spectral distortion within the calibrated operating range [24].

Table 4.

SAM results for representative regions.

Scene Category SAM (Rad)
Sky 0.0021
Vegetation 0.0056
Buildings 0.0045
Full scene 0.0039

4.3. Ablation Study

An ablation study was conducted to evaluate six distinct processing schemes: raw low-gain, raw high-gain, low-gain with binning, high-gain with binning, dual-gain fusion without binning, and the proposed comprehensive method incorporating both binning and fusion. To quantitatively assess the digital dynamic range of the final output, an integrating sphere was utilized as a uniform illumination source to determine the maximum unsaturated DN value for each configuration. Concurrently, dark frames were acquired under strictly zero-illumination conditions to calculate the standard deviation of repeated dark-frame DN outputs.

Figure 17 compares the standard deviation of repeated dark-frame DN outputs across the spectral dimension before and after four-pixel binning. The results reveal that the noise levels in both gain channels are significantly attenuated and stabilized following the binning operation. Specifically, in the high-gain branch, the standard deviation decreases from approximately 1.55 to 0.82. Similarly, the standard deviation in the low-gain branch drops from roughly 0.85 to 0.52.

Figure 17.

Figure 17

Comparison of dark-frame standard deviation across the spectral dimension. (a) High-gain channel; (b) Low-gain channel.

The quantitative results of the ablation study are summarized in Table 5. Table 5 demonstrates that while utilizing pixel binning alone effectively mitigates noise, it does not alter the inherent 11-bit output ceiling of the system. Under these conditions, the digital dynamic range of the final output increases from 0.21 × 104:1 to 0.35 × 104:1 in the low-gain branch and from 0.10 × 104:1 to 0.22 × 104:1 in the high-gain branch. When dual-gain fusion is applied without pixel binning, the maximum unsaturated output DN value expands to 16,915, substantially elevating the digital dynamic range of the final output to 1.10 × 104:1. This indicates that dual-gain fusion primarily serves to extend the bright-end limit. When pixel binning and dual-gain fusion are combined, the system not only preserves the expanded upper range limit but also further reduces the dark-frame standard deviation to 0.82, thereby achieving a high digital dynamic range of the final output of 2.03 × 104:1. These results show that the binning and fusion operations play highly complementary roles within the proposed method.

Table 5.

Results of the ablation study.

LG HG Binning Fusion Output Bit Width (Bits) Dark Noise
(A.U.)
Maximum
Unsaturated DN Value (A.U.)
Digital Dynamic Range of the Final Output (104:1)
11 0.85 2047 0.21
11 1.55 2047 0.10
11 0.52 2047 0.35
11 0.82 2047 0.22
15 1.55 16,915 1.10
15 0.82 16,915 2.03

4.4. Comparison and Discussion

Table 6 summarizes the key performance metrics of the proposed system alongside other representative real-time HDR CMOS acquisition systems. To provide a comprehensive comparison, the table explicitly details the output bit depth, output resolution, maximum frame rate, pixel throughput, and algorithmic latency for each system. Furthermore, to objectively evaluate the actual efficacy of the algorithms employed by the compared systems and their applicability to hyperspectral imaging, the global spectral angle mapper (SAM) of the HDR data generated by each method was evaluated under the same experimental/simulation framework. This calculation utilized the raw high-gain and low-gain data acquired from the aforementioned push-broom experiments. Correspondingly, the improvements in the digital dynamic range of the final output obtained by applying these methods to the proposed system were also estimated.

Table 6.

Comparison of different systems.

Method Output Bit Width
(Bit)
Resolution
(Pixels)
Maximum Frame Rate (Frames/s) Pixel
Throughput (Mpixel/s)
Algorithm
Latency
(ms)
Global SAM (Rad) Digital
Dynamic
Range (104:1)
This paper 15 1280 × 180 290 66.82 0.0183 0.0039 2.03
[17] 17 4096 × 4096 24 402.65 9.6 0.0128 1.07
[18] 14 2560 × 2160 50 276.48 35 0.3014 1.26

The results indicate that the proposed system achieves the lowest algorithmic latency among the compared systems. Additionally, the adopted four-pixel HDR fusion method provides the largest improvement in the digital dynamic range of the final output and the lowest global SAM value for the same scene. These results support its applicability to airborne push-broom VNIR hyperspectral cameras.

5. Conclusions

Airborne push-broom VNIR hyperspectral cameras face a persistent difficulty under complex illumination: they must capture highlight and shadow details within a single exposure while also maintaining a high frame rate. To address this problem, we designed a real-time HDR acquisition system based on a dual-gain CMOS sensor and developed a four-pixel HDR fusion method. The method combines spectral–spatial four-pixel binning with optically calibrated pixel-level dual-gain piecewise fusion and is implemented on an FPGA with low latency. Ground-based push-broom experiments were performed to verify the system. In a typical urban scene, the HDR results retain low-gain information in bright regions while preserving high-gain texture in dark regions, thereby reducing both highlight saturation and shadow detail loss caused by noise. The fused HDR hyperspectral data cube also maintains stable spatial–spectral relationships, with no noticeable band-to-band discontinuities or abnormal artifacts, showing that spectral fidelity is preserved while dynamic range is expanded. The spectral responses of representative target pixels remain continuous at the fusion switching points, without evident abrupt changes, and spectral analyses further showed SAM values on the order of 10−3 rad, indicating limited spectral distortion within the calibrated operating range, which confirms that the piecewise takeover strategy improves dynamic range while preserving spectral-shape integrity.

The experiments show that the proposed system can continuously acquire and output data at 290 frames per second, with an effective single-frame resolution of 1280 × 180 pixels. The algorithmic latency is only 0.0183 ms. The effective output bit width increases from 11 bits to 15 bits, and the digital dynamic range of the final output reaches about 2.03 × 104:1, which is about 9.58 times that of the original low-gain configuration. Compared with previously reported real-time systems of similar architecture, the proposed system provides clear advantages in both frame rate and processing latency and yields a larger improvement in the digital dynamic range of the final output. These characteristics make it better suited to the engineering requirements of high-frame-rate, low-latency push-broom imaging.

However, while four-pixel binning reduces downstream data throughput, suppresses dark-end noise, and supports high-frame-rate real-time processing, it inevitably decreases the spatial and spectral sampling density of the output data. In the present system, the 2 × 2 binning performed jointly along the spatial and spectral dimensions reduces the number of samples in both dimensions to 50% of the original data. For the representative airborne VNIR tasks targeted in this work, such as water-depth retrieval and urban hyperspectral classification under high-contrast conditions, the main requirement is the simultaneous preservation of bright- and dark-end information within the same scan, i.e., avoiding highlight saturation while improving the detectability of weak signals in dark regions. For such scene-level applications, this capability is generally more important than maximizing the detectability of very small targets or extremely fine textures. Therefore, in the present application context, the sampling-density loss introduced by four-pixel binning represents a reasonable engineering trade-off for achieving lower dark-end noise, a wider usable dynamic range, and more stable real-time output. For future tasks that place greater emphasis on fine spatial structures, subtle narrow-band spectral features, or small-target detection, more flexible strategies could be adopted, such as configurable binning factors, a fusion-only mode without binning, or scene-adaptive switching between operating modes.

Furthermore, it should be noted that real-world airborne imaging environments involve several interfering factors not fully covered in the current experiments, such as temperature variations [25], platform vibrations, attitude disturbances [26], and fluctuations in external radiation conditions. Among these, temperature changes and long-term operation may cause drift in the dual-gain mapping coefficients and switching thresholds, while platform vibrations and attitude disturbances could further compromise data stability and consistency during the push-broom imaging process. Therefore, although the ground-based rotating-platform experiments in this study successfully reproduced the key geometric relationships and real-time processing constraints of push-broom imaging, the long-term robustness of the system under true airborne conditions requires further validation. Future work will focus on temperature-compensated calibration, adaptive parameter updating, and online calibration mechanisms under more realistic flight conditions. By integrating rigorous radiometric calibration with more extensive airborne field data, we aim to conduct a more systematic quantitative evaluation of the system’s dynamic range expansion capability, spectral fidelity, and engineering applicability, thereby providing a more comprehensive foundation for high-dynamic-range acquisition in real-world push-broom VNIR hyperspectral missions.

Author Contributions

Conceptualization, H.F. and Y.W.; methodology, H.F.; software, H.F. and C.Z.; validation, H.F., Y.W. and C.L.; formal analysis, D.H.; investigation, H.F.; resources, Y.W.; data curation, H.F.; writing—original draft preparation, H.F.; writing—review and editing, H.F., Y.W., D.H., C.Z. and C.L.; visualization, C.Z.; supervision, D.H.; project administration, Y.W. and D.H.; funding acquisition, Y.W. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Funding Statement

This research was funded by the National Key Research and Development Program, grant number [2023YFC3107602].

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

  • 1.Audebert N., Saux B.L., Lefevre S. Deep learning for classification of hyperspectral data: A comparative review. IEEE Geosci. Remote Sens. Mag. 2019;7:159–173. doi: 10.1109/MGRS.2019.2912563. [DOI] [Google Scholar]
  • 2.Feng Y.Z., Sun D.W. Application of hyperspectral imaging in food safety inspection and control: A review. Crit. Rev. Food Sci. Nutr. 2012;52:1039–1058. doi: 10.1080/10408398.2011.651542. [DOI] [PubMed] [Google Scholar]
  • 3.Bai X., Wang J., Chen R., Kang Y., Ding Y., Lv Z., Ding D., Feng H. Research progress of inland river water quality monitoring technology based on unmanned aerial vehicle hyperspectral imaging technology. Environ. Res. 2024;257:119254. doi: 10.1016/j.envres.2024.119254. [DOI] [PubMed] [Google Scholar]
  • 4.Khan A., Vibhute A., Mali S., Patil C. A systematic review on hyperspectral imaging technology with a machine and deep learning methodology for agricultural applications. Ecol. Inform. 2022;69:101678. doi: 10.1016/j.ecoinf.2022.101678. [DOI] [Google Scholar]
  • 5.Fang Q., Tian H. A review of hyperspectral remote sensing in vegetation monitoring. Remote Sens. Technol. Appl. 1998;13:62–69. doi: 10.11873/j.issn.1004-0323.1998.1.62. [DOI] [Google Scholar]
  • 6.Yako M., Yamaoka Y., Kiyohara T., Hosokawa C., Noda A., Tack K., Spooren N., Hirasawa T., Ishikawa A. Video-rate hyperspectral camera based on a CMOS-compatible random array of Fabry–Pérot filters. Nat. Photonics. 2023;17:218–223. doi: 10.1038/s41566-022-01141-5. [DOI] [Google Scholar]
  • 7.Dierssen H.M. Hyperspectral measurements, parameterizations, and atmospheric correction of whitecaps and foam from visible to shortwave infrared for ocean color remote sensing. Front. Earth Sci. 2019;7:14. doi: 10.3389/feart.2019.00014. [DOI] [Google Scholar]
  • 8.Kishino M., Furuya K. A new simplified method for the measurement of water-leaving radiance. La Mer. 2015;53:29–38. doi: 10.32211/lamer.53.1-2_29. [DOI] [Google Scholar]
  • 9.Dare P.M. Shadow analysis in high-resolution satellite imagery of urban areas. Photogramm. Eng. Remote Sens. 2005;71:169–177. doi: 10.14358/PERS.71.2.169. [DOI] [Google Scholar]
  • 10.Xiao F., DiCarlo J.M., Catrysse P.B., Wandell B.A. Proceedings of the Color and Imaging Conference. Volume 10. Society of Imaging Science and Technology; Springfield, VA, USA: 2002. High dynamic range imaging of natural scenes; pp. 337–342. [Google Scholar]
  • 11.Bai B.D., Liu J., Fan J.L. Recent research in high dynamic range imaging. J. Xi’an Univ. Posts Telecommun. 2016;21:1–14. doi: 10.13682/j.issn.2095-6533.2016.03.001. [DOI] [Google Scholar]
  • 12.Nie T., Huang L., Liu H., Li X., Zhao Y., Yuan H., Song X., He B. Multi-Exposure Fusion of Gray Images Under Low Illumination Based on Low-Rank Decomposition. Remote Sens. 2021;13:204. doi: 10.3390/rs13020204. [DOI] [Google Scholar]
  • 13.Tan X., Chen H., Zhang R., Wang Q., Kan Y., Zheng J., Jin Y., Chen E. Deep multi-exposure image fusion for dynamic scenes. IEEE Trans. Image Process. 2023;32:5310–5325. doi: 10.1109/TIP.2023.3315123. [DOI] [PubMed] [Google Scholar]
  • 14.He S.W., Wang Y.J., Sun H.H., Zhang L., Wu P. High dynamic range imaging based on DMD. Acta Photonica Sin. 2015;44:0811001. doi: 10.3788/gzxb20154408.0811001. [DOI] [Google Scholar]
  • 15.Qiao Z.C., Yi H.W., Wei D.S., Li X.Y., Hai Y. High dynamic range video reconstruction using adaptive beam-splitting multi-sensor imaging system. Acta Opt. Sin. 2025;45:2110004. doi: 10.3788/aos251002. [DOI] [Google Scholar]
  • 16.He S.W., Wang Y.J., Sun H.H., Zhang L., Wu P. Design of high dynamic scientific CMOS camera system. Chin. J. Liq. Cryst. Disp. 2015;30:729–735. doi: 10.3788/YJYXS20153004.0729. [DOI] [Google Scholar]
  • 17.He L., Chen G., Guo H., Jin W.Q. A dynamic scene HDR fusion method based on dual-channel low-light-level CMOS camera. Infrared Technol. 2020;42:340–347. doi: 10.3724/SP.J.7101502298. [DOI] [Google Scholar]
  • 18.You X., Tang Y., Zhang L., Wang D. Design of high dynamic large-area scientific CMOS real-time system. Acta Opt. Sin. 2025;45:0911001. doi: 10.3788/aos250479. [DOI] [Google Scholar]
  • 19.Ferreti G.d.S., Paixão T., Alvarez A.B. Generative adversarial network-based lightweight high-dynamic-range image reconstruction model. Appl. Sci. 2025;15:4801. doi: 10.3390/app15094801. [DOI] [Google Scholar]
  • 20.Wang Z., Zhao Y., Zhan A., Gao J., Chen L., Hao C., Zhang H., Qiao S. An ultra-low-power CMOS image sensor with a new pixel structure in PWM mode featuring a programmable ramp generator for calibration. Microelectron. J. 2025;162:106726. doi: 10.1016/j.mejo.2025.106726. [DOI] [Google Scholar]
  • 21.Zhang W.G., Xie Z.H., Che C., Hu X.D., Fu Z.Q., Zeng X.Y. Design of star-sensor photoelectric imaging module. Semicond. Optoelectron. 2025;46:360–366. [Google Scholar]
  • 22.Chen J., Chen N., Wang Z., Dou R., Liu J., Wu N., Liu L., Feng P., Wang G. A review of recent advances in high-dynamic-range CMOS image sensors. Chips. 2025;4:8. doi: 10.3390/chips4010008. [DOI] [Google Scholar]
  • 23.Yuan Y., Zang Q., Lin Z., Liu J., Fan Z. Hierarchical pixel-wavelength fusion network for progressive hyperspectral image reconstruction. IEEE Trans. Geosci. Remote Sens. 2025;63:1–16. doi: 10.1109/TGRS.2025.3567469. [DOI] [Google Scholar]
  • 24.Xue Q., Gao X., Lu F., Ma J., Song J., Xu J. Development and application of unmanned aerial high-resolution convex grating dispersion hyperspectral imager. Sensors. 2024;24:5812. doi: 10.3390/s24175812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Zhang Y.Z., Liu R., Yu Y.W., Zhao D.J., Chen W.L., Li C.X. Research on discriminant method of temperature perturbation in blood glucose sensing by near-infrared spectroscopy. J. Infrared Millim. Waves. 2025;44:762–771. doi: 10.11972/j.issn.1001-9014.2025.05.014. [DOI] [Google Scholar]
  • 26.Jin J.R., Han G.C., Wang C.R., Wu R.F., Wang Y.M. Research on variable-speed scanning method for airborne area-array whisk-broom imaging system based on vertical flight path correction. J. Infrared Millim. Waves. 2025;44:511–519. doi: 10.11972/j.issn.1001-9014.2025.04.005. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.


Articles from Sensors (Basel, Switzerland) are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES