Abstract
Forest structural complexity is a key element of ecosystem functioning, impacting light environments, nutrient cycling, biodiversity, and habitat quality. Addressing the need for a comprehensive global assessment of actual forest structural complexity, we derive a near-global map of 3D canopy complexity using data from the GEDI spaceborne lidar mission. These data show that tropical forests harbor most of the high complexity observations, while less than 20% of temperate forests reached median levels of tropical complexity. Structural complexity in tropical forests is more strongly related to canopy attributes from lower and middle waveform layers, whereas in temperate forests upper and middle layers are more influential. Globally, forests exhibit robust scaling relationships between complexity and canopy height, but these vary geographically and by biome. Our results offer insights into the spatial distribution of forest structural complexity and emphasize the importance of considering biome-specific and fine-scale variations for ecological research and management applications. The GEDI Waveform Structural Complexity Index data product, derived from our analyses, provides researchers and conservationists with a single, easily interpretable metric by combining various aspects of canopy structure.
Subject terms: Biogeography, Forest ecology, Scientific data
Forest structural complexity plays a crucial role in ecosystem functioning, influencing factors like light, nutrient cycling, and biodiversity. Here, a global map of 3D canopy complexity modeled from spaceborne lidar reveals that highly complex forests are concentrated in the tropics, with significant variations observed across biomes.
Introduction
One of the most important variables emerging in ecological research of forested ecosystems is structural complexity. While definitions of structural complexity vary, they largely converge around its characterization as a function of the heterogeneity of canopy structure in 3D space. This heterogeneity is determined by tree diameter, height, biomass, leaf angle, deadwood, and vertical layering, along with their relative abundance and spatial arrangement, among others1–5. Recent work has explored the development of theoretical frameworks that address both the ecological role of complexity in ecosystem function6 and composition7,8, as well as the environmental and anthropogenic drivers that shape complexity9. These studies have been aided by advances in remote sensing methods, particularly lidar, that can detail the 3D structure of forest canopy elements in ways that were hitherto unobtainable.
Ecosystem productivity, water use strategies, and carbon use efficiency are largely explained by traits of vegetation structure directly linked to structural complexity, including aboveground biomass, tree height, and leaf area index6. Forest structural complexity may also be an effective surrogate to measure and monitor biodiversity and restoration effectiveness, as forest structure regulates microclimate conditions and provides niche space for organisms, thereby enabling functional redundancies that increase resistance and resilience to both natural and anthropogenic disturbances10.
Structural complexity is underpinned by detailed measures of multiple forest structural traits. Some of these may be readily quantified at the field plot level, such as stem diameters and tree heights, but are difficult to map across landscapes given the limited number and distribution of forest inventory plots globally. Other important elements of structure, such as vertical stratification and layering, tree architecture, and the horizontal and vertical distribution of canopy gaps are more difficult to measure in the field, and as the level of detail in the field inventory increases, the smaller the extent the inventory can cover4. Consequently, characterization of complex forest structural traits over large areas often must rely on remote sensing data and while some traits, such as height, may be measured directly others, such as biomass, must be inferred from relationships among field measurements and the remotely sensed observation. While passive optical and radar remote sensing have expanded the reach of forest structure assessments, these have been shown to not be sufficiently sensitive to important variations in 3D canopy structure10,11. In contrast, lidar remote sensing has emerged as the most effective means for forest structure assessments, and due to its direct 3D mapping capabilities and canopy penetration, can observe not only horizontal structural variability but also the vertical variability of the canopy12,13.
There are several different types of lidar used for forest canopy characterization. Terrestrial Laser Scanning (TLS) provides remarkable 3D data at very high spatial resolutions, allowing digital reconstruction of structural elements and gaps at fine scales (on the order of centimeters)14,15, but is time-consuming, restricted to small sites, and suffers from signal saturation in tall canopies. Airborne Laser Scanning (ALS) and drone-based lidar provide structural measurements of forests and are used to map forest heights, canopy cover, plant area index (PAI), foliage height diversity (FHD), and other variables over larger sites13. However, the limited spatial extent, lack of global availability, and fragmentation among ALS and TLS datasets, especially in the tropics, hinder the characterization of complexity and its variability within and among biomes. Thus, the challenge of mapping structural complexity consistently at global scales remains.
Progress in advancing our knowledge of the importance of structural complexity on the functioning and resilience of ecosystems is underpinned by the ability to measure complexity far more widely than currently exists. This is because we do not yet fully understand how complexity develops, how horizontal and vertical complexity are linked, nor how the myriad of environmental and biological factors interact to control its spatial distribution and evolution. Ehbrecht et al.16 explored some of these factors by modeling a stand structural complexity index (SSCIpot) that related climatic and edaphic variables and TLS point clouds at 1 km resolution. The resulting map depicts potential complexity, that is what the complexity could be in a specific location, and not the actual structural complexity that currently exists there. The impacts of disturbance and subsequent recovery could not be assessed, even though these are important determinants of the actual complexity16,17. Furthermore, their models relied on 294 TLS point clouds distributed across 20 sites in different continents, producing a global map extrapolated from a small pool of training samples. Although their models underwent cross-validation, the limited number of samples restricted the extent of validation, which could contribute to the overgeneralization of complexity as a function of climate variables.
One means of obtaining the required data on forest structure as it currently manifests on the land surface is using spaceborne lidar. The Global Ecosystem Dynamics Investigation (GEDI) mission was optimized to monitor ecosystem structure of the Earth’s tropical and temperate forests, collecting data along the orbital track of the International Space Station (ISS), between ±51.6° latitude18. GEDI waveforms record the vertical distribution of leaves and branches from the top of the canopy to the ground over 25 m diameter footprints. In operation from April 2019 to March 2023, GEDI has acquired billions of observations over the land surface. The GEDI observable is a return waveform that provides a direct measure of vertical variability through relative height (RH) metrics19. GEDI waveforms are also used to derive a variety of other structural properties, including canopy height, canopy cover, plant area index, aboveground biomass, and others18 within each footprint. Given these properties, GEDI data are well suited for mapping actual structural complexity on a global scale.
Lidar remote sensing of canopies can produce vast amounts of data leading to many different descriptors of canopy structure. This has led to the creation of indices of complexity that provide compact yet meaningful summaries of structural variability that may be readily mapped and interpreted by ecologists2,20–23. Additionally, the use of a single index facilitates the incorporation of structural complexity into ecological modeling and applications4. While GEDI produces less data over its 25 m footprint than a TLS or ALS survey, its waveforms are nonetheless described by 100 RH metrics at each footprint. Hence, our motivation is to provide an index of complexity from GEDI waveforms. Furthermore, GEDI only records vertical variations in canopy structure, as given by variations in the amplitude of its returned waveform at any given height. There is thus considerable interest to assess the degree to which 3D complexity may be inferred from vertical variations in the waveform alone, under the hypothesis that vertical and horizontal complexity must be related.
Here we develop a footprint-scale Waveform Structural Complexity Index (WSCI). This index is established by modeling the empirical relationships between GEDI RH metrics and an existing metric of 3D structural complexity (CEXYZ)20 derived from ALS point clouds. CEXYZ is an entropy-based measure that captures both horizontal and vertical complexity. Horizontal complexity refers to the spatial distribution of canopy structures within a footprint, while vertical complexity describes the distribution of vegetation layers from the ground to the canopy top. Together, these components provide a comprehensive characterization of the 3D structural complexity of the forest canopy. We create global models using over 800,000 measured values of CEXYZ from ALS and collocated GEDI data to predict WSCI for four different plant functional types (PFTs) at the scale of GEDI footprints. We then apply these models to approximately 2 billion GEDI shots and use the resulting WSCI values to create a data set of existing structural complexity for the Earth’s temperate and tropical forests. We analyze which RH metrics are most important for modeled WSCI, identifying whether these originate from the lower, middle, or upper layers of the waveform, towards understanding how different elements of the canopy impact complexity and how these vary by biome24. Our analysis reveals that highly complex canopies are concentrated in tropical broadleaf forests, and this complexity appears to be more strongly driven by variability in metrics linked to the lower and middle waveform layers than in other forest types. Additionally, we find strong evidence of scaling relationships between structural complexity and canopy height globally, but these vary geographically and by biome. Tropical forests tend towards having smaller scaling exponents but their complexity in early seral growth stages surpasses that of other biomes; e.g. short tropical forests show higher complexity relative to other forests of similar heights. These results emphasize the value of widespread and consistent estimates of forest structural complexity and their variability while opening pathways for investigations on how forest structural complexity develops as a result of gradients in environmental drivers, seral stage, and disturbance conditions. Our results confirm the complexity of tropical forests, and thus their conservation significance, but also highlight the importance of considering the protection of what GEDI data reveal to be rare yet highly complex forests in temperate biomes.
Results
Model performance and feature importance
We created regression models to predict WSCI using extreme gradient boosted trees (XGBoost)25 for four plant functional type (PFT) models: Evergreen Broadleaf Trees (EBT), Deciduous Broadleaf Trees (DBT), Evergreen Needleleaf Trees (ENT), and Grasslands, Shrublands and Woodlands (GSW). The models were trained using GEDI RH percentiles as independent variables to predict the 3D structural complexity index (CEXYZ)20 derived from ALS point clouds collocated with GEDI footprints. This training approach employed a thorough spatial cross-validation method to strive for broad geographical applicability. Our models explained 68% of the variability of complexity in the training data. As part of our analyses, we also created separate models to predict horizontal and vertical complexity at the PFT level. This approach helped us determine if horizontal canopy structure, which GEDI waveforms do not measure, can be inferred based on its relationship with vertical structure. We found that the vertical complexity was well-predicted (R2 = 0.75, RMSE = 8.8%), with lower performance for horizontal complexity (R2 = 0.40, RMSE = 9.5%) (Fig. 1). GEDI does not measure variability of horizontal structures within individual footprints, but instead integrates the return signal from all leaves and branches at any particular height into one value, which is represented by the cumulative RH metric at that height. Our results suggest that horizontal and vertical canopy structures are linked at the footprint scale, confirmed by the strong relationship between horizontal and vertical complexity observed in our ALS samples (Supplementary Fig. 1), which enables some variation in horizontal complexity to be inferred from vertical structure co-variates. Note that maximum horizontal complexity is limited by footprint size (25 m), hence the sharp cut-off in values at 6.3 shown in Fig. 1. In contrast, vertical complexity is limited by maximum tree height, which varies among footprints, and so does not show such behavior.
The feature importance of individual model predictions was examined using SHAP values (Shapley Additive Explanations)26 and showed that vertical complexity is largely predicted by metrics related to upper strata features, with 46% of the feature importance accounted for by RH percentiles in the top 10% of the canopy (that is above 90% of total canopy height) (Fig. 1e). In contrast, horizontal complexity was estimated by a larger pool of canopy features at intermediate height layers, with only 12% of the feature importance accumulated in the top 10% RH percentiles (Fig. 1d). The WSCI models (which integrate both vertical and horizontal structure) showed intermediate patterns of feature importance between the horizontal and vertical complexity models, mixing features from both canopy top and intermediate height layers among the strongest predictors, with the top 10% RH percentiles concentrating 35% of the feature importance (Fig. 1f). These results further imply a functional dependency between vertical and horizontal structural attributes in the forest’s overall complexity development. As canopy height increases, vertical layering also develops, such as through the recruitment of new individuals into the understory, diversifying the tree composition, increasing tree density3,27, and consequently affecting the forest’s horizontal and vertical complexity.
Model performance was consistent among the forested PFTs explaining 656% (RMSE), 618%, and 588% of CEXYZ variability for EBT, EDT, and ENT, respectively (Fig. 2). The model fitted for GSW showed lower performance capturing 3613% of variability. Exploring the model residuals geographically, we observed low errors and little bias in our training samples over regions where forested PFTs are prevalent (Supplementary Fig. 2). Higher errors were consistently observed in samples at regions of sparse tree cover where GSW is prevalent, with overestimation noted in the midwestern United States, northern Spain, and underestimation in central Australia and South Africa (Supplementary Fig. 2). The models for ENT and GSW predicted WSCI mostly from features near the canopy top, with 68% and 48% of the importance concentrated in the top 10% RH percentiles, respectively. The broadleaf PFT models, conversely, place higher importance on features farther down in the waveform for predicting structural complexity, with only 20% of the feature importance coming from the top 10% RH percentiles.
Global Patterns of structural complexity and model uncertainty
We applied the WSCI models to the 2 billion high-quality GEDI shots and averaged these to 1 km resolution. Doing so, we observe a latitudinal pattern of high structural complexity over the tropics, decreasing towards higher latitudes (Fig. 3a). Hotspots of high complexity are found at tropical forests, as in the Amazon, Gabon, and between Borneo and Papua. The median structural complexity in tropical forests is higher than 82% of the estimates for temperate forests, and 96% of those in other forest types (see Supplementary Fig. 3 for the definition of these forest types). Spatial gradients of complexity decrease from dense tropical forests towards sparse tree cover areas between the Amazon and the Brazilian Cerrado, and between the Congo Basin’s rainforests and African savannahs. At smaller extents, hotspots of structural complexity emerge in other tropical, subtropical, and temperate forest biomes, such as the Atlantic coast rainforest, in Brazil, the region between Bhutan and Northern Myanmar, in the Himalayan Forests, the Sierra Nevada in the northwestern United States, the region between Colombia and Costa Rica, in Central America and the southeastern Australian coast. Model uncertainty was quantified using conformal predictors to calculate prediction intervals at a 95% confidence level for all WSCI estimates (detailed in Supplementary Table 1). Prediction intervals were generally inversely related to WSCI estimates (Fig. 3b), reflecting the heteroscedasticity observed in the WSCI model residuals, where the variance of the errors increase as observed complexity decreases. The larger uncertainty of the GSW model relative to the models trained on forest PFTs is also evident (Fig. 2).
The prediction of structural complexity based on vertical canopy strata also revealed distinct geographical patterns. We analyzed SHAP values derived from all WSCI estimates, dividing the accumulated feature importance into three layers for global visualization: lower (<= RH 33), middle (RH > 33 and RH < = 66), and upper (RH > 66) returns. Averaging the relative feature importance of these layers at 10 km scale (Fig. 4) reinforces that global structural complexity is largely determined by returns from the upper layer. However, in tropical forests, which are dominated by evergreen broadleaf trees, complexity relies more heavily on a combination of features from lower and middle layers. In contrast to these forests, complexity in temperate forests depends more strongly on metrics from the upper layer, but with the increased influence of middle layers in regions dominated by broadleaf trees. Regions with sparse tree cover, where GSW is prevalent, exhibit structural complexity arising from a blend of features from throughout the vertical profile. Examining feature importance profiles as a function of plant functional types sampled from forests in different geographical areas, for example, comparing EBT in the Amazon vs. Southeast Asia, largely illustrates the consistency of these findings, while also allowing for some variation in feature importance by region (Supplementary Fig. 4).
Characterizing complexity relative to other structural elements
We expect the WSCI to be strongly associated with other measures of structure derived from waveforms, especially height and FHD; the former reflecting taller canopies’ potential to encompass more 3D space in which elements such as canopy layering may develop, and the latter because it is an entropy-based measure (but in one dimension only). It is of interest to understand deviations from these relationships, e.g. having short canopies and high complexity and vice versa. Furthermore, other measures of the canopy structure, in particular, cover and plant area index (PAI) also may be indicative of complexity in canopies. GEDI derives all of these from the same RH metrics, and a single index of complexity, in our case the WSCI, likely incorporates variation from all of these.
We performed a Principal Components Analysis (PCA) to understand the relationships among WSCI and these other metrics using a 1% random sample (n = 19,041,737) of GEDI footprints (Fig. 5a). The first component (PC1) explains the majority of the variance in the data set (89%) and is aligned with increasing forest density, measured by cover and PAI, noting that these are highly related (PAI is a monotonic transformation of cover, for example). The two clusters along this axis correspond to short, open woodlands at one end (low negative PC1 scores), and denser forests at the other (high positive PC1 scores) (Supplementary Fig 5a). This PC geographically captures broad-scale differences among regions, such as between the Amazon basin and the Brazilian Cerrado to the south but does not capture variability within these regions as their PC1 scores are relatively uniform. In contrast, PC2 is loaded heavily on measures explicitly linked to vertical stratification and canopy height: FHD, RH98, and WSCI and captures variations in structure within these broad regions (Supplementary Fig. 5b). As expected WSCI is aligned closely with FHD and RH98 in this PC space. WSCI is further strongly (R2 = 0.71) and linearly related to FHD (Fig. 5b), and non-linearly related to RH98 (Fig. 5e). However, WSCI also incorporates some element of horizontal complexity (recalling that our models explained about 40% of the variability in horizontal complexity from ALS), comprehensively integrating structural information from GEDI waveforms that cannot be captured by any other single metric, explaining between 60% − 80% of the variability of each variable assessed in the PCA (Supplementary Fig. 6). Note that for any given height range, there may be a large range of WSCI values (Fig. 5d) and this range compresses as canopies get taller.
The question arises as to whether WSCI, as an index, captures any variation in complexity that is not already captured with existing standard GEDI metrics (RH98, AGBD, FHD, cover and PAI). The advantage of using a single index is well established as it facilitates the incorporation of structural complexity into ecological modeling and applications4. In addition, because WSCI uses the entirety of the waveform and infers horizontal variability, it is of interest to establish if it derives aspects of structure not captured by these other metrics (Supplementary Fig. 6). To assess this, we used the 3 first PCA components (which accounted for 98.8% of the total variance in the dataset) to perform a principal components regression (PCR) to predict WSCI. The PCR explained 96% of WSCI’s variance with the unexplained 4% likely due to nonlinear interactions among the variables not accounted for in principal components space. We then removed WSCI from our principal components analysis and found that the resulting first 3 principal components explained more than 99.5% of the variation in the data set, with a subsequent PCR of these 3 components explaining 85% of the variance in WSCI. Thus, based on these PCR analyses, the WSCI captures 11% of the variation not included in these other metrics. That said, it is possible that new models trained with these metrics and that account for nonlinear relationships could predict CExyz nearly as well. However, some of these metrics, such as AGBD28 and PAI29 rely on empirical calibrations and/or assumptions about canopy and ground reflectivity in their derivations, respectively. Using RH metrics directly avoids such issues and intrinsically provides a means to assess the contribution of elements of canopy structure as captured by the direct, cumulative energy returns at various heights to complexity.
Scaling of complexity with height
Recent work has explored power law scaling relationships between various measures of structural complexity derived from ALS to canopy height across the United States4. While our results show that for a particular canopy height range, the range of WSCI values may be large, the trend towards higher complexity with height is evident (Fig. 5e). GEDI data enable us to derive scaling relationships globally and to assess how these may vary geographically and by PFT. We used linear regression models developed within 10 km pixels distributed over the GEDI domain to measure the scaling of WSCI as a function of canopy height within those pixels. Our analyses showed that the relationship between WSCI and canopy height may be described by a power law in most regions, with WSCI scaling linearly with log of RH98 (Fig. 6a). Deviations from linear relationships may be found within and among forest biomes, mostly in landscapes of sparse tree cover and transitional zones (Fig. 6c). We found average scaling exponents of 1.04, 1.29 and 1.33 for Tropical, Temperate, and Other biomes, respectively (Supplementary Fig. 7).
While the rate of increase in complexity per unit of canopy height is lower in tropical forests relative to other forest biomes, the initial state of structural complexity, inferred from the intercept in the scaling relationship, is higher in tropical forests, particularly in the Amazon (Fig. 6b). This implies that short forests are consistently more complex in tropical biomes, but differences among biomes decrease with increasing height, converging near a height of 45 meters (Supplementary Fig. 7c). This phenomenon can be further observed by weakened latitudinal gradients and a more homogeneous global distribution of structural complexity when mapping complexity of only tall forests (Supplementary Fig. 8), making it harder to identify hotspots. In summary, tall forests tend to be highly complex everywhere, and what boosts structural complexity in tropical forests, relative to other biomes, is the complexity of their short forests.
Discussion
The observed geographical patterns of structural complexity, height scaling, and associated differences in which parts of the canopy are used to predict it across biomes provide a near-global view of complexity as measured from space at relatively fine spatial scales (25 m GEDI footprints). Our models relied on different configurations of relative height metrics to explain the complexity associated with different canopy vertical layers, which had not been quantified over large scales. While these RH metrics are correlated amongst themselves30, these linkages decrease as the vertical distance among them increases, and thus our three, broad strata of upper, middle, and lower waveform returns are likely discriminating on actual differences in structure and not artifacts of the modeling process or the data used to derive it.
Structural complexity is inherently linked to 3D space1,2,4, hence our choice of a metric that separates complexity into vertical and horizontal components20. Metrics such as FHD ignore the horizontal component and are highly dependent on the number of height bins, which is determined by the top height of the canopy. GEDI waveforms only provide a direct measurement of the vertical component, leaving the horizontal component to be inferred indirectly through our model-based approach. This approach is analogous to the modeling of above ground biomass (AGB), which relies on in situ plot measurements to train prediction models based on measurements (e.g. height) that are indirectly related to AGB28. High-resolution ALS data are widely available and provide sufficiently detailed measurements to quantify structural complexity accurately and precisely, and therefore represent the best source of training data. Advanced machine learning models can also integrate variables from high-dimensional space to capture patterns beyond traditional ecological metrics like FHD or PAI. Our model-based approach has additionally provided insights into the sensitivity of GEDI waveforms to 3D structural complexity and uncertainty estimates for use in ecological inference, highlighting areas to target for improvement (e.g., tropical savannas).
On the issue of power-law scaling, previous work attributed differences in such scaling across PFTs to crown architecture, vertical layering, and the degree of suppression driven by species competition31. Our results expand on these findings globally and suggest consistent evidence of scaling, but one that varies with biome. In particular, the rate of increase of complexity with respect to height was variable by biome, largely driven by differences between PFTs captured by the different models. However, even within biomes, differences in these scaling patterns at finer scales were observed as the same model can vary its weights (Fig. 2) to predict complexity based on the characteristics of the input canopy profile. Nonetheless, the degree of scaling and its patterns, as given in Fig. 6, are noteworthy.
Our findings distinguished among needleleaf forests, deciduous broadleaf forests, evergreen broadleaf forests, grasslands, shrublands, and woodlands, likely reflecting the different growth strategies adopted by trees in these plant functional types. The broadly different variable importance patterns found between broadleaf and needle leaf forests (Fig. 2) may be attributable to differences in their tree communities and competition for light, affecting the strategies of occupation of the canopy 3D space. For example, needleleaf trees tend to suppress emerging trees in the understory, displaying a more homogeneous canopy structure, dominated by few species able to grow tall enough to occupy the canopy top32. Broadleaf forests are generally more diverse in terms of tree species and display a wider array of growth strategies, enabling more competition to occur among individuals in the understory31,33. Forests in GSW have sparse trees mixed with short vegetation of other habits, thus the structural complexity in those areas is partially attributed to low-stature vegetation. However, the GEDI lidar is not optimized to measure the vertical structure of the shortest plants (say less than about 2–3 m) so that GSW model estimates of structural complexity may be dominated by GEDI returns from taller individuals, with low vegetation indistinguishable from the ground. Furthermore, the observed differences in complexity within similar forest types may reflect differences in growth strategies related to variations in species composition and succession dynamics. Tropical forests, generally the most diverse and dynamic ecosystems, benefit from abundant resources supporting tree growth34,35. In contrast, other forest biomes with more limited resource availability enable fewer dominant tree species to thrive optimally, which are often under stress induced by competition with individuals following similar growth strategies seeking the same resources36. These properties increase the likelihood of occupying niches in the 3D canopy space in tropical forests, allowing for various resource allocation strategies that collectively boost structural complexity. However, tropical forests across different continents may exhibit distinct mechanisms underlying the development of complexity within the same biome type. For example, African rainforests are known to be less diverse than those in the Amazon37, particularly in the understory38. Conversely, Borneo has a higher proportion of large tree individuals than the Amazon, mainly due to the abundance of species from the Dipterocarpaceae family39. Our findings align with these observations, as we found weaker power-law scaling of complexity with height in South/Central American tropical forests compared to African and Southeast Asian tropical forests (Supplementary Fig. 7f). However, although generally positive associations between tree species diversity and structural complexity have been documented3,5,8,40, the strength of these relationships remains uncertain. Therefore, the availability of high-resolution WSCI estimates may facilitate further exploration to better understand how biodiversity and structure interact.
The WSCI is derived from structural attributes and is influenced by disturbances underlying GEDI observations. As an effective integrator of structural information (Supplementary Fig. 6), the WSCI may be useful to assess forest degradation and structural integrity (the ecosystem’s capacity to maintain its structure, function, and composition relative to its natural range of variation41). The SSCIpot product estimates potential structural complexity16, and its comparison with actual complexity (WSCI) may be informative; for example, large differences between SSCIpot and WSCI where the current complexity of the forest is much less than potential could imply forest degradation and a loss of integrity. However, caution must be exercised when comparing indices that, while designed for the same purpose, may not be equivalent and thus yield unreliable interpretations. In our case here, we found the two indices were only weakly correlated, even when limiting comparisons to intact forests42 (Supplementary Fig. 9). This lack of agreement is likely indicative of the divergent means by which the indices were derived. The WSCI uses structural attributes directly to model complexity at fine scales (GEDI footprints) spatially across the domain of GEDI observations. In contrast, the SSCIpot relies on climate variables at much coarser scales. Additionally, the SSCIpot models are also influenced by boreal forests, which are not observed by GEDI but are estimated in the SSCIpot map. Given the strong concentration of intact forests in boreal regions it is perhaps not surprising the two indices exhibit a weak correlation (Supplementary Fig. 9). Future research should attempt to derive an equivalent WSCI index from the ICESat-243 mission, which has excellent coverage in boreal regions, which if accurate, could provide the means for assessing deviations from potential as given by SSCIpot. One limitation of the WSCI is that it is only able to infer horizontal complexity within GEDI footprints through its association with explicitly measured vertical variability within footprints (i.e. the RH metrics). Our results showed that such an association exists, and that horizontal and vertical canopy variabilities are linked. For now, such inferential approaches may be the best that can be hoped for until lidar data with sufficient horizontal resolution and spacing are available over large portions of the Earth. The potential for using high-resolution stereo imagery as a substitute for lidar to resolve canopy features is increasing44 though its efficacy in dense forests or to measure different vertical portions of the canopy is yet to be determined.
More research is also needed to understand how to use data that are spatially sparse, such as GEDI, to quantify horizontal complexity across spatial scales, linking within and between footprint canopy structure variability. The GEDI data themselves may be used to examine inter-footprint variability in complexity along its sampling transects, as well as focusing on areas of dense coverage that occurred during its mission period where shots were concentrated due to variations in the ISS orbit. Alternatively, structural complexity derived from waveforms, say in 1 km or larger cells, could be examined, but at the cost of losing the ability to examine any fine-scale climatic, edaphic, and disturbance gradients that might exist. Another approach would use multi-sensor fusion to train models using wall-to-wall remote sensing data, say from passive optical or radar data, to infer complexity. For example, Qi et al.45,46 have shown how GEDI data may be used to train interferometric SAR data from the TanDEM-X satellites to map canopy structure, biomass, and topography at 30 m resolution. Such fusion could be furthered by using extant ALS data sets by way of both calibration and validation. A fusion approach could also help improve the mapping of structural complexity in domains of sparse tree cover where our model estimates over grasslands, shrublands, and woodlands showed weaker performance and higher uncertainty. This result likely reflects the sparse cover in these landscapes and that their structural complexity relies partially on low-stature vegetation not well observed by GEDI waveforms. Fusion approaches may further enable us to characterize forest structural complexity beyond the GEDI geographical domain47, e.g. including boreal forests. These could include methods just described as well as potentially leveraging ICESat-2 data.
We have presented the current global pattern of forest structural complexity using a waveform structural complexity index (WSCI) designed to estimate 3D canopy structural complexity from GEDI observations. Acting as an integrator of various structural metrics, the WSCI estimates the contribution of horizontal and vertical canopy elements to predict complexity at any GEDI footprint. Tropical forests consistently exhibited a larger proportion of high structural complexity than other biomes, even for forests of the same height. Our findings further suggest that the canopy relative height metrics that are most important for inferring structural complexity vary across plant functional types and biomes. The creation of a global database of structural complexity from GEDI48, along with GEDI’s explicit measurement of collocated canopy heights enabled us to discover that complexity follows a power law function with respect to height in most forests but with important variations by biome. These high-resolution data were also essential towards confirming that while hotspots of structural complexity are concentrated in tropical forests, some temperate forest sites reached structural complexity levels comparable to tropical forests, and hence actions to additionally preserve these should be accelerated. Such efforts have already begun, notably in the United States, where work is currently underway to identify, monitor, and manage mature and old-growth forests49. We anticipate that maps of spatial complexity, such as presented here, will advance these efforts. This latter point emphasizes that much remains to be understood about structural complexity, not only where it occurs, but how it develops, how it may be managed, and what it implies for ecosystem functioning. The widespread availability of WSCI estimates from GEDI is a valuable starting point for developing a quantitative comprehension of these issues.
Methods
We developed a modeling framework to estimate a 3D structural complexity index calculated from ALS point clouds matched to GEDI footprints and modeled using GEDI RH metrics through XGBoost regression, followed by conformal predictors for estimating model uncertainty. We used SHAP feature explainers26 to understand the relative contribution of lower, middle, and upper waveform layers for predicting structural complexity in different models and across biomes. We compared worldwide WSCI estimates to other GEDI structural metrics through Principal Components Analysis and subsequent Principal Components Regression to understand how much structural variability captured by GEDI the WSCI was able to uncover. We used linear regression between WSCI and RH98 in 10 km pixels to map geographical patterns of structural complexity scaling with canopy height. Lastly, we compared the global patterns of WSCI with another product estimating potential structural complexity globally to assess whether the two products matched at coarse and fine scales.
GEDI data
GEDI footprints were filtered to select high-quality, high-sensitivity data (Table 1), guaranteeing that only high-fidelity GEDI footprints were used for model training and validation. Filtering was performed on metrics available in the GEDI L2A version 2 product19. Setting a high sensitivity threshold increased the likelihood of keeping only waveforms that reached the ground in dense canopy cover, thus capturing the entire vertical profile of the forest. We further filtered shots based on their Plant Functional Types, keeping only data from PFTs where tree cover is expected based on the MODIS MCD12Q1 V006 product50. Our models used relative height (RH) percentiles from the GEDI L2A product as predictors of a 3D structural complexity metric from GEDI intersected Airborne Laser Scanning (ALS) point clouds. Once trained, regression models were applied to the current catalog of GEDI footprints (between April 2019 and March 2023) over the Earth’s land surface to generate WSCI predictions.
Table 1.
Filter | Description |
---|---|
algorithm_run_flag = 1 | L2B algorithms were applied |
degrade_flag = 0 | Low degradation of geolocation performance |
land_cover_data/landsat_water_persistence <10 & land_cover_data/urban_proportion <50 | Non-urban land surface waveforms |
rx_maxamp > 8 * sd_corrected | Maximum waveform amplitude at least 8x its standard deviation |
sensitivity > 0.95 (0.98 at the tropics) | High sensitivity shots |
land_cover_data/pft_class in [1,2,3,4,5,6,11] | Plant Functional Types with tree cover |
We assessed the relationship between WSCI estimates and other relevant forest structural metrics from GEDI on a global scale, using only shots where RH98 was greater than 5 meters, and where tree cover was expected from the European Space Agency (ESA) WorldCover v200 product at 10 m resolution51. The GEDI metrics compared with the WSCI were as follows: canopy cover fraction, canopy height (RH98), plant area index (PAI), foliage height diversity (FHD), and above-ground biomass density (AGBD), extracted from the GEDI L2A19, L2B29 and L4A30 data products.
Airborne lidar data
We used a comprehensive ALS discrete point cloud database gathered by the GEDI mission team, consisting of datasets shared by research partners and open data initiatives around the world (Table 2). Those point clouds were collected over a multitude of forest sites distributed across five continental regions, covering a broad range of structural (height, cover, PFTs) and environmental (topography, climate) conditions18.
Table 2.
ALS project | Continental region | Number of intersected GEDI footprints | Reference |
---|---|---|---|
CSIR | Africa | 17,284 | Li et al.76 |
WWF DRC | Africa | 8713 | Xu et al.77 |
JPL Gabon | Africa | 2918 | Fatoyinbo et al.78 |
TERN | Australia | 24,185 | Quadros & Keysers79 |
G-LiHT Mexico | Central America | 69,880 | Cook et al.80 |
PNOA Spain Leon | Europe | 24,020 | Pascual et al.81 |
PNOA Spain Extremadura | Europe | 18,245 | Pascual & Guerra-Hernandez82 |
JPL Borneo | South East Asia | 4346 | Melendy et al.83 |
NERC ARSF Malaysia | South East Asia | 3433 | NERC84 |
EBA INPE Brazil | South America | 140,923 | Ometto et al.85,86 |
NEON | USA | 502,329 | NEON87 |
Those point clouds were matched to GEDI shots collected between 2019 and 2023 and used to correct the systematic geolocation of their intersected GEDI orbit sections (crossovers) using the GEDI simulator framework52. This procedure calculated the optimal horizontal and vertical offsets to match an observed GEDI waveform to its ALS-simulated counterpart. Low-quality offsets obtained from less than 10 shots with a Pearson correlation lower than 0.75 were removed from further analyses. Low offset correlations may indicate waveform degradation, not caught by the standard filters, or substantial land cover change in the time period between ALS and GEDI data acquisitions. Therefore, by filtering out these observations we are guaranteed to keep only consistent ALS/GEDI crossovers over space and time, i.e. registered on the same exact locations with unchanged structural elements over time between ALS and GEDI acquisitions.
Reference structural complexity index
Several studies have used canopy height variability within lidar point clouds as a structural complexity indicator21,53,54. Other forest structural complexity indices are implementations of traditional metrics adapted to lidar data. Foremost among these is Foliage Height Diversity. This metric has long been used in field surveys to summarize the information within forest vertical plant profiles55 and has a direct lidar counterpart often used as a surrogate for structural complexity31,56,57. Other examples of such implementations are fractal box-dimension, the occupation of 3D space independent of scale22,58; lacunarity, a measure of structured empty spaces in the vegetation59,60; and rugosity and rumple index, proxies for surface roughness54,61. More recently, structural complexity indices have been designed specifically for lidar point clouds, addressing complexity as a 3D metric explicitly, and include the Stand Structural Complexity Index (SSCI)2, the Structural Heterogeneity Index (SHITLS)23, and the 3D Canopy Entropy Index (CEXYZ)20. The SSCI combines fractal elements with vertical layering to account for canopy shape, and horizontal and vertical complexity simultaneously, but it was designed for single scan TLS point clouds and lacks information that enables complete 3D coverage since all objects are scanned from a single direction, also lacking penetration of upper canopies due to the lidar instrument’s design14,15,62. The SHITLS is calculated by combining explicit tree architectural metrics extracted from Quantitative Structural Models (QSMs), which are digitally reconstructed trees that can only be reliably generated from high resolution and high accuracy multiscan TLS point clouds15,63. SHITLS is a sum of standardized metrics that represent different components of structural variation in the forest canopy, assuming that all metrics have the same weight for explaining the 3D complexity. The CEXYZ is a sensor agnostic index that can be applied to both ALS and TLS point clouds and its conceptualization has an adaptive voxel sampling step to account for varying 3D point densities, which makes it a flexible metric that produces comparable measurements from different lidar instruments and scanning setups. Moreover, the CEXYZ has explicit vertical and horizontal complexity components in its formulation, and similarly to other complexity metrics, it relies on entropy, whose definition from information theory is synergistic with ecological definitions of complexity, as both are closely associated with randomness, heterogeneity, and variability within a system64. The SSCI, SHITLS, and CEXYZ have a strong theoretical basis on forest ecology and forest management principles, and all have been empirically validated across forest stands from different ecoregions and under different management regimes2,20,23.
Our models targeted the 3D Canopy Entropy Index (CEXYZ)20 to take advantage of the large training database of paired ALS/GEDI crossovers. Although indices designed for TLS data are desirable as a modeling basis due to their higher level of structural detail, the substantially lower coverage and availability of TLS data makes it harder to have a sufficiently large database of collocated TLS/GEDI crossovers for training models at a global scale that captures sufficient variability in most ecoregions. Moreover, there are currently no GEDI waveform simulation frameworks validated for TLS point clouds, and therefore no reliable way of performing geolocation matching and upsampling using simulated waveforms from TLS data.
The CEXYZ combines entropy measures from 2D probability density planes, estimated by a kernel density function, measuring the amount of information contained in a 3D point cloud scene (Eq. 1). Liu et al.20 demonstrated that CEXYZ translates well to forest structural complexity by testing it under multiple forest conditions, showing a monotonic growth of the index with increasing vertical layering and tree density. Furthermore, the CEXYZ is defined as a function of explicit components of horizontal and vertical canopy entropy (i.e. complexity), enabling us to measure the relative contribution of these components in the WSCI modeling framework.
1 |
2 |
Where = 3D Canopy Entropy, = Canopy Entropy of a 2D plane (XY, XZ or YZ), = estimated probability density kernel of a point i on the 2D plane, s = grid size (regular spacing to measure kernel density, in point cloud units).
We measured CEXYZ from the point clouds in the ALS/GEDI crossovers, which corresponded to the total 3D entropy of 25 m diameter circular plots, with kernel density calculated in systematic grids of 10 cm spacing. We then modeled CEXYZ as a function of the RH metrics from the GEDI footprints (Fig. 7). Point cloud processing and CEXYZ calculations were done in the R programing language using the packages lidR65 version 4.1.0, TreeLS66 version 2.0.5, ks67 version 1.13.5, sf68 version 1.0.12, nabor69 version 0.5 and trend70 version 1.1.4.
Modeling framework, feature importance, and uncertainty estimation
The high-quality crossovers were split into training and test data sets on a geographical basis, in which crossovers from 80% of the sites (training set) were used for training regression models and the remaining 20% (test set) were used for estimating model prediction intervals. The WSCI models were fitted using the Extreme Gradient Boosting Trees (XGBoost25 Python package version 1.7.4) regression algorithm, as it offers a good balance of computational efficiency, precision, accuracy, and robustness against overfitting25. These XGBoost properties granted us fast processing time with reliable outputs, which enabled us to carry out extensive hyperparameter tuning for better model optimization and apply strict quality control on model selection. We used GEDI RH metrics as predictors in the XGBoost regression models, as the vector of RH metrics for a single GEDI footprint has a fixed length of 101 (0 to 100%) and retrieves information from the full forest vertical profile. Therefore, we expected little information loss relative to training models using the raw GEDI full waveforms, with the added benefit of significantly faster computations on both model fitting and prediction. Global models were fitted for each forest PFT domain, based on GEDI footprint intersections with pixels from the MODIS MCD12Q1 V006 product50.
Hyperparameter tuning was performed through 5-fold grid search with spatial cross-validation (GridSearchCV) using the scikit-learn71 Python package version 1.3.2, minimizing the average root mean squared error (RMSE) cost function over the training set split into folds containing samples from different locations. The hyperparameters optimized by the GridSearchCV were: (1) number of estimators (regression trees), (2) sub-sample size (observations), (3) feature space sample size (variables), (4) maximum regression tree depth and (5) learning rate. Spatial cross-validation folds were split on a geographical basis, thus the data in the validation fold belonged to geographical locations unseen in the folds used for fitting the model on each iteration of the GridSearchCV. To enforce model and geographical generalization, we picked the optimal hyperparameters from models that minimized RMSE on validation folds while maintaining a difference of less than 5% in RMSE and R2 between training and validation folds.
We used SHAP explainers26 to extract feature importance from model predictions. SHAP explainers represent the average marginal contribution of a feature to all possible subsets of features using the shap26 Python package version 0.43. It is an approach derived from game theory that is generalizable to any type of machine learning model, providing a comprehensive and consistent way to understand the impact of individual features on model predictions. We used SHAP explainers on the predictions from all the training GEDI samples to understand the patterns of feature importance across different PFT models. We also extracted feature importance from all GEDI high-quality observations to map the geographical patterns of relative contributions of canopy vertical strata to model predictions worldwide. We defined three waveform layers: lower, as the accumulated feature <= RH33; middle, as the accumulated feature importance > RH33 and <= RH66; and upper, as the accumulated feature importance > RH66. We averaged the canopy strata contributions in 10 km pixels and generated an RGB false color composite image representing the feature importance from these three canopy strata in different bands.
We used conformal predictors to calculate prediction intervals in the WSCI models through the crepes72 Python package version 0.6.1. This technique allows the estimation of prediction intervals around individual predictions estimated by a known model73,74, being able to build uncertainty estimators using any kind of pre-trained machine learning model as input, accompanied by a set of observations unseen during model training (our test set), used to calibrate the error model. We trained conformal predictors in Mondrian intervals74 to account for heteroskedasticity in model residuals at different intervals of the modeled variable. WSCI predictions were calculated for the entire GEDI footprint catalog acquired between April 2019 and March 2023, providing complexity estimates accompanied by model prediction intervals for every GEDI footprint over the Earth’s land surface.
Horizontal and vertical complexity model contributions
The WSCI models were fitted to predict CEXYZ directly from the GEDI RH metrics, but to assess the amount of structural complexity information explained across in the vertical and horizontal directions within GEDI footprints we also trained XGBoost models to predict the different terms defined from the CEXYZ formula (Eq. 1). We used the same modeling framework described above to estimate horizontal (CEXY) and vertical (CEZ) complexity from GEDI RH metrics on a PFT basis. Since GEDI only measures vertical complexity in a single direction, we defined CEZ as the average between the two vertical planes entropy in Eq. (1): . We assessed model performance through the RMSE and R2 statistics to determine the efficacy of GEDI for explaining the variability in structural complexity in both directions. We also assessed the feature importance of each model through SHAP explainers to determine which RH metrics are better predictors of horizontal and vertical complexity.
Mapping global patterns and relationships with other GEDI metrics
We generated a global map of the mean WSCI and mean model prediction interval at 1 km spatial resolution. To remove non-forest observations, we aggregated only observations where RH98 was greater than 5 meters, and where tree cover was expected from the ESA WorldCover v200 product at 10 m resolution51.
We compared WSCI estimates with canopy cover fraction, RH98, PAI, FHD, and AGBD, extracted from the GEDI L2A19, L2B29, and L4A30 data products. A PCA was performed on a random sample of 1% of the forest footprints used for mapping. All metrics were standardized to the [0,1] range before performing the PCA to avoid unequal weighting of the PCA eigenvectors caused by different scales of measurement units from the different variables. We also performed a PCA excluding WSCI as input and then performed a PCR to predict WSCI using the first three PCA components to quantify the structural variability uncovered by the WSCI relative to the combination of other GEDI high-level metrics. Ordinary least squares (OLS) regressions were subsequently carried out to investigate relationships between WSCI and log(RH98), used as a proxy of canopy height, in 10 km pixels for the entire GEDI domain. Both PCA and OLS analyses were performed in the Python programing language using packages scikit-learn71 version 1.3.2 and statsmodels75 version 0.14. To capture biome-wide trends and reduce outlier effects in the relationship between WSCI and canopy height we used robust linear regression, fitting an iteratively reweighted least squares regression algorithm weighted using Tukey’s biweight function in the statsmodels75 Python package version 0.14. We applied robust regression between WSCI and log of RH98 to the 1% GEDI footprints sampled for the PCA, divided into 3 forest biome categories: tropical, temperate or other forest domains, using the WWF terrestrial ecoregions of the world as basis24.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Acknowledgements
The authors gratefully acknowledge the large number of contributors of airborne lidar data that enabled the creation of the empirical WSCI models described in this study. Access to the CSIR Africa ALS data was under a data use agreement between the GEDI Mission Science Team data and we gratefully acknowledge CSIR and the University of Witwatersrand for funding the collection of these data. We thank Adrian Pascual and Matheus Nunes for their support on early quality assessments of the WSCI product and manuscript. We are indebted to Bryan Blair and Michelle Hofton for their invaluable contributions to waveform processing and entropy exploration, which laid the groundwork for this research. We gratefully acknowledge the funding from National Aeronautics and Space Administration (NASA) contract NNL 15AA03C for the development and execution of the GEDI mission, including funding to R.D. and J.A., and NASA FINNEST grant 80NSSC22K1543 to T.C.
Author contributions
Conceptualization: T.C., R.D. Methodology: T.C., J.A., R.D. Investigation: T.C., J.A., R.D. Visualization: T.C. Funding acquisition: R.D. Writing - original draft: T.C., R.D. Writing - review & editing: T.C., J.A., R.D.
Peer review
Peer review information
Nature Communications thanks Martin Ehbrecht and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Data availability
GEDI data are openly available and archived on NASA Distributed Active Archive Centers (DAACs). The Waveform Structural Complexity Index (WSCI) data product is openly available in the Oak Ridge National Laboratory (ORNL) DAAC as GEDI04_C Waveform Structural Complexity Index Product under accession code 10.3334/ORNLDAAC/2338. All Airborne Laser Scanning (ALS) datasets used in this study are available under open access and can be obtained directly from the sources or by contacting the principal investigators listed in Table 2 of this study. The GEDI footprint-level RH metrics used to train the WSCI models were taken from the GEDI02_A height and elevation product, available at the Land Processes (LP) DAAC under accession code 10.5067/GEDI/GEDI02_A.002. GEDI’s FHD, PAI and cover metrics were taken from the GEDI02_B canopy cover and vertical profile metrics product also available at the LP DAAC under accession code 10.5067/GEDI/GEDI02_B.002. GEDI’s footprint-level biomass data were taken from the GEDI04_A aboveground biomass density (AGBD) product, available at the ORNL DAAC under accession code 10.3334/ORNLDAAC/2056. The ESA worldcover v200 data product is available at https://worldcover2021.esa.int. The WWF Terrestrial Ecoregions of the World can be obtained at https://www.worldwildlife.org/publications/terrestrial-ecoregions-of-the-world.
Code availability
The code used to generate training data from ALS and apply the WSCI models to GEDI data is available on GitHub at https://github.com/tiagodc/GEDI-WSCI. A permanent reference to the version of the code used in this study is available at 10.5281/zenodo.13351657.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-024-52468-2.
References
- 1.McElhinny, C., Gibbons, P., Brack, C. & Bauhus, J. Forest and woodland stand structural complexity: Its definition and measurement. Ecol. Manag.218, 1–24 (2005). 10.1016/j.foreco.2005.08.034 [DOI] [Google Scholar]
- 2.Ehbrecht, M., Schall, P., Ammer, C. & Seidel, D. Quantifying stand structural complexity and its relationship with forest management, tree species diversity and microclimate. Agric. Meteorol.242, 1–9 (2017). 10.1016/j.agrformet.2017.04.012 [DOI] [Google Scholar]
- 3.Gough, C. M., Atkins, J. W., Fahey, R. T. & Hardiman, B. S. High rates of primary production in structurally complex forests. Ecology100, e02864 (2019). [DOI] [PubMed]
- 4.Atkins, J. W. et al. Integrating forest structural diversity measurement into ecological research. Ecosphere14, e4633 (2023). 10.1002/ecs2.4633 [DOI] [Google Scholar]
- 5.Coverdale, T. C. & Davies, A. B. Unravelling the relationship between plant diversity and vegetation structural complexity: A review and theoretical framework. J. Ecol.111, 1378–1395 (2023). 10.1111/1365-2745.14068 [DOI] [Google Scholar]
- 6.Migliavacca, M. et al. The three major axes of terrestrial ecosystem function. Nature598, 468–472 (2021). 10.1038/s41586-021-03939-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hakkenberg, C. R. & Goetz, S. J. Climate mediates the relationship between plant biodiversity and forest structure across the United States. Glob. Ecol. Biogeogr.30, 2245–2258 (2021). 10.1111/geb.13380 [DOI] [Google Scholar]
- 8.Hakkenberg, C. R. et al. Inferring alpha, beta, and gamma plant diversity across biomes with GEDI spaceborne lidar. Environ. Res. Ecol.2, 035005 (2023). 10.1088/2752-664X/acffcd [DOI] [Google Scholar]
- 9.Li, W. et al. Human fingerprint on structural density of forests globally. Nat. Sustain.6, 368–379 (2023). 10.1038/s41893-022-01020-5 [DOI] [Google Scholar]
- 10.Camarretta, N. et al. Monitoring forest structure to guide adaptive management of forest restoration: A review of remote sensing approaches. New For.51, 573–596 (2020). 10.1007/s11056-019-09754-5 [DOI] [Google Scholar]
- 11.Lehmann, E. A. et al. SAR and optical remote sensing: Assessment of complementarity and interoperability in the context of a large-scale operational forest monitoring system. Remote Sens. Environ.156, 335–348 (2015). 10.1016/j.rse.2014.09.034 [DOI] [Google Scholar]
- 12.Coops, N. C. et al. Modelling lidar-derived estimates of forest attributes over space and time: A review of approaches and future trends. Remote Sens. Environ. 260, 112477 (2021).
- 13.Tompalski, P. et al. Estimating changes in forest attributes and enhancing growth projections: a review of existing approaches and future directions using airborne 3D Point Cloud Data. Curr. For. Rep. 7, 1–24 (2021).
- 14.Liang, X. et al. Terrestrial laser scanning in forest inventories. ISPRS J. Photogramm. Remote Sens.115, 63–77 (2016). 10.1016/j.isprsjprs.2016.01.006 [DOI] [Google Scholar]
- 15.Calders, K. et al. Terrestrial laser scanning in forest ecology: Expanding the horizon. Remote Sens. Environ. 251, 112102 (2020).
- 16.Ehbrecht, M. et al. Global patterns and climatic controls of forest structural complexity. Nat. Commun. 12, 519 (2021). [DOI] [PMC free article] [PubMed]
- 17.Willim, K. et al. Assessing understory complexity in beech-dominated forests (Fagus sylvatica L.) in central europe—from managed to primary forests. Sens. Switz. 19, 1684 (2019). [DOI] [PMC free article] [PubMed]
- 18.Dubayah, R. et al. The Global Ecosystem Dynamics Investigation: High-resolution laser ranging of the Earth’s forests and topography. Sci. Remote Sens.1, 100002 (2020). 10.1016/j.srs.2020.100002 [DOI] [Google Scholar]
- 19.Dubayah, R. et al. GEDI L2A Elevation and Height Metrics Data Global Footprint Level V002. NASA EOSDIS Land Processes Distributed Active Archive Center 10.5067/GEDI/GEDI02_A.002 (2021).
- 20.Liu, X. et al. A novel entropy-based method to quantify forest canopy structural complexity from multiplatform lidar point clouds. Remote Sens. Environ.282, 113280 (2022). 10.1016/j.rse.2022.113280 [DOI] [Google Scholar]
- 21.Coops, N. C. et al. A forest structure habitat index based on airborne laser scanning data. Ecol. Indic.67, 346–357 (2016). 10.1016/j.ecolind.2016.02.057 [DOI] [Google Scholar]
- 22.Seidel, D. A holistic approach to determine tree structural complexity based on laser scanning data and fractal analysis. Ecol. Evol.8, 128–134 (2018). 10.1002/ece3.3661 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Reich, K. F., Kunz, M. & von Oheimb, G. A new index of forest structural heterogeneity using tree architectural attributes measured by terrestrial laser scanning. Ecol. Indic. 133, (2021).
- 24.Olson, D. M. et al. Terrestrial ecoregions of the world: a new map of life on earth. BioScience51, 933 (2001). 10.1641/0006-3568(2001)051[0933:TEOTWA]2.0.CO;2 [DOI] [Google Scholar]
- 25.Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (Association for Computing Machinery, New York, NY, USA, 2016). 10.1145/2939672.2939785.
- 26.Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell.2, 56–67 (2020). 10.1038/s42256-019-0138-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hakkenberg, C. R., Peet, R. K., Urban, D. L. & Song, C. Modeling plant composition as community continua in a forest landscape with L i DAR and hyperspectral remote sensing. Ecol. Appl.28, 177–190 (2018). 10.1002/eap.1638 [DOI] [PubMed] [Google Scholar]
- 28.Duncanson, L. et al. Aboveground biomass density models for NASA’s Global Ecosystem Dynamics Investigation (GEDI) lidar mission. Remote Sens. Environ.270, 112845 (2022). 10.1016/j.rse.2021.112845 [DOI] [Google Scholar]
- 29.Dubayah, R. et al. GEDI L2B Canopy Cover and Vertical Profile Metrics Data Global Footprint Level V002. NASA EOSDIS Land Processes Distributed Active Archive Center 10.5067/GEDI/GEDI02_B.002 (2021).
- 30.Dubayah, R. O. et al. Global Ecosystem Dynamics Investigation (GEDI)GEDI L4A Footprint Level Aboveground Biomass Density, Version 1 10.3334/ORNLDAAC/1907 (2021).
- 31.Atkins, J. W., Walter, J. A., Stovall, A. E. L., Fahey, R. T. & Gough, C. M. Power law scaling relationships link canopy structural complexity and height across forest types. Funct. Ecol. 10.1111/1365-2435.13983 (2021).
- 32.Dormann, C. F. et al. Plant species richness increases with light availability, but not variability, in temperate forests understorey. BMC Ecol.20, 43 (2020). 10.1186/s12898-020-00311-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Zambrano, J. et al. Analyses of three‐dimensional species associations reveal departures from neutrality in a tropical forest. Ecology10.1002/ecy.3681 (2022) [DOI] [PubMed]
- 34.Gómez-Pompa, A., Vázquez-Yanes, C. & Guevara, S. The tropical rain forest: a nonrenewable resource. Science177, 762–765 (1972). 10.1126/science.177.4051.762 [DOI] [PubMed] [Google Scholar]
- 35.Ewel, J. Tropical succession: manifold routes to maturity. Biotropica12, 2–7 (1980). 10.2307/2388149 [DOI] [Google Scholar]
- 36.Pierce, S. et al. A global method for calculating plant CSR ecological strategies applied across biomes world‐wide. Funct. Ecol.31, 444–457 (2017). 10.1111/1365-2435.12722 [DOI] [Google Scholar]
- 37.Parmentier, I. et al. The odd man out? Might climate explain the lower tree α‐diversity of African rain forests relative to Amazonian rain forests? J. Ecol.95, 1058–1071 (2007). 10.1111/j.1365-2745.2007.01273.x [DOI] [Google Scholar]
- 38.Sabatini, F. M. et al. Global patterns of vascular plant alpha diversity. Nat. Commun.13, 4683 (2022). 10.1038/s41467-022-32063-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Banin, L. et al. Tropical forest wood production: a cross‐continental comparison. J. Ecol.102, 1025–1037 (2014). 10.1111/1365-2745.12263 [DOI] [Google Scholar]
- 40.Gough, C. M., Atkins, J. W., Fahey, R. T., Hardiman, B. S. & LaRue, E. A. Community and structural constraints on the complexity of eastern North American forests. Glob. Ecol. Biogeogr.29, 2107–2118 (2020). 10.1111/geb.13180 [DOI] [Google Scholar]
- 41.Hansen, A. et al. Global humid tropics forest structural condition and forest structural integrity maps. Sci. Data6, 232 (2019). 10.1038/s41597-019-0214-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Potapov, P. et al. The last frontiers of wilderness: Tracking loss of intact forest landscapes from 2000 to 2013. Sci. Adv.3, e1600821 (2017). 10.1126/sciadv.1600821 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Abdalati, W. et al. The ICESat-2 Laser Altimetry Mission. Proc. IEEE98, 735–751 (2010). 10.1109/JPROC.2009.2034765 [DOI] [Google Scholar]
- 44.Tucker, C. et al. Sub-continental-scale carbon stocks of individual trees in African drylands. Nature615, 80–86 (2023). 10.1038/s41586-022-05653-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Qi, W. et al. Improved forest height estimation by fusion of simulated GEDI Lidar data and TanDEM-X InSAR data. Remote Sens. Environ.221, 621–634 (2019). 10.1016/j.rse.2018.11.035 [DOI] [Google Scholar]
- 46.Qi, W., Saarela, S., Armston, J., Ståhl, G. & Dubayah, R. Forest biomass estimation over three distinct forest types using TanDEM-X InSAR data and simulated GEDI lidar data. Remote Sens. Environ.232, 111283 (2019). 10.1016/j.rse.2019.111283 [DOI] [Google Scholar]
- 47.Lang, N., Jetz, W., Schindler, K. & Wegner, J. D. A high-resolution canopy height model of the Earth. Nat. Ecol. Evol.7, 1778–1789 (2023). 10.1038/s41559-023-02206-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.De Conto, T., Armston, J. & Dubayah, R. O. Global Ecosystem Dynamics Investigation (GEDI)GEDI L4C Footprint Level Waveform Structural Complexity Index, Version 2 10.3334/ORNLDAAC/2338 (2024).
- 49.Bruening, J. M., Dubayah, R. O., Pederson, N., Poulter, B. & Calle, L. Definition criteria determine the success of old-growth mapping. Ecol. Indic.159, 111709 (2024). 10.1016/j.ecolind.2024.111709 [DOI] [Google Scholar]
- 50.Friedl, M. & Sulla-Menashe, D. MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500m SIN Grid V061. NASA EOSDIS Land Processes Distributed Active Archive Center 10.5067/MODIS/MCD12Q1.061 (2022).
- 51.Zanaga, D. et al. ESA WorldCover 10 m 2021 v200. Zenodo 10.5281/ZENODO.7254221 (2022).
- 52.Hancock, S. et al. The GEDI simulator: A Large‐Footprint Waveform Lidar simulator for calibration and validation of spaceborne missions. Earth Space Sci.6, 294–310 (2019). 10.1029/2018EA000506 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Zellweger, F., Braunisch, V., Baltensweiler, A. & Bollmann, K. Remotely sensed forest structural complexity predicts multi-species occurrence at the landscape scale. Ecol. Manag.307, 303–312 (2013). 10.1016/j.foreco.2013.07.023 [DOI] [Google Scholar]
- 54.LaRue, E. A. et al. Compatibility of aerial and terrestrial LiDAR for quantifying forest structural diversity. Remote Sens. 12, (2020).
- 55.MacArthur, R. H. & MacArthur, J. W. On bird species diversity. Ecology42, 594–598 (1961). 10.2307/1932254 [DOI] [Google Scholar]
- 56.Adnan, S. et al. Determining maximum entropy in 3D remote sensing height distributions and using it to improve aboveground biomass modelling via stratification. Remote Sens. Environ.260, 112464 (2021). 10.1016/j.rse.2021.112464 [DOI] [Google Scholar]
- 57.Crockett, E. T. H. et al. Structural and species diversity explain aboveground carbon storage in forests across the United States: Evidence from GEDI and forest inventory data. Remote Sens. Environ.295, 113703 (2023). 10.1016/j.rse.2023.113703 [DOI] [Google Scholar]
- 58.Seidel, D. et al. Deriving stand structural complexity from airborne laser scanning data-what does it tell us about a forest? Remote Sens. 12, (2020).
- 59.Mandelbrot, B. B. The Fractal Geometry of Nature. (Henry Holt and Company, 1983).
- 60.Weishampel, J. F., Blair, J. B., Knox, R. G., Dubayah, R. & Clark, D. B. Volumetric lidar return patterns from an old-growth tropical rainforest canopy. Int. J. Remote Sens.21, 409–415 (2000). 10.1080/014311600210939 [DOI] [Google Scholar]
- 61.Kane, V. R. et al. Comparisons between field- and LiDAR-based measures of stand structural complexity. Can. J. Res.40, 761–773 (2010). 10.1139/X10-024 [DOI] [Google Scholar]
- 62.Liang, X. et al. International benchmarking of terrestrial laser scanning approaches for forest inventories. ISPRS J. Photogramm. Remote Sens.144, 137–179 (2018). 10.1016/j.isprsjprs.2018.06.021 [DOI] [Google Scholar]
- 63.Calders, K. et al. Realistic forest stand reconstruction from terrestrial LiDAR for radiative transfer modelling. Remote Sens.10, 1–15 (2018). 10.3390/rs10060933 [DOI] [Google Scholar]
- 64.Loke, L. H. L. & Chisholm, R. A. Measuring habitat complexity and spatial heterogeneity in ecology. Ecol. Lett. 10.1111/ele.14084 (2022). [DOI] [PMC free article] [PubMed]
- 65.Roussel, J.-R. et al. lidR: An R package for analysis of Airborne Laser Scanning (ALS) data. Remote Sens. Environ.251, 112061 (2020). 10.1016/j.rse.2020.112061 [DOI] [Google Scholar]
- 66.de Conto, T., Olofsson, K., Görgens, E. B., Rodriguez, L. C. E. & Almeida, G. Performance of stem denoising and stem modelling algorithms on single tree point clouds from terrestrial laser scanning. Comput. Electron. Agric.143, 165–176 (2017). 10.1016/j.compag.2017.10.019 [DOI] [Google Scholar]
- 67.Duong, T., Wand, M., Chacon, J. & Gramacki, A. ks: Kernel Smoothing. R package version 1.13.5 10.32614/CRAN.package.ks (2024).
- 68.Pebesma, E. et al. sf: Simple Features for R. R package version 1.0.12 10.32614/CRAN.package.sf (2024).
- 69.Mangenat, S. & Jefferis, G. nabor: Wraps ‘libnabo’, a Fast K Nearest Neighbour Library for Low Dimensions. R package version 0.5.0 10.32614/CRAN.package.nabor (2018).
- 70.Pohlert, T. trend: Non-Parametric Trend Tests and Change-Point Detection. R package version 1.1.4 10.32614/CRAN.package.trend (2023).
- 71.Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res.12, 2825–2830 (2011). [Google Scholar]
- 72.Boström, H. crepes: a Python Package for Generating Conformal Regressors and Predictive Systems. in Proceedings of the Eleventh Symposium on Conformal and Probabilistic Prediction with Applications 24–41 (PMLR, 2022).
- 73.Conformal prediction. in Algorithmic Learning in a Random World (eds Vovk, V., Gammerman, A. & Shafer, G.) 17–51 (Springer US, Boston, MA, 2005). 10.1007/0-387-25061-1_2.
- 74.Boström, H., Johansson, U. & Löfström, T. Mondrian conformal predictive distributions. in Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications 24–38 (PMLR, 2021).
- 75.Seabold, S. & Perktold, J. Statsmodels: Econometric and Statistical Modeling with Python. Proc. 9th Python Sci. Conf. 92–96 10.25080/Majora-92bf1922-011 (2010).
- 76.Li, X. et al. First validation of GEDI canopy heights in African savannas. Remote Sens. Environ.285, 113402 (2023). 10.1016/j.rse.2022.113402 [DOI] [Google Scholar]
- 77.Xu, L. et al. Spatial distribution of carbon stored in forests of the Democratic Republic of Congo. Sci. Rep.7, 15030 (2017). 10.1038/s41598-017-15050-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Fatoyinbo, T. et al. The NASA AfriSAR campaign: Airborne SAR and lidar measurements of tropical forest structure and biomass in support of current and future space missions. Remote Sens. Environ.264, 112533 (2021). 10.1016/j.rse.2021.112533 [DOI] [Google Scholar]
- 79.Quadros, N. & Keysers, J. Airborne lidar acquisition and validation. In Effective Field Calibration and Validation Practices: A practical handbook for calibration and validation satellie and model-derived terrestrial environmental variables for research and management (A TERN Landscape Assessment Initiative, NCRIS, 2018).
- 80.Cook, B. D. et al. NASA Goddard’s LiDAR, Hyperspectral and Thermal (G-LiHT) airborne Imager. Remote Sens.5, 4045–4066 (2013). 10.3390/rs5084045 [DOI] [Google Scholar]
- 81.Pascual, A. et al. Assessing the performance of NASA’s GEDI L4A footprint aboveground biomass density models using National Forest Inventory and airborne laser scanning data in Mediterranean forest ecosystems. Ecol. Manag.538, 120975 (2023). 10.1016/j.foreco.2023.120975 [DOI] [Google Scholar]
- 82.Pascual, A. & Guerra-Hernández, J. An integrated assessment of carbon emissions from forest fires beyond impacts on aboveground biomass. A showcase using airborne lidar and GEDI data over a megafire in Spain. J. Environ. Manag.345, 118709 (2023). 10.1016/j.jenvman.2023.118709 [DOI] [PubMed] [Google Scholar]
- 83.Melendy, L. et al. CMS: LiDAR Data for Forested Sites on Borneo Island, Kalimantan, Indonesia, 2014. 158571.6823019998 MB 10.3334/ORNLDAAC/1518 (2017).
- 84.NERC Airborne Research Facility (NERC ARF). NERC-ARF 2014 Flights: Airborne remote sensing measurements. https://catalogue.ceda.ac.uk/record/party/2310/ (2020).
- 85.Ometto, J. et al. L1A - Discrete airborne LiDAR transects collected by EBA in the Brazilian Amazon (Roraima e Amapá). Zenodo 10.5281/zenodo.7689693 (2023).
- 86.Ometto, J. P. et al. A biomass map of the Brazilian Amazon from multisource remote sensing. Sci. Data10, 668 (2023). 10.1038/s41597-023-02575-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.National Ecological Observatory Network (NEON). Discrete return LiDAR point cloud (DP1.30003.001) 10.48443/HJ77-KF64 (2024).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
GEDI data are openly available and archived on NASA Distributed Active Archive Centers (DAACs). The Waveform Structural Complexity Index (WSCI) data product is openly available in the Oak Ridge National Laboratory (ORNL) DAAC as GEDI04_C Waveform Structural Complexity Index Product under accession code 10.3334/ORNLDAAC/2338. All Airborne Laser Scanning (ALS) datasets used in this study are available under open access and can be obtained directly from the sources or by contacting the principal investigators listed in Table 2 of this study. The GEDI footprint-level RH metrics used to train the WSCI models were taken from the GEDI02_A height and elevation product, available at the Land Processes (LP) DAAC under accession code 10.5067/GEDI/GEDI02_A.002. GEDI’s FHD, PAI and cover metrics were taken from the GEDI02_B canopy cover and vertical profile metrics product also available at the LP DAAC under accession code 10.5067/GEDI/GEDI02_B.002. GEDI’s footprint-level biomass data were taken from the GEDI04_A aboveground biomass density (AGBD) product, available at the ORNL DAAC under accession code 10.3334/ORNLDAAC/2056. The ESA worldcover v200 data product is available at https://worldcover2021.esa.int. The WWF Terrestrial Ecoregions of the World can be obtained at https://www.worldwildlife.org/publications/terrestrial-ecoregions-of-the-world.
The code used to generate training data from ALS and apply the WSCI models to GEDI data is available on GitHub at https://github.com/tiagodc/GEDI-WSCI. A permanent reference to the version of the code used in this study is available at 10.5281/zenodo.13351657.