Skip to main content
. 2020 Sep 11;11:4540. doi: 10.1038/s41467-020-18321-y

Fig. 3. Assessment of predictive performance in a spatially structured environment.

Fig. 3

a Clustering of field data into 44 spatial folds (bright colors) used in spatial K-fold CV, superimposed over the spatial distribution of moist forests61 (light gray) and country borders. b At the regional scale (a few hundred km, clipped from a), field data (light gray) are aggregated into dense clusters, leaving large swaths of unsampled areas. The outlined region is further expanded in c. c At the landscape scale (a few tens of km), reference pixel AGB and AGB covariate values are spatially dependent (as seen in Fig. 2a, b), which violates the independence hypothesis of training and test sets in cross-validation procedures based on random data splitting. We used circular buffers of increasing radii r to exclude training data (gray) located around the test data (dark gray) in a leave-one-out CV (B-LOO CV) to evaluate how spatial dependence in the data impacted CV statistics.