Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2024 Mar 27;14:7213. doi: 10.1038/s41598-024-57588-9

A super SDM (species distribution model) ‘in the cloud’ for better habitat-association inference with a ‘big data’ application of the Great Gray Owl for Alaska

Falk Huettmann 1,, Phillip Andrews 1, Moriz Steiner 1, Arghya Kusum Das 2, Jacques Philip 3, Chunrong Mi 4, Nathaniel Bryans 5, Bryan Barker 5
PMCID: PMC10965900  PMID: 38531933

Abstract

The currently available distribution and range maps for the Great Grey Owl (GGOW; Strix nebulosa) are ambiguous, contradictory, imprecise, outdated, often hand-drawn and thus not quantified, not based on data or scientific. In this study, we present a proof of concept with a biological application for technical and biological workflow progress on latest global open access ‘Big Data’ sharing, Open-source methods of R and geographic information systems (OGIS and QGIS) assessed with six recent multi-evidence citizen-science sightings of the GGOW. This proposed workflow can be applied for quantified inference for any species-habitat model such as typically applied with species distribution models (SDMs). Using Random Forest—an ensemble-type model of Machine Learning following Leo Breiman’s approach of inference from predictions—we present a Super SDM for GGOWs in Alaska running on Oracle Cloud Infrastructure (OCI). These Super SDMs were based on best publicly available data (410 occurrences + 1% new assessment sightings) and over 100 environmental GIS habitat predictors (‘Big Data’). The compiled global open access data and the associated workflow overcome for the first time the limitations of traditionally used PC and laptops. It breaks new ground and has real-world implications for conservation and land management for GGOW, for Alaska, and for other species worldwide as a ‘new’ baseline. As this research field remains dynamic, Super SDMs can have limits, are not the ultimate and final statement on species-habitat associations yet, but they summarize all publicly available data and information on a topic in a quantified and testable fashion allowing fine-tuning and improvements as needed. At minimum, they allow for low-cost rapid assessment and a great leap forward to be more ecological and inclusive of all information at-hand. Using GGOWs, here we aim to correct the perception of this species towards a more inclusive, holistic, and scientifically correct assessment of this urban-adapted owl in the Anthropocene, rather than a mysterious wilderness-inhabiting species (aka ‘Phantom of the North’). Such a Super SDM was never created for any bird species before and opens new perspectives for impact assessment policy and global sustainability.

Keywords: Big data; Machine learning ensemble; Open access; Open source geographic information system (OGIS, QGIS); Great Gray Owl (Strix nebulosa); Alaska; Cloud computing; Oracle cloud infrastructure

Subject terms: Conservation biology, Ecological modelling, Environmental sciences

Introduction

Knowing where animals occur is a crucial component in our understanding of a science-based conservation management and global sustainability in the real industrial world; the Anthropocene and its challenges (e.g.1,2). Methods to obtain such knowledge are commonly not robust nor very advanced. As per textbook (see for instance3), they are primarily based on inappropriate linear functions4., simplistic use of step-wise coefficients5, frequency statistics and parsimony, unrealistic parametric assumptions, simplistic computing, and the use of relatively few predictors widely 'underdescribing' and biasing ecology (e.g. < 5 predictor variables); examples shown in6,7, 8. These problems are well-known and described for decades (e.g.4,912), not reflecting well on a modern science-based management employing readily-available computer models and what complex ecology with a myriad of linkages, or reality, really is about. Required progress has been widely insufficient1,2, 12. A good example for dealing better with ecological complexities is already telecoupling and spill-over effects13. But while widespread and freely available for already over two decades, more holistic methods like machine learning algorithms14,15, ensemble models16-18 and supercomputing based on widely available open access ‘Big Data’ are still widely ignored19-21, underused and not applied to their potential (11 and citations within), e.g., multivariate analysis done with modern methods (22; see23 for a national application in the subarctic). Considering the global environmental crisis12, so far, the progress in such globally relevant fields like conservation policy based on multivariate efforts have been quite insignificant (e.g.1,2, 11). For instance, most species management models still remain in the single-species realm ignoring species clusters and communities (11, see7 for Resource Selection Funcstions RSF, and4 for Habitat Suitability Index HSI). Also, telemetry data and geolocator data for most of the species are still missing and widely biased for sample sizes and animal strata, frequently  still hand-mined for perceived outliers or using ‘an assumed common-sense’ code (example shown here24 and with an application by25). It is clear that the sheer magnitude and complexity of biodiversity cannot be geo-tagged for a solution, nor should. Promoting more geo-tagging efforts and mindsets for a proper science, and conservation remains far away from the realistic and natural species distribution and from global realities. Lacking already a relevant consideration of scale and autocorrelation those approaches  do not achieve any modern modeling concepts for urgently needed population-inference in times of the global biodiversity crisis. It just remains in a repetitive ‘me too’ point-and-click science ‘group-think’. Such a low-performing institutional culture - without deeper reflection on progress—a missing vision—still dominates, e.g., in regular SDMs the use of just a few predictors and Maximum Entropy (Maxent) (= a shallow learning machine learning algorithm,26,27). A relevant research design with relevant strata, a mutually accepted taxonomy for sampling, meaningful absence and availability data linked with socio-economic or higher precision climate change predictors all rule in their absence. For mandated biodiversity management this is often widely impossible to achieve even. The codified species-habitat models like HSIs, RSFs, Occupancy Models28 or Species Distribution Models (SDMs;29) are widely competing with each other, are often not in mutual agreement and still use methods being at least 20 years old (11, and citations within), e.g., Maxent as a leading algorithm in regular SDMs (26,27, 29; Maxent as an algorithm comes from the 1960s and was not improved in relevant terms since the 1980s still remaining in the probability framework based on parametric assumptions, which are dubious to obtain in real-life biology, e.g.4,11). Instead, modern ensemble model approaches that are based on J. Friedman’s paradigm of ‘many weak learners make for a strong learner’ are far and few but powerful (30; see also 11). For HSIs, RSFs and Occupancy Models—still widely taught and used in the wildlife discipline, its institutions and federal contractors applied for governance policy—the reality is even worse (based on ambiguous parsimony, linearity, few predictors and dubious model fittings for probability requiring a strict but unrealistic and rarely achieved research design;4, 11, 28 respectively).

In the meantime, with open access data sources on the rise in the Anthropocene, many managed species are now of great concern and the wider ecology is simply left unaddressed, still using an underlying governance understanding and policy that comes from over 100 years ago (see here the dominant legal interpretation of ‘Originalism31, see32 for a critique and failure). It does not remotely allow for modern, latest, or more relevant telecoupling approaches13 and similar (see33 for Deep Ecology and holistic aspects) in the world we actual live in (‘the Anthropocene’), or for massive problems faced by humanity in the future.

Employing best-available methods for confidence of the inference11, being accurate and precise matters for a proper habitat and species management3. That concept applies even more so in areas that are already deeply affected by the Anthropocene20,21, as well as with a human-accelerated climate change where a vast environmental onslaught is predicted to occur. Sophistication matters for a good outcome.

Using a new and best-available large open access global geographic information system (GIS) predictor data set for Alaska, here we introduce and show an example of improved options available: Super SDMs (34, for regular and latest SDMs see3537, as well as23, 27. Here we apply it for a species paradox, the charismatic and circumpolar but greatly unknown, understudied and misunderstood so-called ‘Phantom of the North’ (https://abcbirds.org/bird/great-gray-owl/;38)—the Great Gray Owl (Strix nebulosa). It is a very popular species in the public eye (see for instance featured in 'Into the Wild' movie and book for remote Alaska39). This species is likely long-lived and has a circumpolar distribution38. Relevant distribution data for this species are scarce and widely missing though in Alaska40,41. We introduce here the generic concept of a ‘Super SDM’34 based on a widely extended set of open access predictors and latest computational methods. We investigate and promote it as a new but readily available science-mandated global baseline for inference in species-habitat associations. Knowing best-available species-habitat associations are of crucial importance on a finite planet, while consumption patterns, human population, social inequality, habitat fragmentation, sea levels, global temperatures, etc. are greatly on the rise compromising wilderness and its species.

Methods

We started with the pioneering study approach presented by42, based on34, 35) and applied it as an update to Great Gray Owls (GGOW; taxonomic serial number TSN 177929) for Alaska. It followed the initial work from43 and then got extended with more and fine-tuned predictors and a cloud computing platform to overcome computing limitations towards progress. The workflow is described below and visualized in Fig. 1.

Figure 1.

Figure 1

Generic workflow for this study and suggested for SuperSDMs. Text in brackets has adjustable components and as were used in this study).

Data

We compiled likely the best-known and publicly available open access occurrence records for GGOWs in Alaska (n = 410), covering years from 1880 til 2019 (see Fig. 2); virtually all data points come from visual detections; whereas relevant nest location information are widely unknown in Alaska and unlikely for those data. The data are in the public domain (see43,44 for citizen science data), got merged from various publicly-available sources and do not carry a unifying underlying protocol and research design (details in43; eBIRD citation provided further below). Because we let the algorithm take care of data and outliers for generalization (sensu11), we do not filter the precious data. Still, wrong identifications and erroneous species confusions for GGOW are virtually impossible due to its unique appearance (for more data validity details see43,44,45). GGOWs are not known to occur in clusters and usually found individually46, thus autocorrelation is not an apparent issue for this species and its data (our model analysis of ‘tree-based algorithms’ is relatively robust to such issues regardless, see11), and citations within. These presence data were merged with the ‘background data’ (pseudo-absence) for all of the study area resulting in a binary response (presence/absence) for the subsequent data mining and models based on a relative index of occurrence (RIO;11).

Figure 2.

Figure 2

Great Gray Owl sightings in the study area of Alaska.

In addition, we also compiled the best-available global open access set of GIS layer predictors. Here we used Alaska as the study area, environmentally described by 100+ predictors (‘Big Data’; we currently have an even larger global data set of over 132 and of 230 GIS layers33), but here we focus on Alaska-specific questions and use its continuous predictors (while many other categorical predictors remain unused, still awaiting their use and further assessment). The list of utilized predictors can be seen in Table 1. This dataset exists in the form of ASCII/TIFF files in a WGS 1984 geographic projection of latitude and longitude in decimal degrees (see Data Availability section and Appendix section within). For layer creation of the specific Alaska features we used also the Alaska state NAD1983 projection with coordinates in feet for a slightly higher accuracy of local variables.

Table 1.

List of predictors for Alaska used in this study; the majority of predictors are climate-related (6 datasets with monthly mean metrics; n = 75) with some topographic (n = 5), biological (n = 5) and human-related ones (n = 15). This data set is a dynamic Open Access GIS layer dataset compiled by Sririam and Huettmann (unpublished, Andrews 2019 and Steiner and Huettmann in review). It lists overall more than 219 GIS Layers for Alaska.

Data set # Data Res Units Variable type Specific source Citations
1–12 Average Temperature by month12 60 m C*100 Quan PRISM Sriram and Huettmann (unpublished)
13–24 Average precipitation by month12 60 m Mm Quan PRISM Sriram and Huettmann (unpublished)
25 Human population density 1 km Humans/kmy Quan ICESIN Sriram and Huettmann (unpublished)
26 NDVI 1 km Index Quan Website Sriram and Huettmann (unpublished)
27 Globcover 1 km Categories Cate-gorical Website Sriram and Huettmann (unpublished)
28 GLC2000 1 km Categories Cate-gorical Website Sriram and Huettmann (unpublished)
29–41 Cloudcover by month 60 m % Quant World Clouds Sriram and Huettmann (unpublished)
42–61 BIOCLIM 1–19 1 km Indeces Quan Bioclim Sriram and Huettmann (unpublished)
62 Aspect 300 m Degrees Quan USGS Sriram and Huettmann (unpublished)
63–75 Solar radiation by month 1 km Kjul Quan

World Solar

Radiation

Sriram and Huettmann (unpublished)
76 Human Footprint 2 km Index Rank Assembled WWF
77 Mammal density 2 km Species number Quan Publication Steiner and Huettmann (in review)
78 Bird density 2 km Species number Quan Publication Steiner and Huettmann (in review)
79 Proximity to coast 1 km Index (km) Quan GIS Andrews (2019)
80 Lake proximity 1 km Index (km) Quan GIS Andrews (2019)
81 Road proximity 1 km Index (km) Quan GIS Andrews (2019)
82 Proximity to ‘water’ 1 km Index (km) Quan GIS Andrews (2019)
83 Proximity to Airport 1 km Index (km) Quan GIS Andrews (2019)
84 Proximity to Fire 1 km Index (km) Quan GIS Andrews (2019)
85 Proximity to pipeline 1 km Index (km) Quan GIS Andrews (2019)
86–98 Monthly global mean temperatures 1 km Deg C Quan World Climate Sriram and Huettmann (unpublished)
99 World Rodent Diversity 2 km Species number Quan Publication Steiner and Huettmann (in review)
100 Elevation 300 m M asl Quan USGS Steiner and Huettmann (in review)
101 Model1 1 km RIO Quan Publication Zahibi et al. (2091(
101 X coordinate M Quan GIS Not used in models as a predictor
102 Y coordinate M Quan GIS Not used in models as a predictor

We then used a point lattice of 1 km for Alaska, created in Open GIS  QGIS (vers. 3.28 Firenze; https://blog.qgis.org/2022/10/25/qgis-3-28-firenze-is-released/). Those lattice points were used as background (pseudo-absence) samples to be compared with presence points in the study area as part of a binary response (see also11,47). But also it was later used as a point-prediction grid for the study area for overlays with the predictors (resulting in the ‘data cube’). That way it was also used for scoring the predictions from the model described below to each lattice point (as presented in11). This step is crucial to geo-reference the obtained predictions, allowing for a spatial representation of the model results. The data cube is exported as a stand-alone table in a CSV format consisting of 373,423 rows (lattice points) and 105 columns and has a size of 206 MB.

Thanks to the machine learning approach used here, one is able to handle all the compiled data, including some potentially uncertain data (aka ‘bad apples’; see11 and citations within). Thus, we did not engage much into specific data cleaning, transformation or correction of the raw data (= GGOW locations and predictors). Being able to use default data speaks to the powerful research design we allow, and here we relied on data sections received (e.g. openly shared with the global public) and brought together. In this study we actually let the algorithm ‘learn’ the signals in the data and handle all the data realities for generalization (sensu48,49; “inference from predictions” as a core scheme of the approach chosen and promoted by Leo Breiman; see also11 and citations within). We then assess the major predictions with a test using several lines of evidence to convince. Here we apply published and alternative data, e.g. coming from a research design, as well as several citizen science source data for this species overall within Alaska (examples show in50).

Models and cloud computing

For a proof of concept, we used a basic RandomForest (‘bagging’, a powerful ensemble model classifier;48-51) run in R on the data cube. In order to successfully run this analysis, we utilized the R packages ‘randomForest’ (https://cran.r-project.org/web/packages/ randomForest/ index.html; see52,53 for further justification of this application). We followed Formula 1 for a RandomForest run. Details of the base code we used in R are shown in Appendix 1 (see Data Availability section).

Formula 1:Presence/Backgroundtmean_1+tmean_2+tmean_3+tmean_4+tmean_5+tmean_6+tmean_7+tmean_8+tmean_9+tmean_10+tmean_11+tmean_12+prec_1+prec_2+prec_3+prec_4+prec_5+prec_6+prec_7+prec_8+prec_9+prec_10+prec_11+prec_12+pdensit1+ndvi+globcover+glc2000+cloud1+cloud2+cloud3+cloud4+cloud5+cloud6+cloud7+cloud8+cloud9+cloud10+cloud11+bio_1+bio_2+bio_3+bio_4+bio_5+bio_6+bio_7+bio_8+bio_9+bio_10+bio_11+bio_12+bio_13+bio_14+bio_15+bio_16+bio_17+bio_18+bio_19+aspect+solrad1+solrad2+solrad3+solrad4+solrad5+solrad6+solrad7+solrad8+solrad9+solrad10+solrad11+solrad12+hf+mammals+birds+distcoasta+distlakeri+EucDistTow+EucDstAirp+EucDistFir+DistPipeli+World_MIN1+World_MIN2+World_MIn3+World_MIn4+World_MIn5+World_MIN6+World_MIN7+World_Min8+World_Min9+World_Min10+World_Min11+World_Min12+GlobalRive+WorldSlope+WorldRoden+WorldSoil2+Model1

Using these data initially on a consumer-grade laptop (16 GB memory), we ran into a run-time memory error indicating that it is not executable on a common laptop machine, and thus, cannot be completed as a model prediction without removing data or simplifying the prediction model. This is a bottleneck, thus far, not allowing to progress. So here we tried to overcome this computing bottleneck with super computing in a cloud-computing environment from the Oracle Cloud Infrastructure (an Oracle for Research computing credit grant provided to FH).

An Oracle Cloud virtual machine instance running Oracle Linux 8 was accessed via SSH through Windows Powershell. Installed on the machine was R 4.2.2. Details of the virtual machine are shown in Table 2. Those settings are not on the extreme side of cloud-computing but are sufficient to have the RandomForest run completed on the Big Data set that otherwise would not have been solved. It presents a showcase of the feasibility, magnitude, and potential of the workflow presented in this study, allowing many subsequent applications and presenting vast potential.

Table 2.

Supercomputing settings.

Oracle cloud metric Description
Computer system Linux
Memory (CPU Capacity) 1024 GB
OCPU count 64
Machine shape VM.Standard.E4.Flex
Internet bandwidth 40 Gbps
Cores AMD EPYC 7113

Model assessment

For a robust inference, model predictions are to be assessed for validity11. Ideally, that’s done with different lines of evidence. While we have exhausted all known publically-available data sources for this species, as available in GBIF.org and43, here we inquired with several alternative and more recent data sources beyond 2019, such as vetted bird watching listervs and citizen science web portals, e.g. iNaturalist (https://www.inaturalist.org/; new data collected).

Results

Data

We were able to compile the best publicly available distribution occurrence dataset for Great Gray Owls (GGOW) in Alaska; it covers a unique time period from 1880 to 2019, and is a testable quantified research component useable as a point data set (n = 410) in a CSV (ASCII) format, originating from various sources now existing as a GIS shapefile (see in Data Availability section, Appendix 3a within).

Further, we compiled, and make, the entire underlying GIS predictor set of over 100 GIS layers for Alaska available (see in Data Availability section, Appendix 2 within).

Both data sets are described with FGDC ISO compliant metadata in XML & HTML format (see also as part of the respective Data Availability section, Appendix within) to understand the data making it an inherent outcome of this multi-year study.

Model run

For the first time, we were able to complete an open access and open source workflow using Big Data for GGOW for a basic ensemble model algorithm (RandomForest) in the R environment run on a cloud computing workstation. We got a good model conversion (Fig. 3). This model ran c. 8 h, some of the figures required another overall 1 h to complete. The memory usage of the model run is up to 80% (of the assigned 1,024 GB).

Figure 3.

Figure 3

Randomforest Model fit (error) by number of trees showing a good and fast model fit.

Figure 4 shows the variable importance ranks of the 100 predictors we used, which presents the basis for the subsequent predictions (Fig. 5) and are further discussed in the next section for their meaning.

Figure 4.

Figure 4

Variable importance using two metrics (MSE, node purity) showing a variety of ecological predictors driving the GGOW occurrence with some predictor groups dominating, e.g. human impacts.

Figure 5.

Figure 5

Great Gray Owl raw predictions in the study area of Alaska using randomForest; the relative index of occurrence (RIO) is shown along a color gradient of red (predicted presence) and green (predicted absence).

Model predictions and accuracy

The map shown in Fig. 5 is the first prediction using machine learning ensembles and Big Data ever completed for Great Gray Owls (GGOWs) in Alaska and around the globe using a cloud-computing environment.

Our prediction result shows hotspots and coldspots for GGOWs in Alaska; the state with the largest protected area system in the U.S. However, our predicted ecological niche of GGOW does not match well with traditional range maps: in the predicted ecological niche the hotspots are primarily found along roads and urban areas, as well as human settlements (villages) and industrial areas, including some coastal zones and the Arctic tundra. Whereas the predicted coldspots are seen in western Alaska and in other vast sections of Alaska’s wilderness, including many protected areas and some wilderness regions. According to the predicted ecological niche (as per11 and citations within) transferred from the geographic niche this is a robust quantifiable finding to test further (details shown below for evidence and confidence).

For a wider inference, it becomes clear from Fig. 4 that a multivariate set of ecological predictors—at least 20—drives the occurrence of GGOWs in Alaska, not just a few single predictors but a wider range of predictors together across a wide environmental spectrum interacting in synergy. Whereas, a parsimonious approach does not capture GGOW’s distribution in Alaska and must be biased adding variance. However, seen from that angle, the predictor group that is directly related to human impacts and urbanization stands out (Figs. 4 and 5), whereas the more typical ecological niche predictors like climate and landcover seem to play a much lower role and are overruled by human/urban predictors. Figures 4, 6 and 7 make clear that GGOWs are found in habitats with a high human footprint, and/or occur next to it, but usually not far away from them or in the remote wilderness. Lakes and fires (54 for underlying ecology see55-57) could be a secondary, weak relationship for GGOW habitats. The predictors of Distance to coast and Proximity to Airports deserve more attention (many predictions are in coastal areas, a few GGOW presence records come from the Federal Bird Strike airport database (https://wildlife.faa.gov/); as per43). The predictors related to human cities and towns, human footprint, distance to pipeline and human density are among the leading predictors for GGOWs, out of a diverse set of 100 predictors overall (their variable importance ranks are shown in Fig. 4). GGOWs are known to rely on small mammals for prey (e.g.58). But noteworthy in our model findings is the high rank of the predictor called ‘model 1’, which is the predicted range of the 60+ bark beetle species community59. The correlation of GGOWs with bark beetles is a new finding, have never been described before (see60 for a traditionally reported small mammal link) and should be pursued more in future research projects.

Figure 6.

Figure 6

(ac) Partial dependence plot of the topthree predictors using MSE (hf, pdens, hlake).

Figure 7.

Figure 7

(a,b) Partial dependence plots of top two predictors using node purity (EucDistFir, EucDistPipe; the other two partial dependence plots of this group are already shown in Fig. 6).

What is the meaning of ‘background’ in binary presence/pseudo-absence models? Here we model binary predictions in the absence of ‘confirmed absence’ data points for this species (as shown in47,60). However, while meaningful absence data is missing for GGOWs in Alaska, e.g. a Breeding Bird Atlas, here we use a 1 km sample from all of Alaska and its diverse habitats making it a next-to-perfect comparison with the best-available presence records of GGOWs61, covering a unique time period 1880–2019.

We explain the mismatches with traditional GGOW maps due to lack of data, some parsimony perspectives and methods, previously insufficient predictor sets realized, and plain human expert assessment and perception errors11,62. The ML/AI methods we present as a Super SDM can help to overcome those problems. It also disproves the ‘human-desired’ distribution range of the ‘Phantom of the North’. At minimum, it shows a quantified and testable predicted ecological niche for GGOW to work from, and such a repeatable workflow.

How good and valid are the predictions achieved?

Using the Receiver Operating Characteristic11,64,65, our internal prediction accuracy shows a ROC value of over 90% for Alaska’s lattice points, but as provided by the software as a standard performance metric11, and citations within). Alternative assessment data are more powerful but few (see overview in43 for GGOW). However, as shown in Fig. 8, the existing ones at least fully confirm the model for the survey areas with high accuracy; the model predictions match the training data ‘very well’ (= almost a 100% match for locations tested) using recent bird watching records and iNaturalist records, extending the data set of c. 1% of the training data.

Figure 8.

Figure 8

GGOW predictions from the RF model run in ‘the cloud’ supercomputing overlaid with the training data (black dots). In addition, alternative Great Gray Owl sightings are overlaid (a) Detailed field assessment from Andrews (2019), and (b) recent sightings of the last 4 years from citizen efforts like birding listservers (b1,b2), and iNaturalist (b3–5) and Xeno-Canto (b6; 2 entries). It represents app. an additional 1% of the training data available for this ‘elusive’ species.

GGOWs are widely described as species for ‘the taiga’, e.g. in Google. Thus far, there are not many GGOW records for Alaska beyond the Brooks Range and the Arctic Tundra but some exist (Fig. 5 and evaluation data; Fig. 8). However, already in adjacent Canada, and in the Old World GGOWs are reported at those latitudes and at higher Northern latitudes. A sound recording was made in the Arctic area that we predict (for Alaska-Canada-border see https://xeno-canto.org/species/Strix-nebulosa). While prey abundance is generically high in those areas, thus far it is not known whether the model output predicts there the realized niche or indicates a sister taxon, e.g. snowy owl? Arguably, with an increased shrubification of the Arctic the boreal ecosystem is already moving north allowing for perch sites of GGOW with prey

Overall, the prediction results from the workflow we present—thus far—are difficult to beat for evidence, or to show wrong with empirical data at hand (see Fig. 8 below). They are far from overprediction, e.g. for wilderness and protected areas. Until there is better data available, specifically GGOW presences and absences, or nest, migration and telemetry data and expert information for GGOW are provided open access (e.g. from NGOs or governmental records), our results remain as good as they get and are to be used for management for time to come. All data are publicly available for that reason and allow for extension,  assessments, updates and improvements as needed in a quantified open access fashion.

Discussion

Here we present for the first time the best-available Open Access data for the Great Gray Owl (GGOW) as well as its 100+ geographic information system (GIS) habitat predictors for Alaska with ISO compliant metadata for a public audience. This presents the largest and most modern data set (“Big Data”) ever compiled for this species, its environment, and the state of Alaska (= the area in the U.S. with the largest wilderness and protected area system left) covering data from 1880 to 2019 and beyond (assessment data 2019 onwards).

Further, we were able to run the first Alaska-wide Super SDM model of GGOW predictions from such data. Super SDMs can have limitations dependent on data used, should always be assessed with several lines of independent evidence. They are not the ultimate and final statement on species-habitat associations, but they come close34. At minimum, they are low-cost rapid assessments capturing data quantitatively in time and space. It also is a great leap forward to be more ecological and more inclusive of all information and synergies available setting a new stage for species-habitat assessments11.

Beyond the data provided, the other strength of this work consists of the conceptual use and workflow of an ensemble model applied in a powerful cloud computing (supercomputer) environment, allowing for overcoming a traditional computational bottleneck using 100 predictors for new findings that were not able to be achieved before for inference. Overcoming the technical limitations of memory that come with the traditional computing environment allowed here a showcase for new computational and biological insights and progress, e.g. that GGOWs associate consistently with a high human footprint.

We followed the approach by Leo Breiman48,49 to infer from the prediction, as well as Jerome Friedman (cited in11,30) ‘many weak learners create a strong learner’. The actual base-code was made available (see Data Availability section, Appendix 4 within) for improvements, and the results were mapped in Open Source GIS for further use and application. Arguably, these ML models can be tested, improved and extended in various ways (for instance, the randomForest in R version can usually be challenged by Leo Breiman’s code in the Minitab Salford Predictive Modeler System (https://www.minitab.com/en-us/products/spm/). But here we show a proof of concept with all settings allowing to run and establish Super SDMs in a quantified and testable fashion.

We further pursed the concept of data mining, which keeps raw data and potential outliers ‘as is’, because that is a more powerful approach to the vast and otherwise accurate dataset. It leaves the actual ML algorithm to resolve problems and find the best prediction, rather than a biased human perception, assumptions, human errors11,65,66, and human meddling with a wrath of data and model settings within a complex ecological setting widely not understood (23,63,68; see11,65 for alternatives). The same applies to the concept of overfitting (better to be referred to as a full fit, as per11); randomForest is designed on the principle of ‘bagging’ which tends to avoid overfitting in the default setting, including a robust handling of outliers and autocorrelation11.

Biologically, it is known that GGOW’s populations and subsequent habitat needs are somewhat cyclic58,6668; here we present the year-wide average ecological niche across decades of observations with a testable and quantified prediction. From the raw data and predictions one can already easily show that GGOW is not a ‘phantom of the north’ (38, see also69) but instead it is a circumpolar species occurring instead in more southern areas70,71, e.g. in coastal areas and latitudes of 40 degrees North72-77 and thus living already for a long time in a highly urbanized, industrial, forestry and farming landscape among humans in the “Total Anthropocene” (78; for specific GGOW examples in its range see79-86). GGOWs do associate with a high human footprint. In Alaska, albeit well known and enthusiastically reported87-89, the GGOW is quite a rare sighting as such, but it is clearly affiliated with human landscapes43. However, a solid description and effective GGOW conservation plan with an associated budget for this species exist elsewhere (see90 for Oregon,91,92 for national forest practices) but is widely missing in (urban) Alaska (93,94; see95-100 for specific GGOW field protocols to be used; see101 for Alaska). Using a Super SDM, here we further can infer102 and confirm that GGOW in Alaska (= the state with the biggest wilderness in the U.S. and holding its largest national park system) is in essence an urbanized bird that associates with industrial infrastructure, pipeline, roads, urbanized centers and farming. Whereas the vast tracts of Alaska, e.g. western Alaska, interior Alaska and protected areas are widely free of reported GGOW sightings and high numbers/clusters (that is true for raw data as well as for the predictions of the ecological niche using over 100 predictors). Essentially, our finding flips how this species must be perceived and managed (e.g. opposite from81,103). As a minimum estimate, we find GGOW is an urbanized species primarily detected thus far in association with humans and man-made habitats (104; this habitat link can somewhat cycle over the years, and it is even stronger during migration and in wintering areas, such as found for a long time already in Alberta and Manitoba/Canada;72,95, 105, 106, and in the Old World107; contrast it with93). A question remains for GGOWs in the high arctic, and whether it occurs there much, or is a sister taxon like the Snowy Owl occupying that niche? Arguably, prey is abundant for GGOW and so are perching options.

How generalizable are the ecological niche predictions for inference, and for the realized niche? In the wide absence of any relevant research design specific for GGOW (see108-110 for road bias and how resolved), representative sampling, of an Alaskan Bird Atlas and Nesting Survey for that matter (compare with Birds of Yukon111, or bird banding/ringing work elsewhere in the GGOW range, e.g.112), and unsubstantiated narratives113 this question currently cannot be answered with ultimate accuracy (compare with114; see101 for owls in Southeast Alaska). Table 3 shows that more data and information exist that actually could be used, but unfortunately it is not presented to us, communicated with the public, and available to the public or science’s use. However, it is clear that much avian and raptor research was done but not shared, and thus opportunity was left unused, which is a generic pattern in wildlife-related research, specifically in Alaska, and for ML/AI applications (see for instance11,115, 116). As SDMs can indeed generalize11,28 here we used all publicly available GGOW information human-possible to-date in order to achieve the goals starting from 1880 onwards.

Table 3.

Data sources for Great Gray Owls in Alaska.

Data source name Contenta Open access Used in study Notes
GBIF Presence Yes Yes Training Data
Alaska Museum Presence Partly No Partly in GBIF already, incomplete data set of specimen only
eBird Presence Yes Yes Training Data
Birdwatch List-server Presence Yes Yes Training and Assessment Data
iNaturalist Presence Yes Yes Assessment Data
Bird Banding Presence No No Not easily available, few locations, e.g. EURING-BTO, USFWS, CWS Bird Banding Atlas
Xeno Canto Sound/Presence Yes No Recording exist for Alaska-Canada Arctic boundary area, as well as near a village
Feederwatch Presence Partly No Insufficient coverage for Alaska
Xmas Bird count Presence Partly No Limited value for spatial coverage
Movebank Presence No No Not shared, no coverage for Alaska
State & Federal Agencies Presence No No Not shared, not findable, some coverage for Alaska
Commercial Experts/contractors and NGOs Presence/Abundance No No Unknown amount of research, data and expertise
Raptor Biologists/Falconers Presence/Abundance No No Entire Professional Raptor and Wildlife Societies do not share or truly promote Open Access data sharing for many yearsb

a'Presence' refers to an implied georeferenced location; absence is not considered, yet. Often data include other information like abundance or attributes but which are not used here. The use of telemetry, data logger, nest and survey data are essential for such records.

bMany of such data works and funding are often coming from public environmental impact studies and contracts, e.g. for wind farms, mining and oil & gas projects, and airport strike risk assessments working on, and with, public resources.

While our model prediction assessments are ‘high’, arguably our model prediction still presents an underestimate of reality and an incomplete truth; many pixels await ground-truthing. Already the limits of data, research design and pseudo-absences can potentially limit inference (e.g.117). Cycling aspects of the Arctic and its populations are not included yet (e.g.118,119) and more focused data will fill other gaps and provide model updates. However, it is undeniable—from the raw data and the predictions alike—that GGOWs occur in human-dominated areas of Alaska. Those sightings are linked with man-made, urban and industrial habitats indeed, beyond ‘myth’. It matches other wildlife research findings in Alaska, such as50.

This research sets the stage for how habitat models—SDMs—can be run and improved. Leaving out predictors in the pursuit of parsimony  is still widely done in most of the species-habitat works in Alaska to-date—must be seen as willful, with an untested hypothesis-drop, that knowingly creates uncertainty and bias, leaving out many possible questions unanswered (see11,117, 118 for a vast range of applications). In the light of Super SDMs, such scholastic work must be perceived as ignoring best-available options; arguably it has either not done its homework or does not want to use existing data, information and employ easily available potential at hand for their research while better approaches have existed for many decades (see57,120124 for other applications done in Alaska, and see125-131 for other disciplines).

As commonly done in wildlife applications, e.g.11,132, here we show a ‘proof of concept’ with first inference. It is primarily technical progress it allows for bigger impacts on improved inference related to species and habitat management, in Alaska and globally. Here we were able to set a new available and mandatory baseline for inference: we established the Super SDM. Having such concepts available allows for predictions of high accuracy (see132 for 1 m prediction resolution), specifically when it comes to impact assessments, e.g. with an optimized survey design133, done into the future and with climate change (e.g.134-136). For Alaska, coming already from a troubling industrial past (e.g.137), much more industrial development is the current path to come in the Anthropocene. It is where state-wide mining and nuclear reactors are now tried and planned while the permafrost landscape melts, and the boreal forest gets cut down and burns55,138, with a new major sector exponentially on the rise—seabed mining139. As the decaying fate of natural resources and wilderness has shown140,141, regular ‘modern’ conservation governance has widely failed in Alaska and beyond (12; see for instance Alaska’s salmon crisis including King Salmon disappearance within just less than 50 years under such a regime affecting habitats and associated thousand-year long indigenous cultures relying on it142,143). Here we provide some quantified progress on best-available human options for global sustainability.

Acknowledgements

FH appreciates the work with the research team, specifically the incredible work and sophisticated and visionary discussions with Dan Steinberg, Salford Systems, and Minitab-Salford support. There are many students and project co-workers to acknowledge for their great work, specifically Sid Sriram, the great Hazel Berrios, Ela Huettmann, Sophia Linke and the impressive ‘team Chrome’. The Hoodoo UNAC cluster is a great and kind resource. The kind Andrew’s family is acknowledged also and for their encouragement. This work was supported in part by Oracle Cloud credits and related resources provided by Oracle for Research. This is EWHALE lab publication # 301.

Author contributions

The data were compiled by P.A. and F.H. using public sources, online repositories and public inquiry. Models were done by F.H. and initiated by P.A. under FH's supervision; this updated work was helped and discussed by all members of the author team. Further discussions and text edits were done by all members of the author team also. Bits and pieces of the workflow originate with FH's earlier work done with M.C., M.S., A.K.S. and J.P. over previous years.

Data availability

Data are shared Open Access, as per Methods and Appendix at the following URL https://drive.google.com/drive/u/0/folders/1rz3ZW3xplvdEf8LDu-d7-1BDXF6XxNMY, and also available from the authors on request.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Phillip Andrews and Jacques Philip are deceased.

Change history

6/21/2024

The original online version of this Article was revised: The Acknowledgements have been corrected.

References

  • 1.Huettmann F. Economic growth and wildlife conservation in the North Pacific Rim. In: Gates E, Trauger D, editors. Peak Oil, Economic Growth, and Wildlife Conservation. Island Press; 2014. pp. 133–156. [Google Scholar]
  • 2.Huettmann F. Climate change effects on terrestrial mammals: A review of global impacts of ecological niche decay in selected regions of high mammal importance. Encycl. Anthropocene. 2017;2(2018):123–130. [Google Scholar]
  • 3.Silvy NJ, editor. The Wildlife Techniques Manual: Volume 1: Research. Volume 2: Management. JHU Press; 2020. [Google Scholar]
  • 4.McArdle BH. The structural relationship: Regression in biology. Can. J. Zool. 1988;66(11):2329–2339. doi: 10.1139/z88-348. [DOI] [Google Scholar]
  • 5.Whittingham MJ, Stephens PA, Bradbury RB, Freckleton RP. Why do we still use stepwise modelling in ecology and behaviour? J. Anim. Ecol. 2006;75(5):1182–1189. doi: 10.1111/j.1365-2656.2006.01141.x. [DOI] [PubMed] [Google Scholar]
  • 6.Royle J, Nichols J. Estimating abundance from repeated presence-absence data or point counts. Ecology. 2003;84:777–790. doi: 10.1890/0012-9658(2003)084[0777:EAFRPA]2.0.CO;2. [DOI] [Google Scholar]
  • 7.Manly BFL, McDonald L, Thomas DL, McDonald TL, Erickson WP. Resource Selection by Animals: Statistical Design and Analysis for Field Studies. Springer; 2007. [Google Scholar]
  • 8.Guillera-Arroita G, Lahoz-Monfort JJ, MacKenzie DI, Wintle BA, McCarthy MA. Ignoring imperfect detection in biological surveys is dangerous: A response to ‘fitting and interpreting occupancy models'. PLoS ONE. 2014;9(7):e99571. doi: 10.1371/journal.pone.0099571. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Guthery FS, Brennan LA, Peterson MJ, Lusk JJ. Information theory in wildlife science: Critique and viewpoint. J. Wildl. Manag. 2005;69(2):457–465. doi: 10.2193/0022-541X(2005)069[0457:ITIWSC]2.0.CO;2. [DOI] [Google Scholar]
  • 10.Arnold TW. Uninformative parameters and model selection using Akaike's Information Criterion. J. Wildl. Manag. 2010;74:1175–1178. [Google Scholar]
  • 11.Humphries GRW, Magness DR, Huettmann F, editors. Machine Learning in Ecology and Sustainable Resource Management. Springer; 2018. [Google Scholar]
  • 12.Peterson MN, Nelson MP. Why the North American model of wildlife conservation is problematic for modern wildlife management. Hum. Dimens. Wildl. 2017;22(1):43–54. doi: 10.1080/10871209.2016.1234009. [DOI] [Google Scholar]
  • 13.Liu J, Dou Y, Batistella M, Challies E, Conno T, Friis C, Millington JDA, Parish E, Romulo CL, Bicudo Silva RF, Triezenberg H, Yang H, Zhao Z, Zimmerer KS, Huettmann F, Treglia ML, Basher Z, Chung MG, Herzberger A, Lenschow A, Mechiche-Alami A, Newig J, Roch J, Sun J. Spillover systems in a telecoupled Anthropocene: Typology, methods, and governance for global sustainability. Environ. Sustain. 2018;33:58–69. doi: 10.1016/j.cosust.2018.04.009. [DOI] [Google Scholar]
  • 14.Friedman J, Hastie T, Tibshirani R. Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors) Ann. Stat. 2000;28(2):337–407. doi: 10.1214/aos/1016218223. [DOI] [Google Scholar]
  • 15.Fernandez-Delgado M, Cernadas E, Barro S. Do we need hundreds of classifiers to solve real-world classification problems? J. Mach. Learn. Res. 2014;15:3133–3181. [Google Scholar]
  • 16.Grossman, R., Seni, G., Elder, J., Agarwal, N. & Liu, H. Ensemble methods in data mining: Improving accuracy through combining predictions. Data Mining and Knowledge Discovery (2010).
  • 17.Kandel K, Huettmann F, Suwal MK, Regmi GR, Nijman V, Nekaris KAI, Lama ST, Thapa A, Sharma HP, Subedi TL. Rapid multi-nation distribution assessment of a charismatic conservation species using open access ensemble model GIS predictions: Red Panda (Ailurus fulgens) in the Hindu-Kush Himalaya region. Biol. Cons. 2015;181:150–161. doi: 10.1016/j.biocon.2014.10.007. [DOI] [Google Scholar]
  • 18.Hao T, Elith J, Lahoz-Monfort JJ, Guillera-Arroita G. Testing whether ensemble modelling is advantageous for maximising predictive performance of species distribution models. Ecography. 2020;43(4):549–558. doi: 10.1111/ecog.04890. [DOI] [Google Scholar]
  • 19.Marzluff JM, Sallabanks R, editors. Avian Conservation: Research and Management. Island Press; 1998. [Google Scholar]
  • 20.Meine C, Soule M, Noss RF. “A mission-driven discipline”: The growth of conservation biology. Conserv. Biol. 2006;20:631–651. doi: 10.1111/j.1523-1739.2006.00449.x. [DOI] [PubMed] [Google Scholar]
  • 21.Mahoney SP, Geist V, editors. The North American Model of Wildlife Conservation. Johns Hopkins University Press; 2019. [Google Scholar]
  • 22.McGarigal K, Cushman SA, Stafford S. Multivariate Statistics for Wildlife and Ecology Research. Springer; 2013. [Google Scholar]
  • 23.Boulanger-Lapointe N, Ágústsdóttir K, Barrio IC, Defourneaux M, Finnsdóttir R, Jónsdóttir IS, et al. Herbivore species coexistence in changing rangeland ecosystems: First high resolution national open-source and open-access ensemble models for Iceland. Sci. Total Environ. 2022;845:157140. doi: 10.1016/j.scitotenv.2022.157140. [DOI] [PubMed] [Google Scholar]
  • 24.Douglas, D. C. 2006. The Douglas Argos-Filter Algorithm. Available at alaska.usgs.gov/science/biology/spatial/douglas.html
  • 25.McIntyre CL, Lewis SB. Statewide movements of non-territorial Golden Eagles in Alaska during the breeding season: Information for developing effective conservation plans. Alaska Park Sci. 2018;17:65–73. [Google Scholar]
  • 26.Elith J, Graham CH, Anderson RP, Dudik M, Ferrier S, Guisan A, Hijmans RJ, Huettmann F, Leathwick JR, Lehmann A, Li J, Lohmann LG, Loiselle BA, Manion G, Moritz C, Nakamura M, Nakazawa Y, Overton JM, Peterson AT, Phillips SJ, Richardson KS, Scachetti-Pereira R, Schapire RE, Soberon J, Williams S, Wisz MS, Zimmermann NE. Novel methods improve prediction of species’ distributions from occurrence data. Ecography. 2006;29:129–151. doi: 10.1111/j.2006.0906-7590.04596.x. [DOI] [Google Scholar]
  • 27.Elith J, Graham CH, Valavi R, Abegg M, Bruce C, Ford A, Guisan A, Hijmans RJ, Huettmann F, Lohmann L, Loiselle B, Moritz C, Overton J, Peterson AT, Phillips S, Richardson K, Williams SE, Wiser SK, Wohlgemuth T, Zimmermann NE. Presence-only and presence-absence data for comparing species distribution modeling methods. J. Biodivers. Inform. 2020;15:69–80. doi: 10.17161/bi.v15i2.13384. [DOI] [Google Scholar]
  • 28.MacKenzie D, Nichols J, Royle J, Pollock K, Bailey L, Hines J. Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurrence. 2. Elsevier; 2017. [Google Scholar]
  • 29.Guisan A, Thuiller W. Predicting species distribution: Offering more than simple habitat models. Ecol. Lett. 2005;8:993–1009. doi: 10.1111/j.1461-0248.2005.00792.x. [DOI] [PubMed] [Google Scholar]
  • 30.Hastie T, Tibshirani R, Friedman JH, Friedman JH. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer; 2009. pp. 1–758. [Google Scholar]
  • 31.Whittington KE. Originalism: A critical introduction. Fordham L. Rev. 2013;82:375. [Google Scholar]
  • 32.Cross F. The Failed Promise of Originalism. Stanford University Press; 2013. [Google Scholar]
  • 33.Naess A. The Ecology of Wisdom: Writings by Arne Naess. Catapult; 2009. [Google Scholar]
  • 34.Steiner, M. & Huettmann, F. (in review). With Super SDMs (Machine Learning, Open Access Big Data, and The Cloud) towards a more holistic and inclusive inference: Insights from progressing the marginalized case of the world’s squirrel hotspots and coldspots. Scientific Reports. [DOI] [PMC free article] [PubMed]
  • 35.Guisan A, Zimmermann NE. Predictive habitat distribution models in ecology. Ecol. Model. 2000;135(2–3):147–186. doi: 10.1016/S0304-3800(00)00354-9. [DOI] [Google Scholar]
  • 36.Zimmermann NE, Edwards TC, Jr, Graham CH, Pearman PB, Svenning JC. New trends in species distribution modelling. Ecography. 2010;33(6):985–989. doi: 10.1111/j.1600-0587.2010.06953.x. [DOI] [Google Scholar]
  • 37.Steiner M, Huettmann F. Sustainable Squirrel Conservation. Springer; 2023. [Google Scholar]
  • 38.Nero RW. The Great Gray Owl: Phantom of the Northern Forest. Smithsonian Institution Press; 1980. [Google Scholar]
  • 39.Krakauer J. Into the Wild. Pan Macmillan; 2018. [Google Scholar]
  • 40.Alaska Center for Conservation Science (ACCS). 2016. Alaska GAP Analysis Project. University of Alaska Anchorage. akgap.uaa.alaska.edu. Accessed on July 20, 2019
  • 41.Audubon (2019). Great Gray Owl Strix nebulosa. https://www.audubon.org/field-guide/bird/great-gray-owl. Accessed online on April 14, 2019.
  • 42.Sriram, S. & Huettmann, F. (unpublished). A Global Model of Predicted Peregrine Falcon (Falco peregrinus) Distribution with Open Source GIS Code and 104 Open Access Layers for use by the global public. Journal of Earth System Science Data.
  • 43.Andrews, P. Great Grey Owl Habitat Association. University of Alaska Fairbanks (2019).
  • 44.Dickinson JL, Shirk J, Bonter D, Bonney R, Crain RL, Martin J, Phillips T, Purcell K. The current state of citizen science as a tool for ecological research and public engagement. Front. Ecol. Environ. 2012;10(6):291–297. doi: 10.1890/110236. [DOI] [Google Scholar]
  • 45.Sauermann H, Franzoni C. Crowd science user contribution patterns and their implications. Proc. Natl. Acad. Sci. (USA) 2015;112(3):679–684. doi: 10.1073/pnas.1408907112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Bull EL, Henjum MG, Rohweder RS. Nesting and foraging habitat of great gray owls. J. Raptor Res. 1988;22(4):107–115. [Google Scholar]
  • 47.Barbet-Massin M, Jiguet F, Albert CH, Thuiller W. Selecting pseudo-absences for species distribution models: How, where, and how many? Methods Ecol. Evol. 2012;3:327–338. doi: 10.1111/j.2041-210X.2011.00172.x. [DOI] [Google Scholar]
  • 48.Breiman L. Random forests. Machine learning. 2001;45:5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]
  • 49.Breiman L. Statistical modeling: The two cultures (with comments and a rejoinder By the author) Stat. Sci. 2001;16:199–231. doi: 10.1214/ss/1009213726. [DOI] [Google Scholar]
  • 50.Huettmann F, Kövér L, Robold R, Spangler M, Steiner M. Model-based prediction of a vacant summer niche in a subarctic urbanscape: A multi-year open access data analysis of a ‘niche swap’by short-billed Gulls. Ecol. Inform. 2023;78:102364. doi: 10.1016/j.ecoinf.2023.102364. [DOI] [Google Scholar]
  • 51.Cutler DR, Edwards TC, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ. Random forests for classification in ecology. Ecology. 2007;88(11):2783–2792. doi: 10.1890/07-0539.1. [DOI] [PubMed] [Google Scholar]
  • 52.Mueller JP, Massaron L. Machine Learning for Dummies. Wiley; 2016. [Google Scholar]
  • 53.Mi C, Huettmann F, Guo Y, Han X, Wen L. Why to choose Random Forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence. PeerJ. 2017 doi: 10.7717/peerj.2849. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Hannah KC, Hoyt JS. Northern Hawk Owls and recent burns: Does burn age matter? The Condor. 2004;106:420–423. doi: 10.1093/condor/106.2.420. [DOI] [Google Scholar]
  • 55.Kasischke ES, Williams D, Barry D. Analysis of the patterns of large fires in the boreal forest region of Alaska. Int. J. Wildl. Fire. 2002;11:131–144. doi: 10.1071/WF02023. [DOI] [Google Scholar]
  • 56.Fisher JT, Wilkinson L. The response of mammals to forest fire and timber harvest in the North American boreal forest. Mammal Rev. 2005;35(1):51–81. doi: 10.1111/j.1365-2907.2005.00053.x. [DOI] [Google Scholar]
  • 57.Loehman, R. Landscape effects of fire frequency and severity on boreal Alaskan landscapes. USGS (2016). https://alaska.usgs.gov/science/program.php?pid=18. Accessed on November 20, 2017.
  • 58.Bull, E. L. & Henjum, M. G. Ecology of the great gray owl. General Technical Report. PNW-GTR-265. Portland, Oregon: USDA Forest Service. Pacific Northwest Research Station (1990).
  • 59.Zabihi, K., Huettmann, F. & Young, B. Predicting multi-species bark beetle (Coleoptera: Curculionidae: Scolytinae) occurrence in Alaska: First use of open access big data mining and open source GIS to provide robust inference and a role model for progress in forest conservation. Biodiversity Informatics 1–15 (2021). https://journals.ku.edu/jbi/issue/current
  • 60.Solheim R, Oien IJ, Sonerud GA. How does the Great Grey Owl manage when small rodents are in short supply? Var Fuglefauna. 2015;38(3):118–123. [Google Scholar]
  • 61.Lobo JM, Jimenez-Valverde A, Hortal J. The uncertain nature of absences and their importance in species distribution modelling. Ecography. 2010;33:103–114. doi: 10.1111/j.1600-0587.2009.06039.x. [DOI] [Google Scholar]
  • 62.Perera AH, Drew CA, Johnson CJ. Expert Knowledge and Its Application in Landscape Ecology. Springer; 2012. [Google Scholar]
  • 63.Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine. Clin. Chem. 1993;39:561–577. doi: 10.1093/clinchem/39.4.561. [DOI] [PubMed] [Google Scholar]
  • 64.Fielding AH, Bell JF. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ. Conserv. 1997;234:38–49. doi: 10.1017/S0376892997000088. [DOI] [Google Scholar]
  • 65.Drew CA, Wiersma YF, Huettmann F, editors. Predictive Species and Habitat Modeling in Landscape Ecology: Concepts and Applications. Springer; 2011. [Google Scholar]
  • 66.Krebs CJ, Boutin S, Boonstra R. Ecosystem Dynamics of the Boreal Forest. Oxford University Press; 2001. [Google Scholar]
  • 67.Lehikoinen A, Ranta E, Pietianinen H, Byholm P, Saurola P, Valkama J, Huitu O, Korpimaki E. The impact of climate and cyclic food abundance on the timing of breeding and brood size in four boreal owl species. Oecologia. 2011;165:349–355. doi: 10.1007/s00442-010-1730-1. [DOI] [PubMed] [Google Scholar]
  • 68.Hipkiss T, Stefansson O, Hornfeldt B. Effect of cyclic and declining food supply on great grey owls in boreal Sweden. NRC research press web. Can. J. Zool. 2008;86:1426–1431. doi: 10.1139/Z08-131. [DOI] [Google Scholar]
  • 69.Hilden O, Helo P. The great grey owl Strix nebulosa: A bird of the Northern Taiga. Ornis Fennica. 1981;58:159–166. [Google Scholar]
  • 70.Winter, J. 1986. Status, distribution and ecology of the great gray owl (Strix nebulosa) in California [thesis]. San Francisco State University.
  • 71.NatureServe. 2009. Strix nebulosa- Forster 1772. http://explorer.natureserve.org/index.htm. Accessed on July 20, 2019.
  • 72.Bull EL, Duncan JR. Great Gray Owl (Strix nebulosa), version 2.0. In: Poole AF, Gill FB, editors. The Birds of North America. Cornell Lab of Ornithology; 1993. [Google Scholar]
  • 73.Duncan JR. Owls of the World: Their Lives, Behavior, and Survival. 1. Firefly Books; 2003. [Google Scholar]
  • 74.Konig C, Weick F. Owls of the World. 1. A&C Black Publishers Ltd.; 2008. [Google Scholar]
  • 75.Brazil M. Birds of East Asia: China, Taiwan, Korea, Japan, and Russia. A&C Black; 2009. [Google Scholar]
  • 76.Birdlife International. 2016. Strix nebulosa. The IUCN red list of threatened species 2016. E.t22689118a93218931. 10.2305/iucn.uk.2016-3.rlts.t22689118a93218931.en. Accessed online on October 2017.
  • 77.Del Hoyo J. All the Birds o the World. Lynx Edition (2020).
  • 78.Steffen W, Broadgate W, Deutsch L, Gaffney O, Ludwig C. The trajectory of the Anthropocene: The great acceleration. Anthropocene Rev. 2015;2:81–98. doi: 10.1177/2053019614564785. [DOI] [Google Scholar]
  • 79.Mikkola, H. Der bartkauz Strix nebulosa. Die Neue Brehm- Bucherei 538, Ziemsen Verlag, Wittenberg, Lutherstadt (1981).
  • 80.Bull EL, Henjum MG. The neighborly great gray owl. Nat .Hist. 1987;9:32–41. [Google Scholar]
  • 81.Hayward, G. D. & Verner, J. Flammulated, boreal, and great gray owls in the United States: A technical conservation assessment. USDA Forest Service. General Technical Report RM-253 (1994).
  • 82.Huff, M., Henshaw, J. & Laws, E. Great Gray Owl survey status and evaluation of guidelines for the Northwest Forest Plan. USDA Forest Service/Pacific Northwest Research Station (1996).
  • 83.Duncan, J. R. Movement strategies, mortality, and behavior of radio-marked Great Gray Owls in southeastern Manitoba and Minnesota. USDA Forest Service. Biology and Conservation of Northern Forest Owls. Symposium Proceedings (1987).
  • 84.Sulkava S, Huhtala K. The great gray owl (Strix nebulosa) in the changing forest environment of northern Europe. J. Raptor Res. 1997;31(2):151–159. [Google Scholar]
  • 85.Kalinowski, R. Habitat relationships of the great gray owl prey in meadows of the Sierra Nevada Mountains. The faculty of Humboldt State University (thesis) (2012).
  • 86.Vazhov SV, Bakhtin RF, Vazhov VM. Ecology of some species of owls in agricultural landscapes of the Altai region. Ecol. Environ. Conserv. 2016;22(3):1549–1557. [Google Scholar]
  • 87.Taras, M. The Alaska owlmanac. Alaska Department of Fish and Game, Division of Wildlife Conservation (2004).
  • 88.eBird. Sensitive Species in eBird. https://help.ebird.org/customer/en/portal/articles/2885265-sensitive-species-in-ebird. Accessed on June 20, 2019.
  • 89.eBird. eBird basic dataset metadata (v1.12). https://ebird.org/data/download. Accessed on May 15, 2019.
  • 90.Bryan T, Forsman ED. Distribution, abundance, and habitat of great gray owls in south-central Oregon. Murrelet. 1987;68:45–49. doi: 10.2307/3535691. [DOI] [Google Scholar]
  • 91.Wu, J. X., Loffland, H. L., Siegel, R. B. & Stermer, C. A conservation strategy for Great Gray Owls (Strix nebulosa) in California. Interim version 1.0. The Institute for Bird Populations and California Partners in Flight. Point Reyes Station, California (2016).
  • 92.Duncan JR. Great gray owls (Strix nebulosa nebulosa) and forest management: A review and recommendations. J. Raptor Res. 1997;31(2):160–166. [Google Scholar]
  • 93.ADFG. Alaska wildlife action plan. Alaska Department of Fish and Game. Juneau (2015).
  • 94.ADFG. State of Alaska FY2018 governor’s operating budget. Department of Fish and Game Wildlife Conservation Component Budget Summary (2016).
  • 95.Loch, S. L. Manitoba great gray owl project progress report. April 1, 1984 to August 1, 1985. Manitoba Department of Natural Resources. Winnipeg, Manitoba (1985).
  • 96.Fuller MR, Mosher JA. Methods of detecting and counting raptors: A review. Stud. Avian Biol. 1981;6:235–246. [Google Scholar]
  • 97.Fuller MR, Mosher JA. Raptor survey techniques. In: Pendleton BAG, Millsap BA, Cline KW, Bird DM, editors. Raptor Management Techniques Manual. National Wildlife Federation; 1987. [Google Scholar]
  • 98.Takats, D. L., Francis, C. M., Holroyd, G. L., Duncan, J. R., Mazur, K. M., Cannings, R. J., Harris, W. & Holt, D. Guidelines for nocturnal owl monitoring in North America. Beaverhill Bird Observatory and Bird Studies Canada, Edmonton, Alberta (2001).
  • 99.Quintana D, Gerhardt R, Broyles M, Dillon J, Friesen C, Godwin S, Kamrath S. Survey Protocol for the Great Gray Owl Within the Range of the Northwest Forest Plan [ver. 3.0] USDA Forest Service and USDI Bureau of Land Management; 2004. [Google Scholar]
  • 100.Beck, T. W. & Winter, J. Survey protocol for the Great Gray Owl in the Sierra Nevada of California. USDA Forest Service, Pacific Southwest Region. Vallejo, CA (2000).
  • 101.Kissling ML, Lewis SB, Pendleton G. Factors influencing the detectability of forest owls in southeastern Alaska. The Condor. 2010;112(3):539–548. doi: 10.1525/cond.2010.090217. [DOI] [Google Scholar]
  • 102.Chapman AD, Grafton O. Guide to Best Practices for Generalising Sensitive Species-Occurrence Data, Version 1.0. Global Biodiversity Information Facility; 2008. [Google Scholar]
  • 103.Keane, J. J., Ernest, H. B. & Hull, J. M. Conservation and Management of the Great Gray Owl 2007–2009: Assessment of Multiple Stressors and Ecological Limiting Factors. Report F8813-07-0611, National Park Service & U.S. Department of Agriculture, Forest Service (2011).
  • 104.Bedrosian, B., Gura, K. & Mendelsohn, B. Occupancy, nest success, and habitat use of Great Gray Owls in western Wyoming. Teton Raptor Center, Wilson, WY (2015).
  • 105.Collister, D. M. Seasonal distribution of the Great Gray Owl (Strix nebulosa) in Southwestern Alberta. General Technical Report NC., (190), 119 (1981).
  • 106.Bouchart ML. Great Gray Owl Habitat Use in Southeastern Manitoba and the Effects of Forest Resource Management. University of Manitoba (Practicum); 1991. [Google Scholar]
  • 107.Virkkala R, Marmion M, Heikkinen RK, Thuiller W, Luoto M. Predicting range shifts of northern bird species: Influence of modelling technique and topography. Acta Oecologica. 2010;36:269–281. doi: 10.1016/j.actao.2010.01.006. [DOI] [Google Scholar]
  • 108.Hanowski JAM, Niemi GJ. A comparison of on- and off-road bird counts: Do you need to go off road to count birds accurately? J. Field Ornithol. 1995;66:469–483. [Google Scholar]
  • 109.Kadmon R, Farber O, Danin A. Effect of roadside bias on the accuracy of predictive maps produced by predictive models. Ecol. Appl. 2004;14(2):401–413. doi: 10.1890/02-5364. [DOI] [Google Scholar]
  • 110.Geldmann J, Heilmann-Clausen J, Holm TE, Levinsky I, Markussen B, Olsen K, Rahbek C, Tottrup AP. What determines spatial bias in citizen science? Exploring four recording schemes with different proficiency requirements. Divers. Distrib. 2016;22:1139–1149. doi: 10.1111/ddi.12477. [DOI] [Google Scholar]
  • 111.Sinclair PH, Nixon WA, Eckert CD, Hughes NL. Birds of the Yukon Territory. UBC Press; 2003. [Google Scholar]
  • 112.Fransson, T. & Pettersson, J. Swedish bird ringing atlas volume 1, divers-raptors. Stockholm, Sweden (2001).
  • 113.Osborne, T. Great Gray Owl. Alaska Department of Fish and Game, Alaska Wildlife Notebook Series (1994). http://www.adfg.alaska.gov/index.cfm%3Fadfg%3Deducators .notebookseries. Accessed on September 18, 2019.
  • 114.Aycrigg J, Beauvais G, Gotthardt T, Huettmann F, Pyare S, Andersen M, Keinath D, Lonneker J, Spathelf M, Walton K. Novel approaches to modeling and mapping terrestrial vertebrate occurrence in the northwest and Alaska: An evaluation. Northwest Sci. 2015;89:355–381. doi: 10.3955/046.089.0405. [DOI] [Google Scholar]
  • 115.Thessen AE. Adoption of machine learning techniques in ecology and earth science. One Ecosyst. 2016;1:e86221. doi: 10.3897/oneeco.1.e8621. [DOI] [Google Scholar]
  • 116.The Royal Society. Machine learning: The power and promise of computers that learn by example. royalsociety.org/machine-learning. (2017).
  • 117.Valavi R, Elith J, Lahoz-Monfort JJ, Guillera-Arroita G. Modelling species presence-only data with random forests. Ecography. 2021;44(12):1731–1742. doi: 10.1111/ecog.05615. [DOI] [Google Scholar]
  • 118.Hegel TM, Verbyla D, Huettmann F, Barboza PS. Spatial synchrony of recruitment in mountain-dwelling woodland caribou. Popul. Ecol. 2012;54(1):19–30. doi: 10.1007/s10144-011-0275-4. [DOI] [Google Scholar]
  • 119.Hegel TA, Mysterud FH, Stenseth N. Interacting effect of wolves and climate on recruitment in a northern mountain caribou population. Oikos. 2010;119:1453–1461. doi: 10.1111/j.1600-0706.2010.18358.x. [DOI] [Google Scholar]
  • 120.Ohse B, Huettmann F, Ickert-Bond SM, Juday GP. Modeling the distribution of white spruce (Picea glauca) for Alaska with high accuracy: An open access role-model for predicting tree species in last remaining wilderness areas. Polar Biol. 2009;32:1717–1729. doi: 10.1007/s00300-009-0671-9. [DOI] [Google Scholar]
  • 121.Booms T, Huettmann F, Schempf P. Gyrfalcon nest distribution in Alaska based on a predictive GIS model. Polar Biol. 2009;33:1601–1612. [Google Scholar]
  • 122.Young B, Yarie J, Verbyla D, Huettmann F, Herrick K, Chapin FS. Modeling and mapping forest diversity within the boreal forest of interior Alaska. Lands. Ecol. 2017;32:397–413. doi: 10.1007/s10980-016-0450-2. [DOI] [Google Scholar]
  • 123.Young, B. D., Yarie, J., Verbyla, D., Huettmann, F. & Stuart Chapin III, F. Mapping aboveground biomass of trees using forest inventory data and public environmental variables within the Alaskan Boreal Forest. In Machine Learning for Ecology and Sustainable Natural Resource Management (eds G. Humphries, D.R. Magness and F. Huettmann) 141–160 (2018).
  • 124.Baltensperger AP, Huettmann F. Predictive spatial niche and biodiversity hotspot models for small mammal communities in Alaska: Applying machine-learning to conservation planning. Lands. Ecol. 2015;30(1):681–697. doi: 10.1007/s10980-014-0150-8. [DOI] [Google Scholar]
  • 125.Dhar V. Data mining in finance: Using counterfactuals to generate knowledge from organizational information systems. Inf. Syst. 1998;23:423–437. doi: 10.1016/S0306-4379(98)00021-0. [DOI] [Google Scholar]
  • 126.Onskog J, Freyhult E, Landfors M, Ryden P, Hvidsten TR. Classification of microarrays; synergistic effects between normalization, gene selection and machine learning. BMC Bioinform. 2011;12:390. doi: 10.1186/1471-2105-12-390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Perlich C, Dalessandro B, Raeder T, Stitelman O, Provost F. Machine learning for targeted display advertising: Transfer learning in action. Mach. Learn. 2014;95(103–127):4. [Google Scholar]
  • 128.Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 2015;13:18–17. doi: 10.1016/j.csbj.2014.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Isasi I, Irusta U, Elola A, Aramendi E, Ayala U, Alonso E, Kramer-Johansen J, Eftestol T. A machine learning shock decision algorithm for using during piston-driven chest compressions. IEE Trans. Biomed. Eng. 2019;66(6):1752–1760. doi: 10.1109/TBME.2018.2878910. [DOI] [PubMed] [Google Scholar]
  • 130.Tabak MA, Norouzzadeh MS, Wolfson DW, Sweeney SJ, Vercauteren KC, Snow NP, Halseth JM, Di Salvo PA, Lewis JS, White MD, Teton B, Beasley JC, Schlichting PE, Boughton RK, Wight B, Newkirk ES, Ivan JS, Odell EA, Brook RK, Lukacs PM, Moeller AK, Mandeville EG, Clune J, Miller RS. Machine learning to classify animal species in camera trap images: Applications in ecology. Methods Ecol. Evol. 2018;10:585–590. doi: 10.1111/2041-210X.13120. [DOI] [Google Scholar]
  • 131.Rametov NM, Steiner M, Bizhanova NA, Abdel ZZ, Yessimseit DT, Abdeliyev BZ, Mussagalieva RS. Mapping plague risk using super species distribution models and forecasts for rodents in the Zhambyl Region, Kazakhstan. GeoHealth. 2023;7(11):e2023GH000853. doi: 10.1029/2023GH000853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132.Robold, R. & Huettmann, F. High-resolution prediction of american red squirrel in interior Alaska: A role model for conservation using open access data, machine learning, GIS and LIDAR. PEERJ. https://peerj.com/articles/11830/ (2021). [DOI] [PMC free article] [PubMed]
  • 133.Hanson JO, McCune JL, Chadès I, Proctor CA, Hudgins EJ, Bennett JR. Optimizing ecological surveys for conservation. J. Appl. Ecol. 2023;60:41–51. doi: 10.1111/1365-2664.14309. [DOI] [Google Scholar]
  • 134.Magness DR, Huettmann F, Morton JM. Using random forests to provide predicted species distribution maps as a metric for ecological inventory & monitoring programs. In: Smolinski TG, Milanova MG, Hassanien A-E, editors. Applications of Computational Intelligence in Biology: Current Trends and Open Problems. Studies in Computational Intelligence. Springer; 2008. pp. 209–229. [Google Scholar]
  • 135.Euskirchen ES, McGuire AD, Chapin FS, III, Yi S, Thompson CC. Changes in vegetation in northern Alaska under scenarios of climate change, 2003–2100: Implications for climate feedbacks. Ecol. Appl. 2009;19(4):1022–1043. doi: 10.1890/08-0806.1. [DOI] [PubMed] [Google Scholar]
  • 136.Murphy, K., Huettmann, F., Fresco, N. & Morton, J. Connecting Alaska landscapes into the future: results from an interagency climate modeling, land management and conservation project. US Fish and Wildlife Service. Unpublished Report, Anchorage Alaska. (2010).
  • 137.O'Neill D. The Firecracker Boys: H-bombs, Inupiat eskimos, and the Roots of the Environmental Movement. Basic Books; 2007. [Google Scholar]
  • 138.Viereck LA. Wildfire in the taiga of Alaska. Quat. Res. 1973;3:465–495. doi: 10.1016/0033-5894(73)90009-4. [DOI] [Google Scholar]
  • 139.Gartman, A., Mizell, K. & Kreiner, D. C. Marine minerals in Alaska—A review of coastal and deep-ocean regions. Professional Paper, (1870), 2022
  • 140.Taber RD, Payne NF. Wildlife, Conservation, and Human Welfare: A United States and Canadian Perspective. Krieger Publishing Company; 2003. [Google Scholar]
  • 141.Serreze MC, Walsh JE, Chapin FS, Osterkamp T, Dyurgerov M, Romanovsky V. Observational evidence of recent change in the northern high-latitude environment. Clim. Change. 2000;46:159–207. doi: 10.1023/A:1005504031923. [DOI] [Google Scholar]
  • 142.O’Neill, D. The fall of the Yukon kings. Arctic voices: resistance at the tipping point. Edited by S. Banerjee. Seven Stories Press, New York, 142–165. 2012.
  • 143.Robinson, M. J. The common good: Salmon science, the conservation crisis, and the shaping of Alaskan political culture. University of Alaska Fairbanks. Unpublished PhD thesis, 2015.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data are shared Open Access, as per Methods and Appendix at the following URL https://drive.google.com/drive/u/0/folders/1rz3ZW3xplvdEf8LDu-d7-1BDXF6XxNMY, and also available from the authors on request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES