Abstract
Estimating the numbers and whereabouts of internally displaced people (IDP) is paramount to providing targeted humanitarian assistance. In conflict settings like the ongoing Russia-Ukraine war, on-the-ground data collection is nevertheless often inadequate to provide accurate and timely information. Satellite imagery may sidestep some of these challenges and enhance our understanding of the IDP dynamics. Our study thus aimed to evaluate whether internal displacement patterns can be estimated from changes in car counts using multi-temporal satellite imagery. We collected over 1000 very-high-resolution images across Ukrainian cities between 2019 and 2022, to which we applied a state-of-the-art computer vision model to detect and count cars. These counts were then linked to population data to predict displacements through ratio or non-linear models. Our findings suggest a clear East-to-West movement of cars in the first months following the war’s onset. Despite data sparsity hindered fine-grained evaluation, we distinguished a clear positive and non-linear trend between the number of people and cars in most cities, which further allowed to predict the sub-national people dynamics. While our approach is resource-saving and innovative, satellite imagery and computer vision models present some shortcomings that could mask detailed IDPs dynamics. We conclude by discussing these limitations and outline future research opportunities.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-024-80035-8.
Keywords: Car Detection, Satellite Imagery, Convolutional Neural Network, Crisis Response, Migration, Societal Computing
Subject terms: Computer science, Psychology and behaviour
Main
Millions of civilians are uprooted each year from their places of residence due to conflicts, human rights violations, and natural disasters1. The number of forcibly displaced people has more than doubled over the past decade, reaching a record high of 100 million in 20222. Internally displaced people (IDPs), i.e., people forced to leave their homes but who remain within their country’s border, accounted for more than 70% of the recorded movements, marking a 20% increase compared to the previous year3. According to the latest report of the Internal Displacement Monitoring Centre (iDMC), the escalating number of conflicts and violence in the past year pushed the figure to an unprecedented 28.3 million displacements worldwide3. Such alarming numbers raise international concerns, as IDPs are amongst the most vulnerable people and often in need of humanitarian assistance to ensure their safety, as well as access to medical care, services, and food1.
While there have been wide-ranging international efforts to measure the scale of cross-border population flows, estimation of within-country migration flows is particularly prone to inaccuracies, lack of timely updates, and lack of disaggregation, e.g., by age4,5. This also applies in the context of the ongoing Russia-Ukraine War, which triggered the largest humanitarian crisis in Europe since World War II. As of October 2024, over 40 million border crossings from Ukraine have been registered since the start of the war (24 February 2022), in addition to 6.7 million Ukrainians who sought refuge in neighbouring countries and another 3.5 million within the borders of their country6,7.
The volatile situation of the conflict has resulted in pendular migration not only between Ukraine and adjacent countries but also within the country itself. Although public authorities have a good understanding of the population flow to and from receiving countries, less is known about the Ukrainians’ displacement pattern at the sub-national level8. This is a crucial limitation, as Ukraine currently lacks intermediary institutions that could ensure periodic assessments throughout the country.
To date, most information on IDPs is collected by humanitarian organizations through either phone- or field-based surveys9,10. In a conflict or disaster, however, these traditional data sources often fail to provide accurate and timely information, especially during the acute phase of the displacement crisis5,9. In armed conflicts, on-the-ground enumerators can be exposed to life-threatening risks, or have their access hindered by either infrastructural damages or inaccessibility to more remote areas11. As a consequence, most countries rely on a patchwork of IDP estimates that arise from various independent assessments, each of which is collected for different purposes that ultimately yield conflicting IDP estimates9.
In light of such limitations, the use of non-traditional data for monitoring mobility patterns has gained momentum among both the scientific and the humanitarian community due to their untapped potential to sidestep some of the current data limitations9,12,13. A substantial body of research has explored the potential use of anonymized call detail records (CDR) from mobile phone operators to monitor mobility patterns14,15. For example, studies have shown how CDR can track the spread of communicable diseases, such as the work by Wesolowski et al.16,17 on malaria, and Oliver et al.18 on COVID-19. Additionally, CDR data has been useful to measure communication patterns within particular communities such as the Syrian refugees in Turkey19, as well as to analyze internal displacement trends as exposed by Shibuya et al.20 in the context of the Russia-Ukraine War.
While CDR data is undoubtedly useful, procuring access is a major hurdle and needs to be negotiated for each telecom operator individually. Moreover, there are difficulties in quickly transferring a methodology from one country to another as different telecom operators might apply different processing and aggregation methods, and might fall under different national regulations. To address these concerns, a growing body of literature has been exploring digital traces from social media platforms, which are more easily accessible and persistent across countries [e.g., Refs.8,10,21–23]. Others, in turn, have been relying on satellite imagery to estimate population dynamics through either refugee settlement detection12 or analysis of nightlights data24,25.
Building upon these studies, we advocate that satellite imagery bears a fertile, and yet largely under-explored, ground for studying internal population displacements. News channels all around the world reported the fleeing of thousands of Ukrainians by means of personal or shared vehicles during the first weeks following the full-scale Russian invasion. Countless numbers of vehicles stretching dozens of kilometers have been recorded at the major checkpoints on the Western border, capturing the evasion of Ukrainians from the Eastern to the Western regions [e.g., Refs.26–28].
This vehicle-facilitated displacement behavior raises the question of whether internal displacement patterns can be estimated from the spatial-temporal changes in car counts obtained from satellite imagery. To pursue this research question, we collected a total of 1009 very-high-resolution satellite images between 2019 and 2022, spanning 61 cities across Ukraine (Fig. 1; see section Material & Methods for more details about data collection and processing). Of these images, only 534 images remained for analysis after undergoing our data post-processing pipeline, with most cities remaining with less than 10 images throughout the time series (see Supplementary Fig. S1). Furthermore, the monthly data availability was generally sporadic for all cities, with several temporal gaps along the time series (i.e., months without any data) (see Supplementary Fig. S2). However, data coverage was much better during the months succeeding the start of the war than during the pre-conflict period, especially for cities heavily involved in the conflict (e.g., Kharkiv, Donetsk, Mariupol, Odessa, and Ivano-Frankivsk; see Supplementary Fig. S2). This likely reflects operational adjustments by the satellite operator in response to the demand for images from the conflict areas.
Despite the data shortages, visual inspection of these images revealed some clear pre-war vs. wartime patterns in the number of cars across multiple cities. For example, Mariupol, one of the most heavily affected cities by the war, suffered a massive drop in the number of cars during the first month succeeding the war when compared with the same month and region prior to the COVID-19 pandemic (Fig. 2a, b). The opposite trend was observed for the city of Uzhhorod on the Slovakian border, with a substantial increase in the number of vehicles during the months succeeding the outbreak of the war when contrasted to the pre-war period (Fig. 2c, d).
These two examples illustrate the apparent connection between the number of cars and people, as the car dynamics seemingly follow the general East-West migration pattern that has been previously reported for Ukraine8,10. On this basis, the current study proposes a novel methodology to estimate IDPs, using the ongoing Russia-Ukraine War as a motivating case study. By using a computer vision model in combination with a robust statistical analysis, we modelled the population-car relationship for multiple cities and estimated the sub-national people displacement whenever applicable.
Results
In the following two sections, we first describe our more basic and measurement-centric analysis of (raw) car counts (see section “Spatio-temporal car dynamics”). Subsequently, we describe a more advanced and modelling-centric analysis that uses the changes in car counts to estimate changes in population counts (see section “Inferring IDPs from cars”).
Spatio-temporal car dynamics
Evaluating the car dynamics on a monthly resolution was largely infeasible, given the temporal data gaps within each city (see Supplementary Fig. S2). Nevertheless, trends in the car dynamics become more apparent at coarser temporal resolution and we therefore present the following results on a quarterly and yearly resolution to ease interpretation (Fig. 3).
At the quarter-of-year resolution, our results suggest an increasing and progressive trend of car displacement from East-to-West throughout the year (Fig. 3a). However, these results still suffer from considerable data shortages for some parts of Ukraine.
Analyzing the relative changes at the yearly level provided instead a more robust picture of the cars’ internal displacement (Fig. 3b). Except for Rivne, oblasts to the West were marked by a substantial increase in the number of cars. A particularly large increase was observed for Ivano-Frankivsk (412%) and Lviv (401%) (Fig. 3b). Oblasts in the central region of Ukraine depicted a more transitional state, as car density increased in some oblasts (e.g., Zhytomyr (472%), Cherkasy (307%), and Chernihiv (107%)), and decreased in others (e.g., Kherson (-98%), Poltava (-76%), and Kiev city (-75%))(Fig. 3b). The Eastern region, in contrast, was marked by a clear outflow of cars, with Donetsk (-90%), Dnipropetrovsk (-87%) and Mykolaiv (-72%) recording the largest drop (Fig. 3b). Kharkiv and Luhansk were an exception to this, given the number of cars increased for these two oblasts when contrasted to the number of cars circulating prior to the war (2019). Luhansk, indeed, recorded the highest increase among all Oblasts (+ 773%) (Fig. 3b).
At a high level, these aggregated results indicate a clear East-to-West movement, with some key oblasts suggesting potential cross-border movements (e.g., Odessa (Moldova), Ivano-Frankivsk (Romania), Lviv (Poland), Zhytomyr (Belarus), and Luhansk (Russia)) (Fig. 3b).
Inferring IDPs from cars
Translating from shifts in car distribution to shifts in population distribution requires a model of the linkage between the two. To obtain such a model, we put fine-grained 2019 population counts in relationship to car counts from satellite imagery for the same year, and use the estimated relationship to predict population shifts along the first months following the start of the war.
There is a general positive link, i.e., areas with larger populations also have more cars visible in satellite imagery, which reinforces our initial hypothesis (Fig. 4). Oleksandriya was nevertheless an exception to this, as there is virtually no relationship between its population and the number of cars (Fig. 4). Curiously, there is variation in the exact curve linking population and car counts (Fig. 4 and Supplementary S3-S7). Furthermore, some highly populated areas were associated with a particularly low number of cars. Visual inspection suggests that such cases occurred mainly in dense residential areas where cars could be parked in closed spaces such as garages. It is likely that other factors such as differential rates of car ownership, street parking availability, and access to and quality of public transportation could also explain this variation.
To model this car-to-people relationship, we explored two approaches with distinct levels of complexity. In the first approach, we assumed that the 2019 car-to-people ratio still applies in the following years while remaining constant within areas; hereafter the ratio method. That is, if for a given area in a city the number of visible cars drops by, say, 40% compared to 2019, we assume that the population in that area has also dropped by 40%. This approach is intuitive, but it has a risk of “overfitting”, and it always assigns a population count of zero to areas without visible cars. Moreover, an additional limitation is that the fixed car-to-population relationship fails to account for changes in urbanization levels, an aspect that is likely to occur in the long-run for any city or areas within the city. We nevertheless deem that this effect was negligent in our case study, given the narrow time window (2019–2022) which also involved two years of heavy economic recession that was driven by the COVID-19 pandemic.
In the second approach, we relied on Generalized Additive Model (GAM) as it provides a more flexible function to estimate the cars-to-people relationship; hereafter the regression method. Unlike the ratio method, this approach tends to overestimate the population size, as it typically assigns the function’s average to all areas, including uninhabited areas. By using these two distinct methods, we can ultimately derive lower and upper bound estimates, and thereby provide a proxy for uncertainty in our population size estimates.
Overall, we could estimate IDPs for 43 out of the 61 evaluated cities. The remaining cities did not have any comparable month between the baseline (2019) and the other two years (2020, 2022), and were thus not considered for further analysis. For an in-depth overview on our above data analytical assumption, we refer the reader to the Material & Methods section. For the majority of applicable cases, our findings indicated that the population was larger during the first COVID-19 year than during the conflict year (e.g., Bila-Tserkva, Donetsk and Odessa; see Supplementary Figs. S8 and S11), as one might expect.
Although the population estimates could differ substantially between the two prediction methods (i.e., ratio and regression methods), their general trend remained consistent (Fig. 5 and Supplementary S8-S12). While the ratio method predicted usually much larger population drops (or increases), the regression method resulted in milder and less extreme predictions. Following the ratio method, for example, more than 88% of Mariupol’s population emigrated in March 2022. These fluctuations are dampened when considering the estimates from the regression method, which predicted a decrease of only 30%. Such discrepancies are naturally expected, given that the GAM approach adds the average population of the baseline year into its calculation, unlike the ratio approach which considers only the fixed population-to-car ratio.
For a handful of cities, some months in the regression approach were marked by either suspicious population sizes (e.g., Berehove and Mamalyha in Supplementary Figs. S8 and S10, respectively), or differing population trends between the two prediction methods (e.g., Donetsk, Luhansk and Shehyni in Supplementary Figs. S8, S10 and S12, respectively). This likely results from poor model fit, violation of one or more model assumptions, or because the number of cars during the conflict year was simply far outside the model’s prediction range.
The results from the GAM approach generally revealed a good performance in terms of goodness-of-fit, as most of the evaluated cities (25/43) displayed a coefficient of determination above 60% (see Supplementary Table S1), and followed reasonably well the evaluated model assumptions (refer to Supplementary S13 for an example). This means that the city-specific baseline population sizes could be generally well estimated by the number of cars, and the model is as such suitable for predicting population during the war period. For models outside these conditions, we caution against any solid conclusion.
Irrespective of the prediction method, our results revealed similar trends in terms of people displacement across cities and oblasts. In line with the raw car dynamics reported previously, the current results also suggest an East-to-West movement of population (Fig. 5 and Supplementary Figs. S8 - S12). In the first months following the start of the war (March-April), cities in the West, such as Uzhhorod and Ivano-Frankivsk, also showed a substantial increase in the number of people compared to their pre-war population (Fig. 5 and Supplementary Fig. S9). The population of Ivano-Frankivsk increased substantially compared to all other western cities, with estimates ranging between +157% - +623% above the pre-war population size (see Supplementary Fig. S9). In contrast, cities in the East and more central regions, such as Kiev and Mariupol, mostly saw an outflow of people (Fig. 5 and Supplementary Figs. S8 - S12). Among those, the cities of Zaporizhzhia and Kherson were marked by the largest population drop, where the ratio method predicted a population decrease of more than 90% for both cities (see Supplementary Figs. S10 and S12).
Apart from the city-level comparisons, it might be of general interest to understand the displacement dynamics at the sub-city scale. Figure 6 shows an example of the predicted population for Kiev city for two different months in 2022, with Supplementary Figure S14 providing additional support by showing the results in relative terms. Regardless of the prediction method, our results clearly show an extensive emigration of its residents in the first month following the war (March), with the eastern side of the city marked by a larger population drop than the western side (see mid panels in Fig. 6). This picture changes completely three months later (June), when the eastern portion of the city seems to have recovered most of its initial population (see lower panels in Fig. 6).
Discussion
Estimating population displacement at the subnational level is inherently challenging due to its dynamic nature and the irregular collection of primary data. Our analysis demonstrates the potential that very-high-resolution satellite imagery holds for monitoring car-based internal displacement. The importance of such methods is likely to increase, as climate change and political and economic instabilities are bound to lead to additional displacement pressure worldwide in the years to come3,29,30.
We investigated two different approaches to predict IDPs and despite the occasional discrepancies, some clear movement patterns could be traced at both subnational and sub-city level. These approaches could be improved further if situation-specific information on the mode of (escape) transport is available, e.g. from post-displacement surveys. However, groups such as the “caminantes”, who are leaving Venezuela on foot, will be invisible to our method. Similar challenges also apply to the use of mobile phone data, as the exact phone-to-person link is context-dependent and parts of the population might have several mobile devices per person, while others have none. Connectivity can be additionally affected by power shortages or destroyed cell towers, and thus blurring the IDP dynamics.
Following our findings, most of the IDPs originated from the eastern region due to the decreased population estimates. Oblasts to the west, in contrast, were frequently associated with increased population estimates, suggesting as such their role as host regions. At the high-level comparison, these findings corroborate with the trends reported by the International Organization for Migration (IOM)6, and are further supported by those from Rowe et al.8 and Leasure et al.10. Comparing these estimates in absolute terms is nevertheless intractable, given the differences in spatial and temporal granularity, including the data source underlying the reference year. In the absence of ground-truth data with which our population estimates could be possibly validated, such high-level data triangulation remains the only option at the time being.
While our study focuses on one country, Ukraine, and one satellite image provider, Maxar, we believe that the computer vision approach for detecting and counting cars will generalize to other geographic contexts that are similar in terms of car, building, road types, as well as to other providers of very high-resolution satellite imagery (30–50 cm). The benefits of such an approach are multifold for both the humanitarian and government sector, going from reduced life-treating risks for the on-ground enumerators to the rapidity at which population estimates can be generated as a response to the frequent imagery update. Besides, a major advantage of the current method is its transparency, unlike the black-box, and often biased, algorithms employed by social media platforms such as Meta.
Nevertheless, our framework holds some fundamental limitations that could hamper its broader diffusion. To start with, very-high-resolution imagery such as the ones used herein are not cheap to procure and require staff with technical skills to acquire, process, and analyze them12. This might impose a barrier for low- and middle-income countries, who already suffer from long-lasting funding restrictions. Additionally, it can take time to establish partnerships between governments and imagery providers.
Current computer vision techniques also come with limitations that can jeopardize the accuracy and generalization of any similar framework31,32. A typical constraint of AI-based models is the lack of ground-truth data to perform their formal validation in the real world33,34. In the absence of a dedicated database, it is necessary to employ expert human annotators to manually label the images. This is not only costly and time-consuming, but often infeasible in the context of humanitarian crises which require timely responses.
Some other constraints are intrinsic to the satellite images themselves. In our brief experiment, we showed the effect of weather- and cloud-based obstructions on car detectability, including imagery-related features such as color, off-Nadir angle, and image resolution (see section ”Understanding the effect of imagery features on car detection”). Given its unknown impact on population estimators, we largely encourage future research to investigate further the impact of such aspects on object detectability, and most importantly the use of high-resolution synthetic aperture radar (SAR) imagery as a potential remedial solution to the limiting RGB-based images (https://umbra.space/).
Data sparsity was also a key limitation in the present study, mostly as a result from the pre-collected archival Maxar imagery. Despite the better spatial coverage and cadence of the freely available Sentinel-2 images, its spatial resolution of 10 m per pixel prevents the detection of small objects like cars. Most cars typically occupy less than a pixel, making it hard to distinguish them from the background. For this reason, we resorted to Maxar images which offer higher spatial resolution (i.e., 0.3–0.5 m per pixel), yet come at the cost of lower spatial and temporal coverage as our analysis relied on pre-collected archival imagery, rather than on-demand task-based imagery.
However, it is worth to highlight that the number of earth observation satellites has improved rapidly, jumping from 789 in Aug 2020 to 1238 in May 202335, with a further increase expected. Furthermore, the cost of tasked-based imagery has dropped to under USD 1000 per request36,37, down from over USD 10,000 a few years back, with further price drops expected. With improved data availability, it is likely to open some niche for exploring more advanced and robust statistical models to predict population shifts from observed car displacements, including, but not limited to, spatial and state-space models38,39.
Beyond the aforementioned cases, there remains the fundamental research challenge of tying the number of displaced cars to the number of displaced people. Behavioral changes induced by the war could likely affect the pre-war link between the number of visible cars and the number of people living in a given area. For example, one might assume that, when fleeing, the cars could be more packed. The drop in cars could be thus an underestimation of the drop in population. Conversely, the cars that remain might well become more hidden/protected, overestimating as such the population displacement.
While our research focuses on computational methods for estimating internal displacement, it is important to acknowledge that such work does not happen in a political vacuum. In particular, there is a risk that any migration-related technology will be used to curb and restrict migration, rather than benefiting migrants40. And even if used with good intentions in a humanitarian context, more data could add noise and distract, while also creating privacy issues41. Despite these valid concerns, we believe that if developed and deployed responsibly through academic-humanitarian partnerships, satellite-based estimates can benefit displaced populations, reducing privacy risks related to the use of individual data, such as mobile phone traces.
Material & methods
Car detection and counting from satellite images
Automatic detection of small objects such as cars from satellite images is a difficult task, especially at a global scale, due to the diverse nature of environments across geographies, climate zones, and seasons. To tackle this challenge, many methods have been proposed using traditional approaches such as training classifiers (e.g., Support Vector Machines (SVMs)42 and Random Forests (RFs)16) over handcrafted features (e.g., Local Binary Patterns (LBP)43, Histogram of an Oriented Gradients (HOG)44, and Scale-Invariant Feature Transform (SIFT)45)46–48. However, more recent approaches leverage on large labeled image collections such as xView49, DOTA50, DIOR51, and FAIR1M52 by using various deep learning architectures based on Convolutional Neural Networks (CNNs)53–56. The CNN-based approaches have achieved better performance thanks to the capacity of deep neural network architectures to extract and learn object characteristics within an end-to-end framework51,57–64.
In this study, we employed the state-of-the-art ensemble CNN framework proposed by Minetto et al.65, which ranked third place in the xView challenge (http://xviewdataset.org/), the most advanced benchmark for object detection in satellite images, organized by the US Defense Innovation Unit Experimental (DIUx) and the National Geospatial-Intelligence Agency (NGA). The ensemble model is designed by combining two baseline Single Shot Multibox Detectors (SSD)55 with various data augmentation strategies adopting different scales, overlaps, and thresholds in order to ensure better scale invariance and detection accuracy for small vehicles. More technical details about the method can be found in Minetto et al.65.
Different than the reference work65, we focused particularly on the small car class and filtered the final car detections using a higher confidence threshold of 0.45 which we tuned by a sensitivity analysis experiment as follows: We took a sample of 3000 images with a total of 19,000 ground-truth car annotations and evaluated the car detection model’s output performance by computing Fβ score as in Eq. (1) with β = 0.5 while increasing the confidence threshold value from 0 to 1 with a step size of 0.05.
1 |
In this analysis, we chose β = 0.5 to put more weight on precision than recall of the model, and hence, focused on minimizing false-positive car detections which is critical for our use case. Supplementary Table S2 summarizes the performance achieved by the model in terms of Precision, Recall, and F0.5-score across varying confidence thresholds. The maximum F0.5-score was achieved as 0.4776 when confidence threshold was 0.45 as highlighted in Supplementary Fig. S15. At this threshold value, the precision and recall scores of the model were recorded as 0.5578 and 0.3032, respectively, according to Supplementary Table S2.
Study region and areas of interest
Our study region comprised all primary administrative units, i.e., oblasts, in Ukraine, except for the occupied territories of Crimea and Sevastopol. Within each oblast, we selected the two most populated urban areas, in addition to other strategic areas located at the border checkpoints which were used as escape routes; hence, where humanitarian efforts could be optimized.
The selection of urban areas was based on the open-source gridded population data retrieved from WorldPop, which is regularly updated and curated by the WorldPop Research Group at the University of Southampton66 (https://www.worldpop.org). For the purpose of this study, we downloaded the geospatial layer tailored specifically for Ukraine67. This data provides information for the latest available year (2020) at a resolution of 100 m, and was produced through the top-down constrained method (for more details, see Stevens et al.68 and WorldPop69). We then aggregated the population data at the secondary administrative units, i.e., Raions (districts), and ranked them within each oblast. Ultimately, we selected the cities belonging to the top-two most-populated Raions in each oblast and manually defined a boundary area around the identified cities.
To define the strategic areas, we relied on information provided by the Humanitarian Data Exchange (HDX) database (https://data.humdata.org/). HDX is an open-source platform managed by the United Nations Office for the Coordination of Humanitarian Affairs (OCHA), providing over 20,000 datasets spanning 250 locations. We specifically retrieved information regarding international border crossings of Ukraine, for which the most up-to-date data was downloaded at the development of this study (June 2022). This dataset nevertheless does not specify the type of border crossing and can include categories such as railway and water crossing. In order to keep locations that can be only crossed by vehicles, we further intersected this data with information provided by the Ukrainian State Border Guard Service (https://dpsu.gov.ua/).
This process resulted in 61 areas of interest (AOI) with varying sizes, whereby each oblast included at least two AOIs (Fig. 1). All-together, these 61 AOIs covered a total area of 13.496 km2(see Supplementary Table S3). For simplicity, we will henceforth refer to each AOI as ‘city’.
Satellite imagery collection and processing
Reliable detection of small objects such as cars requires access to very-high resolution satellite images. Although there are a number of free satellite imagery providers (e.g., Sentinel data), they all come at the cost of coarser spatial resolution (> 10 m). For this reason, we used pen-sharpened natural color images from the Worldview and GeoEye series of satellites operated by Maxar Corp, which currently provides images at a spatial resolution of up to 30 cm.
Through the SecureWatch Platform (https://securewatch.maxar.com/myDigitalGlobe), we searched the archive for images taken between January 2019 and September 2022. We included images from the years before the war so that we could contrast more reliably the car dynamics before and during the war. Moreover, we started from January 2019 because car dynamics might have likely changed during the COVID-19 outbreak and would, as such, not provide a reliable baseline for comparison. The list of available satellite images was further filtered to retain images with (i) RGB color, (ii) less than 20% cloud coverage, and (iii) a Ground Sampling Distance (GSD) of at most 0.5 m (i.e., images at 0.3 m, 0.4 m, and 0.5 m resolution). Note that GSD refers to the distance between two consecutive pixels in an image measured on the ground. The smaller the value of GSD, the higher the spatial resolution of the image and the more visible the details of small objects such as cars.
After this initial search, we proceeded to download satellite imagery for each AOI, for every day (where available) in the above-mentioned period. For each AOI, we downloaded an image for a given day only when there were one or more images available covering at least 1% of the AOI’s area on that day. If there was only a single satellite image available for that day, we downloaded the portion of the image that covered the AOI. If there were more than one satellite images available on that day we downloaded a single stitched image as follows. The available satellite images were sorted based on area of the AOI they covered and the image covering the most area was downloaded first; this procedure was then repeated with the remaining images for the still-uncovered portions of the AOI. These different image portions were then stitched together into one final image. We downloaded a total of 1009 daily image snapshots across all 61 AOIs. Most of these snapshots (77.7%) were composed of a single satellite image with the remainder being stitched from two or more images as described above. The bulk of the images were taken between 8 AM and 9 AM UTC (see Supplementary Fig. S16).
Data post-processing
Preliminary assessments revealed that certain images were associated to an outstanding number of car detections, while others only to a few or even none. Further investigation indicated that the outstanding detections were mostly related to false-positives. Conversely, images with zero or few detections were associated to cases in which the imagery was either fully obstructed by dense cloud and/or haze layers, or where it covered only a very small fraction of the AOI. To avoid confounding noise, hence misleading the results, we carried out three additional data filtering processes described as follows.
False-positive filtering
As any ML model, the car detection and classification algorithm used herein is not immune to false-positives. Prior investigation revealed that the CNN model detected cars at places in which their occurrences are unlikely or even impossible (e.g., water bodies, crop fields and forests; see Supplementary Fig. S17). To attenuate these misdetections, we juxtaposed the detected cars with some of the spatial attributes retrieved from the OpenStreetMap (OSM) database and filtered out any detection located within the unlikely areas.
OSM is a crowd-sourced database that provides physical features from all over the world70. The data is organized by tags, that correspond to a key-value pair describing the feature’s characteristics. Based on the official map attributes exposed in the Wiki OSM webpage (https://wiki.openstreetmap.org/wiki/Map features), we selected five primary keys to compose our standard tag: landuse, place, natural, leisure, and aeroway. Each key was then paired to a list of values, defining ultimately areas such as rivers, forests, farmland, parks, and railways. For a detailed list of the selected tags, we refer to Supplementary Table S4.
To retrieve the geospatial layers related to our list of selected tags, the osmdata R-package was used71. The package essentially downloads OSM data through overpass API queries, where each query runs within a bounding box area. In our case, we ran the query for all 61 AOIs (i.e., bounding boxes), and stored the AOI-specific data in the format of shapefiles. Any cars detected on those layers were considered as false-positives, and hence, removed from our database.
Population filtering
Most of the downloaded imagery covered only a fraction of the AOIs. In some cases, the covered fraction was very small and typically did not overlap with the core extent of the urban area (see Supplementary Fig. S18). We deemed such images as unrepresentative of the city’s core dynamics, and consequently removed them from the analysis. Specifically, we classified as unrepresentative any imagery covering less than 50% of the population distribution. For this purpose, each imagery was assigned to a distribution index, calculated as follows:
2 |
where pi, s represents the population fraction covered by the imagery i relative to its AOI s, and Ni, s and Ns indicate the number of people within the imagery and AOI, respectively. The index ranges between 0 and 1, with larger values corresponding to stronger representativeness of the full population distribution. Note that to compute the indices, we used the same gridded population data as presented in section “Satellite imagery collection and processing”.
Cloud and haze filtering
Some images were associated to very low car detections (N < 10), or even none in the more extreme cases. In order to make reliable inferences on the car dynamics, it is imperative to distinguish false from true zeros; or, alternatively, low occurrences. False zeros/low occurrences are of particular concern, as they may arise from images obstructed by factors such as clouds, haze and pollution.
Despite the initial imagery request was limited to cases in which the cloud coverage did not exceed 20%, it does not warrant that the AOI will be cloud free. This is because MAXAR calculates the cloud coverage on the entire image tile, instead on the AOI (i.e., subset of the provided tile). Most of the zero/low occurrences detected in our data were due to images that were fully or partially covered by a dense cloud/haze layer. We therefore removed all images that were fully obstructed, while partially obstructed images were removed only if the cloud(s) blocked the bulk of the city. Note that for the latter aspect, images were visually inspected as there is no gold solution to filter cloud-obstructed images.
Data analysis
Evaluating the spatial and temporal dynamics
To investigate the potential internal migration dynamics before and during war period, we evaluated first the temporal car dynamics for each city whenever applicable. Data scarcity prevented the evaluation on a daily basis, and thus monthly average of car density were computed for all cities. Moreover, to examine regions that experienced an increase/decrease in the number of cars during the conflict year (2022), we calculated the change in average car density relative to the baseline year (i.e., 2019). To draw a country level picture, relative changes were calculated at the level of primary administrative units (Oblasts). For Oblasts with two or more cities, this means that the average car densities were further averaged across the cities.
Inferring IDPs from cars
We assumed that spatial and temporal changes in the number of cars could reflect potential migration of Ukrainian citizens across the country. Thus, we specifically propose to estimate war-induced IDPs from historical population-to-car trends incurred during the baseline year (i.e., 2019). In the absence of ground truth data, we used WorldPop’s gridded population data for the year 2019 to reflect more realistically the baseline population.
Overall, two distinct methods were used to estimate the relationship between pre-war population and cars. Specifically, we used the (i) linear ratio and (ii) regression method72, which differ in terms of complexity and assumptions. Whereas the ratio method assumes that the population-to-car relationship remains invariant through time within the given geographical area, the regression method relaxes this assumption by including various levels of complexity, such as non-linear and/or spatially- and temporally-dependent relationships. By using two distinct methods, we can provide lower and upper bound estimates, and hence, a proxy for uncertainty.
The forecasting should be ideally conducted on a daily basis, since such fine-grained estimates would be more relevant during the acute phase of the humanitarian crisis. However, such high temporal resolution does virtually not exist in the context of historical satellite imagery, and therefore, IDPs were predicted on a monthly basis.
We initially constructed a 1 × 1 km spatial grid for all cities through the sf R- package73. The total number of cars and people were then computed at the grid cell level for each city-specific satellite imagery taken in the years 2019 (baseline year), 2020 (first COVID-19 year), and 2022 (first conflict year). We used the results from the first COVID-19 year to contrast with those from the conflict year and assure that the latter results were realistic given past trends. That is, we should expect that the population during the COVID pandemic is larger than the population during the war period.
To avoid any hidden and potentially misleading seasonality effect that might be induced by the extensive temporal gap, we identified and selected only cities presenting one or more matching months between 2019 and at least one of the two other years (i.e., 2020, 2022). Monthly averages of the number of cars and people were then computed at the grid cell level for each city and year. Next, for the reference year (i.e., 2019), we pooled information from all months and calculated the average number of cars and people for each city and grid cell therein. These averaged values were subsequently used to estimate the pre-war relationship between population and cars for each of the prediction methods (i.e., linear ratio and regression model). Because the extension of the spatial grid might differ between the considered periods (e.g., Fig. 6), we assured that all predictions were conducted on the matching grid cells only.
In the ratio method, we specifically calculated the proportion of the number of people y relative to the number of cars x as follows:
3 |
where rs, z is the population-to-car ratio for city s and grid cell z for the reference year. Certain grid cells may register zero cars, whereby the ratio cannot be computed. For such cases, we calculated the global median of the ratio and borrowed the computed value for the affected cells. The grid-level ratios from the reference year were then used as multiplying factor to predict the city-specific population from the average number of cars that were calculated for the matching months in 2020 (COVID-19) and 2022 (war).
In the regression method, in contrast, we estimated the baseline population car relationship through a Generalized Additive Model (GAM), as prior examination indicated an asymptotic trend when plotting the two variables irrespective of the grid cell index. Unlike the ratio method, GAMs are a class of statistical models in which the uncertainty can be retrieved in both estimation and prediction phase. These models constitute a powerful extension of GLMs (Generalized Linear Models), whereby the linearity assumption of the predictor(s) can be relaxed through smoothing functions74.
Similar to GLMs, the response variable Y (herein no. of people) follows a probability distribution from the exponential family (herein Poisson), with the mean µ = E(Y ) linked to an additive non-parametric predictor η (herein no. of cars) through a link function g(.), such that g(µ) = E(η). For a given city, and month-year, the model can be thus simplified as:
4 |
where is the intercept (i.e., global average of cars), (.) the cubic spline function, ε the error term, and i the grid cell index.
All GAM models were fitted through the mgcv R-package75, and model assumptions checked visually through the mgcViz package76. The parameters from the fitted models were ultimately used to predict the grid-level population for the matching months in 2020 and 2022 based on their average number of cars, akin to the ratio method.
For both approaches we calculated the change in population size relative to the baseline population. Computations were performed on both grid-cell and global level, where the latter was calculated by summing up the grid-level populations.
Understanding the effect of imagery features on car detection
It is paramount to understand the influence of imagery-related characteristics on the CNN’s model capacity to detect cars, as it influences the IDP estimates. Cloud obstruction, sun elevation and off-Nadir angles, for example, can influence the geometry of the image and hence accuracy of the object detection77,78. Thus, to examine the overall effect of these features on car detection, we conducted a Generalized Linear Model (GLM) through the stats R-package. The number of cars were standardized by the area of the given imagery (i.e., car density), log-transformed, and modelled via a Gaussian probability distribution. Among the imagery features, we evaluated the effect of image resolution (categorical), off-Nadir angle (numeric), sun elevation (numeric), cloud coverage (numeric), and presence of snow (categorical). Model assumptions were visually assessed to evaluate the residuals’ normality, homoscedasticity and independence.
Our results revealed that all tested covariates significantly affected the car detection, except the cloud coverage (Table 1). Higher image resolution was generally associated with higher densities of detected cars (Table 1; Fig. 7a and Supplementary Fig. S19). The presence of snow was negatively related to car densities, meaning that a higher number of detected cars tended to be associated with non-snowy days (Table 1; Fig. 7b). Supplementary Figure S20 shows an example of the impact of snow on car detection, reducing the contrast between cars and their surroundings, making it more difficult for the model to discern cars in the image.
Table 1.
Parameters | Estimate | 2.5% | 97.5% | t-value | P-value |
---|---|---|---|---|---|
Intercept | 3.44 | 2.85 | 4.02 | 11.55 | < 0.05 |
Image resolution − 0.4 | − 1.05 | − 1.37 | − 0.73 | − 6.45 | < 0.05 |
Image resolution − 0.5 | − 1.90 | − 2.20 | − 1.61 | − 12.59 | < 0.05 |
Snow presence - Yes | − 1.67 | − 2.16 | − 1.18 | − 6.72 | < 0.05 |
Off-Nadir | − 0.03 | − 0.05 | − 0.02 | − 3.99 | < 0.05 |
Sun elevation | 0.02 | 0.01 | 0.03 | 4.49 | < 0.05 |
Cloud coverage | − 1.74 | − 4.55 | 1.07 | − 1.22 | 0.225 |
Estimated parameters are on a log-scale and include the 95% confidence interval, t-values, and P-values expressed at 5% of level of significance.
The off-nadir and sun elevation angles had an antagonic effect on car detection, both of which were directly related to the occlusion of important infrastructure such as roads and parking lots by buildings or their shadows, and thus, affecting car detectability (Supplementary Figs. S21 and S22). Whereas car densities were larger at smaller off-nadir angles and with maximum detectability around 25° (Table 1; Fig. 7c), car densities were higher towards larger sun elevation angles (Table 1; Fig. 7d).
Surprisingly, no significant impact from cloud obstruction could be detected from the present data. Although lower car densities seemed to be associated to images that were more heavily obstructed by clouds (Fig. 7e), the overall effect was not statistically significant (Table 1). It is noteworthy that all model assumptions were reasonably met, i.e., residual normality, homoscedasticity and independence. We refer to Supplementary Figure S23 for a visual overview of the model’s diagnostics.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
Part of this work was done while MCR and IW were employed at the Qatar Computing Research Institute. IW’s work is supported by funding from the Alexander von Humboldt Foundation and its founder, the Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung). The Article Processing Charges (APC) of this publication are funded by the Open Access fund of the Saarland University supported by the Deutschen Forschungsgemeinschaft (DFG). All authors thank the Qatar Computing Research Institute for providing funding for procuring the satellite imagery from Maxar. Special thanks to Katherine Hoffmann Pham for all constructive comments on an earlier draft. Furthermore, we thank Maxar for granting the sample images displayed in this research article, as well as the two anonymous reviewers for their insightful comments.
Author contributions
IW and FO conceptualized the study, with supportive contribution of MCR on its research design. MF performed the satellite data collection and imagery pre-processing. FO conducted the Machine Learning analysis. MCR conducted all data post-processing. MCR analyzed the data, with supportive contribution from IW and FO. All authors wrote the first draft of the manuscript, and read and approved the submitted version.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Data availability
All data and codes underlying the findings of the present study are available on the first author’s GitHub repository (https://github.com/mcruf/IDP_UKR). We nevertheless note that the original Maxar satellite images cannot be openly shared due to confidentiality reasons. These can, however, be made available from the first author upon prior permission from Maxar.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Marie-Christine Rufener, Email: macrufener@gmail.com.
Ingmar Weber, Email: iweber@cs.uni-saarland.de.
References
- 1.UNHCR. Protecting internally displaced persons: a handbook for national human rights institutions. UNHCR Blog Netw.https://www.undp.org/publications/protecting-internally-displaced-persons-handbook-national-human-rights-institutions (2022).
- 2.UNHCR. Global displacement hits another record, capping decade-long rising trend. UNHCR Blog Netw.https://www.unhcr.org/asia/news/press/2022/6/62a9d2b04/unhcr-global-displacement-hits-record-capping-decade-long-rising-trend.html (2022).
- 3.IDMC. IDMC’s 2023 global report on internal displacement. iDMC Blog Netw.https://www.internal-displacement.org/publications/2023-global-report-on-internal-displacement-grid/ (2023).
- 4.Checchi, F. et al. Public health information in crisis-affected populations: a review of methods and their use for advocacy and action. Lancet390, 2297–2313. 10.1016/S01406736(17) (2018). [DOI] [PubMed] [Google Scholar]
- 5.Ratnayake, R., Abdelmagid, N. & Dooley, C. What we do know (and could know) about estimating population sizes of internally displaced people. J. Mig Heal6, 100120. 10.1016/j.jmh.2022.100120 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.IOM. DTM Ukraine – internal displacement report – general population survey round 18 (October 2024). IOM Blog Network. https://dtm.iom.int/reports/ukraine-internal-displacement-report-general-population-survey-round-18-october-2024 (2024).
- 7.UNHCR. Operational data portal Ukraine refugee situation. UNHCR Blog Netw.https://data.unhcr.org/en/situations/ukraine (2024).
- 8.Rowe, F., Neville, R. & González-Leonardo, M. Sensing population displacement from Ukraine using facebook data: potential impacts and settlement areas. Preprint at 10.31219/osf.io/7n6wm (2022).
- 9.Baal, N. & Ronkainen, L. Obtaining representative data on idps: challenges and recommendations. UNHCR Blog Netw.https://www.unhcr.org/media/obtaining-representative-data-idps-challenges-and-recommendations (2017).
- 10.Leasure, D. R. et al. Nowcasting daily population displacement in Ukraine through social media advertising data. Pop Devel Rev.49, 231–254. 10.1111/padr.12558 (2023). [Google Scholar]
- 11.Abdelmagid, N. & Checchi, F. Estimation of population denominators for the humanitarian health sector: Guidance for humanitarian coordination mechanisms. Health Cluster Blog Netw.https://healthcluster.who.int/publications/m/item/estimation-of-population-denominators-for-the-humanitarian-health-sector (2018).
- 12.Quinn, J. A. et al. Humanitarian applications of machine learning with remote-sensing data: review and case study in refugee settlement mapping. Phil Trans. R Soc. A376, 20170363. 10.1098/rsta.2017.0363 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Weber, I. et al. Non-traditional data sources: providing insights into sustainable development. Comm. ACM64, 88–95. 10.1145/3447739 (2021). [Google Scholar]
- 14.Wesolowski, A., Buckee, C. O., Engø-Monsen, K. & Metcalf, C. J. E. Connecting mobility to infectious diseases: the Promise and limits of mobile phone data. J. Infect. Dis.214, 414–420. 10.1093/infdis/jiw273 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Williams, N. E., Thomas, T. A., Dunbar, M., Eagle, N. & Dobra, A. Measures of human mobility using mobile phone records enhanced with gis data. PLoS One10, e0133630. 10.1371/journal.pone.0133630 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Breiman, L. Random forests. Mach. L Ear45, 5–32. 10.1023/A:1010933404324 (2001). [Google Scholar]
- 17.Wesolowski, A. et al. Quantifying the impact of human mobility on malaria. Science338, 267–270. 10.1126/science.1223467 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Oliver, N. et al. Mobile phone data for informing public health actions across the covid-19 pandemic life cycle. Sci. Adv.6, eabc0764. 10.1126/sciadv.abc0764 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Bakker, M. A. et al. Measuring Fine-Grained Multidimensional Integration Using Mobile Phone Metadata: The Case of Syrian Refugees in Turkey. In Guide to mobile data analytics in refugee scenarios (eds. Salah, A. S., Pentland, A., Lepri, B. & Letouzé, E. 123–140 10.1007/978-3-030-12554-7_7 (2019).
- 20.Shibuya, Y., Jones, N. & Sekimoto, Y. Assessing internal displacement patterns in Ukraine during the beginning of the Russian invasion in 2022. Sci. Rep.14, 11123. 10.1038/s41598-024-59814-w (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Mazzoli, M. et al. Migrant mobility flows characterized with digital data. PLoS One15, e0230264. 10.1371/journal.pone.0230264 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Palotti, J. et al. Monitoring of the Venezuelan exodus through facebook’s advertising platform. PLoS One. 15, e0229175. 10.1371/journal.pone.0229175 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Zagheni, E., Garimella, K R, V., Weber, I. & State, B. Inferring international and internal migration patterns from twitter data. Proc. Int. Conf. WWW. 439–444. 10.1145/2567948.2576930 (2014).
- 24.Coscieme, L., Sutton, P. C., Anderson, S., Liu, Q. & Elvidge, C. D. Dark times: Nighttime satellite imagery as a detector of regional disparity and the geography of conflict. Gisci Rem. Sens.54, 118–139. 10.1080/15481603.2016.1260676 (2017). [Google Scholar]
- 25.Witmer, F. D. & O’Loughlin, J. Detecting the effects of wars in the caucasus regions of Russia and georgia using radiometrically normalized dmsp-ols nighttime lights imagery. Gisci Rem. Sens.48, 478–500. 10.2747/1548-1603.48.4.478 (2011). [Google Scholar]
- 26.Burnett, E. et al. Ukrainians are fleeing in droves. But they’re waiting more than 60 hours at the border. CNN Blog Netw. (2022). https://edition.cnn.com/2022/02/26/europe/ukraine-russia-invasion-refugee-border-crossing-wait-kyiv-lviv-intl/index.html
- 27.HRW. Fleeing war in Ukraine: people waiting to cross border need humanitarian assistance. HRW Blog Netw.https://www.hrw.org/news/2022/02/28/fleeing-war-ukraine (2022).
- 28.Vasovic, A. & Zinets, N. Cars choke roads as ukrainians flee Russian invaders. Reuters Blog Netw.https://www.reuters.com/world/europe/cars-choke-roads-ukrainians-flee-russian-invaders-2022-02-25/ (2022).
- 29.IFRC. Displacement in a changing climate. IFRC Blog Netw.https://www.ifrc.org/document/displacement-in-a-changing-climate (2021).
- 30.UNHCR & Mid-year trends 2023. UNHCR Blog Networkhttps://www.unhcr.org/mid-year-trends-report-2023 (2023).
- 31.Shankar, S. et al. No classification without representation: Assessing geodiversity issues in open data sets for the developing world. Preprint at https://arxiv.org/abs/1711.08536 (2017).
- 32.de Vries, T., Misra, I. & Wang, C. & van der Maaten, L. Does object recognition work for everyone? Preprint at https://arxiv.org/abs/1906.02659 (2019).
- 33.Raji, D., Bender, E. M., Paullada, A., Denton, E. & Hanna, A. AI and the everything in the whole wide world benchmark.Preprint at https://arxiv.org/abs/2111.15366 (2021).
- 34.Torralba, A. & Efros, A. A. Unbiased look at dataset bias. Proc. IEEE/CVF Conf. Comp. Vis. Pat. Recog.,1521–1528. 10.1109/CVPR.2011.5995347 (2011).
- 35.UCS. UCS Blog Networkhttps://www.ucsusa.org/resources/satellite-database (2023).
- 36.Buczkowski, A. GeoAwesome Blog Network.https://geoawesome.com/demystifying-satellite-data-pricing-a-comprehensive-guide (2023).
- 37.Satellogic Satellogic Blog Network. https://satellogic.com/2023/01/24/now-you-see-transparent-pricing-for-eo-market-growth/ (2024).
- 38.Gao, S. Spatio-temporal analytics for exploring human mobility patterns and urban dynamics in the mobile age. Spat. Cogn. Comp.15, 86–114. 10.1080/13875868.2014.984300 (2015). [Google Scholar]
- 39.Pu, T., Huang, C., Yang, J. & Huang, M. Transcending time and space: Survey methods, uncertainty, and development in human migration prediction. Sustainability15, 10584. 10.3390/su151310584 (2023). [Google Scholar]
- 40.Bircan, T. & Korkmaz, E. E. Big data for whose sake? Governing migration through artificial intelligence. Humanit. Soc. Sci. Commun.8, 241. 10.1057/s41599-021-00910-x (2021). [Google Scholar]
- 41.Dijstelbloem, H. Migration tracking is a mess. Nature543, 32–34. 10.1038/543032a (2017). [DOI] [PubMed] [Google Scholar]
- 42.Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn.20, 273–297 (1995). [Google Scholar]
- 43.Ahonen, T., Hadid, A. & Pietikäinen, M. Face recognition with local binary patterns in Computer Vision - ECCV 2004. Lecture Notes in Computer Science, vol 3021 (ed. Pajdla, T. & Matas, J.) 469–481 10.1007/978-3-540-24670-1_36 (2004).
- 44.Dalal, N. & Triggs, B. Histograms of oriented gradients for human detection. Comput. Soc. Conf.1, 886–893. 10.1109/CVPR.2005.177 (2005). [Google Scholar]
- 45.Lowe, D. G. Object recognition from local scale-invariant features. IEEE Int. Conf. Comp. Vis.2, 1150–1157. 10.1109/ICCV.1999.790410 (1999). [Google Scholar]
- 46.Bar, D. E. & Raboy, S. Moving car detection and spectral restoration in a single satellite worldview-2 imagery. IEEE J. Sel. Top. Appl. Rem. Sen6, 2077–2087. 10.1109/JSTARS.2013.2253088 (2013). [Google Scholar]
- 47.Eikvil, L., Aurdal, L. & Koren, H. Classification-based vehicle detection in high-resolution satellite images. ISPRS J. Photo Rem. Sens.64, 65–72. 10.1016/j.isprsjprs.2008.09.005 (2009). [Google Scholar]
- 48.Leitloff, J., Hinz, S. & Stilla, U. Vehicle detection in very high resolution satellite images of city areas. IEEE Trans. Geosci. Rem. Sens.48, 2795–2806. 10.1109/TGRS.2010.2043109 (2010). [Google Scholar]
- 49.Lam, D. et al. xview: Objects in context in overhead imagery. Preprint at https://arxiv.org/abs/1802.07856 (2018).
- 50.Xia, G. S. et al. Dota: A large-scale dataset for object detection in aerial images. Preprint at https://arxiv.org/abs/1711.10398 (2019).
- 51.Li, K., Wan, G., Cheng, G., Meng, L. & Han, J. Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J. Photo Rem. Sens.159, 296–307. 10.1016/j.isprsjprs.2019.11.023 (2020a). [Google Scholar]
- 52.Sun, X. et al. FAIR1M: a benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery. ISPRS J. Photo Rem. Sens.184, 116–130. 10.1016/j.isprsjprs.2021.12.004 (2022). [Google Scholar]
- 53.Girshick, R. & Fast, R-C-N-N. IEEE Int. Conf. Comp. Vis., 1440–1448 10.1109/ICCV.2015.169 (2015).
- 54.Lin, T. Y., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. IEEE Int. Conf. Comp. Vis.42, 2980–2988. 10.1109/ICCV.2017.324 (2017). [DOI] [PubMed] [Google Scholar]
- 55.Liu, W. et al. SSD: Single shot multibox detector. in Computer Vision – ECCV 2016. Lecture Notes in Computer Science, vol 9905 (ed. Leibe, B., Matas, J., Sebe, N. & Welling, M) 21–37; (2016). 10.1007/978-3-319-46448-0_2
- 56.Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. You only look once: Unified, real-time object detection. Preprint at https://arxiv.org/abs/1506.02640 (2016).
- 57.Akyon, F. C., Altinuc, S. O. & Temizel, A. Slicing aided hyper inference and fine-tuning for small object detection. IEEE Image Proc. 966–970. 10.1109/ICIP46576.2022.9897990 (2022).
- 58.Cao, L., Wang, C. & Li, J. Vehicle detection from highway satellite images via transfer learning. Inf. Sci.366, 177–187. 10.1016/j.ins.2016.01.004 (2016). [Google Scholar]
- 59.Cao, L. et al. Weakly supervised vehicle detection in satellite images via multi-instance discriminative learning. Patt Rec64, 417–424. 10.1016/j.patcog.2016.10.033 (2017). [Google Scholar]
- 60.Chen, X., Xiang, S., Liu, C. L. & Pan, C. H. Vehicle detection in satellite images by hybrid deep convolutional neural networks. IEEE Geosci. Rem. Sens. Lett.11, 1797–1801. 10.1109/LGRS.2014.2309695 (2014). [Google Scholar]
- 61.Ding, J. et al. Object detection in aerial images: a large-scale benchmark and challenges. IEEE Trans. Pat. Anal. Mach. Intel.44, 7778–7796. 10.1109/TPAMI.2021.3117983 (2021). [DOI] [PubMed] [Google Scholar]
- 62.Froidevaux, A. et al. Vehicle detection and counting from VHR satellite images: efforts and open issues. IEEE Int. Geosci. Rem. Sens. Symp. 256–259. 10.1109/IGARSS39084.2020.9323827 (2020).
- 63.Long, Y., Gong, Y., Xiao, Z. & Liu, Q. Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Trans. Geosci. Rem. Sens.55, 2486–2498. 10.1109/TGRS.2016.2645610 (2017). [Google Scholar]
- 64.Yang, X. et al. Scrdet: Towards more robust detection for small, cluttered and rotated objects. Proc. IEEE/CVF Inter. Conf. Comp. Vis. 8232–8241. 10.1109/ICCV.2019.00832 (2019).
- 65.Minetto, R., Segundo, M. P., Rotich, G. & Sarkar, S. Measuring human and economic activity from satellite imagery to support city-scale decision-making during covid- 19 pandemic. IEEE Trans. Big Data7, 56–68. 10.1109/TBDATA.2020.3032839 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Tatem, A. J. Worldpop, open data for spatial demography. Sci. Data4, 170004. 10.1038/sdata.2017.4 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Bondarenko, M. et al. Gridded population estimates for Ukraine using un cod-ps estimates 2020, version 2.0. WorldPop10.5258/SOTON/WP00735 (2022). [Google Scholar]
- 68.Stevens, F. R., Gaughan, A. E., Linard, C. & Tatem, A. J. Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. PLoS One10, e0107042. 10.1371/journal.pone.0107042 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.WorldPop. Top-down estimation modelling: constrained vs unconstrained. WorldPop Blog Netw.https://www.worldpop.org/methods/top_down_constrained_vs_unconstrained/
- 70.Mooney, P. et al. A review of openstreetmap data. in Mapping and the Citizen Sensor (ed. Giles, F. 37–59 (2017).
- 71.Padgham, M., Lovelace, R., Salmon, M., Rudis, B. & osmdata. J. Op Sour Soft210.21105/joss.00305 (2017).
- 72.George, M. V., Smith, S. K., Swanson, D. A. & Tayman, J. Population projections. in The Methods and Materials of Demography (ed. Jacob, S. S. & David, A. S.) 561–601 (2004).
- 73.Pebesma, E. Simple features for R: standardized support for spatial vector data. R J.10, 439–446. 10.32614/RJ-2018-009 (2018). [Google Scholar]
- 74.Hastie, T. & Tibshirani, R. Generalized Additive Models 352 (Chapman and Hall/CRC, 1990).
- 75.Wood, S. N. Generalized Additive Models: An Introduction with R 2nd edn 476 (Chapman and Hall/CRC, 2017).
- 76.Fasiolo, M., Nedellec, R., Goude, Y. & Wood, S. Scalable visualisation methods for modern generalized additive models. J. Comp. Grap Stat.1, 78–86. 10.1080/10618600.2019.1629942 (2019). [Google Scholar]
- 77.Li, W., Zou, Z. & Shi, Z. Deep matting for cloud detection in remote sensing images. IEEE Trans. Geosci. Rem. Sens.58, 8490–8502. 10.1109/TGRS.2020.2988265 (2020b). [Google Scholar]
- 78.Wang, J. et al. Learning to extract building footprints from off-nadir aerial images. IEEE Trans. Pat. Anal. Mach. Intel45, 1294–1301. 10.1109/TPAMI.2022.3162583 (2022). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data and codes underlying the findings of the present study are available on the first author’s GitHub repository (https://github.com/mcruf/IDP_UKR). We nevertheless note that the original Maxar satellite images cannot be openly shared due to confidentiality reasons. These can, however, be made available from the first author upon prior permission from Maxar.