Abstract
Objective
Community health systems operating in remote areas require accurate information about where people live to efficiently provide services across large regions. We sought to determine whether a machine learning analyses of satellite imagery can be used to map remote communities to facilitate service delivery and planning.
Materials and Methods
We developed a method for mapping communities using a deep learning approach that excels at detecting objects within images. We trained an algorithm to detect individual buildings, then examined building clusters to identify groupings suggestive of communities. The approach was validated in southeastern Liberia, by comparing algorithmically generated results with community location data collected manually by enumerators and community health workers.
Results
The deep learning approach achieved 86.47% positive predictive value and 79.49% sensitivity with respect to individual building detection. The approach identified 75.67% (n = 451) of communities registered through the community enumeration process, and identified an additional 167 potential communities not previously registered. Several instances of false positives and false negatives were identified.
Discussion
Analysis of satellite images is a promising solution for mapping remote communities rapidly, and with relatively low costs. Further research is needed to determine whether the communities identified algorithmically, but not registered in the manual enumeration process, are currently inhabited.
Conclusions
To our knowledge, this study represents the first effort to apply image recognition algorithms to rural healthcare delivery. Results suggest that these methods have the potential to enhance community health worker scale-up efforts in underserved remote communities.
Keywords: global health, public health surveillance, community health workers, deep learning, neural networks
INTRODUCTION
In rural communities worldwide, poor access to quality healthcare services has impeded progress toward achieving the United Nations Sustainable Development Goals. Despite some important gains in improving access to reproductive, maternal, neonatal, and child health interventions, inequities in rural coverage persist—and appear to be most serious among populations living in the most hard-to-reach regions.1,2 One of the most promising strategies for improving care access among remote rural population is via the expansion of community health worker (CHW) programs, wherein a trusted cadre of community members are trained to deliver basic health services to their communities.3 In the last decade, a growing body of literature has documented the effectiveness of CHWs in improving health outcomes,4–8 stimulating a renewed interest in scaling-up these programs as a means of reducing rural health disparities.9
A persistent barrier to the expansion of CHW programs in remote areas has been the limited ability to efficiently and accurately enumerate population catchment areas. The capacity to identify the communities, households, and individuals that fall under a health system’s purview is a first-order task for any health service planning effort—and is one that is especially critical for CHW programs, which typically provide home-based, rather than facility-based, care.10 In settings where census data are missing—or extremely outdated—identifying and mapping communities manually is a time-consuming and expensive task for CHW programs, who further recognize that manual methods may be prone to underidentification. Particularly in settings where vital registration systems are weak, and accurate census data is unavailable or outdated, it is likely that a large proportion of the rural population is not well enumerated, and thus is underserved.11 Although the strengthening of national data collection systems to better identify these missing populations is a key priority,12,13 novel interim approaches are needed to more precisely monitor and deliver services to very rural and remote populations globally.
To overcome this barrier to CHW program expansion efforts, we sought to determine whether deep learning could enhance community mapping efforts across remote regions. Following recent work in the area of machine learning for development,14,15 we attempted to overcome established data gaps by leveraging publicly available satellite data.16 These data, which are increasingly available at a global scale, are collected with much higher frequency, and with more granular geographic scale, than are typically available when relying on census or household survey sources.17 Such satellite-based analyses may therefore be especially relevant for rural and remote community health systems, where accurate and rapidly updateable information is critically needed for appropriate service delivery and planning.
MATERIALS AND METHODS
Data
Community location predictions were generated using 2 data sources: the SpaceNet corpus18 and a rural satellite image dataset built for specifically for this project. Predictions were then evaluated using an external dataset of community census data. First, SpaceNet is a publicly available set of high spatial resolution, commercial satellite imagery that contains high-quality building annotations. At the time of the analysis, SpaceNet provided 650 000 annotated satellite images covering 5 cities (Khartoum, Sudan; Las Vegas, Nevada, USA; Paris, France; Rio de Janeiro, Brazil; and Shanghai, China); these were all used in the initial training.
Second, because SpaceNet data are heavily biased toward urban, and primarily developed, settings, we enriched these data using a set of initially unlabeled satellite images collected from rural, and less developed, settings. Images were selected from 5 locations: Guatemala, Uganda, Nepal, and Senegal, as well as from areas of Liberia that fell outside the catchment area of the external validation dataset. For the entirety of each country, we collected 1 km × 1 km images tiles from DigitalGlobe, Westminster, CO 50 cm resolution imagery accessed via the Bing Maps API. This additional rural image dataset was then randomly sampled to find candidates to annotate. An additional 26 180 rural images from the candidate set of rural images were ultimately labeled.
The external validation data was a geographic information systems dataset identifying all known communities within the health service catchment area of Last Mile Health (LMH), a nonprofit organization that focuses on implementing CHW programs to improve healthcare access and quality in rural settings.19 Starting in 2016, LMH worked together with the Liberia Ministry of Health to scale up a national professionalized cadre of CHWs across 14 of Liberia’s 15 counties. In preparation for this scale-up, LMH undertook a community census process in 2015 to identify the locations of all known communities in the Rivercess County area, as well as household counts for each community. Community geolocation data were obtained by sending a team of trained enumerators into the field with handheld GPS devices (Garmin eTrex 10, Olathe, Kansas, USA) to collect road data, and Android phones to collect community locations and associated information. For remote communities (defined as communities that are >5 km from the nearest health facility), these household counts were later updated by CHWs, who created hand-drawn maps of catchment areas (Figure 1), then registered every individual living in each household to a community to enable efficient service planning. Geolocation data were systematically updated in 2016 and 2017 by comparing the existing geographic information systems dataset with CHW-collected household registration data and resolving location conflicts with in-person verification. Further, as additional communities were identified, new communities were established, or existing communities were abandoned, data were updated on an ad hoc basis as information became available.
Figure 1.
A community health worker catchment area map.
Analysis
The basis of the community prediction approach involved algorithmically facilitated recognition of individual buildings from satellite imagery across large geographic areas. Following prior work that used similar methods to measure regional poverty in Africa,20,21 we began with a set of high-quality, annotated satellite image data from urban contexts (SpaceNet), and used these as a starting point to generate an algorithm capable of detecting buildings in urban areas. Over successive training iterations we further refined the algorithm to better generalize to the task of predicting buildings in rural regions by using additional images collected from rural regions. The overall approach to the study design is presented in Figure 2.
Figure 2.
Study design. DBSCAN: density-based spectral clustering of applications with noise.
TensorBox,20,21 a type of deep neural network, was used for all building recognition tasks. Belonging to a general class of machine learning methods termed deep learning, a neural network constructs a hierarchy of layers, each containing algorithmically learnable weights and biases ultimately representing a differentiable function. Models are trained by supplying a series of inputs, along with their desired outputs, to produce a function that maps any new input to a corresponding output based on the previously supplied examples. After training, the model takes in an image and outputs a set of coordinates describing the bounding box for each building contained in the image. Bounding boxes were accompanied by confidence values.
Using TensorBox, we first trained a model to identify buildings in the candidate set of unlabeled rural images. To accomplish this process, we started with the SpaceNet trained algorithm, then randomly sampled from the candidate set of rural images. We ran the sampled rural images through the algorithm, which output a grid of bounding box predictions with associated confidence scores. Prediction accuracy was evaluated with the Jaccard Index (JI), which reflects the overlap between the predicted and actual data model, defined as the intersection area, over the union area, of the predicted and ground truth polygons. When the algorithm detected a building in the candidate rural image (ie, output any bounding boxes with a JI score over 0.6), the annotation was set aside and we visually confirmed that a building was present. Our stopping criterion was the change in F1 score from the previous model in a held-out test set. When the addition of further images did not improve the F1 score, we concluded that the model had achieved saturation. Of the candidate rural images, 26 180 images were labeled in this process. Using this approach, we attempted to balance the trade-off between the ability to obtain additional training images from diverse rural settings with the need to limit extensive manual verification.
After a sufficient amount of rural images were collected and annotated, we split the rural image dataset into training (n = 20 926) and testing sets (n = 5254) using an 80:20 ratio. Data splits were performed on a per-country basis to maximize training diversity and facilitate greater generalizability. To do this, we first randomly shuffled the images for each country, then performed the necessary splits to ensure reasonable spatial coverage across locations. Further, we ensured that the training and testing sets contained an even ratio of images containing buildings. After splitting the rural image data, we retrained TensorBox from scratch to identify individual buildings, and then evaluated the quality of the individual building detections on the held-out 20% testing data.
In this second phase of the building prediction process, we used a lower JI criteria of 0.2 to detect positive matches, rather than the 0.6 JI criteria used in the candidate image sampling process. The lower score was used to reduce inappropriate misclassification of false negatives. We iterated through each predicted bounding box ordered by the predicted confidence and matched it to a bounding box in the ground truth annotation. Once a match was identified, the label was removed from the set, ensuring that a single ground truth label matched only a single prediction. At the end of the process, if no match was identified, the detection was considered to be a false positive. Remaining bounding boxes identified in the ground truth annotation were considered false negatives. We then calculated positive predictive value (precision) and sensitivity (recall) and F1 score. In addition, we calculated system run time and cost to inform programmatic utility. We also conducted sensitivity analyses comparing the relative performance of TensorBox with that of 2 other neural networks.
Finally, to evaluate the community detection results we used a clustering method, density-based spectral clustering of applications with noise (DBSCAN),22 to identify groups of densely connected buildings indicative of a community—the natural spatial granularity for community-based health systems. We parameterized DBSCAN a priori based on a heuristic definition of communities in Rivercess County in consultation with LMH field staff. Two parameters were required: (1) the minimum number of points needed to form a cluster and (2) the maximum distance allowable between points belonging to a cluster. The community definition was therefore operationalized in DBSCAN as a collection of 3 or more buildings detected within 500 min. We performed external validation of the community predictions, using the independent community census data collected by LMH. We first calculated a 500-m buffer around each predicted cluster, and around the community location GPS coordinates in the community health system census data, then compared the overlap. In cases where the results diverged, we visually examined the satellite imagery using a manual verification process. A community detection false positive was defined as occurring when a community was predicted by the model yet inspection of the satellite imagery, and comparison with the ground truth dataset, showed that no community was present, or that the community or fell outside the 500-m buffer. A false negative was defined as occurring when the model did not predict a community registered in the community health system census data and the imagery suggested that one was present (Figure 3). We distinguished between false negatives due to the model performance and false negatives due to imagery or cloud coverage. Five reviewers independently evaluated all incidences of false positives and false negatives. Interrater reliability was measured using Fleiss’ kappa statistic.
Figure 3.
Satellite imagery visual inspection examples. (A) The algorithmic method detected a community not previously registered in the community census data; visual inspection suggests that a community may be present at the location. (B) A community was predicted by the algorithmic method; visual inspection suggests that no community is present at the location. (C). Community census data registered a community; visual inspection suggests that a community is likely not present at the location. (D) The algorithmic method did not predict a community registered in the community health system census data; visual inspection suggests that a community is present at the location. (E) The satellite image is considered missing due to cloud coverage. (F) The satellite image is considered missing due to a problem with the satellite image.
This was an open source and open access project. JavaScript was used for data acquisition, Python, specifically PyTorch, for machine learning tasks, and R (version 3.2) was used to calculate interrater reliability. Source code is available at github.com/ArnholdInstitute.
The Icahn School of Medicine at Mount Sinai Institutional Review Board reviewed the study and determined that it did not constitute human subjects research because individual-level data were not collected.
RESULTS
Building detection
In terms of the individual building detection results, the algorithm was able to achieve 86.47% positive predictive value and 79.49% sensitivity in the unseen testing data. The F1 score was 0.83. Additional sensitivity analyses using 2 additional detection algorithms performed similarly to the selected approach (Supplementary Appendix).
Community detection
Across a land area of approximately 9725 km2, the neural network approach identified 646 distinct building groupings that suggested a community was present. Comparing the algorithm-generated results with the health system community census data, 451 of the 596 communities that were detected by the algorithmic approach had previously been registered and validated in the community census (75.67%) (Table 1). Overall, 28 false positives and 83 false negatives were detected. Of these false negatives, 10 of the 83 were due to missing imagery and 56 were due to cloud coverage errors. We further identified 167 building groupings suggestive of a community that had not been previously registered within the health system community census. Agreement between the 5 satellite image raters was 0.76, indicating a substantial degree of agreement.
Table 1.
Community detection results comparing algorithmic results, health system census results and visual inspection of satellite imagery
Health system census indicated a community present | Health system census indicated no community present | Total | |
---|---|---|---|
Algorithmic method detected a community | 451 | 195a | 646 |
Algorithmic method detected no community | 145 | ||
Total | 596 | ||
Visual inspection of satellite imagery indicated community potentially present | Visual inspection of satellite imagery indicated no community likely present | ||
Algorithmic method detected a community | 618b | 28 | 646 |
Algorithmic method detected no community | 83c | ||
Total | 701 | ||
Visual inspection of satellite imagery indicates community potentially present | Visual inspection of satellite imagery indicated no community likely present | Total | |
Health system census indicated community | 535 | 61 | 596 |
Includes 28 false positives and 167 potential missing communities.
Includes 451 correctly identified communities and 167 potentially missing communities.
Includes image samples that could not be evaluated due to missing imagery (n = 10) and cloud coverage (n = 56).
Computational efficiency and cost
The computational time was 1.15 s/image. To put this in perspective, it would take the TensorBox algorithm approximately 30 hours to infer buildings for the 92 786 image tiles that cover the entire country of Liberia. Using Amazon’s EC2 cloud compute platform at $0.41/hour (g3.4XL spot instance), the computational cost of conducting the building detection analysis for the health system service area was $12.30 ($1946.48 LRD). To repeat this process to detect communities across the entire country of Liberia, the total cost computational cost would be approximately $141 USD ($22 313 LRD).
DISCUSSION
To the best of our knowledge, this work represents the first attempt to use satellite-based neural network methods to automate the identification of communities in very rural areas to facilitate improved health service delivery. When compared with existing health system community census data, this method correctly detected 75.67% of registered communities, and identified an additional 167 building groupings, indicative of communities, that had not previously been identified or registered. In-person validation is needed to confirm if these newly detected communities are inhabited, and to determine the health service needs of individuals potentially residing in these locations. All of the data used in these analyses came from publicly available sources, and the estimated computational cost of producing the estimates was $12.30. Results suggest that satellite image–based approaches to community mapping can serve as a fruitful complement to existing field-based data collection methods. These methods have the potential to improve the reach and effectiveness of CHW programs providing care to underserved rural and remote communities globally.
Within many low- and middle-income countries, accurate community maps and census data are not available at the administrative level needed for effective health services planning and delivery efforts. In locations without readily available topographic data or codified address systems, collecting this household and community information is a manually intensive process that is prone to error. Base community maps are often created by equipping motorcyclists, bicyclists, or walkers with GPS devices to ride or walk along established routes to create formal maps of road networks.23,24 While practical, this approach is resource-intensive, both in terms of cost and time for CHW programs with limited resources. Further, the manual approach is known to systematically miss certain types of communities, including very small communities and those inaccessible by passable roads. Automating the manual process to generate preliminary maps that can be refined efficiently over time with ground validation has the potential to facilitate faster, more complete, more cost-effective, scalable solutions to community mapping over large geographic areas.
Our study had several limitations. First, the results we present are preliminary and although they illustrate the potential of satellite-based neural networks in enhancing rural community detection, significant additional ground-based validation efforts are needed. For instance, by utilizing satellite data, we were unable to distinguish occupied buildings from those that many be vacant, abandoned, or used for commercial purposes. In Rivercess County, farming settlements are common, yet may only be utilized seasonally. Similarly, areas with extensive migration, or those in rapid transition, may not be well estimated by this method, which does not directly measure of the presence of people.25 Recent work has successfully leveraged mobile phone,26 and other nonstatic data sources,27 in settings with rapid population movement. Future work is needed to evaluate these approaches in areas where complex settlement patterns are known or suspected, and to expand methods for integrating remotely sensed and ground-based community surveillance broadly.
A second important source of measurement error is the way that community location measurements were collected and operationalized. In the satellite image dataset, a community was defined as 3 or more building within 500 m, represented as a polygon. However, this definition is not necessarily consistent with the way in which the health system enumerators, CHWs, or community members themselves may define communities. No systematic community definition was specified in the health system household and community census data collection process and community locations were represented as a single GPS coordinate.
A related issue is that, in this part of Liberia, the creation or abandonment of new communities can happen rapidly. For example, a community may move, or a small new community may be created relatively close to an existing one. In these situations, CHWs may incorporate the new community into their existing service delivery routines without officially reporting the change, or adjusting the GPS coordinates, in the community census dataset. Therefore, it is not possible to determine the exact source of error in all instances where the predicted community results and community census results diverged. In some cases, measurement error in the community census GPS coordinates could have resulted in an incorrectly classified false positive in the predicted results. On the other hand, it is possible that some of the algorithmically identified communities that appear to have been missed in the community enumeration process could actually be receiving healthcare services from CHWs in practice. Ultimately, a more valid comparison metric may be at the household level. However, as in-person validation is likely required to collect detailed household characteristics that cannot be estimated from satellite images with any reasonable confidence—household members’ age, sex, or medical needs—we restricted our approach to concentrate on community locations. In other words, the algorithmic approach is used to streamline the manually intensive community mapping activity; however, we stress that successful applications would likely require in-person validation.
A fourth measurement issue we encountered was instances of cloud coverage and missing imagery errors. In the case of cloud coverage, the desired image is completely or partially obscured by clouds; missing images commonly occur due to problems with the positioning of the satellite or other technical issues related to the data provider. Although cloud coverage and missing images represent limitations to the data, they may be reasonably overcome by collecting additional satellite imagery, which will likely improve with time. Thus, they do not necessarily represent a fundamental problem with the detection approach. On the other hand, our experience with imagery errors suggests that this approach may not work well in certain types of areas where cloud coverage is common. Further, this concern may generalize to other types of settlement cover such as snow, tree canopies or extensive air pollution.
A final limitation is that while the neural network method may work well in identifying communities in rural Liberia, the same approach may not generalize to other remote geographical contexts, nor nonrural settings. To a certain degree, this concern is offset by the fact that we used diverse training data to develop the approach; however, additional research is needed to determine the extent to which results may be generalizable. Further, it would be possible to retrain the algorithm for additional settings by collecting additional training data, or by using household information to further refine the predictions to better distinguish between inhabited and uninhabited communities. Given that all of the satellite image data used in these analyses are publicly available, many barriers to additional data collection are relatively low; however, additional human capital would be required.
As high-quality, longitudinal satellite data are increasingly available publicly and without cost,28 the method we present here could have utility across several domains, and may be useful for rapidly and inexpensively producing the granular information needed for other types of service planning and policy activities. Here, we focused on the problem of identifying communities in rural areas to aid in CHW service delivery and scale-up. However, the capacity to automatically identify hard-to-reach households and communities may prove fruitful in a number of other global health and development contexts. For example, polio eradication teams have already been using satellite image analyses to facilitate vaccination planning though settlement identification in northern Nigeria and also in the Democratic Republic of Congo.29–31 Incorporating machine learning into this process has the potential to reduce costs, and with appropriate training data, might provide improved accuracy over manual visual inspection. Similarly, the method could be used to help strengthen the registration of births and deaths in very rural regions that are poorly enumerated and where birth certificate coverage inequities are well documented and adult deaths are infrequently documented.11 Finally, though we focus our attention on rural remote locations, the benefits of satellite-based machine learning mapping approaches also have utility in highly urban, yet poorly characterized settings such as the identification informal settlements.
We present a method for identifying communities in rural and remote settings using a combination of satellite imagery and machine learning. The ability to quickly and inexpensively identify hard-to-reach and underserved regions is critical to expanding health service coverage and reduce health access disparities, and may have utility across a number of other applications. To our knowledge, this is the first study to apply advanced image-recognition, machine learning algorithms in the context of rural healthcare delivery. Linking such information to community health systems can be a powerful tool for service delivery, health policy planning, and advocacy, especially in low- and middle-income countries. This capacity to monitor health and population indicators with increasing granularity is consistent with calls for a more precision-oriented approach to global health,32,33 one that has the potential to facilitate greater targeting of health services to the places and people with the most pronounced need.
FUNDING
This work was supported by the UBS Optimus Foundation (no grant identification numbers), the United States Agency for International Development (Grant Number AID-OAA-F-16-00106), the Bill & Melinda Gates Foundation (Grant Number OPP1171814), and the National Science Foundation (Grant Number 1464297).
AUTHOR CONTRIBUTIONS
EB designed the methodology, conducted and interpreted the analyses, and drafted the manuscript. ML designed the methodology, conducted and interpreted the analyses, and critically reviewed and edited the manuscript. AK oversaw the acquisition of the data, conducted and interpreted the analyses, and critically reviewed and edited the manuscript. JD oversaw the acquisition of the data, conducted and interpreted the analyses, and critically reviewed and edited the manuscript. MD engaged in the design of the methods, conducted and interpreted the analyses, and critically reviewed and edited the manuscript. AB contributed to the design of the methods, interpreted the analyses and critically reviewed and edited the manuscript. PD contributed to the design of the methods, interpreted the analyses, and critically reviewed and edited the manuscript. BS helped refine aspects of the methodology and critically reviewed and edited the manuscript. PJL helped refine aspects of the methodology and critically reviewed and edited the manuscript. PS was involved in all stages of the study design and implementation, he contributed resources, and critically reviewed and edited the manuscript.
SUPPLEMENTARY MATERIAL
Supplementary material is available at Journal of the American Medical Informatics Association online.
Supplementary Material
ACKNOWLEDGMENTS
We thank the local health staff and field staff of Last Mile Health for their support.
CONFLICT OF INTEREST STATEMENT
AK and JD are employees of Last Mile Health. ML, EB, MD, AB, PD, BS, PJL, and PS have no interests to declare.
REFERENCES
- 1. Hogan DR, Stevens GA, Hosseinpoor AR, Boerma T.. Monitoring universal health coverage within the sustainable development goals: development and baseline data for an index of essential health services. Lancet Glob Heal 2018; 6 (2): e152–68. doi: 10.1016/S2214-109X(17)30472-2.N. [DOI] [PubMed] [Google Scholar]
- 2. World Health Organization, World Bank. Tracking Universal Health Coverage: First Global Monitoring Report. Geneva, Switzerland: World Health Organization Press; 2015. [Google Scholar]
- 3.Lehmann U, Sanders D. Community health workers: what do we know about them. The state of the evidence on programmes, activities, costs and impact on health outcomes of using community health workers. Geneva: World Health Organization. 2007:1–42. https://www.who.int/hrh/documents/community_health_workers.pdf. (accessed February 2, 2018). [Google Scholar]
- 4. Lewin M-B, Glenton D, Bosch-Capblanch VW, Odgaard-Jensen J, Aja ZS.. Lay health workers in primary and community health care for maternal and child health and the management of infectious diseases. Cochrane Database Syst Rev 2010: 3: CD004015.. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kredo T, Adeniyi FB, Bateganya M, Pienaar ED. Task shifting from doctors to non-doctors for initiation and maintenance of antiretroviral therapy. Cochrane Database Syst Rev 2014; 7: CD004015. doi: 10.1002/14651858.CD007331.pub3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Okwundu CI, Nagpal S, Musekiwa A, Sinclair D.. Home- or community-based programmes for treating malaria. Cochrane Database Syst Rev 2013; 5: CD009527. doi: 10.1002/14651858.CD009527.pub2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Lassi Z, Haider B, Bhutta Z.. Community-based intervention packages for reducing maternal and neonatal morbidity and mortality and improving neonatal outcomes. Cochrane Database Syst Rev 2011; 73: CD007754. [DOI] [PubMed] [Google Scholar]
- 8. Perry H, Zulliger R, Rogers M.. Community health workers in low-, middle-, and high-income countries: an overview of their history, recent evolution, and current effectiveness. Annu Rev Public Health 2014; 35 (1): 399–421. [DOI] [PubMed] [Google Scholar]
- 9. McCord G, Liu A, Singh P.. Deployment of community health workers across rural sub-Saharan Africa: financial considerations and assumptions. Bull World Health Org 2012; 91: 244–53B. doi: 10.2471/BLT.12.109660. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Pallas SW, Minhas D, Pérez-Escamilla R, Taylor L, Curry L, Bradley EH.. Community health workers in low- and middle-income countries: what do we know about scaling up and sustainability? Am J Public Health 2013; 103 (7): e74. doi: 10.2105/AJPH.2012.301102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Bhatia A, Ferreira LZ, Barros AJD, Victora CG.. Who and where are the uncounted children? Inequalities in birth certificate coverage among children under five years in 94 countries using nationally representative household surveys. Int J Equity Health 2017; 16(1): 148–51. doi: 10.1186/s12939-017-0635-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. AbouZahr C, De Savigny D, Mikkelsen L, et al. Civil registration and vital statistics: progress in the data revolution for counting and accountability. Lancet 2015; 386 (10001): 1373–85. doi: 10.1016/S0140-6736(15)60173-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Ye Y, Wamukoya M, Ezeh A, Emina JBO, Sankoh O.. Health and demographic surveillance systems: a step towards full civil registration and vital statistics system in sub-Sahara Africa? BMC Public Health 2012; 12: 741. doi: 10.1186/1471-2458-12-741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Xie M, Jean N, Burke M, Lobell D, Ermon S. Transfer learning from deep features for remote sensing and poverty mapping. arXiv 2016 Feb 27 [E-pub ahead of print].
- 15. Jean N, Burke M, Xie M, Davis WM, Lobell DB, Ermon S.. Combining satellite imagery and machine learning to predict poverty. Science 2016; 353 (6301): 790–4. doi: 10.1126/science.aaf7894. [DOI] [PubMed] [Google Scholar]
- 16. Giovannini E, Li R, Anant T, et al. A World That Counts: Mobilising the Data Revolution for Sustainable Development New York: UN Data Revolution; 2014. doi: 10.1177/1079063213492341. [Google Scholar]
- 17. Wardrop NA, Jochem WC, Bird TJ, et al. Spatially disaggregated population estimates in the absence of national population and housing census data. Proc Natl Acad Sci U S A 2018; 115 (14): 3529–37. doi: 10.1073/pnas.1715305115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. SpaceNet on Amazon Web Services (AWS). https://spacenetchallenge.github.io/datasets/datasetHomePage.html. Accessed January 8, 2019.
- 19. Luckow PW, Kenny A, White E, et al. Implementation research on community health workers’ provision of maternal and child health services in rural Liberia. Bull World Health Organ 2017; 95 (2): 113–20. doi: 10.2471/BLT.16.175513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Stewart R. Tensorbox: Oject detection in TensorFlow. https://github.com/Russell91/TensorBox. Accessed January 8, 2019.
- 21. Stewart R, Andriluka M, Ng AY.. End-to-end people detection in crowded scenes. In: IEEE Conference on Computer Vision and Pattern Recognition; 2016. doi: 10.1109/CVPR.2016.255. [Google Scholar]
- 22. Ester M, Kriegel H, Sander J, Xu X.. A density-based algorithm for discovering clusters in large spatial databases with noise. Knowl Discov Databases 1996; 96 (34): 226–31. [Google Scholar]
- 23. Agbenyo F, Marshall Nunbogu A, Dongzagla A.. Accessibility mapping of health facilities in rural Ghana. J Transp Health 2017; 6: 73–83. doi: 10.1016/j.jth.2017.04.010. [Google Scholar]
- 24. Gammino VM, Nuhu A, Chenoweth P, et al. Using geographic information systems to track polio vaccination team performance: pilot project report. J Infect Dis 2014; 210 suppl 1: S98–101. doi: 10.1093/infdis/jit285. [DOI] [PubMed] [Google Scholar]
- 25. Doupe P, Bruzelius E, Faghmous J, Ruchman SG. Equitable development through deep learning: the case of sub-national population density estimation. In: Proceedings of the 7th Annual Symposium on Computing for Development; 2016: 6. doi: 10.1145/3001913.3001921.
- 26. Wesolowski A, Eagle N, Tatem AJ, Smith DL, Noor AM, Snow RW.. Quantifying the impact of human mobility on malaria. Science 2012; 338 (6104); 267–70. doi: 10.1126/science.1223467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Patel NN, Stevens FR, Huang Z, Gaughan AE, Elyazar I, Tatem AJ.. Improving Large Area population mapping using geotweet densities. Trans GIS 2017; 21 (2): 317–31. doi: 10.1111/tgis.12214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Wulder MA, Coops NC.. Satellites: make earth observations open access. Nature 2014; 513 (7516): 30–1. doi: 10.1038/513030a. [DOI] [PubMed] [Google Scholar]
- 29. Upfill-Brown AM, Voorman A, Chabot-Couture G, Shuaib F, Lyons HM.. Analysis of vaccination campaign effectiveness and population immunity to support and sustain polio elimination in Nigeria. BMC Med 2016; 14: 60. doi: 10.1186/s12916-016-0600-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Kamadjeu R. Tracking the polio virus down the Congo River: a case study on the use of Google EarthTM in public health planning and mapping. Int J Health Geogr 2009; 8: 4. doi: 10.1186/1476-072X-8-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Barau I, Zubairu M, Mwanza MN, Seaman VY.. Improving polio vaccination coverage in Nigeria through the use of geographic information system technology. J Infect Dis 2014; 210 suppl 1: S102–10. doi: 10.1093/infdis/jiu010. [DOI] [PubMed] [Google Scholar]
- 32. Dowell SF, Blazes D, Desmond-Hellmann S.. Four steps to precision public health. Nature 2016; 540: 189–91. doi: 10.1038/540189a. [Google Scholar]
- 33. The Lancet Global Health. Precision global health: beyond prevention and control. Lancet Glob Heal 2017; 5 (1): e1. doi: 10.1016/S2214-109X(16)30339-4. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.