Abstract
Under-5 Mortality rates have been decreasing across Africa for the past two decades. Contributing factors include policy changes, technology, and health investments. This study identifies sub-populations that have experienced more-than-expected change in mortality rates (either increasing or decreasing) during this time period. We train under-5 mortality predictive models on Demographic and Health Survey (DHS) datasets from the early 2000s and apply those models to data collected in more recent versions of the survey. This provides an estimate of the risk current families would have faced in the past. We then apply techniques from anomalous pattern detection to identify sub-populations that have the most divergence between their predicted and observed mortality rates; higher and lower. These detected groups are examples of successes and possible misses of the health progress observed in Africa over the course of decades. Identifying these groups through data-driven discovery may lead to a better understanding of health policies in developing countries.
Introduction
Improving child mortality rates is an important Maternal, Neonatal and Child Health (MNCH) priority for global sustainable development. Despite a global decline in child mortality rates, many countries are not on track to achieving the global targets of ending preventable deaths among newborns and children under 5 years the year 20301. We contribute to the body of work which shows the progress towards MNCH-specific targets remains uneven within and across countries, reflected in disparities in access to healthcare services and inequitable allocation of resources for MNCH priorities2.
Some of the key barriers to understand and address MNCH challenges are the complicated interactions of various factors and interventions captured in data. These include socio-economic, health system capacity, and quality of individual care. These scenarios are further complicated when considering health outcome progress over long periods of time (10+ years). The quintessential questions addressed in this paper are, "Which type of women and families are driving the overall decrease in child mortality rates observed in the past decades? Are there groups of women and families that have been left behind?"
This present work leverages data-driven discovery to identify sub-populations of women (and their households) that experience larger-than-expected changes in Under-5 mortality rates between two points in time, spaced approximately 10-15 years apart. The advantage of such approaches is that health investigators do not need to first posit which sub-population to test and then follow up with confirmation analysis and significance testing procedures.3-8 Rather, it allows the data to highlight the sub-population(s) that are most anomalous in their divergence between observed and expected rates of Under-5 mortality.
Expected mortality rates are obtained by training a machine learning algorithm (Boosted Decision Trees9,10) on data from the earlier time-point, T0, and then applying that model to data from a more recent time, T1. This is analogous to estimating the mortality rates of recent families if their children had been born 10 to 15 years earlier. Due to shifts in the data that occur over the course of decades (see Table 1 and Table 2), the predictive model is less accurate when predicting mortality at time T1 than T0. "Concept Drift" is expected between time steps and is a key component of this work. Rather than viewing these data shifts as traps to be avoided, we use them to identify sub-populations that do not match the country-wide trends in under-5 mortality during the same time frame. Detecting the sub-populations that undergo the largest amount of shift between their predicted and observed mortality rates is accomplished by Bias Scan.11 See the Methods Section for more details.
Table 1: Description of possible shifts in data.13 and how they are addressed in this work.
| Data Shift Type Notation | Description | MNCH Domain Example | Addressed | How |
| Covariate Shift P (X) | Distribution of features is different | Access to water has changed within the past two decades. | No | Future work may identify the emergence or disappearance of a sub-population over time. |
| Prior Probability Shift P (Y) | Distribution of labels is different | Under-5 Mortality Rates have decreased in past two decades. | Yes | Our null hypothesis is that all sub-populations of women (households) have undergone the same decrease in odds in the past decade(s). |
| Concept Drift P (Y |X) | Distribution of labels given features is different | Households with the same education attainment having lower under-5 mortality rates now than before. | Yes | Our alternative hypothesis is that some sub-population of women (households) has experienced a greater change in odds. Bias Scan identifies this sub-population. |
| Confounding Shift P (Y |X,Z) | Distribution of labels given a variable that influences both features and labels is different | Changing employment status may influence both wealth index and under-5 mortality rates. | No | Future work may identify anomalous sub-populations by both their change in features and labels. |
Table 2: Country-level statistics for years of surveys (T0, T1), sample size, under-5 mortality rates, and predictive model accuracy.
| Country | Years | Size | Under-5 Mortality(%) | AUC |
| Burkina Faso | 2003 | 7367 | 15.9 | 0.869 |
| 2010 | 10364 | 11.7 | 0.855 | |
| Ethiopia | 2000 | 7245 | 16.1 | 0.838 |
| 2016 | 7193 | 8.1 | 0.798 | |
| Kenya | 2003 | 3972 | 11.0 | 0.931 |
| 2014 | 14949 | 5.5 | 0.895 | |
| Nigeria | 2003 | 3775 | 19.6 | 0.870 |
| 2018 | 21792 | 13.1 | 0.839 | |
| Tanzania | 2004 | 5658 | 11.6 | 0.875 |
| 2015 | 7050 | 6.8 | 0.868 |
Ayele and Zewotir applied a Cox proportional hazard model to Ethiopian Demographic and Health Survey (DHS) data (2000, 2005, 2011) to identify how risk factors for under-5 mortality change over time12. The datasets and temporal component are similar to this present work. However, our goal and methodology differ substantially. We wish to identify sub-populations (i.e. a subset of feature-values) that remain static over time that experience a large change in their under-5 mortality rates between times T0 and T1. For example, small households in Ethiopia with a single adult living in them and had two births saw their under-5 mortality rates decrease from 47.2% in 2000 to 7.5% in 2016. This drastic change exceeded that of Ethiopia on average between the same time period. More examples of these anomalous sub-populations are provided in the Results section.
Data and Data Shifts
Critical to this work is recognizing and exploiting data shifts over time. Varshney summarizes four relevant data shifts that can cause problems for machine learning models.13 These shifts are (re)listed in Table 1 along with MNCH domain examples and how these shifts are used in this work. Arguably the most common type of data shift is the covariate shift which changes the distribution of the features. This is in contrast to the prior probability shift which changes the distribution of the outcomes (or labels). The data shift of primary concern for this paper is the Concept Drift which is a change in the outcome for a given set of features.
We highlight that the unit of analysis for this study is the mother (and her household). We only consider survey respondents who have given birth within the 5 years leading up to the survey date. Births and deaths records within the survey are used to create the binary label for under-5 mortality experienced by the mother (and her household).
Table 2 lists information about the five countries explored in this work including the timesteps T0 and T1, number of mothers (households) in each survey, the under-5 mortality rates, and the discriminating power of a machine learning model to distinguish between women/households who experience under-5 mortality and those who do not (area under the receiver-operator curve, AUC). We highlight the decrease in under-5 mortality rates experienced by each country over time. This change in odds per country is an important part of methodology (see Figure 1). Second, we note that the AUC decreases for T1 because it is using the model trained on T0 data to predict T1 data. This drop in accuracy is expected under data shifts such as temporally-induced Concept Drift. We apply Bias Scan11 to exploit this drift and identify which sub-populations in T1 differ the most between their expected and observed mortality rates.
Figure 1:
Workflow Diagram for detecting Concept Drift between timesteps T0 and T1. Step 3 uses q’ which is the odds ratio of observed mortality between T1 and T0.
Table 3 provides a subset of the features and their values extracted from both timesteps of the DHS data at each country. These features were used both in the training of the predictive classifier and to create the search space for Bias Scan to efficiently identify anomalous sub-populations. In some scenarios, some features were removed to correctly address additions/removals of survey questions over the decades.
Table 3: Features and their values used for training and scanning. Due to variations in survey questions, not all countries have all the features.
| Feature | Feature Values |
| Marital status | Married/Living with Partner, Divorced/Separated, Never in Union/Widowed |
| Respondent employed | Not working, Ag-self/Ag-employee, Clerical/Sales, Household/Services/Skilled/Unskilled/Other, Professional, Missing/Unknown |
| Source of drinking water | Piped, Well, Borehole, Other, Missing/Unknown |
| Ethnicity | Varies by country |
| Living with partner | Living with her, Staying elsewhere |
| Region | Varies by Country |
| Respondent Education Level | Primary, Secondary, Higher, No education |
| Visited health facility | Yes, No |
| in the past 12 months | |
| Gender of Head of Household | Male, Female |
| Respondent worked in | Yes, No |
| the past 12 months | |
| Relationship Structure | Three+ related adults, Two adults - opposite sex |
| Two adults - same sex/Unrelated, One adult, No adults | |
| Literacy | No/Blind/Missing/Unknown, Yes |
| Number of Children Under 5 who slept under No Mosquito Net | All children, No net in household , No, Some children |
| Wealth Index | Poorest, Poorer, Middle, Richer, Richest |
| Household Size | Missing/Unknown, 1 to 3, 4 to 5, 6 to 8 9 and above |
| Number Of Births | Missing/Unknown,One, Two, Three+ |
| Respondent Age | Missing, Below 20, 20 to 29, 30 to 39, 40 to 49, 50 and above |
| BMI | Missing/Unknown,Underweight,Normal, Overweight, Obese |
| Ever Terminated Pregnancy | Yes, No |
Methods: Training, Calibrating, and Shifting Models
In order to capture the relationship between the features X and the mortality label Y, we trained a predictive classifier using off-the-shelf software10 and methods.9 The advantages to these methods over more standardized regression techniques are well captured in Ogallo's approach to this similar problem14. In summary, boosted decision trees are able to better capture non-linearities in the relationship between the features and the outcome as well as interactions between features. This more expressive form of P (Y |X) typically results in higher discriminating power. The number of trees and depth of individual trees were chosen through cross-validation that optimized the AUC of the held-out cross-validation set.
The second part of Step 1 as shown in Figure 1 is calibrating the model so that the predicted probabilities accurately reflect the true proportion of observed outcomes in time T0. This was done using Platt Scaling 15 option with held out training data.
An example of the calibrated T0 model from Ethiopia is shown in Figure 2 as a calibration plot. When the model is used to predict T0 data, there is strong agreement between the predicted probabilities and the observed fraction of positive cases. However, when predicting T1 data we observe that the model is systematically producing a higher predicted probability for the proportion of true outcomes. This is due to the data shift in Ethiopia's mortality rates between T0 and T1. The rate decreased from 16.1% to 8.1% over the course of 16 years.
Figure 2:
Calibration Plot showing the data distribution shift in Ethiopia between T0 and T1. The T1 predictions are pessimistic, assigning higher probability of mortality to women/households that did not experience Under-5 mortality. Bias Scan is used to identify the sub-population where this divergence shows the most evidence of concept drift.
This insight leads us to Step 3 in the workflow diagram (Figure 1). In order for Bias Scan to better identify the temporal Concept Drift in the data, we must separately attempt to address the prior probability shift. On average, the odds of under-5 mortality decreased by a factor of 0.46 in Ethiopia between T0 and T1. Therefore, a more accurate version of the predicted probability on T1 data is to also shift these predictions by the same change in the odds. This shift is done to all records in T1 predicted probabilities before running Bias Scan.
Methods: Bias Scan
Bias Scan11 efficiently identifies subsets of the data where a predictive classifier is systematically over (or under) estimating the probability of of an observed outcome. Bias Scan exploits mathematical properties of the scoring function16,17 which makes it computationally feasible to "scan" over the exponentially-many possible subsets of records in a data set. Scanning for anomalous subsets, rather than investigating subgroups of apriori interest, is the critical component of data-driven discovery.
Bias Scan may be viewed through lens of hypothesis framing. The null hypothesis is that all sub-populations of T1 households are accurately predicted by the model trained on T0 data. The alternative hypothesis assumes some multiplicative bias, q, in the odds for some sub-population, S.
These hypotheses form a log-likelihood ratio based on the Bernoulli distribution and form the "bias score" of some sub-population, S. This score appropriately balances the size of the sub-population along with deviation between the predicted probabilities and the proportion of positive outcomes in the subgroup. Any sub-population that has a large number of records with systematically higher predicted probabilities as compared to the observed outcomes yi in the same subgroup, will have a high bias score.
| (1) |
Bias Scan uses an iterative ascent procedure to efficiently optimize this objective function over all possible subpopulations. When optimizing over a feature with k possible values, Bias Scan does not consider each of the 2k - 1 possible subsets, as that would be computationally infeasible. Rather, previous work16 has shown that at most 2k subsets must be considered while still guaranteeing that the subset with the highest bias score will be identified. This reduction from exponential to linearly-many subsets to consider is what makes Bias Scan computationally efficient for large data sets. Each ascent is guaranteed to converge to a local optimum and multiple random restarts are used to, ideally, converge to a global maximum. Thirty (30) random restarts were used for this piece of work. Additionally, Bias Scan has a tuning parameter that penalizes complex sub-populations that may span too many features, inhibiting interpretation. We used a penalty value between 2.5 and 4.0 in our experiments. These values resulted in interpretable subsets spanning 1-3 features. See the Results section for more details.
We conclude this section by noting two simple extensions to Bias Scan we used in this current piece. This is the first application of Bias Scan to explicitly search over temporally-induced concept drift. Previous uses of Bias Scan focused more on identifying faults within the predictive classifier. Here, we assume the classifier is operating correctly and it is the data that is shifting between T0 and T1. It is these shifts that we extract insights from rather than attempting to highlight weaknesses or bias in the predictive model. Second, this is the first application where prior probability shift is addressed separately from the scanning process. (See Step 3 in Figure 1). Without this global shift of T1 predictions, the scanning results would typically identify "All" sub-populations as anomalous. However, that result was due to the prior probability shift and not the concept drift. By shifting the predicted probabilities of T1, we are addressing the country-wide change in outcomes between T0 and T1. Therefore, any further deviations between the shifted probabilities and the observed outcomes at T1 can be more readily attributed to concept drift of a particular sub-population. Tables 4 and 5 in the Results Section provide the average predictions for the identified sub-population before and after this shift (last two columns). This is to reinforce our null hypothesis that all sub-populations have experienced the same decrease in the odds of mortality across the country.
Table 4: Anomalous sub-populations with lower-than-expected under-5 mortality rates for each country.
| Country Years Mortality | Sub-Population | Size | Mortality % | Mean Model Predictions % | |||
| T0 | T1 | T0 | T1 | T1 Raw | T1 Shifted | ||
| Burkina Faso 2003, 2010 15.9%, 11.7% | Ethnicity = Mossi, Worked in past year = Yes, Visited Health Facility in past year = Yes | 1501 | 3390 | 15.1 | 8.5 | 13.7 | 10.8 |
| Kenya 2003, 2014 11.0%, 5.5% | Respondent Education Level = Higher, Respondents Age = 40 to 49 years | 6 | 58 | 16.7 | 0.0 | 32.0 | 25.1 |
| Ethiopia 2000, 2016 16.1%, 8.1% | Relationship structure = One Adult, Number of Births = 2, Household size = 1 to 3 | 53 | 67 | 47.2 | 7.5 | 53.5 | 36.7 |
| Nigeria 2003, 2018 19.6%, 13.1% | Region = South South or South West | 809 | 4737 | 14.8 | 7.4 | 14.0 | 10.3 |
| Tanzania 2004, 2015 11.6%, 6.8% | Number of children who slept under mosquito net = No net in household, Number of Births = 1 | 1660 | 914 | 8.9 | 0.2 | 3.0 | 1.7 |
Table 5: Anomalous sub-populations with higher-than-expected under-5 mortality rates for each country.
| Country Years, Mortality | Sub-Population | Size | Mortality % | Mean Model Predictions % | |||
| T0 | T1 | T0 | T1 | T1 Raw | T1 Shifted | ||
| Tanzania 2004, 2015 11.6%, 6.8% | Respondents Age = Below 20 | 395 | 550 | 7.6 | 7.6 | 7.3 | 4.6 |
| Burkina Faso 2003, 2010 15.9%, 11.7% | Visited Health Facility in past year = No Relationship structure = 3 or more adults | 2957 | 1476 | 16.0 | 16.0 | 14.1 | 11.4 |
| Kenya 2003, 2014 11.0%, 5.5% | Number of Births = 2, Gender of Head of Household = Male, Household size = 1-3 | 60 | 126 | 56.7 | 57.1 | 29.9 | 18.3 |
| Ethiopia 2000, 2016 16.1%, 8.1% | Null | 7245 | 7193 | 16.1 | 8.1 | 15.2 | 8.6 |
| Nigeria 2003, 2018 19.6%, 13.1% | Relationship structure = Two adults, Number of Births = 2, Household size = 1-3 | 68 | 328 | 73.5 | 76.8 | 56.5 | 46.1 |
Results
Bias Scan11 was applied in two "directions". In the negative direction, it identifies sub-populations that have lower observed rates of mortality in T1 than expected according to the model trained on T0 data. These sub-populations are showing gains that outpace the country on average and could be argued as successes. The positive direction detects sub-populations with observed mortality rates higher than expected and highlight possible misses of development efforts to emphasize going forward. These sub-populations are listed in Tables 4 and 5, respectively.
We begin by looking at the successess in Table 4. We categorize these results as "drivers" of the change vs "exceptional cases" of the change. Burkina Faso and Nigeria have sub-populations that could be argued as driving the country-wide decrease in under-5 mortality (due to their large size). For example, in Nigeria women in the South South or South West regions saw their under-5 mortality rates decrease from 14.8% to 7.4%. Although this sub-population was already doing better than average in 2003 (19.6%) the decreases made over the next 15 years were substantial. The last two columns of this table show that the model was accurate in predicting approximately 14.0% mortality rate for the sub-population. If that sub-population experienced the same change in the odds as the rest of the country between T0 and T1, then the mean would be 10.3%. However, in reality the observed rate was 7.4% at time T1. This divergence (and large size of the group) flag it as an anomalous sub-population.
The successes in Kenya and Ethiopia are examples of an exceptional case in the decrease in under-5 mortality. Among educated women in Kenya between the ages of 40 and 49, the mortality rate decreased from 16.7% to 0.0%. Further-more, in Ethiopia, small households with one adult and two births saw meteoric drops in rates from 47.2% to 7.5%. However, we note that these groups cover a small sub-population.
Finally we note the possible counter-intuitive result in Tanzania. How is it that households without a malaria net had their rates decrease from 8.9% to 0.2%? A household without a net in 2015 likely does not live in a Malaria area, whereas household without a net in 2004 (before their widespread distribution) simply may not have had a net despite living in a Malaria area.
We now look at the groups which are showing delayed improvements in mortality rates (compared to the average change) from the countries under consideration. In Tanzania, the age of the mother (under 20 years) highlights a sub-population that continues to have high under-5 mortality rates. For the women in this subgroup, the under- 5 mortality rate has not decreased at the rate of the rest of the country. This result highlights the importance of incorporating the prior probability shift before scanning for bias. Without that shift, this group of women would have appeared normal because the model's raw predictions (mean) was similar to that of the observed mortality rate at time T1. However, maintaining the same under-5 mortality rates while the rest of the country decreases should be reported as anomalous.
Second, we look at Ethiopia's promising result. There were no (large) sub-populations detected that systematically had higher rates of under-5 mortality. This failure to reject the null hypothesis suggests strong performance in Ethiopia that the overall reduction in mortality appears to be more inclusive than other countries. At lower penalty complexities a low-scoring subset did emerge, however.
Burkina Faso has an interesting result due to the feature of attending a health facility. Those who had not attended a health facility had an under-5 mortality rate that stayed constant at 16.0%. This is in comparison to those who did attend a facility in the past year and that group saw their mortality rates decrease from 15.1% to 8.5%. There are other features involved such as ethnicity and employment status but a plausible hypothesis is that quality of care at these facilities increased between the years 2003 and 2010 (at least for Mossi ethnicity).
We conclude the results section by highlighting the repeated presence of a few features in the groups where changes in the mortality rate were less than the national averages. These are the number of births and household size. We believe their presence is explained by "label leaking" in the predictive model. This occurs in the training stage of a predictive model when combination of features indirectly provide information about the class label that otherwise would not be known. For example, how can a household that has 2 adults and 2 births in the past 5 years be of size 1-3? This is possible if one of those children has died (otherwise the household would be larger). Label leaking results in an artificially strong predictive model, but for the wrong reasons. Future work is needed to determine if Bias Scan can be used as a more generalized detection of the "label leaking" effect. However, Bias Scan correctly identifies these groups as anomalous because the sub-population's under-5 mortality rate has not changed over the course of a decade of improvements.
Discussion, Strengths and Limitations
There are 10's of trillions of possible sub-populations that span the feature values listed in Table 3. Traditionally, this search space of possible hypotheses to investigate (i.e. Has the under-5 mortality rate of large households in Nigeria's South South region decreased in the past decade?) is narrowed down by domain experts who then manually inspect a small number of sub-populations of interest. In this work we show that data-driven discovery can be used to identify anomalous sub-populations that are experiencing a changing under-5 mortality rate more extreme than the country, as a whole. Critically, Bias scan efficiently explores this large space and is able to identify sub-populations that maximize a log likelihood ratio statistic based on the Bernoulli distribution. These are the sub-populations with observed mortality rates in T1 that differ from their expected mortality rates as predicted by a machine learning model trained on T0 data.
This is not meant to diminish the role of domain experts, by any means. An excellent example of the role of domain experts that is not explored in this paper is narrowing the search space of Bias Scan. For simplicity, we allowed Bias Scan to search over the exact same feature-values as those used to train (and predict) the under-5 mortality rates. This may result in some awkward sub-populations (See Tables 4 and 5). However, if an investigator wanted to know more about the effect of regional healthcare initiatives, they could remove some demographic features from the search space such as relationship structure of the adults in the household. This does not mean the investigator is specifying the sub-population, but rather the potential space for Bias Scan to efficiently explore. We believe this type of interaction captures the best parts of data-driven and hypothesis-driven research.
The analysis in this paper does fall short in a couple of areas. The primary limitation is our approach's indifference to covariate and confounding shifts. For example, it is possible that another sub-population experienced a larger change in the odds than the one we detected. However, that sub-population also decreased in size between T0 and T1 due to a covariate shift. This small size at T0 decreases the sub-population's bias score and therefore goes undetected. Future work in this space may incorporate penalties to the bias score when the sub-population has drastically different sizes between the two time steps. Alternatively, we could attempt to bypass the model training and predicting step entirely and form expectations of mortality directly from the mortality rates at T0. Second, we are limited by domain expertise to explore the causal factors that may explain why these sub-populations experienced such relative large changes in mortality odds. Perhaps there was increased regional healthcare capacity during those years or government-backed initiatives to actively address under-5 mortality rates? The sub-populations identified in this work may lead to hypothesis generation in follow-up studies designed to identify a causal impact of an intervention put in place between T0 and T1 on a per-country basis. However, that is outside the scope of the current work.
Conclusion
The world's under-5 mortality rate is decreasing but not everyone is experiencing the gains. To that end, this piece of work demonstrated how temporal data distribution changes, such as concept drift and prior probability shift, can be used to identify sub-populations of women and their households that are benefiting the most (and the least) from these global trends. Data-driven discovery is at the core of this analysis. We do not investigate apriori sub-populations of interest for statistically significant changes in mortality rates. Such an approach puts too much onus on the investigator to correctly hypothesize the sub-population from domain knowledge. Instead, we train predictive classifiers to estimate what the mortality rate would have been if the global trends in reducing under-5 mortality were experienced uniformly across all sub-populations within a country. We then apply Bias Scan, a technique from anomalous pattern detection, to identify sub-populations of records that have mortality rates that differ the most from this expectation. The resulting groups represent the successes and delayed results (or possible missteps) of recent changes in global health developments.
Some of the identified sub-populations, such as the southern regions of Nigeria, can be considered "drivers" of the overall decrease in mortality rates. Other sub-populations, like single mothers in Ethiopia with 2 children, highlight exceptional cases where mortality rates decreased from 47% to less than 8% in 16 years. Future work is needed to appropriately assign causal connections to the observed changes in under-5 mortality rates identified in this work.
Bias Scan is a powerful tool for health informatics, more generally. As predictive models become more common-place in healthcare applications it will be more important for researchers to understand and acknowledge data distribution changes and how it impacts the performance of the model across different sub-populations of patients.
Acknowledgements
This work is funded by Bill & Melinda Gates Foundation, investment ID 52720. We would like to particularly thank Claire Mershon and Dr. Nosa Orobaton for their MNCH-specific insights on this work.
Dr. Daniel B. Neill also played an important role in sharing code updates for Bias Scan.
Figures & Table
References
- 1.United Nations Development Program The sustainable development goals. 3: Good health and well-being. 2019.
- 2.Aluísio JD Barros and Cesar G Victora Measuring coverage in mnch: determining and interpreting inequalities in coverage of maternal, newborn, and child health interventions. PLoS medicine. 2013;10(5) doi: 10.1371/journal.pmed.1001390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kaguthi Grace, et al. Predictors of post neonatal mortality in western kenya: a cohort study. The Pan African medical journal. 2018;31 doi: 10.11604/pamj.2018.31.114.16725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Mengesha Hayelom Gebrekirstos, et al. Survival of neonates and predictors of their mortality in tigray region, northern ethiopia: prospective cohort study. BMC pregnancy and childbirth. 2016;16(1):202. doi: 10.1186/s12884-016-0994-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Mekonnen Yared, et al. Neonatal mortality in ethiopia: trends and determinants. BMC public health. 2013;13(1) doi: 10.1186/1471-2458-13-483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ozodiegwu Ifeoma D, et al. Country-level analysis of the association between maternal obesity and neonatal mortality in 34 sub-saharan african countries. Annals of Global Health. 2019;85(1) doi: 10.5334/aogh.2510. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yaya Sanni, et al. Decomposing the rural-urban gap in the factors of under-five mortality in sub-saharan africa? evidence from 35 countries. BMC public health. 2019;19(1):616. doi: 10.1186/s12889-019-6940-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Dessie Zufan Bitew, et al. Maternal characteristics and nutritional status among 6–59 months of children in ethiopia: further analysis of demographic and health survey. BMC pediatrics. 2019;19(1):83. doi: 10.1186/s12887-019-1459-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Friedman Jerome H. Stochastic gradient boosting. Comput. Stat. Data Anal. 2002;38(4):367–378, February. [Google Scholar]
- 10.Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay E. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830. [Google Scholar]
- 11.Zhang Zhe, Neill Daniel B. Identifying significant predictive bias in classifiers. 2016.
- 12.Ayele Dawit G, Zewotir Temesgen T. Comparison of under-five mortality for 2000, 2005 and 2011 surveys in ethiopia. BMC public health. 2016;16(1):930. doi: 10.1186/s12889-016-3601-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Varshney Kush R. Trustworthy machine learning and artificial intelligence. XRDS. 2019;25(3):26–29 April. [Google Scholar]
- 14.Ogallo William, Speakman Skyler, Akinwande Victor Abayomi, Varshney Kush, Walcott-Bryant Aisha, Wayua Charity, Weldemariam Komminist, Mershon Claure-helene, Orobaton Nosa. Identifying factors associated with neonatal mortality in sub-saharan africa using machine learning. AMIA Annual Symposium Proceedings. 2020. American Medical Informatics Association, (to appear) [PMC free article] [PubMed]
- 15.Platt John. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 2000;10 06 [Google Scholar]
- 16.Speakman Skyler, Somanchi Sriram, Edward McFowland III, Neill Daniel B. Penalized fast subset scanning. Journal of Computational and Graphical Statistics. 2016;25(2):382–404. [Google Scholar]
- 17.Neill Daniel B. Fast subset scan for spatial pattern detection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2012;74(2):337–360. [Google Scholar]


