Skip to main content
Gates Open Research logoLink to Gates Open Research
. 2022 Jul 19;6:80. [Version 1] doi: 10.12688/gatesopenres.13666.1

Study protocol for UNICEF and WHO estimates of global, regional, and national low birthweight prevalence for 2000 to 2020

Julia Krasevec 1,a, Hannah Blencowe 2, Christopher Coffey 1, Yemisrach B Okwaraji 2, Diana Estevez 3, Gretchen A Stevens 4, Eric O Ohuma 2, Joel Conkle 1, Giovanna Gatica-Domínguez 5, Ellen Bradley 2, Ben Kimathi Muthamia 1, Nita Dalmiya 6, Joy E Lawn 2, Elaine Borghi 5,#, Chika Hayashi 1,#
PMCID: PMC10229761  PMID: 37265999

Abstract

Background

Reducing low birthweight (LBW, weight at birth less than 2,500g) prevalence by at least 30% between 2012 and 2025 is a target endorsed by the World Health Assembly that can contribute to achieving Sustainable Development Goal 2 (Zero Hunger) by 2030. The 2019 LBW estimates indicated a global prevalence of 14.6% (20.5 million newborns) in 2015. We aim to develop updated LBW estimates at global, regional, and national levels for up to 202 countries for the period of 2000 to 2020.

Methods

Two types of sources for LBW data will be sought: national administrative data and population-based surveys. Administrative data will be searched for countries with a facility birth rate ≥80% and included when birthweight data account for ≥80% of UN estimated live births for that country and year. Surveys with birthweight data published since release of the 2019 edition of the LBW estimates will be adjusted using the standard methodology applied for the previous estimates. Risk of bias assessments will be undertaken. Covariates will be selected based on a conceptual framework of plausible associations with LBW, covariate time-series data quality, collinearity between covariates and correlations with LBW. National LBW prevalence will be estimated using a Bayesian multilevel-mixed regression model, then aggregated to derive regional and global estimates through population-weighted averages.

Conclusion

Whilst availability of LBW data has increased, especially with more facility births, gaps remain in the quantity and quality of data, particularly in low-and middle-income countries. Challenges include high percentages of missing data, lack of adherence to reporting standards, inaccurate measurement, and data heaping. Updated LBW estimates are important to highlight the global burden of LBW, track progress towards nutrition targets, and inform investments in programmes. Reliable, nationally representative data are key, alongside investments to improve the measurement and recording of an accurate birthweight for every baby.

Keywords: Low birthweight, global estimates, nutrition, newborn, Bayesian modelling

Introduction

Reducing low birthweight (LBW) prevalence by at least 30% between 2012 and 2025 is a target endorsed by the World Health Assembly in 2012 as part of the Comprehensive Implementation Plan on Maternal, Infant and Young Child Nutrition. It also contributes to achieving the Sustainable Development Goal (SDG) 2, which aims to end all forms of malnutrition across all age groups. Birthweight is a widely used indicator of attained fetal size, with LBW defined as a weight at birth of less than 2,500g, regardless of gestational age and sex 1 . LBW includes both live-born preterm neonates (<37 completed weeks of gestation) and live-born growth-restricted neonates (small-for-gestational-age (SGA) <10th centile of birthweight for gestational age and sex) who may be term or preterm. LBW increases the risk of neonatal and child mortality, neuro-developmental disability, stunted linear growth in childhood, and longer-term consequences of fetal programming, such as increased risk of obesity and diabetes 25 .

LBW is associated with factors contributing to preterm birth and/or fetal growth restriction such as extremes of maternal age (especially younger than 16 years of age or older than 40 years of age), multiple births, obstetric complications, maternal chronic conditions (e.g., hypertensive disorders of pregnancy), malnutrition and infections (e.g., malaria or Group B Streptococcus) 68 . In settings with high levels of fertility treatment and intensive obstetric management, including high caesarean sections rates, iatrogenic preterm birth may be an important driver of LBW 9 . Other contributors to LBW include exposure to environmental factors, such as indoor air pollution, and tobacco and drug use 10, 11 .

Despite the importance of LBW as a public health indicator, ongoing data challenges remain. Potential sources of bias in birthweight data that are likely to impact LBW estimates are summarized in Table 1. A major limitation of monitoring LBW is the lack of birthweight data for many of the world’s children. Many babies, especially those born outside of health facilities, are not weighed at birth; and even when weighed, low coverage of birth registration and administrative data systems, incomplete records and poor child health card retention at the household level contribute to birthweight data gaps.

Table 1. Potential sources of bias in low birthweight data.

Potential sources of bias in birthweight data Likely effect * on
LBW prevalence
estimates
1. Coverage of weighing: bias in newborns weighed at birth
1.1 Many newborns in LMIC countries are not weighed at birth, especially if born at home. These are more likely
to be socio-economically disadvantaged and at higher risk of LBW.
Decreased
1.2 Extremely preterm or sick babies, those stillborn or dying soon after birth and those born around threshold
of viability are the most likely to not be weighed. These babies are at high risk of being LBW.
Decreased
2. Coverage of data system: bias in newborns included in data source
2.1 Low coverage of administrative data systems in many low- and middle-income countries (e.g., lower coverage
of birth registration for those who die shortly after birth, missing home births, and births in private facilities even
if weighed). Births in private facilities are more likely to be socioeconomically advantaged and at lower biological
risk of LBW; however, high prevalence of medical interventions (e.g., caesarean sections both indicated and
elective before 37 weeks) may increase risk of LBW.
Increased or
decreased
3. Loss of birthweight data: biases in missing birthweight data for newborns included in the data
source and weighed at birth
3.1 In surveys, biases in card retention (e.g., birthweight not available for babies who died and who are more
likely to have been LBW) or inability to recall birthweight accurately at the time of the survey.
Decreased
3.2 Missing administrative birthweight data on the sickest babies (frequently LBW) who are transferred
immediately to (and weighed in) a newborn ward.
Decreased
4. Measurement errors: individual measurement or recording error
4.1 Heaping of recording of birthweight on 2500g. As definition excludes babies with birthweight exactly 2500g,
those LBW newborns with birthweight near the threshold frequently heaped at 2500g.
Decreased
4.2 Errors in birthweight measurement (e.g., poorly calibrated scales, inappropriate devices), suboptimal weighing
practices (e.g., clothed, or delayed weighing until >1 day after birth).
Increased or
decreased
4.3 Extremely preterm or sick babies and those born around threshold of viability who die soon after birth are
more likely to be misclassified as stillbirth. These babies are at high risk of being LBW.
Decreased
5. Measurement unit error
5.1 Confusion in surveys where birthweights may be provided in both pounds and grams (e.g., LBW baby
weighing 4.0 lbs. recorded as 4.0 kg).
Decreased
6. Denominator calculation errors in LBW prevalence calculation
6.1 LBW prevalence calculated as: number with birthweight <2500 per all livebirths (whether weighed or not). Decreased

* Decreased - the potential bias is likely to lead to a decreased LBW prevalence; Increased - the potential bias is likely to lead to an increased LBW prevalence.

Source: Updated from Blencowe et al. 17 Copyright © 2019 UNICEF and World Health Organization. Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license.

Globally, nearly one third of newborns do not have their birthweights included in available nationally representative data sources, with major variation across regions. For example, 68.1% of newborns in Western Africa are missing birthweight data compared with just 1.4% in Europe 12 . Furthermore, there are large disparities within countries: children born to poorer, less educated, rural, or marginalized mothers are at greater risk of missing birthweight information compared with their wealthier, more educated and urban counterparts 13, 14 . Since these children are more likely to have LBW, estimates that do not account for missing birthweights tend to underestimate the LBW prevalence 15, 16 . Moreover, heaping of birthweights on multiples of 100g and 500g can lead to underestimation of LBW prevalence 13, 14, 17, 18 . Given that a child recorded as having a birthweight of 2500g is not considered LBW, rounding up of birthweights to 2500g leads to underestimation of the LBW prevalence, and consequently, to an underestimation of the care needed for these newborns.

Despite the challenges associated with monitoring LBW, birthweight data are more likely to be collected and published in a range of populations worldwide than data on the related component indicators of preterm birth and size for gestational age. Thus, estimates based on the 2,500g cut-off allow for comparative health statistics across populations and have been the focus of several global goals since 1990 19, 20 . LBW reduction also has potential to contribute to other SDG targets, such as reducing neonatal and under-five mortality and preventing stunting.

In the most recent estimates (2015), a global average of 14.6% of livebirths were estimated as LBW 17 . These estimates represented the largest systematic compilation of LBW prevalence data to date and included 1,447 country-years from 148 countries. Innovative data processing steps were introduced for these estimates, including application of data coverage and quality criteria and a revised adjustment method for survey data 17 . To help fill data gaps, statistical regression models, including covariates of neonatal mortality rate, underweight prevalence among children aged less than 5 years, data type and region, and a country-specific random effect were used to estimate LBW prevalence.

This protocol describes the proposed methodology and process for developing updated global LBW estimates for the period 2000–2020, which will be undertaken by the United Nations Children’s Fund (UNICEF) and the World Health Organization (WHO) in collaboration with the London School of Hygiene & Tropical Medicine (LSHTM). This protocol is informed by the Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER) 21 . We build on data sources and adjustment methods applied for the previous estimates 17 , with data quality review enhancements, and propose a new modelling approach. We also note that this set of LBW estimates is being developed in coordination with, and will benefit from its association with, updated preterm birth estimates, for which a protocol has already been published 22 .

Protocol

Project organization

A Steering Group comprised of UNICEF, WHO and LSHTM will implement this protocol. The work will be supported by an Estimates Consultative Group, comprised of global experts in LBW and preterm birth measurement, including obstetricians, neonatologists, statisticians, preterm birth researchers, modellers, and programme experts working in the measurement field. The Estimates Consultative Group will provide technical guidance on the estimation methods, and review data inputs and preliminary estimates prior to finalization. An official country consultation will be conducted with UNICEF and WHO Member States to inform them of the methodology, review preliminary national estimates and identify any additional data.

Research ethics approval

This work is based on secondary analyses of household survey data and aggregate data from administrative sources only. The study was approved on 17th May 2021 by the London School of Hygiene and Tropical Medicine ethics review board (reference: 22858).

Data sources

Two types of input data sources will be considered: (i) national administrative data sources; and (ii) nationally representative household surveys. National administrative data are defined as data from national systems, including civil registration and vital statistics (CRVS) systems, national health management information systems (HMIS), and birth registries. Nationally representative household surveys include Demographic and Health Surveys (DHS), Multiple Indicator Cluster Surveys (MICS), and other nationally representative surveys for which anonymized individual-level data and required variables are available.

Figure 1 provides an overview of the methodological steps undertaken for data search and abstraction.

Figure 1. Flow diagram of the data search and review process.

Figure 1.

Source search strategy

Administrative sources. A systematic search of Ministry of Health and/or National Statistical Office publications and datasets available in the public domain will be conducted only for countries that have at least 80% of births occurring in a health facility in a given year according to UNICEF or WHO databases 23 . For countries that meet the search threshold, data sources and websites used for the 2019 edition of the LBW estimates will first be searched to identify more recent data from CRVS, HMIS, or medical birth registries. For countries meeting the search threshold, but without LBW data included in the 2019 edition of the LBW estimates, a web-based search will be undertaken of National Statistical Office, Ministry of Health, and other national perinatal databases to identify further data for any years from 2000 to 2020. The search terms used for English-language websites will include birth, birthweight, and low birthweight, with appropriate translations for non-English websites. Non-English websites will be searched by researchers who speak the relevant language. In addition to the above-mentioned methods to systematically search for LBW country data from administrative sources, UNICEF and WHO country offices will be requested to consult government counterparts for any LBW data not available in the public domain.

Household surveys. We will update the database from the one used for the 2019 edition of the LBW estimates by conducting an updated search of household survey data sources to identify any missing sources. Two broad approaches will be undertaken to search for and compile updated country data on birthweight from household surveys: (i) searching the websites of DHS and MICS for surveys from 1998 to present; and (ii) searching the UNICEF Nutrition Data Source Catalogue for additional surveys that contain birthweight information from 2000 onwards from low- and middle-income countries (LMICs).

Data screening, review, data extraction and adjustments

Administrative sources. Data will be extracted into an Excel-based data extraction form ( Extended Data Table 1) employing methods outlined in a guide developed for this purpose. The following variables will be extracted: country, data source, data source type, year, number of live births, number of LBW live births with sub-envelopes categorized by 500g intervals, number of live births weighing ≥2500g and number of live births missing birthweight data. LBW prevalence will be calculated as the (number of LBW live births) / (live births with a birth weight (i.e., sum of live births <2500g and ≥2500g or when ≥2500g is not available a back calculated value using reported percent LBW and number of live births weighing <2500g)) ×100. Where data on numbers of live births with a birthweight are not available, total live births will be used as the denominator, and if not available, total births (live births and still births) will be used.

The administrative data will be double extracted. The first extraction will be conducted by a single abstractor. A second abstractor will also extract all data points and any disagreement between the first and second abstractor will be resolved by a third person. For non-English data sources, review and extraction will be supported by staff that speak the relevant language. Where necessary, the relevant government agency will be contacted to help direct the reviewer to the appropriate data tables and to clarify any questions regarding the data.

Household surveys. For household surveys, anonymized individual-level data will be re-analysed using STATA version 17, to produce data quality indicators (see Extended Data Table 2), as well as LBW prevalence estimates adjusted for missing birthweights and data heaping, with output variables described in Extended Data Table 3. As in the previous LBW estimates 17 , adjustments to overcome some of the potential biases noted in Table 1 will be made, namely multiple imputation to account for missing birthweights, and fitting a finite mixture model of two normal distributions to adjust for data heaping 17 . Birthweights reported to be <250g or >5,500g will be considered implausible based on results from the INTERGROWTH-21 st study 24 , and will therefore be set to “missing”. For survey datasets containing the mother’s perception of size at birth, missing birthweights will be imputed using the following variables: (i) mother’s perception of size at birth; (ii) sex of child; (iii) multiple/singleton status; (iv) maternal parity; (v) maternal height; and (vi) maternal body mass index, when available. Where a mother’s perception of size at birth is not available, only the adjustment for data heaping will be performed. Following evidence from previous research 25, 26 , five imputations will be performed for each survey, and a mixture model of two normal distributions will then be fitted to each of the five datasets of recorded and imputed birthweights. The approach provides an estimate of the proportion of birthweights <2,500g that accounts for missing values and heaping, and produces 95% confidence intervals that account for uncertainty arising from both the estimation of the parameters of the two normal distributions and from the imputation step 27 .

Exclusion criteria

General exclusions for implausibility. All data sources with an estimated LBW prevalence of <2.1% or >40% in a given year will be considered implausible and excluded. This lower cut-off is consistent with the lowest population-based LBW prevalence among healthy women at low risk of pregnancy complications (e.g., preterm birth and fetal growth restriction) in any country from the INTERGROWTH 21 st project 24 . The basis for the upper cut-off, consistent with that used for the 2019 edition of the LBW estimates, is from the highest population-based LBW prevalence, which was 37% 28 .

Administrative sources, specific exclusions. National administrative birthweight data will not be included for country-years where the number of live births with a birthweight is <80% of UN estimated live births 29 , as these are unlikely to be representative of the national population.

Household surveys, specific exclusions. Outputs produced during data processing and initial analysis (outlined in Extended Data Table 2) will be used to assess each survey against the exclusion criteria. Unweighted samples, rather than weighted samples used in the 2019 Edition of the LBW estimates, will be used to align with methods applied for data quality review of other nutrition indicators based on recent global guidance 30 . Consistent with the previous LBW estimates 17 , surveys will be excluded if any of the criteria listed below apply.

  • Unavailability of the mother’s perception of size at birth variable, except in cases where >95% 1 of live births have a valid 2 birthweight.

  • <30% 3 of live births have a valid 2 birthweight in the dataset.

  • <200 4 valid 2 birthweights are available in the dataset.

  • There is severe heaping / implausible birthweight distribution, which we define as:

    • i.

      >55% of all birthweights falling on the three most frequent birthweights (e.g., if 3,000g, 3,500g and 2,500g were the three most frequent birthweights, these three birthweights could not make up more than 55% of all birthweights in the dataset)

    • ii.

      >5% of birthweights on the tail ends of ≤500g and ≥5,000g

    • iii.

      >10% of birthweights ≥4,500g

Data quality assessment

Administrative data which pass the 80% threshold will be assessed using quality indicators across four dimensions adapted from the WHO data quality review framework 31 . The four dimensions are (i) availability of time series data; (ii) availability of aggregate data to assess data quality; (iii) internal consistency and plausibility; and (iv) external comparability and plausibility. This data quality review will inform subsequent statistical analyses and sensitivity analyses, and will help quantify and adjust for potential biases and limitations of the LBW estimates.

Table 2 summarizes the overall approaches that will be taken to minimize the risk of bias outlined above.

Table 2. Risk of bias assessment and potential approaches.

Criteria Potential biases Proposed approach admin data Proposed approach survey data
1. Population
representativeness
of available
birthweight data
Biases in newborns included
in data source and biases in
birthweight availability for
included newborns.

(Table 1 – Potential biases 1,2,3)
Exclude if the total births with a
weight in the data source is <80%
of UN-estimated population of live
births.

Consider sensitivity analyses.
Include only surveys designed to be
nationally representative.
Only include surveys with valid
birthweights for ≥30% of births,
and for those, undertake multiple
imputation to impute birthweight
data for included newborns with
missing birthweight. Set a stricter
inclusion criterion of ≥95% requiring
a valid birthweight for surveys where
multiple imputation is not possible.
2. Birthweight
distribution
Biases due to missing
birthweight for very sick babies
and those born around the
threshold of viability

(Table 1 – Potential biases 1,3)
Categorize data where possible
into LBW subgroups % for very
low birthweight, extremely low
birthweight and <500g.

Review distributions and identify data
with evidence of under-capture of
those <1,000g. Consider adjusting
these data or sensitivity analysis
based on excluding these data.
Multiple imputation to impute
birthweight data for included
newborns with missing birthweight.

Very sick or small babies who die
immediately after birth may not be
captured in the birth history at all.
Thus, consider sensitivity analysis
based on excluding data points with
evidence of under-capture of
those <1,000g.
3. Measurement
errors due to
heaping
Heaping of recorded birthweight
on 2,500g.

(Table 1 – Potential biases 4)
Consider use of administrative
data birthweight heaping index for
countries with available information
to identify indicators of countries that
have higher and lower prevalence
of heaping. Use model terms for
categories of administrative data in
the Bayesian model to adjust data in
countries that are expected to have
high heaping.
Exclusion of surveys with extreme
heaping (>55% of all birthweights
falling on the three most frequent
birthweights and <5% of births
on the tail ends of ≤500g and
≥5,000g) Also, heaping adjustment
undertaken as part of the pre-
modelling data processing.
4. Measurement
errors due to
misclassification
of live births as
stillbirths
Most likely in babies around the
perceived thresholds of viability,
which vary by context

(Table 1 – Potential biases 4)
Methods detailed above on
birthweight distribution to attempt
to identify missing babies around the
threshold of vulnerability.
Misclassified newborns will be
missing from the survey dataset.
Methods detailed above on
birthweight distribution to attempt
to identify missing babies around
the threshold of vulnerability.
5. Measurement unit
error
Confusion in surveys where
birthweights may be provided in
both pounds and grams

(Table 1 – Potential biases 5)
Not applicable Exclusions based on >10% of all
birthweights ≥4,500g. 5
6. Incorrect
denominator used
For example, where a large
number of newborns in the
data source did not have a
recorded birthweight and the
denominator used includes
all newborns in data source,
rather than all newborns with a
birthweight in the data source.

(Table 1 – Potential biases 6)
Re-calculate LBW prevalence
estimates using the correct
denominator, if available; explore
other approaches to account for bias
if not.
Re-calculate all LBW prevalence
estimates.

Note: Any remaining error will be captured by model terms for non-sampling variability

Statistical analysis and modelling

After eligibility and exclusion criteria are applied to the extracted and re-analysed data, one dataset of survey and administrative estimates will be compiled. In compliance with GATHER guidance, the following details of all included data sources will be made publicly available: reference information or contact name/institution, population represented, data collection method, year(s) of data collection, and sample size, as relevant.

Step 1: Covariates selection for modelling

The development of the models for the LBW estimates will utilize country-level covariates available from the United Nations and other sources. Covariates for inclusion will be selected a priori using a three-step approach as follows: (i) identifying plausible predictors and outcomes for LBW based on a conceptual framework ( Figure 2); (ii) assessing data availability and data quality of potential covariates time series; and (iii) assessing correlation between covariates, correlation of covariates with LBW, and clustering analysis to select one covariate within each cluster based on correlation levels and data availability.

Figure 2. Conceptual framework for the identification of potential covariates for use in the low birthweight estimates.

Figure 2.

This framework has been informed by previous publications 3234 .

Plausible predictors of LBW were identified through construction of a conceptual framework based on biological plausibility and risk factors ( Figure 2) using existing frameworks in the literature 3235 . The conceptual framework illustrates the pathways to LBW and the relationship between socioeconomic and demographic factors, maternal nutrition and health status, and access to health care. It also shows the links between early childhood outcomes that are associated with LBW, which will be considered as potential predictors of the model, including child malnutrition (e.g., stunting and underweight among children under 5 years of age) and early child mortality (e.g., neonatal mortality rate, infant mortality rate).

Potential covariates across six domains: (1) socio-economic, demographic, fertility, and cultural factors; (2) nutritional, behavioural, and environmental factors; (3) maternal conditions (including infections); (4) fetal or placental conditions; (5) health care-related factors (markers to access to care); and (6) early childhood outcomes associated with LBW/preterm birth are presented in Table 3.

Table 3. Covariates for potential inclusion in the modelling analyses.

Domain Potential covariate
Socio-economic, demographic,
fertility, and cultural factors
Gross National Income
Gross Domestic Product
GINI coefficient
Adult Female Literacy Rate
Mean years female education
Adolescent Birth Rate
Total Fertility Rate
General Fertility Rate
Modern contraceptive rate prevalence
Proportion of live births to mothers aged 35 and older
Urban Population
Nutritional, behavioural, and
environmental factors
Adult Female Smoking Rate
Indoor air Pollution
Outdoor air pollution
Adult Female Body Mass Index (Mean)
Underweight women of reproductive age
Overweight women of reproductive age
Maternal Anaemia
Adult Female Substance Use
Intimate Partner Violence
Maternal conditions (including
infections)
Maternal Mortality Rate
Adult Female HIV Prevalence
Malaria Incidence ( P. falciparum Parasite Rate)
Insecticide-treated nets coverage
Adult female Syphilis prevalence
Gestational Hypertension
Gestational Diabetes
Maternal Depression
Fetal or placental conditions Twinning
Birth Defects
Growth restriction
Health care-related factors
(Markers of Access to care)
Antenatal Care Attendance (Four or more times)
Skilled Birth Attendance
Facility Birth Rate
Caesarean Section Rate
Early childhood outcomes
associated with LBW
Neonatal Mortality Rate
Stunting prevalence in children under 5 years
Underweight prevalence in children under 5 years

Data availability will be assessed through consultation with WHO and UNICEF colleagues and a targeted search of webpages of United Nations organizations (e.g., WHO Global Health Observatory, UNICEF, United Nations Population Division) and academic groups (e.g., International Health Metrics and Evaluation (IHME)).

For potential covariates with no existing time series estimates available, UNICEF and WHO databases will be searched for empirical data available for these variables. Where comparable, but incomplete, time series data are located for a given covariate, a new time series will be generated using a standard approach for in-filling and extrapolation consistent with previously used approaches 36 . Namely, for countries with some empirical data, linear interpolation and constant backwards and forwards extrapolation will be used. For countries with no empirical data, values will be imputed using a regression based on geographic region and country’s lag distributed GDP and World Bank country income classification. Finally, for all countries, smoothed time series will be generated using a 7-year average for model prediction.

Given that ideal covariates for LBW would be comprised of estimates from primary data sources (i.e., not modelled using covariates) for all years from 2000 to 2020, potential covariates will be assessed considering data source, number of empirical data points available by country and methods used to produce time-series including any modelling, in-filling, smoothing, extrapolations, or any other data manipulations.

Finally, exploratory analysis will be undertaken to observe correlations between potential covariates and for each covariate with LBW. To select a parsimonious set of covariates that avoids model overfitting, cluster analyses of all covariates will be undertaken with the aim of having distinct clusters from which only one covariate per cluster will be selected for inclusion in the modelling. The selection of covariates within a cluster will be based on covariates that have the highest correlation with LBW or covariates with data for most country-years in cases where the correlation coefficients are deemed not to be that different from the covariate with the highest correlation though incomplete data.

Step 2: Development of a model to estimate low birthweight prevalence

A Bayesian multilevel-mixed regression model will be developed to estimate LBW prevalence at national level. Analysis will be conducted using RStudio 2021.09.0+351 "Ghost Orchid" Release, and the RJAGS RJAGS, R2JAGS, RSTAN packages. The model will process all country-years with ‘available data’, including the regional 37 intercepts and country-specific intercepts and slopes, generating LBW estimates for all country-years. The model will include terms for data source characteristics ( e.g., survey versus administrative data, and/or measures of data quality). Temporal variability will also be considered at the country and regional level. Covariates selected from Step 1 will be included in the model. LBW will be modelled on the logit scale to ensure that LBW prevalence estimates and confidence intervals obtained from the fitted model are within a plausible range (i.e., between 2.1% and 40%). In case of an implausible and unexpected direction of one or more of the included covariates based on the estimate of regression coefficients, the covariate within a cluster that is next in rank based on the correlation coefficient with LBW will be selected, and so on.

Step 3: Generating estimates of LBW prevalence and trends

Annual estimates of national LBW prevalence from 2000 to 2020 will be predicted from the Bayesian multilevel-mixed regression model developed in step 2, for all countries, including for country-years with data and country-years without useable data, or no data at all.

Various sensitivity analyses will be performed comparing: the final LBW model (with and without covariates to evaluate the contribution of the covariates used), and a model that includes additional covariates used in the 2019 edition of the LBW estimates.

Step 4: Presentation of results

Country-level point estimates with the 10 th and 90 th percentiles for uncertainty intervals around the estimate will be presented. A specific review of data availability and estimates for the year 2020 will be applied to assess any effects of the COVID-19 pandemic and response; estimates for 2020 will be published depending on the outcome of the assessment. Only national estimates for those countries contributing at least one eligible data point in the estimation period will be published. Nevertheless, estimates derived for all countries and years will contribute to the regional and global estimates.

Access to data

In compliance with GATHER guidance 21 , the final LBW estimates with uncertainty intervals will be published online through the WHO Global Health Observatory and UNICEF Data website alongside the complete database of input data used to develop modelled estimates and relevant code. The following information will be made publicly available for all included data sources: reference information or contact name/institution, population represented, data collection method, year(s) of data collection and sample size, as relevant.

Dissemination

This work will result in publication of global, regional, and national LBW prevalence estimates for the period of 2000–2020 in an open-access peer-reviewed journal. We will also publish the final protocol, database and LBW prevalence estimates online through the WHO Global Health Observatory and UNICEF Data website, according to GATHER 21 , as described in the previous section.

Discussion and conclusion

The development of LBW prevalence estimates is critical for all countries and yet there are challenges anticipated in this work that we have noted as part of this study protocol.

Firstly, with regards to population representativeness, national data sources are often incomplete or unavailable, particularly in LMICs. National administrative data sources may miss marginalized or vulnerable groups (e.g., those in humanitarian settings, indigenous populations) who may face greater risks of LBW. We will not include administrative data from data systems with low population representativeness (covering <80% of national livebirths); while some of these countries will have nationally representative survey data included, others may have no LBW input data.

Biases may arise due to missing birthweight data on newborns around the threshold of viability – the smallest and most preterm newborns. Methods for analysing individual-level survey data will partially address missing birthweights. For administrative data, we will consider adjusting data points with evidence of missing birthweights and/or under-capture of those <1,000g. However, this assessment will be limited to countries that capture and report such data, noting that those most prone to biases are also most likely to lack such information.

Potential sources of bias, and approaches to address these, have been considered above ( Table 2), and sources with high levels of missing data will be excluded. It is not possible to adjust for all biases due to measurement errors. Whilst adjustments for heaping will be applied to survey data, there are currently no established methods for adjusting aggregate data from administrative sources. Evidence on the extent of heaping in administrative data are limited to a subset of countries, meaning that systematic adjustment is not possible nor is it possible to adjust the administrative estimates for heaping at an individual level since microdata were not obtained. Instead, other data quality indicators that are available for the administrative data will be used as a proxy to inform the structure of the Bayesian model to account for these factors. Challenges arising from the low quality of some data are compounded by absence of clear, internationally harmonized guidelines on how to assess the quality of birthweight data. In the future, methods to adjust for incomplete or low-quality administrative data may help overcome these biases.

The work described in this protocol will be used to generate estimates of LBW prevalence at global, regional, and national levels for the period of 2000 to 2020. This protocol builds closely on the methodology used for the 2019 edition of the LBW estimates 17 . In successive estimation rounds, increases in data capture for administrative systems will allow for an expanded number of national data points from more countries to be included in the estimation work. The availability of birthweight data with sufficient coverage from household surveys from LMICs is also expected to improve over time (UNICEF and WHO, 2019); however, some data gaps remain in recent years due to delays related to the COVID-19 pandemic. The increase in the number of countries with data, as well as the quantity of data points per country, are expected to improve the estimates overall, although some data quality issues, such as heaping of birthweights, may not improve at the same pace.

The current round of LBW prevalence estimates for the period of 2000 to 2020 will be critical for targeting programs that aim to reduce LBW prevalence over time. These estimates will also guide the refinement and implementation of nutrition and health policies, inform resource allocation within health systems, and help assess the impact of nutrition and newborn survival interventions and their respective redesign.

Study status

The research is currently underway with the administrative and survey data searches, extraction and re-analysis completed in June 2022, the model developed and tested from January to July 2022, country consultation set to begin in August 2022 and final estimates planned for release in the fall/winter of 2022.

Data availability statement

Underlying data

No underlying data are associated with this protocol.

Extended data

figshare. Extended data table1 admin data abstraction template.xlsx. https://doi.org/10.6084/m9.figshare.20113040.v1 38

This project contains the following files:

  • -

    Extended data table1 admin data abstraction template.xlsx (data extraction form for low birthweight and preterm birth estimates)

figshare: Extended data tables 2 and 3 LBW protocol.docx. https://doi.org/10.6084/m9.figshare.20113043.v2 39

  • -

    Extended data tables 2 and 3 LBW protocol.docx (variable description for household survey reanalysis outputs for data quality review)

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

Reporting guidelines

No standard reporting guideline exists for protocols for global estimates. Final estimates will be reported in accordance with the GATHER statement 21 .

Acknowledgements

We would like to acknowledge the contributions made by Laith Hussain-Alkhateeb and Simon Cousens for statistical advice and Julia D’Aloisio for editorial services.

We gratefully acknowledge the contributions of members of the Estimates Consultative Group, including comments on the proposed methodology. Members include: Aris Papageorghiou - Nuffield Department of Obstetrics & Gynecology and Oxford Maternal & Perinatal Health Institute, UK; Mercedes de Onis - Royal Academy of Medicine for Spain; Bo Jacobsson - Sahlgrenska University Hospital/Ostra Sweden; Florence West - International Confederation of Midwives; Jeeva Shankar - All India Inst of Medical Sciences, India; Jennifer Zeitlin - Institut national de la santé et de la recherche médicale (INSERM), France; Jim Zhang - Shanghai Jiao Tong University School of Medicine, China; Joanne Katz - Johns Hopkins University, USA; John Wakefield - University of Washington; Max Petzold - University of Gothenburg, Sweden; Pisake Lumbiganon - Khon Kaen University, Thailand; Robert Pattinson - South African Medical Research Council, University of Pretoria, South Africa; Sharad Sharma - Ministry of Health, Family Health Division, Nepal; Shoo Lee - Toronto Sick Kids, Canada; Victor Gaigbe-Togbe - UN Population Division, USA; William Keenan - International Pediatric Association.

Funding Statement

This work was supported by the Bill and Melinda Gates Foundation grant to UNICEF [grant number 001395]; the Children’s Investment Fund Foundation (CIFF) grant to UNICEF [grant reference number: 1803-02535]; and CIFF funding to London School of Hygiene & Tropical Medicine [grant reference number: 1803-02535]. The funders had no role in developing the protocol, preparing the protocol manuscript, or deciding to submit for the protocol for publication. For the overall study, the funders will have no role in data collection and analysis, data interpretation, manuscript preparation, or the decision to submit the manuscript for publication.

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

[version 1; peer review: 2 approved, 4 approved with reservations]

Footnotes

1 Data sources with ≥95% of livebirths with a valid birthweight, but no data on mother’s perception of size of birth, will be adjusted for heaping only, and their inclusion will be assessed in a sensitivity analysis.

2 Valid birthweights are defined as those falling between 250g and 5500g; birthweights falling outside this range are set to ‘missing’.

3 Note that coverage of livebirths weighed is among births with a valid birthweight for surveys and is much lower (≥30%) than required for administrative data sources (≥80%) because raw data are available for surveys, allowing multiple imputation of missing birthweights by use of other covariates from the survey.

4 The criteria requiring at least 200 birthweights and 30% of births with a birthweight in the dataset are intended to allow a sufficient sample for application of the adjustments for missing birthweights and data heaping.

5 The most common source of confusion is between lbs and kg, such as a baby just below the LBW threshold weighing 5.5 lbs but recorded as 5.5 kg). This criterion thereby excludes surveys with a large amount of unit confusion.

Author contributions

Conceptualisation: CH, EBo, JEL, JK, HB

Methodology: CH, EBr, JC, JEL, JK, HB, EOO, GAS

Writing original draft: CH, JK and HB

Writing – reviewing and editing: EBo, EBr, CC, YBO, DE, GAS, EOO, JC, GG-D, BKM, ND, CH and JEL

References

  • 1. World Health Organization: International Classification of Disease 11.Reference guide 2.28.4.1 Fetal death and live birth 2019 standards-and-reporting-requirements-for-mortality-in-perinatal-and-related-periodsc2-28-4. Reference Source [Google Scholar]
  • 2. Jornayvaz FR, Vollenweider P, Bochud M, et al. : Low birth weight leads to obesity, diabetes and increased leptin levels in adults: the CoLaus study. Cardiovasc Diabetol. 2016;15:73. 10.1186/s12933-016-0389-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Christian P, Lee SE, Donahue Angel M, et al. : Risk of childhood undernutrition related to small-for-gestational age and preterm birth in low- and middle-income countries. Int J Epidemiol. 2013;42(5):1340–55. 10.1093/ije/dyt109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Gu H, Wang L, Liu L, et al. : A gradient relationship between low birth weight and IQ: A meta-analysis. Sci Rep. 2017;7(1):18035. 10.1038/s41598-017-18234-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. McCormick MC: The contribution of low birth weight to infant mortality and childhood morbidity. N Engl J Med. 1985;312(2):82–90. 10.1056/NEJM198501103120204 [DOI] [PubMed] [Google Scholar]
  • 6. Accrombessi M, Zeitlin J, Massougbodji A, et al. : What Do We Know about Risk Factors for Fetal Growth Restriction in Africa at the Time of Sustainable Development Goals? A Scoping Review. Paediatr Perinat Epidemiol. 2018;32(2):184–96. 10.1111/ppe.12433 [DOI] [PubMed] [Google Scholar]
  • 7. Althabe F, Moore JL, Gibbons L, et al. : Adverse maternal and perinatal outcomes in adolescent pregnancies: The Global Network's Maternal Newborn Health Registry study. Reprod Health. 2015;12 Suppl 2(Suppl 2):S8. 10.1186/1742-4755-12-S2-S8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Lean SC, Derricott H, Jones RL, et al. : Advanced maternal age and adverse pregnancy outcomes: A systematic review and meta-analysis. PLoS One. 2017;12(10):e0186287. 10.1371/journal.pone.0186287 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Lima MC, de Oliveira GS, Lyra Cde O, et al. : [The spatial inequality of low birth weight in Brazil]. Cien Saude Colet. 2013;18(8):2443–52. 10.1590/s1413-81232013000800029 [DOI] [PubMed] [Google Scholar]
  • 10. Amegah AK, Quansah R, Jaakkola JJ: Household air pollution from solid fuel use and risk of adverse pregnancy outcomes: a systematic review and meta-analysis of the empirical evidence. PLoS One. 2014;9(12):e113920. 10.1371/journal.pone.0113920 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Pereira PP, Da Mata FA, Figueiredo AC, et al. : Maternal Active Smoking During Pregnancy and Low Birth Weight in the Americas: A Systematic Review and Meta-analysis. Nicotine Tob Res. 2017;19(5):497–505. 10.1093/ntr/ntw228 [DOI] [PubMed] [Google Scholar]
  • 12. UNICEF-WHO: Database on percentage of births without a birthweight.2021; Accessed 28th November 2021. [Google Scholar]
  • 13. Biks GA, Blencowe H, Hardy VP, et al. : Birthweight data completeness and quality in population-based surveys: EN-INDEPTH study. Popul Health Metr. 2021;19(Suppl 1):17. 10.1186/s12963-020-00229-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Blanc AK, Wardlaw T: Monitoring low birth weight: an evaluation of international estimates and an updated estimation procedure. Bull World Health Organ. 2005;83(3):178–85. [PMC free article] [PubMed] [Google Scholar]
  • 15. Abeywickrama G, S Padmadas S, Hinde A: Social inequalities in low birthweight outcomes in Sri Lanka: evidence from the Demographic and Health Survey 2016. BMJ Open. 2020;10(5):e037223. 10.1136/bmjopen-2020-037223 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Zaveri A, Paul P, Saha J, et al. : Maternal determinants of low birth weight among Indian children: Evidence from the National Family Health Survey-4, 2015-16. PLoS One. 2020;15(12):e0244562. 10.1371/journal.pone.0244562 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Blencowe H, Krasevec J, de Onis M, et al. : National, regional, and worldwide estimates of low birthweight in 2015, with trends from 2000: a systematic analysis. Lancet Glob Health. 2019;7(7):e849–e60. 10.1016/S2214-109X(18)30565-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Channon AA, Padmadas SS, McDonald JW: Measuring birth weight in developing countries: does the method of reporting in retrospective surveys matter? Matern Child Health J. 2011;15(1):12–8. 10.1007/s10995-009-0553-3 [DOI] [PubMed] [Google Scholar]
  • 19. UNICEF: First call for children: World declaration and plan of action from the world summit for children Convention on the Rights of the Child.1990. Reference Source [Google Scholar]
  • 20. UNICEF: ‘A World Fit for Children,’ the Declaration and Plan of Action adopted at the United Nations General Assembly Special Session on Children in 2002.2002. [Google Scholar]
  • 21. Stevens GA, Alkema L, Black RE, et al. : Guidelines for Accurate and Transparent Health Estimates Reporting: the GATHER statement. Lancet. 2016;388(10062):e19–e23. 10.1016/S0140-6736(16)30388-9 [DOI] [PubMed] [Google Scholar]
  • 22. De Costa A, Moller AB, Blencowe H, et al. : Study protocol for WHO and UNICEF estimates of global, regional, and national preterm birth rates for 2010 to 2019. PLoS One. 2021;16(10):e0258751. 10.1371/journal.pone.0258751 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Global health observatory. Geneva: WHO. 2017. AND UNICEF Global Database on Delivery Care.2021. Reference Source [Google Scholar]
  • 24. Villar J, Cheikh Ismail L, Victora CG, et al. : International standards for newborn weight, length, and head circumference by gestational age and sex: the Newborn Cross-Sectional Study of the INTERGROWTH-21st Project. Lancet. 2014;384(9946):857–68. 10.1016/S0140-6736(14)60932-6 [DOI] [PubMed] [Google Scholar]
  • 25. StataCorp: Stata Multiple Imputation Reference Manual: Release 17. College Station, TX: StataCorp LLC;2021. Reference Source [Google Scholar]
  • 26. Royston P: Multiple imputation of missing values: Further update of ice with an emphasis on categorical variables. Stata Journal. 2009;9(3):466–77. 10.1177/1536867X0900900308 [DOI] [Google Scholar]
  • 27. Chang KT, Carter ED, Mullany LC, et al. : Validation of MINORMIX Approach for Estimation of Low Birthweight Prevalence Using a Rural Nepal dataset. J Nutr. 2022;152(3): 872–879. 10.1093/jn/nxab417 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. UNICEF. Bangladesh Bureau of Statistics: National low birthweight survey of Bangladesh, 2003– 04.Dhaka: Planning Division, Ministry of Planning, Government of Bangladesh; 2005. Reference Source [Google Scholar]
  • 29. UN-Department of Economic and Social Affairs (DESA) Population Division.World Population Prospects. 2019. [Google Scholar]
  • 30. World Health Organization and the United Nations Children’s Fund (UNICEF): Recommendations for data collection, analysis and reporting on anthropometric indicators in children under 5 years old.Geneva. 2019. Reference Source [Google Scholar]
  • 31. World Health Organization: Data quality review: a toolkit for facility data quality assessment. Module 1. Framework and metrics. Geneva; 2017. Reference Source [Google Scholar]
  • 32. Mosley WH, Chen LC: An analytical framework for the study of child survival in developing countries. 1984. Bull World Health Organ. 2003;81(2):140–5. [PMC free article] [PubMed] [Google Scholar]
  • 33. Olusanya BO, Ofovwe GE: Predictors of preterm births and low birthweight in an inner-city hospital in sub-Saharan Africa. Matern Child Health J. 2010;14(6):978–86. 10.1007/s10995-009-0528-4 [DOI] [PubMed] [Google Scholar]
  • 34. Villar J, Papageorghiou AT, Knight HE, et al. : The preterm birth syndrome: a prototype phenotypic classification. Am J Obstet Gynecol. 2012;206(2):119–23. 10.1016/j.ajog.2011.10.866 [DOI] [PubMed] [Google Scholar]
  • 35. Victora CG, Huttly SR, Fuchs SC, et al. : The role of conceptual frameworks in epidemiological analysis: a hierarchical approach. Int J Epidemiol. 1997;26(1):224–7. 10.1093/ije/26.1.224 [DOI] [PubMed] [Google Scholar]
  • 36. Liu L, Villavicencio F, Yeung D, et al. : National, regional, and global causes of mortality in 5-19-year-olds from 2000 to 2019: a systematic analysis. Lancet Glob Health. 2022;10(3):e337–e347. 10.1016/S2214-109X(21)00566-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. UN-DESA: Standard country or area codes for statistical use (M49).New York: UN Statistics Division; 2021. Reference Source [Google Scholar]
  • 38. krasevec j: Extended data table1 admin data abstraction template.xlsx. figshare. Journal contribution. [Dataset] 2022. 10.6084/m9.figshare.20113040.v1 [DOI] [Google Scholar]
  • 39. krasevec j: Extended data tables 2 and 3 LBW protocol.docx. figshare. Journal contribution. [Dataset] 2022. 10.6084/m9.figshare.20113043.v1 [DOI] [Google Scholar]
Gates Open Res. 2023 Jun 30. doi: 10.21956/gatesopenres.14951.r33248

Reviewer response for version 1

Aloka L Patel 1, Esther Lee 1

The authors outline a meticulous and comprehensive study protocol to update low birth rate prevalence for 202 countries between 2000 to 2020. The overarching goal of this protocol is to estimate the current global LBW prevalence in order to create programs and track progress towards targets to ultimately achieving SDG 2 goals. We want to recognize the exemplary work by the authors in putting together this protocol. We specifically found the handling the potential sources of bias in birthweight data to be comprehensive and thoughtful. We know that data driven programs and effort lead to more targeted and meaningful improvement, and this protocol will provide the foundation for future world impact on mitigating morbidity and mortality due to LBW.

We have some thoughts and clarifying questions that may help guide the authors to improve the organization and streamline the protocol to be more easily followed by those with less technical statistical knowledge.

Page 3: paragraph 1: after “live-born growth-restricted neonates add "i.e". before (small-for-gestational-age (SGA))

  • The term growth restriction is often used with IUGR (Intrauterine growth restriction), so it would provide clarity to add the "i.e" to show that SGA is an example of “live-born GR'

Page 6 “For survey datasets containing the mother’s perception of size at birth, missing birthweights will be imputed using the following variables: (i) mother’s perception of size at birth; (ii) sex of child; (iii) multiple/singleton status; (iv) maternal parity; (v) maternal height; and (vi) maternal body mass index, when available. Where a mother’s perception of size at birth is not available, only the adjustment for data heaping will be performed. Following evidence from previous research25,26,”

  • Understanding that gestational age (GA)/Ultrasound guided accurate GA may not be widely available in LMICs, we would recommend factoring in infant’s GA or estimated GA when available?

Page 7: For countries with no empirical data, what is the rationale for inclusion with inter/extrapolation of data versus omitting that country? We understand that the generated data could serve to be a valuable reference for future data collection. However, it would be valuable to understand the reasoning.

  • Also, we suggest considering sensitivity analysis with including/excluding countries without empirical data.

Page 8: criteria 2: consider sensitivity analysis based on excluding data points with evidence of under-capture of those <1,000g---Yes, we would support this

Page 9, Fig. 2: Consider writing out "U5"= under 5 child mortality. The font could be mistaken for US (United States).

Page 10: Table 3 “Early childhood outcomes associated with LBW”

  • The rationale behind including early childhood outcomes associated with LBW is unclear. Is it due to impact of the potential covariates on maternal health due to early life experiences and subsequent impact on fetus and neonate? Or are these covariates serving as markers of higher rates of LBW outcomes based on data from the geographical area?

Is the study design appropriate for the research question?

Yes

Is the rationale for, and objectives of, the study clearly described?

Yes

Are sufficient details of the methods provided to allow replication by others?

Yes

Are the datasets clearly presented in a useable and accessible format?

Not applicable

Reviewer Expertise:

Dr. Patel: neonatology, nutrition, disparities, clinical and economic outcomes, growth.  Dr. Lee: neonatology, global health.

We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.

Gates Open Res. 2023 Jun 9. doi: 10.21956/gatesopenres.14951.r33402

Reviewer response for version 1

Zachary Vesoulis 1

In this study protocol, Krasevec et al. describe a multipronged approach for broad estimation of low birthweight (LBW) incidence across 200 countries. Accurate assessment of LBW incidence is an important foundation of reducing infant and child mortality and a key measure for world health initiatives.

In this protocol, the authors detail a strategy for the acquisition and handling of data from a variety of administrative and institutional sources along with a statistical approach to handle anticipated challenges.

This is an important project and a clearly written protocol and the authors should be commended for taking on this initiative. They have accurately identified several key barriers including missing data, heaping of data, and lack of reliability in the data. For some of these problems (especially heaping) they present sufficient statistical methods to overcome the challenge.

The problem of missing data looms large and will require significant effort on the part of the team. In this particular case, missing data are truly missing, making measures as simple as a denominator (much less accurate birth weights) difficult to ascertain. The authors have provided thresholds of data missingness and an imputation strategy, however I anticipate that the real data will not be missing in such a "neat" fashion. I would suggest that they consider partnership with organizations or individuals in regions where there is a high degree of missingness to provide direct estimation of birth weights in small representative samples to be used as validation for the more general statistical strategies.

I am also concerned about overreliance on web based sources of data and the primarily English-language focus. The authors note that collaborators who speak "the relevant language" will participate in the study, however it is unclear the breadth of language expertise available and additional bias may be introduced through language or record accessibility.

Finally, the usage of household survey data provides an excellent alternative source of "truth" for the purposes of the study and I appreciate the systematic approach to evaluating the survey output to ensure validity. However, it is unclear to this reviewer how standardized or available the surveys are and how readily the analytic framework can be applied to broad survey design. Unless the surveys are standardized forms, this heterogeneity will likely prove challenging and result in additional missing data, particularly when compared to vital statistics which are more readily comparable.

Is the study design appropriate for the research question?

Yes

Is the rationale for, and objectives of, the study clearly described?

Yes

Are sufficient details of the methods provided to allow replication by others?

Yes

Are the datasets clearly presented in a useable and accessible format?

Not applicable

Reviewer Expertise:

Prematurity, outcomes research, brain injury, neurodevelopment

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Gates Open Res. 2023 Jun 2. doi: 10.21956/gatesopenres.14951.r33246

Reviewer response for version 1

Karen J Mathewson 1, Christina A Brook 1

The authors outline a study protocol involving up to 202 countries that is designed to estimate the prevalence of low birth weight (LBW) between 2000 and 2020 at world-wide, regional, and national levels. The protocol focuses on two major sources of data, namely, national repositories of administrative data (provided the available birthweight data account for ≥ 80% of UN-estimated live births for that country/year), and birthweight data from population-based household surveys. Analyses include quality and bias assessments, comprehensive extraction of birth-weight related information from national repositories, and the development of regression models from household survey data. Sensitivity analyses also will be conducted to evaluate the contributions to the models from covariates in six domains (demographic/socio-economic factors, nutritional/environmental factors, maternal health conditions, fetal or placental conditions, factors related to access to care, and early childhood outcomes such as neonatal mortality and childhood stunting). Updated data on LBW rates are essential for evaluating the global burden of LBW and programs designed to reduce these rates, in accord with one of the Sustainable Development Goals set by the World Health Assembly in 2012, which is to reduce rates of LBW by at least 30% by 2025.

This paper presents an ambitious protocol to obtain updated estimates of LBW prevalence at national, regional, and global levels over the twenty-year period ending in 2020. The authors are to be highly commended for their rigour in compiling a statistical approach for analyzing such a complex dataset, including attention to the representativeness of the data and potential sources of bias in the data. Below are some thoughts and queries that might be helpful in clarifying the protocol:

  • Carrying the complex design from beginning to end of the paper was somewhat challenging for anyone inexperienced with such protocols. This could be improved with some re-organization of the structure of the text. An overall objective could be inserted after the Introduction. Perhaps the switching back and forth in the Protocol section between how the national administrative data and survey data will be handled could be avoided by first describing all of the plans for the national data (screening, extraction exclusions), and then all of the plans for the survey data (screening, analyses, exclusions).  

  • It would be helpful if the authors would provide brief, explicit rationales for some of the decisions they made. For example, although they describe the household surveys as “nationally representative”, it becomes clear that the point of using the surveys is that they are an essential secondary data source when national administrative data are unrepresentative of their populations due to missing data. The reason for using survey data could be made explicit. Second, how was the 30% threshold arrived at for inclusion of survey data? In another example, it is not clear why fitting imputed data to a mixture model of two distributions is necessary or important when adjusting for data heaping. Finally, what validation is available for selecting the variable “mother’s perception of newborn size”? There may be other examples.

  • The authors propose a mixed effects model (also known as a multilevel model) to model incomplete LBW data from 202 countries. There are many strengths to this design, including the use of Bayes estimation, given the likelihood of nonnormality in sample(s’) distributions and the complexity of the data, as well as the use of mixed effects to account for data dependency. It would be helpful to have a summary statement at the beginning of the statistical analysis section that outlines which analysis will be undertaken at each geographical level (national, regional, or global).

  • Another strong point in favour of the modeling design for the survey data is the inclusion of carefully selected covariates (through EFA and a conceptual framework) that account for known associations with LBW status. One concern, however, is the intent to use early childhood outcomes as covariates. Perhaps we have misunderstood the intended design, but we wonder about the validity of using outcomes such as undernutrition in childhood to predict birthweight status, when temporal precedence would indicate that these outcomes postdate LBW. The authors indicate that such early childhood outcomes are often correlated with LBW, but more justification is needed.

  • It was challenging to understand the goals of the statistical analyses of the survey data in terms of predictors and outcomes. Step 1 (covariate selection) is clear. Perhaps we misread the text, but it is not clear how a Bayesian multilevel model “will be developed to estimate LBW prevalence at a national level” at Step 2, which is then used to predict annual estimates of national LBW prevalence for all countries in Step 3.

  • The sensitivity analyses will be very helpful. In addition to evaluating the contribution of the various covariates to LBW estimates, will significant covariates ultimately be used to inform policy, or for some other purpose?

  • It would be useful to the reader to have a figure depicting a basic Bayesian multilevel-mixed regression model, with direction of effects, to clarify some of the specific relations that will be tested. The conceptual framework in Fig. 2 does not carry all of this relational information. It would also be helpful to have a simple paragraph describing the steps in the modeling process from gathering data to estimating model parameters of interest (i.e., gather data, assess at the national level for relations one, two and three, assess at the global level for relations one two and three etc.)

  • It was not entirely clear how missing data was being handled statistically at each level of analysis. Do any of the statistical approaches determine whether data is missing at random (MAR) versus NMAR, which in turn dictates how missingness is handled? It is difficult to imagine that the samples will be free of data that are not missing at random (NMAR), which may produce biased rather than “true” estimates (e.g., Muthén & Asparouhov, 2011), in addition to the problems identified by the authors in Table 2.

  • It would be reassuring to have a separate section on expected methodological limitations of the protocol.

Minor issues:

  • What does “temporal variability” at the country and regional levels refer to, at the end of p. 7? 

  • Please name the “standard approach for infilling and extrapolation consistent with previously-used approaches” alluded to in reference 36 (p. 7)? This sentence could be clearer.

Is the study design appropriate for the research question?

Partly

Is the rationale for, and objectives of, the study clearly described?

Yes

Are sufficient details of the methods provided to allow replication by others?

No

Are the datasets clearly presented in a useable and accessible format?

Not applicable

Reviewer Expertise:

kjm: extremely low birth weight; psychophysiology                cab: statistical analyses; statistical modelling

We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.

Gates Open Res. 2023 May 30. doi: 10.21956/gatesopenres.14951.r33099

Reviewer response for version 1

Hong-tian Li 1, Hong-zhao Yu 1, Yang Liu 1

The protocol describes in detail the data sources, data processing, and modeling approach used to estimate the global low birth weight (LBW) prevalence from 2000 to 2020. The proposed methodology offers insightful solutions for dealing with potential biases in LBW estimates. Below are some suggestions for further optimization.

Firstly, the authors mentioned to use a novel modeling strategy in comparison to the 2019 study, but did not describe in detail the advantages of the new strategy. Readers may be interested to know the reason for that.

Secondly, it is reasonable to exclude implausible data points prior to data synthesis, but a 40% upper limit may not be appropriate across all settings. Specifically, for developed countries, this threshold may be too high, and may result in inclusion of unlikely data points for these countries (i.e., 39%). Is it possible to set exclusion criteria tailored to specific regions based on the 2019 study?

Thirdly, it may need to clarify whether only one data point per year per country will be included and incorporated into statistical modelling.

Fourthly, more detailed descriptions regarding covariate selection would be beneficial. For example, the authors mentioned that "or covariates with data for most country-years in cases where the correlation coefficients are deemed not to be that different from the covariate with the highest correlation though incomplete data." It would be beneficial to set explicit criteria in advance to determine what levels of difference qualify as no difference. Additionally, on page 7, the authors stated that "Finally, for all countries, smoothed time series will be generated using a 7-year average for model prediction." However, the reasons for using the 7-year average for model prediction is unclear.

Finally, in Figure 1 and Table 1, abbreviations such as MOH and NOS are used without explanation. The exclusion criteria for household surveys in Figure 1, "adjustment procedure for missing birthweights and data heaping yield a result", are not immediately self-explanatory.

Is the study design appropriate for the research question?

Yes

Is the rationale for, and objectives of, the study clearly described?

Yes

Are sufficient details of the methods provided to allow replication by others?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Reviewer Expertise:

Perinatal Epidemiology

We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.

Gates Open Res. 2023 May 30. doi: 10.21956/gatesopenres.14951.r33247

Reviewer response for version 1

Mitsuhiro Haga 1

General comments:

Thank you very much for allowing me to review such an outstanding protocol paper. The authors designed a detailed protocol for estimating the prevalence of low birthweight infants worldwide. One of the major challenges for collecting populational data is under and/or inaccurate reporting, which is much more prevalent in the most vulnerable populations. The study protocol is clearly addressing the biases involved in the public health study. The most striking part of this study is that the authors will review and extract non-English data sources with the support of staff who can speak relevant languages. Not every study in the field of public health does such an inclusive methodology. I admire the passion and thoughtfulness of the authors. Their plan for data extraction and statistical analyses is plausible and reasonable. I have a few minor comments that the authors can be answered quickly.

Specific comments:

1. Statistical analysis:

The authors are going to create a Bayesian multilevel-mixed regression model predicting the prevalence of low birthweight infants. The R packages they will use in this study are specified as “RJAGS RJAGS, R2JAGS, RSTAN packages.” in the manuscript (P7, C2, L49–50). However, I cannot find such R packages. Would they be “rjags, R2jags, and rstan”? It would be much more desirable if the authors added citations of those packages as references.

2. Statistical analysis:

It seems the authors will perform statistical analysis with JAGS and Stan via R software. I am curious to know how the authors are going to use these two packages. I understand that JAGS and Stan are used for creating a Bayesian multilevel model. Are the authors going to use JAGS and Stan for different purposes? Otherwise, are they going to use the two as cross-validation?

3. Figure 1:

It would be much more reader-friendly if the authors specified the full spelling of MICS, DHS, and LMICs.

4. Figure 2:

This figure is very similar to the one that appeared on the paper written by the authors’ team in 2019 (reference #22). The conceptualization figure must be a self-citation of their previous work, but the figure does not cite reference #22. It is recommended to specify the citation source even though the work is their own project.

5. Figure 2:

I cannot understand the words “US child mortality” in the box at the third from the top at the right end. Are the authors going to study child mortality rate in the United States particularly?

Is the study design appropriate for the research question?

Yes

Is the rationale for, and objectives of, the study clearly described?

Yes

Are sufficient details of the methods provided to allow replication by others?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Reviewer Expertise:

Neonatology

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Gates Open Res. 2023 May 25. doi: 10.21956/gatesopenres.14951.r33242

Reviewer response for version 1

Jose R Duncan 1

I would like to congratulate the authors for creating this concise research protocol titled: “Study protocol for UNICEF and WHO estimates of global, regional, and national low birthweight prevalence for 2000 to 2020”. In this protocol, Krasevec and collaborators, designed a thorough protocol to report the low birthweight (LBW) prevalence for individual countries, regions, and globally.

 

  • Is the rationale for, and objectives of, the study clearly described?

The authors described the importance and rationale for designing their study and the objective is clear and concise.

 

  • Is the study design appropriate for the research question?

The study is thoroughly defined with clear search methodology, reporting methods, and use multiple ways to minimize and assess bias and confounding. Even though, the authors also acknowledge that it will be impossible to adjust for all biases.

 

  • Are sufficient details of the methods provided to allow replication by others?

Yes.

 

  • Are the datasets clearly presented in a useable and accessible format?

There are clear and complete data extraction methodology, but I could not see the datasets that will be use or have been already collected.

The authors described thoroughly the methodology of how they will do this study. They try to minimize bias and confounding utilizing multiple interventions, these include:

  • Well defined exclusion criteria.

  • They utilize the WHO data quality review framework, to assess the quality of the data.

  • They described a priori methodology to select covariates to be included in a model that includes multiple steps. They plan to corroborate covariates with UNICEF and WHO, they will also do a cluster analysis.

  • They described in detail the how they will obtain the LBW prevalence and report their results.

  • The authors plan to perform multiple sensitivity analyses to corroborate their LBW prevalence and use it to compare to previously used models to estimate previous LBW prevalence.

I will recommend the authors to consult an experienced statistician to verify that there are no other interventions that could improve their models.

I will also caution of not including the countries with outlier LBW prevalence and will recommend the authors to present the complete data and calculate prevalence with and without those with LBW prevalence < 2% or greater than 40%.

Lastly, I would like to see the authors include a statement acknowledging that a prospective design will have been ideal to assess the LBW, however, this methodology will take years to complete.

Is the study design appropriate for the research question?

Yes

Is the rationale for, and objectives of, the study clearly described?

Yes

Are sufficient details of the methods provided to allow replication by others?

Yes

Are the datasets clearly presented in a useable and accessible format?

Not applicable

Reviewer Expertise:

Obstetrics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Availability Statement

    Underlying data

    No underlying data are associated with this protocol.

    Extended data

    figshare. Extended data table1 admin data abstraction template.xlsx. https://doi.org/10.6084/m9.figshare.20113040.v1 38

    This project contains the following files:

    • -

      Extended data table1 admin data abstraction template.xlsx (data extraction form for low birthweight and preterm birth estimates)

    figshare: Extended data tables 2 and 3 LBW protocol.docx. https://doi.org/10.6084/m9.figshare.20113043.v2 39

    • -

      Extended data tables 2 and 3 LBW protocol.docx (variable description for household survey reanalysis outputs for data quality review)

    Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).


    Articles from Gates Open Research are provided here courtesy of Bill & Melinda Gates Foundation

    RESOURCES