. 2023 Feb 11;23:107. doi: 10.1186/s12884-023-05366-2

Table 3.

Key lessons learned from the process of harmonizing variables across different studies of BEP intervention in LMICs

Broad Lessons	Specific examples^a
Study level
Harmonization of variables should include both: a) aligning existing variables and b) deciding which measurements should be captured in all studies	While we started by harmonizing variables that were already part of each study (e.g. newborn weight), we also discussed variables that would be important to capture in every study that were not currently present (e.g., food insecurity, certain labor and delivery outcomes).
Quality control should be part of harmonization, but it is particularly difficult to align (especially after studies have started)	We aimed to align quality control procedures for anthropometry and ultrasound variables but ultimately had to refer to best practices from reliable sources as a goal and allow each study to follow quality control practices within their specific field constraints.
Some methodological issues cannot be completely harmonized, but differences can be documented to aid interpretation	For newborn length, we agreed that having two blinded measurers was the ideal practice, but some studies had the capacity for only one measurer.
Some variables do not fit well with harmonization, and decisions on handling these variables in IPD meta-analysis can be made at the analysis level	The structure of education varies considerably across countries, including names for different levels/classes. All studies were collecting information about the amount education each participant completed, and we will reconcile these data during analysis
Ethical and cultural values affect what is allowed and approved in different settings	Harmonization of estimated fetal weight in the third trimester was desired to examine in utero effects of BEP on fetal growth, however in one country, this measurement was not allowed by the IRB due to the inability to address growth restriction in clinical care if identified.
Variable level
There are many layers to harmonizing variables that should be discussed – equipment, training of staff using equipment, calibration of equipment, quality control, number of measurements, timing (e.g., in gestation) of measurement, and handling of data (e.g., cut-offs)	Anthropometry and gestational age were the two variable groups with the most layers to discuss. We decided to harmonize some layers (which measurements were taken, e.g. birth weight, length, and head circumference) but could not align others (expertise of staff taking ultrasound measurements).
Some variables are quite complicated, and take much more time to work out	Gestational age is measured in different ways depending on the timepoint in pregnancy and the available resources. The group spent considerable time reviewing different components of ultrasound measurements and equations to translate measurements to gestational age and ultimately decided to use INTERGROWTH-21st equations (for early and late gestation) but developed an altered protocol for deciding which measurements to capture.
Creating new names for variables (with the same underlying intent) can aid group consensus	We had a lot of discussion about how to define stillbirth and miscarriage. As the WHO defines stillbirth as ≥28 weeks, the group still wanted to capture loss below 28 weeks but did not want to call it miscarriage. So, we decided to create categories by gestational age < 28 weeks and call each “fetal loss <X weeks”.
Creating proxy variables can aid harmonization, particularly when is not possible to capture a clinical variable with standard diagnostic tests (or not possible to capture it at all)	We wanted to capture cephalopelvic disproportion as a safety assessment, however components of this outcome in clinical obstetric definitions were not possible to obtain in these settings. We created a proxy variable, named “maternal-fetal disproportion”, to capture available data with the same underlying meaning (that the baby was too large to fit through birth canal).
When “perfect” capture of clinical variables is not possible, divide into two variables: a) ideal measurement by study team and b) clinical diagnosis from medical record (even if you cannot obtain details on how the diagnosis was made)	Preeclampsia currently has a clinical definition that includes many severe signs and symptoms that are not possible to capture in these studies. We decided to create a definition feasible for study teams to measure (high blood pressure and proteinuria) and a separate variable to capture clinical diagnosis from the medical record (which will vary by location).

^aSupplementary Table 1 has additional details about each variable or set of variables mentioned here