Skip to main content
Health Policy OPEN logoLink to Health Policy OPEN
. 2022 Dec 6;4:100087. doi: 10.1016/j.hpopen.2022.100087

The impact of cash transfer participation on unhealthy consumption in Brazil

Fernanda Araujo Maciel a,, Diogo Duarte b
PMCID: PMC10297731  PMID: 37383885

Highlights

  • Brazilian cash transfer participants of the Bolsa Familia program spend more on food.

  • Recipients do not spend more on unhealthy products, such as packaged foods.

  • There is no increase in alcohol or tobacco products purchased.

  • Machine learning algorithms improve the propensity score estimation of the model.

Keywords: Conditional cash transfer program, Unhealthy consumption, Propensity score matching estimation

Abstract

We assess the impact of the Brazilian government’s conditional cash transfer program Bolsa Família on unhealthy consumption by households, proxied by expenses with ultra-processed food, alcohol, and tobacco products. Using machine learning techniques to improve the propensity score estimation, we analyze the intensive and extensive margin effects of participating in the program on the household purchase of unhealthy products. Our results reveal that program participants spend more on food in general, but not necessarily more on unhealthy options. While we find evidence that participants increase their probability of spending more on food away from home, they do not significantly alter their expenditures on packaged food, alcohol, or tobacco products.

1. Introduction

Conditional cash transfer (CCT hereafter) programs aim to alleviate poverty and improve the nutrition of low-income families, among other goals. The literature has shown that families participating in these programs use the funds to purchase more food, which ultimately improves their caloric consumption [1], [2]. However, the nutritional quality of the food purchased is still low. For instance, in Mexico, participants in the Oportunidades program improve their dietary intake of iron, zinc, and vitamin A, but this effect is due to the supplementary nourishment provided by the government and not due to an improvement of their diet [3]. In the United States, Supplemental Nutrition Assistance Program (SNAP hereafter) participants consume as many calories as nonparticipants. However, SNAP participants have systematically partaken in lower dietary quality than nonparticipants [4]. This suggests that SNAP participants fulfill their caloric needs by purchasing high-energy-dense foods that are poor in nutrients.

Our study further explores the implications of conditional cash transfer programs for the diet of their participants. Using the database on the Brazilian conditional cash transfer program, called Bolsa Família, we examine (1) whether participants in the program increase their food expenses overall and (2) if the increase in food expenses among these participants translates into a higher expenditure in high-caloric, low-nutritious (i.e. ultra-processed) food. Our results show that while program participants purchase more food overall, the share of ultra-processed food does not increase with the additional funds provided by the government. This result has important implications for policy design since the overconsumption of fat and sugar may lead to the development of diseases such as diabetes, hypertension, allergies, and cancer [5], [6], [7].

A second important finding is that while participants in Bolsa Família have a higher probability of purchasing food away from home and cookies, the probability of participants purchasing soda decreases. Interestingly, we find that the program participants do not significantly change their expenditures on packaged food, alcohol, or tobacco products (which include cigarettes, hand-rolled cigarettes, cigars, pipes, and cigarillos).

Our study also contributes to the methodological front. To evaluate the impact of the conditional cash transfer program on the unhealthy consumption of these families, we use propensity score matching (PSM hereafter). PSM estimation relies (mostly) on logistic regression [8], which requires the model to be parametric and linear (in terms of the log of the odds). As a result, the estimates produced by this method can be severely biased in the presence of mild nonlinear relationships. To overcome this problem, we follow recent studies in the literature that suggest that machine learning techniques can enhance the predictability of econometric models and reduce estimation biases [9], [10], [11], and compare the propensity score from a standard logistic regression model with the ones obtained from four popular machine learning models: random forests, gradient boosting, support vector machines (SVM hereafter), and neural networks. Our analysis suggests that the propensity score from the gradient boosting method has the highest sensitivity and the lowest misclassification. After computing the propensity score of the gradient boosting method, we proceed to the second stage of the PSM, which consists of matching the propensity scores of families that participate in the program and those who do not, to unveil the effects of participating in the Bolsa Família program. To the best of our knowledge, this is the first study to provide a comprehensive analysis of the performance of these machine learning algorithms on the estimation of propensity scores with this dataset.

The remainder of this paper is organized as follows. Section 2 contains a brief description of the Brazilian cash transfer program Bolsa Família and its relationship with ultra-processed foods. Section 3 contains our empirical analysis, with the description of the dataset, methodology, and results. Section 4 concludes.

2. Bolsa Família: The Brazilian CCT Program

The central concept of any CCT program is to condition the received funds provided by the government to households in poverty on the participants’ compliance with some health and education requirements [12]. Implemented in 2003, the Brazilian CCT program Bolsa Família is the largest social welfare program in the country, covering a quarter of all Brazilian households. The program was introduced with the objective of reducing hunger, malnutrition, poverty, and familial deprivation, while providing low-income families access to public services, such as health, education, and social assistance.

Only families living in extreme poverty are eligible to participate in this program, those with monthly income below R$89 per capita (approximately US$23 per capita, using the conversion rate from 2009, as all values hereafter). On top of the base value of R$89, families receive an additional value of R$41 for each child and pregnant woman or nursing mother (up to five). Families with monthly income below R$178 per capita (approximately US$45 per capita) are also eligible if they have children under 16 years old, or if they are pregnant women or nursing mothers. However, for these families, the benefits are restricted to R$41 per child, pregnant women, or nursing mothers.

Once eligible, families have to comply with the requirements for health and schooling. In addition, the household is enlisted in a national registry, namely Cadastro Único that contains self-reported information on household demographic characteristics, income, and prior participation in transfer programs. Although all households are free to register in Cadastro Único, each municipality has quotas allocated by the federal government according to a poverty assessment based on poverty maps [13]. Since officials of each municipality are responsible for the registration process, there is substantial heterogeneity across municipalities in targeting registration [14].

Bolsa Família has a substantial impact on the income of the families participating in the program. For example, a family of two parents with three children that has a total monthly income of R$400 (i.e., a monthly income below R$89 per capita) can receive a benefit up to R$212, which corresponds to an increase of more than 50% of their original total monthly income.

One of the main effects of this CCT program is that participants in Bolsa Família use the additional funds to increase their standards of living by raising their food consumption. About 87% of the participant families use the cash transfer primarily to purchase food [15]. Beneficiaries spend, on average, 56% of their household income on food. In a study with 5,000 households selected from the Bolsa Família registry, the authors show that families report significantly higher consumption of cereals, processed foods, meat, milk and other dairy products, beans, and sugar [16].

While the aforementioned studies have shown an increase in food consumption and a reduction in the food insecurity of these families, it is still unclear whether the increase in calories is associated with good nutritional practices or it is mainly caused by an increase in the consumption of ultra-processed foods. This question is central to policymakers since the increase in the consumption of ultra-processed food is accompanied by diseases, such as diabetes, hypertension, allergies, and cancer [5], [6], [7]. For this reason, we investigate in this study if the increase in food consumption experienced by Bolsa Família participants is accompanied by an increase in their consumption of ultra-processed foods.

2.1. Ultra-processed foods

Ultra-processed foods (UPF hereafter) are obtained by processing food ingredients such as hydrogenated oils, hydrolyzed proteins, and starch-modified sugars. The ingredients in ultra-processed products are additives to make the final product convenient, cheap, and flavorful [17]. As a result, the products contain a high amount of calories, fat, sugar, and/or salt. Typical examples of UPF are bread, chips, soft drinks, and processed meat [18].

UPF are predominant in high-income countries, and their popularity is fast increasing in middle-income countries as well [19]. Ultra-processed foods are particularly attractive to households with children and to those of lower income and lower education [6]. In the United States, households participating in the SNAP program have higher spending on UPF and lower spending on healthy foods, such as fruits and vegetables, compared to purchases made by nonparticipants in SNAP [20]. A similar pattern is observed in Mexico, where households participating in the CCT program Oportunidades are associated with a higher body mass index (BMI hereafter) and a higher prevalence of excess weight and obesity [21].

The same consumption trend of ultra-processed foods can be found in Brazil [18]. Using data from three household budget surveys across three decades (1987, 1995, and 2003), the authors show that the consumption of ultra-processed food products increased among both lower and upper income groups in Brazil. The most recent survey reveals that ultra-processed food products represent 28% of the total energy (418 kcal per capita) a Brazilian household purchases. The survey also documents an increase in sugar, saturated fat, and sodium over the past decades, while calories consumed from in natura or minimally processed foods decreased over the past years [22]. A study using the Brazilian Household Budget Survey data from 2009 finds that the availability of ultra-processed products to a household is positively associated with the average household BMI and the prevalence of obesity [23].

The increasing consumption of UPF in past decades motivates us to investigate if this trend is also observed among Bolsa Família participants. The next section presents the dataset, methodology, and results from our analysis.

3. Empirical analysis

3.1. Data

We use data from the Brazilian Household Budget Survey (Pesquisa de Orçamentos Familiares, POF hereafter), conducted by the Brazilian Institute of Geography and Statistics (Instituto Brasileiro de Geografia e Estatística, IBGE hereafter). The survey contains information about 55,970 households, totaling 190,159 individuals, collected from 2008 to 2009.

The main objective of the POF survey is to investigate patterns of consumption and expenditure of the Brazilian population. This data serves as input for the construction of consumption baskets used to estimate IBGE’s consumer price indexes, such as the IPCA, the main consumer price index in Brazil. POF contains information on individuals (age, level of education, and income), housing (size, the existence of sewage, and type of walls), expenditure for each household (habitation, clothing, health, and food), and source of income (total income, income from social programs including Bolsa Família).

For this article, we exclude heads of households that are under 18 years of age. We also exclude families with total monthly income below R$1 and those participants who claim to have a monthly wage per capita of above R$6,000 (approximately US$3,000). We perform a pre-trimming of the data and select households that receive up to R$465 (US$233), the monthly minimum wage in 2009. The final sample comprises 33,395 households. Out of the 33,395 households, 8,829 of them (approximately 26.4%) participate in Bolsa Família. The average monthly income of the 33,395 households (i.e., recipients and nonrecipients of Bolsa Família) is R$870.19, while the average monthly income of the 8,829 Bolsa Família recipients without government assistance is R$609.22. These families that participate in the CCT program receive, on average, R$86.32 per month from the program, which corresponds to an increase of 14% of their average income.

Table 1 presents some descriptive statistics on the demographics of the sample. The average age of the household reference person is 46 years old and 31% is female. The most prevalent race designation is multiracial (51%), followed by white (37%), and black (11%). Approximately 70% of the household reference persons have eight years or less of schooling. The average family size is 3.8 people, with 2.04 children among households that report a nonzero number of children. The average household has a monthly per capita income of R$232. The study sample households are predominantly located in the Northeast (37%), followed by the Southeast (35%), with 77% of them living in urban areas. The percentage of families under the poverty line is 24%, with 11% of them living in a situation of extreme poverty.

Table 1.

Descriptive statistics of the study sample.

Household Reference Person Household
Age 46 Family size 3.75
Female 31% Number of children 2.04
Race (%) Household income R$232.05
White 37 Extreme Poverty 11%
Black 11 Poverty 24%
Asian 0 Pregnant 4%
Multiracial 51 Breastfeeding 8%
Indigenous 1 Urban area 77%
Undeclared 0
Schooling (%) Region (%)
4 years 38 North 9
4–7 years 32 Northeast 37
8–10 years 13 Southeast 35
11–14 years 15 South 12
15 + years 2 Center 7

The variables of investigation are the amount spent (in Brazilian Reais) on unhealthy products. More specifically, we focus on household expenses for soda, cookies, packaged foods, food away from home, alcohol, and tobacco products in the month before the survey was conducted. The variable food away from home is divided into two groups: unhealthy and total. The former includes pastries and snacks, and the latter consists of all food purchased away from home, including unhealthy options. We select the variables household food expense and total food expense, which also includes food consumed away from home, to estimate the increase in food expenditure among the program participants.

Table 2 shows the percentage of households with nonzero expenditures in our variables of interest and the average household monthly expenditure in absolute value (Brazilian Reais). For example, 94.2% of the households claim to have purchased food in the previous month, spending on average R$323.90. We observe that 27.1% of households bought soda, with average spending of R$22.50, and 22% had some expenses with tobacco products, with an average of R$42.02.

Table 2.

Average expenditure by household.

Product Any expenditure (%) Expenditure (R$)
Total food 94.2 323.90
Total food at home 91.0 263.16
Soda 27.1 22.50
Cookies 36.2 19.25
Packaged food 9.0 37.21
Food away from home (unhealthy) 38.2 43.22
Food away from home (total) 59.4 104.58
Alcohol 13.4 67.18
Smoking 22.0 42.02

We perform a preliminary analysis by computing the correlation among these variables to check if they are highly correlated. A large correlation coefficient may indicate that the variables in question are part of a broader dimension, which should be identified by a factor analysis. Since the largest correlation coefficient is 0.25, the performance of a factor analysis is not necessary.

3.2. Methodology

The main objective of our analysis is to quantify if recipients of the cash transfer program increase their spending on ultra-processed foods with low nutritional value. In addition, we also investigate if the consumption of highly addictive goods such as alcohol and tobacco increases relative to similar households that do not receive monetary assistance. Different from the existing studies of Bolsa Família, we rely on the PSM to address these questions. This method provides an estimate of the effect of a treatment conditional on covariates that may help predict receiving the treatment [24]. As a result, the PSM estimates experience the effects of less selection bias and are recommended for observational studies and program evaluation [25].

The first step of the PSM is obtaining the propensity score, which is the probability of receiving the treatment, conditional on a set of covariates. By conditioning the probability of treatment on a set of covariates, we aim to make the treatment independent from the covariates. In our study, these covariates are variables that can predict being a recipient of the cash transfer. Thus, the household variables we take into consideration are the family size, number of children, household income, geographic region, whether the household is located in an urban area, whether the household is below the poverty or the extreme poverty line, whether someone in the household is pregnant or breastfeeding. In addition, we control for the household reference person variables of age, gender, race, and schooling level.

In the second step, we use the calculated propensity score to perform a nearest-neighbor matching operation, where each treated unit is matched with an untreated unit. In our study, we match a household that participates in Bolsa Família with another household that does not receive the funds from the program but shares roughly the same propensity score. This is possible because some of the eligible families do not participate in the program due to the limited number of quotas allocated to their municipality by the federal government.

To measure the effects of unhealthy expenditure, we evaluate their sensitivity by computing the extensive and intensive margins. In our study, an extensive margin refers to the difference in the percentage of a product purchased between participants and nonparticipants of Bolsa Família.

An intensive margin considers the monetary difference spent on a specific product between participants and nonparticipants. Since the expenditure data is right-skewed, we apply a log-transformation to it to obtain an empirical distribution closer to a normal distribution. Notice that by using the log-transformation, we are restricted to comparing the percentage difference in expenditure only among households who have purchased the product in question. Following the log-transformation, the skewness of all variables lies between the ideal range of −0.5 and 0.5 for the distribution to be considered approximately symmetric, except for the variables total food expense (-0.87) and household food expense (-0.69).

3.2.1. Computing the propensity score with machine learning methods

While logistic and probit regressions are the predominant methods for propensity score estimation [8], some recent studies have shown that machine learning techniques can produce estimates with lower sensitivity and accuracy. The reason is that these techniques capture nonlinear relationships among the investigated variables, resulting in better estimates of the propensity score [26].

As mentioned in Section 3.2, the first step of the PSM is obtaining the propensity score. In our study, we obtain this estimate using four different machine learning binary classification models and check which algorithm produces the propensity score estimation with the lowest sensitivity and misclassification. The methods we evaluate are random forests, gradient boosting, SVM, and neural networks.

While the study of Westreich et al. [27] argues that the predictive accuracy by some of these methods improves relative to the logistic regression, we are unaware of a study conducting a horse race among these methods in the context of the observational data of Bolsa Família. A few related studies compare multiple machine learning models using simulated data [28], [29].

After estimating a propensity score using each of the proposed methods, we compute their sensitivity, accuracy, and misclassification rate, defined as, respectively,

Sensitivity=TPP,Accuracy=TP+TNP+N, (1)
MisclassificationRate=1-Accuracy,

where the true positive TP is the number of households correctly classified as Bolsa Família participants, the true negative TN is the number of households correctly classified as non-Bolsa Família participants, the condition positive P is the total number of Bolsa Família participants, and the condition negative N is the total number of non-Bolsa Família participants.

It is important to highlight that the sensitivity measure in (1) captures the fraction of correctly classified participants in the universe of all participants of Bolsa Família. The reason why this measure is critical to us is that there are eligible households in our sample who would have been classified by the machine learning algorithms as recipients of the Bolsa Família funds but were unable to participate in the program due to the limited available quotas. For this reason, the sensitivity measure reflects more accurately the ability of the machine learning methods to correctly classify the participants of Bolsa Família than the misclassification rate or accuracy, two measures that are widely adopted in the machine learning literature [30], [31]. We use 70% of the data set for training, 10% for validation, and the remaining 20% for testing to address the issue of overfitting.

Table 3 shows the sensitivity, accuracy, and misclassification rate for each of the five models. As illustrated, gradient boosting has the highest sensitivity and accuracy (93.5% and 76.5%, respectively) among all models, followed by random forests (83.3% and 76.3%, respectively) and neural networks (77.6% and 75.4%, respectively). We highlight that despite the gradient boosting method and random forest having roughly the same accuracy, they differ more than 10% in their sensitivity, which indicates that the fact that gradient boosting is considerably more accurate when restricting the sample space to those who participate in Bolsa Família. Another surprising result is that the logistic regression–the most used method in the estimation of propensity scores–displays the lowest sensitivity and accuracy (76.5% and 74.2%) among all models tested.

Table 3.

Model Comparison.

Model Sensitivity(%) Accuracy(%) Misclassification(%)
Gradient Boosting 93.5 76.5 23.5
Random Forests 83.3 76.3 23.7
Neural Networks 77.6 75.4 24.6
SVM 76.6 74.2 25.8
Logistic Regression 76.5 74.2 25.8

Given the results presented in Table 3, we use the gradient boosting method in the first stage of the PSM to obtain the propensity score, which is essentially the probability of being a Bolsa Família participant conditional on a given set of variables we controlled for (i.e., the covariates). Once the probabilities are obtained, we proceed to the second stage of the PSM, where we match observations with similar propensity scores.

After estimating the propensity scores and before proceeding to the results, the common support should be verified. Common support is the overlap in the range of propensity scores across Bolsa Família participants and nonparticipants and it is visually assessed in the graph of propensity scores across these groups [32]. Fig. 1 presents the density plots of the propensity scores of the cash-transfer recipients (treated group, red line) and nonrecipients (control group, blue line). The left panel shows the densities for the controlled and treated groups using the full sample data (labeled as raw). As illustrated, most of the households that do not participate in the program (control group) are assigned a probability of 10% of receiving Bolsa Família, indicating that nonrecipients are correctly predicted not to partake in the program. Similarly, most of the households that participate in the program (treated group) receive a probability of participating of 75%. Before proceeding to match these groups, they should be balanced, i.e., the distribution of propensity scores should be similar across both groups. The panel on the right displays the density of the matched sample, and we can observe a great overlap of propensity scores across Bolsa Família participants and nonparticipants.

Fig. 1.

Fig. 1

Propensity scores density plots. The graphs show the propensity scores’ densities of the full sample (i.e., raw) on the left-hand side and the matched sample on the right-hand side of the control (blue) and treated (red) groups.

To check the balance in matched samples, we provide in Fig. 2 the box plots for the propensity scores, which graphically evaluate the quality of the match. The box plots for recipients are marked in red and nonrecipients in blue. The panel on the left-hand side shows the balance plot of the propensity scores for the full sample data (labeled as raw). The great disparity between these groups indicates that nonrecipients (control group) are predicted with a low probability of participating in the program, while recipients (treated group) are predicted with a higher probability of participating in the program. The panel on the right shows the balance plot for the propensity score for the matched group. Notice that the matched sample displays considerably more balanced box plots between recipients and nonrecipients of the cash-transfer program.

Fig. 2.

Fig. 2

Propensity scores box plots. The graphs show the box plots for the full sample (i.e., raw) on the left-hand side and the matched sample on the right-hand side of the control (blue) and treated (red) groups.

3.3. Results

We estimate two PSM models to capture the effects of the outcomes at the extensive and intensive margins. The extensive margin captures the change in consumption, measured as the percentage differences in household purchases between participants and nonparticipants of Bolsa Família. More specifically, the measure reflects how much more (or less) a participant household spends buying an unhealthy product in comparison to a nonparticipant household. To estimate the extensive margin, we use a dummy variable that assumes value 1 when the household has purchased the product in the past month.

The intensive margin captures the percentage increase (or decrease), measured in Brazilian Reais, a participant household spends buying an unhealthy product relative to a nonparticipant household, given that the household has purchased the unhealthy item. We estimate the effects on intensive margins using the log-transformation of the monetary expenditure to reduce the skewness of the distribution. Table 4 presents the average treatment effect (ATE hereafter), the standard error (SE hereafter) of the extensive margin model (first column), and the intensive margin model (second column). We observe that the total food consumption increases by 2 percentage points (pp hereafter) while the total food at home increases by 3.3 pp. The consumption of cookies, unhealthy food away from home, and total food away from home also experience an increase of 4.8 pp, 7.5 pp, and 5.7 pp, respectively. Notice that the consumption of unhealthy food away from home experiences the largest increase in pp among all products we investigate. Surprisingly, soda consumption is the only item that experiences a statistically significant decline in consumption of 3.3 pp, while the decline in the consumption of packaged food is not significant. Intensive margin results show that, among those who purchased food in the last month, Bolsa Família participants increase their total spending on food by 14.5% and total food purchased for consumption at home by 10.4%, in line with the findings in the literature [16], [33]. Interestingly, according to the intensive margin model, there is no significant difference in monetary expenses on the products analyzed between participants and nonparticipants. The last two rows of Table 4 reveal that the increase in consumption of alcohol and tobacco products is not significant in the models investigated, indicating that recipients of the cash transfer program are not necessarily engaging in risky consumption behaviors more than nonparticipants.

Table 4.

Gradient boosting: Marginal effects on food, alcohol, and smoking spending.

Model Extensive Margin Intensive Margin
ATE SE ATE SE
Total food 0.020*** 0.004 0.145*** 0.038
Total food at home 0.033*** 0.006 0.104*** 0.039
Soda -0.033*** 0.012 −0.058 0.042
Cookies 0.048** 0.021 0.041 0.029
Packaged food −0.010 0.012 −0.126 0.090
Food away from home (unhealthy) 0.075*** 0.022 −0.016 0.040
Food away from home (total) 0.057*** 0.022 −0.034 0.042
Alcohol 0.003 0.012 0.019 0.122
Tobacco 0.005 0.014 −0.094 0.054

Note: The table displays the ATE and SE for the extensive and intensive margin models. Each ATE cell represents a separate regression. *** denotes that the estimate is statistically significant at 1%, and ** denotes that the estimate is statistically significant at 5%.

Lastly, as a robustness check, we compare the marginal effects on food, alcohol, and smoking spending of the selected model, gradient boosting, with the standard logistic regression model. Table 5 presents the results. Overall, the average treatment effects of the extensive and intensive margins are consistent (in value and significance) with a few exceptions. Using the logistic regression model, soda consumption is not significant, but consumption and expenditure on tobacco are. Tobacco consumption increases by 2.2 pp while tobacco spending decreases by 15%.

Table 5.

Logistic regression: Marginal effects on food, alcohol, and smoking spending.

Model Extensive Margin Intensive Margin
ATE SE ATE SE
Total food 0.022*** 0.004 0.134*** 0.017
Total food at home 0.033*** 0.005 0.103*** 0.017
Soda 0.009 0.008 −0.026 0.023
Cookies 0.043*** 0.008 −0.011 0.021
Packaged food 0.008 0.006 −0.107 0.068
Food away from home (unhealthy) 0.068*** 0.008 -0.051** 0.027
Food away from home (total) 0.074*** 0.008 −0.039 0.028
Alcohol 0.001 0.006 0.010 0.048
Tobacco 0.022*** 0.007 -0.150*** 0.034

Note: The table displays the ATE and SE for the extensive and intensive margin models. Each ATE cell represents a separate regression. *** denotes that the estimate is statistically significant at 1%, and ** denotes that the estimate is statistically significant at 5%.

4. Conclusion

Our study shows that when compared to a similar household that does not participate in the Brazilian CCT program, a Bolsa Família participant household spends more funds purchasing food, but the differences in unhealthy food product expenses are not significant among families that already buy those. We also document that the probability of purchasing cookies and food away from home increases, according to the extensive margin model.

We find that there are no significant differences in the probability of purchasing alcohol and tobacco products. In addition, we find that there are no significant differences in the number of funds spent on these products as well, indicating that recipients of the CCT program do not engage in more risky consumption behaviors than nonrecipients. We show that the only unhealthy product that experiences a decline in the probability of being purchased is soda.

Overall, our study finds that the participants in Bolsa Família use the cash transfer to purchase more food, but they do not necessarily spend the funds on unhealthy products that could potentially worsen their diets and lead to long-term health complications. We argue that our results are particularly robust due to our proposed methodology of estimating the propensity score through a machine learning model horse race. We propose selecting the machine learning model with the highest sensitivity and the least misclassification. While machine learning models are generally criticized for the nontrivial interpretability of their output, estimating a propensity score for matching purposes does not require interpretation, and this criticism does not apply to our case. We find that gradient boosting outperforms all the competing models and that the most widely used method in the literature–logistic regression–has the lowest sensitivity and accuracy among all the tested models.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Footnotes

Appendix A

Supplementary data associated with this article can be found, in the online version, at https://doi.org/10.1016/j.hpopen.2022.100087.

Contributor Information

Fernanda Araujo Maciel, Email: fernanda.maciel@csus.edu.

Diogo Duarte, Email: diogo.duarte@fiu.edu.

Supplementary material

The following are the Supplementary data to this article:

mmc1.pdf (286.2KB, pdf)

References

  • 1.Hoddinott J., Wiesmann D. The impact of conditional cash transfer programs on food consumption. Conditional Cash Transfers in Latin America, p. 258, 2010.
  • 2.Bailey S., Hedlund K. The impact of cash transfers on food consumption in humanitarian settings: a review of evidence. Canadian Foodgrains Bank. 2013 [Google Scholar]
  • 3.Ramírez-Silva I., Rivera J.A., Leroy J.L., Neufeld L.M. The oportunidades program’s fortified food supplement, but not improvements in the home diet, increased the intake of key micronutrients in rural mexican children aged 12–59 months. J. Nutr. 2013;143(5):656–663. doi: 10.3945/jn.112.162792. [DOI] [PubMed] [Google Scholar]
  • 4.Andreyeva T., Tripp A.S., Schwartz M.B. Dietary quality of Americans by Supplemental Nutrition Assistance Program participation status: a systematic review. Am. J. Prev. Med. 2015;49(4):594–604. doi: 10.1016/j.amepre.2015.04.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Goncalves M.D., Hopkins B.D., Cantley L.C. Dietary fat and sugar in promoting cancer development and progression. Annu. Rev. Cancer Biol. 2019;3:255–273. [Google Scholar]
  • 6.Moran A.J., Khandpur N., Polacsek M., Rimm E.B. What factors influence ultra-processed food purchases and consumption in households with children? a comparison between participants and non-participants in the supplemental nutrition assistance program (snap) Appetite. 2019;134:1–8. doi: 10.1016/j.appet.2018.12.009. [DOI] [PubMed] [Google Scholar]
  • 7.Jamar G., Ribeiro D.A., Pisani L.P. High-fat or high-sugar diets as trigger inflammation in the microbiota-gut-brain axis. Crit. Rev. Food Sci. Nutr. 2021;61(5):836–854. doi: 10.1080/10408398.2020.1747046. [DOI] [PubMed] [Google Scholar]
  • 8.Thoemmes F.J., Kim E.S. A systematic review of propensity score methods in the social sciences. Multivar. Behav. Res. 2011;46(1):90–118. doi: 10.1080/00273171.2011.540475. [DOI] [PubMed] [Google Scholar]
  • 9.Athey S., Tibshirani J., Wager S., et al. Generalized random forests. Ann. Stat. 2019;47(2):1148–1178. [Google Scholar]
  • 10.Iskhakov F., Rust J., Schjerning B. Machine learning and structural econometrics: contrasts and synergies. Econ. J. 2020;23(3):S81–S124. [Google Scholar]
  • 11.Shobana G., Umamaheswari K. in 2021 6th International Conference on Inventive Computation Technologies (ICICT) IEEE; 2021. Forecasting by machine learning techniques and econometrics: A review; pp. 1010–1016. [Google Scholar]
  • 12.Shei A., Costa F., Reis M.G., Ko A.I. The impact of Brazil’s Bolsa Família conditional cash transfer program on children’s health care utilization and health outcomes. BMC Int. Health Human Rights. 2014;14(1):10. doi: 10.1186/1472-698X-14-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Brollo F., Kaufmann K., La Ferrara E. The political economy of program enforcement: Evidence from Brazil. J. Eur. Econ. Assoc. 2017 [Google Scholar]
  • 14.De Brauw A., Gilligan D.O., Hoddinott J., Roy S. The impact of Bolsa Familia on schooling. World Dev. 2015;70:303–316. [Google Scholar]
  • 15.Menezes F., Gadelha E., Santarelli M., Billo R., Costa D.M., Curado J.C., Burlandy L., Magalhães R., da Costa R.S., de Magalhães I.B., et al. IBASE; Rio de Janeiro: 2008. Repercussões do programa Bolsa Família na segurança alimentar e nutricional das famílias beneficiadas. [Google Scholar]
  • 16.De Bem Lignani J., Sichieri R., Burlandy L., Salles-Costa R. Changes in food consumption among the programa Bolsa Família participant families in Brazil. Public Health Nutrition. 2011;14(5):785–792. doi: 10.1017/S136898001000279X. [DOI] [PubMed] [Google Scholar]
  • 17.Monteiro C., Cannon G., Levy R.B., Claro R., Moubarac J.-C., Martins A.P., Louzada M.L., Baraldi L., Canella D. ”The food system. ultra-processing: the big issue for nutrition, disease, health, well-being,”. World Nutrition. 2012;3(12) [Google Scholar]
  • 18.Monteiro C.A., Levy R.B., Claro R.M., de Castro I.R.R., Cannon G. Increasing consumption of ultra-processed foods and likely impact on human health: evidence from Brazil. Public Health Nutrition. 2010;14(1):5–13. doi: 10.1017/S1368980010003241. [DOI] [PubMed] [Google Scholar]
  • 19.Popkin B.M., Adair L.S., Ng S.W. Global nutrition transition and the pandemic of obesity in developing countries. Nutr. Rev. 2012;70(1):3–21. doi: 10.1111/j.1753-4887.2011.00456.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Franckle R.L., Moran A., Hou T., Blue D., Greene J., Thorndike A.N., Polacsek M., Rimm E.B. Transactions at a northeastern supermarket chain: differences by supplemental nutrition assistance program use. Am. J. Prev. Med. 2017;53(4):e131–e138. doi: 10.1016/j.amepre.2017.06.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Fernald L.C., Gertler P.J., Hou X. Cash component of conditional cash transfer program is associated with higher body mass index and blood pressure in adults. J. Nutr. 2008;138(11):2250–2257. doi: 10.3945/jn.108.090506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Martins A.P.B., Levy R.B., Claro R.M., Moubarac J.C., Monteiro C.A. Increased contribution of ultra-processed food products in the Brazilian diet (1987–2009) Revista de Saúde Pública. 2013;47(4):656–665. doi: 10.1590/S0034-8910.2013047004968. [DOI] [PubMed] [Google Scholar]
  • 23.Canella D.S., Levy R.B., Martins A.P.B., Claro R.M., Moubarac J.-C., Baraldi L.G., Cannon G., Monteiro C.A. Ultra-processed food products and obesity in Brazilian households (2008–2009) PloS One. 2014;9(3):e92752. doi: 10.1371/journal.pone.0092752. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Rosenbaum P.R., Rubin D.B. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55. [Google Scholar]
  • 25.Guo S., Barth R.P., Gibbons C. Propensity score matching strategies for evaluating substance abuse services for child welfare clients. Children Youth Services Rev. 2006;28(4):357–383. [Google Scholar]
  • 26.Athey S. University of Chicago Press; 2018. ”The impact of machine learning on economics,” in The Economics of Artificial Intelligence: An Agenda. [Google Scholar]
  • 27.Westreich D., Lessler J., Funk M.J. Propensity score estimation: machine learning and classification methods as alternatives to logistic regression. J. Clin. Epidemiol. 2010;63(8):826. doi: 10.1016/j.jclinepi.2009.11.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Stürmer T., Schneeweiss S., Rothman K.J., Avorn J., Glynn R.J. Performance of propensity score calibration–a simulation study. Am. J. Epidemiol. 2007;165(10):1110–1118. doi: 10.1093/aje/kwm074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Setoguchi S., Schneeweiss S., Brookhart M.A., Glynn R.J., Cook E.F. Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiol. Drug Saf. 2008;17(6):546–555. doi: 10.1002/pds.1555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Chollet F. Simon and Schuster; 2017. Deep learning with Python. [Google Scholar]
  • 31.Géron A. O’Reilly Media; 2019. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. [Google Scholar]
  • 32.Garrido M.M., Kelley A.S., Paris J., Roza K., Meier D.E., Morrison R.S., Aldridge M.D. Methods for constructing and assessing propensity scores. Health Serv. Res. 2014;49(5):1701–1720. doi: 10.1111/1475-6773.12182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Martins A.P.B., Monteiro C.A. Impact of the bolsa família program on food availability of low-income brazilian families: a quasi experimental study. BMC Public Health. 2016;16(1):1–11. doi: 10.1186/s12889-016-3486-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mmc1.pdf (286.2KB, pdf)

Articles from Health Policy OPEN are provided here courtesy of Elsevier

RESOURCES