Skip to main content
The American Journal of Clinical Nutrition logoLink to The American Journal of Clinical Nutrition
. 2022 Dec 19;117(1):182–190. doi: 10.1016/j.ajcnut.2022.10.016

Reliability and validity of assigning ultraprocessed food categories to 24-h dietary recall data

Nadia M Sneed 1,2,, Somto Ukwuani 3, Evan C Sommer 1, Lauren R Samuels 4, Kimberly P Truesdale 5, Donna Matheson 6, Tracy E Noerper 7, Shari L Barkin 1, William J Heerman 1
PMCID: PMC10196599  PMID: 36789937

Abstract

Background

The Nova classification system categorizes foods into 4 processing levels, including ultraprocessed foods (UPFs). Consumption of UPFs is extensive in the United States, and high UPF consumption is associated with chronic disease risk. A reliable and valid method to Nova-categorize foods would advance understanding of UPF consumption and its relationship to health outcomes.

Objectives

Test the reliability and validity of training coders and assigning Nova categories to individual foods collected via 24-h dietary recalls.

Design

A secondary analysis of 24-h dietary recalls from 610 children who participated in a randomized controlled trial and were 3–5 y old at baseline was conducted. The Nutrition Data System for Research (NDSR) software was used to collect 2–3 dietary recalls at baseline and yearly for 3 y. Trained and certified coder pairs independently categorized foods into one of 4 Nova categories (minimally processed, processed culinary ingredients, processed, and ultraprocessed). Interrater reliability was assessed by percent concordance between coder pairs and by Cohen’s κ coefficient. Construct validity was evaluated by comparing the average daily macronutrient content of foods between Nova categories.

Results

In 5546 valid recall days, 3099 unique foods were categorized: minimally processed (18%), processed culinary ingredients (0.4%), processed (15%), and ultraprocessed (67%). Coder concordance = 88.3%, and κ coefficient = 0.75. Descriptive comparisons of macronutrient content across 66,531 diet recall food entries were consistent with expectations. On average, UPFs were 62% (SD 19) of daily calories, and a disproportionally high percentage of daily added sugar (94%; SD 16) and low percentage of daily protein (47%; SD 24). Minimally processed foods were 30% (SD 17) of daily calories, and a disproportionally low percentage of daily added sugar (1%; SD 8) and high percentage of daily protein (43%; SD 24).

Conclusions

This method of Nova classifying NDSR-based 24-h dietary recalls was reliable and valid for identifying individual intake of processed foods, including UPFs.

Keywords: diet quality, diet recall methods, Nova classification system, nutrition, 24-h dietary recall, ultraprocessed foods

Introduction

Ultraprocessed foods (UPFs) are manufactured food formulations that typically contain cosmetic additives and are made with minimal whole food ingredients [1]. Ultraprocessed foods are ubiquitous in the United States [2] and contribute to poor diet quality due to their high levels of sodium, refined carbohydrates, added sugars, and saturated fat [[2], [3], [4], [5], [6], [7], [8], [9]]. Data from the NHANES suggests that UPFs account for nearly 60% of the mean daily energy intake for US adults [10]. Similarly, recent NHANES data indicates that among US children, the percentage of total energy consumed from UPFs increased from 61.4% in 1999 to 67.0% in 2018 [11]. Consumption of UPFs is particularly prevalent among low-income children and adolescents [12] due, in part, to the increased exposure of unhealthy food marketing [13, 14] and socioeconomic constraints that reduce the intake of healthier minimally processed foods (e.g., fruits, vegetables, whole grains) [12, 14, 15]. Such high rates of UPF consumption may partially explain the diet-related health disparities (e.g., obesity, diabetes, cardiovascular disease) observed in low-income minority populations [12, 16, 17]. Therefore, methods that accurately measure UPF consumption would be important for assessing intervention effectiveness in these populations.

The degree to which foods are processed has emerged as an important indicator of diet quality ([2], [3], [4], [5], [6], [7], [8], [9]]. The Nova food classification system [1] assigns foods to 4 categories according to the extent of their processing (Table 1) and is globally recognized by the Food and Agriculture Organization of the United Nations and Pan American Health Organization as a dietary quality index for nutrition and policy health research [18, 19]. Yet, challenges to applying the Nova classification system include the nuanced definitions of food processing, the complexity of categorizing multi-ingredient foods, and the lack of sufficiently detailed food descriptions in dietary recall software related to food processing [[20], [21], [22]]. When assigning a Nova category to foods reported in dietary recalls, researchers have largely relied on an expert consensus approach without assessing the reliability or validity of that classification method [4, 10, [23], [24], [25]]. As additional investigations emerge to evaluate the impact of UPF consumption on diet quality and its related health outcomes, it will become increasingly important to have a reliable and valid method for accurately assigning a Nova category to each specific food.

TABLE 1.

Definitions and examples of ultraprocessed foods1

Nova Classification Definition Examples
Unprocessed/minimally processed Edible parts of plants or animals. Can be minimally processed by removing inedible parts, or other methods that do not add salt, sugar, oils, or fats. Fresh/dried fruits, leafy/root vegetables, grains, legumes, meat, poultry, seafood.
Processed culinary ingredients Substances extracted, pressed or centrifuged from Group 1 foods or from nature, designed for the preparation, seasoning, or cooking of foods. Vegetable oils from seeds/fruits (olive oil), butter; cane sugar, honey.
Processed foods Products manufactured by adding processed culinary ingredients to unprocessed/ minimally processed foods to prolong the durability of foods and modify their palatability. Canned vegetables, salt/sugared nuts, canned fish, fruits in syrup.
Ultraprocessed foods Industrial formulations of several ingredients including sugar, oils/fats, and salt, and also food substances with rare culinary use, like hydrogenated oils, high-fructose corn syrup, and modified starches or flavors, colors, emulsifiers, nor-sugar sweeteners and other additives used to disguise undesirable qualities of the final product or imitate the sensorial qualities of culinary preparations. Carbonated soft drinks, sweet or savory packaged snacks, mass produced breads, cookies, or pastries, ready to heat products.
1

Adapted from Monteiro et al.[1].

The purpose of the current study is to test the reliability and validity of a method for training coders and assigning a Nova category to food items collected via 24-h diet recalls using the multipass method in the Nutrition Data System for Research (NDSR) software [26]. This approach aimed to classify foods into 1 of the 4 Nova categories and then create analytic variables (e.g., number of ultraprocessed calories/day) for use in future analyses to assess diet quality.

Methods

Sample size and data source

Diet recall data for the current analysis were obtained from the randomized controlled trial of the Growing Right Onto Wellness (GROW) intervention (clinicaltrials.gov: NCT01316653). The primary outcome for the GROW trial was child body mass index trajectory and has been previously reported [27]. The trial was conducted as a part of the Childhood Obesity Prevention and Treatment Research Consortium and was supported by an independent coordinating center at the University of North Carolina, Chapel Hill [28]. The Vanderbilt University Medical Center Institutional Review Board approved the GROW trial, and written informed consent was obtained by bilingual (Spanish- and English-speaking) data collectors in participants’ language of choice. The current investigation was a secondary analysis of the data collected as a part of the original GROW trial.

For the original GROW trial, 610 parent–child pairs were recruited from multiple local settings, including medical practices and community centers in Nashville, TN. Participants were enrolled between August 2012 and May 2014, with the final 36-mo follow-up between October 2015 and June 2017. Eligibility criteria included being a child aged 3–5 y and being high normal weight to overweight but not obese (BMI ≥50th and <95th percentile based on standardized growth curves from the US Centers for Disease Control and Prevention) [29]. Caregiver eligibility included being English- or Spanish-speaking, consistent telephone access, and a commitment to participate in the study. Households also had to qualify for at least 1 service for underserved populations (e.g., Medicaid, Special Supplemental Nutrition Program for Women, Infants, and Children). Parent–child pairs were excluded if a medical condition precluded regular physical activity or if they lived/worked outside of a 5-mile radius of participating community centers. Race/ethnicity information [30] was assessed by parent report, using fixed categories with an open-ended option.

As a part of the original GROW trial protocol, dietary assessments were collected at 4 timepoints: baseline and then annually (12-, 24-, and 36-mo follow-up) using NDSR software. At each timepoint, participants were asked to complete 3 24-h recalls with trained and certified research assistants (2 weekdays and 1 weekend day), and a minimum of 2 recalls were required for analysis at each timepoint. Recalls were collected over the telephone in English or Spanish. To avoid collecting days with similar foods, recalls were not conducted on consecutive days and the third recall was collected more than 1 wk after the first recall. The goal was to collect all recalls within 45 days but recalls collected outside that timeframe were still included for analysis. Quality assurance checks were conducted on at least 10% of the dietary recalls according to NDSR standard protocols.

The recall data from all participant diet recalls across all time points during the GROW trial contained 3497 unique foods, of which 397 were assigned a Nova category by a single expert coder to establish a baseline understanding of the dataset and provide guidance to other coders. The 397 food categories represented a broad category of foods consumed by children in the GROW trial that was representative of each of the Nova categories. Subsequently, 3100 foods were coded by six pairs of coders using the process described below. This double-coded unique-foods dataset (n = 3100) was used for the reliability component of the current study. After all 3497 unique foods had been coded, the Nova categories were merged back into the full diet recall dataset, and this enhanced diet recall dataset was used for the validity component of the current study.

Nova classification system

The Nova classification system, developed by Monteiro and colleagues [31], consists of 4 separate processing categories: unprocessed and minimally processed, processed culinary ingredients, processed foods, and UPFs [1]. Nova classification is unique in that it categorizes foods based on the extent of their processing and includes a category to identify industrially manufactured “food stuffs” considered to be ultraprocessed [1]. A detailed overview of the 4 food categories is provided in Table 1. The Nova classification system has been expanded upon since its original introduction in 2010 [31]. For this study, Nova classification was based on 2017 [4] and 2019 [1] published criteria by the original author.

Training of coders for Nova category assignment

To assign 1 of the 4 Nova categories, each unique food item was reviewed by 2 independent coders using a set of decision rules adapted by the study team (see Supplementary Methods). Coders included 11 dietetic interns and 1 medical student.

All coders underwent training and certification prior to the categorization process. The training consisted of 1) an in-person session facilitated by the study team, including 2 hours of content where the Nova classification and the scientific literature related to Nova was reviewed, and 2) an online 1-h training module developed by the study team. Coders were also instructed to read previous literature related to the Nova classification system [4]. Prior to completing the certification, coders were given a sample of 50 items to categorize while receiving feedback in real-time. To be certified, coders were given up to 2 attempts to classify 25 foods with at least 90% accuracy using the previously described resources. Items included in the quiz were derived from a list of the most frequently appearing foods within the GROW NDSR diet recall data and were selected to include a representative sample of Nova categories.

After successful completion of training requirements, pairs of coders were assigned a set of unique foods (ranging between 460 and 580 foods) from the GROW diet recall data and coded them into 1 of the 4 Nova categories or an “I don’t know” category. Additionally, coders were asked to rate the difficulty of categorizing each item on a scale of 1–4 (Very Easy, Easy, Hard, Very Hard). Coders were able to use any resources (e.g., primary literature, training module, online search engines) to assist in determining their scoring and met weekly with the research team to discuss and resolve questions. Coders completed categorizations independently from each other. An additional coder (author NMS) reviewed and adjudicated coding discrepancies. In the event that the third coder (NMS) was unable to make a final decision regarding coding discrepancies, the issue was brought to the study team (coauthors on this manuscript), which consisted of experts in nutrition, child health and development, and childhood obesity. The study team was responsible for making a final decision to resolve the coding discrepancies.

Interrater reliability

Reliability of the categorization process was assessed based on the initial categorization attempts, prior to final study team adjudication. The number or percent of discordant categorizations and the type of disagreement are reported overall as well as by Nova category. To identify whether specific categories or types of foods may have been particularly challenging to categorize, discordant categorizations are also reported by 12 broad food groups. These groups were chosen because they were intuitive to understand, and almost all foods could be readily placed into one of them. Cohen’s unweighted κ coefficient is also reported by pairs and overall. Typically, the interpretation of Cohen’s κ is as follows: poor (κ < 0), slight (κ = 0.00–0.20), fair (κ = 0.21–0.40), moderate (κ = 0.41–0.60), substantial (κ = 0.61–0.80) or almost perfect (κ = 0.81–1.00) [32].

Construct validity assessment

The foods included in the final, adjudicated dataset were used to assess the validity of the approach. The adjudicated Nova categories were merged into the “full diet recall dataset,” which consisted of 1 entry (row) for every instance of a food being consumed across all recall days, all participants, and all time points. This full diet recall dataset therefore contained repeated entries for every food as many times as it was reportedly consumed throughout the study, accompanied by information describing the quantity consumed for each instance, and a corresponding Nova code for each food. Before conducting any validity analyses, total daily kilocalories were calculated (one total per child per recall day), and all entries from days on which the total kilocalories consumed was <350 or >3750 were excluded. A small number of the remaining entries had negative caloric values and were excluded from analysis. Per the NDSR manual, foods may contain negative values when there is an adjustment being made to match the weight and/or nutrient content of the food.

The full diet recall dataset was used to examine validity at 3 levels: the food recall-entry level, the day level, and the study-timepoint level. In each case, we hypothesized that, if this training and coding approach were valid, foods coded as UPFs would have relatively lower protein and much higher added sugar and correlate with a lower diet quality score [via the health eating index (HEI)-2010] while foods coded as “minimally processed” would have higher protein and lower added sugar and correlate with a higher diet quality score.

The first method was analyzed at the food recall-entry level because it was focused on the ratio of each macronutrient per 100 kcal within each Nova category across all foods consumed throughout the study, without respect to daily or timepoint-based intake patterns. This method examined the macronutrient composition of foods consumed in each Nova category by calculating the ratios (unit of mass per 100 kcal) of total fat (g), saturated fatty acids (g), protein (g), carbohydrates (g), added sugars (g), fiber (g), and sodium (mg) within each Nova category. Macronutrient ratios were calculated for each food entry in the enhanced diet recall dataset (where each food entry had been placed into a Nova category). Weighted summary statistics (medians, interquartile ranges, and interdecile ranges) were then calculated within each Nova category, weighting each entry by total calories consumed. Weighting the entries in this manner ensured that foods that were consumed in relatively greater (caloric) quantities within each Nova category would contribute more to this assessment of validity.

The second method was analyzed at the day level because it was focused on daily intake patterns. Specifically, the mean daily percent of a given macronutrient that came from a given Nova category was compared to the mean percent of daily caloric intake from that category. The following totals were calculated for each recall day for each child, overall and within each Nova category: total kilocalories, total fat (g), saturated fatty acids (g), protein (g), carbohydrates (g), added sugars (g), fiber (g), and sodium (mg). These daily total variables were then used to calculate the daily percentage of kilocalories and the percentage of each macronutrient that came from each Nova category in each recall day and child. Means and standard deviations were then calculated for these daily percentages across all recall days for all children. The mean macronutrient percentages were then compared to the mean percent of total kilocalories within each Nova category to determine whether a Nova category contributed a disproportionate amount of healthy (e.g., protein, fiber) or unhealthy (e.g., added sugar, sodium) macronutrients. For example, if foods coded as UPFs contributed 60% of daily kilocalories on average, validity would be supported if the UPF category contributed less than 60% of the average daily protein intake and more than 60% of the average daily intake of added sugars.

The third method was analyzed at the study-timepoint level (mean of 2–3 recall days) because it was focused on whether the mean daily percent of calories from each Nova category was correlated with mean daily HEI-2010 total score and/or the 12 HEI subcomponent scores. These diet quality scores were only available at the timepoint level in the original study. Details about the HEI-2010 total and subcomponent scores have previously been described elsewhere [33]. This analysis was conducted at the timepoint level because the original GROW trial calculated 1 HEI value per participant per timepoint (baseline and 12, 24, and 36 mo) to characterize child diet quality at each timepoint; calculations for both the HEI and the current analysis combined all recall entries from a single timepoint for each child, without regard to day. Weighted Pearson correlations were used to account for the repeated-measurements structure (multiple timepoints per child) and to ensure that each child contributed equally to the analyses regardless of the number of timepoints available for that child.

All statistical analyses were performed using R version 4.1.2 [34]. A P value <0.05 was considered statistically significant.

Results

Eleven dietetic interns, 1 medical student, and 1 PhD-level nurse with training in nutrition research completed the training, with 12 scoring at least 90% on the certification, qualifying them to participate in Nova categorization. One dietetic intern required 4 attempts to pass the quiz and did not participate in the categorization process. The demographic characteristics of the children in the GROW trial at baseline included mean age of 4.3 (SD 0.9) y, 51.9% female, 91.4% Hispanic/Latino. The Special Supplemental Nutrition Program for Women, Infants, and Children and/or Supplemental Nutrition Assistance Program was used by 87.5% of families [27].

There were 3100 unique double-coded foods in the dataset. When comparing the Nova categories from each rater prior to adjudication by the study team, 2619 (84.5%) foods had concordant categorizations, 347 (11.2%) had discordant categorizations, and 134 (4.3%) were initially coded as “I don't know” by 1 or both coders (Figure 1). The most common type of categorization discordance was between processed foods and UPFs (n = 210 discordant foods) followed by discordance between unprocessed/minimally processed and processed (n = 64) and unprocessed/minimally processed and ultraprocessed (n = 65). The discordance and “I don’t know” rates are shown by food group in Figure 2. The food groups with the highest rates of discordance were fruits (26.9% discordant codes), condiments/spices (19.0% discordant), ready-to-eat foods (17.4% discordant), and grains (17.1% discordant). The food groups with the lowest rates of discordance were baby formula/food (0% discordant), sweets/snacks (3.4% discordant), and soups (4.1% discordant).

FIGURE 1.

Figure 1

Concordance by Nova category. Color represents concordance (green), discordance (red), and “I don’t know” (gray) frequencies by Nova categorization, compiled across 6 coder pairs and 3100 double-coded unique foods.

FIGURE 2.

Figure 2

Nova categorization and concordance by food group. The 3100 double-coded unique foods are shown, categorized into 12 food groups. The percent of foods with concordant rater Nova categories, the percent with 1 or both coders rating “I don’t know,” and the percent discordance are shown for each food group.

Interrater reliability

After excluding foods where either coder indicated “I don’t know,” coder concordance was 88.3%, and the overall κ coefficient was 0.75, with 95% CI (0.73, 0.77). The overall mean self-rated difficulty (possible range of 1-very easy to 4-very difficult) was 1.4 (SD 0.55). Table 2 shows the percent agreement, κ coefficient, and mean difficulty rating for each coder pair.

TABLE 2.

Interrater reliability summary1

Agreement Cohen’s κ Mean (SD) of self-rated difficulty of foods2 Percent of foods for which both coders rated the difficulty as very easy or easy
Pair 1 (n = 461 foods) 98.2% 0.97 1.5 (0.59) 88.5%
Pair 2 (n = 507 foods) 87.2% 0.65 1.2 (0.47) 94.5%
Pair 3 (n = 512 foods) 79.2% 0.64 1.5 (0.48) 93.2%
Pair 4 (n = 583 foods) 86.5% 0.76 1.3 (0.45) 94.7%
Pair 5 (n = 550 foods) 85.4% 0.60 1.6 (0.63) 79.3%
Pair 6 (n = 487 foods) 95.1% 0.77 1.3 (0.60) 89.7%
Overall 88.3% 0.75 1.4 (0.55) 90%
1

n represents all foods reviewed by coder pairs. However, the agreement and Cohen’s κ statistics were calculated using only those foods for which neither coder indicated “I don’t know.”

2

Possible range of self-rated difficulty: 1-Very Easy to 4-Very difficult.

After adjudication, 1 food item was uncategorizable due to a lack of sufficient detail provided in the 24-h diet recall food description (“soup, unknown kind, unknown base”). The remaining 3099 double-coded food items were categorized as follows: unprocessed/minimally processed, n = 545 (18%); processed culinary ingredients, n = 13 (0.4%); processed, n = 469 (15%); and ultraprocessed, n = 2072 (67%).

Construct validity assessment

The initial dataset to be used for the validity assessment consisted of 76,146 food entries from 5567 24-h diet recall days from the 610 GROW children. After applying exclusion criteria as noted above, the final dataset consisted of 74,684 food entries from 5546 d (98.1% and 99.6% of the original numbers) from the 610 children. Of these 74,684 food entries, 8153 entries were coded as having 0 calories by the NDSR software (8018 entries were water, mineral water, soda water, or ice; 69 entries were salt; 66 entries were unsweetened or diet beverages). This resulted in a total of 66,531 food entries that contributed to the validity analyses.

At the food recall-entry level, descriptive comparisons of macronutrient content were consistent with expectations: UPFs generally had higher added-sugar-to-calorie ratios and lower protein-to-calorie ratios than minimally processed foods (Figure 3 and Supplemental Table 2).

FIGURE 3.

Figure 3

Macronutrients by mass [grams (g) or milligrams (mg)] per 100 kcal for foods consumed in each Nova category. Plots show weighted medians (points), interquartile ranges (thick lines), and interdecile ranges (thin lines); foods were weighted within each category by number of calories consumed.

At the day level, the relative macronutrient contributions of the Nova groups were also consistent with expectations. On average, over the 5546 daily recalls, UPFs made up 62% (SD 19) of the day’s calories, and a comparatively high percentage of the day’s added sugar (94%; SD 16) and low percentage of the day’s protein (47%; SD 24) came from UPFs. Minimally processed foods were 30% (SD 17) of the day’s calories, and a disproportionally low percentage of the day’s added sugar (1%; SD 8) and high percentage of the day’s protein (43%; SD 24) came from minimally processed foods (see Supplemental Tables 1 and 2 for full comparisons).

At the study-timepoint level, child daily UPFs consumption (expressed as a percent of daily caloric intake) was negatively correlated with the HEI total score (r = −0.39, P < 0.001), and child minimally processed food consumption was positively correlated with the HEI total score (r = 0.40, P < 0.001). The full set of correlations between percent of daily calories from the 3 main Nova categories and the HEI-2010 total score and its subcomponents is shown with P values in Table 3.

TABLE 3.

Weighted correlations between mean daily intake at each study-timepoint of the 3 main Nova categories and the health eating index-2010 total and subcomponent scores

Healthy Eating Index Minimally Processed Foods Processed Foods Ultraprocessed Foods
Adequacy subcomponents1
 Total score 0.40 (P < 0.001) 0.06 (P = 0.012) −0.39 (P < 0.001)
 Total vegetables 0.18 (P < 0.001) 0.12 (P < 0.001) −0.22 (P < 0.001)
 Greens and beans 0.31 (P < 0.001) 0.06 (P = 0.002) −0.31 (P < 0.001)
 Total fruits 0.29 (P < 0.001) 0.02 (P = 0.376) −0.28 (P < 0.001)
 Whole fruits 0.35 (P < 0.001) 0.06 (P = 0.01) −0.35 (P < 0.001)
 Whole grains −0.00 (P = 0.912) 0.05 (P = 0.046) −0.02 (P = 0.318)
 Dairy 0.17 (P < 0.001) 0.00 (P = 0.852) −0.16 (P < 0.001)
 Total protein 0.24 (P < 0.001) 0.04 (P = 0.096) −0.24 (P < 0.001)
 Seafood and Plant Proteins 0.15 (P < 0.001) 0.08 (P < 0.001) −0.18 (P < 0.001)
 Fatty Acids −0.13 (P < 0.001) −0.07 (P = 0.002) 0.16 (P < 0.001)
Moderation subcomponents2
 Sodium 0.15 (P < 0.001) −0.08 (P < 0.001) −0.10 (P = 0.002)
 Refined grains 0.29 (P < 0.001) 0.03 (P = 0.112) −0.28 (P < 0.001)
 Empty calories 0.24 (P < 0.001) 0.06 (P = 0.006) −0.25 (P < 0.001)

Correlations were calculated from a dataset with 2019 rows from a total of 610 children (maximum number of rows per child: 4). HEI subcomponents are interpreted such that a higher value represents a healthier diet.

1

For the 9 adequacy subcomponents, a higher value indicates higher intake of these healthy foods.

2

For the 3 moderation subcomponents, a higher value indicates a lower or more limited intake of these less healthy foods.

Discussion

This method of training health professionals with prior experience in nutrition and our approach of classifying food items via their respective Nova category was a reliable and valid approach to identify individual consumption patterns for UPFs from 24-h dietary recall data collected with NDSR software. Reliability was demonstrated by high interrater reliability with an overall agreement of 88.3% and κ statistic of 0.75. Validity was demonstrated by descriptive macronutrient comparisons between Nova categories aligning with expectations, and as-expected correlations between higher rates of UPF consumption and unhealthier diet quality (e.g., lower HEI total and subcomponent scores). Taken together, this provides evidence that this type of coding and training approach can be used to draw meaningful conclusions in dietary research, with limited concerns for misclassification bias. Moreover, our results (i.e., UPFs accounted for 62% of total daily calories in children 3–5 y) are also in line with recent evidence that found that UPFs made up 61.1% of the total energy intake of US children aged 2–5 y [11].

The implications of this work are highly relevant to scientists, healthcare professionals, and policy makers interested in understanding how exposure to nutrients, foods, and diet patterns (particularly from UPFs) influence health outcomes and disease risk. Many dietary measures such as the HEI-2010 and 2015 categorize foods into groups based on their specific food and/or nutrient classification (e.g., total fruits/vegetables, whole and refined grains, total protein, added sugar), in accordance with US dietary guideline recommendations [33, 35]. Applying the methods detailed in this study to Nova categorize NDSR dietary recall data offers investigators the advantage of an additional level of context about the quality of foods consumed as part of a dietary pattern. This is because Nova categorizes foods based on the extent to which they are modified by any physical, biological, or chemical change, allowing determinations to be made about the “quality” of the food that moves beyond an assessment of macronutrient content. This is especially important as evidence continues to link UPF exposure to poor health outcomes, including overweight/obesity, cardiovascular disease, diabetes, and cancer [36].

Despite high levels of interrater agreement, the overall initial coding agreement of 88.3% falls short of the precision needed for research to appropriately draw inferences about UPF consumption from a single coder. Therefore, it is recommended that future researchers employ a double-coding methodology followed by expert adjudication of discordant food items when conducting Nova categorization. There are tens of thousands of different foods that individuals consume in the United States and globally. While even trained nutrition experts (i.e., dietetic interns) had some difficulty agreeing on food items, this study found a high degree of consistency in assigning Nova categories for commonly consumed food items. Yet, there were still numerous circumstances that required special attention during coding.

It is important to note that while this approach to Nova categorization demonstrated validity through its alignment with the macronutrient and HEI criteria, UPF intake should not be interpreted as interchangeable with these variables. UPF intake should be expected to have some overlap with these and other related measures; however, it is likely capturing a unique construct. Although there is some evidence that UPF intake may be related to certain health outcomes [37, 38], further study is needed to determine if elevated levels of UPF consumption at different life stages (e.g., early childhood) or over an extended period of time may negatively impact healthy development or be associated with the emergence of diet-related health problems.

As previously mentioned, there were several challenges to applying the Nova classification system to 24-h dietary recall data. The most challenging food items included fruits (e.g., canned or packaged) and ready-to-eat foods (e.g., frozen dinners). For example, it was often difficult to ascertain from the NDSR dietary data if fruits should be considered processed (e.g., canned with added sugars) or ultraprocessed (e.g., canned with high-fructose corn syrup or nonnutritive sweeteners). Similarly, ready-to-eat foods were coded as ultraprocessed based on the Nova classification system guidelines [1, 4]. To minimize the potential for misclassification, more accurate descriptions related to the degree of processing for foods (e.g., brand or store-brand names, preparation of mixed dishes that describe use of homemade or ready-to-eat ingredients) should be added to 24-h diet recall software. This would have to be applied prospectively, in lieu of retrospectively assigning ambiguously described foods to a Nova category. Another challenge routinely encountered by coders was the variety of foods included in the UPFs category. For example, a packaged dessert and a granola bar were both categorized as ultraprocessed. Although this variation in foods categorized as ultraprocessed is appropriate based on the Nova classification guidelines [1], it does not align with an intuitive concept of what makes a “healthy” vs. “unhealthy” food.

Another challenge was the categorization of fruits or grains eaten at fast-food restaurants in which the degree of processing for these healthier foods was initially unclear. Previous researchers using Nova classified foods from fast food outlets as “ultraprocessed” unless they were described as “milk, egg-based preparations, or salads without dressing” [4]. However, in the GROW dataset it was unclear how to handle instances where fruits (e.g., apple slices) and grains (e.g., white rice) were reported from fast-food restaurants. In these instances, coders were asked to review the nutrition facts label of the respective fast-food website to determine the degree of processing. If items did not meet criteria for an UPF (e.g., lacked food additives), they were coded via their respective Nova category (e.g., white rice coded as minimally processed) (see Supplementary Figure 1).

This study had several limitations. Only foods reportedly consumed by the children enrolled in the clinical trial were categorized (3497 individual foods). This represents approximately 10% of food items in the NDSR database, and it is difficult to determine whether this process would have yielded the same high reliability and validity for a broader range of food items. However, these represented all of the foods reported by parents of preschool age children across 3 y. Because the dietary data analyzed in this study was collected using NDSR software, this exact method for assigning the Nova classification would not directly apply to other dietary collection methods. The limited or ambiguous food details provided by the NDSR software made it difficult to identify the degree of processing for some food items. In one case, this made it impossible to assign a Nova category to a food. We were also unable to determine the composition of ingredients used to make mixed dishes (e.g., homemade vs. ready-made). This may have resulted in the misclassification of calories, leading us to potentially under- or overestimate the caloric contributions of some Nova categories. However, the percentage of UPFs consumed by participants in our study are similar to those of a recent study of 2–5 y-olds (62% vs. 61.1% respectively) [11]. Lastly, NDSR allows for foods to contain a negative value when there is an adjustment being made to match food weight and/or nutrient content. Therefore, we had a small number of food entries with negative caloric values that were excluded from analysis, which may have resulted in slightly higher estimates for these entries.

In conclusion, this method of coding and training healthcare professionals to double-code individual food items into a Nova category was a reliable and valid approach for identifying individual consumption patterns of levels of food processing, including UPFs, from 24-h diet recall data collected with NDSR software. Future studies should focus on expanding this approach to the full list of foods in NDSR or attempt to apply it to other dietary recall programs to expand the availability of this information in an analyzable format. This would facilitate further examination of the connection between UPF consumption and a variety of outcomes.

Author contribution

The authors’ responsibilities were as follows – WJH, ECS, and SLB: contributed to the study conception and design; NSM, SU, ECS, LRS, and WJH: material preparation and data collection; NMS, SU, ECS, LRS, KPT, DM, TEN, SLB, and WJH: data analysis and interpretation; NMS and WJH: wrote the paper with contributions from ECS and LRS; NMS, ECS, LRS, KPT, DM, TEN, SLB, and WJH: critically revised the manuscript; NMS: primary responsibility for the final content; and all authors: read and approved the final manuscript.

Conflict of interest

The authors report no conflicts of interest.

Data availability

Data described in the manuscript, code book, and analytic code will be made available upon request pending study team approval.

Funding

This work was supported by funding from the National Heart, Lung, and Blood Institute (1R03HL154243 and 1U01HL103620). NMS was supported by a T32 training grant through the Agency for Healthcare Research and Quality (1T32HS026122) and the Vanderbilt University School of Nursing. Data were collected and stored using REDCap, supported by grant #UL1 TR000445 from the National Center for Advancing Translational Sciences at the National Institutes of Health. This content is solely the responsibility of the authors and does not necessarily represent the official views of the National Heart, Lung, and Blood Institute, the Agency for Healthcare Research and Quality, or Vanderbilt University.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.ajcnut.2022.10.016.

Appendix A. Supplementary data

The following is the Supplementary data to this article:

Multimedia component1
mmc1.docx (116.6KB, docx)

References

  • 1.Monteiro C.A., Cannon G., Levy R.B., Moubarac J.C., Louzada M.L., Rauber F., et al. Ultra-processed foods: what they are and how to identify them. Public Health Nutr. 2019;22(5):936–941. doi: 10.1017/S1368980018003762. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Martínez Steele E., Popkin B.M., Swinburn B., Monteiro C.A. The share of ultra-processed foods and the overall nutritional quality of diets in the US: evidence from a nationally representative cross-sectional study. Popul Health Metr. 2017;15(1):6. doi: 10.1186/s12963-017-0119-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.da Costa Louzada M.L., Martins A.P.B., Canella D.S., Baraldi L.G., Levy R.B., Claro R.M., et al. Ultra-processed foods and the nutritional dietary profile in Brazil. Rev Saúde Publ. 2015;49:38. doi: 10.1590/S0034-8910.2015049006132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Moubarac J.C., Batal M., Louzada M.L., Martinez Steele E.M., Monteiro C.A. Consumption of ultra-processed foods predicts diet quality in Canada. Appetite. 2017;108:512–520. doi: 10.1016/j.appet.2016.11.006. [DOI] [PubMed] [Google Scholar]
  • 5.Cediel G., Reyes M., Corvalán C., Levy R.B., Uauy R., Monteiro C.A. Ultra-processed foods drive to unhealthy diets: evidence from Chile. Public Health Nutr. 2021;24(7):1698–1707. doi: 10.1017/S1368980019004737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Parra D.C., da Costa-Louzada M.L., Moubarac J.C., Bertazzi-Levy R., Khandpur N., Cediel G., et al. Association between ultra-processed food consumption and the nutrient profile of the Colombian diet in 2005. Salud Publica Mex. 2019;61(2):147–154. doi: 10.21149/9038. [DOI] [PubMed] [Google Scholar]
  • 7.Marrón-Ponce J.A., Flores M., Cediel G., Monteiro C.A., Batis C. Associations between consumption of ultra-processed foods and intake of nutrients related to chronic non-communicable diseases in Mexico. J Acad Nutr Diet. 2019;119(11):1852–1865. doi: 10.1016/j.jand.2019.04.020. [DOI] [PubMed] [Google Scholar]
  • 8.Machado P.P., Steele E.M., Levy R.B., Sui Z., Rangan A., Woods J., et al. Ultra-processed foods and recommended intake levels of nutrients linked to non-communicable diseases in Australia: evidence from a nationally representative cross-sectional study. BMJ Open. 2019;9(8) doi: 10.1136/bmjopen-2019-029544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rauber F., da Costa Louzada M., Steele E.M., Millett C., Monteiro C.A., Levy R.B. Ultra-processed food consumption and chronic non-communicable diseases-related dietary nutrient profile in the UK (2008–2014) Nutrients. 2018;10(5):587. doi: 10.3390/nu10050587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Martínez Steele E., Baraldi L.G., Louzada M.L., Moubarac J.C., Mozaffarian D., Monteiro C.A. Ultra-processed foods and added sugars in the US diet: evidence from a nationally representative cross-sectional study. BMJ Open. 2016;6(3) doi: 10.1136/bmjopen-2015-009892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Wang L., Martínez Steele E., Du M., Pomeranz J.L., O’Connor L.E., Herrick K.A., et al. Trends in consumption of ultraprocessed foods among US Youths aged 2–19 years, 1999–2018. JAMA. 2021;326(6):519–530. doi: 10.1001/jama.2021.10238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Baraldi L.G., Martinez Steele E.M., Canella D.S., Monteiro C.A. Consumption of ultra-processed foods and associated sociodemographic factors in the USA between 2007 and 2012: evidence from a nationally representative cross-sectional study. BMJ Open. 2018;8(3) doi: 10.1136/bmjopen-2017-020574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Backholer K., Gupta A., Zorbas C., Bennett R., Huse O., Chung A., et al. Differential exposure to, and potential impact of, unhealthy advertising to children by socio-economic and ethnic groups: a systematic review of the evidence. Obes Rev. 2021;22(3) doi: 10.1111/obr.13144. [DOI] [PubMed] [Google Scholar]
  • 14.Moran A.J., Khandpur N., Polacsek M., Rimm E.B. What factors influence ultra-processed food purchases and consumption in households with children? a comparison between participants and non-participants in the Supplemental Nutrition Assistance Program (SNAP) Appetite. 2019;134:1–8. doi: 10.1016/j.appet.2018.12.009. [DOI] [PubMed] [Google Scholar]
  • 15.Kern D.M., Auchincloss A.H., Stehr M.F., Roux A.V.D., Moore L.V., Kanter G.P., et al. Neighborhood prices of healthier and unhealthier foods and associations with diet quality: evidence from the Multi-Ethnic Study of Atherosclerosis. Int J Environ Res Public Health. 2017;14(11) doi: 10.3390/ijerph14111394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.US Burden of Disease Collaborators. Mokdad A.H., Ballestros K., Echko M., Glenn S., Olsen H.E., et al. The state of US health, 1990–2016: burden of diseases, injuries, and risk factors among US states. JAMA. 2018;319(14):1444–1472. doi: 10.1001/jama.2018.0158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Satia J.A. Diet-related disparities: understanding the problem and accelerating solutions. J Am Diet Assoc. 2009;109(4):610–615. doi: 10.1016/j.jada.2008.12.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Monteiro C.A., Cannon G., Moubarac J.C., Levy R.B., Louzada M.L.C., Jaime P.C. The UN decade of nutrition, the NOVA food classification and the trouble with ultra-processing. Public Health Nutr. 2018;21(1):5–17. doi: 10.1017/S1368980017000234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Bleiweiss-Sande R., Chui K., Evans E.W., Goldberg J., Amin S., Sacheck J. Robustness of food processing classification systems. Nutrients. 2019;11(6) doi: 10.3390/nu11061344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Gibney M.J. Ultra-processed foods: definitions and policy issues. Curr Dev Nutr. 2018;3(2):nzy077. doi: 10.1093/cdn/nzy077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Adams J., White M. Characterisation of UK diets according to degree of food processing and associations with socio-demographics and obesity: cross-sectional analysis of UK National Diet and Nutrition Survey (2008–12) Int J Behav Nutr Phys Act. 2015;12:160. doi: 10.1186/s12966-015-0317-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Marino M., Puppo F., Del Bo’ C., Vinelli V., Riso P., Porrini M., et al. A systematic review of worldwide consumption of ultra-processed foods: findings and criticisms. Nutrients. 2021;13(8):2778. doi: 10.3390/nu13082778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Monteiro C.A., Levy R.B., Claro R.M., de Castro I.R., Cannon G. Increasing consumption of ultra-processed foods and likely impact on human health: evidence from Brazil. Public Health Nutr. 2011;14(1):5–13. doi: 10.1017/S1368980010003241. [DOI] [PubMed] [Google Scholar]
  • 24.Poti J.M., Mendez M.A., Ng S.W., Popkin B.M. Is the degree of food processing and convenience linked with the nutritional quality of foods purchased by US households? Am J Clin Nutr. 2015;101(6):1251–1262. doi: 10.3945/ajcn.114.100925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Moubarac J.C., Martins A.P., Claro R.M., Levy R.B., Cannon G., Monteiro C.A. Consumption of ultra-processed foods and likely impact on human health. Evidence from Canada. Public Health Nutr. 2013;16(12):2240–2248. doi: 10.1017/S1368980012005009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Harnack L. In: Encyclopedia of Behavioral Medicine. MD Gellman. Turner J.R., editor. Springer; New York, NY: 2013. Nutrition data system for research (NDSR) pp. 1348–1350. [Google Scholar]
  • 27.Barkin S.L., Heerman W.J., Sommer E.C., Martin N.C., Buchowski M.S., Schlundt D., et al. Effect of a behavioral intervention for underserved preschool-age children on change in body mass index: a randomized clinical trial. JAMA. 2018;320(5):450–460. doi: 10.1001/jama.2018.9128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Pratt C.A., Boyington J., Esposito L., Pemberton V.L., Bonds D., Kelley M., et al. Childhood Obesity Prevention and Treatment Research (COPTR): interventions addressing multiple influences in childhood and adolescent obesity. Contemp Clin Trials. 2013;36(2):406–413. doi: 10.1016/j.cct.2013.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Kuczmarski R.J., Ogden C.L., Guo S.S., Grummer-Strawn L.M., Flegal K.M., Mei Z., et al. CDC growth charts for the United States: methods and development. Vital and health statistics Series 11, Data from the National Health Survey. 2000;246:1–190. 2002. [PubMed] [Google Scholar]
  • 30.Hales C.M., Fryar C.D., Carroll M.D., Freedman D.S., Aoki Y., Ogden C.L. Differences in Obesity prevalence by demographic characteristics and urbanization level among adults in the United States, 2013–2016. JAMA. 2018;319(23):2419–2429. doi: 10.1001/jama.2018.7270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Monteiro C.A., Levy R.B., Claro R.M., Castro I.R., Cannon G. A new classification of foods based on the extent and purpose of their processing. Cad Saude Publica. 2010;26(11):2039–2049. doi: 10.1590/s0102-311x2010001100005. [DOI] [PubMed] [Google Scholar]
  • 32.McHugh M.L. Interrater reliability: the kappa statistic. Biochem Med (Zagreb) 2012;22(3):276–282. [PMC free article] [PubMed] [Google Scholar]
  • 33.Guenther P.M., Casavale K.O., Reedy J., Kirkpatrick S.I., Hiza H.A., Kuczynski K.J., et al. Update of the healthy eating index: HEI-2010. J Acad Nutr Diet. 2013;113(4):569–580. doi: 10.1016/j.jand.2012.12.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.R Core Team . R Foundation for Statistical Computing; Vienna, Austria: 2021. R: A Language and Environment for Statistical Computing. [Google Scholar]
  • 35.Krebs-Smith S.M., Pannucci T.E., Subar A.F., Kirkpatrick S.I., Lerman J.L., Tooze J.A., et al. Update of the healthy eating index: HEI-2015. J Acad Nutr Diet. 2018;118(9):1591–1602. doi: 10.1016/j.jand.2018.05.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Elizabeth L., Machado P., Zinöcker M., Baker P., Lawrence M. Ultra-Processed foods and health outcomes: a narrative review. Nutrients. 2020;12(7) doi: 10.3390/nu12071955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Chang K., Khandpur N., Neri D., Touvier M., Huybrechts I., Millett C., et al. Association between childhood consumption of ultraprocessed food and adiposity trajectories in the avon longitudinal study of parents and children birth cohort. JAMA Pediatr. 2021;175(9) doi: 10.1001/jamapediatrics.2021.1573. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Pagliai G., Dinu M., Madarena M.P., Bonaccio M., Iacoviello L., Sofi F. Consumption of ultra-processed foods and health status: a systematic review and meta-analysis. Br J Nutr. 2021;125(3):308–318. doi: 10.1017/S0007114520002688. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component1
mmc1.docx (116.6KB, docx)

Data Availability Statement

Data described in the manuscript, code book, and analytic code will be made available upon request pending study team approval.


Articles from The American Journal of Clinical Nutrition are provided here courtesy of American Society for Nutrition

RESOURCES