Skip to main content
American Journal of Public Health logoLink to American Journal of Public Health
. 2017 Jun;107(6):916–921. doi: 10.2105/AJPH.2017.303815

Overview of Asian American Data Collection, Release, and Analysis: National Health and Nutrition Examination Survey 2011–2018

Ryne Paulose-Ram 1,, Vicki Burt 1, Lisa Broitman 1, Namanjeet Ahluwalia 1
PMCID: PMC5425904  NIHMSID: NIHMS1004581  PMID: 28426300

Abstract

Data System. The National Health and Nutrition Examination Survey (NHANES), conducted by the National Center for Health Statistics, is a cross-sectional survey on the health and nutritional status of US adults and children.

Data Collection/Processing. A complex, multistage probability design is used to select a sample representative of the US civilian, noninstitutionalized population. NHANES includes in-home interviews, physical examinations, and biospecimen collection. About 5000 persons are examined annually. Since 2011, NHANES has been oversampling Asian Americans in addition to traditionally oversampled groups, including Hispanics and non-Hispanic Blacks.

Data Analysis/Dissemination. Data are publicly released online in 2-year cycles. Some data, because of disclosure risk, are only available through the Research Data Center. Data users should read documentation, examine sample sizes and response rates, and account for the complex survey design. With publicly released data, analyses of Asians as a single group is only possible; some Asian subgroup analyses may be conducted through the Research Data Center.

Public Health Implications. Oversampling Asians in NHANES 2011–2018 allows national estimates to be computed on health conditions, nutrition, and risk factors of public health importance on this growing subpopulation of Asian Americans.


Asian Americans are among the fastest growing populations in the United States. As determined by the 2015 American Community Survey, 6.4% of the total US population, that is 19.2 million people, reported being of Asian race (alone or in combination with another race).1 By the year 2060, the total Asian population is projected to increase to 48.6 million, which is more than a 150% increase; by comparison, there is a projected increase of about 30% in the total US population.2

The US Office of Management and Budget provides federal agencies with standards for “record keeping, collection, and presentation of data on race and ethnicity.”3 In these guidelines, Asian is defined as “a person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent.” This area includes China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam.

The National Health and Nutrition Examination Survey (NHANES) has been collecting data to monitor the health and nutritional status of Americans since the early 1970s. Over its history, NHANES has oversampled certain segments of the population to obtain stable estimates on these subgroups. Starting in 2011, NHANES began oversampling Asian Americans. We provide an overview of NHANES 2011–2018—its sample design, data collection and release, and analytic issues—with a focus on changes consequent to the oversampling of Asian Americans.

DATA SYSTEM

NHANES is conducted by the Division of Health and Nutrition Examination Surveys of the National Center for Health Statistics (NCHS).

The NHANES program began in the early 1970s and was conducted as a series of periodic surveys. In 1999, the survey became continuous, collecting data annually on a nationally representative sample of the US civilian, noninstitutionalized population.

Purpose

The main purpose of NHANES is to assess the health and nutritional status of adults and children in the United States through its unique combination of personal interviews, standardized physical examinations, and laboratory measurements. The specific goals are to (1) provide prevalence data on selected diseases and risk factors for the US population; (2) monitor trends in selected diseases, behaviors, and environmental exposures; (3) explore emerging public health needs; and (4) maintain a national probability sample of baseline information on health and nutritional status.

Public Health Significance

NHANES data are used by research organizations, government agencies, universities, health care providers, and educators to help develop public health policy, direct and design health programs and services, and expand the health knowledge for the nation.

The inclusion of the oversample of Asian Americans in NHANES 2011–2014 and 2015–2018 allows the first national estimates for Asian Americans on numerous health conditions and risk factors of public health significance that are unavailable through other surveys. This includes national estimates on obesity, diagnosed and undiagnosed hypertension, undiagnosed diabetes, cholesterol levels (total, high- and low-density lipoproteins), infectious diseases (e.g., hepatitis virus and human papillomavirus), nutritional status, and numerous other conditions and risk factors.

DATA COLLECTION/PROCESSING

NHANES combines personal interviews with physical examinations and laboratory testing. Specific questions, examinations, and laboratory tests included in each survey cycle are summarized in the NHANES Survey Content Brochure.4

Trained interviewers conduct interviews in the participant’s home, using a computer-assisted personal interview system, and collect demographic, socioeconomic, dietary, and health-related information. Participants are scheduled to visit a mobile examination center (MEC), about 2 weeks later and are randomly assigned to a morning (persons aged 12 years and older are asked to fast at least 9 hours) or afternoon or evening examination.5

The MEC travels to locations throughout the country offering a controlled environment where measurements can be conducted under standardized conditions. Staff includes a physician, dentist, phlebotomist, medical and health technicians, and dietary and health interviewers—many of whom are bilingual (English–Spanish). At the MEC, participants receive additional interviews, physical examinations, and biological specimen collection. Examinations include measurements, such as blood pressure and anthropometry, and a dental examination. Private interviews on an automated computer-assisted self-interviewing system allow participants to enter responses to sensitive questions, such as illicit drug use or sexual behavior, in privacy.

Biospecimen collection includes blood and urine collection, mouth rinses, and self-administered genital swab collection. Specimens undergo various laboratory testing, including exposure to toxic substances and pollutants, measures of nutritional status, hormones, and infectious diseases. Blood and urine may also be stored, if participants consent to allow their specimens to be used for future research. The collection, processing, storage, and shipping of blood, urine, and other types of biospecimens, as well as some specimen analysis, all occur at the MEC.6 Eligibility for specific questionnaires, examinations, and laboratory tests are determined by the participant’s age and gender (and time of examination, i.e., AM or PM, for certain tests). Procedure and protocol manuals are available on the NHANES Web site.7,8

Additional Data-Collection Processes for Asian Oversample

Starting in 2011, staff fluent in English and an Asian language were recruited and hired, when possible. Staff participates in cultural competency training to help recognize and respect cultural differences. Many of the NHANES materials are translated into Mandarin Chinese (traditional and simplified), Korean, and Vietnamese following a process similar to the Spanish translations for NHANES (available on request). Translated materials include advance, endorsement, and reminder letters, brochures on confidentiality and sample participant overview, consent forms, safety questions, examination center and postexamination instructions, examination scripts, and hand cards used for the questionnaires administered in the household and the MEC. Anything the participant completes on his or her own, such as self-administered specimen collections and the automated computer-assisted self-interviewing interview, are also translated and recorded in Asian languages.

The questionnaire administered in the household, because of its complexity, is not translated into any Asian language. Instead, local interpreters are hired to assist in translating the questionnaire and interpreting responses. The interpreter is provided with a glossary of terms with a translation of medical terminology as well as anything that might be an unfamiliar term or difficult to explain. Interpreters read translated examination scripts verbatim. Local Asian organizations may help identify interpreters. Family members or friends of the participant may also be used as an interpreter. A professional medical interpretation telephone service is available when necessary.5

Local organizations and community leaders assist, when possible, with endorsements, media coverage, and mentions in newsletters and meetings. They are critical to helping inform the public on NHANES objectives and the importance of participating.

Ethical Procedures

NHANES procedures and protocols are reviewed and approved annually by the NCHS Research Ethics Review Board to protect the rights and welfare of participants and ensure compliance with the Department of Health and Human Services’ Policy for Protection of Human Research Subjects (45 CFR part 46).9

Written, informed consent is obtained for all participants for the interview and examination. Documented signed consent is obtained from participants who have reached the age of maturity in their state (usually 18 years). A parent or guardian gives permission for minors. Children aged 7 to 17 years also provide documented assent. Interpreters assist participants who cannot speak or read English or Spanish.5

Population and Geographic Coverage

NHANES is a nationally representative, cross-sectional survey of the civilian, noninstitutionalized population residing in the 50 states and the District of Columbia. As a result, it specifically excludes persons in supervised care or custody in institutional settings, active-duty military personnel, and any other US citizens residing outside the 50 states and the District of Columbia.10

Sample design specifications.

Details of the NHANES 2011–2014 sample design are described elsewhere.10 Since 1999, the target annual examined sample size for NHANES has been 5000 persons. Sample persons are located in counties across the country. Annually, 15 out of about 3100 US counties are visited.

For the two 4-year NHANES sample design periods, 2011–2014 and 2015–2018, there were 87 domains for which specified reliability was desired. These included sex–age groups for non-Hispanic Black, non-Hispanic non-Black Asian, and Hispanic persons and income–sex–age groups for the remainder of the US population. The rates required for sampling persons in these domains were designed to achieve a designated number of MEC examinations in each domain. The NHANES sample weights are adjusted for these different sampling rates, different response rates, and different coverage rates among persons in the sample. This allows accurate national estimates to be made from the sample.10

Oversampling.

To increase the reliability and precision of estimates for certain domains, oversampling is conducted. During NHANES 2011–2014 and 2015–2018, the following groups were oversampled:

  • Hispanic persons

  • Non-Hispanic Black persons

  • Non-Hispanic, non-Black Asian persons

  • Non-Hispanic White and other persons at or below 130% of the federal poverty level

  • Non-Hispanic White and other persons aged 80 years and older

Specifically, the Hispanic category included all persons who were reported to be of Hispanic ethnicity regardless of race; the non-Hispanic Black category included all persons who were reported to be of non-Hispanic Black race (single race or in combination with any other race including Asian); the non-Hispanic non-Black Asian category included all persons who were reported to be non-Hispanic non-Black Asian (single race or in combination with another race except Black). All other persons not falling into these categories were assigned to the non-Hispanic White and other category. Therefore, any Asian person who was also Hispanic or non-Hispanic Black was considered to be in the respective latter categories.

For these oversampled groups, the proportion of persons in the NHANES sample is larger than their corresponding proportion in the US population. Statistical weighting schemes allow estimates from these groups to be combined to obtain national estimates that ultimately reflect the actual proportions of these groups in the population as a whole. Details of NHANES 2011–2014 oversampling and weighting procedures are described elsewhere.10

Unit of Data Collection and Sample Size

The unit of data collection and analysis is the person. In NHANES, persons who complete the in-home interviews are “interview respondents” and included in the interviewed sample. Those who complete the in-home interview and are further examined at the MEC are “MEC respondents” and included in the examined sample.

Interviewed and examined sample sizes and response rates for all survey cycles, overall and by age and gender, are provided on the NHANES Web site.11 Examined response rates overall were 69.5% in 2011–2012 and 68.5% in 2013–2014. On the basis of race and Hispanic origin collected at sample selection, persons in the non-Hispanic non-Black Asian category had the lowest examined response rates for both survey cycles (56.0% and 55.2%) compared with persons categorized as non-Hispanic Black (75.8% and 77.0%), Hispanic (76.8% and 76.2%), and non-Hispanic White and other races (67.1% and 64.0). Table 1 provides interviewed and examined sample sizes for 2011–2012 and 2013–2014, overall and by race and Hispanic origin categories (as defined in the publicly released Demographic Data File with single race categories for non-Hispanic Black, Asian, and White persons).

TABLE 1—

Unweighted Sample Sizes, Overall and by Race and Hispanic Origin: National Health and Nutrition Examination Survey, United States, 2011–2014

Publicly Released Race and Hispanic Origin Variable (RIDRETH3) Unweighted Interviewed Sample Size, No. Unweighted Examined Sample Size, No.
2011–2012
 Totala 9 756 9 338
 Hispanicb 2 431 2 327
 Non-Hispanic Black, single race 2 683 2 582
 Non-Hispanic Asian, single race 1 282 1 215
 Non-Hispanic White, single race 2 973 2 841
2013–2014
 Totala 10 175 9 813
 Hispanicb 2 690 2 615
 Non-Hispanic Black, single race 2 267 2 198
 Non-Hispanic Asian, single race 1 074 1 019
 Non-Hispanic White, single race 3 674 3 538
a

Includes the “other” race group not shown separately.

b

Includes Mexican American and Other Hispanic persons.

Survey Design and Frequency of Data Collection

The NHANES uses a complex, multistage probability sampling design to select a sample representative of the US civilian, noninstitutionalized household population. Sample selection follows these stages: (1) selection of primary sampling units, which are counties or small groups of contiguous counties, (2) selection of segments within primary sampling units that constitute a block or group of blocks containing a cluster of households, (3) selection of specific households within segments, and (4) selection of persons within a household.10

Since 1999, NHANES has been in the field collecting data annually.

Key Data Elements and Data Quality/Editing

Data collection of race and Hispanic origin.

The race categories included in NHANES follow the guidelines provided by the US Office of Management and Budget.

During the household questionnaire, participants are asked several questions on race and Hispanic origin. First, they are asked to report if they consider themselves to be Hispanic or Latino and to provide the group that represents their Hispanic/Latino origin or ancestry. Participants are then asked to report which race they consider themselves to be, with the option to select 1 or more races. Those who select “Asian” are asked to further report which group represents their Asian origin or ancestry, with the option to select more than 1 category. Hand cards with a list of options are available to assist in the response. The specific questions are available in the Sample Person Demographic Questionnaire.12

Editing and public release of race and Hispanic origin data.

To reflect the change in the 2011–2014 sample design, an additional race or Hispanic origin variable, RIDRETH3, was included on the 2011–2012 and 2013–2014 public use Demographic Data File.13 It was derived from responses to the household questions on race and Hispanic origin.

Among respondents who reported being of Hispanic ethnicity, those who self-identified as “Mexican American” were coded as such regardless of their other race/ethnicity identities; those who self-identified as another “Hispanic” ethnicity were coded as “other Hispanic.” Non-Hispanic participants were then categorized on the basis of their self-reported races: non-Hispanic White, non-Hispanic Black, non-Hispanic Asian, and other non-Hispanic races (which include non-Hispanic persons reporting multiple races).

This variable is consistent with the race or Hispanic origin variable, RIDRETH1, which is on previous data releases, because the Mexican American and other Hispanic categories may include persons of multiple races and because non-Hispanic White, non-Hispanic Black, and non-Hispanic Asian categories include only those reporting a single race.

The public release of a more detailed race and Hispanic origin variable required modification to numerous other variables released in earlier survey cycles to maintain participant confidentiality (e.g., respondent’s age and country of birth). Refer to the NHANES Web site for additional detail on specific variables.

DATA ANALYSIS/DISSEMINATION

The recommended approach for analysis of NHANES data is design-based analysis. Design-based analytic procedures explicitly take into account features of the survey design such as differential selection probabilities and geographic clustering. Survey sample weights should be used, and the complex survey design must be accounted for in the estimation of variance. The weights account for oversampling and survey nonresponse, and their proper use ensures that calculated estimates are truly representative of the US civilian noninstitutionalized population.

Data users, before any analysis, should read the NHANES Analytic Guidelines14 and relevant documentation on the survey overall and specific data files to be used in their analysis. Data file documentation is available via the link next to the respective data file on the NHANES Web site.

An additional resource for all analysts is the NHANES tutorial, which is a Web-based product designed to assist users in understanding and analyzing NHANES data.15

Analysis of Asian Race Category

Because of NHANES operational and sample design constraints, the oversampling of Asians as a single group is all that is possible. Therefore, analysis is also limited to this group when using the publicly released data files.

Information on Asian subgroups was, however, collected during the household interview. These data are not publicly released because of small sample sizes and disclosure concerns but are available via the NCHS Research Data Center (Research Data Center),16 thereby allowing some subgroup analyses with multiple years of data. However, it is important to note that the weighted distribution of the sample across the 4 major race or Hispanic origin categories is controlled to the US distribution of these groups, and not to the distribution of the individual subgroups within a race or Hispanic origin group, such as Asians.

Analysts should also note low response rates and limited sample sizes for the Asian group overall. Detailed 2- and 3-way analytic comparisons of demographic subgroups, therefore, may not meet all analytic criteria outlined in the NHANES Analytic Guidelines.14 Data users should review sample sizes before performing analyses by race or Hispanic origin groups. With small sample sizes, analysts may need to combine multiple survey cycles that oversample Asians. For example, because of the low prevalence of hepatitis C among Asian American adults, small sample sizes required combining NHANES 2011–2012 and 2013–2014 to obtain statistically reliable estimates of this infection.17 Further examination by birth status, however, was not possible even with 4 years of data.

Population Count

To understand the public health impact of a condition, it is helpful to calculate population counts in addition to the percentage of the population with a health condition. For computing population counts from NHANES 1999–2010, totals from the Current Population Survey for each survey cycle are available by race or Hispanic origin, gender, and age.11 For NHANES 2011–2014, totals are from the 2011 American Community Survey, which has a larger sample size and resulted in more reliable control totals for the number of Asians within age and gender categories.

Linkage Ability

NCHS has a record linkage program that links NHANES data files with the National Death Index, the US Renal Data System, and administrative data from the Centers for Medicare and Medicaid Services, the Social Security Administration, and the Department of Housing and Urban Development.18 These linked files allow researchers to examine factors that may influence disability, chronic disease, health care utilization, morbidity, and mortality. NHANES data may also be linked to restricted use variables, such as geocoded data, and occupation or industry data through the Research Data Center.

Data Release and Accessibility

Each single year of NHANES and any combination of consecutive years comprises a nationally representative sample of the population. However, because NHANES goes to only a small number of primary sampling units each year, estimates for single-year data are relatively unstable (i.e., have large variance estimates). In addition, releasing only 1 year of data increases the possibility of disclosure of a participant’s identity. As a result, data are publicly released in 2-year cycles. In general, any 2-year cycle can be pooled with adjacent 2-year cycles to create analytic data files on the basis of 4 or more data years to produce estimates with greater precision and smaller sampling error.

Division of Health and Nutrition Examination Surveys data release and access policy is consistent with the NCHS policy of “making high quality data available, as widely as practicable, as soon as possible after data collection, and in as much detail as possible while maintaining survey participant confidentiality.”19

Because of the voluminous nature of NHANES and the large amount of post–data-collection processing, release of all data from 2 years of data collection does not occur at 1 point in time. The first major wave of data release from a 2-year survey cycle occurs about 8 months after the end of all data collection for that 2-year cycle. An additional large wave of data release occurs a few months after. Then additional releases occur until all releasable data are made available to the public.

There are numerous mechanisms for data release and access. For data users and researchers worldwide, survey data are available on the Internet via the NHANES Web site and on request via CD-ROMs. Data not released publicly because of confidentiality or disclosure risk (e.g., geocoded data, occupation or industry data, exact date of interview or examination, detailed race or Hispanic origin) may be accessible via the Research Data Center through a proposal process and is subject to availability and approval.16 A comprehensive list of NHANES data available through the Research Data Center as of this publication is available at https://wwwn.cdc.gov/nchs/nhanes/Search/DataPage.aspx?Component=Non-Public. Information from NHANES is also made available through an extensive series of publications and articles in scientific and technical journals.

Key References

PUBLIC HEALTH IMPLICATIONS

Results from NHANES benefit Americans in important ways. Facts about the distribution of health conditions and risk factors in the population help researchers better understand diseases. Comparisons between information from this survey and previous surveys allow health planners to understand how health conditions and risk factors change in the United States over time. NHANES data may be used to identify health care needs of the population. Government agencies and private organizations may use NHANES findings to establish policies and plan research, education, and health promotion programs that may improve present health status and prevent future health problems.

Data from NHANES 2011–2014 and 2015–2018 now allow the first national estimates for Asian Americans on numerous health conditions and risk factors of public health importance. Some NHANES findings on the health status of Asian Americans follow:

  • About 1 in 4 (24.9%) non-Hispanic Asian adults had hypertension (blood pressure > 140/90 mm Hg or taking antihypertensives).20

  • About 1 in 10 non-Hispanic Asian men and women had high total cholesterol (serum total cholesterol > 240 mg/dL).21

  • Of non-Hispanic Asian adults, 38.6% had a high body mass index (≥ 25 kg/m2)22 and 11.7% were obese (body nass index ≥ 30 kg/m2).23

  • More than half of non-Hispanic Asian adults with diabetes are undiagnosed.24

CONCLUSIONS

Over many decades, NHANES has been a unique source of national data on the health and nutritional status of the US population. It continues to collect and disseminate data to meet current public health objectives, considering changing priorities and shifts in the demographic profile of the US population. Oversampling Asians in NHANES 2011–2018 allows national health estimates to be computed for Asian Americans, a growing subpopulation of the United States, while continuing to monitor the health of the nation and other racial/ethnic groups. The success of NHANES is made possible by each person who was asked to participate in the survey and agreed to do so as well as all the NHANES partners who continue to see the survey’s value and provide support through funding, staffing, and other means.

ACKNOWLEDGMENTS

The authors thank Jennifer Madans and Lisa Mirel for their critical review and comments.

HUMAN PARTICIPANT PROTECTION

Centers for Disease Control and Prevention research on human participants complies with Health and Human Services Policy for Protection of Human Research Subjects. All National Health and Nutrition Examination Survey procedures and protocols have been reviewed and approved by the National Center for Health Statistics Research Ethics Review Board.

Footnotes

See also Chin, p. 827.

REFERENCES


Articles from American Journal of Public Health are provided here courtesy of American Public Health Association

RESOURCES