Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Feb 1.
Published in final edited form as: Cancer Prev Res (Phila). 2020 May 14;13(8):687–698. doi: 10.1158/1940-6207.CAPR-19-0490

Utilizing Cultural and Ethnic Variables in Screening Models to Identify Individuals at High Risk for Gastric Cancer: A Pilot Study

Haejin In 1,2,3, Ian Solsky 1, Philip E Castle 3, Clyde B Schechter 3,4, Michael Parides 1, Patricia Friedmann 1, Judith Wylie-Rosett 3, M Margaret Kemeny 5, Bruce D Rapkin 3
PMCID: PMC7415580  NIHMSID: NIHMS1589486  PMID: 32409594

Abstract

Identifying persons at high risk for gastric cancer (GC) is needed for targeted interventions for prevention and control in low-incidence regions. Combining ethnic/cultural factors with conventional GC-risk factors may enhance identification of high-risk persons. Data from a prior case-control study (40 GC cases, 100 controls) were used. A “conventional model” using risk factors included in the Harvard Cancer Risk Index’s GC module was compared to a “parsimonious model” created from the most predictive variables of the conventional model as well as ethnic/cultural and socioeconomic variables. Model probability cut-offs aimed to identify a cohort with at least 10 times the baseline risk using Bayes’ Theorem applied to baseline U.S. GC incidence. The parsimonious model included age, U.S. generation, race, cultural food at ages 15–18 years, excessive salt, education, alcohol, and family history. This 11-item model enriched the baseline risk by 10-fold, at the 0.5 probability level cut-off, with an estimated sensitivity of 72% (95% CI 64–80), specificity of 94% (95% CI 90–97) and ability to identify a sub-cohort with GC prevalence of 128.5 per 100,000. The conventional model was only able to reach a risk level of 9.8 times baseline with a corresponding sensitivity of 31% (95% CI 23–39) and specificity of 97% (95% CI 94–99). Cultural and ethnic data may add important information to models for identifying U.S. individuals at high risk for GC, who then could be targeted for interventions to prevent and control GC. The findings of this pilot study remain to be validated in an external dataset.

Keywords: Cultural variables, gastric cancer, risk prediction, gastric cancer screening

BACKGROUND

Cancer screening allows for the earlier detection and treatment of pre-cancers and cancers, which can reduce cancer-related morbidity and mortality [1]. In the United States (US), screening is commonly performed for cancers of the cervix, colon, lung, prostate, and breast but it is not generally performed for gastric cancer (GC), which has a low incidence in the US [2]. In other countries with higher incidence rates of GC such as Korea and Japan, national GC screening programs have been implemented, which have resulted in improved survival from GC [34]. Although a population-based national screening program for GC may not be feasible in the US due to its relatively low prevalence, targeted screening of high-risk individuals could potentially reduce GC mortality, which is estimated to have caused approximately 11,140 deaths in the US in 2019 [5]. A survey tool has the potential to identify higher-risk individuals that would benefit from screening by endoscopy.

Prior efforts to identify patients at risk have largely focused on surveying patient symptoms. However, early GC is frequently asymptomatic, which results in this method generally identifying individuals already in advanced stages of cancer with high mortality [67]. There currently are few risk prediction models of GC detection in asymptomatic individuals so that cancer can be averted or treated successfully. One such model is the gastric cancer module of the Harvard Cancer Risk Index (HCRI), which was created by utilizing risk factors identified from group consensus. It was constructed to predict individual patients’ risks for various different types of cancers [8]. This was later developed into an online assessment, was renamed Your Disease Risk, and provides risk assessment and suggestions for prevention for several cancer types, including gastric cancer (available from: https://siteman.wustl.edu/prevention/ydr/) [9].

Gastric cancer is the third most common cancer in the world, and its incidence varies greatly by country. High-incidence countries such as Mongolia, Korea and Japan have incidence rates 10–15 times greater than low-incidence counties such as Saudi Arabia, Indonesia, and Nigeria [10]. Given the extreme variation in GC risk around the world, in countries like the US where its population is a heterogeneous mix of race, ethnicity and cultures, GC risk is also very likely to be associated with ethnic background and country of origin. The addition of these variables could greatly enhance our ability to identify persons at risk, and be particularly important for screening efforts in the diverse US patient population. Furthermore, studies have shown that targeted gastric cancer screening efforts for high risk racial and ethnic groups may be cost-effective [11].

The objective of our study was to determine whether adding cultural and ethnic variables not included in the HCRI model, can improve the performance of models to identify individuals at high risk for GC in U.S. This study utilizes data from a prior case-control study where data were collected using an extensive survey composed of known and proposed GC risk factors to try to identify variables associated with GC [12]. The goal of this study is not to propose a specific screening instrument to be used in a generalizable way but to help guide the development of future risk prediction tools to identify high-risk cancer patients in a simple, non-invasive manner.

METHODS

Data

Data from a prior GC case-control study were utilized for the present study [12]. The GC cases (n=40) were from a large urban academic medical center or an inner city public hospital serving largely racial/ethnic minorities. The controls (n=100) were recruited from primary care (n=47) and community settings (n=53) to obtain a study sample with similar demographic characteristics. Primary care patients were recruited by phone or by direct recruitment at primary care clinics. Community controls were recruited from community centers, libraries and churches neighboring the two hospitals. The survey was conducted though phone interview or paper survey. As described in detail in a prior publication [12], controls from primary care settings, as compared to controls from community settings, were more frequently older than 70 years (17% vs. 6%), Black (40% vs. 26%), less well educated (education >HS: 47% vs. 64%), and born in countries with lower rates of gastric cancer (incidence rate of country of birth >15 per 100,000: 4% vs. 19%). Furthermore, controls from primary care settings more often completed phone interviews than paper surveys (72% vs. 7%) whereas community controls more often completed paper surveys than phone interviews (93% vs. 28%) [12].

Information contained in the study database exclusively contained self-reported data, including demographics, race/ethnicity, socioeconomic status, food frequency, smoking, alcohol habits, family health history, history of H. pylori diagnosis and treatment, ethnicity, country of birth, acculturation index, and lifetime ethnic dietary habits. Given the subjective nature of some of the questions associated with these variables, which could be challenging to complete, the aforementioned publication also included information about missing data for some of these variables indicating non-responses.

Survey Items and Model Variables

The GC case-control survey assessed all variables contained in the gastric cancer module of the Harvard Cancer Risk Index (HCRI) as well as ethnic and cultural, socioeconomic (SES) and dietary variables. While the variables collected in the GC case-control study and HCRI were topically alike, some of the survey items were not identical. The survey items used in HCRI and for the GC case-control study for this analysis are presented in Table 1. In total, 14 variables were assessed. Some variables required multiple survey questions. The survey items for each variable were first compared to see if these tools assessed comparable information. GC case-control data was categorized to best match the information captured in the HCRI questions. Regarding family history of cancer, GC case-control data contained information up to second-generation; however the variable was regrouped to match the information asked for the HCRI which only assessed GC in siblings or parents. For alcohol consumption, while the HCRI considered 4 drinks per day as higher risk, very few patients in the GC case-control study reported this level of alcohol consumption, hence we assessed alcohol consumption as two or more drinks per day or less than 2 drinks per day. There was a slight difference in the smoking items, where the GC case-control study only considered people who have smoked more than 100 cigarettes to be or to have been smokers, while the HCRI did not make this distinction.

Table 1:

Survey Items

Variables Harvard Cancer Risk Index, GC module Qs GCA Screen Survey
Gender What is your sex? What is your gender?
Age Enter your age What is your date of birth?
Family history Has your brother, sister or parent ever had stomach cancer? Has anyone in your family ever had stomach cancer? (check all)
BMI What is your height? What is your height?
What is your weight? What is your weight?
Excess salt How many meals a week do you eat at restaurants or fast food places? How often did you eat at restaurants or order take out?
How many times a day do you eat canned foods, processed foods (like potato chips), preserved meats (like bacon), or frozen meals (like pizza or TV dinners)? How often did you eat processed meats?
Alcohol How many servings of alcohol do you have in a typical day? How much wine, beer or liquor did you drink on average each week?
Smoking Do you smoke cigarettes? Throughout your entire life, had you smoked 100 or more cigarettes?
(If yes or quit) How old were you when you started smoking? (If yes) Did you smoke routinely or have you stopped?
How old were you when you quit smoking? What age did you start smoking?
How many cigarettes did you used to smoke per day on average? On average, how many cigarettes per day did you smoke at the following ages?
Blood type What is your blood type? What is your blood type group?
H. pylori Have you ever been told by a doctor you have an H pylori infection? Have you ever been diagnosed with H. pylori?
(If yes) Were you treated for H. pylori infection? (If yes) Have you ever been treated for H. pylori?
Generation Where was your mother born?
Where was your father born?
Where were you born?
Cultural foods How often did you eat foods from your culture?
Education What is the highest level of school you have completed?
Race What is your race?
Are you of Hispanic, Latino or Spanish Origin?
Heritage Score I often participate in my cultural traditions.
I often participate in mainstream American cultural traditions.
I would be willing to marry a person from my culture.
I would be willing to marry an American person.
I enjoy social activities with people from the same culture as myself.
I enjoy social activities with typical American people.
I am comfortable interacting with people of the same culture as myself.
I am comfortable interacting with typical American people.
I enjoy entertainment from my culture (for example, movies, music).
I enjoy American entertainment (for example, movies, music).
I often behave in ways that are typical of my culture.
I often behave in ways that are typically American.
It is important for me to maintain the practices of my culture.
It is important for me to maintain American cultural practices.
I believe in the values of my culture.
I believe in mainstream American values.
I enjoy the jokes and humor of my culture.
I enjoy American jokes and humor.
I am interested in having friends from my culture.
I am interested in having American friends.

BMI: Body Mass Index; H pylori: Helicobacter pylori; GC: Gastric Cancer; Q: Questions

Data Analysis and Model Building

The objective of this analysis is to compare the discriminatory ability of a risk model using conventionally known risk factors as assessed in the gastric cancer module of the HCRI [8] (“conventional risk factor model”) with a parsimonious model developed through the inclusion of ethnic and cultural variables associated with GC followed by variable selection (“parsimonious model”). The conventional risk variables are age, gender, family history of GC, body mass index, excessive salt intake, alcohol, smoking, blood type, and H pylori. Cultural-ethnic variables include race, immigration/generation, acculturation, and consumption of cultural foods at ages 15–18 years (age range selected as this is a time when individuals are generally in high school which is a time they should be able to recall and this range was also found to be most predictive in our prior analysis). Education was also included as a variable.

We first examined the effect size for age- and sex-adjusted individual variables. Then three logistic regression models were built and compared. First, the conventional risk factor model was created (“conventional model”). Then the ethnic and cultural variables and education were added to create an all-inclusive model (“enhanced model”). The enhanced model was then used as the basis to select a model which contained only the most predictive variables (“parsimonious model’). Parsimonious model selection utilized multiple methods to determine variables for model inclusion; backwards, forward and stepwise selection (cut off for inclusion and removal at the p=0.1 level) and variable ranking using changes using Nagelkerke / Cragg & Uhler pseudo-R square statistics [13]. Pseudo-R2 was used to quantify the contribution of variables to the observed variation in being a gastric cancer case, and changes in the pseudo-R2 statistics (delta-Pseudo-R2) were used capture the relative contribution of each variable to the model.

Sensitivities and specificities were calculated for each level of probability for all three models. Selection of model cut-off levels were based on maximizing sensitivity and specificity and achieving our operational goal of identifying a risk threshold that corresponds to a positive predictive value (PPV) that is at least 10 times the risk of the baseline US annual incidence. This conservative cutoff is based on differences in incidence between the US and Korea/Japan. GC incidence in the US is 7–10 times lower than Korea and Japan, making use of preventive and early detection strategies for GC in the general population unacceptable in terms of population benefits and harms. The GLOBOCAN reported age-standardized rate (ASR) for Korea and Japan, countries that screen, are 114.0 and 75.5 respective; the ASR for the US is 11.2 per 100,000, male and female combined, in patients ages 40–79 years old. Note that GLOBOCAN ASRs are used for comparability in age adjustments across countries. The predicted prevalence of GC for cohort found to be high risk by the model was estimated using Bayes’ Theorem [14] applied to the baseline prevalence in the U.S. population. Analyses were performed using SAS 9.4 (SAS Institute Inc., Cary, NC).

RESULTS

Demographics

Demographics of 40 GC cases and 100 controls are displayed in Table 2. Cases were more likely to be male (50% vs. 24%, p=0.003), Hispanic (60% vs. 28%, p=0.003), have less than a high school education (47.5% vs. 19%, p=0.001), have a family history of GC (15% vs. 5%, p=0.047), be foreign born (85% vs. 54%, p=0.003), consume cultural food at ages 15–18 years daily or more often (68% vs. 36%, p=0.003), have excess salt in their diet (15% vs. 5%, p=0.047), and more frequently report two or more drinks a day or more of alcohol (18% vs. 6%, p=0.037). Trends were also observed for cases to be less or moderately acculturated (73% vs. 57%, p=0.286), have a current or past history of smoking (45% vs. 32%, p=0.263), be BMI>30 (43% vs. 35%, p=0.407) and less likely to have blood type A (13% vs. 17%, p=0.509) than controls. Helicobacter pylori prevalence was similar between cases and controls (8% vs. 6%, p=0.279).

Table 2:

Attributes of Cases (n=40) and Controls (n=100)

Variables Categories Controls Cases p-value
N=100 N=40
Age <50 32 (33%) 2 (5%) 0.001
50 to 59 29 (29%) 11 (28.5%)
60 to 69 28 (28%) 11(28.5%)
≥ 70 11 (11%) 16 (40%)
Gender Male 24 (24%) 20 (50%) 0.003
Female 76 (76%) 20 (50%)
Race NH-White 17 (17%) 4 (10%) 0.003
NH-Black 33 (33%) 10 (25%)
Hispanic 28 (28%) 24 (60%)
Asian/PI/Other 22 (22%) 2 (5%)
Education Less than HS 19 (19%) 19 (47.5%) 0.001
High School 25 (25%) 10 (25%)
Greater than HS 56 (56%) 11 (27.5%)
Family History of GC No 95 (95%) 34 (85%) 0.047
Yes 5 (5%) 6 (15%)
US generation Foreign-born 54 (54%) 34 (85%) 0.003
1st Generation (both parents foreign born) 10 (10%) 2 (5%)
2nd Generation (one or more parents US born) 36 (36%) 4 (10%)
Cultural food consumption frequency at ages 15 to 18 Daily or more 36 (36%) 27 (67.5%) 0.003
Weekly or more (less than daily) 35 (35%) 7 (17.5%)
Less than once per week 29 (29%) 6 (15%)
Excess Salt No 95 (95%) 34 (85%) 0.047
Yes 5 (5%) 6 (15%)
Acculturation Upper Tertile (less acculturated) 29 (29%) 12 (30%) 0.286
(Heritage Subscore) Middle Tertile (moderately acculturated) 28 (28%) 17 (42.5%)
Lower Tertile (more acculturation) 30 (30%) 7 (17.5%)
Missing 13 (13%) 4 (10%)
Alcohol two or more drinks per day (missing=5) No 90 (94%) 32 (82%) 0.037
Yes 6 (6%) 7 (18%)
Smoking Never 68 (68%) 22 (55%) 0.263
Quit, more than 20 years ago 5 (5%) 5 (12.5%)
Quit, less than 20 years ago 13 (13%) 8 (20%)
Current smoker 14 (14%) 5 (12.5%)
BMI 30 or more No 65 (65%) 23 (57.5%) 0.407
Yes 35 (35%) 17 (42.5%)
Blood Type A No 83 (83%) 35 (87.5%) 0.509
Yes 17 (17%) 5 (12.5%)
H. Pylori Never 94 (94%) 37 (92.5%) 0.279
Yes, treated 5 (5%) 1 (2.5%)
Yes, not treated or unsure 1 (1%) 2 (5%)

BMI: Body Mass Index; H pylori: Helicobacter pylori; GC: Gastric Cancer; Q: Questions

Model selection

Table 3 summarizes the selection criteria of variables for inclusion in the parsimonious model. The ranking of variable importance using delta-Pseudo-R2 was a follows; age (delta-Pseudo-R2 30%), U.S. generation (12%), race (9%), cultural food at ages 15–18 years (7%), excessive salt (5%), acculturation (5%), education (3%), alcohol (3%), family history of GC (2%), smoking (2%), H pylori (2%), gender (1%), BMI (0%), and Blood-type A (0%). The variables that were uniformly selected using backward, forward and stepwise model selections were age, U.S. generation, race, cultural food at ages 15–18 years, excessive salt, and alcohol. Acculturation was selected only when using backward and forward selection, education only with backwards and forward selection, and family history only with forward and stepwise selection. Acculturation requires 20 survey items for assessment; hence we dropped this variable from consideration because the end product of this survey needs to be short and easy to use for practicality. The final variables included in the parsimonious model were age, U.S. generation, race, cultural food at ages 15–18 years, excessive salt, education, alcohol and family history of GC.

Table 3.

Selection of Variables for the Parsimonious Model - Relative Contribution of Variables to the Explanatory Power and Variables Based on Selection Modeling.

Variable importance based on model variance Variables selected based on p-value selection modeling Final Variables for Parsimonious Model
Pseudo-R square %Delta pseudo-R square Backwards (p=0.1) Forward (p=0.1) Stepwise (p=0.1)
Base Model (all variables below) 0.796 -
Age 0.559 30% * * * *
Generation in the model 0.698 12% * * * *
Race 0.725 9% * * * *
Cultural Food at Ages 15–18 0.745 7% * * * *
Excess Salt 0.754 5% * * * *
Acculturation 0.757 5% * *
Education 0.771 3% * * *
Alcohol ≥2 drinks/day 0.773 3% * * * *
Family history of GCA 0.782 2% * * *
Smoking 0.782 2%
H pylori 0.783 2%
Gender 0.791 1%
BMI 0.794 0%
Blood-type A 0.795 0%

Pseudo-R square of the base model minus the variable being tested

Model Comparisons

Table 4 shows the OR estimates of each variable after adjusting for age and sex, as well as for the three multivariable models. After adjusting for age and sex, having a family history of GC, consuming excess salt, being diagnosed but not having treated H pylori, being foreign-born, having at least daily cultural food consumption, less than high school education and being of Hispanic race were individually found to be predictive. In the conventional model, older age (OR 1.1 95% CI 1.06–1.2) and consumption of excess salt (OR 7.8 95% CI 1.5–41.0) were independently predictive. In the parsimonious model, age (OR 1.2 95% CI 1.1–1.3), excess salt (OR 15.7 95% CI 1.9–132.8), daily alcohol consumption of 2 drinks or more (OR 78.1.0 95% CI 4.9–999.9) and being foreign born (OR 29.3 95% CI 3.5–247.9) were found to be independently predictive.

Table 4:

Comparison of Model Results

Variable Age and Sex Adjusted
OR (95% CI)
Conventional Model
OR (95% CI)
Enhanced Model
OR (95% CI)
Parsimonious Model
OR (95% CI)
Age
Per 1 year increase N/A 1.1 (1.1–1.2) 1.2 (1.1–1.3) 1.2 (1.1–1.3)
Gender
Male (ref=Female) N/A 3.5 (1.3–9.7) 2.0 (0.5–7.7) N/A
Family History
Yes (ref=no history) 6.4 (1.5–27.9) 5.0 (1.0–25.4) 3.5 (0.4–27.5) 9.1 (0.7–116.9)
BMI >= 30
Yes (ref=no) 1.6 (0.7–3.8) 1.7 (0.6–4.4) 1.3 (0.4–4.5) N/A
Excess Salt
Yes (ref=no) 6.7 (1.6–27.3) 7.8 (1.5–41.0) 9.6 (1.1–85.0) 15.7 (1.9–132.8)
Alcohol
>=2 drink/day (ref=no) 1.8 (0.6–4.9) 2.0 (0.4 – 9.1) 7.5 (0.5–109.3) 78.1 (4.9–999.9)
Smoking
Never Ref Ref Ref N/A
Quit, more than 20 years ago 1.1 (0.2–5.2) 1.6 (0.3–8.2) 0.5 (0.04–4.5) N/A
Quit, less than 20 years ago 1.4 (0.5–4.3) 2.0 (0.6–6.9) 2.0 (0.3–13.3) N/A
Current smoker 1.0 (0.3–3.6) 1.1 (0.2–4.9) 1.7 (0.2–13.9) N/A
Blood Type A
Yes (ref=no) 0.6 (0.2–2.0) 0.8 (0.2–3.5) 1.5 (0.2–10.2) N/A
H pylori
Never Ref Ref Ref N/A
Yes, treated 0.3 (0.0–2.9) 0.2 (0.02–24.4) 0.8 (0.04–16.1) N/A
Yes, not treated or unsure 16.7 (1.2–231.0) 24.4 (1.8–329.4) 5.2 (0.3–91.4) N/A
Generation
Foreign-born 9.5 (2.5–35.4) N/A 17.5 (2.1–147.4) 29.3 (3.5–247.9)
First generation 3.1 (0.4–24.2) N/A 2.0 (0.07–52.7) 1.3 (0.0–37.8)
Second generation Ref N/A Ref Ref
Cultural food consumption frequency at ages 15 to 18
Daily or more 4.6 (1.5–14.5) N/A 5.5 (0.7–44.7) 3.7 (0.6–21.9)
Weekly or more (less than daily) 1.2 (0.3–4.6) N/A 0.9 (0.1–7.5) 0.2 (0.0–1.9)
Less than once per week Ref N/A Ref N/A
Education
< High School 4.6 (1.7–12.7) N/A 4.1 (0.8–21.7) 3.6 (0.8–16.8)
High School 1.8 (0.6–5.3) N/A 2.2 (0.4–10.9) 3.8 (0.7–21.2)
> High School Ref N/A Ref N/A
Race
Non-Hispanic White Ref N/A Ref Ref
Non-Hispanic Black 1.0 (0.2–4.3) N/A 0.6 (0.1–4.5) 0.6 (0.1–4.0)
Hispanic 5.4 (1.4–21.3) N/A 1.8 (0.3–12.3) 3.6 (0.5–26.1)
API/Other 0.4 (0.1–3.1) N/A 0.1 (0.01–1.2) 0.0 (0.0–0.6)
Heritage Score
Upper Tertile (less acculturated) 1.7 (0.5–5.6) N/A 3.4 (0.5–24.7) N/A
Middle Tertile (moderately acculturated) 2.5 (0.8–7.8) N/A 2.4 (0.5–11.8) N/A
Lower Tertile (more acculturation) Ref N/A Ref N/A
Missing 1.4 (0.3–6.4) N/A 16.4 (0.8–345.4) N/A

OR: Odds Ratio; API: Asian Pacific Islander; BMI: Body Mass Index; H pylori: Helicobacter pylori; GC: Gastric Cancer; Q: Questions

Model AUC for the parsimonious model (0.94 95% CI 0.91–0.98) was improved over the conventional model (0.87 95% CI 0.81–0.93, p=0.009). ROC curves are shown in Figure 1.

Figure 1.

Figure 1.

ROC Curves for enhanced, parsimonious, and conventional models.

Table 5 shows the selection of model cut-offs for comparison of model performance. The conventional model was unable to reach the target level of risk (10 times baseline risk) at any cutoff level with specificity of less than 100%. The parsimonious model exceeded 10 times the risk at the 0.5 probability level cut-off. This resulted in a model sensitivity of 72% (95% CI 64–80%), specificity of 94% (95% CI 90–97%) and PPV of 0.128% (95% CI 0.086–0.359), which would translate to the prediction model being able to identify a sub-cohort of the population with a prevalence rate of 128.5 per 100,000. This parsimonious model performed as well as the enhanced model; the enhanced model exceeded 10 times the risk at the 0.4 probability level cut-off with a sensitivity of 90% (95% CI 85–95%), specificity of 93% (95% CI 88–97%) and PPV of 0.137% (95% CI 0.090–0.345%). However, the parsimonious model required only 11 survey items compared to the enhanced model that required 42 survey items. Characteristics of the three models are provided for comparison on Table 6.

Table 5.

Determination of model cut-off levels, and its corresponding sensitivity, specificity, and the predicted prevalence of GC identified by the model when applied to the U.S. population, ages 40–79, both sexes. (Presented for comparison - GLOBOCAN 2018 ASR for Gastric Cancer, ages 40–79, both sexes, per 100,000: USA 11.2, Japan 75.5, Korea 114.0)

Conventional Model Enhanced Model Parsimonious Model
Probability Level Sensitivity
(%)
Specificity
(%)
Predicted prevalence of GC identified by the model when applied to baseline U.S. incidence rate of 11.2 per 100K
(per 100,000)
Times Increased Risk Sensitivity
(%)
Specificity
(%)
Predicted prevalence of GC identified by the model when applied to baseline U.S. incidence rate of 11.2 per 100K
(per 100,000)
Times Increased Risk Sensitivity
(%)
Specificity
(%)
Predicted prevalence of GC identified by the model when applied to baseline U.S. incidence rate of 11.2 per 100K
(per 100,000)
Times Increased Risk
0.1 97% 43% 19.0 1.7 100% 78% 51.2 4.6 100% 71% 38.4 3.4
0.2 95% 64% 29.1 2.6 97% 83% 65.4 5.8 95% 79% 51.0 4.6
0.3 79% 79% 42.7 3.8 92% 86% 76.3 6.8 87% 83% 58.6 5.2
0.4 64% 86% 53.0 4.7 90% 93% 137.7 12.3 79% 91% 94.9 8.5
0.5 56% 92% 75.8 6.8 85% 97% 302.4 27.0 72% 94% 128.5 11.5
0.6 44% 94% 78.1 7.0 74% 99% 793.3 70.8 67% 94% 119.3 10.7
0.7 31% 97% 110.2 9.8 72% 99% 766.1 68.4 59% 97% 210.9 18.8
0.8 23% 99% n/a n/a 62% 99% 657.4 58.7 54% 98% 288.7 25.8
0.9 5% 100% n/a n/a 54% 99% 575.7 51.4 46% 100% 100000.0 8928.6

Bolded text denotes the levels when the predicted prevalence reaches target of 10 times baseline risk

*

Baseline population prevalence is based on GLOBOCAN estimates of the U.S. for gastric cancer in patients ages 40–79, both sexes.

indicates the level selected for comparisons of model performance in table 6

n/a – unable to calculate.

Table 6:

Comparison of Risk Factor Model Performance at model probability levels shown in table 5.

Conventional Model Enhanced Model Parsimonious Model
Number of Variables 9 14 8
Number of Survey Items 15 42 12
Sensitivity (95% CI)* 31% (23, 39) 90% (85, 95) 72% (64, 80)
Specificity (95% CI)* 97% (94, 99) 93% (88, 97) 94% (90, 97)
PPV (95% CI)* 0.110% (0.07, 3.78) 0.137% (0.90, 0.345) 0.128% (0.86, 0.359)
AUC (95% CI) 0.87 (0.81, 0.93) 0.97 (0.95, 1.00) 0.95 (0.92, 0.98)
*

Model probability cutoffs: Conventional Model – 70%, Enhanced Model – 40%, Parsimonious Model – 50%;

PPV per 100,000.

DISCUSSION

We found that the inclusion of cultural and ethnic factors to conventionally known GC risk factors greatly enhanced the ability to detect individuals at higher risk for GC. Immigration/generation was highly predictive, and the addition of this variable allowed for the development of a model that used fewer survey items with considerable improvement of sensitivity, specificity, model fit, and positive predictive value.

The findings from this study and its predecessor indicate that cultural and ethnic variables may play a critical role in identifying individuals at higher risk for developing GC in the US. In the parent study, data were collected using a survey comprised of 227 items. After item reduction steps, a logistic regression model using the highest ranked eight variables using boot-strapping techniques was chosen as the final model [12]. The resulting eight variables were age, gender, family history of GC, race, generation/immigration, consumption of cultural foods at ages 15–18 years, education, and acculturation, and were the basis of these five cultural-ethnic variables being included as variables explored in this study. Generation/immigration remained a strong predictor in both analyses highlighting the importance of considering cultural and ethnic variables in identifying individuals at high risk for GC.

Large racial/ethnic disparities exist for gastric cancer (GC) incidence in the US [15,16]. While rare in whites, among Blacks, Asians and Hispanics, GC is among the ten most frequent cancers and has nearly double the cancer incidence rate of whites [17]. Studies have shown that racial minorities in the US have an increased incidence of GC compared to non-Hispanic Whites and that there is a higher incidence of GC among immigrants from specific countries including non-Asian countries [1516,18]. In a study of first generation Hispanic males, incidence rate was 21.3 per 100,000 in men from Puerto Rico compared to 9.2 per 100,000 for non-Hispanic white males [19]. In a study of Asians living in the US, GC was among the five most frequent cancers for Koreans, Chinese, Japanese, Laotian and Vietnamese [20]. Additionally, Asians had disproportionately greater tumor burden when compared to their representation in the overall population; Asians represented 16% of GC diagnosed, but only accounted for 9% of the population [15].

Diet has long been considered a risk factor for GC development. It has been suggested that pickled and salted foods put individuals at higher risk and there is some question as to whether a Western diet can modulate this risk [2123]. If this is the case, it may explain why the dietary acculturation influences GC risk: it is possible that as individuals develop “American habits” including the adoption of a western diet, their risk for GC changes. Given that we desire to assess GC risk using the least number of questions, assessing acculturation using established instruments such as the Vancouver Acculturation Index [24] or assessing food consumption using food frequency questionnaires would not be feasible given the large number of questions required. Instead we opted to assess acculturation through the development of a dietary acculturation question; consumption of cultural foods at various ages. Ages 15–18 years was selected based on our prior study that showed this age group to have the highest predictability. By asking it in this manner, we can simultaneously capture both acculturation and consumption of certain foods that may be related to greater risk in those particular ethnicities/regions. For example, compared to a Western diet, a traditional Korean diet is largely soy-based with a large amount of pickled foods and allium vegetables (garlic, onions). Instead of asking about each food item separately, asking about the frequency of Korean food can serve as a proxy for consumption of food items associated with greater or lesser risk. While undefined, this item is likely to also serve as a proxy for as yet to be discovered factors that modulate risk for GC.

While the identification of the most predictive risk factors for GC is clearly necessary, these efforts will need to go hand-in-hand with those to try to further educate providers and the public about the importance of GC screening [25]. This is evident from research that has shown that many providers have insufficient understanding of GC risk and those individuals who are at higher risk [26]. Furthermore, once a risk prediction tool such as a survey is developed, implementation efforts will need to be developed carefully to be cognizant to include racial and ethnic minority groups that will be at particular risk.

In the U.S., only 28% of gastric cancers are diagnosed at earlier localized stages of cancer, resulting in 5-year survival of 31% [27]. In Korea and Japan, where they have instituted population based screening programs since the 1950s and 1980s respectively, 50–60% of gastric cancers are detected in earlier stages [2834] and as a result these countries report superior overall 5-year GC survival of 40–60% [32,35] While no randomized controlled trials exist, observational studies suggest a 30–60% mortality benefit from screening [3,4,3643].

Our target PPV of 10 times the background incidence rate aims to identify a subset of the population that would have the same level of risk as countries that currently screen for gastric cancer. In our study, the parsimonious model showed potential to identify a cohort with prevalence of 128.5 per 100,000. This level of risk is also comparable to that of colorectal cancer in the U.S. which has an established screening program. The GLOBOCAN ASR for colorectal cancer in persons ages 50–79, in the U.S. is 96.7 per 100,000 population [10]

Assuming that we are able to identify a sufficiently high risk group, upper endoscopic screening to detect and treat premalignant lesions and diagnosis frank GC at an earlier stage is likely to have similar benefit as colorectal cancer screening in the U.S. or gastric cancer screening in high-risk countries. Compared to colonoscopy, upper endoscopy also has the benefit of not requiring a bowel prep and is simpler to administer. Benefits of screening for gastric cancer are especially pronounced due to extreme 5-year survival differences by stage, ranging from 90% for Tis and T1a lesions to 67% for local, 29% for regional and 4% for metastatic disease [44,45]. In addition, gastric cancer mortality is not offset by effective treatments, making early detection the most promising option [46].

Although the findings of this work are important for showing the value of cultural and ethnic variables to GC risk prediction models, research is needed to identify additional risk factors that allow for even greater discrimination between those high risk vs. lower risk than a survey of self-reported conventional and cultural-ethnic risk factors can allow. One possible area for exploration is the characterization of patient’s oral and fecal microbiome, which could serve as a potential biomarker for GC and which may also shed further light on the relationship between diet and GC risk [47]. Another possibility is adding data on H. pylori antibodies, especially those to CagA, VacA, and pepsinogen and hemoglobin A1c levels [48].

This study is not without its limitations. First, the data was taken from a prior study with a relatively small number of patients from a single geographic area. Data was collected by phone interview or survey and the different ways in which individuals completed these could have introduced some inaccuracies into the data. Furthermore, while efforts were made to obtain a diverse sample of immigrants, the sample lacked people born in the most endemic regions of the world such as Korea, Mongolia and Japan, which precluded our ability to adequately explore the prevalence of GC in a subject’s birth country as a variable of interest. The study also chose representative variables to serve as conventional and cultural ethnic variables based upon what has been used previously; however, this pool of variables should not necessarily be considered definitive and all-encompassing as there are likely additional variables that could have been included. Neither the enhanced risk model, which includes all of the conventional and cultural-ethnic variables, nor the parsimonious model which contains the immigration/generation variable should be considered the definitive model for risk prediction for GC patients. Development of a high-risk prediction model would need to be undertaken in a much larger patient population to make more concrete statements and fully explore how variables that put individuals at risk for GC differ by race/ethnicity. Additionally, we make no claim as to what is an appropriate risk threshold at which patients should be declared at sufficiently high risk that they warrant referral for a screening endoscopy. While the parsimonious model improved PPV by 10 times compared to a model using conventional risk factors alone, most at that risk threshold or above still have no gastric pre-cancer or GC, there needs to be continued efforts to identify other predictors of gastric cancer and improve upon our ability to identify persons at GC. Importantly, no existing sources of data contain all the variables we found to be important in this study, and these findings are reported without external validation.

Nonetheless, while this study does not propose a specific screening tool to be used at this time, this study provides further support that cultural and ethnic variables may be useful for identifying high risk GC patients for screening in the US. The findings of this study may prompt other investigators to begin collecting these variables with more consistency. As the importance of cultural and ethnic variables to cancer research becomes evident, it is equally important that a culturally sensitive approach be taken to care delivered at all stages of the cancer continuum to ensure that high-risk patients are not only identified but that they also receive care and support of the highest quality possible [49].

CONCLUSION

The addition of ethnic and cultural variables, particularly the immigration/generation, to conventional risk factor variables improved the ability of models to identify individuals at high risk for GC. Future efforts to develop GC risk prediction tools in racially/ethnically heterogeneous populations should look to incorporate these variables for improved risk prediction. These variables may one day contribute to the creation of a parsimonious survey-based tool that can serve as a highly scalable paradigm to identify high-risk individuals for interventions (e.g., endoscopy) to prevent and control GC.

Funding:

Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under the award number UG1CA189823 (Alliance for Clinical Trials in Oncology NCORP Grant). Partial support also provided by the Montefiore Medical Center (MMC) minority-based NCORP community site (UG1CA189859). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Footnotes

Conflict of Interest Statement: The authors have no disclosures to report. The authors report no proprietary or commercial interest in any product mentioned or concept discussed in this article.

Meeting Presentation: Results in this manuscript were initially presented at the Gastrointestinal Cancers Symposium on January 17, 2019 in San Francisco, California and at the Society of Surgical Oncology (SSO) Annual Cancer Symposium, March 29, 2019 in San Diego, California.

REFERENCES

  • 1.American Cancer Society. Cancer Prevention & Early Detection Facts & Figures 2017–2018. Published 2017. [Google Scholar]
  • 2.Internaetional Agency for Research on Cancer. GLOBOCAN 2012: Estimated Cancer Incidence, Mortality and Prevalence Worldwide in 2012. Lyon, France: World Health Organization. [Google Scholar]
  • 3.Oshima A, Hirata N, Ubukata T, Umeda K, Fujimoto I. Evaluation of a mass screening program for stomach cancer with a case-control study design. International journal of cancer Journal international du cancer. 1986;38(6):829–833. [DOI] [PubMed] [Google Scholar]
  • 4.Hamashima C, Ogoshi K, Okamoto M, Shabana M, Kishimoto T, Fukao A. A community-based, case-control study evaluating mortality reduction from gastric cancer by endoscopic screening in Japan. PLoS One. 2013;8(11):e79088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.American Cancer Society. Key Statistics About Stomach Cancer. 2019. Accessed February 20, 2019. [Google Scholar]
  • 6.Tan YK, Fielding JW. Early diagnosis of early gastric cancer. European journal of gastroenterology & hepatology. 2006;18(8):821–829. [DOI] [PubMed] [Google Scholar]
  • 7.Tata MD, Gurunathan R, Palayan K. MARK’s Quadrant scoring system: a symptom-based targeted screening tool for gastric cancer. Annals of gastroenterology : quarterly publication of the Hellenic Society of Gastroenterology. 2014;27(1):34–41. [PMC free article] [PubMed] [Google Scholar]
  • 8.Colditz GA, Atwood KA, Emmons K, et al. Harvard report on cancer prevention volume 4: Harvard Cancer Risk Index. Risk Index Working Group, Harvard Center for Cancer Prevention. Cancer Causes Control. 2000;11(6):477–488. [DOI] [PubMed] [Google Scholar]
  • 9.Siteman Cancer Center. Your Disease Risk. Accessed April 23, 2019. [Google Scholar]
  • 10.Arbyn M, Redman CWE, Verdoodt F, et al. Incomplete excision of cervical precancer as a predictor of treatment failure: a systematic review and meta-analysis. Lancet Oncol. 2017;18(12):1665–1679. [DOI] [PubMed] [Google Scholar]
  • 11.Saumoy M, Schneider Y, Shen N, Kahaleh M, Sharaiha RZ, Shah SC. Cost Effectiveness of Gastric Cancer Screening According to Race and Ethnicity. Gastroenterology. 2018;155(3):648–660. [DOI] [PubMed] [Google Scholar]
  • 12.In H, Langdon-Embry M, Gordon L, et al. Can a gastric cancer risk survey identify high-risk patients for endoscopic screening? A pilot study. J Surg Res. 2018;227:246–256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Wolfinger ROCM. Generalized linear mixed models: a pseudo-likelihood approach. J Stat Comput Sim. 1993;4:233–243. [Google Scholar]
  • 14.Rosner B Fundamentals of Biostatistics. Eighth Edition ed: Cenage Learning; 2016. [Google Scholar]
  • 15.Lui FH, Tuan B, Swenson SL, Wong RJ. Ethnic disparities in gastric cancer incidence and survival in the USA: an updated analysis of 1992–2009 SEER data. Digestive diseases and sciences. 2014;59(12):3027–3034. [DOI] [PubMed] [Google Scholar]
  • 16.Schlansky B, Sonnenberg A. Epidemiology of noncardia gastric adenocarcinoma in the United States. Am J Gastroenterol. 2011;106(11):1978–1985. [DOI] [PubMed] [Google Scholar]
  • 17.United States Cancer Statistics Working Group. United States Cancer Statistics: 1999–2011 Incidence and Mortality Web-based Report. U.S. Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Published 2014. Accessed January 01, 2015. [Google Scholar]
  • 18.Dong E, Duan L, Wu BU. Racial and Ethnic Minorities at Increased Risk for Gastric Cancer in a Regional US Population Study. Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association. 2017;15(4):511–517. [DOI] [PubMed] [Google Scholar]
  • 19.Pinheiro PS. Cancer incidence in first generation U.S. Hispanics: Cubans, Mexicans, Puerto Ricans, and new Latinos. Cancer epidemiology, biomarkers & prevention. 2009;18(8):2162–2169. [DOI] [PubMed] [Google Scholar]
  • 20.Gomez SL. Cancer incidence trends among Asian American populations in the United States, 1990–2008. JNCI : Journal of the National Cancer Institute. 2013;105(15):1096–1110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Chang ET, Gomez SL, Fish K, et al. Gastric cancer incidence among Hispanics in California: patterns by time, nativity, and neighborhood characteristics. Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology. 2012;21(5):709–719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gomez SL, Shariff-Marco S, DeRouen M, et al. The impact of neighborhood social and built environment factors across the cancer continuum: Current research, methodological considerations, and future directions. Cancer. 2015;121(14):2314–2330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Kim Y, Park J, Nam BH, Ki M. Stomach cancer incidence rates among Americans, Asian Americans and Native Asians from 1988 to 2011. Epidemiology and health. 2015;37:e2015006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ryder AG, Alden LE, Paulhus DL. Is acculturation unidimensional or bidimensional? A head-to-head comparison in the prediction of personality, self-identity, and adjustment. Journal of personality and social psychology. 2000;79(1):49. [DOI] [PubMed] [Google Scholar]
  • 25.Shah SC, Nunez H, Chiu S, et al. Low baseline awareness of gastric cancer risk factors amongst at-risk multiracial/ethnic populations in New York City: results of a targeted, culturally sensitive pilot gastric cancer community outreach program. Ethnicity & health. 2017:1–17. [DOI] [PubMed] [Google Scholar]
  • 26.Shah SC, Itzkowitz SH, Jandorf L. Knowledge Gaps among Physicians Caring for Multiethnic Populations at Increased Gastric Cancer Risk. Gut and liver. 2018;12(1):38–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Birkmeyer NJO, Gu N, Baser O, Morris AM, Birkmeyer JD. Socioeconomic Status and Surgical Mortality in the Elderly. Medical Care. 2008;46(9):893–899. [DOI] [PubMed] [Google Scholar]
  • 28.Fumihiko W Cancer Statistics in Japan 2012. Center for Cancer Control and Information Services, National Cancer Center;2014. [Google Scholar]
  • 29.Jeong O, Park YK. Clinicopathological features and surgical treatment of gastric cancer in South Korea: the results of 2009 nationwide survey on surgically treated gastric cancer patients. Journal of gastric cancer. 2011;11(2):69–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Korean Gastric Cancer Association Nationwide Survey on Gastric Cancer in 2014. Journal of gastric cancer. 2016;16(3):131–140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Dhillon PK, Farrow DC, Vaughan TL, et al. Family history of cancer and risk of esophageal and gastric cancers in the United States. International Journal of Cancer. 2001;93(1):148–152. [DOI] [PubMed] [Google Scholar]
  • 32.Freedman ND, Abnet CC, Leitzmann MF, et al. A Prospective Study of Tobacco, Alcohol, and the Risk of Esophageal and Gastric Cancer Subtypes. American Journal of Epidemiology. 2007;165(12):1424–1433. [DOI] [PubMed] [Google Scholar]
  • 33.Cook MB, Matthews CE, Gunja MZ, Abid Z, Freedman ND, Abnet CC. Physical Activity and Sedentary Behavior in Relation to Esophageal and Gastric Cancers in the NIH-AARP Cohort. PLoS ONE. 2013;8(12):e84805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Justin Cheung M, Rachel Munday R, Cheung J, Goodman K, Munday R. Helicobacter pylori infection in Canada‚Äôs arctic: Searching for the solutions. Can J Gastroenterol. 2008;22(11):912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Fock KM. Review article: the epidemiology and prevention of gastric cancer. Alimentary pharmacology & therapeutics. 2014;40(3):250–260. [DOI] [PubMed] [Google Scholar]
  • 36.Fukao A, Tsubono Y, Tsuji I, S HI, Sugahara N, Takano A. The evaluation of screening for gastric cancer in Miyagi Prefecture, Japan: a population-based case-control study. International journal of cancer Journal international du cancer. 1995;60(1):45–48. [DOI] [PubMed] [Google Scholar]
  • 37.Hamashima C, Shabana M, Okada K, Okamoto M, Osaki Y. Mortality reduction from gastric cancer by endoscopic and radiographic screening. Cancer science. 2015;106(12):1744–1749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Mizoue T, Yoshimura T, Tokui N, et al. Prospective study of screening for stomach cancer in Japan. International journal of cancer Journal international du cancer. 2003;106(1):103–107. [DOI] [PubMed] [Google Scholar]
  • 39.Lee KJ, Inoue M, Otani T, Iwasaki M, Sasazuki S, Tsugane S. Gastric cancer screening and subsequent risk of gastric cancer: a large-scale population-based cohort study, with a 13-year follow-up in Japan. International journal of cancer Journal international du cancer. 2006;118(9):2315–2321. [DOI] [PubMed] [Google Scholar]
  • 40.Miyamoto A, Kuriyama S, Nishino Y, et al. Lower risk of death from gastric cancer among participants of gastric cancer screening in Japan: a population-based cohort study. Preventive medicine. 2007;44(1):12–19. [DOI] [PubMed] [Google Scholar]
  • 41.Inaba S, Hirayama H, Nagata C, et al. Evaluation of a screening program on reduction of gastric cancer mortality in Japan: preliminary results from a cohort study. Preventive medicine. 1999;29(2):102–106. [DOI] [PubMed] [Google Scholar]
  • 42.Hamashima C, Ogoshi K, Narisawa R, et al. Impact of endoscopic screening on mortality reduction from gastric cancer. World J Gastroenterol. 2015;21(8):2460–2466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Jun JK, Choi KS, Lee HY, et al. Effectiveness of the Korean National Cancer Screening Program in Reducing Gastric Cancer Mortality. Gastroenterology. 2017;152(6):1319–1328.e1317. [DOI] [PubMed] [Google Scholar]
  • 44.Howlader NNA, Krapcho M, Garshell J, Miller D, Altekruse SF, Kosary CL, Yu M, Ruhl J, Tatalovich Z,Mariotto A, Lewis DR, Chen HS, Feuer EJ, Cronin KA SEER Cancer Statistics Review, 1975–2011. National Cancer Institute. Published 2014. Accessed January 01, 2015. [Google Scholar]
  • 45.Min YW, Min B-H, Lee JH, Kim JJ. Endoscopic treatment for early gastric cancer. World Journal of Gastroenterology : WJG. 2014;20(16):4566–4573. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Lin LL, Huang HC, Juan HF. Discovery of biomarkers for gastric cancer: a proteomics approach. Journal of proteomics. 2012;75(11):3081–3097. [DOI] [PubMed] [Google Scholar]
  • 47.Sun JH, Li XL, Yin J, Li YH, Hou BX, Zhang Z. A screening method for gastric cancer by oral microbiome detection. Oncology reports. 2018;39(5):2217–2224. [DOI] [PubMed] [Google Scholar]
  • 48.Iida M, Ikeda F, Hata J, et al. Development and validation of a risk assessment tool for gastric cancer in a general Japanese population. Gastric cancer : official journal of the International Gastric Cancer Association and the Japanese Gastric Cancer Association. 2018;21(3):383–390. [DOI] [PubMed] [Google Scholar]
  • 49.Kagawa-Singer M, Dadia AV, Yu MC, Surbone A. Cancer, culture, and health disparities: time to chart a new course? CA: a cancer journal for clinicians. 2010;60(1):12–39. [DOI] [PubMed] [Google Scholar]

RESOURCES