Abstract
Background
Risk prediction models for hepatocellular carcinoma are available for individuals with chronic hepatitis B virus (HBV) and hepatitis C virus (HCV) infections who are at high risk but not for the general population with average or unknown risk. We developed five simple risk prediction models based on clinically available data from the general population.
Methods
A prospective cohort of 428 584 subjects from a private health screening firm in Taiwan was divided into two subgroups—one with known HCV test results (n = 130 533 subjects) and the other without (n = 298 051 subjects). A total of 1668 incident hepatocellular carcinomas occurred during an average follow-up of 8.5 years. Model inputs included age, sex, health history–related variables; HBV or HCV infection–related variables; serum levels of alanine transaminase (ALT), aspartate transaminase (AST), and alfa-fetoprotein (AFP), as well as other variables of routine blood panels for liver function. Cox proportional hazards regression method was used to identify risk predictors of hepatocellular carcinoma. Receiver operating characteristic curves were used to assess discriminatory accuracy of the models. Models were internally validated. All statistical tests were two-sided.
Results
Age, sex, health history, HBV and HCV status, and serum ALT, AST, AFP levels were statistically significant independent predictors of hepatocellular carcinoma risk (all P < .05). Use of serum transaminases only in a model showed a higher discrimination compared with HBV or HCV only (for transaminases, area under the curve [AUC] = 0.912, 95% confidence interval [CI] = 0.909 to 0.915; for HBV, AUC = 0.840, 95% CI = 0.833 to 0.848; and for HCV, AUC = 0.841, 95% CI = 0.834 to 0.847). Adding HBV and HCV data to the transaminase-only model improved the discrimination (AUC = 0.933, 95% CI = 0.929 to 0.949). Internal validation showed high discriminatory accuracy and calibration of these models.
Conclusion
Models with transaminase data were best able to predict hepatocellular carcinoma risk even among subjects with unknown or HBV- or HCV-negative infection status.
Chronic hepatitis B virus (HBV) and hepatitis C virus (HCV) infections are three to five times more common than HIV infection and AIDS in the United States, placing those infected with HBV or HCV at increased risk for hepatocellular carcinoma, cirrhosis, and death (1). However, unlike HIV infection and AIDS, a recent Institute of Medicine (IOM) report noted that most of the five million Americans with HBV or HCV infections are unaware of their risks until they develop symptoms of hepatocellular carcinoma or cirrhosis (2). Many of the 150 000 deaths expected in the next 10 years could be prevented if physicians and the public were better educated about early recognition of these conditions.
Although HBV or HCV carriers are at increased risk of hepatocellular carcinoma, the cancer also occurs among noncarriers of these viruses (3,4). In this aspect, clinicians will have difficulty assessing this “low-risk” population. To start checking for carrier status, HBV or HCV testing will require extra effort and extra cost; however, recommendations for universal screening have been suggested (5). Even when HBV or HCV testing is performed and found to be positive, the majority of carriers do not take action to reduce their risk (2). This is partly because of the fact that the relationship between hepatitis carrier status and hepatocellular carcinoma risk, specific to the individual, is not readily available to the doctors. As a result, the test information, whether positive or negative for HBV or HCV, is often wasted. As not knowing one’s risk is a major barrier for taking action (2), much of the recent progress made on the treatment of HBV or HCV cannot be fully utilized (6).
Available prediction models for hepatocellular carcinoma are limited to individuals at elevated risk who are carrying HBV (7–9) or HCV (10,11). Many of these models also need additional information from clinical workups, such as presence of HBV DNA in the blood (7,8), and this has further limited the applicability of the prediction model because such measures are usually not readily available in clinical settings. As much of the public is unaware of their risk profile, the need for a new average-risk model is obvious. Finally, assessing risk in the general public without mass screening for HBV or HCV is another challenge, because conducting such screenings has not been proven effective scientifically (5). Hepatocellular carcinoma has a high mortality rate, and possible interventions, such as interferon therapy or lifestyle changes, are available to reduce the mortality or alter the course of the disease, provided individuals at high risk can be identified. A simple, easy-to-administer risk prediction model based on commonly available data at health checkups would be of great value.
Taking advantage of a medical screening program involving nearly half a million healthy individuals in Taiwan with follow-up data (12,13), we developed a prediction model for hepatocellular carcinoma based on data routinely collected in a typical office visit. The intent of the prediction model is to provide a simple, efficient, and widely available tool to identify and quantify cancer risk in the average-risk population.
Methods
Study Population and Data Collection
The study population was obtained from a standard medical screening program conducted by the MJ Health Management Institution (MJ). From 1994 to 2008, a total of 428 584 subjects, free of cancer at baseline, were recruited. In the MJ cohort, because tests for HCV infection were performed at an extra cost to members, only a subset of 130 533 participants has data on HCV status. Given the importance of HCV as a risk factor for hepatocellular carcinoma, we divided the cohort into two subcohorts in order to provide more accurate risk prediction estimates: one cohort had the HCV test (n = 130 533 subjects), and the other had no HCV test (n = 298 051 subjects). Participants in the MJ cohort were aged 20 years or older.
All participants completed a self-administered questionnaire covering demographic characteristics and health history, including lifestyle and medical history (such as diabetes, hypertension, stroke, heart diseases). Subjects who self-reported having been diagnosed with diabetes or currently taking diabetes medication were defined as having diabetes. All subjects went through testing for anthropometric measurements (eg, height, weight, waist circumference, hip circumference, body fat percentage, etc.), blood pressure, pulse rate, respiration rate, and chest circumference. Overnight fasting blood was analyzed for a standard panel, including hemogram, blood sugar tests, liver function tests, renal function tests, blood lipid tests, thyroid function tests, blood grouping, the presence of HBV surface antigen in blood (HBsAg), and the presence of HCV antibody in blood (offered to a subgroup of members with additional cost). Individuals who tested positive for HBsAg are referred to as “HBV+” subjects and those tested positive for HCV antibody are referred to as “HCV+” subjects, and individuals who tested negative were referred to as “HBV−” and “HCV−” subjects, respectively. Smoking was classified by the number of pack-years (ie, daily cigarette quantity × duration in years) among ever-smokers. Alcohol consumption was classified into “regular drinkers” (those who consumed ≥2 drinks/day on ≥3 days/week) and “occasional drinkers” (those who consumed <2 drinks/day on <3 days/week). Regarding volume of leisure time physical activity (LTPA) (ie, the product of intensity measured as metabolic equivalent tasks [MET] × duration of exercise in hours), the MET-hour per week of each individual was classified as: inactive (<3.75 MET-hour), low-active (3.75–7.49 MET-hour), and active (≥7.5 MET-hour; this group met the current LTPA recommendation) (12,13). Details of the study method have been reported elsewhere (13).
All participants signed an informed consent. Ethical reviews by Institutional Review Boards were approved by National Health Research Institutes, Taiwan, and MD Anderson Cancer Center.
Ascertainment of Hepatocellular Carcinoma
Each participant received a unique identification that was matched with the National Cancer Registry (14) and the National Death File (15). All incidences of hepatocellular carcinoma were identified as International Classification of Diseases for Oncology (ICD-O) code 155. Registry data included identification information, sex, date of birth, date of diagnosis, anatomical site of the tumor, histological diagnosis, and treatments (14). At the end of 2008, with an average of 8.5 years (range = 1–13.9 years) of follow-up, a total of 1668 incident hepatocellular carcinomas were identified.
Statistical Analysis
To develop a risk prediction model, each dataset from the two subcohorts (with and without HCV tests) was randomly and equally split into a training set to guide the building of the risk model and a validation set to assess the models’ predictive performance. Only the results of the final models from the full dataset are presented in this study. Sex, diabetes (yes, no), HBV status (HBV+, HBV−), HCV status (HCV+, HCV−), and alcohol drinking (regular, occasional, or none) were categorical variables. Other variables were continuous variables. Cutoff points for continuous variables were based on median (two categories), tertile (three categories), or quartile (four categories) values in the population. The cutoff points for AST and ALT were chosen by setting the reference group at different starting points through multiple iterations (see below). A stepwise Cox proportional hazards regression analysis was performed to identify risk predictors that were statistically significantly associated with increased risk of hepatocellular carcinoma in multivariable models, with special efforts to screen for risk predictors among various tests commonly performed to evaluate liver function, such as measurement of bilirubin, albumin, globulin, albumin–globulin ratio, alkaline phosphatase, serum alanine transaminase (ALT), serum aspartate transaminase (AST), γ-glutamyl transferase, l-lactate dehydrogenase, alfa-fetoprotein (AFP), and liver ultrasound. Hazard ratios (HRs) and 95% confidence intervals (CIs) were estimated for each variable. The proportional hazard assumption was tested and deemed to have been met (16).
Four models were developed in the subcohort without an HCV test (Table 1), and a fifth model was added in the subcohort with an HCV test (Table 2). Modeling started with health history (model 1: age, sex, pack-year of smoking, alcohol drinking, physical activity, and diabetes) or transaminase only (model 2: age, sex, AST, and ALT), followed by combination of these variables into a joint transaminase and health history model (model 3: age, sex, pack-year of smoking, alcohol drinking, physical activity, diabetes, AST, and ALT). Finally, model 3 was extended by adding HBV test results and AFP (model 4: age, sex, pack-year of smoking, alcohol drinking, physical activity, diabetes, AST, ALT, AFP, and HBV) and further extended by adding HBV and HCV test results and AFP (model 5: age, sex, pack-year of smoking, alcohol drinking, physical activity, diabetes, AST, ALT, AFP, HBV, and HCV) variables. HBV and HCV variables were dichotomized as positive or negative as described above. In selecting the cutoff points for AST and ALT, we set the reference group at different starting points (eg, <10 IU/L, <15 IU/L, <20 IU/L, etc.) and examined the risk at each additional 5 IU/L at a time through multiple iterations. We found that the reference at 25 IU/L was best to differentiate risk groups (ie, there was a substantial increase in HRs when exceeding 25 IU/L in all scenarios). It is to be noted that the 25 IU/L cutoff point is much lower than the upper limit of the normal reference (ULN), usually set around 40 IU/L (17).
Table 1.
Characteristics | Total subjects, % | Incidence, No. (%) | Model 1 | Model 2 | Model 3 | Model 4 |
---|---|---|---|---|---|---|
Health history | Transaminase | Transaminase and health history | Transaminase, health history, AFP, and HBV | |||
HR (95% CI) | HR (95% CI) | HR (95% CI) | HR (95% CI) | |||
Sex | ||||||
Female | 52.2 | 390 (0.25) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
Male | 47.8 | 862 (0.60) | 1.79(1.54 to 2.07) | 1.93 (1.71 to 2.19) | 1.54 (1.33 to 1.79) | 1.38 (1.19 to 1.61) |
Age at baseline, y | ||||||
20–29 | 55.5 | 116 (0.07) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
40–59 | 31.8 | 479 (0.52) | 6.66 (5.43 to 8.17) | 5.34 (4.35 to 6.56) | 5.24 (4.26 to 6.43) | 5.28 (4.28 to 6.51) |
≥60 | 12.7 | 595 (1.63) | 19.06 (15.60 to 23.40) | 13.71 (11.20 to 16.8) | 13.46 (10.90 to 16.50) | 14.85 (12.00 to 18.40) |
Smoking, pack-year | ||||||
0 | 71.5 | 664 (0.31) | 1.00 (referent) | — | 1.00 (referent) | 1.00 (referent) |
1–9.9 | 13.9 | 196 (0.47) | 1.35 (1.13 to1.61) | — | 1.36 (1.14 to 1.63) | 1.19 (0.99 to 1.42) |
≥10 | 14.6 | 392 (0.90) | 1.32 (1.13 to1.55) | — | 1.39 (1.19 to 1.62) | 1.38 (1.18 to 1.62) |
Drinking† | ||||||
None or occasional | 84.2 | 872 (0.35) | 1.00 (referent) | — | 1.00 (referent) | 1.00 (referent) |
Regular | 15.8 | 380 (0.81) | 1.56 (1.36 to 1.79) | — | 1.2 (1.05 to 1.39) | 1.26 (1.09 to 1.45) |
Physical activity, MET-hour‡ | ||||||
<3.75 | 52.6 | 634 (0.40) | 1.00 (referent) | — | 1.00 (referent) | 1.00 (referent) |
3.75–7.49 | 22.5 | 232 (0.35) | 0.82 (0.70 to 0.95) | — | 0.90 (0.77 to 1.06) | 0.96 (0.82 to 1.12) |
≥7.5 | 24.9 | 376 (0.51) | 0.74 (0.65 to 0.84) | — | 0.87 (0.77 to 1.00) | 0.87 (0.76 to 0.99) |
Diabetes | ||||||
No | 96.9 | 1111 (0.38) | 1.00 (referent) | — | 1.00 (referent) | 1.00 (referent) |
Yes | 3.1 | 141 (1.53) | 1.66 (1.38 to 1.99) | — | 1.33 (1.11 to 1.60) | 1.34 (1.11 to 1.62) |
AST, IU/L | ||||||
<25 | 77.0 | 213 (0.09) | — | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
25–39 | 17.8 | 365 (0.69) | — | 3.93 (3.21 to 4.83) | 4.00 (3.25 to 4.91) | 3.31 (2.69 to 4.08) |
40–59 | 3.3 | 260 (2.66) | — | 14.58 (11.50 to 18.40) | 14.43 (11.4 to 18.3) | 8.51 (6.68 to 10.8) |
≥60 | 1.9 | 413 (7.43) | — | 39.58 (31.60 to 49.60) | 38.34 (30.6 to 48.1) | 10.92 (8.55 to 13.9) |
ALT, IU/L | ||||||
<25 | 67.2 | 238 (0.12) | — | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
≥25 | 32.8 | 1013 (1.04) | — | 1.93 (1.71 to 2.19) | 1.47 (1.21 to 1.79) | 1.29 (1.05 to 1.57) |
AFP, ng/mL | ||||||
<2.5 | 51.8 | 183 (0.12) | — | — | — | 1.00 (referent) |
2.5–4.9 | 40.6 | 416 (0.35) | — | — | — | 1.56 (1.30 to 1.87) |
5.0–9.9 | 6.7 | 296 (1.49) | — | — | — | 4.29 (3.52 to 5.22) |
≥10.0 | 0.9 | 345 (13.24) | — | — | — | 15.20 (12.30 to 18.90) |
HBV | ||||||
Negative | 84.3 | 615 (0.25) | — | — | — | 1.00 (referent) |
Positive | 15.7 | 637 (1.38) | — | — | — | 3.40 (3.00 to 3.84) |
*Risk factors identified for hepatocellular carcinoma and incidence in each risk group in 298 051 subjects without HCV test in the MJ Health Management Institution cohort. Hazard ratios and 95% confidence intervals were estimated using Cox regression model. Cutoff points for continuous variables were based on median (two categories), tertile (three categories), or quartile (four categories) values in the population. The cutoff points for AST and ALT were chosen by setting the reference group at different starting points (eg, <10, <15, <20 IU/L, etc.) and examined the risk at each additional 5 IU/L at a time through multiple iterations. HR = hazard ratio; CI = confidence interval; AST = serum aspartate transaminase; ALT = serum alanine transaminase; AFP = alfa-fetoprotein; HBV = hepatitis B virus; HCV = hepatitis C virus; — = not applicable.
†None or occasional drinking was defined as not drinking or consumed fewer than 2 drinks per day on less than 3 days per week. Regular drinking was defined as 2 or more drinks per day on 3 or more days per week.
‡Physical activity was categorized as inactive (<3.75 MET-hour), low-active (3.75–7.49 MET-hour), and active (≥7.5 MET-hour).
Table 2.
Characteristics | Total subjects, % | Incidence, No. (%) | Model 1 | Model 2 | Model 3 | Model 4 | Model 5 |
---|---|---|---|---|---|---|---|
Health history | Transaminase | Transaminase and health history | Transaminase, health history, AFP, and HBV | Transaminase, health history, AFP, HBV, and HCV | |||
HR (95% CI) | HR (95% CI) | HR (95% CI) | HR (95% CI) | HR (95% CI) | |||
Sex | |||||||
Female | 48.3 | 128 (0.20) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
Male | 51.7 | 288 (0.43) | 1.60 (1.22 to 2.09) | 1.65 (1.32 to 2.07) | 1.26 (0.96 to 1.65) | 1.17 (0.89 to 1.55) | 1.39 (1.05 to 1.83) |
Age at baseline, y | |||||||
20–29 | 59.3 | 52 (0.07) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
40–59 | 31.9 | 179 (0.44) | 5.91 (4.25 to 8.22) | 4.84 (3.50 to 6.68) | 4.65 (3.35 to 6.46) | 4.35 (3.12 to 6.06) | 3.74 (2.68 to 5.23) |
≥60 | 8.9 | 159 (1.39) | 17.17 (12.20 to 24.2) | 12.19 (8.75 to 16.90) | 11.10 (7.90 to 15.60) | 10.92 (7.71to 15.5) | 8.65 (6.06 to 12.4) |
Smoking, pack year | |||||||
0 | 74.7 | 229 (0.24) | 1.00 (referent) | — | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
1–9.9 | 13.1 | 53 (0.31) | 1.23 (0.88 to 1.72) | — | 1.31 (0.93 to 1.83) | 1.27 (0.92 to 1.74) | 1.13 (0.81 to 1.58) |
≥10 | 12.2 | 115 (0.73) | 1.41 (1.06 to 1.87) | — | 1.49 (1.12 to 1.97) | 1.50 (1.14 to 1.98) | 1.47 (1.10 to 1.97) |
Drinking† | |||||||
No or occasional | 84.1 | 280 (0.26) | 1.00 (referent) | — | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
Regular | 15.9 | 136 (0.66) | 1.60 (1.24 to 2.06) | — | 1.37 (1.06 to 1.77) | 1.36 (1.05 to 1.77) | 1.36 (1.05 to 1.76) |
Physical activity, MET-hour‡ | |||||||
<3.75 | 51.4 | 188 (0.31) | 1.00 (referent) | — | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
3.75–7.49 | 23.4 | 82 (0.30) | 0.90 (0.69 to 1.19) | — | 1.01 (0.76 to 1.32) | 1.04 (0.79 to 1.36) | 1.10 (0.84 to 1.45) |
≥7.5 | 25.2 | 122 (0.41) | 0.69 (0.54 to 0.89) | — | 0.84 (0.66 to 1.08) | 0.87 (0.68 to 1.12) | 0.89 (0.69 to 1.14) |
Diabetes | |||||||
No | 97.7 | 362 (0.29) | 1.00 (referent) | — | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
Yes | 2.3 | 35 (1.18) | 1.70 (1.16 to 2.49) | — | 1.46 (1.00 to 2.14) | 1.63 (1.11 to 2.39) | 1.51 (1.03 to 2.21) |
AST, IU/L | |||||||
<25 | 75.6 | 53 (0.05) | — | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
25–39 | 18.6 | 121 (0.50) | — | 4.72 (3.10 to 7.17) | 4.63 (3.04 to 7.06) | 3.97 (2.59 to 6.09) | 3.61 (2.36 to 5.52) |
40–59 | 3.6 | 79 (1.66) | — | 13.48 (8.46 to 21.50) | 13.93 (8.73 to 22.20) | 8.59 (5.33 to 13.90) | 6.31 (3.89 to 10.20) |
≥60 | 2.2 | 163 (5.79) | — | 43.00 (27.80-66.60) | 41.36 (26.60 to 64.3) | 13.98 (8.80 to 22.20) | 8.27 (5.10 to 13.40) |
ALT, IU/L | |||||||
<25 | 64.0 | 46 (0.06) | — | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) | 1.00 (referent) |
≥25 | 36.0 | 370 (0.79) | — | 2.38 (1.55 to 3.65) | 1.26 (0.96 to 1.65) | 1.94 (1.26 to 3.00) | 1.94 (1.26 to 3.00) |
AFP, ng/mL | |||||||
<2.5 | 50.4 | 49 (0.07) | — | — | — | 1.00 (referent) | 1.00 (referent) |
2.5–4.9 | 41.6 | 130 (0.24) | — | — | — | 1.72 (1.20 to 2.44) | 1.73 (1.21 to 2.46) |
5.0–9.9 | 7.0 | 99 (1.09) | — | — | — | 4.96 (3.39 to 7.23) | 4.86 (3.33 to 7.10) |
≥10.0 | 1.0 | 136 (10.15) | — | — | — | 19.10 (12.90 to 28.20) | 15.80 (10.70 to 23.50) |
HBV | |||||||
Negative | 84.3 | 615 (0.25) | — | — | — | 1.00 (referent) | 1.00 (referent) |
Positive | 15.7 | 637 (1.38) | — | — | — | 4.04 (3.24 to 5.04) | 5.84 (4.63 to 7.35) |
HCV | |||||||
Negative | 97.4 | 246 (0.19) | — | — | — | — | 1.00 (referent) |
Positive | 2.6 | 170 (5.06) | — | — | — | — | 3.98 (3.02 to 5.25) |
*Risk factors identified for hepatocellular carcinoma and incidence in each risk group in 130 533 subjects with HCV test in the MJ Health Management Institution cohort. Hazard ratios and 95% confidence intervals were estimated using Cox regression model. Cutoff points for continuous variables were based on median (two categories), tertile (three categories), or quartile (four categories) values in the population. The cutoff points for AST and ALT were chosen by setting the reference group at different starting points (eg, <10, <15, <20 IU/L, etc.) and examined the risk at each additional 5 IU/L at a time through multiple iterations. HR = hazard ratio; CI = confidence interval; AST = serum aspartate transaminase; ALT = serum alanine transaminase; AFP = alfa-fetoprotein; HBV = hepatitis B virus; HCV = hepatitis C virus; — = not applicable.
†None or occasional drinking was defined as not drinking or consumed fewer than 2 drinks per day on less than 3 days per week. Regular drinking was defined as 2 or more drinks per day on 3 or more days per week.
‡Physical activity was categorized as inactive (<3.75 MET-hour), low-active (3.75–7.49 MET-hour), and active (≥7.5 MET-hour).
Model goodness of fit was assessed in terms of discriminatory accuracy and calibration in an internal validation. Discriminatory accuracy for predicting the development of hepatocellular carcinoma within 10 years was assessed by constructing time-dependent receiver operating characteristic curves for censored survival data (18) and calculating the area under the curve (AUC). The adequacy of each fitted model was also evaluated by calculating the concordance index (C-index), which also measures the model discriminatory accuracy, using the training and validation sets. Similar to AUC, we calculated C-index based on a 10-year prediction. A total of four datasets were used in calculating the C-index: one training set and one validation set for those with an HCV test, and one training set and one validation set for those without an HCV test. We assessed internal calibration of the models by determining the extent of agreement between predicted and observed events in 10 years (ie, calibration) (19) and then created a cross-validated calibration plot. We used the whole study population to perform the 10-fold cross-validated calibration for different models, in which the study population was randomly divided into 10 equal subsets with nine subsets as training set and one subset as testing set. Cross-validated predicted probability was calculated in each decile.
The 5- and 10-year absolute risks were calculated from baseline probability and relative risk profile from the Cox proportional hazard regression model, using the standard equation for survival data with censored observations (20):
where F(t) denotes the probability of developing cancer in t years; S 0(t) is the baseline survival function; b j is the regression coefficient for the jth variable (X j); M j denotes the mean level of X j; p is the number of variables.
We derived risk scores for each statistically significant predictor based on regression coefficients in the Cox proportional hazards regression model following the reported procedures (21). In all models, reference level of a particular risk factor received a risk score of zero. For a particular risk factor, risk score was assigned as integer points to each risk level and calculated as a weighted distance from each level to the reference level of that particular risk factor.
All statistical tests were two-sided, and all P-values less than .05 were considered statistically significant. Statistical analyses and modeling were performed using Stata 10.0 (StataCorp, College Station, TX) and SAS 9.2 (SAS Institute Inc, Cary, NC).
Results
Risk of Hepatocellular Carcinoma in the Subcohort Without HCV Test
Among 298 051 subjects in the subcohort without HCV test, 1252 incident hepatocellular carcinoma occurred. We used data from this subcohort to develop four risk prediction models. In all models, male sex and older age (40–59 and ≥60 years) were statistically significantly associated with increased risk of hepatocellular carcinoma (Table 1). We presented risk factors besides sex and age in the subsequent text.
Risk predictors that were statistically significantly associated with increased or decreased risks of hepatocellular carcinoma in multivariable models are shown in Table 1. In model 1 (health history only), statistically significantly increased risks were associated with smoking (1–9.9 vs 0 pack-years, HR = 1.35, 95% CI = 1.13 to 1.61; ≥10 vs 0 pack-years, HR = 1.32, 95% CI = 1.13 to 1.55), regular alcohol drinking (regular [consumed ≥2 drinks/day on ≥3 days/week] vs none or occasional [consumed <2 drinks/day on <3 days/week], HR = 1.56, 95% CI = 1.36 to 1.79), and diagnosis with diabetes mellitus type 2 (yes vs no, HR = 1.66, 95% CI = 1.38 to 1.99). Physical activity was associated with reduced risks of hepatocellular carcinoma (low-active [3.75–7.49 MET-hour] vs inactive [<3.75 MET-hour], HR = 0.82, 95% CI = 0.70 to 0.95; active [≥7.5 MET-hour] vs inactive [<3.75 MET-hour], HR = 0.74, 95% CI = 0.65 to 0.84). In model 2 (transaminase only), increasing levels of AST at or above 25 IU/L were associated with increasing cancer risk (25–39 vs <25 IU/L, HR = 3.93, 95% CI = 3.21 to 4.83; 40–59 vs <25 IU/L, HR = 14.58, 95% CI = 11.50 to 18.40; ≥60 vs <25 IU/L, HR = 39.58, 95% CI = 31.60 to 49.60), and ALT levels at or above 25 IU/L were also associated with statistically significantly increased risks (≥25 vs <25 IU/L, HR = 1.93, 95 % CI = 1.71 to 2.19). In model 3 (health history and transaminase), the above variables remained statistically significantly associated with risk, except physical activity, where the association became non-statistically significant. In model 4 (health history, transaminase, HBV, and AFP), in addition to the variables in model 3, HBV+ status was associated with an increased risk (positive vs negative for HBsAg, HR = 3.40, 95% CI = 3.00 to 3.84). Increasing AFP levels at or above 2.5ng/mL were associated with increasing risk of hepatocellular carcinoma (2.5–4.9 vs <2.5ng/mL, HR = 1.56, 95% CI = 1.30 to 1.87; 5.0–9.9 vs <2.5ng/mL, HR = 4.29, 95% CI = 3.52 to 5.22; ≥10ng/mL vs <2.5ng/mL, HR = 15.20, 95% CI = 12.30 to 18.90).
Risk of Hepatocellular Carcinoma in the Subcohort With HCV Test
Among 130 533 subjects in the subcohort with HCV test, 416 incident hepatocellular carcinoma occurred. In addition to the four risk prediction models described above in the subcohort without HCV test, we developed a fifth model where the HCV status was included (Table 2). In model 1, statistically significantly increased risks were associated with smoking (≥10 vs 0 pack-years, HR = 1.41, 95% CI = 1.06 to 1.87), regular alcohol drinking (regular [consumed ≥2 drinks/day on ≥3 days/week] vs none or occasional [consumed <2 drinks/day on <3 days/week], HR = 1.60, 95% CI = 1.24 to 2.06), and diabetes mellitus type 2 (yes vs no, HR = 1.70, 95% CI = 1.16 to 2.49). Being physically active was associated with reduced risks (ie, active [≥7.5 MET-hour] vs inactive [<3.75 MET-hour], HR = 0.69, 95% CI = 0.54 to 0.89). In model 2, similar to the subcohort without HCV test, increasing levels of AST at or above 25 IU/L were associated with statistically significantly increasing cancer risk (25–39 vs <25 IU/L, HR = 4.72, 95% CI = 3.10 to 7.17; 40–59 vs <25 IU/L, HR = 13.48, 95% CI = 8.46 to 21.50; ≥60 vs <25 IU/L, HR = 43.0, 95% CI = 27.8 to 66.60). In model 3, all above variables were associated with a statistically significantly increased risk, except for ALT, and physical activity was not associated with a statistically significantly decreased risk. In model 4, HBV+ status was associated with a statistically significantly increased risk (HBV+ vs HBV−, HR = 4.04, 95% CI = 3.24 to 5.04). Again, increasing AFP level at or above 2.5ng/mL was associated with statistically significantly increasing risk (2.5–4.9 vs <2.5ng/mL, HR = 1.72, 95% CI = 1.20 to 2.44; 5.0–9.9 vs <2.5ng/mL, HR = 4.96, 95% CI = 3.39 to 7.23; ≥10ng/mL vs <2.5ng/mL, HR = 19.10, 95% CI = 12.90 to 28.20). In model 5 (health history, transaminase, AFP, HBV, and HCV), all variables shown in model 4 above were still associated with statistically significantly increased risk, and so was HCV positivity (HCV+ vs HCV−, HR = 3.98, 95% CI = 3.02 to 5.25).
The Goodness of Fit of the Risk Prediction Models
We assessed the goodness of fit of the models in terms of discriminatory accuracy and calibration in an internal validation. We evaluated discriminatory accuracy, which measures how well a prediction model distinguishes at the individual level between those who will develop disease and those who will not develop disease, by calculating the C-index and AUC in all five models. In the subcohort without an HCV test, we examined the C-index, based on 10-year prediction, in the full, training, and validation datasets (Table 3). Compared with model 1, a statistically significant increase in C-index was noted in model 2 (model 1 vs model 2: full dataset, C-index = 0.810 [95% CI = 0.799 to 0.822] vs 0.903 [95% CI = 0.887 to 0.912], P < .05; training dataset, C-index = 0.813 [95% CI = 0.802 to 0.829] vs 0.904 [95% CI = 0.896 to 0.921], P < .05; validation dataset, C-index = 0.809 [95% CI = 0.798 to 0.820] vs 0.902 [95% CI = 0.892 to 0.916], P < .05). There was no appreciable increase in C-index when adding health history to transaminase only in model 3, but addition of HBV status in model 4 increased the C-index values compared with models 1, 2, and 3. In the subcohort with HCV test, similar results were observed in all four models. Addition of HCV status in model 5 only slightly increased C-index values (Table 3).
Table 3.
10-year risk prediction | C-index (95% CI) | |
---|---|---|
Subcohort without HCV test | Subcohort with HCV test | |
Model 1—Health history | ||
Full dataset | 0.810 (0.799 to 0.822) | 0.798 (0.772 to 0.815) |
Training set | 0.813 (0.802 to 0.829) | 0.783 (0.764 to 0.801) |
Validation set | 0.809 (0.798 to 0.820) | 0.817 (0.797 to 0.836) |
Model 2—Transaminase only | ||
Full dataset | 0.903 (0.887 to 0.912) | 0.914 (0.901 to 0.937) |
Training set | 0.904 (0.896 to 0.921) | 0.915 (0.903 to 0.935) |
Validation set | 0.902 (0.892 to 0.916) | 0.917 (0.903 to 0.938) |
Model 3—Transaminase and health history | ||
Full dataset | 0.904 (0.892 to 0.922) | 0.915 (0.903 to 0.936) |
Training set | 0.905 (0.894 to 0.926) | 0.916 (0.905 to 0.939) |
Validation set | 0.903 (0.889 to 0.917) | 0.915 (0.904 to 0.936) |
Model 4—Transaminase, health history, AFP, and HBV | ||
Full dataset | 0.923 (0.910 to 0.939) | 0.936 (0.913 to 0.958) |
Training set | 0.925 (0.913 to 0.942) | 0.936 (0.911 to 0.960) |
Validation set | 0.922 (0.911 to 0.941) | 0.941 (0.916 to 0.964) |
Model 5—Transaminase, health history, AFP, HBV, and HCV | ||
Full dataset | — | 0.941 (0.918 to 0.967) |
Training set | — | 0.942 (0.920 to 0.970) |
Validation set | — | 0.945 (0.928 to 0.975) |
*The full dataset was split into half as the training and validation sets. C-index (measures model discriminatory accuracy) of each model was calculated to evaluate model goodness of fit in the full datasets, the training set, and the validation set . C-index = concordance index; CI = confidence interval; — = not applicable; AFP = alfa-fetoprotein; HBV = hepatitis B virus; HCV = hepatitis C virus.
AUC was used to illustrate the changes in model discriminatory accuracy by comparing different models (Figure 1, A). In the subcohort without an HCV test, the AUC of model 1 was 0.807 (95% CI = 0.804 to 0.811), based on health history data, which increased to 0.900 (95% CI= 0.894 to 0.906) in model 2, based on transaminase data, and 0.918 (95% CI = 0.910 to 0.928) in model 4 with HBV status data. In the subcohort with an HCV test, the AUC in model 1 was 0.793 (95% CI = 0.790 to 0.802), which increased to 0.912 (95% CI = 0.909 to 0.915) in model 2, 0.927 (95% CI = 0.918 to 0.945) in model 4, and 0.933 (95% CI = 0.929 to 0.949) in model 5 with HCV status data (Figure 1, B). It was of interest to note that, among models with individual (single) tests for HBV, HCV, AFP, and transaminases only, the “transaminase only” model (model 2) had the highest AUC compared with all other models based on single tests (transaminase only, AUC = 0.912 [95% CI = 0.909 to 0.915]; HBV only, AUC = 0.840 [95% CI = 0.833 to 0.848]; HCV only, AUC = 0.841 [95% CI = 0.834 to 0.847]; AFP only, AUC = 0.871 [95% CI = 0.862 to 0.886]) (Supplementary Figure 1, available online).
Model calibration was assessed by creating a 10-fold cross-validated calibration plot, and then internal validity was assessed by determining the extent to which the predicted events in agreement with observed events in 10 years. The cross-validated calibration plot by risk deciles showed excellent calibration agreement of observed events with predicted events in the 10-year timeframe across models and in both the subcohort without an HCV test (Figure 2, A) and the subcohort with HCV test (Figure 2, B). As shown in Figure 2, all observed probabilities of cancer development were within the 95% confidence intervals of the predicted probabilities, indicating excellent calibration of the models.
Risk Score Assignments and Absolute Risk of Hepatocellular Carcinoma
Risk scores were assigned to each risk factor in each model based on the strength of the association a particular risk factor conferred (Supplementary Table 1, available online). In all models, reference level of a particular risk factor received a risk score of zero. The higher risk a risk factor conferred, the higher the risk score was assigned to the risk factor. For example, in model 2 (transaminase only), the reference level of AST (<25 IU/L) was assigned a risk score of 0, and increasing scores were assigned to increasing levels of AST (25–39 IU/L, score = 5; 40–59 IU/L, score = 9; ≥60 IU/L, score = 13) (Supplementary Table 1, available online).
The probability of developing hepatocellular carcinoma (ie, absolute risk) in 5 and 10 years, as a function of increasing risk score, in all models in the two subcohorts is shown in Supplementary Figure 2 (available online).
Application of Risk Score and Prediction Power of Transaminases
We applied the models to predict probability of developing hepatocellular carcinoma in 5 years and 10 years using eight hypothetical examples (Table 4). These examples are representative of individuals from general population with a range of risk profiles. For example, a 60-year-old male with abnormal transaminases, AST (60 IU/L), and ALT (30 IU/L), without considering any other risk factors, as in Example 1, would have a hepatocellular carcinoma risk of 7.3% (95% CI = 6.5% to 8.5%) and 15.5% (95% CI = 13.6% to 17.3%) in 5 and 10 years, respectively, according to model 2. The same individual in Example 2, when positive for HBV, would have a 21.4% (95% CI = 17.5% to 23.8%) and 38.2% (95% CI = 34.1% to 42.0%) risk of cancer development in 5 and 10 years, respectively, according to model 4. When this individual is also positive for HCV, as in Example 3, his risk would increase to 77.0% (95% CI = 69.1% to 82.8%) and 97.1% (95% CI = 94.7% to 98.4%) in 5 years and 10 years, respectively, according to model 5. This latter high risk could be substantially attenuated to 44.6% (95% CI = 37.7% to 50.8%) and 75.8% (95% CI = 69.4% to 80.9%), respectively, as in Example 4, if lifestyle risks were modified by smoking and drinking cessation and by engaging in physical activity and improving diabetes management. In Examples 5 and 6, we compared an individual who was HBV+ with normal transaminase levels to someone who was HBV– with abnormal transaminase levels. The HBV– individual with abnormal transaminase levels had a much higher hepatocellular carcinoma risk compared with the individual with HBV positivity alone (5-year absolute risk = 5.1% [95% CI = 4.1% to 6.1%] vs 0.1% [95% CI = 0.1% to 0.2%]; 10-year absolute risk = 11.8% [95% CI = 9.9% to 13.6%] vs 0.3% [95% CI = 0.2% to 0.3%]). Example 7 shows the substantial benefit of lifestyle modification for the high-risk individual in Example 6, which would reduce his cancer risk by more than 50%. Example 8, an individual with HCV+ status, would have a similar risk to the individual in Example 5 with HBV+, as long as transaminases were normal.
Table 4.
Risk factors | Example 1† | Example 2 | Example 3 | Example 4 | Example 5 | Example 6 | Example 7 | Example 8 |
---|---|---|---|---|---|---|---|---|
Transaminase only | HBV+ | HBV+ and HCV+ | Lifestyle improved, HBV+, and HCV+ | Normal transaminase but HBV+ | Abnormal transaminase but HBV– | Lifestyle improved, abnormal transaminase but HBV– | Normal transaminase but HCV+ | |
Age, y | 60 | 60 | 60 | 60 | 60 | 60 | 60 | 60 |
Sex | Male | Male | Male | Male | Male | Male | Male | Male |
AST, IU/L | 60 | 60 | 60 | 60 | 20 | 60 | 60 | 20 |
ALT, IU/L | 30 | 30 | 30 | 30 | 20 | 30 | 30 | 20 |
Smoking, pack-years | — | 10 | 10 | 0 | 10 | 10 | 0 | 10 |
Regular drinking | — | Yes | Yes | No | Yes | Yes | No | Yes |
Physically active | — | No | No | Yes | No | No | Yes | No |
Diabetes | — | Yes | Yes | No | Yes | Yes | No | Yes |
AFP, ng/mL | — | 7 | 10 | 10 | 2 | 10 | 10 | 2 |
HBV status | — | + | + | + | + | – | – | – |
HCV status | — | — | + | + | – | – | – | + |
Predicted probability in 5 y (95% CI), % | 7.3 (6.5 to 8.5) | 21.4 (17.5 to 23.8) | 77.0 (69.1 to 82.8) | 44.6 (37.7 to 50.8) | 0.1 (0.1 to 0.2) | 5.1 (4.1 to 6.1) | 2.1 (1.7 to 4.1) | 0.1 (0.1 to 0.1) |
Predicted probability in 10 y (95% CI), % | 15.5 (13.6 to 17.3) | 38.2 (34.1 to 42.0) | 97.1 (94.7 to 98.4) | 75.8 (69.4 to 80.9) | 0.3 (0.2 to 0.3) | 11.8 (9.9 to 13.6) | 4.9 (2.5 to 5.7) | 0.2 (0.2 to 0.3) |
*The models were applied to calculate probability of developing hepatocellular carcinoma in eight hypothetical individuals with different risk profiles (ie, different AST, ALT levels, life style factors such as smoking, drinking, physical activity, and hepatitis B and C status) who represented the risk profiles in the general population. Life style improved means quit smoking, stop drinking, or becoming physically active. Predicted probability was calculated from absolute risk. AST or ALT levels greater than 25 IU/L were considered abnormal. AST = serum aspartate transaminase; ALT = serum alanine transaminase; AFP = alfa-fetoprotein; HBV = hepatitis B virus; HCV = hepatitis C virus; CI = confidence interval; — = not applicable.
†This individual only had AST and ALT measured. No data on health history variables or HBV or HCV status. The values of these variables were left blank.
We also analyzed the distribution of hepatocellular carcinoma incidence by transaminase levels and by HBV and HCV status (Table 5). Among 130 543 subjects in the subcohort with HCV testing, 109 029 (83.5%) were negative for both HCV and HBV. Of the 416 subjects who developed hepatocellular cancer during an average follow-up of 8.5 years, 66 (15.7%) were negative for both HCV and HBV. Subjects with AST and ALT levels of 25 IU/L or lower constituted 61.1% (79 762 of 130 543) of the HCV-tested subcohort, and 7% (29 of 416) of subjects who experienced incident cancers had AST and ALT levels of 25 IU/L or lower. Incident hepatocellular cancers were detected in 4.6% (19 of 416) individuals with both HBV+ and HCV+ status. For those positive for HBV or HCV, only 37.2% (48 562 of 130 543) and 33.4% (43 601 of 130 543) subjects, respectively, were aware of their carrier status (Table 5).
Table 5.
Transaminase level | Subjects, No. (%) | ||||
---|---|---|---|---|---|
HCV– and | HCV– and | HCV+ and | HCV+ and | ||
Total | HBV– | HBV+ | HBV– | HBV+ | |
All subjects in the HCV subcohort | 130 543 (100) | 109 029 (83.5) | 18 155 (13.9) | 3054 (2.3) | 305 (0.2) |
AST, IU/L | |||||
<25 | 98 744 (75.6) | 86 352 (66.1) | 11 512 (8.8) | 798 (0.6) | 82 (0.1) |
25–39 | 24 223 (18.6) | 18 178 (13.9) | 4986 (3.8) | 944 (0.7) | 115 (0.1) |
40–59 | 4762 (3.6) | 3169 (2.4) | 965 (0.7) | 568 (0.4) | 60 (0.0) |
≥60 | 2814 (2.2) | 1330 (1.0) | 692 (0.5) | 744 (0.6) | 48 (0.0) |
ALT, IU/L | |||||
<25 | 83 532 (64.0) | 74 382 (57.0) | 8421 (6.5) | 657 (0.5) | 72 (0.1) |
25–39 | 25 620 (19.6) | 19 564 (15.0) | 5305 (4.1) | 668 (0.5) | 83 (0.1) |
40–59 | 11 807 (9.0) | 8725 (6.7) | 2465 (1.9) | 564 (0.4) | 53 (0.0) |
≥60 | 9584 (7.3) | 6358 (4.9) | 1964 (1.5) | 1165 (0.9) | 97 (0.1) |
AST and ALT, <25 IU/L | 79 762 (61.1) | 71 276 (54.6) | 7833 (6.0) | 522 (0.4) | 131 (0.1) |
Subjects in the HCV subcohort with hepatocellular carcinoma | 416 (100) | 66 (15.7) | 181 (43.6) | 150 (36.1) | 19 (4.6) |
AST, IU/L | |||||
<25 | 53 (12.8) | 26 (6.3) | 25 (6.0) | 2 (0.5) | 0 (0.0) |
25–39 | 121 (29.1) | 14 (3.4) | 71 (17.1) | 29 (7.0) | 7 (1.7) |
40–59 | 79 (19.0) | 7 (1.7) | 42 (10.1) | 26 (6.3) | 4 (1.0) |
≥60 | 163 (39.2) | 19 (4.6) | 43 (10.4) | 93 (22.4) | 8 (1.9) |
ALT, IU/L | |||||
<25 | 45 (10.8) | 19 (4.6) | 20 (4.8) | 3 (0.7) | 3 (0.7) |
25–39 | 88 (21.2) | 16 (3.8) | 57 (13.7) | 13 (3.1) | 2 (0.5) |
40–59 | 82 (20.2) | 11 (2.6) | 39 (9.4) | 30 (7.2) | 4 (1.0) |
≥60 | 199 (47.8) | 20 (4.8) | 65 (15.7) | 104 (25.0) | 10 (2.4) |
AST and ALT, <25 IU/L | 29 (7.0) | 18 (4.3) | 10 (2.4) | 1 (0.2) | 0 (0.0) |
Awareness of hepatitis status | 11 618 (8.9) | 4569 (3.5) | 48 562 (37.2) | 43 601 (33.4) | 56 003 (42.9) |
*We assessed the prevalence of HBV and HCV infection at different levels of serum AST and ALT. A total of 416 subjects developed hepatocellular cancer in this subcohort during an average follow-up of 8.5 y. The cutoff points for AST and ALT were chosen by setting the reference group at different starting points (eg, <10, <15, <20 IU/L, etc.) and examined the risk at each additional 5 IU/L at a time through multiple iterations. Carrier awareness information was based on questionnaire response. AST = serum aspartate transaminase; ALT = serum alanine transaminase; HBV = hepatitis B virus; HCV = hepatitis C virus.
Discussion
To date, published prediction tools are only available for high-risk chronic HBV carriers. In this study, we developed prediction models for hepatocellular carcinoma based on data routinely collected in a typical office visit with the goal to provide a simple, efficient, and widely available tool to identify and quantify cancer risk in the average-risk population. Our results showed that the model with transaminase alone was best able to predict hepatocellular carcinoma. Because this model is able to predict hepatocellular carcinoma risk with high prediction accuracy without knowing HBV or HCV, it has great potential to be translated into clinical use for general public.
HBV and HCV are well-known risk factors for hepatocellular carcinoma, but in this study, transaminase (AST or ALT) level of 25 IU/L or higher were found be independent risk factors for hepatocellular carcinoma with a linear dose–response trend. The stepwise prediction models, involving testing for AST or ALT initially, are simple for clinicians to implement in their daily practice. When routinely collected AST exceeded 25 IU/L, the risk of hepatocellular carcinoma increased exponentially with increasing concentrations of AST. Because of the increased risks associated with AST or ALT concentrations of 25 IU/L or higher, such a finding should have triggered further testing for HBV, HCV, or AFP, to yield a more complete picture. The model using transaminases alone had a high prediction power, with an AUC value of 0.912, which was statistically significantly better than those testing for HBV, HCV, or AFP alone (see Supplementary Figure 1, available online). On the other hand, subjects with AST or ALT concentrations less than 25 IU/L, considered normal in this study, can be spared from further testing. Although this “normal” group contributed 7% of all hepatocellular carcinoma incidences in this cohort, they occurred among nearly two-thirds of the general adult population (61%). The risk level in this group, with approximately four incidences per 100 000 person-years, is sufficiently low and may not be accorded priority in further testing, because two-thirds of them, if pursued, would find their HBV or HCV status negative. At 25 IU/L or greater, the false-negative rate for AST was 12.7%. This is small compared with the false-negative rate for HBV (51.7%) or HCV (59.1%), as the model should not miss those at high risk. Although the false-positive rate for AST (24.2%) was higher than the corresponding rate for HBV (14.3%) or HCV (2.5%), this difference is not of much practical importance, because positive AST will be followed-up with HBV or HCV test in this model.
To our knowledge, this is the first model that assesses the hepatocellular carcinoma risk of apparently healthy individuals visiting their primary care physician’s office for a health checkup. In the literature, prediction tools are only available for high-risk individuals, such as known chronic HBV carriers (7–9) or chronic HCV carriers (10,11). Models for these high-risk individuals require more clinical data and more sophisticated data to estimate risks; none of them address the average- or unknown-risk general population. A model for subjects at average or unknown risk is more valuable than one for subjects at high risk, because detailed clinical data are readily available for the high-risk individuals but much less available for those at average or unknown risk. In this large cohort, two-thirds of those with positive HBV or HCV status, both high-risk groups that available models attempt to target, were not aware that they were carriers (Table 5); thus, existing prediction models were of little use for these groups.
The versatility of the prediction model we present is highlighted by the fact that it is useful for both the average-risk general public and high-risk individuals. None of the currently available models could assess both HBV– and HCV– subjects, a large group contributing 30–40% of liver cancer cases reported in western populations (6). In Taiwan, more than 1000 new hepatocellular cancers could be estimated to occur annually in HBV– and HCV– individuals (15). This group, constituting 83.5% of our cohort, is commonly overlooked clinically (3), and yet an individual in this group is estimated to have a 10-year absolute risk of 11.8%, based on our prediction model (Example 6 in Table 3). Existing models, moreover, were not able to accurately assess and may, therefore, underestimate the risk of cancer in HBV+ and HCV+ patients, a group contributing nearly 5% of all cancer incidences in our study. Subjects in this group could have a 10-year risk as high as 97.1%, as in Example 3 estimated by our model. Thus, the simple tool we present here has much wider applicability than other models currently available. More importantly, this versatility is accomplished with high efficiency, by relying on common clinical data that are often readily available. That AST and ALT values alone showed AUC exceeding 0.912 was remarkable. These tests are inexpensive and routinely collected in daily medical practice but have not been put into use for risk prediction in average-risk settings. Adoption of this model can make prediction of hepatocellular carcinoma a routine clinical activity among subjects in any category of risk. Models 4 and 5 showed the independent effects of AST or ALT on liver cancer, even in the presence of HBV or HCV, but the effect was appreciably attenuated from model 3. For example, the hazard ratio for AST concentration of 40–59 IU/L was 13.93 in model 3 but reduced to 8.59 in model 4 when HBV status was known. It was further reduced to 6.31 in model 5 when both HBV and HCV status were considered (Table 2). In other words, HBV or HCV had an impact on the predictive ability of the transaminases, and therefore they were somewhat associated, but transaminase remained a major statistically significant predictor even in the presence of hepatitis carrier status.
Although this model was mainly intended to predict cancer incidence, we found that it was also able to predict mortality from liver-related diseases (hepatocellular carcinoma and liver cirrhosis) with similar precision (AUC = 0.93; data not shown). This finding is an added benefit, and the identical results obtained (data not shown) provided additional confidence in the validity of the model.
Our prediction model is valuable for risk identification and risk communication. Subjects at high risk of cancer should be properly informed in a timely manner of their relative and absolute risks. However, the model has value well beyond risk communication because it is also able to estimate the potential for risk mitigation associated with healthy lifestyle choices. Our model identified four statistically significant risk factors from health history, namely smoking, drinking, physical inactivity, and diabetes, with all of these having biologically relevant and statistically significant risks, as previously reported (13,22,23). Our data show that liver cancer risk can potentially be reduced by 33–50% through modification of one or more of these four risk factors. Counseling for lifestyle changes would add important value to the risk prediction process. These benefits can be seen not only in high-risk individuals with positive HBV or HCV status (Examples 3 and 4 in Table 3 show a reduction from 77% to 45% in the 5-year risk) but also in average-risk individuals negative for HBV or HCV (Examples 6 and 7 show a reduction of similar size, from 5.1% to 2.1%). Furthermore, in addition to reducing risk for liver cancer, modifying and eliminating these lifestyle behaviors can have a major impact on all-cause mortality (12,13), a fact often neglected by clinicians. Reducing all-cause mortality is clearly as important as reducing liver cancer. Thus, a further valuable aspect of this simple prediction model is that it may help to educate and motivate an individual at high risk to pursue a wider range of various risk reduction options. These options range from treating carriers with interferon-like therapy (6) to reducing cancer risk by eliminating associated risk factors to reducing all-cause mortality.
The high predictive power of elevated transaminase levels of AST or ALT at 25 IU/L or higher for hepatocellular cancer risk, although not a novel finding, has nevertheless not been fully appreciated in the clinic. This finding arose by examining the cancer risk at every 5-unit interval across the entire AST and ALT distribution. The cutoff level of 25 IU/L is worth remembering, because it is substantially lower than the 40 IU/L commonly cited by laboratory standards as the ULN. Nearly one-fifth (17.8%) of the adult population in our cohort had an AST between 25 and 39 IU/L, a level commonly dismissed as high normal, and yet their liver cancer risk was increased by as much as 3.6-fold. Once AST exceeded 40 IU/L, the risk score given was as large as, or equivalent to, that of the HBV+ or HCV+ individuals in model 5. We also found that the predictive power of AST and ALT alone for liver cancer was higher than that of isolated HBV, HCV, or AFP (Supplementary Figure 1, available online). Finally, although the abnormal ALT has been commonly recognized to play an important role, our data showed that the risk conferred by abnormal AST was higher than ALT (6.3- to 8.3-fold increase vs 1.9-fold increase; Table 2, model 5).
There are several limitations of this study. First, although our models demonstrated an excellent level of goodness of fit and discriminatory ability, additional validation in external populations is advised. To some extent, the similar results in the two different subcohorts, each of which had a large sample size, provide internal validity to our findings, but external confirmation is needed. Second, our cohort is drawn from participants engaged in a medical screening program belonging to an above-average socioeconomic status. This may limit the generalizability of our findings. Nevertheless, the hazard ratios developed in this study were internally standardized and free from selection bias. The sample sizes of the two subcohorts were sufficiently large, representing 3% of the total adult population in Taiwan. Most results derived from these two subcohorts, such as C-index or relative risks, were similar. Third, only AST or ALT data from the initial visit were used for the model, and temporal changes were not considered. Nevertheless, the predictive power of a single transaminase test was reinforced in this study, as in other studies involving this cohort (12,13).
In summary, the use of transaminase data was best able to predict hepatocellular carcinoma risks, with AUC value of 0.90 or higher. Although HBV and HCV are well-known risk factors, AST or ALT concentrations of 25 IU/L or higher had independent and higher predictive power, even among unknown or HBV– or HCV– subjects, and should trigger further testing. This simple tool for the general public more accurately assesses risk even among groups previously thought to be at low or average risk and may be helpful to educate and motivate individuals to pursue options beneficial in reducing their risk of liver cancer and all-cause mortality.
Funding
This study was supported in part by Taiwan Department of Health Clinical Trial and Research Center of Excellence (grant number DOH101-TD-B-111-004 to CPW), The University of Texas MD Anderson Cancer Center Research Trust (to XW), and Center for Translational and Public Health Genomics, Duncan Family Institute for Cancer Prevention and Risk Assessment, The University of Texas MD Anderson Cancer Center (to XW).
Notes
The funders had no role, and the authors are responsible for the study design, data collection, data analysis, interpretation, writing of the manuscript, and the decision to submit the manuscript for publication.
References
- 1. Kuehn BM. Report: too little surveillance, treatment for US patients with hepatitis B and C Jama. 2010. 303(8 713–714 [DOI] [PubMed] [Google Scholar]
- 2.Hepatitis and Liver Cancer: A National Strategy for Prevention and Control of Hepatitis B and C. www.iom.edu/viralhepatitis. 2010.. www.iom.edu/viralhepatitis [PubMed]
- 3. Blonski W, Kotlyar DS, Forde KA. Non-viral causes of hepatocellular carcinoma World Journal of Gastroenterology. 2010. 16(29 3603–3615 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Chen JG, Zhang SW. Liver cancer epidemic in China: past, present and future Semin Cancer Biol. 2011. 21(1 59–69 [DOI] [PubMed] [Google Scholar]
- 5. U.S. Preventive Services Task Force Screening for Hepatitis C in Adults http://www.uspreventiveservicestaskforce.org/uspstf/uspshepc.htm Accessed July 26, 2012
- 6. El-Serag HB. Current concepts hepatocellular carcinoma N Engl J Med. 2011. 365(12 1118–1127 [DOI] [PubMed] [Google Scholar]
- 7. Yang HI, Sherman M, Su J, et al. Nomograms for risk of hepatocellular carcinoma in patients with chronic hepatitis B virus infection J Clin Oncol. 2010. 28(14 2437–2444 [DOI] [PubMed] [Google Scholar]
- 8. Wong VWS, Chan SL, Mo F, et al. Clinical scoring system to predict hepatocellular carcinoma in chronic hepatitis B carriers J Clin Oncol. 2010. 28(10 1660–1665 [DOI] [PubMed] [Google Scholar]
- 9. Yuen MF, Tanaka Y, Fong DYT, et al. Independent risk factors and predictive score for the development of hepatocellular carcinoma in chronic hepatitis B J Hepatol. 2009. 50(1 80–88 [DOI] [PubMed] [Google Scholar]
- 10. Ikeda K, Arase Y, Saitoh S, et al. Prediction model of hepatocarcinogenesis for patients with hepatitis C virus-related cirrhosis. Validation with internal and external cohorts J Hepatol. 2006. 44(6 1089–1097 [DOI] [PubMed] [Google Scholar]
- 11. Masuzaki R, Tateishi R, Yoshida H, et al. Prospective risk assessment for hepatocellular carcinoma development in patients with chronic hepatitis C by transient elastography Hepatology. 2009. 49(6 1954–1961 [DOI] [PubMed] [Google Scholar]
- 12. Wen CP, Cheng TYD, Tsai MK, et al. All-cause mortality attributable to chronic kidney disease: a prospective cohort study based on 462 293 adults in Taiwan Lancet. 2008. 371(9631 2173–2182 [DOI] [PubMed] [Google Scholar]
- 13. Wen CP, Wai JPM, Tsai MK, et al. Minimum amount of physical activity for reduced mortality and extended life expectancy: a prospective cohort study Lancet. 2011. 378(9798 1244–1253 [DOI] [PubMed] [Google Scholar]
- 14. Chang MH, You SL, Chen CJ, et al. Decreased incidence of hepatocellular carcinoma in hepatitis B vaccines: a 20-year follow-up study J Natl Cancer Inst. 2009. 101(19 1348–1355 [DOI] [PubMed] [Google Scholar]
- 15. Department of Health. Statistics of causes of death. http://www.doh.gov.tw/EN2006/DM/DM2.aspx?now_fod_list_no=9256&class_no=390&level_no=2 Accessed July 26, 2012
- 16. Grambsch PM, Therneau TM. Proportional hazards tests in diagnostics based on weighted residuals Biometrika. 1994. 81 551–526 [Google Scholar]
- 17. Prati D, Taioli E, Zanella A, et al. Updated definitions of healthy ranges for serum alanine aminotransferase levels Ann Intern Med. 2002. 137(1 1–10 [DOI] [PubMed] [Google Scholar]
- 18. Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker Biometrics. 2000. 56(2 337–344 [DOI] [PubMed] [Google Scholar]
- 19. D’Agostino RD, Nam BH. Evaluation of performance of survival analysis models: discrimination and calibration measures Balakrishnan N, Rao CR. Handbook of Statistics Vol 23. Amsterdam, the Netherlands: Elsevier; 2004. 1–25 [Google Scholar]
- 20. Hosmer DW, Lemeshow S. Applied Survival Analysis: Regression Modeling of Time to Event Data New York, NY: John Wiley & Sons, Inc; 1999. [Google Scholar]
- 21. Sullivan LM, Massaro JM, D'Agostmo RB. Presentation of multivariate data for clinical use: the Framingham study risk score functions Stat Med. 2004. 23(10 1631–1660 [DOI] [PubMed] [Google Scholar]
- 22. Kuper H, Tzonou A, Kaklamani E, et al. Tobacco smoking, alcohol consumption and their interaction in the causation of hepatocellular carcinoma Int J Cancer. 2000. 85(4 498–502 [PubMed] [Google Scholar]
- 23. El-Serag HB, Hampel H, Javadi F. The association between diabetes and hepatocellular carcinoma: a systematic review of epidemiologic evidence Clin Gastroenterol Hepatol. 2006. 4(3 369–380 [DOI] [PubMed] [Google Scholar]