Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Dec 28.
Published in final edited form as: Alcohol Clin Exp Res. 2017 Jul 28;41(9):1568–1573. doi: 10.1111/acer.13438

Interobserver Variability in Scoring Liver Biopsies with a Diagnosis of Alcoholic Hepatitis

Bela Horvath 1, Daniela Allende 1, Hao Xie 2, John Guirguis 3, Jennifer Jeung 1, James Lapinski 1, Deepa Patil 1, Arthur J McCullough 3, Srinivasan Dasarathy 3, Xiuli Liu 4,
PMCID: PMC6309429  NIHMSID: NIHMS888826  PMID: 28654190

Abstract

Background:

Alcoholic hepatitis (AH) is one of the most severe forms of alcoholic liver disease. Recently, a histologic scoring system for predicting prognosis in this patient cohort was proposed as Alcoholic Hepatitis Histologic Score (AHHS). We aimed to assess interobserver variability in recognizing histologic features of AH, and the effect of this variability on the proposed AHHS categories.

Design:

Hematoxylin-eosin and trichrome stained slides from 32 patients diagnosed with AH with liver biopsies within 1 month of presentation (2000–2015), were reviewed by 5 pathologists including 3 liver pathologists and 2 gastrointestinal (GI) pathologists masked to the clinical findings or outcome. Histologic features of AH were assessed, the AHHS was calculated and an AHHS category (mild, moderate, severe) was assigned. The Fleiss’ kappa coefficient (Kappa) analysis was performed to determine the interobserver agreement.

Results:

A slight-to-moderate level of interobserver agreement existed among 5 reviewers on histopathological features of AH with kappa value ranging from 0.20 (95% confidence interval (CI): 0.03–0.46, megamitochondria) to 0.52 [95% CI: 0.40–0.68, polymorphonuclear leukocyte (PMN) infiltration]. There was only a fair level of agreement in assigning AHHS category (K=0.33, 95% CI: 0.20–0.51). While overall fibrosis and neutrophilic inflammation were comparably evaluated by 3 liver pathologists and 2 GI pathologists, bilirubinostasis and megamitochondria were more consistently diagnosed by liver pathologists. Overall, 18 out of 32 (56%) were uniformly assigned to an AHHS category by all liver pathologists with a kappa value of 0.40 (95% CI: 0.22–0.60).

Conclusion:

In general, features of AH can be recognized with a slight-to-moderate level of interobserver agreement and there was fair interobserver agreement on assigning an AHHS category. Significant interobserver variability among pathologists revealed by the current study can limit its usefulness in everyday clinical practice.

Introduction

Alcoholic liver disease is likely to be the most common form of liver disease (Guirguis J et al 2015). Alcoholic hepatitis (AH) is one of the most severe forms of alcohol-induced liver injury and is clinically characterized by jaundice and liver failure, with a high short-term mortality (Dominguez M et al., 2008). The high short-term mortality calls for more accurate patient characterization and modern targeted therapies in AH. Recent AASLD guidelines recommended a liver biopsy should be considered for patients with a clinical diagnosis of severe AH and for patients in whom reasonable uncertainty exists regarding the presence of underlying liver disease (O’Shea RS et al., 2010). Recently, a histologic scoring system, the Alcoholic Hepatitis Histologic Score (AHHS) was proposed as an objective histological measure to determine the prognosis of patients with AH. The AHHS is based on several histologic features including bridging fibrosis or cirrhosis, canalicular and/or ductular bilirubinostasis or canalicular and/or ductular plus hepatocellular bilirubinostasis, megamitochondria, and severe neutrophil (PMN) infiltration. The final score is simplified into three categories: mild, 0–3 points; intermediate, 4–5 points; severe 6–9 points. AHHS is associated with the severity of AH and predicts risk of death within 90 days in patients with AH (Altamirano J et al., 2014). Although expert liver pathologists performed histologic assessment of liver specimens, and 71 slide kits were cross-shared for histologic interpretation between the central pathologist and the other participant centers, the interobserver agreement was not reported. Furthermore, evaluation of liver biopsies by expert liver pathologists participating in a well-designed study is not part of routine clinical practice. Therefore, the aim of the present study is to examine the interobserver agreement on the histologic features of AH using a cohort of clinically and histologically proven AH cases retrieved from one institute and to examine the feasibility of AHHS in routine clinical setting.

Methods

After Institutional Review Board approval, patients who were diagnosed with AH or alcoholic cirrhosis identified by screening of the electronic medical records at the Cleveland Clinic between January 1st 2000 and February 28th 2015 were included in this study. Of the 801 patients identified on the screening evaluation, 739 (92.3%) had a liver biopsy. Patients were only included if the time difference between the biopsy and the laboratory results was less than a month. This resulted in a cohort of 51 cases out of the previously screened 739 patients. From these 51 patients, 32 patients had H&E and trichrome stained slides available for review.

Hematoxylin-eosin and trichrome stained slides of the included patients were de-identified and independently reviewed by 5 pathologists masked to the clinical findings and outcome. All pathologists had gastrointestinal (GI) and liver pathology fellowship training followed by 4–8 years of post GI and liver pathology fellowship. Of the 5 pathologists, 3 routinely review liver biopsy daily (classified as liver pathologists in this study) with an average of about 10 liver biopsies per day and the other two review primarily luminal GI pathology (classified as GI pathologists in this study). A variety of histologic features were assessed according to the methods and criteria described previously (Altamirano J et al., 2014; Kleiner DE et al., 2005) (Table 1). The AHHS was then generated using the fibrosis stage, presence or absence of bilirubinostasis, PMN infiltration, megamitochondria, and an AHHS category (mild, moderate, severe) was assigned (Table 2) (Altamirano et al., 2014).

Table 1.

Histological characteristics evaluated in the liver biopsies.

Fibrosis stage Definition Score/code
 None No fibrosis 0
 Portal fibrosis Fibrous expansion of portal tracts 1
 Expansive periportal fibrosis Fibrous expansion of portal tracts with extension into periportal region 2
 Bridging fibrosis Fibrosis connecting two anatomic structures, either portal or central vein 3
 Cirrhosis Regenerating nodules surrounding by extensive bridging septa 4
Lobular fibrosis
No fibrosis in the lobular region 0
Fibrosis in zone 3 1
Fibrosis in zones 3 and 2 2
Fibrosis involves nearly the entire lobule 3
Pericellular fibrosis
No pericellular fibrosis 0
Pericellular fibrosis 1
Steatosis grade
No steatosis 0
(<5% of biopsy volume) 1
(5–33% of biopsy volume) 2
(34 to 66% biopsy volume) 3
(≥ 67% of biopsy volume) 4
Bilirubinostasis
No cholestasis 0
Cholestasis in hepatocytes and/or canaliculi 1
Cholestasis in cholangioles 2
Cholestasis in hepatocytes/canaliculi and cholangioles 3
Hepatocyte ballooning
No ballooning in any hepatocytes 0
Few balloon hepatocytes 1
Many cells/prominent ballooning 2
Mallory hyalines
No Mallory hyalines 0
Rare Mallory hyalines 1
Easily identifiable Mallory hyalines 2
Megamitochrondria
No megamitochondria identified 0
Intracellular oval or round eosinophilic globules 1
Polymorphonuclear leukocyte inflammation
No neutrophilic inflammation 0
Rare neutrophils in the lobules 1
Clusters of neutrophils in the lobules 2
Mononuclear inflammation
No mononuclear inflammation 0
A few mononuclear inflammation in the sinusoids, portal tracts, or in the lobules 1
Dense mononuclear inflammation in the portal tracts and/or lobules 2

Table 2.

Alcoholic Hepatitis Histological Score (AHHS) for Prognostic Stratification of Alcoholic Hepatitis (Altamirano J et al. Gastroenterology 2014;146:12319).

Points
Fibrosis stage
 No fibrosis or Portal fibrosis 0
 Expansive fibrosis 0
 Bridging fibrosis or Cirrhosis +3
Bilirubinostasis
 No 0
 Hepatocellular only 0
 Canalicular or ductular +1
 Canalicular or ductular plus hepatocellular +2
Polymorphonuclear leukocyte infiltration
 No/Mild +2
 Severe PMN infiltration 0
Megamitochondria
 No megamitochondria +2
 Megamitochondria 0
AHHS categories (0–9 points)
Mild 0–3
Intermediate 4–5
Severe 6–9

Fleiss’ kappa was used to assess the interobserver agreement on each histologic feature and AHHS. A kappa value < 0, 0.01–0.20, 0.21–0.40, 0.41–0.60, 0.61–0.80, 0.81 −1.0 is considered poor, slight, fair, moderate, substantial, and almost perfect agreement, respectively (Landis JR and Koch GG 1977). A P-value < 0.05 was considered statistically significant. Statistical analysis was performed using R 2.15.2 (R Development Core Team, 2012, Vienna, Austria).

Results

The consensus histologic data are shown in Table 3. A consensus for each feature is defined as an interpretation agreed upon by at least 3 of the 5 pathologists. In this cohort of patients, 75% of biopsies had bridging fibrosis or cirrhosis, 21% had marked ballooning, 19% had marked Mallory hyalines, 31% had marked neutrophilic inflammation, 23% had bilirubinostasis and18% had megamitochrondria. Moderate or severe steatosis was seen only in 25% of cases. Approximately13%, 31%, and 56% biopsies were categorized to the mild, intermediate and severe AHHS, respectively.

Table 3.

Consensus histologic characteristics of patients with AH in this cohort (N = 32)

Stage of fibrosis N (%)
 No fibrosis or portal fibrosis 6 (18)
 Expansive fibrosis 3 (7)
 Bridging fibrosis or cirrhosis 23 (75)
Steatosis
 <33% 23 (72)
 33%−66% 4 (12)
 >66% 5 (16)
Ballooning
 None or occasional 22 (69)
 Marked 10 (21)
Mallory hyalines
 None or occasional 26 (81)
 Marked 6 (19)
Polymorphonuclear leukocyte infiltration
 No/Mild 22 (69)
 Marked 10 (31)
Bilirubinostasis
 No 24 (75)
 Present 8 (25)
Megamitochondria
 No 25 (78)
 Present 7 (22)
AHHS
 Low (0–3 points) 4 (13)
 Moderate (4–5 points) 10 (31)
 Severe (6–9 points) 18 (56)

A slight-to-moderate level of interobserver agreement was reached among the 5 reviewers on the histopathological features of AH with kappa values ranging from 0.20 (95% CI: 0.03–0.46, megamitochondria) to 0.52 (95% CI: 0.40–0.68, PMN infiltration) (Table 4). Consequently, there was overall only a fair level of agreement in assigning the different AHHS categories to a case (kappa value=0.33, 95% CI: 0.20–0.51).

Table 4.

Interobserver agreement on histopathologic features of AH among 5 pathologists

Histologic feature K value (95% CI) Strength of agreement P value
Fibrosis stage (None/Portal fibrosis/Expansive periportal fibrosis/Bridging fibrosis/Cirrhosis) 0.42 (0.31–0.51) Moderate <0.001
Lobular fibrosis (None/Zonal 3/Zonal 3+2/Panlobular fibrosis) 0.31 (0.19–0.45) Fair <0.001
Pericellular fibrosis (Absent/Present) 0.41 (0.26–0.63) Moderate <0.001
Steatosis (None/<5%/5–33%/33–66%/>66%) 0.43 (0.31–0.57) Moderate <0.001
Bilirubinostatis (None/Hepatocellular/Canalicular or ductular/Both) 0.52 (0.36–0.72) Moderate <0.001
Ballooning (None/Occasional/Marked) 0.37 (0.26–0.50) Fair <0.001
Mallory bodies (None/Occasional/Marked) 0.44 (0.30–0.58) Moderate <0.001
Megamitochondria (Absent/Present) 0.20 (0.03–0.46) Slight <0.001
Polymorphonuclear leukocyte infiltration (None/Mild/Severe) 0.52 (0.40–0.68) Moderate <0.001

As only 3 of the 5 pathologists were regularly reviewing liver biopsies daily, Fleiss’s kappa was used to assess the interobserver agreement among these 3 liver pathologists and the 2 GI pathologists separately on each of the histologic features and the aggregate AHHS. The results are shown in Table 5. Liver pathologists were noted to have better interobserver agreement on certain histologic features such as lobular fibrosis (kappa value 0.27 vs. 0.13), pericellular fibrosis (kappa value 0.48 vs. 0.00), bilirubinostasis (kappa value 0.54 vs. 0.37), Mallory hyalines (kappa value 0.50 vs. 0.22), and megamitochondria (kappa value 0.40 vs. 0.26). The assessment of overall fibrosis and PMN infiltration was comparable between liver and GI pathologists (kappa value 0.41 vs. 0.38 for overall fibrosis and 0.48 vs. 0.40 PMN infiltrates). On the other hand, GI pathologists had a slightly better agreement on steatosis (kappa value 0.42 vs. 0.34) and hepatocyte ballooning (kappa value 0.40 vs. 0.24). Importantly, 2 of the 4 features used for AHHS, bilirubinostasis and megamitochondria were more consistently diagnosed by liver pathologists, and the other two features (overall fibrosis and PMN inflammation) were recognized by liver and GI pathologists with comparable interobserver agreement.

Table 5.

Interobserver agreement on histopathologic features of AH among 3 liver pathologists and 2 gastrointestinal pathologists

Histologic feature K value for 3 liver pathologists (95% CI) Strength of agreement K value for 2 GI pathologists Strength of agreement P value
Fibrosis stage (None/Portal fibrosis/Expansive periportal fibrosis/Bridging fibrosis/Cirrhosis) 0.41 (0.25, 0.56) Moderate 0.38 (0.20, 0.59) Fair <0.001
Lobular fibrosis (None/Zonal 3/Zonal 3+2/Panlobular fibrosis) 0.27 (0.11, 0.41) Fair 0.13 (−0.11, 0.35) Slight <0.001
Pericellular fibrosis (Absent/Present) 0.48 (0.25, 0.68) Moderate 0.00 (−0.27, 0.35) Poor <0.001
Steatosis (None/<5%/5–33%/33-66%/>66%) 0.34 (0.19, 0.49) Moderate 0.42 (0.20, 0.61) Moderate <0.001
Bilirubinostatis (None/Hepatocellular/C analicular or ductular/Both) 0.54 (0.32, 0.77) Moderate 0.37 (0.12, 0.62) Fair <0.001
Ballooning (None/Occasional/Marked) 0.24 (0.10, 0.39) Fair 0.40 (0.13, 0.64) Fair <0.001
Mallory bodies (None/Occasional/Marked) 0.50 (0.34, 0.66) Moderate 0.22 (−0.01, 0.46) Fair <0.001
Megamitochondria (Absent/Present) 0.40 (0.13, 0.63) Fair -0.26 (−0.50, 0.05) Poor <0.001
Polymorphonuclear leukocyte infiltration (None/Mild/Severe) 0.48 (0.30, 0.65) Moderate 0.40 (0.17, 0.61) Fair <0.001

Eighteen of 32 (56%) biopsies were uniformly assigned to an AHHS category by three liver pathologists with the following distribution: 11 severe, 6 intermediate and 1 mild. Of the remaining biopsies, 11of 32 (34%) cases were assigned by two liver pathologists to the same AHHS category including 3 severe, 3 intermediate, and 5 mild. The remainder (3 out of 32, 10%) were assigned to different categories by all three liver pathologists. Overall, AHHS categories were more consistently assigned (kappa value=0.40, 95% CI: 0.20–0.51) by liver pathologists when compared to GI pathologists (kappa value 0.06, 90% CI: −0.20–0.33).

Discussion

AH, the most severe form of alcoholic liver disease, carries a short-term mortality as high as 20–30% (Dominguez M et al, 2008). Although AH has a constellation of histologic features that supports the diagnosis, a biopsy scoring system of prognostic significance was only published recently (Altamirano J et al., 2014) in a multicentric study. The AHHS gives 3 points for the presence of bridging fibrosis or cirrhosis, 1 point for canalicular or ductular bilirubinostasis or 2 points for canalicular or ductular bilirubinostasis plus hepatocellular bilirubinostasis, 2 points for a lack of or only mild neutrophilic inflammation, and 2 points for a lack of megamitochondria. The AHHS is further divided into mild (0–3 points), intermediate (4–5 points), and severe (6–9 points) which predicts a low (3%), moderate (19%), and high (51%) risk of death within 90 days (Altamirano J et al., 2014). However, whether pathologists in a routine clinical setting can reliably recognize these features has not been investigated.

Biopsy interpretation for the diagnosis of alcoholic liver disease requires the recognition of many individual features. High interobserver variability in recognizing these histologic features has been previously reported by others (Kleiner DE et al., 2005; Bedossa P et al., 1988). The presence of major and minor discrepancies in a “second opinion” on liver biopsy interpretation has also been previously reported (Bejarano PA et al, 2001; Paterson AL et al., 2016; Colling R et al., 2014). In one study of non-alcoholic fatty liver disease involving a panel of liver pathologists, a kappa value of 0.15 to 0.84 was reported for a variety of histologic features (Kleiner DE et al., 2005). Not surprisingly, the lowest kappa value was reported for megamitochondria (in adult cases 0.15 and −0.03 for pediatric cases) and the highest kappa value for fibrosis (kappa 0.84) and steatosis (kappa 0.79). This finding reaffirms the high interobserver variability in identifying megamitochondria, a feature not commonly or actively searched for diagnostic purposes in routine liver biopsy interpretation. Bedossa P et al (1988) examined interobserver variation in the assessment of liver biopsies of alcoholic patients and reported a moderate concordance for steatosis (kappa value of 0.47) but a lower concordance for fibrosis alone (kappa value of 0.16) between two pathologists (Bedosa P et al., 1988). In our study, multiple histologic features were assessed by 5 pathologists, with 3 of them having specific expertise as hepatopathologists. The steatosis grade, fibrosis stage, PMN inflammation, bilirubinostasis, and Mallory hyaline had moderate agreement with kappa value of 0.43, 0.42, 0.52, 0.5, and 0.44, respectively. The interobserver agreement for steatosis was similar to previously reported in liver biopsies from alcoholic patients (Bedosa P et al., 1988) but lower than that for nonalcoholic fatty liver disease (Kleiner DE et al., 2005). This may reflect a true difficulty in steatosis assessment in alcoholic liver disease due to more prominent ballooning, pericellular fibrosis, and in some cases foamy degeneration. The overall fibrosis staging assessment in our study had a moderate agreement, that was better than that previously reported in liver biopsies from alcoholic patients (Bedosa P et al., 1988) but worse than that reported for nonalcoholic fatty liver disease (kappa of 0.85, Kleiner DE et al., 2015). This may be related to the more frequently fragmented biopsies in patients with more advanced lobular fibrosis that may interfere with fibrosis interpretation.

As 3 of the 5 reviewers review liver biopsies as a major part of their reporting, a further analysis was performed to investigate the kappa value among these 3 liver pathologists. Indeed, the kappa value for all examined histologic features was higher except for steatosis and ballooning. For features included in AHHS, the kappa value was higher among liver pathologists (0.41 vs. 0.38 for fibrosis, 0.54 vs 0.37 for bilirubinostasis, 0.40 vs −0.26 for megamitochondria, and 0.48 vs. 0.40 for PMN inflammation between liver and non-liver pathologists respectively). These data suggest a greater interobserver agreement in AHHS categories assignment among liver pathologists that practice in a high volume liver setting. Furthermore, uniform agreement could not be reached in a significant proportion of the cases (44%) raising some concerns regarding the general application of AHHS as a predictor of prognosis in AH.

Major strengths of this study are that it is one of the largest cohorts of patients with AH from a single institute and the diagnosis was confirmed by clinical and histological criteria. In addition, all liver biopsies were obtained within one month of the laboratory results. Furthermore, all the reviewing pathologists had undergone training in gastrointestinal and liver pathology fellowship. In contrast to previous reports with higher interobserver agreement, our study included a large number of pathologists with differences in years of experience, training at different institutions and current practices, which is more representative of pathologists nationwide.

A limitation of the present study is that the review results including the AHHS were not correlated with clinical findings, laboratory test results, or the patients’ outcome. However, the goal of the study was not to determine the prognostic value that has been previously reported, but rather to determine the variability in the interpretation of histological characteristics that constitute components of the AHHS. All liver biopsies were taken within 1 month of laboratory results, but the exact time interval between liver biopsy from onset of disease or time of admission was not clear. The AH population in this study may be different from the study population in previous studies as our study patients had lower evidence of marked ballooning, many Mallory hyalines, megamitochondria, and bilirubinostasis. This may be related to the large number of cases with advanced fibrosis in our cohort. Only 3 liver pathologists and 2 GI pathologists participated in this study so the feature recognition difference between these two small groups may not be representative. Finally, all the pathologists had specialized GI/liver fellowship training and pathologists without GI/liver fellowship training were not included. However, this is also a strength of the study because based on the variability amongst pathologists with specialized training, one would expect an even greater variability between non-specialized pathologists, suggesting that such biopsies should preferably be reviewed by pathologists with advanced training and continued interest in liver pathology.

In summary, our study for the first time examined the interobserver variability on several previously reported histologic features of AH that predict outcome in a clinical practice outside of a research setting. An AHHS could only be assigned with a fair level of agreement (kappa value of 0.33) in our study and only eighteen out of 32 (56%) were uniformly assigned to an AHHS category by three liver pathologists. Our data is consistent with previous literature and reaffirms the high interobserver variability in liver biopsy interpretation. Additionally, our data identifies megamitochondria as one of the most difficult features to recognize in AH. Even though pathologists practicing in a high volume liver setting may have a better interobserver agreement, clinicians applying the AHHS as a predictor of prognosis in AH should consider this with caution, based on the limitations identified in our study. Whether training sessions will harmonize the interpretation of individual parameters among pathologists and improve interobserver agreement in scoring liver biopsies with a diagnosis of AH remains to be determined by future studies.

Figure 1.

Figure 1.

Morphologic features seen in alcoholic hepatitis including hepatocyte ballooning, Mallory hyaline, PMN infiltration (A, H&E), megamitochondria (B, H&E), cholestasis (C, H&E), and lobular fibrosis (D, Masson’s trichrome stain).

Acknowledgments

Grant support: SD was supported in part by DK83414, UO1 AA021893, R21 AA 022742, P50 AA02433-01-8236.

DA and AJM were supported in part by UO1 AA021893.

Footnotes

Disclosure: None

References

  1. Altamirano J, Miquel R, Katoonizadeh A (2014) A histologic scoring system for prognosis of patients with alcoholic hepatitis. Gastroenterology 146:1231–1239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bedossa P, Poynard T, Naveau S, Martin ED, Agostini H, Chaput JC (1988) Observer variation in assessment of liver biopsies of alcoholic patients. Alcoholic Clin Exp Res 12:173–178. [DOI] [PubMed] [Google Scholar]
  3. Bejarano PA, Koehler A, Sherman KE (2001) Second opinion pathology in liver biopsy interpretation. Am J Gastroenteroll 96:3158–3164. [DOI] [PubMed] [Google Scholar]
  4. Colling R, Verrill C, Fryer E, Wang LM, Fleming K (2014) Discrepancy rates in liver biopsy reporting. J Clin Pathol 67:825–827. [DOI] [PubMed] [Google Scholar]
  5. Dominguez M, Rincon D, Abraldes JG (2008) A new scoring system for prognostic stratification of patients with alcoholic hepatitis. Am J Gastroenterol 103:2747–2756. [DOI] [PubMed] [Google Scholar]
  6. Guirguis J, Chhatwal J, Dasarathy J (2015) Clinical impact of alcohol-related cirrhosis in the next decade: estimates based on current epidemiological trends in the United States. Alcohol Clin Exp Res 39:2085–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Kleiner DE, Brunt EM, Van Natta M (2005) Design and validation of a histological scoring system for nonalcoholic fatty liver disease. Hepatology 41:1313–1321. [DOI] [PubMed] [Google Scholar]
  8. Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174. [PubMed] [Google Scholar]
  9. O’Shea RS, Dasarathy S, McCollough AJ (2010) Alcoholic liver disease. Am J Gastroenterol 105(1):14–32. [DOI] [PubMed] [Google Scholar]
  10. Paterson AL, Allison ME, Brais R, Davies SE (2016) Any value in a specialist review of liver biopsies? Conclusions of a 4-year review. Histopathology 69:315–321. [DOI] [PubMed] [Google Scholar]

RESOURCES