Abstract
Background and Aim:
Covert Hepatic Encephalopathy (CHE) is associated with poor outcomes but is often not diagnosed due to the time requirement. Psychometric hepatic encephalopathy (PHES) is the gold standard against which EncephalApp Stroop has been validated. However, EncephalApp (5 runs each in “Off” and “On” state) can take up to 10 minutes. Aim: Define the smallest number of EncephalApp runs needed for comparable accuracy to as the total EncephalApp using CHE on PHES as gold standard.
Methods:
A derivation and a validation cohort of outpatients with cirrhosis who underwent PHES (gold standard) and total EncephalApp was recruited. Data were analyzed for individual runs versus total EncephalApp time versus PHES-CHE. The derivation cohort (n=398) was split into training (n=299) and test (n=99) sets. From the training data set a regression model was created with age, gender, education, and various sums of the “Off” settings. After this, a K-fold cross-validation on the test dataset was performed for both total EncephalApp time and individual Off runs and for the validation cohort.
Results:
In both cohorts, Off runs 1+2 had statistically similar AUROC and p-value to the total EncephalApp for PHES-CHE prediction. The adjusted (age, gender, education) regression formula from the derivation cohort showed an accuracy of 84% to diagnose PHES-CHE in the validation cohort. Time for CHE diagnosis decreased from 203.7(67.82) to 36.8(11.25) seconds in the derivation and from 178.2(46.19) to 32.9(9.94) seconds in the validation cohort.
Conclusions:
QuickStroop, which is completed within 1 minute, gives an equivalent ability to predict CHE on the gold standard compared to the entire EncephalApp time.
Keywords: minimal hepatic encephalopathy, psychometric hepatic encephalopathy score, EncephalApp, rapid diagnosis, cirrhosis
Graphical Abstract
INTRODUCTION:
Covert hepatic encephalopathy (CHE) is a largely under recognized form of cognitive dysfunction with a high prevalence in patients with cirrhosis1. CHE is associated with a decline in daily function and prognosticates future overt hepatic encephalopathy (OHE), morbidity and mortality2, 3 4. While not uniform across the world, there is a need for quicker tests to diagnose CHE in a clinical and research setting1.
One of the tests that has been used across the world is EncephalApp Stroop, which tests psychomotor speed and cognitive flexibility to detect CHE that builds upon the traditional Stroop effect1, 5. The EncephalApp has 2 states, the easier Off State and the more difficult On State with settings at 5 runs at both states required for the analysis (Figure 1)6. The diagnosis of CHE relies on the total time (5 off runs+5 On Runs) on the EncephalApp7. While these complete runs can still be completed in a lower time-period than the other gold standards such as psychometric hepatic encephalopathy score (PHES), reducing the time needed to achieve similar results as the total EncephalApp results could enhance acceptability for patients and practitioners, and engender greater translation into clinical practice.
We hypothesized that just by using a lower number of runs of the EncephalApp we could develop a QuickStroop version to diagnose CHE that maintained the accuracy compared to PHES and the total EncephalApp results.
METHODS:
Patient population:
All eligible patients between ages 21–65 years who were not color blind and were able to consent were approached for testing for CHE. We included patents who were tested for CHE using PHES and EncephalApp from 2 high volume tertiary care centers (Virginia Commonwealth University and Richmond VA Medical Centers, Richmond VA)1. All subjects signed informed consent and the protocol was approved by the IRBs of both sites. Cirrhosis was diagnosed either by biopsy, radiologic evidence of cirrhosis, endoscopic evidence of varices in a patient with chronic liver disease, or those with frank decompensation (history of variceal bleeding, ascites, or prior OHE). We included patients with a history of prior episodes of OHE provided they were controlled on medications (lactulose and/or rifaximin) at the time of testing. All patients were able to give informed consent as judged by a mini-mental status score of ≥25, and frank dementia was excluded by family interview, chart review, and evaluation of concurrent medications. We excluded subjects who were not able to give informed consent, had red–green color blindness, had abused alcohol/illicit drugs over the last 3 months, and were on psychoactive medications apart from chronic antidepressants.
A derivation cohort of outpatients (N=398) with cirrhosis who underwent PHES and EncephalApp Stroop was used, and analyses of individual Off State runs and On State runs were compared to PHES CHE as gold standard was performed5. We split the derivation cohort of 398 subject into a training set and a test dataset, with 75% of the subjects (N = 299) in the training set and 25% (N = 99) in the test dataset, with the two datasets having a similar distribution of PHES CHE variables. We then used binary logistic regression on the training dataset (n = 299) to derive a formula for CHE diagnosis on the QuickStroop using significant parameters that can impact this score. We then performed a K-fold cross-validation, with K = 10, for both Total Stroop time and for the sum of runs in the Off position only based on the test data set (n = 99). Sensitivity, specificity, AUROC and accuracy were analyzed.
The smallest number of Off State runs and On State runs needed to reach a point where there was no statistically significant difference between the data obtained for OffTime+OnTime and subset of runs was evaluated for the derivation cohorts.
We then enrolled an external validation cohort (N= 94) of outpatients with cirrhosis to see if the same number of Off State runs as shown in training set of derivation cohort could predict CHE and to validate the binary logistic regression formula generated from the training set of the derivation cohort. Sensitivity, specificity, AUROC and accuracy were analyzed.
EncephalApp Stroop testing:
All potential subjects who were not color-blind underwent the testing by a trained provider. After appropriate instructions and a mandatory trial run/test, patients were tested on the EncephalApp, and timings were recorded. A total of 5 runs were done in the Off state and then 5 more runs were requited to be completed on the On state. Total state times were recorded for all groups. Standard EncephalApp metrics are times taken for 5 successful Off stage runs (Off Time), for 5 successful On stage runs (On Time) and total time taken (Off Time + On Time). A diagnosis of CHE was based on US norms from the multi-center North American experience5.
Psychometric Hepatic Encephalopathy Score (PHES):
The PHES consists of 5 subtests namely the number connection tests A and B, digit symbol test, serial dotting test, and line drawing test8. Tests were administered by trained providers. CHE was diagnosed on PHES for a score ≤ − 4 based on our norms5
Statistics:
The R statistical package was used to derive the formulas and perform the K-fold cross validation. T test/Wilcoxon test was used for continuous variables. Chi-square test was used for categorical variables. De Long’s method was used to compare AUROCs.
RESULTS:
Comparison of derivation and validation cohorts:
We analyzed data of a total of 492 patients who had undergone PHES and EncephalApp testing (Figure 2). The validation cohort had a lower age and proportion with male gender. This was associated with a higher time to complete EncephalApp, but CHE proportion was statistically similar between the groups since the diagnosis is adjusted for age and gender (Table 1).
Table 1:
Demographics | Derivation cohort (N=398) |
Validation Cohort (N=94) |
P value |
---|---|---|---|
Age (Mean(SD)) | 59±8 | 56±7 | 0.0009 |
Male sex N (%) | 335(84.1%) | 54(55%) | <0.0001 |
Education (Median (IQR)) | 13 (12, 14) | 13 (12, 15) | 0.56 |
CHE on PHES N (%) | 201(50.5%) | 43(44%) | 0.23 |
MELD score (Mean(SD)) | 12.5 (5.7) | 12.4 (6.0) | 0.88 |
Rifaximin N (%) | 152(38%) | 28(30%) | 0.13 |
Prior Overt HE N (%) | 164(41%) | 32(32.2%) | 0.11 |
EncephalApp (Mean (SD)) seconds | |||
Total Off Time | 85.5(74.5,101) | 78(65,93) | 0.0002 |
Total On Time | 103(89,125) | 94(78,108) | 0.0003 |
Total OffTime+OnTime | 189(163,225) | 172 (147,201) | 0.0002 |
CHE: covert hepatic encephalopathy, PHES: psychometric hepatic encephalopathy score, HE: hepatic encephalopathy; all patients with prior HE were on lactulose.
Comparison to PHES CHE:
We examined the AUROC for diagnosis of CHE via the total Off + On runs and Off runs using the PHES CHE as the gold standard. This was achieved by the training set (n=299) within the derivation cohort. As shown in Table 2, we noted that we were able to achieve comparable accuracy with the first 2 Off state runs only compared to PHES prediction of the entire OffTime+OnTime with p value >0.05 with this comparison. All comparisons with higher number of Off runs after the first 2 also continued to show this trend. Given that the On State of the EncephalApp is a more complex task that requires more time and is done only after the Off State, we did not analyze the On State runs further.
Table 2:
Derivation cohort training set (n=299) | Total Time | Off 1 | Off1 through 2 | Off1 through 3 | Off through 4 | Off1 through 5 |
---|---|---|---|---|---|---|
AUROC | 0.8831 | 0.8536 | 0.8773 | 0.8787 | 0.8864 | 0.8891 |
p-value vs. PHES | ---- | 0.0247 | 0.5812 | 0.6547 | 0.6719 | 0.4137 |
Validation cohort (n=94) | Total Time | Off 1 | Off1 through 2 | Off1 through 3 | Off through 4 | Off1 through 5 |
AUROC | 0.8905 | 0.8929 | 0.8887 | 0.8893 | 0.8964 | 0.8905 |
p-value vs PHES | ---- | 0.9109 | 0.9266 | 0.9469 | 0.7243 | 1.0000 |
CHE: covert hepatic encephalopathy, PHES: psychometric hepatic encephalopathy score, HE: hepatic encephalopathy, AUROC: area under the receiver operating curve
Comparisons with CHE on PHES were also made for the validation cohort, which again showed a similar trend with Off1 through 2 showing statistical similarity with the total OffTime+OnTime for CHE diagnosis (Table 2).
Ultimately, the time needed to diagnose CHE via the QuickStroop was reduced from total Off Time + On Time 203.7 (67.82) seconds to 36.8 (11.25) seconds for the derivation training set and from total 178.2 (46.19) to 32.9 (9.94) seconds for the validation cohort.
Multivariable regression modelling of the training set in the derivation cohort:
Regression models were developed for each of the stages (OffTime+OnTime, Off 1 only, Off 1 through 2, etc till Off 1 through 5 runs) using CHE PHES as the gold standard incorporating age, gender, education and time required using data obtained from the training set of the derivation cohort. The regression model for Off runs 1 through 2, which was the lowest number of runs needed to achieve comparable accuracy was represented by the formula: −8.2362 + Age*(−0.0349) + Male*(1.1467) + Education*0.0049 + Stroop Off1 through 2*(0.2607) where a score >=0 is MHE (<0 is No MHE)Table 3 has the regression formulae for all other runs.
Table 3:
Regression Models Based on Various Stroop Times | |||||
---|---|---|---|---|---|
Total Time (Off+On) Estimate | Off1 through 2 Estimate | Off1 through 3 Estimate | Off1 through 4 Estimate | Off1 through 5 Estimate | |
Intercept | −7.9721 | −8.2362 | −8.6315 | −8.6628 | −8.7121 |
Age | −0.0372 | −0.0349 | −0.0318 | −0.0385 | −0.0428 |
Sex (Male) | 1.3358 | 1.1467 | 1.0895 | 1.0320 | 0.9780 |
Education | −0.0096 | 0.0049 | 0.0041 | 0.0069 | 0.0131 |
Stroop | 0.0470 | 0.2607 | 0.1796 | 0.1410 | 0.1161 |
Using these formulas, if the calculated value was greater than or equal 0 then a subject was classified as having CHE; if the calculated value was < 0 the subject was classified as not having CHE. As mentioned above, Off 1 through 2 estimate was statistically similar to the total time (Off+On).
This formula was used to compare accuracy in the testing set of the derivation cohort and validation cohorts (Table 4). Our results suggested that the overall accuracy of the formula derived from the training set of the derivation cohort to predict PHES CHE in the validation cohort was 84.04% with a sensitivity of 75.00% and specificity of 87.14% with Off runs 1 through 2. In the testing set, the accuracy was 77.78% with a sensitivity of 70.83% and specificity of 84.31% using the Off runs 1 through 2 adjusted for age, sex, and education level.
Table 4:
Total Stroop Time (Off+On) | Off 1 through 2 | Off 1 through 3 | Off 1 through 4 | Off1 through 5 | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Test Set (n = 99) |
Validation cohort (n = 94) |
Test Set (n = 99) |
Validation cohort (n = 94) |
Test Set (n = 99) |
Validation cohort (n = 94) |
Test Set (n = 99) |
Validation cohort (n = 94) |
Test Set (n = 99) |
Validation cohort (n = 94) |
|
Accuracy | 81.82% | 80.85% | 77.78% | 84.04% | 77.78% | 82.98% | 77.78% | 81.91% | 79.80% | 80.85% |
Sensitivity | 75.00% | 62.50% | 70.83% | 75.00% | 68.75% | 66.67% | 66.67% | 66.67% | 72.92% | 62.50% |
Specificity | 88.24% | 87.14% | 84.31% | 87.14% | 86.27% | 88.57% | 88.24% | 87.14% | 83.33% | 87.14% |
DISCUSSION:
We found that two short runs in the easier Off State of the EncephalApp demonstrate equivalent accuracy as the entire EncephalApp to diagnose CHE based on the PHES gold standard. This was internally and externally validated and was robust even with logistic regression adjustment. The data generated here could potentially streamline the diagnosis of CHE using EncephalApp in clinics where the time burden would go down substantially, make it possible to administer while checking in patients and recording data on smart devices to present to clinicians seeing the patients later in the clinic.
The PHES is considered the gold standard but the translation into clinic has been fraught with issues related to norm development and need for a long period of time for administration and scoring1, 9. There have been attempts to shorten this time in clinical studies10, but there remains a gap in clinical application. While the time required for the complete EncephalApp run (5 Off and 5 On) vary between groups, it is lower than PHES but could also reach up to 10 minutes. This study shows that an accurate diagnosis of CHE that links with PHES can be made within 45 seconds, which is a considerable advantage from a time perspective that could be translated into practice. In terms of loss of potential patho-physiological information by only using 2 Off state runs, we would be unable to determine cognitive flexibility and response inhibition that are associated with the On State and only focus on psychomotor and processing speed tested in the Off State. Therefore, this cognitive measure specifically assessed psychomotor and processing speed, which are abilities intimately associated with subcortical region integrity. Subcortical changes have been demonstrated in prior specialized functional HE-related imaging studies11, 12. This also fits into a dynamic model of brain organization, where subcortical brain regions are responsible for cortical tone and arousal and even mild disruption of brain homeostasis of subcortical areas is associated with impaired processing speed and vigilance13. The strong diagnostic accuracy for HE diagnosis using the QuickStroop confirm that HE, particularly in the early stages of the disorder, impacts subcortical region integrity; and that neuropsychological measures interrogating vigilance and processing speed should be emphasized in diagnostic assessment13.
Other screening strategies such as the animal naming test (ANT) and quality of life questionnaires have also been studied14. We did not directly compare these to the EncephalApp full version or the QuickStroop but there may be some points to consider. The ANT is an easy-to-use paper-pencil test that could have a ceiling effect given the number of animals that can be entered during the limit, while the SIP is often population-dependent on specific questions that may be relevant15, 16. However, it is likely that any of these three modalities (ANT, quality of life, or QuickStroop) could potentially be adopted to increasing CHE screening, which currently remains very low. Therefore, we believe that the QuickStroop may be one more potential avenue to screen for CHE, the screening of which still needs to be improved17.
The advantage over using the complete OffTime+OnTime is time-based, but also potentially reducing stress on patients who could have difficulty completing the more difficult On state. Despite the time advantages, we still need to exclude red-green color-blind subjects and need an electronic device to administer this QuickStroop. The relatively lower time requirement could also reduce the potential to study changes over time because of an intervention. Therefore, for research studies the full EncephalApp may still be required. The study is limited by cross-sectional analyses using data from two centers and needs to be extended further. We did not study impact on outcomes since previous studies have shown this for both PHES and full EncephalApp results1. We have more men than women in the derivation compared to the validation cohort but since the CHE decision by PHES and EncephalApp are adjusted for age, gender, and educational achievement, this will not confound the interpretation. This was reflected in the equivalent results with 2 Off runs of the EncephalApp in both cohorts compared to PHES.
To implement these in clinical practice after further validation, a few points are important to consider. From the provider perspective, this requires trained personnel (training requires <10 minutes and is on www.encephalapp.com), a tablet or smartphone and an area where the provider and patients are not disturbed. From a patient perspective, we need to exclude those with red-green color blindness (history or using Apps such as “Colortest”) and actively confused patients. We anticipate that the QuickStroop results would then be noted in the patient chart for the provider to interpret using norms (www.encephalapp.com has been updated), counsel patients and add/or modify therapy accordingly.
In conclusion, the Quick Stroop version of the EncephalApp Stroop is a valid alternative to the complete EncephalApp Stroop and PHES with a high sensitivity, specificity and accuracy that has high relevance for real world rapid CHE diagnosis. The QuickStroop is likely to add to the armamentarium that helps clinicians in rapid clinical decision-making as it can be completed in less than 40 seconds, which in turn could potentially improve testing and treatment for CHE.
What You Need to Know.
Background
Covert hepatic encephalopathy (CHE) is an epidemic neuro-cognitive complication of cirrhosis, which worsens outcomes but is often undiagnosed due to logistic constraints. The gold standard paper-pencil test psychometric hepatic encephalopathy score (PHES) is difficult to implement, and while the total EncephalApp, which requires lesser amount of time has been validated against this, shorter versions to further enhance CHE diagnosis are needed.
Findings
In 492 outpatients with cirrhosis (398 derivation and 94 validation cohorts) who underwent PHES gold standard and traditional total EncephalApp, we found that only 2 runs in easier Off state of the EncephalApp were accurate to diagnose CHE. Regression formulae adjusted for age, gender and educational status showed the QuickStroop (first 2 runs of EncephalApp) provided equivalent accuracy within one minute to the longer total EncephalApp using PHES as the gold standard.
Implications for Patient care
Using the QuickStroop that requires less than a minute has the potential to improve the rate of screening for and potential treatment of covert hepatic encephalopathy. This enhanced testing could reduce the negative consequences of untreated covert hepatic encephalopathy and improve monitoring of these patients over time.
Grant support:
Partly supported using 2I0CX001076 VA Merit Review, R21TR003095 and RO1HS025412 to JSB
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Data Transparency Statement: Due to privacy concerns, individual data will not be made available.
Disclosures: None for any author
References:
- 1.Vilstrup H, Amodio P, Bajaj J, et al. Hepatic encephalopathy in chronic liver disease: 2014 Practice Guideline by the American Association for the Study of Liver Diseases and the European Association for the Study of the Liver. Hepatology 2014;60:715–35. [DOI] [PubMed] [Google Scholar]
- 2.Romero-Gomez M, Boza F, Garcia-Valdecasas MS, et al. Subclinical hepatic encephalopathy predicts the development of overt hepatic encephalopathy. Am J Gastroenterol 2001;96:2718–23. [DOI] [PubMed] [Google Scholar]
- 3.Roman E, Cordoba J, Torrens M, et al. Minimal hepatic encephalopathy is associated with falls. Am J Gastroenterol 2011;106:476–82. [DOI] [PubMed] [Google Scholar]
- 4.Patidar KR, Thacker LR, Wade JB, et al. Covert hepatic encephalopathy is independently associated with poor survival and increased risk of hospitalization. Am J Gastroenterol 2014;109:1757–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Allampati S, Duarte-Rojo A, Thacker LR, et al. Diagnosis of Minimal Hepatic Encephalopathy Using Stroop EncephalApp: A Multicenter US-Based, Norm-Based Study. Am J Gastroenterol 2016;111:78–86. [DOI] [PubMed] [Google Scholar]
- 6.Bajaj JS, Heuman DM, Sterling RK, et al. Validation of EncephalApp, Smartphone-Based Stroop Test, for the Diagnosis of Covert Hepatic Encephalopathy. Clin Gastroenterol Hepatol 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zeng X, Li XX, Shi PM, et al. Utility of the EncephalApp Stroop Test for covert hepatic encephalopathy screening in Chinese cirrhotic patients. J Gastroenterol Hepatol 2019;34:1843–1850. [DOI] [PubMed] [Google Scholar]
- 8.Weissenborn K, Ennen JC, Schomerus H, et al. Neuropsychological characterization of hepatic encephalopathy. J Hepatol 2001;34:768–73. [DOI] [PubMed] [Google Scholar]
- 9.Zeng X, Zhang LY, Liu Q, et al. Combined Scores from the EncephalApp Stroop Test, Number Connection Test B, and Serial Dotting Test Accurately Identify Patients With Covert Hepatic Encephalopathy. Clin Gastroenterol Hepatol 2020;18:1618–1625 e7. [DOI] [PubMed] [Google Scholar]
- 10.Riggio O, Ridola L, Pasquale C, et al. A simplified psychometric evaluation for the diagnosis of minimal hepatic encephalopathy. Clin Gastroenterol Hepatol 2011;9:613–6 e1. [DOI] [PubMed] [Google Scholar]
- 11.Zhang XD, Zhang LJ, Wu SY, et al. Multimodality magnetic resonance imaging in hepatic encephalopathy: an update. World J Gastroenterol 2014;20:11262–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ahluwalia V, Wade JB, White MB, et al. Rifaximin Improves Brain Edema and Working Memory in Minimal Hepatic Encephalopathy: A Prospective fMRI Study. Hepatology 2012;56:162A. [Google Scholar]
- 13.Luria AR. The working brain; an introduction to neuropsychology. New York: Basic Books, 1976. [Google Scholar]
- 14.Labenz C, Toenges G, Huber Y, et al. Development and Validation of a Prognostic Score to Predict Covert Hepatic Encephalopathy in Patients With Cirrhosis. Am J Gastroenterol 2019;114:764–770. [DOI] [PubMed] [Google Scholar]
- 15.Campagna F, Montagnese S, Ridola L, et al. The animal naming test: An easy tool for the assessment of hepatic encephalopathy. Hepatology 2017;66:198–208. [DOI] [PubMed] [Google Scholar]
- 16.Lauridsen MM, Jepsen P, Wernberg CW, et al. Validation of a Simple Quality-of-Life Score for Identification of Minimal and Prediction of Overt Hepatic Encephalopathy. Hepatol Commun 2020;4:1353–1361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Bajaj JS, Etemadian A, Hafeezullah M, et al. Testing for minimal hepatic encephalopathy in the United States: An AASLD survey. Hepatology 2007;45:833–4. [DOI] [PubMed] [Google Scholar]