Skip to main content
Mayo Clinic Proceedings logoLink to Mayo Clinic Proceedings
. 2009 Aug;84(8):694–701. doi: 10.4065/84.8.694

Validity of the FOUR Score Coma Scale in the Medical Intensive Care Unit

Vivek N Iyer 1, Jayawant N Mandrekar 1, Richard D Danielson 1, Alexander Y Zubkov 1, Jennifer L Elmer 1, Eelco F M Wijdicks 1,
PMCID: PMC2719522  PMID: 19648386

Abstract

OBJECTIVE: To evaluate the validity of the FOUR (Full Outline of UnResponsiveness) score (ranging from 0 to 16), a new coma scale consisting of 4 components (eye response, motor response, brainstem reflexes, and respiration pattern), when used by the staff members of a medical intensive care unit (ICU).

PATIENTS AND METHODS: This interobserver agreement study prospectively evaluated the use of the FOUR score to describe the condition of 100 critically ill patients from May 1, 2007, to April 30, 2008. We compared the FOUR score to the Glasgow Coma Scale (GCS) score. For each patient, the FOUR score and the GCS score were determined by a randomly selected staff pair (nurse/fellow, nurse/consultant, fellow/fellow, or fellow/consultant). Pair wise weighted κ values were calculated for both scores for each observer pair.

RESULTS: The interrater agreement with the FOUR score was excellent (weighted κ: eye response, 0.96; motor response, 0.97; brainstem reflex, 0.98; respiration pattern, 1.00) and similar to that obtained with the GCS (weighted κ: eye response, 0.96; motor response, 0.97; verbal response, 0.98). In terms of the predictive power for poor neurologic outcome (Modified Rankin Scale score, 3-6), the area under the receiver operating characteristic curve was 0.75 for the FOUR score and 0.76 for the GCS score. The mortality rate for patients with the lowest FOUR score of 0 (89%) was higher than that for patients with the lowest GCS score of 3 (71%).

CONCLUSION: The interrater agreement of FOUR score results was excellent among medical intensivists. In contrast to the GCS, all components of the FOUR score can be rated even when patients have undergone intubation. The FOUR score is a good predictor of the prognosis of critically ill patients and has important advantages over the GCS in the ICU setting.


The interrater agreement of FOUR score results was excellent among medical intensivists; in contrast to the Glasgow Coma Scale, all components of the FOUR score can be rated even when patients have undergone intubation. The FOUR score is a good predictor of the prognosis of critically ill patients and has important advantages over the Glasgow Coma Scale in the intensive care unit.


FOUR = Full Outline of UnResponsiveness; GCS = Glasgow Coma Scale; ICU = intensive care unit

Assessing impaired consciousness in the medical and surgical intensive care unit (ICU) is very difficult. The complexity of such an assessment relates in part to the difficulty of finding usable terminology, as illustrated in an earlier study in which 3 observers variously described a single patient as “somnolent,” “difficult to arouse,” and “deeply comatose.”1 In recognition of this problem, Teasdale and Jennett1 devised the Glasgow Coma Scale (GCS) in 1974 in an attempt to bring uniformity to the clinical examination and to clinical communication about the level of consciousness.

The GCS has become a fixture in the initial assessment of abnormal consciousness but is not designed to capture distinct details of the neurologic examination. The GCS has been routinely used in medical and surgical ICUs and is commonly used in the Acute Physiology and Chronic Health Evaluation (APACHE) scoring system. However, its reliability in predicting patient outcomes is unsatisfactory, particularly with regard to the verbal component.2 Other investigators have found additional shortcomings of the GCS and have suggested that adding measures of brainstem reflexes to the GCS could provide better prognostic information.3 Rowley and Fielding4 found that the reliability of the GCS increases with the experience of its users and that user inexperience is associated with a high rate of errors.

We have developed a new coma scale, the Full Outline of UnResponsiveness (FOUR) score. Although the FOUR score is based on the bare minimum of tests necessary for assessing a patient with altered consciousness, it includes much important information that is not assessed by the GCS, including measurement of brainstem reflexes; determination of eye opening, blinking, and tracking; a broad spectrum of motor responses; and the presence of abnormal breath rhythms and a respiratory drive. Because the FOUR score, unlike the GCS, does not include an assessment of verbal response, it is more useful for assessing critically ill patients who have undergone intubation.

The FOUR score was originally tested with staff members of a neuroscience ICU5 and has been subsequently validated by tests with experienced and inexperienced neuroscience ICU nurses.6 To determine whether the FOUR score is equally suited for use by intensivists, fellows, residents, and nurses without a neuroscience background, we prospectively tested the validity of the FOUR score coma scale when used by staff members of a medical ICU.

PATIENTS AND METHODS

A prospective observational study design was used to validate the FOUR score. A total of 18 nurses, 10 fellows, and 5 consultants from the ICU staff volunteered to serve as raters for the study. They were oriented to the study's aims and design during a 30-minute teaching session that included videotape clips demonstrating the determination of the FOUR score with actual patients. The raters had no formal neuroscience training and had not worked in a neuroscience ICU before participating in this study.

Patients with abnormal consciousness were recruited from all ICUs of Mayo Clinic's Saint Marys Hospital during a 1-year period from May 1, 2007, to April 30, 2008. All nonsedated or nonparalyzed patients admitted to any of the ICUs were eligible for participation and were grouped into 4 broad categories of consciousness: alert (fully aware and awake), drowsy (responds to loud voice only), stuporous (responds briefly but only after noxious stimuli), or comatose (eyes closed and no localization of pain stimuli). Informed consent was obtained from patients or proxy. The study was reviewed and approved by the Mayo Clinic Institutional Review Board.

A randomization sheet was used to select the rater pair (fellow/fellow, fellow/nurse, fellow/consultant, or nurse/consultant) that would assess the patient. Within the same hour, each evaluator in the pair recorded a FOUR score and a GCS score for the patient. Each evaluator was given a worksheet that outlined the components of the FOUR score and the GCS score. For patients who had undergone intubation, the lowest GCS verbal score was used both for scoring and for data analyses. This approach provided standardization to the otherwise subjective nature of the verbal score for patients who had undergone intubation.

Description of the FOUR Score

The FOUR score has 4 components: eye responses, motor responses, brainstem reflexes, and respiration pattern. Each component has a maximal value of 4 (Figure 1). Assessing all components of this score usually takes only a few minutes.5 The eye response component of the FOUR score allows differentiation between a vegetative state (eyes open but do not track) and a locked-in syndrome (eyes open, blink, and track vertically on command). The motor assessment component of the FOUR score combines the withdrawal reflex and decorticate rigidity responses because these conditions are often difficult to distinguish clinically. The motor component includes a complex command (the patient is asked to produce a thumbs-up hand signal, a fist, and the peace sign) that determines whether patients are alert.7 Similarly, the motor component of the FOUR score can detect signs of severe cerebral dysfunction, such as myoclonic status epilepticus. Such dysfunction is often a poor prognostic sign for patients with suspected anoxic brain injury.8 The brainstem components of the FOUR score assess the pons, the mesencephalon, and the medulla oblongata in various combinations. The FOUR score also includes an assessment of Cheyne-Stokes respiration and irregular breathing; such signs can indicate bihemispheric or lower brainstem dysfunction of respiratory control. For patients who have undergone intubation, the FOUR score records the presence or absence of a respiratory drive.

FIGURE 1.

FIGURE 1.

Description of Full Outline of UnResponsivenes (FOUR) score. Eye response: E4 = eyelids open or opened, tracking, or blinking to command; E3 = eyelids open but not tracking; E2 = eyelids closed but open to loud voice; E1 = eyelids closed but open to pain; E0 = eyelids remain closed with pain. Motor response: M4 = thumbs-up, fist, or peace sign; M3 = localizing to pain; M2 = flexion response to pain; M1 = extension response to pain; M0 = no response to pain or generalized myoclonus status. Brainstem reflexes: B4 = pupil and corneal reflexes present; B3 = one pupil wide and fixed; B2 = pupil or corneal reflexes absent; B1 = pupil and corneal reflexes absent; B0 = absent pupil, corneal, and cough reflex. Respiration pattern: R4 = not intubated, regular breathing pattern; R3 = not intubated, Cheyne-Stokes breathing pattern; R2 = not intubated, irregular breathing; R1 = breathes above ventilatory rate; R0 = breathes at ventilator rate or apnea.

Outcome Assessment

Data on in-hospital mortality and clinical diagnosis of brain death were recorded for all patients. Morbidity was assessed at 3 months with the Modified Rankin Scale.9 Briefly, a Rankin score of 0 indicates no symptoms; a score of 1, no evident disability despite symptoms; a score of 2, slight disability, with an inability to carry out all previous activities; a score of 3, moderate disability, with the need for some help but the ability to walk without assistance; a score of 4, moderately severe disability, with the inability to walk without assistance or to attend to bodily needs without assistance; a score of 5, severe disability, with the patient being bedridden and incontinent and requiring constant nursing care; and a score of 6, death.

Statistical Analyses

For both the FOUR score and the GCS score, pairwise weighted κ values (for each observer pair), overall weighted κ values, and intraclass correlation values were calculated. A κ statistic of 0.4 or lower is considered poor; a value between 0.4 and 0.6, fair to moderate; a value between 0.6 and 0.8, good interobserver agreement; and a value higher than 0.8, excellent agreement. Cronbach α was calculated for each score as an assessment of internal consistency, and Spearman correlation coefficients were calculated between the FOUR score and the GCS score as an assessment of construct validity.

The sensitivity and specificity of the total FOUR score and the total GCS score in predicting in-hospital mortality and morbidity were compared by a logistic regression model controlling for age, sex, and alertness. The area under the receiver operating characteristic curve was calculated for each model. The association between the outcomes of interest (in-hospital death, a Rankin score of 3-6) and the total scores (FOUR score, GCS score) was displayed graphically by scatter plots with superimposed local regression smoothers. Model-based smoothing with generalized additive models was used to obtain the estimates required for generating the scatter plots and the corresponding 95% confidence intervals. Generalized additive models were used because of the flexibility they offer in modeling additive nonlinear associations between the predictor variables and the outcome.

RESULTS

Baseline Data

Our study involved 55 men and 45 women with a mean ± SD age of 63.0±18.4 years (range, 18-94 years). At the time of evaluation, 46 of the patients were comatose, 6 were stuporous, 14 were drowsy, and 34 were alert. The 34 alert patients had a variety of medical illnesses (cirrhosis, exacerbation of chronic obstructive pulmonary disease, diabetic ketoacidosis, ischemic heart disease, septic shock, hypertensive crisis, and esophageal perforation). For the remaining 66 patients, diagnoses included cerebral hemorrhage (n=12), anoxicischemic brain injury (n=11), ischemic stroke (n=10), subarachnoid hemorrhage (n=7), craniotomy (n=7), metabolic encephalopathy (n=6), seizures (n=5), meningitis or encephalitis (n=5), and traumatic brain injury (n=3) (Table 1).

TABLE 1.

Diagnostic Categories of 100 Patients Involved in Validation Study of the FOUR Score

graphic file with name 694tbl1.jpg

The frequency of Modified Rankin Scale scores was as follows: a score of 0, 19 patients; a score of 1, 6 patients; a score of 2, 9 patients; a score of 3, 4 patients; a score of 4, 18 patients; a score of 5, 11 patients; and a score of 6, 33 patients. All of the in-hospital deaths (other than brain death) resulted from withdrawal of care by family members faced with a catastrophic neurologic outcome.

Interrater Reliability of the FOUR score

The distribution of all 200 ratings (2 ratings for each patient, 1 from each member of the observer pair) of the FOUR score and the GCS score is shown in Figures 2 and 3, respectively. There was a high degree of internal consistency for both the FOUR score (Cronbach α, 0.87 for both the first and the second rater) and the GCS score (Cronbach α, 0.87 for both the first and the second rater). Spearman correlation coefficients for the FOUR score and the GCS score were high (P=.98 for the first rater; P=.92 for the second rater).

FIGURE 2.

FIGURE 2.

Distribution of total Full Outline of UnResponsiveness (FOUR) scores and scores for eye response, motor response, brainstem reflexes, and respiration pattern.

FIGURE 3.

FIGURE 3.

Distribution of total Glasgow Coma Scale scores and scores for eye response, motor response, and verbal response.

The overall interclass correlation score was 0.99 (0.99-0.99) for the FOUR score and 0.98 (0.98 to 0.99) for the GCS score. The rater agreement was good to excellent among all rater pairs (Table 2). Six patients were declared brain dead, and 1 patient had a locked-in syndrome. Two patients had myoclonic status epilepticus and received a score of 0 on the motor component of the FOUR score. For 156 (78%) of the 200 ratings, the brainstem component of the FOUR score received the maximal score. As was true in earlier studies, the distribution of the eye and motor components of the FOUR score was comparable to their distribution in the GCS score. A GCS score of 3 was recorded for 45 (23%) of the 200 ratings; for 18 (40%) of these 45 GCS ratings, the lowest possible FOUR score of 0 was assigned. The FOUR score provided additional discrimination for the remaining 27 ratings (60%), with total scores ranging from 0 to 8.

TABLE 2.

Interobserver Agreement of Rater Pairs With the FOUR Score and the GCS Score as Indicated by Weighted κ Values

graphic file with name 694tbl2.jpg

The neurologic outcome of 66 patients was poor, as evidenced by a Rankin score of 3 to 6. In all, 33 patients died, including 6 patients who were declared brain dead. For every 1-point increase in the total FOUR score, the odds of in-hospital mortality were reduced by an estimated 15% (odds ratio, 0.75; 95% confidence interval, 0.68-0.84) (Table 3). Similarly, every 1-point increase in the total FOUR score was associated with an 18% reduction in the odds of a poor neurologic outcome, as defined by a Rankin score of 3 to 6. Both of these associations remained statistically significant after the analyses were adjusted for age, sex, and alertness.

TABLE 3.

Comparison of Predictions of Outcome (In-Hospital Death and Modified Rankin Scale Score of 3-6) by the FOUR Score and the GCS Scorea

graphic file with name 694tbl3.jpg

Similarly, on the unadjusted model, each 1-point increase in the total GCS score was associated with an estimated 17% reduction in the odds of in-hospital mortality. In a similar fashion, each 1-point increase in the GCS score was associated with an 18% reduction in the odds of an adverse neurologic outcome, as defined by a Rankin score of 3 to 6. These associations persisted after the analyses were adjusted for age, sex, and alertness.

We charted the receiver operating characteristic curves to compare the predictive power of the 2 scales for in-hospital death. The area under the curve for the FOUR score was 0.86; that for the GCS was 0.82. Similarly, calculations of the predictive power for poor neurologic outcome (Rankin score, 3-6) showed that the area under the curve was 0.75 for the FOUR score and 0.76 for the GCS score.

The association between the outcome and the total score can be further shown by the use of scatter plots with superimposed local regression smoothers (Figure 4). A model-based smoothing with generalized additive models was used in this approach. The probability of in-hospital death at the lowest FOUR score was higher than that at the lowest GCS score. This finding was evidenced by the fact that patients with the lowest GCS score of 3 exhibited a wide range of FOUR scores (0-8) and that 8 (89%) of the 9 patients with the lowest FOUR scores died. In comparison, of the 21 patients with the lowest GCS score, 15 (71%) died.

FIGURE 4.

FIGURE 4.

Scatter plots with superimposed local regression smoothers and 95% confidence intervals showing association of the Full Outline of UnResponsiveness (FOUR) score and the Glasgow Coma Scale score with mortality and morbidity (defined as a Modified Rankin Scale [MRS] score of 3-6).

DISCUSSION

The results of this prospective study show that the FOUR score coma scale maintains a high degree of internal consistency and interrater reliability among medical intensivists, including nursing staff at all levels of experience, fellows, and consultants. The level of interobserver agreement found in the current study was slightly higher than that found by the first validation study, which tested neuroscience ICU staff members.5 These results are remarkable, particularly because the raters in this study were not trained to recognize neurologic signs. Our finding of no or only slight interrater variance provides evidence that the FOUR score can be used outside a neuroscience ICU.

An ideal coma scale would be reliable (measures what it is supposed to measure), valid (yields the same results with repeated testing), linear (gives all components equal weight), and easy to use (provides simple instructions without the need for tools or cards). In addition, these scales may predict outcome, although mortality rates in the ICU are confounded by withdrawal of life support.

Although the GCS has been widely used in hospital settings and is considered a standard assessment tool, it has a number of shortcomings. The usefulness of a verbal component in assessing level of consciousness can be questioned. First, the verbal component of the GCS tests primarily orientation, which quickly becomes abnormal in agitated and confused patients without impaired consciousness. Conversely, many patients with little or no verbal response are alert. Moreover, the verbal response component of the GCS cannot be assessed in critically ill patients who have undergone intubation; in fact, verbal response could not be reliably assessed in 45 of the 100 patients in our study. Second, and most importantly, the GCS does not assess brainstem reflexes, eye movements, or complex motor responses in patients with altered consciousness (reliability). Furthermore, the GCS score is numerically skewed toward motor responses (linearity). These shortcomings have prompted several earlier attempts to improve on the GCS: the Reaction Level Scale (RLS85),10 the Comprehensive Level of Consciousness Scale (CLOCS),11 the Clinical Neurologic Assessment Tool (CNA),12 the Coma Recovery Scale (CRS),13 the Glasgow-Liège Scale (GLS),14 the Innsbruck Coma Scale (ICS),15 and the 60-Second Test (SST).16 These tests are lengthy, including as many as 21,11 28,17 or 3518 testable components. None of these scales has gained sufficient traction to become a substitute for the GCS.

The FOUR score aims to overcome these shortcomings with a scale that is both simple to use and comprehensive in its overall neurologic assessment of the comatose or stuporous patient. The 4 components of the FOUR score (eye response, motor response, brainstem reflexes, and respiration pattern) are equally weighted. This scale is easy to remember because it contains 4 components, each with a maximal score of 4. Brainstem reflexes are included for a full and accurate assessment of the depth of coma. The FOUR score is particularly useful for patients with acute metabolic derangements, sepsis, or shock or with other nonstructural brain injuries because it detects early changes in consciousness (eg, inability to follow specific commands, inability to track examiner's finger movements, and Cheyne-Stokes respiration). The FOUR score is also far more useful than the GCS for patients who have experienced a catastrophic neurologic event as a complication of medical illness or surgery. In addition, the GCS performs poorly in assessing patients with less severe degrees of coma, such as those seen in the medical ICU.4 The frequent use of mild sedation in the medical and surgical ICU could affect eye opening and motor response but not brainstem reflexes and respiration. In contrast, all 3 components of the GCS are affected by sedation.

Our study has several limitations. Because this is the fourth validation study of the FOUR score to be performed at Mayo Clinic,5,6,19 the familiarity of raters with the system and their enthusiasm in using it may have increased the level of agreement among raters. This study included fewer patients than our initial study and did not perform comparisons of nurse/nurse ratings. However, our earlier validation study documented that the ratings of ICU nurses without a neuroscience background agree with those of neuroscience ICU nurses, regardless of experience.6

CONCLUSION

The FOUR score can be used in a variety of ICU settings. It is easily taught, is simple to administer, and provides essential neurologic information that allows an accurate assessment of patients with altered consciousness. The FOUR score accurately predicts which patients will have a poor outcome and can detect the occurrence of brain death in a critically ill patient. In addition, the FOUR score can diagnose a locked-in syndrome mimicking coma and can test the vigilance of the patient by using simple hand signals. In contrast, the GCS cannot assess these conditions because it uses only eye opening and motor response to pain as measures of impaired consciousness in intubated patients. The FOUR score has the potential to become an important measure in prospective clinical studies.

REFERENCES

  • 1.Teasdale G, Jennett B. Assessment of coma and impaired consciousness: a practical scale. Lancet 1974;2(7872):81-84 [DOI] [PubMed] [Google Scholar]
  • 2.Kho ME, McDonald E, Stratford PW, Cook DJ. Interrater reliability of APACHE II scores for medical-surgical intensive care patients: a prospective blinded study. Am J Crit Care 2007;16(4):378-383 [PubMed] [Google Scholar]
  • 3.Born JD, Albert A, Hans P, Bonnal J. Relative prognostic value of best motor response and brain stem reflexes in patients with severe head injury. Neurosurgery 1985;16(5):595-601 [DOI] [PubMed] [Google Scholar]
  • 4.Rowley G, Fielding K. Reliability and accuracy of the Glasgow Coma Scale with experienced and inexperienced users. Lancet 1991;337(8740):535-538 [DOI] [PubMed] [Google Scholar]
  • 5.Wijdicks EF, Bamlet WR, Maramattom BV, Manno EM, McClelland RL. Validation of a new coma scale: the FOUR score. Ann Neurol. 2005;58(4):585-593 [DOI] [PubMed] [Google Scholar]
  • 6.Wolf CA, Wijdicks EF, Bamlet WR, McClelland RL. Further validation of the FOUR score coma scale by intensive care nurses. Mayo Clin Proc. 2007;82(4):435-438 [DOI] [PubMed] [Google Scholar]
  • 7.Wijdicks EF, Kokmen E, O'Brien PC. Measurement of impaired consciousness in the neurological intensive care unit: a new test. J Neurol Neurosurg Psychiatry 1998;64(1):117-119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wijdicks EF, Parisi JE, Sharbrough FW. Prognostic value of myoclonus status in comatose survivors of cardiac arrest. Ann Neurol. 1994;35(2):239-243 [DOI] [PubMed] [Google Scholar]
  • 9.van Swieten JC, Koudstaal PJ, Visser MC, Schouten HJ, van Gijn J. Interobserver agreement for the assessment of handicap in stroke patients. Stroke 1988;19(5):604-607 [DOI] [PubMed] [Google Scholar]
  • 10.Starmark JE, Stalhammar D, Holmgren E. The Reaction Level Scale (RLS85): manual and guidelines. Acta Neurochir (Wien) 1988;91(1-2):12-20 [DOI] [PubMed] [Google Scholar]
  • 11.Stanczak DE, White JG, III, Gouview WD, et al. Assessment of level of consciousness following severe neurological insult: a comparison of the psychometric qualities of the Glasgow Coma Scale and the Comprehensive Level of Consciousness Scale. J Neurosurg. 1984;60(5):955-960 [DOI] [PubMed] [Google Scholar]
  • 12.Crosby L, Parsons LC. Clinical neurologic assessment tool: development and testing of an instrument to index neurologic status. Heart Lung 1989;18(2):121-129 [PubMed] [Google Scholar]
  • 13.Giacino JT, Kezmarsky MA, DeLuca J, Cicerone KD. Monitoring rate of recovery to predict outcome in minimally responsive patients. Arch Phys Med Rehabil. 1991;72(11):897-901 [DOI] [PubMed] [Google Scholar]
  • 14.Born JD. The Glasgow-Liège Scale: prognostic value and evolution of motor response and brain stem reflexes after severe head injury. Acta Neurochir (Wien) 1988;91(1-2):1-11 [DOI] [PubMed] [Google Scholar]
  • 15.Benzer A, Mitterschiffthaler G, Marosi M, et al. Prediction of non-survival after trauma: Innsbruck Coma Scale. Lancet 1991;338(8773):977-978 [DOI] [PubMed] [Google Scholar]
  • 16.Mayer SA, Dennis LJ, Peery S, et al. Quantification of lethargy in the neuro-ICU: the 60-Second Test. Neurology 2003;61(4):543-545 [DOI] [PubMed] [Google Scholar]
  • 17.Benzer A, Traweger C, Ofner D, Marosi M, Luef G, Schmutzhard E. Statistical modelling in analysis of outcome after trauma Glasgow-Coma-Scale and Innsbruck-Coma-Scale. Anasthesiol Intensivmed Notfallmed Schmerzther 1995;30(4):231-235 [DOI] [PubMed] [Google Scholar]
  • 18.Segatore M, Way C. The Glasgow Coma Scale: time for change. Heart Lung 1992;21(6):548-557 [PubMed] [Google Scholar]
  • 19.Stead LG, Wijdicks EF, Bhagra A, et al. Validation of a new coma scale, the FOUR score, in the emergency department. Neurocrit Care 2003;10(1):50-54 [DOI] [PubMed] [Google Scholar]

Articles from Mayo Clinic Proceedings are provided here courtesy of The Mayo Foundation for Medical Education and Research

RESOURCES