Abstract
Gaudiani et al. (2022) presented terminal anorexia nervosa (T-AN) as a potential new specifier to the anorexia nervosa (AN) diagnosis, with criteria including (a) AN diagnosis, (b) age > 30 years, (c) previously participated in high-quality care, and (d) the clear, consistent determination by a patient with decision-making capacity that additional treatment would be futile, knowing death will result. This study’s purpose was to empirically examine a subgroup of participants with AN who met the first three criteria of T-AN—and a smaller subset who also met a proxy index of the fourth criterion involving death (TD-AN)—and compare them to an adult “not terminal” anorexia nervosa (NT-AN) group and to a “not terminal” subset 30 years of age or older (NTO-AN). Patients at U.S. eating disorder treatment facilities (N = 782; T-AN: n = 51, TD-AN: n = 16, NT-AN: n = 731, NTO-AN: n = 133), all of whom met criteria for a current Diagnostic and Statistical Manual of Mental Disorders, 5th Edition diagnosis of AN, were compared regarding admission, discharge, and changes from admission to discharge on physiological indices (i.e., white blood cell counts, albumin levels, aspartate aminotransferase levels, and body mass index), as well as self-report measures (i.e., eating disorder, depression, anxiety, and obsessive-compulsive symptoms). In contrast to the tight syndromal symptom interconnections of, and inevitable spiral toward death expected for, a terminal diagnosis, results suggest substantial variability within the T-AN group and TD-AN subset, and an overall trend of improvement across physiological and self-report measures. This study thus provides some empirical evidence against the specification of the T-AN diagnosis.
Keywords: terminal anorexia nervosa, anorexia nervosa, eating disorder features
Anorexia nervosa (AN) is a severe psychiatric disorder characterized by significant restriction of food intake leading to objectively low body weight, fear of weight gain, and body image disturbance (American Psychiatric Association [APA], 2013). AN has the highest mortality rate of any psychiatric illness other than opiate use disorder; outcomes tend to be suboptimal (Birmingham et al., 2005; Harbottle et al., 2008), with more than half of individuals failing to achieve remission after treatment (Watson & Bulik, 2013). Even the best evidence-based treatments for AN may confer little advantage over comparator treatments in the long term (Murray et al., 2019). Moreover, a review from the early 21st century reported that outcomes for AN have not improved significantly in the prior 50 years (Steinhausen, 2002).
A high proportion of individuals with AN develop a chronic course, with some studies suggesting that approximately 20% fall into this category (Steinhausen, 2002). There is a growing body of literature on the concept of severe and enduring AN (SE-AN), which has been defined as a form of AN with a debilitating, persistent course that may not respond to treatment (Touyz & Hay, 2015; Touyz et al., 2013; Wonderlich et al., 2020). While there is no formal consensus on a definition of SE-AN, Hay and Touyz (2018) proposed the following criteria: (a) clinically significant functional impact; (b) at least a 3-year duration of illness; and (c) exposure to at least two evidence-based treatments (Hay & Touyz, 2018). While SE-AN is clearly a debilitating illness and many do not recover or may die due to complications of AN, some work suggests that most individuals with a protracted illness course do recover (Eddy et al., 2017).
Gaudiani et al. (2022) proposed a presentation more extreme than SE-AN, terminal AN (T-AN). The authors define those with T-AN as: (a) having a diagnosis of AN; (b) being age 30 or older; (c) having persistently engaged in high-quality, multidisciplinary care; and (d) persistent denial of further treatment, knowing death is the inevitable outcome, while having intact decision-making abilities (Gaudiani et al., 2022). These criteria overlap somewhat with those proposed for SE-AN, including history of engagement in treatment and chronicity (defined in SE-AN by duration of illness and in T-AN by age), but the concept of T-AN substantially diverges from SE-AN in the conceptualization of death as an inevitable option for those with a poor prognosis. Gaudiani et al. (2022) described three individuals who died of chronic AN and proposed that individuals who meet their definition of T-AN should have a right to medical aid in dying in states where it is legal.
To the best of our knowledge, six papers thus far have already been published in response, raising several concerns with this newly discussed construct of T-AN. First, the ability to define “terminal” in the context of AN was raised as a potential problem (Crow et al., 2023; Guarda et al., 2022; Riddle et al., 2022). Second, Riddle et al. (2022) and Guarda et al. (2022) suggest that some individuals with T-AN, as defined by Gaudiani et al. (2022), may have a treatable illness. Third, others discussed the potential dangers of a terminal label for AN, particularly given its implications regarding hopelessness, and for a terminal label becoming something that individuals with AN strive for given the ego-syntonic nature of the disorder (Downs et al., 2023; Elwyn, 2023; Guarda et al., 2022). With these concerns in mind, Yager et al. (2022) published a response, suggesting that T-AN can be adequately defined, AN can be both treatable and terminal, and that the terminal label itself is not dangerous.
Although many papers have been published in response to Gaudiani et al.’ (2022) proposition, to our knowledge, no one has examined objective indicators of terminality that would be observed if T-AN is a valid categorization. We recognize that the field’s lack of an empirical approach to T-AN may be due to its inherently non-empirical, philosophical qualities, and that our approach herein should be contextualized within that view. Nonetheless, in untreatable instances of a terminal illness, one is likely to observe cohering, objective indicators of terminality when, for instance, assessing bloodwork. For example, albumin levels among end-stage cancer patients and human immunodeficiency virus (HIV) are well below healthy ranges (Feldman et al., 2003; Goldwasser & Feldman, 1997; Hall & Cash, 2012; Mehta et al., 2006). Both albumin and aspartate aminotransferase (AST) levels can reflect liver functioning; like low albumin levels, high AST levels may indicate near-death status (Goldwasser & Feldman, 1997; Hall & Cash, 2012). There is a pattern of biological objectivity and increasing network density of disease-related parameters within truly terminal illnesses, parameters that should replicate across the definition of “terminal.”
Furthermore, defining any psychiatric condition as “terminal” has wide-reaching and concerning implications. Importantly, the suggestions made by Gaudiani et al. (2022) could be generalizable to other Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-5) diagnoses, such as schizophrenia and bipolar disorder, and thus empirically testing claims such as these is essential to direct clinical and ethical guidelines for treatment (Gaudiani et al., 2022). Socially, identifying any mental illness as “terminal” has significant implications for the patient, their family, their community, and society at large (Downs, 2023). Given the controversy over the construct of T-AN, empirical research is needed to assess the validity and utility of this label. To our knowledge, no study has empirically tested a definition of T-AN by examining whether individuals who meet these criteria form a relatively homogenous group that does not improve with treatment.
In the present study, we sought to further examine the terminality concept within an adult AN sample, operationalized in one set of analyses using three of the four criteria suggested by Gaudiani et al. (2022; i.e., diagnosis of AN, age of 30 or older, and prior persistent engagement in high-quality eating disorder care; T-AN), and in a subset of T-AN that demonstrated desire for death, a proxy index of the fourth criterion (T-AN—and a smaller subset who also met a proxy index of the fourth criterion involving death [TD-AN]). We sought to compare this “terminal” group and subset (T-AN and TD-AN) to a “not terminal” AN group (NT-AN; i.e., adults [18+] who do not meet all of the first three criteria for terminality), and to a subset of the “not terminal” group who were 30 years of age or older (NTO-AN).
We reasoned that if the “terminal” groups are found to be relatively homogenous regarding biological and clinical variables at admission, changes in (i.e., declining) body mass index (BMI), and worsening scores on measures of eating disorder symptoms and general mental health during treatment compared to the “not terminal” groups, the “terminal” specifier may be warranted. Alternatively, if these “terminal” groups show considerable in-group variability and little or no evidence of deterioration, the “terminal” specifier may not be warranted.
Method
In total, 782 participants (T-AN: 4.9%, n = 51) were included in this retrospective cohort study from July 2020 to June 2023. The eating disorder clinics were located in Denver, Colorado, and Dallas, Texas in the United States. Due to the clinical nature of this study, discharge data varies (cf., Tables 1 and 2); the level of missing data is notable, and reasonably in line with our expectations due to the time-pressed nature of clinical activity in these settings in general and of discharges in particular (i.e., U.S. insurance payer constraints and patient preferences).1 These facilities predominantly provide inpatient (24/7 care with intensive medical assistance and stabilization), residential (24/7 care with less intensive medical assistance), partial hospitalization program (7 days/week treatment), intensive outpatient (3–5 days/week treatment), and virtual intensive outpatient (3–6 hr/day, 3–5 days/week treatment) levels of care. Each adult participant provided informed consent before completing an online self-report assessment within 5 days of admission and again within 7 days of discharge. The study was approved by the Salus Institutional Review Board. This study was not preregistered. Measures and coding are available by contacting the corresponding author. Data are not available since some information was gathered from patient medical records and participants did not consent to publicly share deidentified data.
Table 1.
Sample Characteristics
| Group demographics | Terminal (n = 51) | Not terminal (all; n = 731) | Not terminal (30 and over; n = 133) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| n | Range | M (SD) | n | Range | M (SD) | n | Range | M (SD) | |
| Age | 51 | 30–61 | 42.65 (10.21) | 731 | 18–65 | 25.15 (9.08) | 133 | 30–65 | 41.45 (9.18) |
| Days admitted | 51 | 6–278 | 70.92 (51.33) | 731 | 1–366 | 78.22 (53.57) | 133 | 2–366 | 77.56 (52.60) |
| Diagnosis | % | % | % | ||||||
| AN-R | 39 | 76.5 | 496 | 67.8 | 84 | 63.2 | |||
| AN-BP | 11 | 21.6 | 235 | 32.1 | 49 | 36.8 | |||
| Level of care | % | % | % | ||||||
| Inpatient | 19 | 37.3 | 232 | 31.7 | 64 | 48.1 | |||
| Residential | 18 | 35.3 | 288 | 39.4 | 40 | 30.1 | |||
| Partial hospitalization | 10 | 19.6 | 146 | 20.0 | 18 | 13.5 | |||
| Intensive outpatient | 2 | 3.9 | 7 | 1.0 | 0 | ||||
| Intensive virtual outpatient | 2 | 3.9 | 58 | 7.9 | 11 | 8.3 | |||
| Gender | % | % | % | ||||||
| Female | 48 | 94.1 | 616 | 84.3 | 115 | 86.5 | |||
| Male | 1 | 2.0 | 41 | 5.6 | 7 | 5.3 | |||
| Nonbinary | 1 | 2.0 | 27 | 3.7 | 4 | 3.0 | |||
| Transgender | 0 | 19 | 2.6 | 0 | |||||
| Denied response | 1 | 2.0 | 28 | 3.8 | 2 | 1.5 | |||
| Race | % | % | % | ||||||
| Caucasian | 47 | 92.2 | 593 | 81.8 | 114 | 85.7 | |||
| Black/African American | 1 | 2.0 | 13 | 1.8 | 1 | 0.8 | |||
| Hispanic or Latino | 1 | 2.0 | 33 | 4.5 | 1 | 0.8 | |||
| Asian | 0 | 21 | 2.9 | 3 | 2.3 | ||||
| American Indian/Native American | 0 | 3 | 0.4 | 0 | |||||
| Native Hawaiian/Pacific Islander | 0 | 2 | 0.3 | 1 | 0.8 | ||||
| Multiracial | 1 | 2.0 | 17 | 7 | 3 | 2.3 | |||
| Unknown | 1 | 2.0 | 49 | 6.7 | 1 | 0.8 | |||
Note. AN-R = anorexia nervosa-restricting subtype; AN-BP = anorexia nervosa-binge-purge subtype.
Table 2.
Descriptive Statistics
| Group Variable |
Terminal (n = 51) | Not terminal (all; n = 731) | Not terminal (30 and over; n = 133) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| n | Range | M (SD) | n | Range | M (SD) | n | Range | M (SD) | |
| Albumin-A | 45 | 3.3–5.5 | 4.45 (0.58) | 108 | 3.3–5.5 | 4.56 (0.42) | 60 | 3.3–5.5 | 4.50 (0.43) |
| Albumin-D | 29 | 3.7–5.0 | 4.49 (0.32) | 88 | 3.4–5.1 | 4.41 (0.35) | 45 | 3.4–5.1 | 4.39 (0.40) |
| AST-A | 40 | 12–289 | 46.30 (61.65) | 102 | 12–321 | 32.48 (47.85) | 56 | 12–321 | 38.37 (58.79) |
| AST-D | 29 | 9–63 | 24.66 (10.40) | 87 | 9–63 | 23.45 (9.98) | 45 | 9–63 | 26.05 (9.92) |
| BMI-A | 51 | 11.3–33.2 | 18.41 (4.5) | 731 | 11.2–39.8 | 18.38 (3.28) | 133 | 11.4–30.8 | 17.90 (3.36) |
| BMI-D | 51 | 11.7–35.1 | 20.79 (4.22) | 731 | 13.0–39.9 | 21.23 (3.00) | 133 | 15.5–35.9 | 21.18 (3.05) |
| PHQ-9-A | 45 | 0–27 | 16.24 (6.98) | 691 | 0–27 | 16.18 (7.03) | 116 | 0–27 | 14.28 (7.53) |
| PHQ-9-D | 44 | 0–26 | 12.57 (8.22) | 626 | 0–27 | 10.22 (6.95) | 119 | 0–25 | 8.28 (7.08) |
| GAD-7-A | 42 | 0–21 | 13.81 (6.18) | 424 | 0–21 | 13.81 (5.51) | 76 | 0–21 | 12.62 (6.05) |
| GAD-7-D | 41 | 0–21 | 10.73 (6.64) | 436 | 0–21 | 10.11 (6.02) | 83 | 0–21 | 8.67 (6.39) |
| OCI-R-A | 44 | 0–60 | 23.50 (16.79) | 687 | 0–72 | 21.76 (15.92) | 116 | 0–65 | 17.94 (14.45) |
| OCI-R-D | 45 | 0–54 | 17.60 (15.69) | 623 | 0–65 | 17.14 (14.81) | 120 | 0–60 | 13.13 (13.05) |
| EDE-Q-G-A | 45 | 0–5.8 | 3.88 (1.84) | 694 | 0–6 | 3.86 (1.56) | 115 | 0.05–5.9 | 3.45 (1.71) |
| EDE-Q-G-D | 45 | 0–6 | 2.75 (1.72) | 628 | 0–5.6 | 2.33 (1.41) | 120 | 0–5.1 | 2.19 (1.38) |
| EDE-Q-E-A | 45 | 0–6 | 3.12 (1.67) | 695 | 0–6 | 3.17 (1.57) | 115 | 0–6 | 2.73 (1.77) |
| EDE-Q-E-D | 45 | 0–6 | 2.03 (1.64) | 628 | 0–5.2 | 1.63 (1.28) | 120 | 0–4.4 | 1.41 (1.24) |
| EDE-Q-R-A | 45 | 0–6 | 3.77 (2.33) | 696 | 0–6 | 3.65 (1.97) | 116 | 0–6 | 3.29 (2.14) |
| EDE-Q-R-D | 45 | 0–6 | 1.75 (1.92) | 630 | 0–6 | 1.14 (1.35) | 121 | 0–5.8 | 1.00 (1.33) |
| EDE-Q-S-A | 45 | 0–6 | 4.49 (1.94) | 694 | 0–6 | 4.50 (1.62) | 115 | 0–6 | 4.09 (1.80) |
| EDE-Q-S-D | 45 | 0–6 | 3.83 (1.98) | 628 | 0–6 | 3.57 (1.84) | 120 | 0–6 | 3.49 (1.88) |
| EDE-Q-W-A | 45 | 0–6 | 4.14 (1.88) | 694 | 0–6 | 4.12 (1.70) | 115 | 0–6 | 3.70 (1.83) |
| EDE-Q-W-D | 45 | 0–6 | 3.40 (2.01) | 628 | 0–6 | 2.97 (1.82) | 120 | 0–6 | 2.85 (1.82) |
Note. AST = aspartate aminotransferase; BMI = body mass index; PHQ-9 = Patient Health Questionnaire–9; GAD-7 = Generalized Anxiety Disorder–7; OCI-R = Obsessive-Compulsive Inventory–Revised; EDE-Q = Eating Disorder Examination Questionnaire; A = admission; D = discharge; G = global; E = eat concern; R = restraint; S = shape; W = weight.
Measures
The Semistructured Clinical Interview Based on DSM-5 Criteria
The tool assesses for the presence or absence of psychiatric disorders (APA, 2013). Masters-level clinical assessment specialists, who were either licensed or working toward licensure, administered the eating disorder modules prior to any other study participation, via a phone interview. To ensure interviewer fidelity, masters-level clinical assessment managers completed random quality audits on a sample of completed assessment documentation. Furthermore, physicians continually evaluated and approved the diagnoses of the participants throughout treatment. Importantly, Criterion A requires a significantly low body weight. The DSM-5 acknowledges familial and other sources (e.g., physicians) may aid in evaluating the history of weight loss and other features of AN, as it is rare for the individual to endorse criterion A or have accurate insight (APA, 2013, p. 340). As some of the participants in both the T-AN and NT-AN groups had BMIs above the World Health Organization cutoffs (i.e., 18.5 kg/m2; APA, 2013), we would like to note that all/the vast majority of our sample had threshold AN at some point in the fairly recent past, and either: (a) still had it at admission; or (b) did not quite meet full criteria at admission but were symptomatic enough to be admitted to an inpatient facility because of AN symptoms (i.e., they are still struggling with the AN syndrome if not full-criteria AN).2
Biological Samples
As noted in the introduction, objective measures of terminality are observed in other illnesses (e.g., end-stage cancer, HIV, etc.) near death. Due to the definition of “terminal,” one would anticipate similar biological changes among those in the T-AN group. Therefore, blood samples were collected to obtain white blood cell counts (WBC), albumin levels, and AST levels at admission and discharge. Only albumin and AST results are included in final analyses as WBC for the terminal sample at discharge was limited (n = 9; see Footnote 3). Healthy ranges, according to the facilities that analyzed the samples, are 3.4–10.8 for WBC, 3.8–5.0 for albumin levels, and 0–40 for AST levels (Laboratory Corporation of America, 2021a, 2021b; Quest Diagnostics, 2019). Albumin levels below healthy ranges may indicate infection, inflammation, kidney disease, liver disease, or poor nutrition, and levels above healthy ranges may indicate dehydration or severe diarrhea (The Cleveland Clinic, 2022a). WBC levels below healthy ranges may indicate a vulnerability toward developing infections (The Cleveland Clinic, 2022c), and levels above healthy ranges may indicate infection, inflammation, injury, and immune system disorders (The Cleveland Clinic, 2022b). AST levels below healthy ranges are generally less medically important, whereas levels above healthy ranges may indicate chronic hepatitis; heart, kidney, bone, or muscle damage; liver cancers; or cirrhosis (Ramirez & Fletcher, 2023). Notably, AST levels can elevate during immediate periods of refeeding (Ozawa et al., 1998); however, as these levels were measured at admission and discharge (times when the immediate effects of refeeding are not relevant), it is improbable that refeeding influenced our results.
BMI
BMI was used as a physiological indicator of improvement. In this study, we compared BMI at admission to BMI at discharge as a measure of standardized alteration in weight change accounting for height. As a cardinal diagnostic criterion for AN involves significantly low body weight, BMI was the most appropriate weight-related measurement construct (APA, 2013). Please see the Semi-Structured Clinical Interview based on DSM-5 criteria for additional information on the BMI within this sample.
Patient Health Questionnaire–9 (PHQ-9)
The PHQ-9 is a self-report tool for assessment of the severity of depressive symptoms directly relating to the DSM-IV criteria for major depressive disorder (Kroenke & Spitzer, 2002; Kroenke et al., 2001; Spitzer et al., 2014). The questionnaire assesses a respondent’s preceding 2 weeks, asking how often participants had been bothered by problems such as: “little interest or pleasure in doing things.” Items were scored on a Likert scale ranging from 0 = not at all to 3 = nearly every day. Total scores are summed, ranging from 0 to 27. The PHQ-9 had good reliability at admission (α = .90) and at discharge (α = .91). Importantly, Item 9 of the PHQ-9 assesses desire for death asking: “Thoughts that you would be better off dead, or thoughts of hurting yourself in some way?” Due to the content of this item (i.e., “death” with no specificity toward suicide), it is defensible as an index of death ideation. Further to this point, is notable that the stem phrase of this item (i.e., “thoughts that you would be better off dead, or of hurting yourself”) features death ideation as much as or more than it does suicidality, in that it both leads with death ideation per se and devotes more words to death ideation than to suicidality.
This single item was used as a proxy for Gaudiani et al.’s (2022) fourth criterion for T-AN. The PHQ-9 is a well-validated measure (Kroenke & Spitzer, 2002; Kroenke et al., 2001; Spitzer et al., 2014), and regarding the one-item index of death ideation, there is evidence that such measures are defensibly reliable and valid (e.g., Joiner et al., 2022; Robison et al., 2022).
Generalized Anxiety Disorder–7 (GAD-7)
The GAD-7 is a self-report tool for assessment of the severity of anxiety symptoms relating to the DSM-IV for generalized anxiety disorder (Dhira et al., 2021; Spitzer et al., 2006). The questionnaire assesses a respondent’s preceding 2 weeks, asking how often participants had been bothered by problems such as: “not being able to stop or control worrying.” Items were scored on a Likert scale ranging from 0 = not at all to 3 = nearly every day. Total scores are summed, ranging from 0 to 21. The GAD-7 is a well-validated measure (Dhira et al., 2021; Spitzer et al., 2006) with good reliability at admission (α = .89) and at discharge (α = .92) in these data.
Obsessive-Compulsive Inventory–Revised (OCI-R)
The OCI-R is a 15-item self-report tool evaluating six different aspects of OCD symptoms: washing, checking, ordering, obsessing, neutralizing, and hoarding. The OCI-R response format ranges from 0 = not at all to 4 = extremely. For the purposes of this study, the scale was scored by summing all questions into one total score. The measure has adequate reliability and validity (Huppert et al., 2007). These data did not include item-level OCI-R scores, precluding internal consistency reliability analysis; however, the test–retest reliability was substantial (r = .76; p < .001).
Eating Disorder Examination–Questionnaire (EDE-Q)
The EDE-Q is a 28-item self-report tool evaluating four different aspects of disordered eating: eating concern, restraint, shape concern, and weight concern (Berg et al., 2012; Rose et al., 2013). The EDE-Q response format ranges from 0 = no days to 6 = every day. For the purposes of this study, both the Global scale (an average of the four subscales) and the four distinct subscales were scored and used independently. The EDE-Q is a reasonably well-validated measure (Berg et al., 2012; Rose et al., 2013) and had excellent reliability at admission (α = .98) and at discharge (α = .99).
Analytic Plan
Descriptive Statistics and Group Heterogeneity
We analyzed and compared the descriptive statistics of the T-AN group (e.g., standard deviations; see Table 2) at admission and discharge to the “not terminal” group (i.e., NT-AN) and to the “not terminal” subset, aged 30 and over (i.e., NTO-AN). To assess for differences in within-group heterogeneity, we tested standard deviation differences, using F tests (see Table 3; Snedecor & Cochran, 1982). Using a proxy for the fourth criterion presented by Gaudiani et al. (2022, p. 1), “a clear and consistent determination by a patient who possesses decision-making capacity that additional treatment would be futile, knowing their actions will be fatal,” a T-AN subset (TD-AN; n = 16) who endorsed some desire for death on the PHQ-9, answering “several days” or more, was also compared to both NT-AN and the NTO-AN subset (see the online supplemental materials). Due to the high number of statistical tests, the Benjamini–Hochberg procedure was employed to control the familywise error rate (i.e., the chances of Type-I error), adjusting the p-value threshold to p < .01 (Benjamini & Hochberg, 1995). This p value was used for the remainder of analyses for ease of interpretation and consistency across comparisons.
Table 3.
F tests
| Variable | Terminal and not terminal (all) | Terminal and not terminal (30 and over) | ||||
|---|---|---|---|---|---|---|
| p | F | Effect | p | F | Effect | |
| Albumin-A | .007** | 1.91 | 1.91 | .021* | 1.82 | 1.82 |
| Albumin-D | .604 | 0.84 | 1.20 | .623 | 0.64 | 1.56 |
| AST-A | .046* | 1.66 | 1.66 | .736 | 1.10 | 1.10 |
| AST-D | .748 | 1.09 | 1.09 | .764 | 1.10 | 1.10 |
| BMI-A | <.001*** | 1.88 | 1.88 | <.001*** | 1.79 | 1.79 |
| BMI-D | <.001*** | 1.98 | 1.98 | .004** | 1.91 | 1.91 |
| PHQ-9-A | .999 | 0.99 | 1.01 | .576 | 0.86 | 1.16 |
| PHQ-9-D | .100 | 1.40 | 1.40 | .213 | 1.35 | 1.35 |
| GAD-7-A | .278 | 1.26 | 1.26 | .856 | 1.04 | 1.04 |
| GAD-7-D | .355 | 1.22 | 1.22 | .754 | 1.08 | 1.08 |
| OCI-R-A | .581 | 1.11 | 1.11 | .212 | 1.35 | 1.35 |
| OCI-R-D | .552 | 1.12 | 1.12 | .121 | 1.45 | 1.45 |
| EDE-Q-G-A | .104 | 1.42 | 1.42 | .496 | 1.18 | 1.18 |
| EDE-Q-G-D | .028* | 1.63 | 1.63 | .034* | 1.75 | 1.75 |
| EDE-Q-E-A | .525 | 1.31 | 1.31 | .674 | 0.89 | 1.23 |
| EDE-Q-E-D | .013* | 1.64 | 1.64 | .018* | 1.75 | 1.75 |
| EDE-Q-R-A | .096 | 1.40 | 1.40 | .470 | 1.19 | 1.19 |
| EDE-Q-R-D | <.001*** | 2.02 | 2.02 | .002** | 2.08 | 2.08 |
| EDE-Q-S-A | .073 | 1.43 | 1.43 | .523 | 1.16 | 1.16 |
| EDE-Q-S-D | .458 | 1.16 | 1.16 | .648 | 1.11 | 1.11 |
| EDE-Q-W-A | .314 | 1.22 | 1.22 | .801 | 1.06 | 1.06 |
| EDE-Q-W-D | .322 | 1.22 | 1.22 | .399 | 1.22 | 1.22 |
Note. AST = aspartate aminotransferase; BMI = body mass index; PHQ-9 = Patient Health Questionnaire–9; GAD-7 = Generalized Anxiety Disorder–7; OCI-R = Obsessive-Compulsive Inventory–Revised; EDE-Q = Eating Disorder Examination Questionnaire; A = admission; D = discharge; G = global; E = eat concern; R = restraint; S = shape; W = weight.
p < .05.
p < .01.
p < .001.
Treatment Response
Next, to assess change over time, we calculated the percentage of the T-AN group that did not improve on physiological or self-report measures from admission to discharge, comparing them to both NT-AN and the NTO-AN subsets. Chi-square difference tests were conducted to assess if these differences in improvement were statistically significant. Analyses were repeated comparing those in the TD-AN subset to both NT-AN and the NTO-AN subsets.
We then conducted regression analyses to compare terminal group status and admission scores to predict discharge scores. The groups were coded NT-AN = 0 and T-AN = 1, with all regression analyses predicting discharge outcomes from admission scores and group status (“terminal” vs. not; see Table 4). Eleven models were run involving the entire sample (i.e., comparing T-AN to NT-AN), one for each of 11 clinical predictors/outcomes: albumin, AST, BMI, PHQ-9, GAD-7, OCI-R, EDE-Q-global, EDE-Q-eating concern, EDE-Q-restraint, EDE-Q-shape concern, and EDE-Q-weight concern. Each model was structured as follows: group status (T-AN vs. NT-AN) and clinical predictor at admission predicting the clinical predictor at discharge (see Table 3a-k). Eleven more identical analyses were run comparing the T-AN group to the NTO-AN subset (see Table 3l-v). Results were reanalyzed comparing those in the TD-AN subset to both the NT-AN group and the NTO-AN subset (see the online supplemental materials). All regression analyses (i.e., admission scores to predict discharge scores) were additionally analyzed by each group and subset, respectively to compare the standardized regression coefficients across all groups and subsets (see the online supplemental materials).
Table 4.
Multiple Regression Models
| Outcome | |||||||
|---|---|---|---|---|---|---|---|
| Model | Predictor moderator | β | SE | 95% CI | t | p | sr 2 |
| a (n = 116; R2 = .121) | Albumin-D | ||||||
| Albumin-A | .34 | 0.09 | [0.16, 0.51] | 3.81 | <.001** | .113 | |
| Terminality | .11 | 0.09 | [−0.06, 0.29] | 1.25 | .215 | .012 | |
| b (n = 112; R2 = .016) | AST-D | ||||||
| AST-A | .11 | 0.10 | [−0.08, 0.30] | 1.18 | .241 | .013 | |
| Terminality | .04 | 0.10 | [−0.15, 0.23] | 0.39 | .695 | .004 | |
| c (n = 782; R2 = .545) | BMI-D | ||||||
| BMI-A | .74 | 0.02 | [0.69, 0.79] | 30.53 | <.001** | .544 | |
| Terminality | −.04 | 0.02 | [−0.08, 0.01] | −1.52 | .129 | .001 | |
| d (n = 590; R2 = .308) | PHQ-9-D | ||||||
| PHQ-9-A | .55 | 0.03 | [0.48, 0.62] | 15.99 | <.001** | .301 | |
| Terminality | .08 | 0.03 | [0.01, 0.14] | 2.22 | .027* | .006 | |
| e (n = 377; R2 = .270) | GAD-7-D | ||||||
| GAD-7-A | .52 | 0.04 | [0.43, 0.61] | 11.73 | <.001** | .268 | |
| Terminality | .04 | 0.04 | [−0.05, 0.13] | 0.87 | .384 | .001 | |
| f (n = 584; R2 = .579) | OCI-R-D | ||||||
| OCI-R-A | .76 | 0.03 | [0.71, 0.81] | 28.26 | <.001** | .579 | |
| Terminality | .00 | 0.03 | [−0.05, 0.06] | 0.09 | .926 | .000 | |
| g (n = 440; R2 = .298) | EDE-Q-G-D | ||||||
| EDE-Q-G-A | .54 | 0.04 | [0.46, 0.62] | 13.48 | <.001** | .292 | |
| Terminality | .08 | 0.04 | [0.00, 0.16] | 2.01 | .045* | .007 | |
| h (n = 594; R2 = .208) | EDE-Q-E-D | ||||||
| EDE-Q-E-A | .45 | 0.04 | [0.38, 0.52] | 12.23 | <.001** | .201 | |
| Terminality | .09 | 0.04 | [0.02, 0.17] | 2.56 | .011* | .009 | |
| i (n = 596; R2: = .114) | EDE-Q-R-D | ||||||
| EDE-Q-R-A | .31 | 0.04 | [0.24, 0.39] | 8.07 | <.001** | .097 | |
| Terminality | .13 | 0.04 | [0.05, 0.21] | 3.38 | <.001** | .017 | |
| j (n = 594; R2 = .330) | EDE-Q-S-D | ||||||
| EDE-Q-S-A | .57 | 0.03 | [0.51, 0.64] | 17.04 | <.001** | .326 | |
| Terminality | .04 | 0.03 | [−0.02, 0.11] | 1.25 | .212 | .002 | |
| k (n = 594; R2 = .330) | EDE-Q-W-D | ||||||
| EDE-Q-W-A | .57 | 0.03 | [0.51, 0.64] | 16.97 | <.001** | .327 | |
| Terminality | .06 | 0.03 | [−0.01, 0.13] | 1.80 | .072 | .004 | |
| l (n = 74; R2 = .123) | Albumin-D | ||||||
| Albumin-A | .33 | 0.11 | [0.11, 0.55] | 2.94 | .004** | .107 | |
| Terminality | .13 | 0.11 | [−0.10, 0.35] | 1.13 | .264 | .016 | |
| m (n = 71; R2 =.017) | AST-D | ||||||
| AST-A | .12 | 0.12 | [−0.13, 0.36] | 0.95 | .344 | .013 | |
| Terminality | −.08 | 0.12 | [−0.32, 0.17] | −0.63 | .528 | .006 | |
| n (n = 184; R2 = .640) | BMI-D | ||||||
| BMI-A | 80 | 0.04 | [0.71, 0.89] | 17.91 | <.001** | .638 | |
| Terminality | −.10 | 0.04 | [−0.19, −0.01] | −2.26 | .025* | .010 | |
| o (n = 143; R2 = .371) | PHQ-9-D | ||||||
| PHQ-9-A | .56 | 0.07 | [0.43, 0.69] | 8.29 | <.001** | .309 | |
| Terminality | .18 | 0.07 | [0.05, 0.31] | 2.65 | .009** | .032 | |
| p (n = 102; R2 = .335) | GAD-7-D | ||||||
| GAD-7-A | .56 | 0.08 | [0.40, 0.72] | 6.81 | <.001** | .312 | |
| Terminality | .10 | 0.08 | [−0.06, 0.26] | 1.22 | .224 | .010 | |
| q (n = 143; R2 = .574) | OCI-R-D | ||||||
| OCI-R-A | .74 | 0.06 | [0.63, 0.86] | 13.33 | <.001** | .541 | |
| Terminality | .06 | 0.06 | [−0.05, 0.17] | 1.15 | .250 | .004 | |
| r (n = 118; R2 = .432) | EDE-Q-G-D | ||||||
| EDE-Q-G-A | .63 | 0.07 | [0.49, 0.77] | 8.99 | <.001** | .400 | |
| Terminality | .13 | 0.07 | [−0.01, 0.27] | 1.83 | .070 | .017 | |
| s (n = 145; R2 = .328) | EDE-Q-E-D | ||||||
| EDE-Q-E-A | .54 | 0.07 | [0.40, 0.67] | 7.77 | <.001** | .286 | |
| Terminality | .17 | 0.07 | [0.03, 0.31] | 2.47 | .015* | .029 | |
| t (n = 146; R2 = .179) | EDE-Q-R-D | ||||||
| EDE-Q-R-A | .34 | 0.08 | [0.19, 0.49] | 4.42 | <.001** | .112 | |
| Terminality | .24 | 0.08 | [0.09, 0.39] | 3.15 | .002* | .057 | |
| u (n = 145; R2 = .483) | EDE-Q-S-D | ||||||
| EDE-Q-S-A | .69 | 0.06 | [0.57, 0.81] | 11.45 | <.001** | .478 | |
| Terminality | .03 | 0.06 | [−0.08, 0.15] | 0.58 | .556 | .001 | |
| v (n = 145; R2 = .447) | EDE-Q-W-D | ||||||
| EDE-Q-W-A | .66 | 0.06 | [0.53, 0.78] | 10.52 | <.001** | .431 | |
| Terminality | .09 | 0.06 | [−0.04, 0.21] | 1.37 | .174 | .007 | |
Note. Models a–k include not terminal (all); Models l–v include not terminal (30 and over). Terminality = −0.5 was not terminal, 0.5 was terminal. CI = confidence interval; AST = aspartate aminotransferase; BMI = body mass index; PHQ-9 = Patient Health Questionnaire–9; GAD-7 = Generalized Anxiety Disorder–7; OCI-R = Obsessive-Compulsive Inventory–Revised; EDE-Q = Eating Disorder Examination Questionnaire; A = admission; D = discharge; G = global; E = eat concern; R = restraint; S = shape; W = weight.
p < .05.
p < .01.
Sensitivity
Finally, to assess statistical sensitivity, all analyses were reanalyzed using 10 random subsamples with identical sample sizes to robustly account for any patterns of missing data and to assess the consistency of our results. As an example, we randomly selected 10 subsamples of 51 participants from the NT-AN group and 10 subsamples of 51 participants from the NTO-AN subset to compare to the T-AN group, respectively. Results were reanalyzed comparing those in the TD-AN subset to both the NT-AN group and the NTO-AN subset. Additionally, we also compared the averaged variances of the 10 random subsamples to the variance of the T-AN group and the TD-AN subset, respectively.
Transparency and Openness
Data are not publicly available (i.e., data involve medical health records, and participants did not consent to de-identified public data sharing), but code and output are accessible through the online supplemental materials. Hypotheses and analyses were not preregistered.
Results
Descriptive Statistics
See Table 1 for demographic statistics of the T-AN group, the NT-AN group, and the NTO-AN subset (see the online supplemental materials for the TD-AN subset demographic and descriptive statistics). The T-AN group was not significantly different from the NT-AN group regarding the frequencies of diagnostic subtype (i.e., AN-restricting type and AN-binge-eating/purging type), gender, race, or levels of care. The T-AN group was statistically significantly different from the NT-AN group regarding average age, t(827) = 13.59, p < .001; Cohen’s d = 1.83.3 The T-AN group was not statistically significantly different from the NT-AN group regarding BMI at admission, t(780) = 0.05, p = .957; Cohen’s d = 0.01, nor at discharge, t(780) = 0.99, p = .324; Cohen’s d = 0.12; see Figure 1. Please see Table 1 for a full list of descriptive statistics regarding the sample separated by group status (i.e., T-AN, NT-AN, and NTO-AN; see the online supplemental materials for the TD-AN subset descriptive statistics).4
Figure 1. BMI Admission to Discharge by Group.
Note. BMI = body mass index; T-AN = terminal anorexia nervosa; NT-AN = not terminal anorexia nervosa; TD-AN = T-AN—and a smaller subset who also met a proxy index of the fourth criterion involving death; NTO-AN = to a “not terminal” subset 30 years of age or older. See the online article for the color version of this figure.
Group Heterogeneity
F tests were conducted on standard deviations (see Table 2) on both admission and discharge scores of all study variables (see Table 3).5 The standard deviations for the T-AN group and the T-AN subset endorsing desire for death (TD-AN; see the online supplemental materials), compared to the NT-AN group and the NT-AN subset 30 years or older (NTO-AN), were generally similar, and if anything, larger in T-AN group and the TD-AN subset.
The T-AN group (compared to NT-AN) had statistically significantly larger standard deviations on Albumin levels at admission, BMI at admission and discharge, and EDE-Q Restraint at discharge (p < .01). The T-AN group (compared to the NTO-AN subset) had statistically significantly larger standard deviations on BMI at admission, BMI at discharge, and EDE-Q restraint at discharge (ps < .01). The TD-AN subset was then compared to the NT-AN group and NTO-AN subset. The TD-AN subset had statistically significantly lower standard deviations than the NT-AN group (p < .01) and NTO-AN subset (p < .01) on AST at admission; however, all participants fell within normal ranges. The TD-AN subset had statistically significantly higher standard deviations than the NTO-AN subset on BMI at admission (p < .01).
Treatment Response
We then identified, by group (e.g., the T-AN group compared to the NT-AN group), individuals who did not show improvement from admission to discharge on biological measures, BMI, and all self-report measures. Regarding biological parameters, the T-AN group had one participant (in the TD-AN subset) out of 29 (3.4%) discharge with Albumin levels slightly below healthy ranges. The NT-AN group had eight participants (five in the NTO-AN subset) discharge with Albumin levels slightly below healthy ranges (NT-AN: eight out of 88 [9.1%]; NTO-AN subset: five out of 45 [11.1%]). The T-AN group and the NT-AN group percentages were not statistically significantly different from one another (χ2 = 0.98, p = .323, φc = .09), and remained nonsignificant when comparing the T-AN group to the NTO-AN subset (χ2 = 1.39, p = .238, φc = .14). One participant in the TD-AN subset discharged with serum levels of albumin slightly below healthy ranges; chi-square difference tests between the TD-AN subset and the NT-AN group were not statistically significant (χ2 = 0.04, p = .842, φc = .02), and remained nonsignificant when comparing the TD-AN subset and the NTO-AN subset (χ2 = 0.50, p = .478, φc = .10).
The T-AN group had two out of 29 (6.9%) discharge with AST levels slightly above the healthy range. The NT-AN group had six participants discharge with slightly elevated AST levels (NT-AN: six out of 87 [6.9%]; NTO-AN subset: four out of 45 [8.9%]). The chi-square difference test between the T-AN and NT-AN groups was not statistically significant (χ2 = 0.00, p = 1.00, φc = 0), and remained nonsignificant when comparing the T-AN group and the NTO-AN subset (χ2 = 0.09, p = .759, φc = .04). Every participant in the TD-AN subset (i.e., endorsing a desire for death) discharged with healthy ranges of AST; chi-square difference tests between the TD-AN subset and the NT-AN group were not statistically significant (χ2 = 0.66, p = .416, wc = .08), and remained nonsignificant when comparing the TD-AN subset and the NTO-AN subset (χ2 = 0.86, p = .353, φc = .13).
When comparing both BMI and across all self-report measures (e.g., PHQ-9, GAD-7, OCI-R, and EDE-Q) zero out of 51 (0%) of the T-AN group worsened regarding both BMI and across all self-report measures; these results were identical for the NT-AN group (NT-AN: zero out of 321 [0%]; NTO-AN subset: zero out of 68 [0%]). In T-AN, one out of 51 (2.0%) worsened on all self-report measures independent of BMI. The NT-AN group had lower percentages of participants who worsened across all self-report measures (NT-AN: three out of 321 [0.93%]; NTO-AN subset: zero out of 68 [0%]). Chi-square difference tests between the T-AN and NT-AN groups, however, indicate no statistically significant difference between them (χ2 = 0.44, p = .509, φc = .04), and remained non-significant when comparing the T-AN group and the NTO-AN subset (χ2 = 1.35, p = .246, φc = .11). Every participant in the TD-AN subset (i.e., endorsing a desire for death) improved on both BMI and across all self-report measures; chi-square difference tests comparing the percentages of participants who worsened across all self-report measures (independent of BMI) between the TD-AN subset and the NT-AN group were not statistically significant (χ2 = 0.15, p = .698, φc = .02).
Finally, we compared the groups and subsets exclusively on the EDE-Q subscales. In the T-AN group, four of 51 (7.8%) worsened only on EDE-Q subscales. In the NT-AN group, a lower percentage of participants worsened only on EDE-Q subscales (NT-AN: 19 of 403 [4.7%]; NTO-AN subset: five of 93 [5.4%]). Chi-square difference tests between T-AN and NT-AN, however, were not statistically significant (χ2 = 0.92, p = .337, φc = .04), and remained nonsignificant when comparing T-AN to NTO-AN (χ2 = 0.34, p = .557, φc = .04). Every participant in the TD-AN subset improved across all EDE-Q related measures; chi-square difference tests between the TD-AN subset and the NT-AN group were not statistically significant (χ2 = 0.79, p = .374, φc = .04), and remained non-significant when comparing the TD-AN subset and the NTO-AN subset (χ2 = 0.90, p = .342, φc = .09).
Terminal Group Status and Admission Scores to Predict Discharge
Next, multiple regression analyses tested the effect that “terminal” group status had on admission to discharge scores (T-AN compared to both NT-AN and NTO-AN; see Table 3). Group status (i.e., T-AN compared to NT-AN) was statistically significantly predictive of EDE-Q-Restraint at discharge (p < .01 level; β = .13, SE = 0.04, p < .001, sr2 = .017) such that the T-AN group reported higher levels of restraint concerns at discharge, controlling for scores at admission. When the T-AN group was compared to NTO-AN, group status was statistically significantly predictive of PHQ-9 at discharge (β = .18, SE = 0.07, p = .009, sr2 = .032) and EDE-Q-restraint at discharge (β = .24, SE = 0.08, p = .002, sr2 = .057) such that the T-AN group reported higher PHQ-9 and EDE-Q-restraint scores at discharge, controlling for their respective scores at admission. Similar results emerged when comparing the TD-AN subset to the NT-AN group and the NTO-AN subset. Group status was statistically significantly predictive of EDE-Q-restraint at discharge when comparing the TD-AN subset to the NT-AN group (p < .01 level; β = .13, SE = 0.04, p = .002, sr2 = .016) and remained significant when comparing the NT-AN subset to the NTO-AN subset (β = .28, SE = 0.09, p = .002, sr2 = .076). In both instances, the TD-AN subset reported higher EDE-Q-restraint scores at discharge, controlling for their respective scores at admission.
Regression analyses independently examined by each group and subset indicated fairly similar standardized regression coefficients and overlapping confidence intervals (see the online supplemental materials).
Sensitivity
Finally, all results were reanalyzed comparing the T-AN group to 10 random subsamples of the NT-AN group (n’s = 51) and to 10 simple random subsamples of the NTO-AN subset (n’s = 51). Additionally, the TD-AN subset was compared to 10 simple random subsamples of the NT-AN group (n’s = 16) and to 10 random subsamples of the NTO-AN subset (n’s = 16). In total, 1,800 analyses were conducted (i.e., 45 analyses per subsample, replicated 10 times across four comparison groups). We also compared the averaged variances of the 10 random subsamples to the variance of the T-AN group and the TD-AN subset, respectively. These analyses affirm the results and conclusions of the main analyses (i.e., the averaged variances were overwhelmingly nonsignificant and when significant, the T-AN group and TD-AN subset had higher variability). Age was the only consistently significantly different parameter between the T-AN group/TD-AN subsets and the NT-AN subsamples. This finding can be viewed as validity check on our analyses as age differences are built into the criteria for “terminality.” Otherwise (i.e., independent of the finding regarding expected age differences) of the 1,760 analyses conducted not involving age, 1,743 did not support the concept of T-AN (i.e., over 99% were non-significant or against the concept of “terminal”). Put differently, 17 analyses of the 1,760 not involving age accorded with the T-AN concept (0.97%). As the p value was set to .01, it is reasonable to expect one in 100 significant findings to be attributable to chance alone.
The only conceivable solace in these findings for supporters of the T-AN concept was that of the 17 significant results, six which involved higher mean levels of EDE-Q-restraint concerns at discharge among the T-AN group in the regression analyses. However, these six significant findings out of 40 analyses on EDE-Q-restraint concerns (i.e., 10 subsamples across four groups), are outweighed by the 34 which did not support the T-AN position.
Discussion
The aims of this study were to compare the heterogeneity and treatment responses of a clinical sample categorized as T-AN, using three of the four criteria that Gaudiani et al. (2022) proposed for T-AN—and a smaller subset who also met a proxy index for the fourth criterion—to a “not terminal” group and a smaller subset 30 years of age or older. The T-AN group, compared to the NT-AN group and NTO-AN subset, displayed similar means, with generally equal or higher variability on physiological and self-report measures. Additionally, the TD-AN subset, compared to the NT-AN group and NTO-AN subset, also displayed similar means with generally higher variability. All statistically significant differences indicated that the T-AN group had higher variability compared to NT-AN group and NTO-AN subset (though the TD-AN subset had statistically significant smaller variability in the AST at admission), indicating generally equivalent patterns of heterogeneity for “terminal” and “not terminal” groups, at odds with the concept of terminality.
Regarding improvement, all groups, including T-AN and the TD-AN subset, mostly showed significant improvement from admission to discharge across all measures. The T-AN group showed higher percentages of participants with a higher percentage of participants who worsened on all self-report measures (i.e., PHQ-9, GAD-7, OCI-R, and EDE-Q), and a higher percentage of participants who worsened specifically on EDE-Q subscales. However, these percentages were not statistically significantly different from the NT-AN group nor from the NTO-AN subset, nor were they clinically significant or concerning. Moreover, the TD-AN subset improved across all biological, BMI, and self-report measures. Finally, the regression models (i.e., comparing the T-AN group to the NT-AN group, the T-AN group to the NTO-AN subset, the TD-AN subset to the NT-AN group, and the TD-AN subset to the NTO-AN subset) were statistically significant across all comparisons for EDE-Q-restraint subscale and indicated that group status was also significant for PHQ-9 scores when comparing the T-AN group to the NTO-AN subset. These findings suggest that the T-AN group endorses higher restraint concerns and symptoms of depression above and beyond their admission scores.
When analyses were repeated using 10 random subsamples matched for sample size, only age (comparing the T-AN group to the NT-AN group and the TD-AN subset to the NT-AN group) was consistently statistically significant. Otherwise, results were overwhelmingly either nonsignificant, or, if significant, in the direction opposite that expected by T-AN proponents (i.e., over 99% of the results). The only conceivable pattern in our findings in support of the T-AN concept (i.e., less than one percent of the results) was that six results out of 40 involved higher mean levels of EDE-Q-restraint concerns at discharge among the T-AN group in the regression analyses.
Taken together, the information presented here indicates that the groups labeled “terminal” do experience a severe illness, but that these groups are not homogenous, do not appear to be meaningfully different from those categorized as “not terminal,” and show overall improvement from admission to discharge, similar to the “not terminal” groups. Thus, the results presented in this study suggest that the use of a “terminal” label for any client with AN meeting the criteria outlined by Gaudiani et al. is premature and potentially unsupportable.
Limitations
While our study provides evidence against the existence of a “terminal” course of AN, we acknowledge limitations of our findings. Most obviously, we did not directly assess the fourth specifier in Gaudiani et al.’s (2022, p. 1) definition of T-AN: “clear and consistent determination by a patient who possesses decision-making capacity that additional treatment would be futile… and they accept that death will be the natural outcome” (italics added). Although we examined the TD-AN subset who met a proxy index for this fourth criterion (i.e., desire for death) and demonstrated that this subset is similar to the larger T-AN group, we likely did not fully assess the fourth criterion as it was intended. Thus, we can only speak fully to a definition of terminality operationalized primarily by the first three of four criteria proposed for T-AN. We did, however, approximate the fourth criterion, especially its emphasis on death ideation (see italicized phrase above). We emphasize that the item in question is not an optimal index of the death-related criterion proposed by Gaudiani et al., we urge readers to consider this important point in interpreting our findings, and we encourage future work using a more precise index of the relevant criterion as originally described by Gaudiani et al. We would add that this criterion strikes us as the most subjective of the four criteria and is the only one to rely on patient rather than clinician judgment—a potentially problematic basis on which to affix a terminal label.
Because there were no established effects between groups of interest from which to draw, power analyses were only conducted post hoc using observed effect sizes. Unsurprisingly, power varied greatly based on observed effects, ranging from .05 to .99, which may indicate a true lack of difference in a sample size below the thousands. We acknowledge that sample sizes were relatively small for our “terminal” groups, which limited statistical power. One view of our efforts herein is that we simply showed that standard deviations are typically larger for smaller cell sizes, unsurprising because the formula for the standard deviation includes group size, or that underpowered analyses typically do not return significant effects, and we encourage readers to weigh this conclusion against our own: Namely, that the totality of evidence we present here is at least suggestive of the potential invalidity of the T-AN concept. Specifically, regarding standard deviations, we add that while it is of course true that group size is included in the formula for the standard deviation, it is also true in our view that in truly terminal illnesses, variability should nonetheless be low specifically regarding key illness-related parameters. Here, it was not. Efforts to replicate our findings with larger samples are of course encouraged, though we acknowledge that such projects are inherently challenging from a pragmatic point of view, given the need for large numbers of people with a clinically severe form of a relatively rare illness.
Additionally, we note that Gaudiani et al. do not specifically state that the T-AN diagnosis is accompanied by a worsening of symptoms; nevertheless, there is abundant language highly suggestive to that effect (i.e., “hazarding psychological risk and deterioration”, p. 12). Thus, as it is likely that true terminality is related to physical deterioration, we included four relevant physiological parameters (i.e., BMI, albumin levels, AST levels, and WBCs) in this study. However, data on only three of the four biological parameters were available at discharge, limiting our ability to assess change in WBCs across treatment. We would also like to underscore our attrition rates, as referenced above (see Footnote 1). In addition, we reiterate the point that conclusive arbitration of AN’s terminality may prove empirically elusive, in that such questions invariably involve values, ethics, and philosophy. It is a defensible if debatable point of view that there are some questions—many of them philosophical in nature—that are forever beyond the reach of empirical scrutiny, and AN’s terminality or lack of it may be one such question (a point that would apply to the terminality of any psychiatric condition). Still, Guadiani and colleagues’ proposal of T-AN has an empirical character, in that they are defining the construct using quantifiable metrics (e.g., presence vs. absence of the AN diagnosis, age, etc.). If Guadiani and colleagues are arguing that T-AN is a valid clinical construct in nature, potentially one that could appear in future editions of the DSM (e.g., as a specifier)—and we believe they are—then our view is that empirical scrutiny is essential, and that is what we strive to provide here. Finally, our study covered a 3-year period from 2020 to 2023, a time of the COVID-19 pandemic and thus likely not necessarily representative of other timeframes. Furthermore, admission-to-discharge timeframes were relatively brief and may not have fully captured the timeline of treatment progression; nevertheless, we would expect to see a consistent and homogenous decline in the T-AN group if it were truly terminal, yet these data show very few such outcomes.
Conclusion
The current study is the first to our knowledge to empirically examine the concept of terminality in AN based on the criteria presented for T-AN as proposed by Gaudiani et al. (2022). Findings suggest that those meeting three of the four criteria of T-AN, as well as those additionally meeting a proxy for the fourth criterion, experience a severe illness marked by both psychological and physiological symptoms; however, this group was, for the most part, not significantly different from others with AN and showed similar improvement across treatment, inconsistent with the “terminal” designation. Thus, future studies should continue to examine the concept of terminality, explore the implications of terminal diagnoses, and research further evidence of the existence of terminality within mental health diagnoses. On this latter point, the stakes are high; the empirical support should therefore be commensurately clear; data presented here are relatively unsupportive.
Supplementary Material
General Scientific Summary.
“Terminal” is a proposed new specifier for an anorexia nervosa (AN) diagnosis with the following criteria: (a) diagnosis of AN, (b) over 30 years old, (c) previously repeatedly sought high-quality treatment, and (d) desire to fatally discontinue treatment. This study empirically scrutinizes the “terminal” label for AN by comparing those with AN who meet the “terminal” criteria to those who do not. Findings here suggest that the groups meeting three-criteria T-AN (as well as those meeting a proxy of the fourth criterion) show substantial variability in admission, discharge, and treatment response on physiological and self-report measures, overall indicating improvement that is inconsistent with a terminal pattern for this group.
Acknowledgments
Morgan Robison was supported by the National Institute of Mental Health (Grant 5T32MH093311-12). Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the National Institute of Mental Health. Dan V. Blalock was supported by Career Development Award 19-035 (IK2HX003085-01A2) from the U.S. Department of Veterans Affairs Health Services Research and Development Service. Renee D. Rienecke receives consulting fees from the Training Institute for Child and Adolescent Eating Disorders, LLC, and receives royalties from Routledge. Daniel Le Grange receives Royalties from Guilford Press and Routledge, is a codirector of the Training Institute for Child and Adolescent Eating Disorders, LLC, and is a member of the Clinical Advisory Board at Equip Health. The remaining authors declare that they have no conflicts of interest. These results have not been previously disseminated, and the data are not available (i.e., data involve medical health records, and participants did not consent to de-identified public data sharing), but materials and code are accessible by contacting the corresponding author. Hypotheses and analyses were not preregistered.
Footnotes
Supplemental materials: https://doi.org/10.1037/abn0000912.supp
It is important to acknowledge that participants who complete admission and discharge assessments may not be representative of those with only admission data. Past analyses on a related sample indicated attrition had no influence on outcomes (Joiner et al., 2022), and analyses on this sample further affirmed that view.
To ensure there were no substantive differences between the T-AN group above versus below 18.5 kg/m2, we reran all analyses further classifying the T-AN group into two categories: BMI above 18.5 kg/m2 and BMI below 18.5 kg/m2. There were no meaningful differences between the two T-AN BMI categories except for the EDE-Q-weight subscale. When further examined, almost everyone in the T-AN group (i.e., except for one participant in the above BMI category who showed no change) improved from admission to discharge.
A latent class analysis was conducted to further assess if the binary associations related to the terminal criteria (i.e., AN type, age of 30 and over, readmittance to high-quality care, and endorsement of desire for death) identified a group presenting as “terminal.” Two-, three-, and four-class solutions were analyzed with different indices supporting different class solutions. The two-class solution did not support a “terminal” distinction because the model fit was very poor, the distinction between the classes was minimal (i.e., low entropy = 0.52), and the classes were based entirely on the readmission variable.
Eight participants in the T-AN group (15.7%) had an involuntary admission at some point during their treatment (two of whom were in the TD-AN subset); only one participant of the eight did not improve on the EDE-Q measures (while improving on all other measures) during their stay. Furthermore, everyone in the TD-AN subset showed improvement. We randomly selected 60 participants in the NT-AN group (30 of whom were NTO-AN), three of whom had involuntary visits (5.0%). The three participants were all over 30 years old. Chi-square difference tests between T-AN and NT-AN were not statistically significant (χ2 = 3.53, p = .060) and remained non-significant when comparing T-AN to NTO-AN (χ2 = 0.77, p = .382).
Only AST and albumin results (not WBC counts) are included in the final analyses presented. WBC was collected at admission, with an n = 9 at discharge for the terminal sample, precluding full inclusion. Nevertheless, for those nine patients, WBC looked unremarkable.
References
- American Psychiatric Association (APA). (2013). Diagnostic and statistical manual of mental disorders: DSM-5 (Vol. 5). [Google Scholar]
- Benjamini Y, & Hochberg Y (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57(1), 289–300. 10.1111/J.2517-6161.1995.TB02031.X [DOI] [Google Scholar]
- Berg KC, Peterson CB, Frazier P, & Crow SJ (2012). Psychometric evaluation of the eating disorder examination and eating disorder examination-questionnaire: A systematic review of the literature. International Journal of Eating Disorders, 45(3), 428–438. 10.1002/eat.20931 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Birmingham CL, Su J, Hlynsky JA, Goldner EM, & Gao M (2005). The mortality rate from anorexia nervosa. International Journal of Eating Disorders, 38(2), 143–146. 10.1002/eat.20164 [DOI] [PubMed] [Google Scholar]
- Crow SJ, Scott Crow CJ, Editor A, & Striegel Weissman R (2023). Terminal anorexia nervosa cannot currently be identified. International Journal of Eating Disorders, 56(7), 1329–1334. 10.1002/eat.23957 [DOI] [PubMed] [Google Scholar]
- Dhira TA, Rahman MA, Sarker AR, & Mehareen J (2021). Validity and reliability of the Generalized Anxiety Disorder-7 (GAD-7) among university students of Bangladesh. PLoS ONE, 16(12), Article e0261590. 10.1371/journal.pone.0261590 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Downs J. (2023). Care pathways for longstanding eating disorders must offer paths to recovery, not managed decline. BJPsych Bulletin. Advance online publication. 10.1192/BJB.2023.38 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Downs J, Ayton A, Collins L, Baker S, Missen H, & Ibrahim A (2023). Untreatable or unable to treat? Creating more effective and accessible treatment for long-standing and severe eating disorders. The Lancet Psychiatry, 10(2), 146–154. 10.1016/S2215-0366(22)00400-X [DOI] [PubMed] [Google Scholar]
- Eddy KT, Tabri N, Thomas JJ, Murray HB, Keshaviah A, Hastings E, Edkins K, Krishna M, Herzog DB, Keel PK, & Franko DL (2017). Recovery from anorexia nervosa and bulimia nervosa at 22-year follow-up. The Journal of Clinical Psychiatry, 78(2), 184–189. 10.4088/JCP.15m10393 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elwyn R. (2023). A lived experience response to the proposed diagnosis of terminal anorexia nervosa: Learning from iatrogenic harm, ambivalence and enduring hope. Journal of Eating Disorders, 11(1), 1–18. 10.1186/s40337-022-00729-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feldman JG, Gange SJ, Bacchetti P, Cohen M, Young M, Squires KE, Williams C, Goldwasser P, & Anastos K (2003). Serum albumin is a powerful predictor of survival among HIV-1-infected women. JAIDS Journal of Acquired Immune Deficiency Syndromes, 33(1), 66–73. 10.1097/00126334-200305010-00010 [DOI] [PubMed] [Google Scholar]
- Gaudiani JL, Bogetz A, & Yager J (2022). Terminal anorexia nervosa: Three cases and proposed clinical characteristics. Journal of Eating Disorders, 10(1), 1–14. 10.1186/s40337-022-00548-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldwasser P, & Feldman J (1997). Association of serum albumin and mortality risk. Journal of Clinical Epidemiology, 50(6), 693–703. 10.1016/S0895-4356(97)00015-2 [DOI] [PubMed] [Google Scholar]
- Guarda AS, Hanson A, Mehler P, & Westmoreland P (2022). Terminal anorexia nervosa is a dangerous term: It cannot, and should not, be defined. Journal of Eating Disorders, 10(1), 1–3. 10.1186/s40337-022-00599-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hall P, & Cash J (2012). What is the real function of the liver “function” tests? The Ulster Medical Journal, 81(1), 30–36. https://pubmed.ncbi.nlm.nih.gov/23536736 [PMC free article] [PubMed] [Google Scholar]
- Harbottle EJ, Birmingham CL, & Sayani F (2008). Anorexia nervosa: A survival analysis. Eating and Weight Disorders: EWD, 13(2), e32–e34. https://pubmed.ncbi.nlm.nih.gov/18612251/ [PubMed] [Google Scholar]
- Hay P, & Touyz S (2018). Classification challenges in the field of eating disorders: Can severe and enduring anorexia nervosa be better defined? Journal of Eating Disorders, 6(1), Article 41. 10.1186/s40337-018-0229-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huppert JD, Walther MR, Hajcak G, Yadin E, Foa EB, Simpson HB, & Liebowitz MR (2007). The OCI-R: Validation of the subscales in a clinical sample. Journal of Anxiety Disorders, 21(3), 394–406. 10.1016/j.janxdis.2006.05.006 [DOI] [PubMed] [Google Scholar]
- Joiner TE, Robison M, McClanahan S, Riddle M, Manwaring J, Rienecke RD, Le Grange D, Duffy A, Mehler PS, & Blalock DV (2022). Eating disorder behaviors as predictors of suicidal ideation among people with an eating disorder. International Journal of Eating Disorders, 55(10), 1352–1360. 10.1002/eat.23770 [DOI] [PubMed] [Google Scholar]
- Kroenke K, & Spitzer RL (2002). The PHQ-9: A new depression diagnostic and severity measure. Psychiatric Annals, 32(9), 509–515. 10.3928/0048-5713-20020901-06 [DOI] [Google Scholar]
- Kroenke K, Spitzer RL, & Williams JBW (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606–613. 10.1046/j.1525-1497.2001.016009606.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Laboratory Corporation of America. (2021a). Complete blood count (CBC) with differential. https://www.labcorp.com/tests/005009/complete-blood-count-cbc-with-differential [Google Scholar]
- Laboratory Corporation of America. (2021b). Metabolic panel comprehensive. https://www.labcorp.com/tests/322000/metabolic-panel-14-comprehensive [Google Scholar]
- Mehta SH, Astemborski J, Sterling TR, Thomas DL, & Vlahov D (2006). Serum albumin as a prognostic indicator for HIV disease progression. AIDS Research and Human Retroviruses, 22(1), 14–21. 10.1089/aid.2006.22.14 [DOI] [PubMed] [Google Scholar]
- Murray SB, Quintana DS, Loeb KL, Griffiths S, & Le Grange D (2019). Treatment outcomes for anorexia nervosa: A systematic review and meta-analysis of randomized controlled trials. Psychological Medicine, 49(4), 535–544. 10.1017/S0033291718002088 [DOI] [PubMed] [Google Scholar]
- Ozawa Y, Shimizu T, & Shishiba Y (1998). Elevation of serum aminotransferase as a sign of multiorgan-disorders in severely emaciated anorexia nervosa. Internal Medicine, 37(1), 32–39. 10.2169/internalmedicine.37.32 [DOI] [PubMed] [Google Scholar]
- Quest Diagnostics. (2019). CBC (includes differential and platelets). https://testdirectory.questdiagnostics.com/test/test-detail/6399/?cc=DLO
- Ramirez E, & Fletcher J (2023). AST (SGOT) blood test: High and low levels, and what do results mean. https://www.medicalnewstoday.com/articles/320982 [Google Scholar]
- Riddle M, O’Melia AM, & Bauschka M (2022). First, do no harm: The proposed definition of “terminal anorexia” is fraught with danger for vulnerable individuals. Journal of Eating Disorders, 10(1), 1–3. 10.1186/S40337-022-00605-X/METRICS [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robison M, Rogers ML, Robertson L, Duffy ME, Manwaring J, Riddle M, Rienecke RD, Le Grange D, Duffy A, Plotkin M, Blalock DV, Mehler PS, & Joiner TE (2022). Avoidant restrictive food intake disorder and suicidal ideation. Psychiatry Research, 317, Article 114925. 10.1016/j.psychres.2022.114925 [DOI] [PubMed] [Google Scholar]
- Rose JS, Vaewsorn A, Rosselli-Navarra F, Wilson GT, & Weissman RS (2013). Test-retest reliability of the Eating Disorder Examination-Questionnaire (EDE-Q) in a college sample. Journal of Eating Disorders, 1(1), 1–10. 10.1186/2050-2974-1-42/TABLES/5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Snedecor GW, & Cochran WG (1982). Statistical methods (7th ed.). Iowa State University Press. [Google Scholar]
- Spitzer RL, Kroenke K, Williams JBW, & Löwe B (2006). A brief measure for assessing generalized anxiety disorder: The GAD-7. Archives of Internal Medicine, 166(10), 1092–1097. 10.1001/archinte.166.10.1092 [DOI] [PubMed] [Google Scholar]
- Spitzer RL, Williams JBW, & Kroenke KJ (2014). Test review: Patient health questionnaire-9 (PHQ-9). Rehabilitation Counseling Bulletin, 57(4), 246–248. 10.1177/0034355213515305 [DOI] [Google Scholar]
- Steinhausen HC (2002). The outcome of anorexia nervosa in the 20th century. American Journal of Psychiatry, 159(8), 1284–1293. 10.1176/appi.ajp.159.8.1284 [DOI] [PubMed] [Google Scholar]
- The Cleveland Clinic. (2022a). Albumin blood test: What it is, purpose, procedure & results. https://my.clevelandclinic.org/health/diagnostics/22390-albumin-blood-test [Google Scholar]
- The Cleveland Clinic. (2022b). Leukocytosis (high white blood cell count): Causes & symptoms. https://my.clevelandclinic.org/health/diagnostics/17704-high-white-blood-cell-count [Google Scholar]
- The Cleveland Clinic. (2022c). Low white blood cell count (leukopenia): Causes, symptoms & treatment. https://my.clevelandclinic.org/health/symptoms/17706-low-white-blood-cell-count [Google Scholar]
- Touyz S,& Hay P (2015). Severe and enduring anorexia nervosa (SE-AN): In search of a new paradigm. Journal of Eating Disorders, 3(1), Article 26. 10.1186/S40337-015-0065-Z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Touyz S, Le Grange D, Lacey H, Hay P, Smith R, Maguire S, Bamford B, Pike KM, & Crosby RD (2013). Treating severe and enduring anorexia nervosa: A randomized controlled trial. Psychological Medicine, 43(12), 2501–2511. 10.1017/S0033291713000949 [DOI] [PubMed] [Google Scholar]
- Watson HJ, & Bulik CM (2013). Update on the treatment of anorexia nervosa: Review of clinical trials, practice guidelines and emerging interventions. Psychological Medicine, 43(12), 2477–2500. 10.1017/S0033291712002620 [DOI] [PubMed] [Google Scholar]
- Wonderlich SA, Bulik CM, Schmidt U, Steiger H, & Hoek HW (2020). Severe and enduring anorexia nervosa: Update and observations about the current clinical reality. International Journal of Eating Disorders, 53(8), 1303–1312. 10.1002/eat.23283 [DOI] [PubMed] [Google Scholar]
- Yager J, Gaudiani JL, & Treem J (2022). Eating disorders and palliative care specialists require definitional consensus and clinical guidance regarding terminal anorexia nervosa: Addressing concerns and moving forward. Journal of Eating Disorders, 10, 135. 10.1186/s40337-022-00659-x [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

