Skip to main content
The Journal of Clinical Endocrinology and Metabolism logoLink to The Journal of Clinical Endocrinology and Metabolism
. 2021 Nov 9;107(4):1040–1052. doi: 10.1210/clinem/dgab808

A Longitudinal Study of Medial Temporal Lobe Volumes in Graves Disease

Mats Holmberg 1,2,, Helge Malmgren 2,3, Rolf A Heckemann 3,4, Birgitta Johansson 5, Niklas Klasson 3,5, Erik Olsson 2, Simon Skau 3,5, Göran Starck 4,6, Helena Filipsson Nyström 2,7,8
PMCID: PMC8947220  PMID: 34752624

Abstract

Context

Neuropsychiatric symptoms are common features of Graves disease (GD) in hyperthyroidism and after treatment. The mechanism behind these symptoms is unknown, but reduced hippocampal volumes have been observed in association with increased thyroid hormone levels.

Objective

This work aimed at investigating GD influence on regional medial temporal lobe (MTL) volumes.

Methods

Sixty-two women with newly diagnosed GD underwent assessment including magnetic resonance (MR) imaging in hyperthyroidism and 48 of them were followed up after a mean of 16.4 ± 4.2 SD months of treatment. Matched thyroid-healthy controls were also assessed twice at a 15-month interval. MR images were automatically segmented using multiatlas propagation with enhanced registration. Regional medial temporal lobe (MTL) volumes for amygdalae and hippocampi were compared with clinical data and data from symptom questionnaires and neuropsychological tests.

Results

Patients had smaller MTL regions than controls at inclusion. At follow-up, all 4 MTL regions had increased volumes and only the volume of the left amygdala remained reduced compared to controls. There were significant correlations between the level of thyrotropin receptor antibodies (TRAb) and MTL volumes at inclusion and also between the longitudinal difference in the levels of free 3,5,3′-triiodothyronine and TRAb and the difference in MTL volumes. There were no significant correlations between symptoms or test scores and any of the 4 MTL volumes.

Conclusion

Dynamic alterations in the amygdalae and hippocampi in GD reflect a previously unknown level of brain involvement both in the hyperthyroid state of the condition and after treatment. The clinical significance, as well as the mechanisms behind these novel findings, warrant further study of the neurological consequences of GD.

Keywords: Graves disease, hippocampus, amygdala, longitudinal, cognition, depression


Thyroid hormones are essential for brain function. Overt hypothyroidism leads to cognitive impairment and reduced gray matter volume of the hippocampi (1). In hyperthyroidism, mental symptoms include unrest, stress intolerance, fatigue, compromised well-being (2, 3), anxiety and depression (4), as well as cognitive impairment (2). Most symptoms resolve on achieving euthyroidism, but in a large proportion of patients, full mental health is not regained (5-9). As in hypothyroidism, hyperthyroidism is linked to smaller hippocampal volumes as compared to controls (10).

Reduced hippocampal volumes are described in several diseases with mental symptoms (11-15). For Cushing syndrome, the response to treatment includes both recoveries of hippocampal volumes and improvement of mental symptoms (11, 16). In some of these illnesses, smaller amygdalar volumes are also observed (14, 17-19).

The robustness of the finding of reduced hippocampal volumes in hyperthyroidism, as well as the mechanism behind the volume reduction and the relationship to mental symptoms and neuropsychological performance, remain to be determined. In Graves disease (GD), thyroid hormones, and thyrotropin (TSH) receptor antibodies (TRAbs) are plausible coplayers, with target receptors present in high levels in the medial temporal lobe (MTL) (20-25). Stressful life events as such have also been associated with reductions in MTL volumes (26-29).

Although longitudinal brain volume alterations previously have been described in experimental thyrotoxicosis (30), the only previous human study on structural brain changes in GD was cross-sectional (10). It is therefore unknown whether the return to euthyroidism in GD is accompanied by a structural recovery in the brain and whether the brain’s morphological changes are correlated with changes in mental symptoms. We previously described CogThy, an ongoing scientific effort seeking to answer this and other questions about GD (31). With a view to MTL volumetry, the longitudinal approach of the CogThy study circumvents the problem that the natural intersubject variation of MTL volumes is large and makes it difficult to detect even substantial volume changes through group comparisons.

In this part of the CogThy study, we hypothesized that 1) MTL structures in patients with untreated GD are smaller than those in healthy controls; 2) as patients improve their thyroid state, MTL volumes increase, and; 3) the degree of symptomatic recovery is linked to the increase in MTL volumes.

Materials and Methods

Study Design

The full CogThy study protocol has been reported elsewhere (31). In summary, the CogThy study is a prospective, case-controlled trial at the Centre of Endocrinology and Metabolism, Sahlgrenska University Hospital, Göteborg, Sweden, with an open treatment period that included 65 premenopausal female patients with GD and 65 matched controls, from September 2011 to October 2019. The patients were included within 2 weeks from the start of treatment with antithyroid drugs (ATDs). Treatment of hyperthyroidism followed the local treatment regiment, by which all patients either receive an ATD in a block-and-replacement regimen (addition of 50 mcg levothyroxine after 2 weeks and increased dose to 100 mcg after 4 weeks) or undergo surgery with previous ATD treatment.

Patients, as well as controls, underwent a comprehensive multimodal assessment battery at inclusion and after a mean of 16.4 ± 4.2 SD months including demographics, smoking status, thyroid hormonal and antibody assessment, questionnaires for mental fatigue, anxiety, depression, and quality of life (QoL), and neuropsychological testing. On the same occasions, the participants underwent magnetic resonance (MR) imaging.

Participants

Patients were recruited consecutively from the Thyroid Unit at Sahlgrenska University Hospital, Göteborg (n = 64), and from the Department of Medicine at Kungälv’s Hospital, Kungälv (n = 1) in Sweden between 2011 and 2019. They were eligible if they were premenopausal and hyperthyroid with free thyroxine (fT4) levels greater than or equal to 50 pmol/L (reference range, 12-22 pmol/L) and/or total 3,5,3′-triiodothyronine (T3) levels greater than or equal to 6.0 pmol/L (reference range, 1.3-3.1 pmol/L). In addition, they had to have elevated TRAb levels and/or technetium scintigraphy with a diffuse uptake. Exclusion criteria were self-reported pregnancy; serious somatic diseases such as other endocrine diseases, heart failure, respiratory failure, active malignancy, psychosis; or inability to follow the study protocol for other reasons. Further exclusion criteria were systemic glucocorticoid treatment (past, present, or anticipated use within 15 months); MR-incompatible implants or other MR contraindications; and amiodarone-induced GD.

Control participants matched for age and sex were randomly selected from the Swedish population registry in the same Gothenburg area. They were invited by mail. Those who responded positively were asked to participate if they matched the patient for smoking status and educational level and had normal thyroid hormone levels at the screening. In addition to the exclusion criteria stated for patients, controls were excluded if they had previous or ongoing thyroid disease.

We approached 116 patients for participation and 65 (56%) enrolled. One patient was excluded immediately because of menopause, and 2 more because of MR difficulties.

The present study is based on 62 patients and a subsample of 56 controls and 48 patients at 15 months for whom MR segmentation results are currently available. Only 22 follow-up MR scans in controls are included because a preliminary analysis found no indication of a longitudinal change in accordance with the small changes expected in this rather young, healthy group observed over 15 months.

Ethics

Ethical approval was granted by the Regional Ethical Review Board in Göteborg, Sweden (Dnr 190-10). The study was conducted under the Declaration of Helsinki. The study was registered in the public project database for research and development in Västra Götaland County, Sweden (https://www.researchweb.org/is/vgr/project/44321).

Biochemistry

Blood samples were analyzed at the Department of Clinical Chemistry at Sahlgrenska University Hospital, Göteborg, Sweden for serum T4, fT4, T3, free T3 (fT3), and TSH by electrochemiluminescence immunoassay (Roche Elecsys ECL, Roche Diagnostics International AG). Total TRAbs were analyzed using radioreceptor analysis with Brahms Kryptor (Thermo Fisher Scientific). Reference ranges for fT3, fT4, TSH, and TRAbs are reported in Fig. 1A-1D.

Figure 1.

Figure 1.

Laboratory measurements and reference values of A, serum free triiodothyronine (S-fT3); B, serum free thyroxine (S-fT4); C, serum thyrotropin (S-TSH); and D, serum TSH receptor antibody (S-TRAb) in premenopausal women with newly diagnosed Graves disease at inclusion (Pat 0; N = 62) and at follow-up (Pat 15; N = 48), and matched controls (Con; N = 56). For the box-whisker plots, the horizontal line within the box is the median, X is the mean, the horizontal ends of the boxes are the lower and upper quartiles, and the whiskers are nonoutlier minimum and maximum. Numbers of participants (N) and levels of significance for group comparisons (Mann-Whitney U test) are specified in the table below. IQR, interquartile range; Pat, patient. P values less than .05 are shown in bold.

Symptom Scoring and Neuropsychological Tests

Symptoms of anxiety and depression were assessed by self-evaluation based on the Comprehensive Psychopathological Rating Scale (CPRS) (32). Self-evaluation of mental fatigue was performed using the Mental Fatigue Scale (MFS) (33). The Swedish version of the Thyroid-Related Patient-Reported Outcome (ThyPRO) was used to assess QoL (34, 35). In this publication, only the ThyPRO dimensions Cognitive problems and Emotional Susceptibility were used.

The neuropsychological examination comprised assessments of processing speed, attention, working memory, and verbal fluency administered in a standardized sequence as follows: 1) Trail Making Test (TMT) A and B plus 2 extended versions with a higher load on divided attention (TMT C and D) for speed, visual scanning, flexibility, and divided attention (36, 37); 2) Digit Symbol coding for processing speed and Digit span for auditory working memory from the third edition of the Wechsler Adult Intelligence Scale (38); 3) F-A-S for verbal fluency from the Delis-Kaplan Executive Function System (39); and 4) reading speed (40). All TMT tests measure the time (in seconds) it takes to connect a group of symbols with a line, for which a lower score indicates higher speed. Digit Symbol coding is measured as the number of correct digit-symbol pairs during a specified time. Digit span measures the number of digits that the test taker can remember both forward and backward. A higher score indicates a greater memory span. The letter verbal fluency test (F-A-S) measures the total number of words beginning with F, A, and S that the test taker can produce (1 minute/letter). Reading speed is measured in words per second.

Brain Morphology

MR images were acquired with a 3-Tesla MR scanner (Philips Gyroscan Achieva 3T, Philips Healthcare) at the Department of Radiology, Sahlgrenska University Hospital, Göteborg, Sweden. For automatic volumetry, structural images of 0.7 × 0.7 × 1 mm voxel size were acquired axially using a 3-dimensional T1-weighted fast field echo sequence. Coronal T2W sections with 0.35 × 0.35 × 2 mm voxel size were acquired for manual volumetry.

An experienced radiologist inspected the MR images for visually apparent structural abnormalities of the brain.

The 3-dimensional T1-weighted images used for automatic volumetry were processed using the following steps:

  1. Subsampling to 1-mm3 isotropic voxels;

  2. Bias correction using N4 (41);

  3. Positional normalization (https://soundray.org/posnorm);

  4. Pincram (42) for brain extraction (using the IXI database as atlases, as discussed in Heckemann et al) (42) and intracranial volume (ICV) masking (using an atlas constructed from segmentations performed with the method described by Klasson et al) (43);

  5. Tissue class segmentation (FSL FAST);

  6. Whole-brain anatomical segmentation (MAPER [multiatlas propagation with enhanced registration] using the Hammersmith atlas database) (44-46);

  7. Masking of the hippocampus and amygdala labels with a gray-matter mask (from step 5).

The imaging specialist (R.H.) was blinded to participant identity and group (patient or control). To reduce variability in the MTL volumes due to head size, the volumes were normalized by ICV using the function:

f(vi)=vik(icviicv¯)

where f(vi) is the normalized MTL volume from MR examination i, vi is the unnormalized MTL volume from examination i, k is the regression coefficient from a simple linear regression with the MTL volume as the dependent variable and ICV as the independent variable, icvi is the ICV from examination i, and icv¯ is the mean ICV from all examinations. Normalization was applied to volumes from both baseline and follow-up examinations. However, volumes from the follow-up examinations were not included when calculating the regression coefficient (k) to avoid potential inaccuracies in the estimation due to possible volume change during treatment.

Effect of Equipment Upgrade on Morphometry

The MR hardware and/or software were upgraded several times during the study period. Two major upgrades were identified, the first one in April 2014. An American College of Radiology MR imaging phantom (47) was scanned just before that upgrade and again in August 2017. The resulting phantom images were compared to quantify differences in geometric distortion and intensity mapping. No such differences could be found.

Statistics

Owing to a lack of previous studies on this topic when this study was conceived, the power calculation for the volumetric analysis was based on changes in hippocampus volume in patients with Cushing disease scanned with a 1.5-T scanner.

According to the power calculation, 40 patients and 40 controls should be enough to detect a longitudinal interindividual hippocampal volume mean difference of 10% with 80% power and a statistical significance level of .05. To account for the risk of dropouts, we included 65 patients. For the statistical analysis, fT3, fT4, and TRAb values exceeding the detection range in either direction were set to the detection limit. When fewer than 62 patients were included in a subanalysis, the number is noted. Spearman rank correlations were used for association analyses. Patients and controls were compared using the Wilcoxon matched-pairs test and the Mann-Whitney U test, except for MR volumetry and neuropsychological test data, which were compared using t test. Intraindividual longitudinal MTL comparisons were performed using paired t tests and MTL comparisons and neuropsychological test results between patients and controls with unpaired t tests. Significance tests were 2-tailed and the statistical significance level was set at .05. No correction for multiple testing was performed, for 3 reasons. The MTL volumes are heavily interdependent. The measurements were hypothesis-driven and not made in a random search for significant results. It was deemed important to avoid type II errors, which would mean failing to recognize the importance of variables for further research in the project.

All statistical calculations were made using Statistica 13.2 (Tibco Software Inc).

Results

In Hyperthyroidism

At the time of diagnosis, the GD patients presented with fT4 greater than or equal to 50 pmol/L in 93.5% (58/62) and total T3 greater than or equal to 6.0 nmol/L in 56.4% (31/55). Elevated TRAb was observed in 98.4% (61/62) of the patients. The patient with normal TRAb had a positive uptake on technetium scintigraphy. Table 1 summarizes the history of the patients. Baseline characteristics are reported in Table 2.

Table 1.

Disease characteristics of the 62 patients with newly diagnosed Graves disease who underwent a magnetic resonance investigation at inclusion

No. Median (IQR) or n (%)
Time from blood test before diagnosis to inclusion, d 61 13 (11.0-19.0)
Time from start of antithyroid drugs to blood test at inclusion, d 62 8 (6-13)
Duration of symptoms before diagnosis, mo 60 4 (3.0-6.2)
Patients treated with β-blockers 62 48 (77.4)
Patients treated with antithyroid drugs 62 62 (100)
 Tiamazole 62 60 (96.8)
 Propylthiouracil 62 2 (3.2)
Eye evaluation with clinical activity score > 3 62 0

Abbreviation: IQR, interquartile range.

Table 2.

Demographic and clinical characteristics of Graves disease patients and controls matched for age, education and smoking status at inclusion

Mean ± SD or n (%) P
Patients
(N = 62)
Controls
(N = 56)
Age, y 32.3 ± 9.1 33.7 ± 8.8 .392a
BMI 22.6 ± 4.1 24.7 ± 4.4 .009 a
Previous smoker 12 (19.4) 12 (21.4) .820b
Current smoker 12 (19.3) 7 (12.5) .389b
College education 40 (64.6) 45 (80.4) .441b

P value less than .05 is shown in bold.

Abbreviation: BMI, body mass index.

a Unpaired t test.

b χ 2 test.

Thyroid Treatment

All patients were initially treated with thiamazole. In 2 patients, thiamazole was replaced with propylthiouracil before the inclusion visit. All treatment was given according to clinical routine with ATDs in block-and-replacement fashion or surgery following ATD pretreatment. At follow-up, 16 of 48 (33.3%) had undergone thyroidectomy, 1 patient was treated with radioactive iodine, 23 of 48 (47.9%) were on ATDs, and 8 of 48 (16.6%) had no treatment and were in remission.

Biochemistry

The levels of fT3, fT4, TSH, and TRAbs at the time of inclusion and follow-up are presented in Fig. 1A-1D. At follow-up, thyroid hormones had returned to normal levels for most patients, but median fT4 and TRAbs remained higher and fT3 and TSH lower in patients compared to controls. At 15 months, TSH levels were below normal in 7 of 48 (14.6%), above normal in 3 of 48 (6.2%), and within the normal range in 38 of 48 (79.2%). TRAb levels improved with treatment but had not yet returned to normal in 39.6% of patients (see Fig. 1D). At follow-up, there was no statistically significant difference in any of the volumes between the patients with a normal TSH and those with a TSH outside the reference range and, similarly, between the patients with negative and positive TRAbs (data not shown).

Mental Symptoms and Neuropsychological Tests

At inclusion, 62 patients and 56 controls completed the CPRS, MFS, and ThyPRO questionnaires. At follow-up, the same questionnaires were completed by 48 patients. At inclusion, patients reported worse mental symptom scores than controls in all the questionnaires. At follow-up, the patients’ symptoms had improved but, except for ThyPRO Emotional susceptibility, they still had poorer mental symptom scores than controls. Results are presented in Fig. 2.

Figure 2.

Figure 2.

Comprehensive Psychopathological Rating Scale (CPRS) scores for A, depression; and B, anxiety; C, Mental Fatigue Scale (MFS) score; D, the Thyroid-related Patient-reported Outcome (ThyPRO) dimensions cognitive problems; and E, emotional susceptibility in premenopausal women with newly diagnosed Graves disease at inclusion (patient [Pat] 0; N = 62) and at follow-up (Pat 15; N = 48), and matched controls (Con; N = 56). For the box-whisker plots, the horizontal line within the box is the median, X is the mean, the horizontal ends of the boxes are the lower and upper quartiles, and the whiskers are non-outlier minimum and maximum. Numbers of participants (N) and levels of significance for group comparisons with unpaired t test are specified in the table below. The dotted line in the MFS plot represents the cutoff for pathological scores (≥ 10.5). P values less than .05 are shown in bold.

Neuropsychological tests were performed by 62 patients at inclusion, 46 patients at 15 months, and 54 controls. Results are presented in Table 3. At inclusion, the patients were significantly slower in TMT B and scored fewer correct digit-symbol pairs in Digit Symbol Coding compared to the controls. At follow-up, patients were faster in TMT A than controls. With treatment, patients improved in all tests except Digit span and reading speed.

Table 3.

Results of tests in 62 premenopausal women with newly diagnosed Graves disease (at inclusion) and 54 matched controls

Comparison of patients and controls at inclusion
Patients at inclusion Controls at inclusion
No. Mean SD No. Mean SD P
Trail Making Test A 62 29.3 11.8 54 27.2 8.3 .284
Trail Making Test B 62 71.5 26.1 54 62.4 21.8 .047
Trail Making Test C 61 66.8 24.8 54 58.8 24.3 .085
Trail Making Test D 61 116.0 40.9 54 105.1 37.5 .140
Digit Symbol Coding 62 77.1 13.8 54 82.1 12.7 .044
Digit Span 62 15.3 3.4 54 15.0 3.4 .620
F-A-Sa 62 41.0 11.6 54 43.8 12.9 .215
Reading speed 61 3.1 0.9 54 3.1 0.8 .920
Comparison of patients at 15 mo and controls at inclusion
Patients at 15 mo Controls at inclusion
No. Mean SD No. Mean SD P
Trail Making Test A 46 23.9 8.6 54 27.2 8.3 .050
Trail Making Test B 46 59.5 24.6 54 62.4 21.8 .533
Trail Making Test C 46 57.9 21.7 54 58.8 24.3 .853
Trail Making Test D 46 98.9 38.5 54 105.1 37.5 .414
Digit Symbol Coding 46 80.9 16.3 54 82.1 12.7 .691
Digit Span 46 15.3 3.5 54 15.0 3.4 .644
F-A-Sa 46 46.6 11.0 54 43.8 12.9 .252
Reading speed 45 3.1 0.9 54 3.1 0.8 .862
Comparison of patients at inclusion and at 15 mo
Patients at inclusion Patients at 15 mo
No. Mean SD No. Mean SD P
Trail Making Test A 45 28.4 12.2 45 23.9 8.7 9.0e-04
Trail Making Test B 45 67.8 25.3 45 60.2 24.5 .012
Trail Making Test C 44 62.8 21.5 44 56.7 19.6 .013
Trail Making Test D 44 109.8 36.9 44 97.6 37.7 .013
Digit Symbol Coding 45 77.8 13.3 45 80.5 16.2 .027
Digit Span 45 15.8 3.3 45 15.2 3.5 .227
F-A-Sa 45 43.3 10.8 45 46.7 11.1 .006
Reading speed 44 3.1 0.9 44 3.1 0.9 .600

Data are presented as mean and SD. Comparisons between controls and patients are made with t test and between patients at inclusion and at 15 months with paired t test. P values less than .05 are shown in bold.

a F-A-S is a test that measures the total number of words that start with F, A, and S that the test taker can produce in 1 minute.

Brain Morphology

At inclusion, the volumes of amygdalae and hippocampi were smaller in patients compared to controls: The mean differences in volume percentage were –10.4% for the left amygdala (P = 2.8e-6), –13.3% for the right amygdala (P = 5.0e-8), –4.4% for the left hippocampus (P = .013), and –4.8% for the right hippocampus (P = .009) (Table 4). With ATD and/or surgical treatment, the patients’ amygdalae and hippocampi increased significantly in size: the mean increase was +6.7% for the left amygdala (P = 3.0e-4), +11.1% for the right amygdala (P = 1.7e-6), +5.6% for the left hippocampus (P = 1.3e-4), and +5.8% for the right hippocampus (P = 1.2e-5) (Table 5A). At follow-up, only the left amygdala remained significantly smaller in patients than in controls (–5.0%, P = .029) (Table 5B).

Table 4.

Volumes of amygdalae and hippocampi in premenopausal women with newly diagnosed Graves disease at inclusion (N = 62) and matched controls (N = 56) at inclusion

Mean ± SD ICV-normalized volume, mm3
Patients at inclusion (N = 62) Controls at inclusion (N = 56) Difference in mean, mm3 P a
Left amygdala 1135.2 ± 148.4 1266.5 ± 140.3 –131.3 2.8e-6
Right amygdala 994.1 ± 161.5 1146.1 ± 114.9 –152.0 5.0e-8
Left hippocampus 1681.7 ± 160.7 1759.9 ± 177.2 –78.3 .013
Right hippocampus 1799.0 ± 193.7 1889.6 ± 176.8 –90.6 .009

Segmentation was performed with the MAPER (multiatlas propagation with enhanced registration) automatic method. Intracranial volume (ICV)-normalized volumes were used. P values less than .05 are shown in bold.

a Unpaired t test.

Table 5.

Volumes of amygdalae and hippocampi in premenopausal women with newly diagnosed Graves disease and matched controls

A, Comparison of patients at inclusion and patients at 15 mo in euthyroidism
Mean ± SD non–ICV-normalized volume, mm3
Patients at inclusion (N = 47c) Patients at 15 mo
(N = 47c)
Difference in mean, mm3 P a
Left amygdala 1123.1 ± 154.9 1201.7 ± 141.8 –78.6 3.0e-4
Right amygdala 997.1 ± 160.7 1094.7 ± 137.9 –97.6 1.7e-6
Left hippocampus 1682.1 ± 191.1 1761.6 ± 189.8 –79.5 1.3e-4
Right hippocampus 1797.5 ± 220.6 1885.5 ± 199.4 –88.0 1.2e-5
B, Comparison of patients at 15 mo and controls at inclusion
Mean ± SD ICV-normalized volume, mm 3
Patients at 15 mo
(N = 48)
Controls at inclusion (N = 56) Difference in mean, mm 3 P b
Left amygdala 1207.1 ± 132.2 1266.5 ± 140.3 –59.4 .029
Right amygdala 1099.9 ± 126.5 1146.1 ± 114.9 –46.2 .054
Left hippocampus 1766.2 ± 179.5 1759.9 ± 177.2 6.3 .858
Right hippocampus 1892.6 ± 173.5 1889.6 ± 176.8 3.0 .932

Segmentation was performed with the MAPER (multiatlas propagation with enhanced registration) automatic method. Owing to dropouts, 48 patients were included. Nonintracranial volume (ICV)-normalized volumes were used for longitudinal comparisons in patients, and ICV-normalized volumes were used for comparisons between patients and controls. P values less than .05 are shown in bold.

a Paired t test.

b Unpaired t test.

c Only 47 of the 48 patients remaining at 15 months had a magnetic resonance scan at baseline.

In the 22 controls who were investigated with an interval of 15.5 ± 1.5 SD months, the differences in non–ICV-normalized volumes were 0.9% for left amygdala (95% CI, –4.3 to 3.9), –1.6% for right amygdala (95% CI, 6.7-3.5), –0.2% for left hippocampus (95% CI, –2.6 to 2.2), and –0.6% for right hippocampus (95% CI, –4.2 to 3.0) (Fig. 3).

Figure 3.

Figure 3.

Scatterplots of the left and right non-intracranial volume (ICV)-normalized volumes of amygdalae and hippocampi in 22 controls at inclusion and at follow-up. Segmentation was performed with the MAPER (multiatlas propagation with enhanced registration) automatic method. Differences were calculated as the inclusion value minus the value at follow-up. At the top the mean difference (in %) is presented followed by a 95% CI from a one-sample t test of the mean difference where the reference value was set to 0.

Impact of Equipment Upgrade on Morphometry

Investigations of the effect of scanner upgrades showed that the change in patient MTL volumes between baseline and follow-up was not biased by these upgrades. The same holds for the difference between amygdalar volumes of patients and controls at baseline (data not shown).

Thyroid Hormones, Antibodies, and Medial Temporal Lobe Volumes

At inclusion and follow-up, no correlations were observed between hormone levels (fT3, fT4, TSH) and the normalized volumes of amygdalae and hippocampi in patients (data not shown). At inclusion, there were negative correlations between the level of TRAb and the normalized volume of the left (ρ = –0.35, P = .006) and right amygdala (ρ = –0.35, P = .005) and the right hippocampus (ρ = –0.29, P = .023). At follow-up, there were no statistically significant differences in any of the volumes between the patients with a normal TSH and those with a TSH outside the reference range, as well as no difference between the patients who had been through surgery and those who had not (data not shown).

There was a negative correlation between the longitudinal difference (between inclusion and follow-up) in the level of fT3 and the difference in volumes of the left MTL (Fig. 4) but not for the right MTL. There was also a negative correlation between the difference in TRAbs and the difference in the volume of the left and right amygdala and the left hippocampus (Fig. 5) There were no correlations between the difference in fT4 and the difference in MTL volumes (data not shown).

Figure 4.

Figure 4.

Correlations between the longitudinal difference in serum levels of free 3,5,3′-triiodothyronine (fT3) and the longitudinal difference in volume of left amygdala and hippocampus in 47 patients with Graves disease. Differences were calculated as the inclusion value minus the value at follow-up, after treatment. Correlations are presented as Spearman rank correlations. P values less than .05 are shown in bold.

Figure 5.

Figure 5.

Correlations between the longitudinal difference in serum levels of thyrotropin receptor antibodies (TRAb) and the longitudinal difference in volume of all 4 medial temporal lobe (MTL) regions in 47 patients with Graves disease. Differences were calculated as the inclusion value minus the value at follow-up, after treatment. Correlations are presented as Spearman rank correlations. P values less than .05 are shown in bold.

Education, Smoking and Medial Temporal Lobe Volumes

At inclusion, no statistically significant differences in MTL volumes were found between patients with a college education and those with lower education. There were neither any statistically significant differences in MTL volumes between current or previous smokers and nonsmokers.

Mental Symptoms, Neuropsychological Tests, and Medial Temporal Lobe Volumes

There were no significant correlations between any of the mental symptom scores or the neuropsychological test results and the volumes of amygdalae and hippocampi in patients at inclusion or follow-up.

There were no significant correlations between the difference in the volume of amygdalae and hippocampi and the difference in the score of any of the symptom questionnaires. The duration of symptoms did not correlate with volumes.

There were statistically significant negative correlations between the difference in TMT D (between inclusion and follow-up) and the difference between inclusion and follow-up in the left amygdala (r = –0.32, P = .039), right amygdala (r = –0.31, P = .046), and left hippocampus (r = –0.37, P = .015).

Discussion

This study demonstrates involvement of the brain in GD at a level not previously reported. Using a combined cross-sectional and longitudinal approach, we show that volume loss in the medial temporal lobe is present at diagnosis and that the response to treatment with ATDs and/or surgery includes partial recovery of these regions. The mechanism and clinical significance of this involvement is unclear, but further scientific inquiry is warranted to elucidate the evident neurological dimension of this disorder.

Our preliminary observation that hippocampi in untreated GD are reduced in volume is congruent with our first hypothesis and corroborates previous work (10). We also saw clear-cut reductions in amygdala volumes, which have not been shown before.

The plasticity of these brain regions is elegantly demonstrated as patients’ brain volumes recover with the treatment of hyperthyroidism, a finding that supports our second hypothesis. These results constitute the best evidence to date that GD is indeed associated with structural brain changes. They also support the hypothesis that the persistent mental symptoms that affect many patients may be a consequence of GD, even though the exact mechanism still needs to be elucidated. The absence of a relationship between the level of anxiety and MTL volumes makes stress per se an unlikely mediator of the volume reductions.

Even though all volumes increased with treatment at a group level, a difference remained between the patients and controls for the left amygdala. The same trend was evident for the right amygdala. The treatment response in terms of MTL volume change was highly variable. Several patients had approximately constant MTL volumes; in others, volumes decreased over the 15 months of treatment. It is impossible to tell from our data for an individual patient whether she started with subnormal volumes. Hence, we cannot tell whether a constant volume is a sign of treatment failure or a sign of unusual resistance of the brain to the effects of GD. However, these reservations do not contradict our interpretations of the results concerning the second hypothesis.

The only significant relationship between thyroid hormone levels and MTL volumes was the correlation between the difference in fT3 and the difference in left MTL volumes. A previous publication has found a correlation between the levels of fT4 and the normalized volume of the left hippocampus (10), a finding we could not confirm, most likely because of our exclusion of GD patients with moderately elevated levels of fT4 (< 50 pmol/L). In our study, the correlations between TRAbs and MTL volumes at inclusion together with the correlations between the differences in TRAbs and differences in MTL volumes highlight thyroid autoimmunity as another possible mechanistic factor.

One of our aims was to find plausible causal mechanisms behind long-standing, disabling mental symptoms in GD patients. The third hypothesis is therefore at the core of our aims. The complete lack of correlations between the difference in mental scores and differences in brain volumes questions the hypothesis that this is related to remaining mental symptoms.

Regarding the neuropsychological tests, our study design may have underestimated the severity of cognitive symptoms, as the tests were conducted at a median of 8 days after the initiation of ATD, but it was unethical to delay ATD treatment until the research visit was complete. Although a few statistically significant, but weak, correlations between test improvement and MTL volume increase were found, the lack of consistent correlations between test result changes and volume changes makes the significance of this finding uncertain. Thus, our third hypothesis did not receive any support.

However, one way of interpreting the positive results from our study is that GD does cause changes both in MTL volumes and mental symptoms and that these changes roughly reflect the course of the disease. The negative results mean that there is no evidence that MTL volumes can be used as predictive biomarkers of mental recovery under GD treatment.

The present study has both strengths and limitations. The inclusion of only women is a strength as well as a weakness. When evaluating differences in MTL brain volumes, the restriction to one sex excludes a possible confounder (48). Also, men and women have different prevalences regarding psychiatric conditions and the focus on women makes the results of the study more relevant for the group that is most affected both by GD and by depression and anxiety.

A further strength is that we chose to avoid treatment with radioactive iodine, as this may be associated with a worse QoL outcome (9).

Another strength is that patients with thyroid-associated ophthalmopathy with ongoing steroid medication were excluded since moderately severe and severe thyroid-associated ophthalmopathy is known to impair QoL considerably (49). Steroid treatment was not initiated in any of the patients during the study’s 15-month follow-up period. This approach eliminates an important confounder.

A limitation is the inclusion of patients with severe hyperthyroidism as this limits the generalizability to patients with more moderate increases in thyroid hormone levels. The inclusion of only women may also be regarded as a limitation because of the limited applicability of the results to men with GD.

A final limitation is the long inclusion time, which may have introduced MR-related biases due to software upgrades or scanner drift. We have investigated this to the best of our knowledge but cannot completely rule out a volumetry bias of unknown origin.

Conclusion

The finding of dynamic alterations in the amygdalae and hippocampi in GD reflects a previously unknown level of brain involvement both in the development of the condition and its response to treatment. The biochemical cause of brain volume reductions is likely more complex than high thyroid hormone levels, and thyroid autoimmunity should be further investigated. Although no correlations between symptom severity and brain volumes were found, at a group level MTL volumes were reduced in hyperthyroid GD patients and became larger after treatment, at the same time as symptom severity on all scales diminished. This parallelism indicates that the existence of mental symptoms as a disease consequence deserves further attention.

Acknowledgments

The authors would especially like to thank research nurse Jenny Tiberg for her dedicated work on the CogThy study. We would also like to thank Jonathan Ståhl Filipsson, Linnea Ståhl Filipsson, Ellinor Ljungberg, Tom Kullin, and Miriam McCann for their kind assistance with administrative work.

Glossary

Abbreviations

ATD

antithyroid drug

CPRS

Comprehensive Psychopathological Rating Scale

fT3

free 3,5,3′-triiodothyronine

fT4

free thyroxine

GD

Graves disease

ICV

intracranial volume

MAPER

multiatlas propagation with enhanced registration

MFS

Mental Fatigue Scale

MR

magnetic resonance

MTL

medial temporal lobe

QoL

quality of life

ThyPRO

Thyroid-Related Patient-Reported Outcome

TMT

Trail Making Test

TRAbs

thyrotropin receptor antibodies

TSH

thyrotropin

Financial Support: This work was supported by grants from the Swedish state under the agreement between the Swedish government and Västra Götalandsregionen, the ALF-agreement (Nos. ALFGBG-717311, ALFGBG-790271, and ALFGBG-442391), Göteborgs Läkaresällskap, Svenska Läkaresällskapet, Svenska Sällskapet för Medicinsk Forskning, Svenska Endokrinologföreningen, Fredrik och Ingrid Thurings Stiftelse, Irisstipendiet, Jeanssons Stiftelser, Tore Nilsons Stiftelse för Medicinsk Forskning, Stiftelserna Wilhelm och Martina Lundgrens, Anna-Lisa och Bror Björnssons Stiftelse, Adlerbertska Stiftelserna, and Åke Wiberg Stiftelse. Västra Götalandsregionen, Sahlgrenska Universitetssjukhuset. Knut och Alice Wallenbergs Stiftelse is acknowledged for generous support.

Author Contributions: M.H. collected the data; contributed data or analysis tools; performed the analysis; and wrote the paper. H.M. conceived and designed the analysis; collected the data; contributed data or analysis tools; performed the analysis; and wrote the paper. R.A.H. collected the data; contributed data or analysis tools; performed the analysis; and wrote the paper. B.J. conceived and designed the analysis; collected the data; contributed data or analysis tools; and wrote the paper. N.K. collected the data; contributed data or analysis tools; performed the analysis; and wrote the paper. E.O. collected the data; contributed data or analysis tools; and wrote the paper. S.S. collected the data; contributed data or analysis tools; and wrote the paper. G.S. collected the data; contributed data or analysis tools; and wrote the paper. H.F.N. conceived and designed the analysis; collected the data; contributed data or analysis tools; performed the analysis; and wrote the paper.

Additional Information

Disclosures: H.F.N. has received lecture fees from Siemens Inc, AstraZeneca, and Bristol Myers Squibb. The remaining authors have nothing to disclose.

Data Availability

Some or all data sets generated during and/or analyzed during the present study are not publicly available but are available from the corresponding author on reasonable request.

References

  • 1. Cooke GE, Mullally S, Correia N, O’Mara SM, Gibney J. Hippocampal volume is decreased in adults with hypothyroidism. Thyroid. 2014;24(3):433-440. [DOI] [PubMed] [Google Scholar]
  • 2. Elberling TV, Rasmussen AK, Feldt-Rasmussen U, Hørding M, Perrild H, Waldemar G. Impaired health-related quality of life in Graves’ disease. A prospective study. Eur J Endocrinol. 2004;151(5):549-555. [DOI] [PubMed] [Google Scholar]
  • 3. Watt T, Groenvold M, Rasmussen AK, et al. Quality of life in patients with benign thyroid disorders. A review. Eur J Endocrinol. 2006;154(4):501-510. [DOI] [PubMed] [Google Scholar]
  • 4. Bové KB, Watt T, Vogel A, et al. Anxiety and depression are more prevalent in patients with Graves’ disease than in patients with nodular goitre. Eur Thyroid J. 2014;3(3):173-178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Abraham-Nordling M, Törring O, Hamberger B, et al. Graves’ disease: a long-term quality-of-life follow up of patients randomized to treatment with antithyroid drugs, radioiodine, or surgery. Thyroid. 2005;15(11):1279-1286. [DOI] [PubMed] [Google Scholar]
  • 6. Berg G, Michanek A, Holmberg E, Nyström E. Clinical outcome of radioiodine treatment of hyperthyroidism: a follow-up study. J Intern Med. 1996;239(2):165-171. [DOI] [PubMed] [Google Scholar]
  • 7. Perrild H, Hansen JM, Arnung K, Olsen PZ, Danielsen U. Intellectual impairment after hyperthyroidism. Acta Endocrinol (Copenh). 1986;112(2):185-191. [DOI] [PubMed] [Google Scholar]
  • 8. Törring O, Tallstedt L, Wallin G, et al. Graves’ hyperthyroidism: treatment with antithyroid drugs, surgery, or radioiodine—a prospective, randomized study. Thyroid Study Group. J Clin Endocrinol Metab. 1996;81(8):2986-2993. [DOI] [PubMed] [Google Scholar]
  • 9. Törring O, Watt T, Sjölin G, et al. Impaired quality of life after radioiodine therapy compared to antithyroid drugs or surgical treatment for Graves’ hyperthyroidism: a long-term follow-up with the thyroid-related patient-reported outcome questionnaire and 36-item Short Form Health Status survey. Thyroid. 2019;29(3):322-331. [DOI] [PubMed] [Google Scholar]
  • 10. Zhang W, Song L, Yin X, et al. Grey matter abnormalities in untreated hyperthyroidism: a voxel-based morphometry study using the DARTEL approach. Eur J Radiol. 2014;83(1):e43-e48. [DOI] [PubMed] [Google Scholar]
  • 11. Bauduin SEEC, van der Wee NJA, van der Werff SJA. Structural brain abnormalities in Cushing’s syndrome. Curr Opin Endocrinol Diabetes Obes. 2018;25(4):285-289. [DOI] [PubMed] [Google Scholar]
  • 12. Bremner JD, Narayan M, Anderson ER, Staib LH, Miller HL, Charney DS. Hippocampal volume reduction in major depression. Am J Psychiatry. 2000;157(1):115-118. [DOI] [PubMed] [Google Scholar]
  • 13. Santos A, Granell E, Gomez-Anson B, et al. Depression and anxiety scores are associated with amygdala volume in Cushing’s syndrome: preliminary study. Biomed Res Int. 2017;2017:1-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. van Mierlo TJ, Chung C, Foncke EM, Berendse HW, van den Heuvel OA. Depressive symptoms in Parkinson’s disease are related to decreased hippocampus and amygdala volume. Mov Disord. 2015;30(2):245-252. [DOI] [PubMed] [Google Scholar]
  • 15. Mueller SG, Schuff N, Yaffe K, Madison C, Miller B, Weiner MW. Hippocampal atrophy patterns in mild cognitive impairment and Alzheimer’s disease. Hum Brain Mapp. 2010;31(9):1339-1347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Bourdeau I, Bard C, Noël B, et al. Loss of brain volume in endogenous Cushing’s syndrome and its reversibility after correction of hypercortisolism. J Clin Endocrinol Metab. 2002;87(5):1949-1954. [DOI] [PubMed] [Google Scholar]
  • 17. Vriend C, Boedhoe PS, Rutten S, Berendse HW, van der Werf YD, van den Heuvel OA. A smaller amygdala is associated with anxiety in Parkinson’s disease: a combined FreeSurfer-VBM study. J Neurol Neurosurg Psychiatry. 2016;87(5):493-500. [DOI] [PubMed] [Google Scholar]
  • 18. Klein-Koerkamp Y, Heckemann RA, Ramdeen KT, et al. ; Alzheimer’sdisease Neuroimaging Initiative . Amygdalar atrophy in early Alzheimer’s disease. Curr Alzheimer Res. 2014;11(3):239-252. [DOI] [PubMed] [Google Scholar]
  • 19. Zavorotnyy M, Zöllner R, Schulte-Güstenberg LR, et al. Low left amygdala volume is associated with a longer duration of unipolar depression. J Neural Transm (Vienna). 2018;125(2):229-238. [DOI] [PubMed] [Google Scholar]
  • 20. Bernal J. Thyroid hormones in brain development and function. In: Feingold KR, Anawalt B, Boyce A, et al, eds. Endotext [Internet]. MDText.com Inc; 2000 [Google Scholar]
  • 21. de Escobar GM, Obregón MJ, del Rey FE. Iodine deficiency and brain development in the first half of pregnancy. Public Health Nutr. 2007;10(12A):1554-1570. [DOI] [PubMed] [Google Scholar]
  • 22. Ferreiro B, Bernal J, Goodyer CG, Branchard CL. Estimation of nuclear thyroid hormone receptor saturation in human fetal brain and lung during early gestation. J Clin Endocrinol Metab. 1988;67(4):853-856. [DOI] [PubMed] [Google Scholar]
  • 23. Peeters R, Fekete C, Goncalves C, et al. Regional physiological adaptation of the central nervous system deiodinases to iodine deficiency. Am J Physiol Endocrinol Metab. 2001;281(1):E54-E61. [DOI] [PubMed] [Google Scholar]
  • 24. Schroeder AC, Privalsky ML. Thyroid hormones, t3 and t4, in the brain. Front Endocrinol (Lausanne). 2014;5:1- 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Moodley K, Botha J, Raidoo DM, Naidoo S. Immuno-localisation of anti-thyroid antibodies in adult human cerebral cortex. J Neurol Sci. 2011;302(1-2):114-117. [DOI] [PubMed] [Google Scholar]
  • 26. Gerritsen L, Kalpouzos G, Westman E, et al. The influence of negative life events on hippocampal and amygdala volumes in old age: a life-course perspective. Psychol Med. 2015;45(6):1219-1228. [DOI] [PubMed] [Google Scholar]
  • 27. Karl A, Schaefer M, Malta LS, Dörfel D, Rohleder N, Werner A. A meta-analysis of structural brain abnormalities in PTSD. Neurosci Biobehav Rev. 2006;30(7):1004-1031. [DOI] [PubMed] [Google Scholar]
  • 28. Logue MW, van Rooij SJH, Dennis EL, et al. Smaller hippocampal volume in posttraumatic stress disorder: a multisite ENIGMA-PGC study: subcortical volumetry results from posttraumatic stress disorder consortia. Biol Psychiatry. 2018;83(3):244-253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Morey RA, Gold AL, LaBar KS, et al. ; Mid-Atlantic MIRECC Workgroup . Amygdala volume changes in posttraumatic stress disorder in a large case-controlled veterans group. Arch Gen Psychiatry. 2012;69(11):1169-1178. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Göbel A, Heldmann M, Göttlich M, Dirk AL, Brabant G, Münte TF. Effect of experimental thyrotoxicosis on brain gray matter: a voxel-based morphometry study. Eur Thyroid J. 2015;4(Suppl 1):113-118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Holmberg MO, Malmgren H, Berglund P, et al. Structural brain changes in hyperthyroid Graves’ disease: protocol for an ongoing longitudinal, case-controlled study in Göteborg, Sweden—the CogThy project. BMJ Open. 2019;9(11):e031168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Svanborg P, Asberg M. A new self-rating scale for depression and anxiety states based on the Comprehensive Psychopathological Rating Scale. Acta Psychiatr Scand. 1994;89(1):21-28. [DOI] [PubMed] [Google Scholar]
  • 33. Johansson B, Starmark A, Berglund P, Rödholm M, Rönnbäck L. A self-assessment questionnaire for mental fatigue and related symptoms after neurological disorders and injuries. Brain Inj. 2010;24(1):2-12. [DOI] [PubMed] [Google Scholar]
  • 34. Watt T, Barbesino G, Bjorner JB, et al. Cross-cultural validity of the thyroid-specific quality-of-life patient-reported outcome measure, ThyPRO. Qual Life Res. 2015;24(3):769-780. [DOI] [PubMed] [Google Scholar]
  • 35. Watt T, Hegedüs L, Groenvold M, et al. Validity and reliability of the novel thyroid-specific quality of life questionnaire, ThyPRO. Eur J Endocrinol. 2010;162(1):161-167. [DOI] [PubMed] [Google Scholar]
  • 36. Lezak MD, Howieson DB, Loring DW, eds. Neuropsychological assessment. 4th ed. American Psychological Association; 2004. [Google Scholar]
  • 37. Johansson B, Berglund P, Rönnbäck L. Mental fatigue and impaired information processing after mild and moderate traumatic brain injury. Brain Inj. 2009;23(13-14):1027-1040. [DOI] [PubMed] [Google Scholar]
  • 38. Wechsler D, ed. Wechsler Adult Intelligence Scale—third edition, WAIS-III, Swedish version. Pearson Assessment; 2003. [DOI] [PubMed] [Google Scholar]
  • 39. Ellis DC, Kaplan E, Kramer JH, eds. Delis-Kaplan Executive Function System—D-KEFS. The Psychological Corporation; 2001. [Google Scholar]
  • 40. Madison S. Läsdiagnos. Läs och skrivcentrum; 2003. [Google Scholar]
  • 41. Tustison NJ, Avants BB, Cook PA, et al. N4ITK: improved N3 bias correction. IEEE Trans Med Imaging. 2010;29(6):1310-1320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Heckemann RA, Ledig C, Gray KR, et al. Brain extraction using label propagation and group agreement: pincram. PLoS One. 2015;10(7):e0129211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Klasson N, Olsson E, Rudemo M, Eckerström C, Malmgren H, Wallin A. Valid and efficient manual estimates of intracranial volume from magnetic resonance images. BMC Med Imaging. 2015;15:1- 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Heckemann RA, Keihaninejad S, Aljabar P, Rueckert D, Hajnal JV, Hammers A; Alzheimer’s Disease Neuroimaging Initiative . Improving intersubject image registration using tissue-class information benefits robustness and accuracy of multi-atlas based anatomical segmentation. Neuroimage. 2010;51(1):221-227. [DOI] [PubMed] [Google Scholar]
  • 45. Hammers A, Allom R, Koepp MJ, et al. Three-dimensional maximum probability atlas of the human brain, with particular reference to the temporal lobe. Hum Brain Mapp. 2003;19(4):224-247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Wild HM, Heckemann RA, Studholme C, Hammers A. Gyri of the human parietal lobe: volumes, spatial extents, automatic labelling, and probabilistic atlases. PLoS One. 2017;12(8):e0180866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Clarke GD. Overview of the ACR MRI accreditation phantom. MRI Phantoms & QA Testing. 1999:1-10. [Google Scholar]
  • 48. Cosgrove KP, Mazure CM, Staley JK. Evolving knowledge of sex differences in brain structure, function, and chemistry. Biol Psychiatry. 2007;62(8):847-855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Wiersinga WM. Quality of life in Graves’ ophthalmopathy. Best Pract Res Clin Endocrinol Metab. 2012;26(3):359-370. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Some or all data sets generated during and/or analyzed during the present study are not publicly available but are available from the corresponding author on reasonable request.


Articles from The Journal of Clinical Endocrinology and Metabolism are provided here courtesy of The Endocrine Society

RESOURCES