Abstract
The purpose of this study was to use objective quantitative magnetic resonance imaging (MRI) methods to develop a computer-aided detection (CAD) tool to differentiate white matter (WM) hyperintensities into either leukoencephalopathy (LE) induced by chemotherapy or normal maturational processes in children treated for acute lymphoblastic leukemia without irradiation. A combined MRI set consisting of T1-weighted, T2-weighted, proton-density-weighted and fluid-attenuated inversion recovery images and WM, gray matter and cerebrospinal fluid proportional volume maps from a spatially normalized atlas were analyzed with a neural network segmentation based on a Kohonen self-organizing map (SOM). Segmented maps were manually classified to identify the most hyperintense WM region and the normal-appearing genu region. Signal intensity differences normalized to the genu within each examination were generated for four time points in 228 children. A second Kohonen SOM was trained on the first examination data and divided the WM into normal-appearing or LE groups. Reviewing labels from the CAD tool revealed a consistency measure of 89.8% (167 of 186) within patients. The overall agreement between the CAD tool and the consensus reading of two trained observers was 84.1% (535 of 636), with 84.2% (170 of 202) agreement in the training set and 84.1% (365 of 434) agreement in the testing set. These results suggest that subtle therapy-induced LE can be objectively and reproducibly detected in children treated for cancer using this CAD approach based on relative differences in quantitative signal intensity measures normalized within each examination.
Keywords: Computer-aided detection, Segmentation, White matter, Neurotoxicity, Childhood acute lymphoblastic leukemia
1. Introduction
When cancer treatment specifically targets the immature brain, treatment efficacy must be balanced between potential neurotoxicity and its associated impact on a survivor’s quality of life. Acute lymphoblastic leukemia (ALL) is the most common childhood cancer: approximately 2400 new cases are diagnosed each year in the United States. The 5-year event-free survival estimate for pediatric patients with ALL is approximately 80%. Treatment of pediatric ALL must include central nervous system (CNS)-directed therapy to prevent relapse in the CNS. Historically, CNS prophylaxis consisted of cranial radiation therapy (CRT) and intrathecal treatment. More recently, efforts to avoid the adverse neurological and neuropsychological effects of CRT have prompted the elimination of CRT and the intensification of CNS chemotherapy, including intermediate-dose or high-dose methotrexate (HDMTX). However, several reports have shown a significant association between treatment with HDMTX and development of leukoence-phalopathy (LE) [1-3]. The term “leukoencephalopathy” is used to designate white matter (WM) damage in the CNS that is identified by hyperintense signals on T2-weighted (T2w) magnetic resonance imaging (MRI) after therapy.
Many recent protocols used MRI to qualitatively evaluate neurotoxicity from HDMTX treatment. In those studies, the degree of LE was rated on a subjective grading scale. Some studies [2,4,5] used a variant of the original scale proposed by Wilson et al. [1] in 1991. In these studies, normal examinations were rated as zero. Grade 1 corresponded to mild changes that did not extend to the gray matter (GM)-WM junction (0-25% of WM involved). Moderate changes extending to the GM-WM junction were rated as Grade 2 (25-50% of WM involved). Grade 3 was reserved for severe changes involving the GM-WM junction and continuing throughout the centrum semiovale (more than 50% of WM involved). Other studies only classified patients as normal, probably abnormal or definitely abnormal, with no assessment of the extent of WM involvement [6,7]. None of these studies evaluated intraobserver and interobserver variance, and subjective grading precluded the comparison of therapy-induced LE between clinical trials.
In addition to difficulties with the subjective grade of the extent of LE, there are no universal guidelines to differentiate subtle LE from normal developmental changes. LE can vary in duration, extent and intensity. While focal lesions are easily identified, subtle changes often do not have distinct boundaries with the normal-appearing white matter (NAWM), especially the normal terminal zones of myelination. Unmyelinated WM regions are slightly hyperintense on T2w imaging as well [8] — an important consideration in a patient population with a peak age at diagnosis of between 2 and 5 years [9]. Therefore, reliably identifying the subtle therapy-induced LE in children treated for ALL is a challenging task due to these MR properties.
The present study examined the feasibility of developing a computer-aided detection (CAD) tool to assist in identifying therapy-induced LE in young children. A robust segmentation algorithm was used to select the most hyperintense WM region and the normal-appearing WM in the genu of the corpus callosum. Signal intensity differences from these regions on multispectral MRI sets were combined with age on examination to provide a possible quantitative measure for decision making. A Kohonen self-organizing map (SOM) was trained on examinations near baseline and was then applied to all subsequent examinations on patients. Comparisons with two trained observers were used to determine if this new CAD tool provided a reproducible quantitative assessment of the presence of subtle therapy-induced LE in young children.
2. Methodology
2.1. Patient population
The CAD of subtle therapy-induced LE was designed and tested on patients treated for ALL without irradiation. The 228 patients included in this study were a subset of participants receiving treatment for ALL on an institutional protocol [10]. Patients underwent a chemotherapy regimen that included five courses of HDMTX with targeted systemic exposure to eliminate individual differences caused by variability in clearance [11]. Patients underwent a clinical imaging protocol after informed consent had been obtained from the patient, parent or guardian, as appropriate. Four MRI examinations were scheduled according to the protocol (Table 1), resulting in 636 examinations on 228 patients, who had at least one evaluable examination at the time of the study. The children ranged in age from 1.0 to 18.9 years (median, 5.6 years) at the time of enrollment in the protocol.
Table 1.
MR examinations collected per protocol
| MR examination | Weeks on protocol | Examinations collected |
|---|---|---|
| After one dose of HDMTX | 6 | 202 |
| After five doses of HDMTX | 14 | 194 |
| 1 year after HDMTX | 63 | 150 |
| End of therapy | 135 | 90 |
The first examination is collected as close to baseline as clinically feasible. The fifth course of HDMTX is the last HDMTX given to the patient.
2.2. MRI
LE is best visualized with a T2w sequence, preferably one that is cerebrospinal fluid (CSF)-attenuated. Thus, fluid-attenuated inversion recovery (FLAIR) seems to be the most sensitive sequence in evaluating the WM. The imaging protocol was designed to simultaneously yield raw images for segmentation procedures and images necessary for the clinical evaluation of patients. All MRI examinations were performed without a contrast agent on a 1.5-T Vision (Siemens Medical Systems, Iselin, NJ, USA) whole-body imager with a standard circular polarized volume head coil. To minimize variability, a localizer sequence was used to determine the position of the patient in the coil. A set of T1-weighted (T1w) sagittal images was collected as part of the diagnostic examination. The midline sagittal image was used to define an oblique transverse imaging plane that was used on all subsequent sequences. The tilt of this oblique plane was defined by the most inferior extent of the genu and by the splenium of the corpus callosum on the midline sagittal image.
All transverse images were acquired as a single set of nineteen 4-mm-thick images with a 1-mm gap. T1w images were acquired with a turbo inversion recovery imaging sequence [repetition time between spin excitations (TR)=8000 ms; echo time (TE)=20 ms; time between inversion and excitation pulse (TI)=300 ms; acquisition=1; echo train length=7]. Proton-density-weighted (PDw) and T2w images were acquired simultaneously with a dual spin-echo sequence (TR/TE1/TE2=3500/17/102 ms; acquisition = 1). FLAIR images were acquired with a turbo inversion recovery sequence (TR/TE/TI=9000/119/2470 ms; acquisition=1; echo train length=7).
2.3. Preprocessing
Preprocessing of MR data consisted of registration and inhomogeneity correction. Registration methods used a modified version of the Statistical Parametric Mapping 02 (SPM2) software’s (Wellcome Department of Cognitive Neurology, Institute of Neurology, Queen Square, London, UK) [12] implementation of image coregistration to register the images within each examination and across all examinations of the same patient. The technique used a normalized mutual information (NMI)-based affine registration [13] to optimize six parameters (three for translation and three for rotation). After registration, intensity inhomogeneity in MR images was corrected using a new implementation of the entropy minimization algorithm. Two important modifications were made to the entropy minimization algorithm. First, a nonparametric Parzen window estimator with a Gaussian kernel was used to directly calculate the probability density function and the Shannon entropy from image data. Second, entropy minimization was performed with a conjugate gradient method. This algorithm can efficiently correct different types of MR images with decreased computational load and can preserve anatomical information in clinical images [14].
2.4. Segmentation of questionable hyperintense regions
A previous segmentation algorithm for quantifying LE served as the starting point for the identification algorithm in the current study. This technique used a neural network algorithm along with spatially normalized proportional volume maps and a gradient magnitude threshold to segment LE identified by a radiologist [15]. A combined imaging set consisting of registered and inhomogeneity-corrected T1w, T2w, PDw and FLAIR MR images and WM, GM and CSF proportional volume maps was analyzed with a Kohonen SOM [16,17]. Masks to limit segmentation to the most probable regions of WM were automatically generated by thresholding for pixels with a >30% proportional volume of WM, a <30% proportional volume of GM and a <10% proportional volume of CSF. A 3-D gradient vector magnitude calculated on the FLAIR was used to formulate a threshold to exclude pixels most likely to be image artifacts.
The segmentation for possible regions of LE was completed with a slightly modified version of the previous algorithm. This process is characterized in the flowchart in Fig. 1. The goal was to identify the single most hyperintense region within the WM. In addition to this hyperintense region, a second region that encompassed the largest proportion of the genu of the corpus callosum was used to control for age effects and differences in signal intensity values between patients and examinations. The first modification was to reduce the number of outputs for the Kohonen SOM from 16 in the previous algorithm to 9 in the modified algorithm. This allowed more pixels to be grouped together into either the genu region or the hyperintense region.
Fig. 1.

This flow chart demonstrates all the procedures for a new set of MR images to be used to compute the difference vector necessary for use with the CAD tool to produce the label of NAWM or LE for the examination.
The second modification was to the registration algorithm used for proportional volume maps. An in-house software routine based on a modified version of the SPM2 software’s [12] implementation of image coregistration was used to align the Cincinnati Children’s Hospital Medical Center atlas [18] proportional volume maps to the images of each patient. This registration technique uses the same NMI-based affine registration as the rigid-body registration of MR images with additional optimization parameters. Twelve parameters (three each for translation, rotation, scaling and shearing along the three coordinate axes) are optimized in this procedure.
The modified segmentation algorithm was used to generate a quantitative summary measurement for input into the CAD tool. Segmentation was manually classified to identify two regions: the most hyperintense region within the WM and the region representing the normal-appearing genu (Fig. 2). The images in Fig. 2 show how segmentation was able to group the pixels by classifying the genu and the hyperintense regions. A prototypical weight vector calculated from the Kohonen SOM was used to characterize the average T1w, T2w, PDw and FLAIR signal intensities for each of the two identified regions. To control for variations in signal intensity values due to patient age and imaging conditions, the genu values of the prototypical weight vector were subtracted from the values associated with the most hyperintense WM region. These four signal intensity difference values, combined with patient age on examination, were the five quantitative summary measurements used in the CAD tool.
Fig. 2.

Example of the genu and most hyperintense regions segmented with the modified algorithm. From the left, the images are PDw at the level of the basal ganglia, with the corresponding output from the segmentation algorithm showing the genu region (green). A FLAIR image at the level of the centrum semiovale, with the corresponding output from the segmentation algorithm showing the hyperintense WM region (orange). All nine regions are shown for each of the segmented outputs.
2.5. CAD tool
The task for the CAD tool was to differentiate the signal intensity difference values computed from the segmentation algorithm into age-appropriate NAWM and LE. A flowchart of this process is shown in Fig. 3. A second Kohonen SOM was modified from the image segmentation algorithm to operate on the signal intensity differences, and this SOM divided the signal intensity differences from the training set into nine output groups. The nine output groups were manually examined; one NAWM group and one LE output group were identified. The other seven outputs were redistributed into either the NAWM or the LE output, depending on the minimum Euclidean distance, yielding a binary label by the CAD package. Unlike the SOM used in the segmentation algorithm, the SOM used in the CAD tool was trained once for a single data set and was applied to new data. The first examination served as the training set for the SOM of the CAD tool. The trained SOM was then applied to the signal intensity differences from the other three examinations to determine the label of the CAD tool.
Fig. 3.

Flowchart describing the training and validation of the CAD tool. The inputs and outputs of the process are shown as ovals; the rectangles represent processing steps, and the diamonds represent decision-making steps. This flowchart demonstrates the division of all 636 difference vectors into a training set and a testing set, the use of the second Kohonen SOM, the formation of the CAD tool and, finally, the use of consensus labels from the two trained observers for validating the CAD labels and for producing agreement measurements for each data set. The dashed line between the consensus labels and the selection of the NAWM and LE outputs represents the use of this information to demonstrate the percentage of NAWM and LE examinations in each output of the SOM to guide the selection of the outputs in characterizing NAWM and LE using the CAD tool.
2.6. Validation of the CAD tool
The overall performance of the SOM trained for the CAD tool was assessed using two metrics. First, the consistency of the CAD labels was examined within each of the 228 patients. This consistency measure served to tie the independently calculated labels for each examination back to the HDMTX regimen that each patient received and was based on two assumptions: a patient could not have LE resolved and become normal with additional HDMTX, and a patient could not develop LE and become abnormal without additional HDMTX. The first assumption corresponds to an inconsistent labeling for a patient with LE on the first examination and with NAWM on the second examination. The second assumption corresponds to a patient who was labeled as NAWM on or after the second examination and was labeled as LE on a subsequent examination.
The second metric used to evaluate the performance of the CAD tool was a comparison of each individual label to a manual reading of each examination by the two trained observers to determine if LE was present in any slice of the examination. To reduce variability, both observers viewed the data at the same time. The differentiation criterion utilized was the presence of a higher signal intensity in LE than in cortical GM. To evaluate the reproducibility of this manual reading, 50 examinations were randomly repeated, and agreement was measured using a κ index. In addition, the consistency of the trained observers was compared to the consistency of single grading by the on-duty radiologist at the time of the examination using a paired-samples t test. The grading scale for the radiologist was reduced to either an NAWM or an LE label to be comparable to the results from the trained observers and the CAD tool.
3. Results
3.1. Segmentation of questionable hyperintense regions
The modified segmentation algorithm was able to produce quantitative data for use with the CAD tool. The signal intensity differences from the FLAIR and T2w images are plotted with the label from the CAD tool (Fig. 4). The graph shows that larger magnitude differences in signal intensities generally correspond to LE. The continuous nature of these differences is also shown, as there is no clear separation between the NAWM and LE groups. FLAIR imaging from three patients also demonstrates the various patterns of LE, which can vary continuously from NAWM to focal LE (Fig. 5). The patient with focal LE (far right) was imaged at the end of the HDMTX administration, and the other two patients were imaged after one course of HDMTX. While extensive and severe LE lesions with signal intensities near that of CSF can be seen easily by an observer with little experience, subtle diffuse changes have nearly identical MR properties and locations with unmyelinated or partially myelinated WM.
Fig. 4.

Distribution of the signal intensity differences computed by the segmentation algorithm. The graph shows the FLAIR signal intensity difference plotted on the vertical axis and the T2w signal intensity differences plotted on the horizontal axis. Filled circles (●) represent examinations labeled as NAWM by the CAD tool, while open circles (○) represent examinations labeled as LE.
Fig. 5.

Example FLAIR imaging for three patients. All images shown are from patients who were 3.5 years of age at the time of the MR examination. From left to right are images from a patient without LE: one with intermediate hyperintensities representing possible LE and one with focal LE.
3.2. Validation of the CAD tool
The consistency and reproducibility of the trained observers were examined before using these results. Of the 186 patients with more than one examination, 168 (90.3%) were consistently labeled by the trained observers. In contrast, single grading by the on-duty radiologist at the time of the examination revealed consistent labels in only 73.5% of patients (136 of 185; P<.001). This substantiates the need for a standardized quantitative approach to allow for reproducible readings. The reproducibility measured in the repeated reading of 50 examinations revealed a κ=0.88 for the trained observers. This high level of reproducibility allows for the use of more consistent consensus labels to evaluate the performance of the CAD tool. Consistency for the CAD tool was 89.8% (167 of 186) for all patients with more than one examination. The overall agreement between the CAD tool and the consensus reading of the two trained observers was 84.1% (535 of 636), with 84.2% (170 of 202) agreement in the first examinations and with 84.1% (365 of 434) agreement in all other examinations.
4. Discussion
Reliably detecting subtle therapy-induced LE in children treated for cancer is a challenging task due to the lack of delineation from the NAWM, especially in very young patients. This project sought to develop a new quantitative methodology to distinguish unmyelinated or partially myelinated WM from subtle therapy-induced LE, which usually presents as diffuse slightly hyperintense regions in the periventricular and centrum semiovale WM on T2w and FLAIR images of young children treated for cancer. The CAD tool developed in this study agreed with the consensus reading of the two trained observers in 84% of all examinations and gave consistent readings across 90% of longitudinal examinations.
As with any study, results must be interpreted while taking into consideration the limitations of the study. While difference measures compensate for age differences between patients by using an internal control (normal-appearing genu), it has been demonstrated previously that the T1 relaxation time of the genu reaches adult values at approximately 3 years of age [19]. This means that our referencing of suspected abnormal regions to the genu values adjusts for measurement variability, but has very little adjustment for age in older subjects. The inclusion of age as an input into the CAD tool helps minimize these effects. In addition, the comparison of the quantitative detection of LE to the trained observers’ reading assumes that the observers are always correct. The high rate of inconsistency within patient examinations using serial assessments by multiple neuroradiologists demonstrates the challenges of differentiating LE and unmyelinated or partially myelinated WM in young children. The proposed CAD system provided an objective detection of hyperintensities that are greater than normally expected on T2w and FLAIR images.
In conclusion, this study has described a novel approach for an MR-based CAD system to assist in the detection of subtle therapy-induced LE even in young children with unmyelinated or partially myelinated WM. An objective detection of therapy-induced LE will facilitate a better understanding of long-term effects on survivors. Combining individualized pharmacokinetic studies, longitudinal neurocognitive testing and quantitative MR assessments of LE extent and intensity could substantially advance the understanding of the impact of therapy on the quality of life of ALL survivors. Our ultimate goal is to use this method to quantify early therapy-induced neurotoxicity, which is potentially reversible through adjustment of therapy or through individualized neurobehavioral and pharmacological interventions.
Acknowledgments
The authors thank Greg Bernstein for his efforts in the processing and analysis of MR examinations, and Vickey Simmons for the scheduling of examinations.
This work was supported, in part, by Cancer Center Support grants P30CA21765 and R01CA90246 from the National Cancer Institute, by the American Cancer Society F.M. Kirby Clinical Research Professorship and by the American Lebanese Syrian Associated Charities.
References
- [1].Wilson DA, Nitschke R, Bowman ME, Chaffin MJ, Sexauer CL, Prince JR. Transient white matter changes on MR images in children undergoing chemotherapy for acute lymphocytic leukemia: correlation with neuropsychologic deficiencies. Radiology. 1991;180:205–9. doi: 10.1148/radiology.180.1.2052695. [DOI] [PubMed] [Google Scholar]
- [2].Asato R, Akiyama Y, Ito M, Kubota M, Okumura R, Miki Y, et al. Nuclear magnetic resonance abnormalities of the cerebral white matter in children with acute lymphoblastic leukemia and malignant lymphoma during and after central nervous system prophylactic treatment with intrathecal methotrexate. Cancer. 1992;70:1997–2004. doi: 10.1002/1097-0142(19921001)70:7<1997::aid-cncr2820700732>3.0.co;2-g. [DOI] [PubMed] [Google Scholar]
- [3].Mahoney DH, Jr, Shuster JJ, Nitschke R, Lauer SJ, Steuber CP, Winick NJ, et al. Acute neurotoxicity in children with B-precursor acute lymphoid leukemia: an association with intermediate-dose intravenous methotrexate and intrathecal triple therapy — a pediatric oncology group study. J Clin Oncol. 1998;16:1712–22. doi: 10.1200/JCO.1998.16.5.1712. [DOI] [PubMed] [Google Scholar]
- [4].Paakko E, Harila-Saari A, Vanionpaa L, Himanen S, Pyhtinen J, Lanning M. White matter changes on MRI during treatment in children with acute lymphoblastic leukemia: correlation with neuropsychological findings. Med Pediatr Oncol. 2000;35:456–61. doi: 10.1002/1096-911x(20001101)35:5<456::aid-mpo3>3.0.co;2-1. [DOI] [PubMed] [Google Scholar]
- [5].Chu WCW, Chik KW, Chan YL, Yeung KW, Roebuck DJ, Howard RG, et al. White matter and cerebral metabolite changes in children undergoing treatment for acute lymphoblastic leukemia: longitudinal study with MR imaging and 1H MR spectroscopy. Radiology. 2003;229:659–69. doi: 10.1148/radiol.2293021550. [DOI] [PubMed] [Google Scholar]
- [6].Kingma A, Mooyaart EL, Kamps WA, Nieuwenhuizen P, Wilmink JT. Magnetic resonance imaging of the brain and neuropsychological evaluation in children treated for acute lymphoblastic leukemia at a young age. Am J Pediatr Hematol Oncol. 1993;15:231–8. doi: 10.1097/00043426-199305000-00012. [DOI] [PubMed] [Google Scholar]
- [7].Kingma A, van Dommelen RI, Mooyaart EL, Wilmink JT, Deelman BG, Kamps WA. Slight cognitive impairment and magnetic resonance imaging abnormalities but normal school levels in children treated for acute lymphoblastic leukemia with chemotherapy only. J Pediatr. 2001;139:413–20. doi: 10.1067/mpd.2001.117066. [DOI] [PubMed] [Google Scholar]
- [8].Barkovitch AJ. Concepts of myelin and myelination in neuroradiology. AJNR Am J Neuroradiol. 2000;21:1099–109. [PMC free article] [PubMed] [Google Scholar]
- [9].Sandler DP, Ross JA. Epidemiology of acute leukemia in children and adults. Semin Oncol. 1997;24:3–16. [PubMed] [Google Scholar]
- [10].Pui C-H, Relling MV, Sandlund JT, Downing JR, Campana D, Evans WE. Rationale and design of total therapy study XV for newly diagnosed childhood acute lymphoblastic leukemia. Ann Hematol. 2004;83:124–6. doi: 10.1007/s00277-004-0850-2. [DOI] [PubMed] [Google Scholar]
- [11].Evans WE, Relling MV, Rodman JH, Crom WR, Boyett JM, Pui C-H. Conventional compared with individualized chemotherapy for childhood acute lymphoblastic leukemia. N Engl J Med. 1998;338:499–505. doi: 10.1056/NEJM199802193380803. [DOI] [PubMed] [Google Scholar]
- [12].Ashburner J, Friston KJ. Nonlinear spatial normalization using basis functions. Hum Brain Mapp. 1999;7:254–66. doi: 10.1002/(SICI)1097-0193(1999)7:4<254::AID-HBM4>3.0.CO;2-G. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Studholme C, Hill DLG, Hawkes DJ. An overlap invariant entropy measure of 3D medical image alignment. Pattern Recogn. 1999;32:71–86. [Google Scholar]
- [14].Ji Q, Glass JO, Reddick WE. A fast entropy minimization algorithm for bias field correction in MR images. Book of abstracts; Thirteenth Scientific Meeting of the International Society for Magnetic Resonance in Medicine; Miami (FL): International Society for Magnetic Resonance in Medicine. 2005.p. 2759. [Google Scholar]
- [15].Glass JO, Reddick WE, Reeves C, Pui C-H. Improving the segmentation of therapy-induced leukoencephalopathy in children with acute lymphoblastic leukemia using a priori information and a gradient magnitude threshold. Magn Reson Med. 2004;52:1336–41. doi: 10.1002/mrm.20259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Pal NR, Bezdek JC, Tsao ECK. Generalized clustering networks and Kohonen’s self-organizing scheme. IEEE Trans Neural Netw. 1993;4:549–57. doi: 10.1109/72.238310. [DOI] [PubMed] [Google Scholar]
- [17].Kohonen T. Self-organization and associative memory. Springer-Verlag; New York: 1989. [Google Scholar]
- [18].Wilke M, Schmithorst VJ, Holland SK. Normative pediatric brain data for spatial normalization and segmentation differs from standard adult data. Magn Reson Med. 2003;50:749–57. doi: 10.1002/mrm.10606. [DOI] [PubMed] [Google Scholar]
- [19].Steen RG, Hunte M, Traipe E, Hurh P, Wu S, Bilaniuk L, et al. Brain T1 in young children with sickle cell disease: evidence of early abnormalities in brain development. Magn Reson Imaging. 2004;22:299–306. doi: 10.1016/j.mri.2004.01.022. [DOI] [PubMed] [Google Scholar]
