Abstract
Cerebral microbleeds, observed as small, spherical hypointense regions on gradient echo (GRE) or susceptibility weighted (SWI) magnetic resonance imaging (MRI) sequences, reflect small hemorrhagic infarcts, and are associated with conditions such as vascular dementia, small vessel disease, cerebral amyloid angiopathy, and Alzheimer’s disease. The current gold standard for detecting and rating cerebral microbleeds in a research context is visual inspection by trained raters, a process that is both time consuming and subject to poor reliability. We present here a novel method to automate microbleed detection on GRE and SWI images. We demonstrate in a community-based cohort of older adults that the method is highly sensitive (greater than 92% of all microbleeds accurately detected) across both modalities, with reasonable precision (fewer than 20 and 10 false positives per scan on GRE and SWI, respectively). We also demonstrate that the algorithm can be used to identify microbleeds over longitudinal scans with a higher level of sensitivity than visual ratings (50% of longitudinal microbleeds correctly labeled by the algorithm, while manual ratings was 30% or lower). Further, the algorithm identifies the anatomical localization of microbleeds based on brain atlases, and greatly reduces time spent completing visual ratings (43% reduction in visual rating time). Our automatic microbleed detection instrument is ideal for implementation in large-scale studies that include cross-sectional and longitudinal scanning, as well as being capable of performing well across multiple commonly used MRI modalities.
Subject terms: Software, Cerebrovascular disorders, Diagnostic markers
Introduction
Microhemorrhages in the brain, known as cerebral microbleeds, are small, persistent deposits of products from blood breakdown, primarily hemosiderin, which have been contained in perivascular regions by macrophages1–9. Radiologically, microbleeds are identified on T2*-weighted magnetic resonance imaging (MRI) scans as roughly spherical signal voids, or hypointensities, due to the strong paramagnetic properties of the hemosiderin left after a bleed has occurred2–4,6–8,10,11. Cerebral microbleeds are associated with a number of outcomes, such as small vessel disease3,10,12, stroke8,9,13, traumatic brain injury14, radiation-induced bleeding15–17, cognitive decline2,11,18 and vascular dementia10,11. Lobar distributions of cerebral microbleeds are considered markers of cerebral amyloid angiopathy1, and are a prominent feature of Alzheimer’s disease (AD)4,19,20. In addition to signaling vascular forms of amyloid pathology, particularly in AD, microbleeds have emerged as a pernicious side effect of anti-amyloid treatments, so-called amyloid related imaging abnormalities related to hemosiderin deposits (ARIA-H)4, a necessary and important consideration in the enrollment of participants into AD therapeutic trials4,21,22. Although microbleeds can be present asymptomatically, early detection can be crucial in estimating risk for later cerebrovascular disease and cognitive decline2,3,11,18.
Microbleeds are detected radiologically with T2*-weighted MRI images, including either gradient echo (GRE) or susceptibility weighted images (SWI)1–5,8–10. These radiological findings have been validated with post-mortem analysis6,7, with true positives captured on imaging 48–89% of the time, depending on acquisition parameters5. Given the high level of sensitivity of MRI to paramagnetic material and the small size of the deposits, it is possible that MRI is more sensitive than gross pathological examination4,10,11,23. By omitting a refocusing pulse used in spin-echo sequences (such as T1) to correct of susceptibility distortion, GRE MRI is sensitive to paramagnetic artifacts, which can be exploited to visualize cerebral microbleeds5,24. SWI MRI is an alternative, more sensitive imaging modality for microbleed detection, with a larger “blooming” effect of paramagnetic material, making microbleeds more easily visible but also potentially more irregularly shaped23,24. Figure 1 shows an illustration of a microbleed using both SWI and GRE, highlighting the differences between the two modalities.
Figure 1.
Examples of microbleed location. The participant’s MRI shown here reflects a heavy burden of cerebrovascular disease. (A) Left: axial slice of a SWI image, with a microbleed location highlighted in white. Right: an enlarged image of the highlighted location. From top to bottom, the views shown are axial, sagittal, and coronal. (B) Left: axial slice of the same participant’s GRE image, with the same microbleed location highlighted in white. Right: an enlarged image of the highlighted location. From top to bottom, the views shown are axial, sagittal, and coronal.
Visual inspection of the T2* MRI scans for small, ovoid, hypointense regions indicative of microbleeds is the most frequently used method of rating microbleeds. Several methods have been developed to improve interrater reliability and reduce the subjectivity inherent in visual reads1,9,13. With increased research and clinical interest in microbleeds, particularly with respect to ARIA-H, there is a need for standardized automated or semi-automated pipelines to detect cerebral microbleeds. A few methods have been proposed14–17,25–31. These studies, however, are frequently done in small15,16,27 clinical populations (e.g., patients with radiation-induced microbleeds15–17 or traumatic brain injury14), with much higher rates of microbleeds than in community-based adults and have not, to our knowledge, demonstrated generalizability to community-based samples, across MRI sequences, or with respect to the reliability of longitudinal detection. We present here the algorithm for Microbleed Automated detection using Geometric Identification Criteria (MAGIC), designed to detect and label cerebral microbleeds in predominantly healthy older adults. We selected participants from a community-based study rather than a clinical sample to test and illustrate our method. We demonstrate the high sensitivity and low false positive rate for the across both GRE and SWI images. This level of validation was not shown in other studies, to the best of our knowledge. We also demonstrate that our algorithm outperforms visual ratings in longitudinal stability of identifying microbleeds on SWI images and demonstrate that final visual inspection of the segmentation output by MAGIC significantly reduces the time needed to rate scans.
Methods
Participants
Participants were selected from the Washington Heights-Inwood Columbia Aging Project (WHICAP), a community-based study of cognitive aging and dementia among Medicare-eligible residents of northern Manhattan New York. WHICAP participants were recruited in 3 waves, beginning in 1992, 1999, and 2009. MRI was first introduced into WHICAP in 200432 using a 1.5T MRI system and repeated on a subset of participants. Beginning in 2011, participants from the cohort recruited in 2009 received high-resolution MRI scanning using a 3T MRI system, and scans were once again repeated after 4.9 ± 1.3 (mean ± standard deviation) years on a subset of these participants. Randomly selected subsets of participants with available 3T MRI scans, including both SWI and GRE sequences, were included in this study (n = 78): one group (n = 44) was randomly selected from individuals rated visually as having at least one microbleed; the other group (n = 34) was randomly selected from individuals rated visually as not having any microbleeds. Fourteen of the microbleed positive participants had a follow-up MRI scan including SWI available at the time this study was performed, so these participants formed our longitudinal sample. GRE images were not collected at follow-up. The WHICAP study is approved by the Columbia University Medical Center Institutional Review Board, and all participants gave written informed consent.
MRI acquisition
Magnetic resonance images were obtained using a 3T Philips Intera scanner at Columbia University between 2011 and 2018. T1-weighted (repetition time = 6.6 ms, echo time = 3.0 ms, field of view = 256 × 200mm2, 1-mm slice thickness), T2*-weighted SWI (repetition time = 17 ms, echo time = 24 ms, field of view = 244 × 197mm2, 2 mm slice thickness, in plane resolution 0.43 × 0.43 mm), and T2*-weighted GRE (repetition time = 15 ms, echo time = 22 ms, field of view = 220 × 181mm2, 1 mm slice thickness, in plane resolution 0.43 × 0.43 mm) Magnetic resonance images were acquired for each participant at baseline, and T1-weighted and T2*-weighted SWI images using the same parameters were acquired for the subset of participants who completed follow-up scans.
Visual microbleed ratings
Consistent with previous studies done in the WHICAP cohort3, microbleeds were rated by visual inspection using criteria suggested by Greenberg and colleagues1. These criteria include the following guidelines: a dark (black) lesion on T2*-weighted MRI, accompanied by a “blooming” effect, which is round or ovoid and at least half-way surrounded by parenchyma (to distinguish microbleeds from vessels). The microbleed is devoid of signal hyperintensity on accompanying T1-weighted sequences and is distinguishable from other mimics (e.g., calcium deposits, bone, or vessel flow). Microbleeds were visually classified by location, including lobar (frontal, temporal, parietal, and occipital lobes) and deep (basal ganglia, thalamus, and infratentorial regions) locations. The number of microbleeds and location were noted for each participant. Three raters (IBM, EA, AGC), each trained in visually identifying microbleeds (IBM with over eight years of experience, AGC and EA with two years of experience), rated the entirety of the SWI and GRE scans, and microbleed locations, which were identified as true locations by either two or three raters, were used as the ground truth locations for testing the sensitivity of the algorithm, with unanimously identified locations representing definite microbleeds and locations identified by only two raters representing potential microbleeds. For longitudinal validation, two of the three raters rated the repeat SWI scans for microbleeds. In this study, participants were designated as microbleed positive if two or three raters agreed there was at least one microbleed in the brain, and microbleed negative if all raters agreed that no microbleeds were present. (Participants who had a microbleed identified by only one rater were excluded from this analysis. They would be considered microbleed negative by visual rating standards, but we wanted to maximize the difference between true and false positives, so we excluded these as too ambiguous.) Both percentage agreement (defined as the number of locations labeled by both raters divided by the total number of locations labeled) and Fleiss’ kappa33,34 were used to assess interrater and intra-rater reliability across modalities. These assessments were performed to ensure the visual ratings provided a reliable ground truth to judge the algorithm-segmented microbleeds against. While the kappa score is frequently used to assess interrater agreement, it applies a very conservative estimate of rater agreement by correcting for probability of agreement in pure guessing. The true agreement level typically lies somewhere between the uncorrected agreement and the kappa score, so we use both methods to assess agreement between visual ratings.
MRI preprocessing
The entire algorithm pipeline is described by the flowchart in Fig. 2, and the steps are illustrated in Fig. 3. Before the microbleed detection could begin, a few preprocessing steps were required. The SWI and GRE scans for each participant were brain extracted using FSL Brain Extraction Toolbox35 (fsl.fmrib.ox.ac.uk/fsl/). The T1-weighted image and a lobar mask from FSL’s MNI atlas were co-registered to the SWI and GRE images separately; identification of microbleeds was done in the native space of each SWI and GRE scan. The co-registered T1-weighted volume was used to compute the CSF mask using the Statistical Parametric Mapping toolbox36 (SPM 12; www.fil.ion.ucl.ac.uk/spm/software/spm12/). Finally, the SWI and GRE scans were resampled to a higher resolution (Fig. 3A) so that all artifacts that were potentially microbleeds had a diameter of at least six voxels to ensure an accurate identification when using a circular Hough transform (see below). In the current implementation, this step was accomplished by scaling the images by a factor of three.
Figure 2.

Algorithm flowchart. Summary of the algorithm as described in the text.
Figure 3.
Algorithm Illustration. This image illustrates some of the algorithm steps on an SWI image. (A) Initial SWI image, resampled. (B) 2D gradient computation (gradient magnitude pictured). (C) Edges of the slice as output after Canny edge detection. (D) Initial potential ROIs labeled by the circular Hough transform. (E) ROIs remaining after edge exclusion. (F) ROIs remaining after circular grouping and size exclusion. (G) Final ROI marked after CSF exclusion and multi-slice merging. (H) Co-registered lobar map used to quantify distribution of detected microbleeds.
Detection of potential microbleed regions of interest
Initially, potential microbleed locations were identified as circular regions of interest (ROIs) on each slice. For each slice of the GRE or SWI image, the 2D image gradient was computed with a 3 × 3 Sobel filter37 (Fig. 3B). Then, edge pixels were detected using the Canny edge detection algorithm38 to remove all neighboring voxels that are not local maxima. Hysteresis thresholding (lower bound = 0.1, upper bound = 0.15) was used to remove spurious edges as a result of noise. The final edges left after this method are illustrated in Fig. 3C.
Once the edge pixels were identified, a circular Hough transform39 was used to detect circular ROIs on each slice. We chose the circular Hough transform to identify these ROIs over other methods (notably over the radial symmetry transform, which has been suggested as a method of detecting microbleeds previously15,26,40) because it allows for a more lenient definition of circularity, and it is therefore more sensitive to ovoid shapes41. Potential circular ROIs on each slice were identified after restricting the radius to vary within a physiologically useful range1,9 (, 0.72–1.72 mm) and thresholding at a lenient threshold of 80% of the maximum overlap in the Hough transform (Fig. 3D).
The large number of potential locations was then thinned using physiologically relevant criteria, analogous to the criteria used in visual inspection. First, all ROIs lying on the edges of the image were discarded. Overlapping ROIs that remain were merged together, and any that were too large to be true microbleeds, using a lenient cutoff of distance between centers greater than eight pixels (1.15 mm), or singular ROIs (i.e., circles unmerged with others, indicating an edge arising from noise) were excluded (Fig. 3F). Finally, all ROIs that overlapped with the CSF mask (segmented from co-registered T1) were excluded as vessels, similar to the visual rating criteria. The remaining locations marked on each slice were then merged across slices. The final ROI representing a potential microbleed was defined as the 3D center of the potential microbleed and a surrounding neighborhood of a standardized size (51 × 51 × 25 voxels, or two times the maximum expected size of a microbleed). This definition was used because a neighborhood of this size both ensures the entire microbleed artifact will be captured for analysis in the next stage and also standardizes the selected ROIs, making geometric features more comparable. At this stage of the algorithm, we tested the sensitivity of detection of microbleeds compared with ground truth visual ratings to ensure that the automatic labelling was accurately capturing the visually labelled microbleeds.
3D geometric filtering
To assist in the removal of false positive locations, we used the geometric information contained in each ROI identified in the previous step. We selected a priori four characteristics of the ROI as having the potential to differentiate between true and false positive locations: the 3D image entropy of the ROI, the 2D image entropy of the maximum intensity projection of the ROI, and the volume and compactness of the central blob in each ROI as identified via Frangi filtering.
In an image, each pixel has a probability of being a given intensity, measured as the fraction of all pixels in the image at that intensity. Image entropy is defined42 based on this intensity probability distribution such that
| 1 |
In a typical 8-bit greyscale image, entropy will lie in the range from zero (all pixels are the same intensity) to eight (all 28 shades of grey have an equal chance of occurring). In a 3D image with a large signal void in the center surrounded by parenchyma, characteristic of a true microbleed, we expected a moderate amount of entropy, while in an area characterized by many sharp gradient changes, characteristic of a false positive, we expected a higher amount of entropy indicating a noisy, false positive region. In a 2D maximum intensity projection of a true microbleed, we expected a lower entropy than in the case of a false positive, as the sharp gradient change around a microbleed tends to leave a small hyperintense ring around the location. However, as this feature is much smaller than the signal void, we did not expect the 2D entropy to be as sharply distinctive as the 3D entropy. True and false positive ROI entropies are illustrated in Fig. 4.
Figure 4.
Geometric measure cutoff justifications. This figure illustrates the geometric properties used to remove false positives from the identified locations in SWI and GRE images. Modalities are separated into panels. (A) Distribution of 3D entropy in the ROI neighborhood on SWI. (B) Distribution of the 2D entropy of the maximum intensity projection of the ROI neighborhood on SWI. (C) Number of false positives removed at different cutoffs of the Frangi vesselness measure based on the central blob volume on SWI. (D) Number of false positives removed at different cutoffs of the Frangi vesselness measure based on the central blob compactness on SWI. (E) Distribution of 3D entropy in the ROI neighborhood on GRE. (F) Distribution of the 2D entropy of the maximum intensity projection of the ROI neighborhood on GRE. (G) Number of false positives removed at different cutoffs of the Frangi vesselness measure based on the central blob volume on GRE. (H) Number of false positives removed at different cutoffs of the Frangi vesselness measure based on the central blob compactness on GRE.
The Frangi filter43 utilizes the second order derivatives of an image to extract spatial information about the geometry of an ROI. In a 3D image , the Hessian matrix at each location is defined as the matrix
| 2 |
The eigenvalues of the Hessian are defined to be ordered such that . When the structure of interest is hypointense compared with the surroundings, which is true for both microbleeds and vessels on T2*-weighted images, . The vesselness of each voxel can therefore be described as
| 3 |
where , , and are ratios containing structural information from the eigenvalues, defined as
| 4 |
| 5 |
| 6 |
and are constants used to tune the sensitivity of the filter to the structural ratios. In our implementation, we used the suggested43 values of and as half of the Hessian norm. The vesselness of an ROI can assist in separating true positive from false positive locations, since a tubular artifact, such as a vessel, will have a high vesselness within the region ( and ), an ovoid artifact will have a lower degree of vesselness ( and ), and an ROI containing high-gradient noise will have a vesselness approaching zero ().
In a standard Frangi filter, when filtering for large tubular structures such as vessels, it is necessary to compute vesselness across a range of scales determined by different Gaussian filters. Since the artifacts we measured are relatively small and did not vary greatly in size, the benefits of maximizing over a range of scales was not worth the computational cost, so we used only the vesselness as measured in the resampled space without any Gaussian blur.
Once we computed the vesselness of all the voxels within each ROI, we used blob analysis to extract the central blob of each ROI defined as all non-zero voxels grouped via 26-connected neighborhood to the non-zero voxel closest to the center of the ROI. First, the vesselness within an ROI was used to create a binary mask, by thresholding as a fraction of the maximum vesselness within the ROI. We tested in a range of 0.1 (not zero to exclude noise) to 0.6 (higher levels are useful to distinguish vessels, not ovoid locations such as microbleeds). The volume, defined as the number of voxels with the blob, and compactness, defined as square of the number of perimeter voxels of the blob divided by the volume of the blob, were noted for this central blob. The number of false positives eliminated at this step using the volume and compactness (minimum and maximum cutoffs) are illustrated in Fig. 4. After this step, we also computed the precision of the algorithm (percentage of true positives to total number of labeled locations).
Final counting and location step
After the vast majority of false positives were removed by the previous step, the remaining ROIs were saved in the native space of the modality of interest (SWI or GRE) in an easily viewable and editable format for correction by a trained rater. The difference in rating times between visual ratings and rating the automatically segmented images was evaluated in a separate group of 20 SWI scans. The co-registered lobar mask was used to count automatically the number of microbleeds identified and output the distribution of locations throughout the brain for further analysis. For the longitudinal scans, an additional visual rating was done using the algorithm’s output locations to confirm that the locations detected at multiple timepoints were indeed microbleeds (i.e. visual ratings were done blinded to the algorithm, and then redone using the algorithm’s output across both timepoints). We emphasize this step so as to be clear that visual ratings remain our gold standard method of rating microbleeds, even when they are informed by an algorithm.
Results
Demographic information
The demographic characteristics of the study cohort are presented in Table 1. Participants were designated as either microbleed positive (one or more microbleeds identified by two or more raters) or microbleed negative (no microbleeds identified by any rater). Microbleed positive participants were slightly older than microbleed negative participants (t(76) = 2.21, p = 0.03), but did not differ in terms of sex/gender (χ2(1, N = 78) = 1.37, p = 0.24) and race/ethnicity (χ2(3, N = 78) = 7.29, p = 0.06). The microbleed positive participants who had a follow-up MRI scan about five years later had similar distribution of sex/gender (χ2(1, N = 58) = 0.58, p = 0.45) and race/ethnicity (χ2(3, N = 58) = 2.09, p = 0.55) as the baseline sample of microbleed positive participants.
Table 1.
Demographic characteristics of individuals with and without microbleeds.
| Microbleed status | Positive | Negative | Total | Statistic |
|---|---|---|---|---|
| Baseline | ||||
| N | 44 | 34 | 78 | – |
| Age, years: mean (SD) | 76.3 (6.0) | 73.3 (7.0) | 74.5 (6.6) | t = 2.21, p = 0.03 |
| Sex/gender, women: N (%) | 20 (45) | 21 (60) | 41 (52) | χ2 = 1.37, p = 0.24 |
| Race/ethnicity: N (%) | χ2 = 7.29, p = 0.06 | |||
| White | 23 (52) | 11 (32) | 34 (44) | |
| Black | 15 (34) | 11 (32) | 26 (33) | |
| Hispanic | 4 (9) | 11 (32) | 15 (19) | |
| Other | 2 (5) | 1 (3) | 3 (4) | |
| Follow up | ||||
| N | 14 | 0 | 14 | |
| Age, years: mean (SD) | 79.3 (6.1) | – | 79.3 (6.1) | |
| Time to follow-up, years: mean (SD) | 4.86 (1.3) | – | 4.86 (1.3) | |
| Sex/gender, women: N (%) | 8 (57) | – | 8 (57) | |
| Race/ethnicity: N (%) | – | |||
| White | 5 (36) | – | 5 (36) | |
| Black | 5 (36) | – | 5 (36) | |
| Hispanic | 3 (21) | – | 3 (21) | |
| Other | 1 (7) | – | 1 (7) | |
Microbleed positive participants are those who have at least one microbleed present (identified by two or more raters), and microbleed negative participants are those who have no microbleeds present (agreed by all three raters).
Interrater reliability
A potential microbleed was labeled as a “definite” microbleed if all three raters agreed that the artifact was a microbleed, and as a “probable” microbleed if two raters agreed that it was a microbleed (full rating agreement can be found in Supplementary Table S1). There was an acceptable level of agreement between raters, with agreement ranging from 0.67 to 0.97 depending on rater and imaging modality. Interrater reliability did not differ systematically between SWI (0.67–0.95) and GRE (0.67–0.97). Merged ratings, reflecting the combination of SWI and GRE ratings via OR operation (i.e., if a rater labeled the location as a true positive on either SWI or GRE they counted it as a true microbleed) were also computed, and showed similar agreement range (0.70–0.95). Interrater reliability, measured across both microbleed positive and negative participants, was similar across modalities (SWI: κ = 0.714, 95% CI: [0.710, 0.717]; GRE: κ = 0.708, 95% CI: [0.705, 0.712]; merged: κ = 0.733, 95% CI: [0.729, 0.737]) and comparable to prior studies that used visual ratings1. Intra-rater reliability between modalities was similar (Rater 1: κ = 1.00, 95% CI: [0.994, 1.006]; Rater 2: κ = 0.751, 95% CI: [0.745, 0.758]; Rater 3: κ = 0.726, 95% CI: [0.720, 0.733]).
Visual ratings identified 54 locations across the 44 microbleed positive participants using SWI scans (39 definite locations, 15 probable locations). In the same 44 participants, visual ratings identified 61 locations on GRE (43 definite locations, 18 probable locations). Combining these ratings, there were a total of 64 unique locations identified (45 definite locations, 19 probable locations). These visual results were used as the “ground truth” measure of sensitivity for the algorithm.
Algorithm results—sensitivity
Of the 54 locations found on SWI, the algorithm identified 50 (38 of the definite true positives, 12 of probable true positives, 93% overall sensitivity). Of the 61 locations found on GRE, the algorithm identified 56 (41 of the definite true positives, 15 of the probable true positives, 92% overall sensitivity). Combining the ratings, the algorithm identified 61 true locations (44 of the definite true positives, 17 of the probable true positives, 95% overall sensitivity). Treated as an independent rater, the algorithm achieved a high level of agreement with other raters in marking true microbleed locations (0.75–0.89), higher than the average agreement amongst visual ratings. The full results of the algorithm’s sensitivity are shown in Table 2.
Table 2.
Algorithm sensitivity results.
| Definite | Probable | Combined | |
|---|---|---|---|
| SWI | |||
| Rater identified | 39 | 15 | 54 |
| Algorithm identified | 38 | 12 | 50 |
| Sensitivity | 0.97 | 0.8 | 0.93 |
| GRE | |||
| Rater identified | 43 | 18 | 61 |
| Algorithm identified | 41 | 15 | 56 |
| Sensitivity | 0.95 | 0.83 | 0.92 |
| Merged | |||
| Rater identified | 45 | 19 | 64 |
| Algorithm identified | 44 | 17 | 61 |
| Sensitivity | 0.98 | 0.89 | 0.95 |
An artifact was labeled as a “definite” microbleed if all three raters agreed that the artifact was a microbleed, and as a “probable” microbleed if two raters agreed that it was a microbleed. This table shows the sensitivity results of the algorithm across these different labels. Note that in a study using only visual ratings, the final column (combining the definite and probable ratings) would be the number typically reported.
Algorithm results—precision
After removing false positives using the cutoff criteria derived from the geometric measures, the algorithm identified an average of 9.7 false positives per scan (precision: 11%) on SWI images and an average of 17.1 false positives per scan (precision: 7%) on GRE images. The performance on microbleed negative participants was modestly better, with an average of 7.32 and 15.4 false positives per scan on SWI and GRE, respectively. The full results of the algorithm’s precision are shown in Table 3.
Table 3.
Algorithm precision results.
| True positives | False positives | Precision | Average FP/scan | |
|---|---|---|---|---|
| Microbleed positive | ||||
| SWI | 50 | 426 | 0.11 | 9.7 |
| GRE | 56 | 752 | 0.07 | 17.1 |
| Microbleed negative | ||||
| SWI | – | 249 | – | 7.32 |
| GRE | – | 544 | – | 15.4 |
| Follow up | ||||
| SWI | 10 | 160 | 0.06 | 3.64 |
As noted in the methods, microbleed positive participants are those who have at least one microbleed present (identified by two or more raters), and microbleed negative participants are those who have no microbleeds present (agreed by all three raters). False positive (FP) are presented in the final column averaged over the number of images (FP/scan).
The measures of 3D entropy in true positive locations did not differ between SWI (average entropy: 5.85 ± 0.41; entropy range 5.06–6.88) and GRE (average entropy: 5.79 ± 0.38; entropy range: 5.07–6.88). As we hypothesized, the 3D entropy of false positives was (Supplementary Tables S2A and S3A) higher than the distribution of true positive locations in both SWI (average entropy: 6.74 ± 0.58, p < 0.001) and GRE (average entropy: 6.63 ± 0.57, p < 0.001) images. In parallel with these results, the 2D entropy of the maximum intensity projections was lower (Supplementary Tables S2A and S3A) in true positive locations (SWI average entropy: 5.01 ± 0.33; GRE average entropy: 5.01 ± 0.31) than in false positive locations (SWI average entropy: 5.87 ± 0.66, p < 0.001; GRE average entropy: 5.67 ± 0.62, p < 0.001). In SWI images, the 2D entropy provided a more sensitive discriminant between true and false positives (2D eliminates 35.9 false positives per scan, 3D eliminates 24.5 false positives per scan), while in GRE the 2D and 3D entropy provided roughly the same level of discrimination (19.8 and 19.9 false positives per scan eliminated by 2D and 3D, respectively). Nearly all of the eliminated false positives had an entropy higher than the range of true positives, with the few that fell below the range lying in locations near a larger signal void (e.g., infarct) that would be visually rated as too large to be a microbleed. The relative distributions of 3D and 2D entropy are shown illustrated in Fig. 4 (Fig. 4A,B show 3D and 2D entropy, respectively, in SWI, with Fig. 4E,F illustrating the same in GRE).
As we expected, lower values of the Frangi filter cutoff allowed for generally better discrimination between true and false positives likely due to the relatively low vesselness of the structures measured. At the cutoffs selected to maximize difference between false positives larger than true microbleeds (SWI: 0.15; GRE: 0.25; see Tables S2B and S3B for cutoffs used for ROIs smaller than true microbleeds as well), true positive volume was lower on average than the volume of false positives on both SWI (true positive (voxels): 1245 ± 475; false positive: 3003 ± 1794; p < 0.001) and GRE (true positive: 54 ± 414; false positive: 2060 ± 1287; p < 0.001). Volume cutoffs were useful in eliminating several false positives, on both SWI (minimum: 11.8 false positives per scan eliminated; maximum: 26.5 false positives per scan eliminated) and GRE (minimum: 1.4 false positives per scan; maximum: 24.3 false positives per scan eliminated). Supplementary Tables S2B and S3B provide the full results for different volume cutoffs in SWI and GRE, respectively. In parallel with these results, compactness was lower in true positives (SWI: 540 ± 414; GRE: 456 ± 309) than in false positives (SWI: 2060 ± 1287, p < 0.001; GRE: 884 ± 697, p < 0.001; see Supplementary Tables S2C and S3C for full results). Because the extreme volume difference between true positive and false positive results drove this relationship, compactness did not provide greater discrimination ability beyond volume, contrary to our initial hypothesis. These results are illustrated for Fig. 4 (Fig. 4C,D illustrate ROI size and compactness, respectively, in SWI, with 4G and 4H illustrating the same measures in GRE).
Figure 5 presents a visual summary of how volume and compactness change across different values of Frangi filter cutoffs. The top row (Fig. 5A,C, illustrating SWI and GRE, respectively) demonstrates that while true positive volume was lower than most false positive volumes (the shaded grey region), these differences were heightened in lower cutoffs for the Frangi filter, indicating that these values provide a greater level of discrimination between true and false positive locations. The bottom row (Fig. 5B,D, again SWI and GRE, respectively) show the same pattern in compactness across Frangi filter cutoff. The cutoff values were chosen to maximize the difference between true and false positives (the shaded grey area), providing the greatest level of precision.
Figure 5.
Illustration of Frangi-filter threshold effect on blob size and compactness. The dashed lines represent the bounds of the volumes of true positives, while the solid lines represent the bounds of the volumes of false positives. All false positives that fall within the shaded grey areas are removed. The purpose of trying a range of thresholds for vesselness was to determine the points of maximum difference (i.e., where the shaded area is largest, and therefore removes the most false positives). (A) Range of volume of central blobs in Frangi-filtered ROIs on SWI images across different thresholds. (B) Range of central blob compactness in Frangi-filtered ROIs on SWI images across different thresholds. (C) Range of volume of central blobs in Frangi-filtered ROIs on GRE images across different thresholds. (D) Range of compactness of central blobs in Frangi-filtered ROIs on GRE images across different thresholds.
It is interesting to note that for the maximum cutoffs, which are responsible for more false positive eliminations than the minimum cutoffs, a cutoff of 0.25 could be used on both SWI and GRE to simplify implementation. Although we did not use this cutoff, as the cutoff would slightly lower the precision because it allows for more false positives to remain, it could provide a simpler form of analysis.
Final counting and location step
A random sample of 20 SWI scans from WHICAP (different from the ones used to develop the algorithm) was used to test the speed of visual ratings versus the editing of locations identified by the algorithm. The time to rate a group of 10 scans visually (no algorithm masks) was 6.12 ± 1.58 min (mean ± standard deviation). The time to rate a group of 10 scans with the algorithm mask was 3.48 ± 1.81 min, significantly (t(17.6) = 3.50, p = 0.003) reducing the time to rate scans by 43%.
Nearly all (92%) of the microbleeds in this sample occurred in lobar locations (frontal: 42%, temporal: 18%, parietal: 26%, and occipital 6%). Microbleeds lying within deeper brain structures accounted for the remainder (basal ganglia: 4%, cerebellum: 4%).
Longitudinal results
In the 14 participants who had longitudinal scans available, there were 20 potential true microbleed locations identified at baseline. In the longitudinal scans, visual ratings identified a subset of these locations remaining (rater #2 identified 6 locations; rater #3 identified 5 locations), while the algorithm identified 10 of the original locations on the follow-up scans, greatly outperforming the visual ratings in terms of longitudinal reliability. Applying the cutoffs defined at baseline to the longitudinal dataset did not remove any of the true positives and resulted in a similar level of precision to the baseline SWI (3.6 false positives per scan; 5.9% precision), indicating that the cutoffs derived using only the baseline data were applicable across multiple timepoints.
Discussion
We have presented here MAGIC, a novel automated method to detect cerebral microbleeds on images acquired at 3T field strength. This new method has high sensitivity and reasonable precision on both T2*-weighted GRE and SWI images, both of which are commonly used in research and clinical applications. We validated the method in a community-based cohort, using both microbleed positive and negative participants. The automation we present, even with its need for minor manual corrections of false positives (a visual inspection step required to ensure the accuracy of the final results), reduces the time needed to visually rate the scan by drawing the rater’s attention to areas most likely to be microbleeds, while retaining a high sensitivity to the microbleeds themselves. We also demonstrated, in a small subset of participants, that the automated algorithm exhibits higher sensitivity in longitudinal identification of potential microbleed locations than visual ratings. Longitudinal reliability in identifying persistent artifacts is critical to future assessments of microbleeds, as the longitudinal stability of microbleed artifacts remains unknown, and accurate identification on multiple scans would help to elucidate the degree to which they change in appearance over time23,44.
As we expected, SWI provides a greater level of precision in identifying microbleeds compared to GRE. This observation is due to the underlying nature of the sequences, as GRE is able to show paramagnetic artifacts, while SWI provides a more sensitive measure by incorporating phase information into the image itself1,2,5,10. Although these differences exist, the strength of the algorithm we present is that it works with comparable sensitivity and precision on both SWI and GRE scans, allowing for use across many different clinical and research applications.
Aside from the demonstrated ability of MAGIC to work across multiple modalities as well as longitudinal identification of microbleeds, features that have not been shown in any other algorithm to our knowledge, we believe that the interpretability of all our steps to provide an attractive additional feature. In comparison to other proposed solutions that sacrifice interpretability (e.g., machine-learning based approaches, such as convolutional neural networks16,25,31, which do not offer an interpretable set of features), the simple geometric measures we propose correspond well with the criteria used for visual rating1, and the cutoff values can be easily modified to accommodate different acquisition parameters used by different groups. Additionally, we do not need to compute a high dimensional set of geometric features and select by weight, as proposed in certain random forest implementations14,29,30. By maintaining interpretable geometric methods throughout the algorithm, we believe it is easier to adapt the pipeline to study or scanner specific differences on the basis of a few experimental scans, rather than requiring retraining for each site or training on a very large initial sample set.
As with any algorithm, there are always limitations and room for optimization. When compared to other automated or semi-automated pipelines already in existence, the method we propose here achieved a higher sensitivity15,17,26,31 and precision14,27 compared to the majority of existing methods. There are some distinct advantages to other methods, however, which must be acknowledged. Deep convolutional networks, while requiring more initial processing and manual labeling, can leverage large datasets to achieve a significantly higher precision than purely geometric analyses25,30,31, as they are able to better discriminate between edge artifacts and true positives (geometric criteria are susceptible to confusion in sharp gradients that higher dimensional approaches, such as convolutional networks or statistical shape maps, are able to resolve). Another approach that has promise29, particularly when it is not feasible to acquire higher resolution scans, combines information from SWI images and proton-density weighted images to verify the accuracy of the microbleed locations and remove fluid mimics (e.g. vessels). To our knowledge, no other automated method has addressed longitudinal reliability, so the data we provide here are the first of their kind. Additionally, we did not find any explicit comparison of method reliability across SWI and GRE scans.
There are certain approaches in currently existing methods, such as combining multiple modalities29, or extracting information from the area surrounding an ROI as well as the ROI itself25,30,31, which could be leveraged to increase the precision of our approach. One way of improving the algorithm without additional image collection is to explicitly include phase information from the SWI images, as SWI is more sensitive to paramagnetic deposits because it incorporates the distortions within the phase image into the scan23. We would look to explicitly process the phase image as well as the final SWI image to look for these specific local distortions as a way of reducing the number of false positives due to gradient changes unrelated to susceptibility effects (e.g., cerebellar folds). Additionally, the inclusion of phase image information would allow for a more specific determination of iron deposition, especially when compared to calcium deposits, as they shift the phase in opposite directions5. We are also considering refinements in our definition of entropy, such as mutual information entropy (i.e., weighting local clusters of voxels by how similar their entropy is, rather than computing entropy solely on an entire image) as a way to further tune the sensitivity of this measure to the presence of true positives. However, given the promising results we already show with this version of the algorithm, we believe these changes will provide incremental benefits. As high-quality imaging becomes more common, especially in the evolving area of microbleed analysis, we believe that algorithmic identification of microbleeds will become a greater necessity, and MAGIC presents a promising step towards achieving a broadly applicable, automated approach to microbleed identification.
Considering the observed association between microbleeds and diseases such as cerebrovascular disease2,3,10–12, cerebral amyloid angiopathy and AD1,4,19,20, as well as the critical role microbleeds may play in treatment trials4,21,22, we believe that a standardized way to identify microbleeds, cross-sectionally and longitudinally, that is generalizable across cohorts will become imperative in assessing microbleed burden. As the elderly population continues to grow, creating consistent, broadly applicable ways of quantifying microhemorrhage location and burden will become increasingly important in both ensuring a high standard of clinical care and providing reliable data to uncover biological causes of microbleeds and how they relate to these diseases.
Supplementary Information
Acknowledgements
Data collection and sharing for this project was supported by the Washington Heights-Inwood Columbia Aging Project (WHICAP, P01AG07232, R01AG037212, RF1AG054023, R56AG034189, R01AG034189, R01AG054520) funded by the National Institute on Aging (NIA). This manuscript has been reviewed by WHICAP investigators for scientific content and consistency of data interpretation with previous WHICAP Study publications. We acknowledge the WHICAP study participants and the WHICAP research and support staff for their contributions to this study.
Author contributions
A.G.C. and A.M.B. conceived and implemented the algorithm. A.G.C., E.A., and I.B.M. visually rated microbleeds. A.G.C., P.J.L., and A.M.B. analyzed the final outputs and drafted the manuscript. A.M.B. and R.M. secured funding and provided overall study oversight. All authors reviewed the manuscript and approved the submission.
Data availability
Qualified investigators may request data in writing (see http://www.cumc.columbia.edu/adrc/investigators for more details).
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-021-83607-0.
References
- 1.Greenberg SM, et al. Cerebral microbleeds: A field guide to their detection and interpretation. Lancet Neurol. 2009;8:165–174. doi: 10.1016/S1474-4422(09)70013-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Martinez-Ramirez S, Greenberg SM, Viswanathan A. Cerebral microbleeds: Overview and implications in cognitive impairment. Alzheimers Res. Ther. 2014;6:33. doi: 10.1186/alzrt263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Wiegman AF, et al. Cerebral microbleeds in a multiethnic elderly community: Demographic and clinical correlates. J. Neurol. Sci. 2014;345:125–130. doi: 10.1016/j.jns.2014.07.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Sperling RA, et al. Amyloid related imaging abnormalities (ARIA) in amyloid modifying therapeutic trials: Recommendations from the Alzheimer’s Association Research Roundtable Workgroup. Alzheimers Dement. 2011;7:367–385. doi: 10.1016/j.jalz.2011.05.2351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Haller S, et al. Cerebral microbleeds: Imaging and clinical significance. Radiology. 2018;287:11–28. doi: 10.1148/radiol.2018170803. [DOI] [PubMed] [Google Scholar]
- 6.Tatsumi S, Shinohara M, Yamamoto T. Direct comparison of histology of microbleeds with postmortem MR images: A case report. Cerebrovasc. Dis. 2008;26:142–146. doi: 10.1159/000139661. [DOI] [PubMed] [Google Scholar]
- 7.Fazekas F, et al. Histopathologic analysis of foci of signal loss on gradient-echo T2*-weighted MR images in patients with spontaneous intracerebral hemorrhage: Evidence of microangiopathy-related microbleeds. AJNR Am. J. Neuroradiol. 1999;20:637–642. [PMC free article] [PubMed] [Google Scholar]
- 8.Cordonnier, C., Al-Shahi Salman, R. & Wardlaw, J. Spontaneous brain microbleeds: Systematic review, subgroup analyses and standards for study design and reporting. Brain130, 1988–2003 (2007). [DOI] [PubMed]
- 9.Cordonnier C, et al. Improving interrater agreement about brain microbleeds: Development of the Brain Observer MicroBleed Scale (BOMBS) Stroke. 2009;40:94–99. doi: 10.1161/STROKEAHA.108.526996. [DOI] [PubMed] [Google Scholar]
- 10.Charidimou A, Jäger HR, Werring DJ. Cerebral microbleed detection and mapping: Principles, methodological aspects and rationale in vascular dementia. Exp. Gerontol. 2012;47:843–852. doi: 10.1016/j.exger.2012.06.008. [DOI] [PubMed] [Google Scholar]
- 11.Ayaz M, Boikov AS, Haacke EM, Kirsch WM, Kido DK. Imaging cerebral microbleeds using susceptibility weighted imaging: One step toward detecting vascular dementia. J. Magn. Reson. Imaging. 2010;31:142–148. doi: 10.1002/jmri.22001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Akoudad S, et al. Cerebral microbleeds are related to loss of white matter structural integrity. Neurology. 2013;81:1930–1937. doi: 10.1212/01.wnl.0000436609.20587.65. [DOI] [PubMed] [Google Scholar]
- 13.Gregoire SM, et al. The Microbleed Anatomical Rating Scale (MARS): Reliability of a tool to map brain microbleeds. Neurology. 2009;73:1759–1766. doi: 10.1212/WNL.0b013e3181c34a7d. [DOI] [PubMed] [Google Scholar]
- 14.van den Heuvel, T. L. A. et al. Automated detection of cerebral microbleeds in patients with traumatic brain injury. NeuroImage Clin.12, 241–251 (2016). [DOI] [PMC free article] [PubMed]
- 15.Bian, W., Hess, C. P., Chang, S. M., Nelson, S. J. & Lupo, J. M. Computer-aided detection of radiation-induced cerebral microbleeds on susceptibility-weighted MR images. NeuroImage Clin.2, 282–290 (2013). [DOI] [PMC free article] [PubMed]
- 16.Chen Y, Villanueva-Meyer JE, Morrison MA, Lupo JM. Toward Automatic detection of radiation-induced cerebral microbleeds using a 3D deep residual network. J. Digit. Imaging. 2019;32:766–772. doi: 10.1007/s10278-018-0146-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Morrison MA, et al. A user-guided tool for semi-automated cerebral microbleed detection and volume segmentation: Evaluating vascular injury and data labelling for machine learning. Neuroimage Clin. 2018;20:498–505. doi: 10.1016/j.nicl.2018.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Poels MMF, et al. Cerebral microbleeds are associated with worse cognitive function: the Rotterdam Scan Study. Neurology. 2012;78:326–333. doi: 10.1212/WNL.0b013e3182452928. [DOI] [PubMed] [Google Scholar]
- 19.Nakata-Kudo Y, et al. Microbleeds in Alzheimer disease are more related to cerebral amyloid angiopathy than cerebrovascular disease. Dement. Geriatr. Cogn. Disord. 2006;22:8–14. doi: 10.1159/000092958. [DOI] [PubMed] [Google Scholar]
- 20.Lee S-H, Kim S-M, Kim N, Yoon B-W, Roh J-K. Cortico-subcortical distribution of microbleeds is different between hypertension and cerebral amyloid angiopathy. J. Neurol. Sci. 2007;258:111–114. doi: 10.1016/j.jns.2007.03.008. [DOI] [PubMed] [Google Scholar]
- 21.Ivanoiu, A. et al. Long-term safety and tolerability of bapineuzumab in patients with Alzheimer’s disease in two phase 3 extension studies. Alzheimers Res. Ther.8 (2016). [DOI] [PMC free article] [PubMed]
- 22.Sevigny J, et al. The antibody aducanumab reduces Aβ plaques in Alzheimer’s disease. Nature. 2016;537:50–56. doi: 10.1038/nature19323. [DOI] [PubMed] [Google Scholar]
- 23.Mittal S, Wu Z, Neelavalli J, Haacke EM. Susceptibility-weighted imaging: Technical aspects and clinical applications, Part 2. Am. J. Neuroradiol. 2009;30:232–252. doi: 10.3174/ajnr.A1461. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Poels MMF, Ikram MA, Vernooij MW. Improved MR imaging detection of cerebral microbleeds more accurately identifies persons with vasculopathy. Am. J. Neuroradiol. 2012;33:1553–1556. doi: 10.3174/ajnr.A3108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Dou Q, et al. Automatic detection of cerebral microbleeds from MR images via 3D convolutional neural networks. IEEE Trans. Med. Imaging. 2016;35:1182–1195. doi: 10.1109/TMI.2016.2528129. [DOI] [PubMed] [Google Scholar]
- 26.Kuijf, H. J. et al. Efficient detection of cerebral microbleeds on 7.0T MR images using the radial symmetry transform. NeuroImage59, 2266–2273 (2012). [DOI] [PubMed]
- 27.Barnes SRS, et al. Semiautomated detection of cerebral microbleeds in magnetic resonance images. Magn. Reson. Imaging. 2011;29:844–852. doi: 10.1016/j.mri.2011.02.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Seghier ML, et al. Microbleed detection using automated segmentation (MIDAS): A new method applicable to standard clinical MR images. PLoS ONE. 2011;6:e17547. doi: 10.1371/journal.pone.0017547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ghafaryasl, B. et al. A computer aided detection system for cerebral microbleeds in brain MRI. in 2012 9th IEEE International Symposium on Biomedical Imaging (ISBI) 138–141 (2012). 10.1109/ISBI.2012.6235503.
- 30.Fazlollahi, A. et al. Efficient machine learning framework for computer-aided detection of cerebral microbleeds using the Radon transform. in 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI) 113–116 (2014). 10.1109/ISBI.2014.6867822.
- 31.Chen, H. et al. Automatic detection of cerebral microbleeds via deep learning based 3D feature representation. in 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI) 764–767 (2015). 10.1109/ISBI.2015.7163984.
- 32.Brickman AM, et al. Regional white matter hyperintensity volume, not hippocampal atrophy, predicts incident Alzheimer disease in the community. Arch. Neurol. 2012;69:1621–1627. doi: 10.1001/archneurol.2012.1527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Fleiss JL. Measuring nominal scale agreement among many raters. Psychol. Bull. 1971;76:378–382. doi: 10.1037/h0031619. [DOI] [Google Scholar]
- 34.McHugh ML. Interrater reliability: The kappa statistic. Biochem Med (Zagreb) 2012;22:276–282. doi: 10.11613/BM.2012.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Jenkinson M, Beckmann CF, Behrens TEJ, Woolrich MW, Smith SM. FSL. NeuroImage. 2012;62:782–790. doi: 10.1016/j.neuroimage.2011.09.015. [DOI] [PubMed] [Google Scholar]
- 36.Penny, W., Friston, K., Ashburner, J., Kiebel, S. & Nichols, T. Statistical Parametric Mapping: The Analysis of Functional Brain Images. (Academic Press, 2006).
- 37.Duda, R. O. & Hart, P. E. Pattern Classification and Scene Analysis. (Wiley and Sons, 1973).
- 38.Canny J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986;8:679–698. doi: 10.1109/TPAMI.1986.4767851. [DOI] [PubMed] [Google Scholar]
- 39.Duda RO, Hart PE. Use of the Hough transformation to detect lines and curves in pictures. Commun. ACM. 1972;15:11–15. doi: 10.1145/361237.361242. [DOI] [Google Scholar]
- 40.Kuijf, H. J. et al. Semi-automated detection of cerebral microbleeds on 3.0 T MR images. PLOS ONE8, e66610 (2013). [DOI] [PMC free article] [PubMed]
- 41.Loy G, Zelinsky A. Fast radial symmetry for detecting points of interest. IEEE Trans. Pattern Anal. Mach. Intell. 2003;25:959–973. doi: 10.1109/TPAMI.2003.1217601. [DOI] [Google Scholar]
- 42.González, R. C. & Woods, R. E. Digital Image Processing, 3rd Edn. (Pearson Prentice Hall, 2009).
- 43.Frangi, A. F., Niessen, W. J., Vincken, K. L. & Viergever, M. A. Multiscale vessel enhancement filtering. in Medical Image Computing and Computer-Assisted Intervention—MICCAI’98 (eds. Wells, W. M., Colchester, A. & Delp, S.) 130–137 (Springer, 1998). 10.1007/BFb0056195.
- 44.Liu W, et al. Imaging cerebral microhemorrhages in military service members with chronic traumatic brain injury. Radiology. 2016;278(2):536–545. doi: 10.1148/radiol.2015150160. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Qualified investigators may request data in writing (see http://www.cumc.columbia.edu/adrc/investigators for more details).




