Feasibility Study of a Generalized Framework for Developing Computer-Aided Detection Systems—a New Paradigm

Mitsutaka Nemoto; Naoto Hayashi; Shouhei Hanaoka; Yukihiro Nomura; Soichiro Miki; Takeharu Yoshikawa

doi:10.1007/s10278-017-9968-3

. 2017 Apr 12;30(5):629–639. doi: 10.1007/s10278-017-9968-3

Feasibility Study of a Generalized Framework for Developing Computer-Aided Detection Systems—a New Paradigm

Mitsutaka Nemoto ^1,^✉, Naoto Hayashi ¹, Shouhei Hanaoka ², Yukihiro Nomura ¹, Soichiro Miki ¹, Takeharu Yoshikawa ¹

PMCID: PMC5603442 PMID: 28405834

Abstract

We propose a generalized framework for developing computer-aided detection (CADe) systems whose characteristics depend only on those of the training dataset. The purpose of this study is to show the feasibility of the framework. Two different CADe systems were experimentally developed by a prototype of the framework, but with different training datasets. The CADe systems include four components; preprocessing, candidate area extraction, candidate detection, and candidate classification. Four pretrained algorithms with dedicated optimization/setting methods corresponding to the respective components were prepared in advance. The pretrained algorithms were sequentially trained in the order of processing of the components. In this study, two different datasets, brain MRA with cerebral aneurysms and chest CT with lung nodules, were collected to develop two different types of CADe systems in the framework. The performances of the developed CADe systems were evaluated by threefold cross-validation. The CADe systems for detecting cerebral aneurysms in brain MRAs and for detecting lung nodules in chest CTs were successfully developed using the respective datasets. The framework was shown to be feasible by the successful development of the two different types of CADe systems. The feasibility of this framework shows promise for a new paradigm in the development of CADe systems: development of CADe systems without any lesion specific algorithm designing.

Keywords: Computer-aided detection (CADe) system, Generalized CADe framework, Automatic optimization, Machine learning method, CADe training dataset

Background

Computer-aided detection (CADe) is a pattern recognition technology that identifies features of concern on a medical image and brings them to the attention of the radiologist [1]. Currently, the radiologist first reviews the images, then activates the CADe system and re-evaluates the CADe-marked areas of concern before issuing the final report. CADe has been widely researched, and various types of CADe system have been developed [2, 3]. Effective CADe systems speed up the medical diagnostic process, reduce diagnostic errors, and improve quantitative evaluations [4].

CADe systems are difficult to develop because of their multiple requirements of specialized expertise in image processing, machine learning, medical physics, and radiology [4]. Moreover, most of the algorithms used in the components of CADe systems must be manually designed by developers with these areas of expertise. Recently, some components of CADe systems have been automatically optimized using machine learning methods [5, 6] and feature selection methods [7, 8]. Machine learning methods are computational methodologies by which computer algorithms can automatically learn complex relationships or patterns and device accurate decision rules or models from training datasets [9–12]; machine learning is data-driven. For example, machine learning methods such as artificial neural networks [5] and support vector machines [6] are used to optimize the classifiers used for candidate classification in CADe systems. Feature selection methods are data-driven optimization methods and are used to automatically select optimal features [13]. For example, various methods such as a stepwise selection method [7] and feature selection based on the genetic algorithm [8] are used to select the features for candidate classification, which are calculated in the feature extraction, from a feature bank including a large number of features prepared in advance by CADe system developers. However, these data-driven methods are not yet extensively used in the development of CADe systems. Since it is still necessary to manually design most of the specific algorithms in CADe system development, the developers require specialized expertise in multiple research fields, making it very difficult for outsiders to enter the field.

Generally, a CADe system can be divided into five components: preprocessing, segmentation, candidate detection, feature extraction, and candidate classification (Fig. 1). It works by sequentially executing the algorithms for each component. Although the components may be similar among CADe systems, the algorithms of the components are usually completely different, depending on factors such as the detection target, the images to be processed, and the design policy of the algorithms. Commonly, a training dataset which consists of medical images and the area of the target lesions is collected in advance. Algorithms of all the components are manually designed based on the training dataset. Some parameters used in the algorithms are optimized by trial and error with the training dataset. And the other parameters are optimized automatically by machine learning methods with the training dataset.

Fig. 1 — Flowchart of conventional CADe system

In this paper, we propose a generalized CADe framework to develop various CADe systems whose characteristics depend only on those of the training dataset. Pretrained algorithms, which are designed generally but without parameter optimization, are prepared for each component of the CADe system. The pretrained algorithms are designed to be applicable to various lesion detection problems. Each pretrained algorithm is automatically optimized by a dedicated method based on machine learning and a training dataset. The framework was designed with the aim of automatically developing CADe systems structured with the components of trained algorithms. If the framework is feasible, it is expected to lead to be a paradigm shift in the development of CADe systems.

In this framework, developers are not required to manually design algorithms and are only required to collect the training dataset. Such a framework would enable outsiders, including clinicians, without programming skills to take part in the development of CADe systems. It would provide opportunities for many clinicians to develop their own CADe systems to meet their specific needs. While there are many algorithms available online, it is difficult for laypersons to integrate them to meet their needs. Therefore, a framework with all the necessary algorithms already prepared would be extremely valuable.

The purpose of this study is to show the feasibility of the generalized CADe framework. To achieve this, two different CADe systems were experimentally developed with the same components and the same pretrained algorithms, but with different training datasets.

Methods

Framework Design

Overview

In this study, CADe systems are experimentally developed through a prototype generalized CADe framework (Fig. 2). This prototype is a restricted version that detects only small and roughly spherical lesions in three-dimensional medical image data (volume data). A representative CADe training dataset is first collected to develop a CADe system. The dataset includes the target lesion area, which is the three-dimensional area of a target lesion defined by voxel-by-voxel labeling. The CADe system includes four components: preprocessing, candidate area extraction, candidate detection, and candidate classification. In this framework, four pretrained algorithms with dedicated optimization/setting methods corresponding to the respective components are prepared in advance. Since the optimization of the pretrained algorithm in a component requires the results of the previous component, the pretrained algorithms are sequentially trained in the order of processing of the components. In this section, we describe the details of each pretrained algorithm with the dedicated optimization/setting method.

Algorithms of the Components

Preprocessing

The first component creates isotropic volume data from input volume data to facilitate the subsequent processing. Trilinear interpolation is used to obtain the isotropic volume data, where the voxel size is r _i × r _i × r _i.

Setting Method

r _i is automatically set on the basis of the size of the smallest lesion in the CADe training dataset. The bounding box is the minimum enclosing box for the smallest lesion. When the bounding box is B _min, r _i is set as follows:

r_{i} = \frac{1}{4} {| B_{min} |}^{\frac{1}{3}},

where |B _min| is the volume of B _min.

2)
Candidate area extraction

The second component extracts the candidate area from the isotropic volume data. The candidate area is a limited three-dimensional region in which candidates are detected in the next component. The candidate area is extracted by a cascade [14] of voxel classifier ensembles that are automatically optimized by a boosting method and a training dataset, as described below. Each classifier ensemble consists of weak classifiers whose classification performances are limited. The output value of the ensemble, which is used as a likelihood, is a weighted sum of the weak classifiers’ outputs. All the weak classifiers within the classifier ensembles are decision stumps [15]. A decision stump is a simple classification model consisting of a one-level decision tree and makes a prediction by thresholding a single feature. In this component, the features used in all the weak classifiers are automatically selected from the voxel feature bank shown in Table 1 [16–18]. The feature selection results through the optimization process are affected by the characteristics of the training dataset. The voxel classifier ensembles within the cascade identify input voxels either as being inside of the target lesion (positive class) or as being outside of the target lesion (negative class). The voxels classified as belonging to the positive class by all classifier ensembles in the cascade are joined together to form the candidate area (Fig. 3). An example of a cascade processing result is shown in Fig. 4.

Table 1.

Contents of the voxel feature bank

Scaled voxel values (n = 5):

Original image and smoothed images scaled by uniform weighted cubic kernel (size =1, 2, 4, 8, 16 voxels)

Morphological features (n = 10):

2 types (top-hat and bottom-hat filtering) × 5 spherical kernel sizes (radius =1, 2, 4, 8, 16 voxels)

Difference of Gaussian features (n = 5):

5 pairs of Gaussian smoothers σ: (1, 2), (2, 4), (4, 8), (8, 16), (16, 32) voxels

Voxel value statistics (n = 20):

4 types (average, deviation, skewness, kurtosis) × 5 cubic ROI sizes (side length =3, 5, 9, 17, 33 voxels)

3D Haar features [16] (n = 95):

19 block mask combinations × 5 cubic ROI sizes (side length =3, 5, 9, 17, 33 voxels)

Haralick texture features [17] (n = 55):

11 types (angular second moment, inverse difference moment, contrast, variance, entropy, correlation, sum average, sum variance, sum entropy, difference variance, difference entropy) × 5 sets of 3D spatial offsets (δ _x, δ _y, δ _z): {(±d,0,0), (0, ±d,0), and (0,0, ±d)|d = 1, 2, 4, 8, 16 voxels}

Hessian matrix-derived features (n = 45):

9 types (3 eigenvalues, mean and Gaussian curvatures, shape index, curvedness, and principal curvatures k1 and k2) × 5 standard deviations of Gaussian kernel (σ = 1, 2, 4, 8, 16 voxels)

Local optimum scale-derived features [18] (n = 8):

Local optimum scale σ _opt (selected from 1, 2, 3, 4, and 5 voxels) + 7 types of feature: principal curvatures (k1, k2, k1/k2), magnitude of first derivative, S _blob, S _bif, and S _line calculated from Gaussian smoothed image by σ _opt-sized kernel

Standardized coordinate features (n = 3):

Standardized x, y, and z coordinates are used as the voxel features (see Appendix).

Open in a new tab

Fig. 4 — Example of the cascade processing result for a brain MRA. The *arrow* in the upper left image gives the position of a cerebral aneurysm. The *bright areas* in the binary images are aggregations of voxels classified as belonging to the positive class by the first, second, and final cascade layers

Optimization Method

The voxel classifier ensembles and the number of classifiers in the cascade are automatically trained and optimized as follows. Cost-sensitive [19] AdaBoost [14, 20] is used to train all classifier ensembles within the cascade. Cost-sensitive learning is a classifier learning method with arbitrary weights added to training samples. For example, it is used to solve the imbalance between the training sample sizes of positive and negative classes. In the AdaBoost training, the features used in the weak classifiers (the decision stumps) are selected from the voxel feature bank shown in Table 1 [16–18]. The threshold value for the feature used in the decision stump is optimized through the AdaBoost training. Since the classifiers within the cascade are trained in numerical order, the voxel dataset used to train the t th ensemble comprises the voxels classified as members of the true class by the first (t − 1) ensembles in the CADe training dataset. Each of the training voxel data is given a teacher label and a cost weight. The teacher label y _x of training voxel x is given by referring to the target area as follows:

y_{x} = \{\begin{matrix} + 1 & (x \in target area) \\ - 1 & (otherwise) \end{matrix} .

When the t th ensemble is undergoing training, the cost weight w _e(t) , x of voxel x is given as follows:

w_{e (t), x} = \{\begin{matrix} \frac{α_{t}}{n_{e (t)}^{+}} & (y_{x} = + 1) \\ \frac{1}{n_{e (t)}^{-}} & (y_{x} = - 1) \end{matrix},

where $n_{e (t)}^{+}$ and $n_{e (t)}^{-}$ are the numbers of “+1”-labeled voxels and “−1”-labeled voxels in the CADe training dataset for the t th ensemble, respectively. α _t is a parameter that controls the priority level of the positive voxels in the training. If α _t = 1.0, an ensemble is trained with equal weight on voxel sensitivity (ratio of correct classification for the positive voxels) and voxel specificity (ratio of correct classification for the negative voxels). If α _t is larger than 1.0, the training weight on the voxel sensitivity increases, if α _t is smaller than 1.0, the training weight on the voxel specificity increases. From among the many voxel classifier ensembles trained by adopting various parameters α _t, the classifier ensemble with the highest voxel classification specificity while achieving over 99.95% voxel sensitivity is selected as a member of the cascade. This sensitivity is calculated from the number of lesion voxels classified correctly and the total number of voxels within the volumetric areas of target lesions defined by experienced radiologists.

The number of ensembles in the cascade is optimized by performing a forward sequential search involving the repeated addition of an ensemble and evaluation of the lesion sensitivity. The addition of an ensemble to the cascade leads to not only the correct rejection of non-lesion voxels but also the misclassification of lesion voxels. Thus, the sequential search is performed while maintaining a cascade classification that at least one voxel for each lesion in the training dataset is correctly classified. A later classifier ensemble of the cascade has a larger number of weak classifiers, as in [8]. Specifically, the numbers of weak classifiers included in the first nine ensembles are preset to 1, 10, 10, 25, 25, 50, 50,100, and 100.

3)
Candidate point detection

The third component detects candidate points of the target lesion from a candidate area. A candidate point is a local maximum of the lesion voxel likelihood. The lesion voxel likelihood, which is calculated at every voxel in the candidate area, is the output value of a voxel classifier ensemble. This ensemble is constructed from decision stumps based on voxel features but is different from the ensembles within the cascade and is trained to enhance voxels belonging to the target lesions. Figure 5 shows images giving an example of the calculated lesion voxel likelihood. The local maxima of the lesion voxel likelihood are selected as the candidate points.

Fig. 5 — Example of calculation results for the lesion voxel likelihood showing an axial slice of a brain MRA with a cerebral aneurysm (a), cropped axial (left)/coronal (center)/sagittal (right) images centered on the cerebral aneurysm (b), and an example of the calculation results for the lesion voxel likelihood corresponding to the cropped images (c). The *arrows* in (b) give the position of the same cerebral aneurysm

Optimization Method

Cost-sensitive AdaBoost is used to train the voxel classifier ensemble. The classifier ensemble consists of 200 decision stumps. Through the AdaBoost training method, voxel features used in the decision stumps are selected from a voxel feature bank different from the bank used in the optimization for candidate area extraction. The feature bank includes the features shown in Table 1 and the outputs of the voxel classifier ensembles used in the candidate area extraction. The voxel dataset used to train this voxel classifier ensemble is extracted from the rectangular sampling area B' for all lesions within the CADe training dataset. The rectangular sampling area B _ω' for lesion ω is twice the size of the bounding box for the target area of ω (B _ω). Each of the training voxel data is given a teacher label (y _x) and a cost weight (w _d , x). The labeling rule for training voxels is the same as that in the training algorithm for candidate area extraction. The cost weight w _d , x at training voxel x is given as follows:

w_{d, x} = \{\begin{matrix} \frac{1}{n_{d}^{+}} & (y_{x} = + 1) \\ \frac{1}{n_{d}^{-}} & (y_{x} = - 1) \end{matrix} .

Here, $n_{d}^{+}$ is the number of “+1”-labeled voxels in the training voxel dataset and $n_{d}^{-}$ is the number of “−1”-labeled training voxels in the training voxel dataset.

4)
Candidate point classification

All candidate points are classified by a candidate classifier ensemble and are given lesion likelihoods. The lesion likelihood is the output of a candidate classifier ensemble. The ensemble is constructed from decision stumps and is different from the ensembles used in the candidate area extraction and candidate detection.

The weak classifiers constructing the candidate classifier ensemble are decision stumps based on pooling features, which are calculated by applying feature pooling operators [21] to voxel features. In this algorithm, three pooling operators are used, max-pooling f _max(∙), min-pooling f _min(∙), and average pooling f _avg(∙), given as follows:

f_{max} (v, Q) = max_{x \in Q} v_{x},

f_{min} (v, Q) = min_{x \in Q} v_{x},

f_{a v g} (v, Q) = \underset{x \in Q}{a v e r a g e} v_{x} .

Here, v _x is the value of voxel feature v calculated at voxel x within a local region Q located at or around the detected candidate point.

Optimization Method

Cost-sensitive AdaBoost is used to train the candidate classifier ensemble, which consists of 200 decision stumps. The dataset used to train the classifier ensemble includes candidate points detected from the CADe training dataset by executing the trained algorithms corresponding to preprocessing, candidate area extraction, and candidate detection. The candidate points inside/outside the target area are labeled as “+1”/“−1”. When the numbers of “+1”- and “−1”-labeled candidate points are $n_{c}^{+}$ and $n_{c}^{-}$ , respectively, the label of candidate point p is y _p and its cost weight w _c , p is

w_{c, p} = \{\begin{matrix} \frac{1}{n_{c}^{+}} & (y_{p} = + 1) \\ \frac{1}{n_{c}^{-}} & (y_{p} = - 1) \end{matrix} .

The feature bank used to train the candidate classifier ensemble includes the pooling features calculated on the basis of the voxel features shown in Table 1, the outputs of the ensembles used for candidate area extraction, and the output of the ensemble used for candidate detection. To calculate the pooling features in the feature bank, 64 different local regions are used, which have different sizes and positions relative to the candidate point. All the local regions are located within the region of interest, which comprises 8 × 8 × 8 voxels centered on the candidate point.

Development and Evaluation of CADe System

Two different CADe training datasets were collected to develop two different types of CADe systems in the generalized CADe framework. These two CADe training datasets differ in terms of the image modality, scanning range, and target lesion. Here, two radiologists first independently interpreted the medical images, and then the final decision was determined with the agreement of the two radiologists. Both radiologists had over 10 years of CT- and MRA-reading experience. Both datasets include three-dimensional areas of all lesions that were manually defined by the radiologists by voxel-by-voxel painting.

Brain Magnetic Resonance Angiography (MRA) Volume Dataset with Cerebral Aneurysms

Three hundred sets of 3D time-of-flight unenhanced MRA images were used (151 males and 149 females). The average patient age was 59.8 years with a standard deviation (SD) of 11.4 years. These images were scanned at our institution with a GE Signa HDxt 3.0T or a GE Discovery MR750 3.0T magnetic resonance imaging scanner (GE Healthcare, Waukesha, WI, USA). The acquisition parameters were as follows: echo time, 2.7 or 2.9 ms; repetition time, 25 ms; flip angle, 15°; field of view, 240 mm; slice thickness, 0.6 mm; slice interval, 1.2 mm; matrix size, 512 × 512 pixels. The original spatial resolution of all obtained images was 0.468 × 0.468 × 0.60 mm³/voxel. Each set included at least one aneurysm of 2 mm or more in diameter. The diameter was manually measured by radiologists at the initial interpretation. The average diameter of the aneurysms is 3.1 mm with an SD of 1.4 mm. The average number of lesions included in a set was 1.11 with an SD of 0.33. Examples of images with aneurisms are shown in Fig. 6 (a).

Fig. 6 — Examples of images included in the brain MRA volume dataset (a) and examples of images included in the chest CT volume dataset (b). *Arrows* in the images gives the positions of lesions

Chest Computed Tomography (CT) Volume Dataset with Pulmonary Nodules

One hundred and twenty-nine sets of unenhanced chest CT images were used (74 males and 55 females). The average patient age was 59.3 years with an SD of 10.6 years. These images were scanned at our institution with a GE LightSpeed CT scanner (GE Healthcare). The acquisition parameters were as follows: number of detector rows, 16; tube voltage, 120 kVp; tube current, 50–290 mA (automatic exposure control); noise index, 20.41; field of view, 400 mm; rotation time, 0.5 s; moving table speed, 70 mm/s; body filter, standard; matrix size, 512 × 512 pixels. The original spatial resolution of all images was 0.781 × 0.781 × 1.25 mm³/voxel. Each set included at least one solid nodule or ground glass opacity (GGO) nodule of 5 mm or more in diameter. The diameter was manually measured by radiologists at the initial interpretation. The average diameter of the nodules is 7.8 mm with an SD of 3.8 mm. The average number of lesions included in a set was 1.45 with an SD of 0.83. Examples of images with nodules are shown in Fig. 6 (b).

The performances of the CADe systems developed through the generalized CADe framework were evaluated by threefold cross-validation. In threefold cross-validation, the dataset was divided into three subsets: two subsets for optimizing pretrained algorithms and remaining one subset for evaluating the obtained CADe system. The validation was iterated three times by rotating the subset. Consequently, three CADe systems were obtained from each CADe training dataset in this study. In the performance evaluation of the developed CADe systems, candidate points located inside/outside any painted lesion area were judged from the lesion likelihood as either true positive (TP) or false positive (FP), respectively. Free-response receiver operating characteristic (FROC) curves and ANODE scores [22] were calculated, and the time required for the lesion detection for each input volume was measured. By changing the threshold value for the lesion likelihood, plots of the TP fraction (sensitivity) and FP per case (the average number of FPs for a case) for the FROC curves were obtained. The ANODE score defines the average sensitivity at predefined FP rates (1/8, 1/4, 1/2, 1, 2, 4, and 8 FPs per case) along a FROC curve. The length of time from inputting a CADe training dataset to outputting a CADe system was also measured.

C/C++ language was used to implement the algorithms and the automatic optimization/setting methods. The experiments were performed on a workstation with Intel Xeon X5680 (3.33 GHz × 2) CPUs, 72 GB RAM, and a Microsoft Windows 7 Professional SP1 operating system. This study was approved by the Ethical Review Board of our institution, and written informed consent to use the images for the study was obtained from all the subjects.

Results

CADe systems for detecting cerebral aneurysms in the brain MRA were successfully developed using the brain MRA volume dataset with cerebral aneurysms. The average time to develop the CADe systems was 28.7 h with an SD of 3.0 h. The cerebral aneurysm detection processes for all 300 sets were completed without problems. The ANODE score had an average of 0.239 with an SD of 0.062 (Fig. 7). The configurations of the CADe systems developed with the same type of dataset resembled each other, as inferred from the small SD for each item shown in Table 2. The average detection time per case was 106.2 s with an SD of 19.6 s.

Fig. 7 — FROC curves of the CADe systems for detecting cerebral aneurysms

Table 2.

Specifications and detection performance of the CADe systems for detecting cerebral aneurysms and lung nodules developed through the generalized CADe framework

	Parameter(s)	Performance/specification (mean ± standard deviation)
	Parameter(s)	Cerebral aneurysm CADe	Lung nodule CADe
Preprocessing	Voxel size (mm)	0.676 ± 0.017	1.700 ± 0.047
Candidate area extraction	Number of cascade layers	7.67 ± 1.15	8.00 ± 1.00
	Ratio of voxel removal (%)	99.92 ± 0.04	99.15 ± 0.31
	Lesion-based sensitivity	0.986 ± 0.018	0.970 ± 0.028
Candidate detection	Number of candidates/cases	743.6 ± 333.3	7642.2 ± 1898.5
Candidate detection	Lesion detection sensitivity	0.871 ± 0.032	0.882 ± 0.049
Candidate classification	ANODE score	0.239 ± 0.062	0.218 ± 0.058

Open in a new tab

The CADe systems for detecting lung nodules in the chest CT were also successfully developed using the chest CT volume dataset with pulmonary nodules. The average time to develop the CADe systems was 22.5 h with an SD of 0.4 h. The lung nodule detection process for all 129 sets was completed without problems. The ANODE score had an average of 0.218 with an SD of 0.058 (Fig. 8). The configurations of the developed CADe systems also resembled each other, as inferred from the small SD for each item shown in Table 2. The average processing time per case was 200.2 s with an SD of 22.5 s.

Fig. 8 — FROC curves of the CADe systems for detecting lung nodules

Discussion

We experimentally confirmed that two different types of CADe system were developed successfully through the generalized CADe framework using two different training datasets. The CADe systems were developed automatically without the need for manual design, the characteristics of which depended only on the CADe training datasets. The two successfully developed CADe systems show that the generalized CADe framework is feasible.

To the best of our knowledge, this is the first report on the development of a whole CADe system without any manual design of the algorithms. Classically, the algorithms in the CADe systems were designed manually. In the generalized CADe framework, all the algorithms were automatically trained by optimization/setting methods using the collected CADe training dataset. None of the pretrained algorithms were specialized for a particular modality or type of lesion. Moreover, the machine learning methods used in the optimization methods included a function to automatically select optimal features from a large-scale feature bank since the optimal feature set was dependent on the characteristics of the detection/classification task. Theoretically, with an appropriate training dataset, a CADe system for detecting any type of lesion can be developed in this framework.

The results of this study show promise of a new paradigm for CADe system developments. It would enable clinicians without expertise in medical image analysis to develop CADe systems to use in their clinical site. Anyone with access to an appropriate dataset of medical images could develop various CADe systems required for their particular clinical situation. In addition, artificial intelligence researchers will be able to concentrate on the development of algorithms in the framework without considering the specific features of the disease to be detected. A generalized CADe framework would enable the development of CADe systems to be divided into the development of clinical applications and the development of the algorithms.

CADe system development in the generalized CADe framework requires less labor than previous CADe system developments with the manual design of algorithms. The manual design of algorithms is a trial-and-error procedure using a CADe training dataset, which is labor-intensive. In the generalized CADe framework, the manual design of algorithms is not required because all the pretrained algorithms are automatically trained by the optimization methods. Using the generalized CADe framework, the collection of the CADe training dataset is the only human task required to develop CADe systems, which was also necessary in previous CADe system developments. Adopting machine learning methods will greatly reduce the workload of CADe system developers.

A limitation of this study is the low performances of the developed CADe systems, which were inferior to those of state-of-the-art CADe systems [18, 22, 23]. Low performance may be a price we need to pay for generalization of development. For the cerebral aneurysm CADe system, our system achieved a median sensitivity of 56.8% with ten FPs/case, while Yang et al. achieved a sensitivity of 80% with three FPs/case and 95% with nine FPs/case in a performance evaluation with about 287 MRA scans [23]. For the lung nodule CADe, our system achieved a mean ANODE score of 0.218, while the ANODE09 study [19] in 2009 achieved a median ANODE score of 0.272. However, it should be noted that the performances of CADe systems trained with different datasets cannot easily be compared since CADe systems evaluated with a small dataset often show superior performance. Arimura et al. developed a cerebral aneurysm CADe system with only 31 cases [24], which showed better performance (a sensitivity of 100% with 2.4 FPs/case) than that of Yang et al. [23].

However, the aim of this paper is not to maximize the performances of the developed CADe systems but to show the feasibility of the framework to develop various CADe systems without manual design of the algorithms since this is a pilot study of the generalized CADe framework. There is ample room for improving this prototype framework. One weakness of this prototype is the inability to integrate local anatomical information into the CADe system. In most CADe systems with segmentation algorithms, the algorithms are uniquely designed for the specific detection of the target lesion, which is the mainstream design for integrating local anatomical information. While a versatile segmentation algorithm that can be applied to any anatomical structure has not yet been proposed [4], introducing general anatomical features such as the distance to an anatomical landmark [25–27] will improve the performance of CADe systems developed with this framework without application-specific design.

Another weakness of the prototype is that it was applied under restricted conditions; the target was limited to roughly spherical lesions with a small volume and the features used were limited in number. Extending the range of the pretrained algorithms will enable the detection of larger lesions, and expanding the range of applications of the pretrained algorithms will enable the detection of lesions from two-dimensional image data. Moreover, the features used were derived from a feature bank with a limited number of voxel features. Feature generation methods, such as deep learning methods, will be effective means of obtaining the optimal features for the target lesions without any manual design [28].

Conclusion

We proposed a generalized CADe framework for developing CADe systems without any manual design for various types of target detection. We experimentally confirmed that two different types of CADe system were developed successfully through the generalized CADe framework using two different training datasets. All the algorithms used in the developed CADe systems were trained automatically by optimization/setting methods instead of being designed manually. The characteristics of the CADe systems depended only on the CADe training datasets. The feasibility of this framework shows promise of a new paradigm for the development of CADe systems. In this framework, developers are not required to manually design algorithms and are only required to collect the training dataset. Such a framework would enable clinicians without programming skills to take part in the development of CADe systems. This would provide opportunities for many clinicians to develop their own CADe systems to meet their specific needs. A generalized CADe framework with all the necessary algorithms already prepared would be extremely valuable.

Acknowledgements

The Department of Computational Radiology and Preventive Medicine, The University of Tokyo Hospital is sponsored by HIMEDIC Inc., Siemens Japan K.K., and GE Healthcare Japan.

Appendix: Calculation of the standard coordinate features

The standard coordinate features were the standardized x, y, and z coordinates. To calculate the features, n _a atlas volumes were randomly chosen from the CADe training dataset and were registered to an input volume in advance. In this study, five atlas volumes and a volume registration method based on Markov random fields [29, 30] were used. The standard coordinate c _p of voxel p in the input volume is given as follows:

c_{p} = \frac{1}{n_{a}} \sum_{i = 1}^{n_{a}} r_{p}^{i} .

Here, $r_{p}^{i}$ , which is the coordinate in the i th registered atlas volume, corresponds to p in the input volume.

Compliance with Ethical Standards

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Standards

The study described in this manuscript was approved by the Research Ethics Board of the University of Tokyo Hospital. Informed consent was obtained from all individual participants included in the study.

References

1.Castellino RA. Computer aided detection (CAD): an overview. Cancer Imaging. 2005;5:17–19. doi: 10.1102/1470-7330.2005.0018. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Giger ML, Chan H, Boone J. Anniversary paper: history and status of CAD and quantitative image analysis: the role of medical physics and AAPM. Medical Physics. 2008;35:5799–5820. doi: 10.1118/1.3013555. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Doi K. Computer-aided diagnosis in medical imaging: historical review, current status and future potential. Computerized Medical Imaging and Graphics. 2007;31:198–211. doi: 10.1016/j.compmedimag.2007.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.van Ginneken B, Schaefer-Prokop CM, Prokop M. Computer-aided diagnosis: how to move from the laboratory to the clinic. Radiology. 2011;261:719–732. doi: 10.1148/radiol.11091710. [DOI] [PubMed] [Google Scholar]
5.Chan HP, Lo SCB, Sahiner B, Lam KL, Helvie MA. Computer-aided detection of mammographic microcalcifications: pattern recognition with an artificial neural network. Medical Physics. 1995;22:1555–1567. doi: 10.1118/1.597428. [DOI] [PubMed] [Google Scholar]
6.Gokturk SB, Tomasi C, Acar B, Beaulieu CF, Paik DS, Jeffrey RB, Jr, Yee J, Napel S. A statistical 3-D pattern processing method for computer-aided detection of polyps in CT colonography. IEEE Trans Med Imaging. 2001;20:1251–1260. doi: 10.1109/42.974920. [DOI] [PubMed] [Google Scholar]
7.Nemoto M, Shimizu A, Hagihara Y, Kobatake H, Nawano S. Improvement of tumor detection performance in mammograms by feature selection from a large number of features and proposal of fast feature selection method. Syst Comput Jpn. 2006;37:56–68. doi: 10.1002/scj.20498. [DOI] [Google Scholar]
8.Miller MT, Jerebko AK, Malley JD, Summers RM. Feature selection for computer-aided polyp detection using genetic algorithms. Proceedings of SPIE (Medical Imaging) 2003;5031:102–110. doi: 10.1117/12.485796. [DOI] [Google Scholar]
9.Wang S, Summers RM. Machine learning and radiology. Med Image Anal. 2012;16:933–951. doi: 10.1016/j.media.2012.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Bishop CM. Pattern recognition and machine learning. New York: Springer; 2006. [Google Scholar]
11.Duda RO, Hart PE, Stork DG. Pattern classification. 2. New York: John Wiley & Sons; 2000. [Google Scholar]
12.Mitchell TM. Machine learning. International. New York: McGraw-Hill Education; 1997. [Google Scholar]
13.Sahiner B, Chan H, Petrick N, Wagner RF, Hadjiiski L. Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size. Med Phys. 2000;27:1509–1522. doi: 10.1118/1.599017. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. Proc IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001) 2001;1:I511–I518. [Google Scholar]
15.Iba W, Langley P: Induction of one-level decision trees. Proc International Conference on Machine Learning (ICML 1992): 233–240, 1992
16.Tu Z, Bai X. Auto-context and its application to high-level vision tasks and 3D brain image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2010;32:1744–1757. doi: 10.1109/TPAMI.2009.186. [DOI] [PubMed] [Google Scholar]
17.Haralick RM. Statistical and structural approaches to texture. Proc IEEE. 1979;67:786–804. doi: 10.1109/PROC.1979.11328. [DOI] [Google Scholar]
18.Nomura Y, Masutani Y, Miki S, Nemoto M, Hanaoka S, Yoshikawa T, Hayashi N, Ohtomo K. Performance improvement in computerized detection of cerebral aneurysms by retraining classifier using feedback data collected in routine reading environment. Journal of Biomedical Graphics and Computing. 2014;4:12–21. [Google Scholar]
19.Sun Y, Kamel MS, Wong AK, Wang Y. Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition. 2007;40:3358–3378. doi: 10.1016/j.patcog.2007.04.009. [DOI] [Google Scholar]
20.Tu Z, Zhou XS, Barbu A, Bogoni L, Comaniciu D: Probabilistic 3D polyp detection in CT images: the role of sample alignment. Proc IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2006
21.Boureau Y, Ponce J, LeCun Y: A theoretical analysis of feature pooling in visual recognition. Proc International Conference on Machine Learning (ICML):111–118, 2010
22.van Ginneken B, Armato SG, de Hoop B, van Amelsvoort-van de V, Saskia DT, Niemeijer M, Murphy K, Schilham A, Retico A, Fantacci ME. Comparing and combining algorithms for computer-aided detection of pulmonary nodules in computed tomography scans: the ANODE09 study. Medical Image Analysis. 2010;14:707–722. doi: 10.1016/j.media.2010.05.005. [DOI] [PubMed] [Google Scholar]
23.Yang X, Blezek DJ, Cheng LT, Ryan WJ, Kallmes DF, Erickson BJ. Computer aided detection of intracranial aneurysms in MR angiography. J Digital Imaging. 2011;24:86–95. doi: 10.1007/s10278-009-9254-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Arimura H, Li Q, Korogi Y, Hirai T, Abe H, Yamashita Y, Katsuragawa S, Ikeda R, Doi K. Automated computerized scheme for detection of unruptured intracranial aneurysms in three-dimensional magnetic resonance angiography. Acad Radiol. 2004;11:1093–1104. doi: 10.1016/j.acra.2004.07.011. [DOI] [PubMed] [Google Scholar]
25.Rohr K. Landmark-based image analysis: using geometric and intensity models. Utrecht: Springer Netherlands; 2001. [Google Scholar]
26.Nemoto M, Masutani Y, Hanaoka S, Nomura Y, Yoshikawa T, Hayashi N, Yoshioka N, Ohtomo K. A unified framework for concurrent detection of anatomical landmarks for medical image understanding. Proc SPIE (Medical Imaging) 2011;7962:79323E. [Google Scholar]
27.Hanaoka S, Shimizu A, Nemoto M, Nomura Y, Miki S, Yoshikawa T, Hayashi N, Ohtomo K, Masutani Y. Automatic detection of over 100 anatomical landmarks in medical CT images: a framework with independent detectors and combinatorial optimization. Med Image Anal. 2016;35:192–214. doi: 10.1016/j.media.2016.04.001. [DOI] [PubMed] [Google Scholar]
28.Nemoto M, Hayashi N, Hanaoka S, Nomura Y, Miki S, Yoshikawa T, Ohtomo K. A primitive study of voxel feature generation by multiple stacked denoising autoencoders for detecting cerebral aneurysms on MRA. Proc SPIE (Medical Imaging) 2016;9785:97852S. doi: 10.1117/12.2216832. [DOI] [Google Scholar]
29.Glocker B, Sotiras A, Komodakis N, Paragios N. Deformable medical image registration: setting the state of the art with discrete methods. Annu Rev Biomed Eng. 2011;13:219–244. doi: 10.1146/annurev-bioeng-071910-124649. [DOI] [PubMed] [Google Scholar]
30.Komodakis N, Tziritas G, Paragios N. Fast, approximately optimal solutions for single and dynamic MRFs. Proc IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR): 1–8, 2007

[CR1] 1.Castellino RA. Computer aided detection (CAD): an overview. Cancer Imaging. 2005;5:17–19. doi: 10.1102/1470-7330.2005.0018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Giger ML, Chan H, Boone J. Anniversary paper: history and status of CAD and quantitative image analysis: the role of medical physics and AAPM. Medical Physics. 2008;35:5799–5820. doi: 10.1118/1.3013555. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Doi K. Computer-aided diagnosis in medical imaging: historical review, current status and future potential. Computerized Medical Imaging and Graphics. 2007;31:198–211. doi: 10.1016/j.compmedimag.2007.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.van Ginneken B, Schaefer-Prokop CM, Prokop M. Computer-aided diagnosis: how to move from the laboratory to the clinic. Radiology. 2011;261:719–732. doi: 10.1148/radiol.11091710. [DOI] [PubMed] [Google Scholar]

[CR5] 5.Chan HP, Lo SCB, Sahiner B, Lam KL, Helvie MA. Computer-aided detection of mammographic microcalcifications: pattern recognition with an artificial neural network. Medical Physics. 1995;22:1555–1567. doi: 10.1118/1.597428. [DOI] [PubMed] [Google Scholar]

[CR6] 6.Gokturk SB, Tomasi C, Acar B, Beaulieu CF, Paik DS, Jeffrey RB, Jr, Yee J, Napel S. A statistical 3-D pattern processing method for computer-aided detection of polyps in CT colonography. IEEE Trans Med Imaging. 2001;20:1251–1260. doi: 10.1109/42.974920. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Nemoto M, Shimizu A, Hagihara Y, Kobatake H, Nawano S. Improvement of tumor detection performance in mammograms by feature selection from a large number of features and proposal of fast feature selection method. Syst Comput Jpn. 2006;37:56–68. doi: 10.1002/scj.20498. [DOI] [Google Scholar]

[CR8] 8.Miller MT, Jerebko AK, Malley JD, Summers RM. Feature selection for computer-aided polyp detection using genetic algorithms. Proceedings of SPIE (Medical Imaging) 2003;5031:102–110. doi: 10.1117/12.485796. [DOI] [Google Scholar]

[CR9] 9.Wang S, Summers RM. Machine learning and radiology. Med Image Anal. 2012;16:933–951. doi: 10.1016/j.media.2012.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Bishop CM. Pattern recognition and machine learning. New York: Springer; 2006. [Google Scholar]

[CR11] 11.Duda RO, Hart PE, Stork DG. Pattern classification. 2. New York: John Wiley & Sons; 2000. [Google Scholar]

[CR12] 12.Mitchell TM. Machine learning. International. New York: McGraw-Hill Education; 1997. [Google Scholar]

[CR13] 13.Sahiner B, Chan H, Petrick N, Wagner RF, Hadjiiski L. Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size. Med Phys. 2000;27:1509–1522. doi: 10.1118/1.599017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. Proc IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001) 2001;1:I511–I518. [Google Scholar]

[CR15] 15.Iba W, Langley P: Induction of one-level decision trees. Proc International Conference on Machine Learning (ICML 1992): 233–240, 1992

[CR16] 16.Tu Z, Bai X. Auto-context and its application to high-level vision tasks and 3D brain image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2010;32:1744–1757. doi: 10.1109/TPAMI.2009.186. [DOI] [PubMed] [Google Scholar]

[CR17] 17.Haralick RM. Statistical and structural approaches to texture. Proc IEEE. 1979;67:786–804. doi: 10.1109/PROC.1979.11328. [DOI] [Google Scholar]

[CR18] 18.Nomura Y, Masutani Y, Miki S, Nemoto M, Hanaoka S, Yoshikawa T, Hayashi N, Ohtomo K. Performance improvement in computerized detection of cerebral aneurysms by retraining classifier using feedback data collected in routine reading environment. Journal of Biomedical Graphics and Computing. 2014;4:12–21. [Google Scholar]

[CR19] 19.Sun Y, Kamel MS, Wong AK, Wang Y. Cost-sensitive boosting for classification of imbalanced data. Pattern Recognition. 2007;40:3358–3378. doi: 10.1016/j.patcog.2007.04.009. [DOI] [Google Scholar]

[CR20] 20.Tu Z, Zhou XS, Barbu A, Bogoni L, Comaniciu D: Probabilistic 3D polyp detection in CT images: the role of sample alignment. Proc IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2006

[CR21] 21.Boureau Y, Ponce J, LeCun Y: A theoretical analysis of feature pooling in visual recognition. Proc International Conference on Machine Learning (ICML):111–118, 2010

[CR22] 22.van Ginneken B, Armato SG, de Hoop B, van Amelsvoort-van de V, Saskia DT, Niemeijer M, Murphy K, Schilham A, Retico A, Fantacci ME. Comparing and combining algorithms for computer-aided detection of pulmonary nodules in computed tomography scans: the ANODE09 study. Medical Image Analysis. 2010;14:707–722. doi: 10.1016/j.media.2010.05.005. [DOI] [PubMed] [Google Scholar]

[CR23] 23.Yang X, Blezek DJ, Cheng LT, Ryan WJ, Kallmes DF, Erickson BJ. Computer aided detection of intracranial aneurysms in MR angiography. J Digital Imaging. 2011;24:86–95. doi: 10.1007/s10278-009-9254-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR24] 24.Arimura H, Li Q, Korogi Y, Hirai T, Abe H, Yamashita Y, Katsuragawa S, Ikeda R, Doi K. Automated computerized scheme for detection of unruptured intracranial aneurysms in three-dimensional magnetic resonance angiography. Acad Radiol. 2004;11:1093–1104. doi: 10.1016/j.acra.2004.07.011. [DOI] [PubMed] [Google Scholar]

[CR25] 25.Rohr K. Landmark-based image analysis: using geometric and intensity models. Utrecht: Springer Netherlands; 2001. [Google Scholar]

[CR26] 26.Nemoto M, Masutani Y, Hanaoka S, Nomura Y, Yoshikawa T, Hayashi N, Yoshioka N, Ohtomo K. A unified framework for concurrent detection of anatomical landmarks for medical image understanding. Proc SPIE (Medical Imaging) 2011;7962:79323E. [Google Scholar]

[CR27] 27.Hanaoka S, Shimizu A, Nemoto M, Nomura Y, Miki S, Yoshikawa T, Hayashi N, Ohtomo K, Masutani Y. Automatic detection of over 100 anatomical landmarks in medical CT images: a framework with independent detectors and combinatorial optimization. Med Image Anal. 2016;35:192–214. doi: 10.1016/j.media.2016.04.001. [DOI] [PubMed] [Google Scholar]

[CR28] 28.Nemoto M, Hayashi N, Hanaoka S, Nomura Y, Miki S, Yoshikawa T, Ohtomo K. A primitive study of voxel feature generation by multiple stacked denoising autoencoders for detecting cerebral aneurysms on MRA. Proc SPIE (Medical Imaging) 2016;9785:97852S. doi: 10.1117/12.2216832. [DOI] [Google Scholar]

[CR29] 29.Glocker B, Sotiras A, Komodakis N, Paragios N. Deformable medical image registration: setting the state of the art with discrete methods. Annu Rev Biomed Eng. 2011;13:219–244. doi: 10.1146/annurev-bioeng-071910-124649. [DOI] [PubMed] [Google Scholar]

[CR30] 30.Komodakis N, Tziritas G, Paragios N. Fast, approximately optimal solutions for single and dynamic MRFs. Proc IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR): 1–8, 2007

PERMALINK

Feasibility Study of a Generalized Framework for Developing Computer-Aided Detection Systems—a New Paradigm

Mitsutaka Nemoto

Naoto Hayashi

Shouhei Hanaoka

Yukihiro Nomura

Soichiro Miki

Takeharu Yoshikawa

Abstract

Background

Fig. 1.

Methods

Framework Design

Overview

Fig. 2.

Algorithms of the Components

Setting Method

Table 1.

Fig. 3.

Fig. 4.

Optimization Method

Fig. 5.

Optimization Method

Optimization Method

Development and Evaluation of CADe System

Brain Magnetic Resonance Angiography (MRA) Volume Dataset with Cerebral Aneurysms

Fig. 6.

Chest Computed Tomography (CT) Volume Dataset with Pulmonary Nodules

Results

Fig. 7.

Table 2.

Fig. 8.

Discussion

Conclusion

Acknowledgements

Appendix: Calculation of the standard coordinate features

Compliance with Ethical Standards

Conflict of Interest

Ethical Standards

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases