Improving Retrieval Performance in Medical Image Databases Using Simulated Annealing

Jing Ginger Han; Chi-Ren Shyu

. 2010 Nov 13;2010:276–280.

Improving Retrieval Performance in Medical Image Databases Using Simulated Annealing

Jing Ginger Han ¹, Chi-Ren Shyu ^1,²

PMCID: PMC3041439 PMID: 21346984

Abstract

In the area of content-based image retrieval, one of the prerequisites of successful retrieval is the extraction of an ample number of distinguishing features that sufficiently describe the important characteristics represented in the image content. Parameters underlying image segmentation and feature extraction need to be set appropriately in order to have this success in retrieval. We present here a parameter tuning method using simulated annealing to dynamically adjust values of important parameters used in customized image processing algorithms for the purpose of improving the performance of retrieval for high resolution CT lung images in computer-aided diagnosis. The most notable improvement using F_β measure among five modules is from 0.56 to 0.81, which is a 44.64% increase (p=0.022). This method provides a way to improve retrieval performance in a large variety of applications in medical imaging informatics.

Introduction

Medical images play a critical role in many aspects of clinical routines such as radiology, neurology, pathology, endoscopy, cardiology, dermatology, etc. Clinical professionals refer to medical images to examine diseases and make diagnoses according to the visual patterns observed in the images in conjunction with other medical data and observations. This results in a huge amount of medical imagery from various modalities being generated daily and needing to be reviewed, compared for diagnoses, and then archived in systems like picture archiving and communication system (PACS) for future reference and also for medical training purposes. According Brenner and Hall¹, an estimated 62 million scans were collected in 2006 in the United States alone, compared to about 3 million scans in 1980. Thus, it is almost not feasible for radiologists to go through all the scans and potentially dig out previous similar cases. Therefore, turning to modern technologies, such as computer vision, database management, and information retrieval, is promising to ease this burden and may even improve diagnoses by backing up the findings with previously diagnosed similar cases.

Content-based image retrieval² (CBIR) is a technique for retrieving images based on image content using extracted features, such as color, texture, and shape. When textual descriptions and annotations are limited or not available³ and needs, on the other hand, more understanding and development⁴, analyzing the content of images becomes more powerful. Moreover, patterns that are not easy for humans to pick up on can be detected automatically by computer vision techniques and this information can contribute to even better accuracy in capturing critical patterns for diagnoses. Unlike images in general purpose CBIR systems, medical images usually do not have rich color information, and users primarily pay attention to pathology-bearing regions, which can be difficult to extract as they are oftentimes not that distinguishable from the surrounding context⁵. From our experience, to build a successful CBIR system with medical imagery, the following assumptions must hold: (1) medically meaningful objects of interest can be identified; (2) the selected features are indeed sufficient to characterize the various appearances in the image set; and (3) feature values are properly extracted and accurately represent the true patterns residing in the images. These sound fairly basic, but are difficult to ensure all the time in practice. A significant amount of this difficulty lies in adjusting and adapting all the various parameters used in the object segmentation and feature extraction algorithms. This problem is common to most of the applications using CBIR technique and is therefore a very valuable issue to be studied. The approach presented in this paper makes parameter tuning automatic according to provided medical images instead of the developer’s empirical settings. We first identify important parameters used in algorithms as well as the range of possible values for each parameter. Then simulated annealing (SA) is applied to search for more efficient parameter configurations that result in better retrieval performance. Several experiments are performed on 303 high resolution CT (HRCT) images of lungs. In the following sections, we will introduce our method and demonstrate how simulated annealing can be used to dynamically adjust parameters in medical image analysis algorithms.

Method

Domain experts, such as physicians and experienced users, always look for some distinct visual patterns appearing in the images that represent certain disease characteristics. These visual patterns can be generally put into groups and we refer to them as perceptual categories⁴ (PC). In this study, there are five perceptual categories represented in the image collection: emphysema (EMP), cysts (CYS), ground-glass opacities (GGO), honeycombing (HON) and bronchial structures (BRO). Because it is not uncommon to observe different visual patterns, belonging to different PCs, in the left and right lungs (Figure 1), the two lungs are analyzed separately.

Figure 1. — A HRCT lung image with a PC of ground-glass opacity (GGO) in the right lung (the left side) and an emphysema (EMP) perceptual category in the left lung (the right side).

1. Modularized PC Recognizers

After segmenting the right and left lungs from the background of the image, we can analyze each lung independently for various patterns using customized algorithms called modularized PC recognizers, or modules in short. In each module, different image processing algorithms and filtering criteria, such as grey scale thresholding, connected components, topological characteristics, spatial relationship information, etc., are designed to filter out artifacts and at the end of each module objects of interest representing this PC’s patterns are left to calculate low-level features.

Each step in each module utilizes thresholds, decision criteria, or some other logic to make decisions about whether segmented objects are indeed useful or merely represent artifacts. All these steps ultimately work together and collectively lead to a final image segmentation result, the quality of which can drastically affect feature accuracy. Usually assigning parameter values will greatly rely on domain experts’ knowledge as well as researchers’ accumulated experience through the development of the system. The selection of the configuration of various parameters can be considered as a combinatorial problem⁷. Because a change in the value of one parameter will likely affect the results of other steps in the algorithm, it is almost impossible to configure the entire array of parameters manually. Instead, we are designing an automatic and statistically-supported method that attempts to reliably optimize parameter configuration to facilitate high accuracy segmentation and feature extraction. Figure 2 demonstrates comparisons of each module’s resulting images when applying ideally best parameter configurations, default configurations based on empirical experience, and computationally optimized configurations using simulated annealing.

Figure 2. — Modularized PC Recognizers and extracted results from various parameter settings.

As an example, the cystic structure module contains seven parameters that need to be computationally optimized. Ideally, an image with cystic structures that is analyzed by the CYS module should be segmented into as many cysts as possible, while an image with only emphysema that passes through the CYS module should result in no segmented objects. If the parameters in the CYS module are not properly set, using empirically configured default values for example, then it is quite likely that some low attenuation regions may also be extracted incorrectly and using features calculated from them the image may be considered as CYS. The hope is that by tuning parameters, this insufficiency can be greatly reduced.

2. Automatic Parameter Tuning

Simulated annealing⁶ is one of the global search methods used for optimization problems. It is useful to deal with combinatorial problems with finite but relative large feasible sets for parameters⁸. Random trials of possible values are tried, and the performance of each trial is evaluated using a customized cost function. A probability of accepting a “worse move” is used to give the trial a chance of “climbing out of local minima” while searching the entire feasible set. In our study, we extend the method of simulated annealing. First of all, parameters in each module are computationally optimized one by one in an order based on (1) their positions in the module as well as (2) their perceived influence on the final results. To ensure the process requires reasonably achievable computing time, we optimize the parameters one at a time. The configuration is updated after finishing each parameter’s SA tuning procedure. Empirically derived values are used if the parameters have not been optimized. Secondly, after optimizing one parameter, the temperature is “reheated” to the initial temperature T₀. This gives the next parameter tuning step an equal chance to move around and escape the local minima obtained from previous parameters’ SA results.

2.1. Overall architecture of the system

The system operates as follows (Figure 3):

Initial Settings: Specify working module, order of parameters to be tuned, initial temperature (T₀) and stopping criteria which include maximum iteration number, frozen temperature (T_f), control value (c) of exponential cooling schedule¹¹ and target minimum cost value¹².
$T_{o} = max {Δ Cost}, T_{k} = c \cdot T_{0}, T_{f} = min {Δ Cost}$
Select Parameter: In the module’s configuration parameter list, pick one parameter according to predefined order.
Simulated Annealing: Use simulated annealing to find the optimal value of picked parameter.
Update Parameter. Set the value of this parameter to the optimal value found in the SA step and proceed to tuning the next parameter.
Final Configuration: All the tuned parameters together form the final configuration for the working module. Once these are all determined, proceed to the next module.

2.2. Simulated Annealing

The procedure for performing simulated annealing, shown in the big dashed box on the right side of Figure 3, for one parameter in a module includes the following steps:

Random Walk: Adjust the value of the parameter by one small step. Step size and valid walking range are predetermined for each parameter using empirical knowledge. Use this new value keeping other parameters fixed to form a new configuration.

\begin{array}{l} {value}_{i}^{j} = {value}_{i}^{b} + {step}_{i}^{j}, {step}_{i}^{j} = 2 * {step}_{i}^{0} * r - {step}_{i}^{0} \\ {value}_{i}^{j} \in [min {value}_{i}, max {value}_{i}] \\ {step}_{i}^{j} \in [- {step}_{i}^{0}, {step}_{i}^{0}] \end{array}

where

\begin{array}{l} i = 1, 2, \dots, m j = 1, 2, \dots, max iteration \\ {value}_{i}^{j} : j^{th} trial for parameter i in a module \\ {value}_{i}^{0} : initial value for parameter i \\ {value}_{i}^{b} : current betst value of parameter i \\ {step}_{i}^{j} : step length in j^{th} trial for parameter i \\ {step}_{i}^{0} : maximum step length for parameter i \\ r : uniform random number between 0 and 1 \end{array}

Feature Extraction: Use this new configuration to segment out objects of interest and extract low-level features for all images.

Retrieval: At each annealing step randomly select a set of images that are labeled as the current module’s PC to search against remaining images and retrieve top ranked results. This random selection of query images helps to limit the effect of over-fitting. In our study, we query 10 times and pick top 30 results for each query. Cost Evaluation: Cost functions are derived using two methods, average precision at seen relevant documents⁹ and F_β measure¹⁰. Images with same labels as query image are considered as relevant results. Averaged cost over all random query evaluations are used as each annealing step’s cost value.

\begin{array}{l} Cost = 1 - Ave .prec or Cost = 1 - F_{β} \\ Ave . prec . = \frac{1}{k} \sum_{r = 1}^{k} p (r) \\ k : # relevant result documents \\ p (r) : precision at r^{th} relevant document \\ F_{β} = \frac{(1 + β^{2}) precision * recall}{β^{2} * precision + recall} \\ precision = # relevant / # total results \\ recall = # relevent results / # total relevant \end{array}

SA Decision: Make a decision on whether to accept the new move. If the cost from the new configuration is less than the current best configuration, set this configuration as the new current best. If not, calculate the probability of accepting a “worse step” and accept it if this probability is greater than a uniform random number; otherwise retain the current best configuration.
Cool Down: Reduce the probability of accepting a worse step by cooling down the system temperature. The SA step stops when one of the following three stopping criteria occurs: steps reach the maximum, temperature drops to a frozen state, or cost reaches to the predetermined low value.

In order to perform annealing in a controlled manner, a cooling schedule and other stopping criteria are set. Raittinen & Kaski suggest that the initial and frozen temperatures can be the maximum and minimum cost differences, respectively¹². Since the cost function in our study ranges in [0,1], the initial temperature is set to be 1 and the frozen temperature is set as 0.001. The control value of exponential cooling schedule is 0.98 and target low cost is 0.05 or equivalent to say the target precision of retrieval is 95% for the method of average precision at seen relevant documents or the target F_β score is 0.95 for the method of F_β measure. The maximum number of iterations for each SA step is set at 300 in order to make the computing time reasonable. Figure 4 shows an example of optimizing CYS module’s third parameter. The F measure with beta = 0.33 increases gradually as temperature drops down and becomes stable around 85% after 300 trials. We use β = 0.33 to emphasize 3 times more of precision than recall due to the intension that users would pay more attention to see top ranked results to be as many relevant results as possible rather than browsing all possible results.

Figure 4. — Precision of each step’s saved best configuration for parameter #3 in CYS module over 300 trials.

Results

1. Data Set

We selected 303 HRCT lung images, in which a total of 394 left or right lungs are labeled individually and with a consensus of two radiologists according to their visual patterns with five perceptual categories. Images are all gray scale images with dimensions of 512 x 512 pixels.

2. Experiment Results

In our study, each module has its own parameters that control performance quality, and the scenarios for these modules are different due to varying characteristics of each perceptual category’s visual patterns. Therefore, we apply a “divide-and-conquer” approach to optimize parameter configurations for each module. Details regarding the distribution of perceptual categories, parameters and low-level features extracted from images are listed in Table 1.

Table 1.

Distribution of images, parameters and features.

Module	Lung No.	Parameter No.	Feature No.
BRO	62	5	7
CYS	52	7	8
EMP	121	6	6
GGO	106	6	6
HON	53	5	5
total	394	29	69^*

Open in a new tab

There are 47 global features including statistics of overall gray scale (12) and textural measurements (35).

Using simulated annealing to find the computationally optimal configuration shows improvement of retrieval’s effectiveness for all modules using both cost evaluation methods (Figure 6). The average retrieval precision improves from 79.49% to 93.73% which is a relative 17.91% increase using average precision at seen relevant documents, and it increased relatively 28.64% which is from 53.52% to 68.85% using F_β measure. One-tailed student’s t-tests are used to determine the statistical significance of improvement. The null hypothesis is “mean effectiveness (precision or F score) is not different after SA optimization” and the alternative is “SA optimization improves the mean effectiveness”. The generic optimization performance of retrieval before applying our approach is not stable and distributed with a wide variance. It is noteworthy to demonstrate that after SA the performance has much better mean values with small variances (Table 2). The p-values using two methods are 0.55±0.001 (precision only) and 0.058±0.001 (F_β).

Figure 6. — Effectiveness increase before and after optimization using average precision at seen relevant documents (first figure) and F_β measure (second figure).

Table 2.

Comparison of means and variances of precision/F_β measure before and after SA for all five modules.

		BRO	CYS	EMP	GGO	HON
Pre	Before	0.81±0.069	0.82±0.099	0.81±0.037	0.75±0.069	0.79±0.049
Pre	After	0.92±0.002	0.98±0.001	0.93±0.005	0.95±0.005	0.92±0.002
F_β	Before	0.53±0.077	0.56±0.114	0.50±0.029	0.63±0.025	0.45±0.076
F_β	After	0.70±0.054	0.81±0.010	0.62±0.023	0.73±0.006	0.58±0.005

Open in a new tab

Figure 7 depicts a retrieval result using a Google-like query by image example method. The query image (top) is the left lung of an HRCT lung image that has bronchial structure patterns, which are dark lumen surrounded by thick walls. Using the features extracted from the query image, most similar images are retrieved and ranked based on overall similarity in the ranked order from top-left to bottom-right. We can see these images come from different patients with certain variation in appearance but all share similar visual pattern, BRO.

Discussion

When a group of computer vision algorithms are used to identify objects of interest in an image, parameters in these algorithms need to be treated carefully. Empirically derived values are normally determined by a limited data sample and are known to be insufficient when the image database grows. The method of simulated annealing is adopted in our study with variations and helps to improve the retrieval precision for all modules. This approach can be applied to other image and information retrieval problems that need to deal with combinatorial problems and do not have a direct and analytic cost measurement. An accurate retrieval system in medical databases that archives medical images, clinical reports, laboratory data and other related information can help health care professionals and medical students to search for similar medical cases and assist them in diagnoses. For example, pulling out top ranked CT scans that have most similar disease patterns will provide radiologists a means for differential diagnosis. Moreover, combining images with other clinical information and observations can provide necessary information for more accurate and efficient diagnoses.

Acknowledgments

The authors would like to thank Dr. Yash Sethi and Dr. Rajdeep Singh for diagnosis labeling of images and in-depth discussions with Jason M. Green and Nan Alan Zhao in computational approaches.

References

1.Brenner DJ, Hall EJ. Computed tomography - an increasing source of radiation exposure. N Engl J Med. 2007;357(22):2277–2284. doi: 10.1056/NEJMra072149. [DOI] [PubMed] [Google Scholar]
2.Muller H, Michoux N, Bandon D, Geissbuhler A. A review of content-based image retrieval systems in medical applications – clinical benefits and future directions. Intl. J. Med. Informatics. 2004;73(1):1–23. doi: 10.1016/j.ijmedinf.2003.11.024. [DOI] [PubMed] [Google Scholar]
3.Lew MS, Sebe N, Djeraba C, Jain R. Content-based multimedia information retrieval: state of the art and challenges. ACM TOMCCAP. 2006;2(1):1–19. [Google Scholar]
4.Hearsh W, Muller H, Kalpathy-Cramer J. The ImageCLEFmed medical image retrieval task test collection. J. Digital Imaging. 2009;22(6):648–655. doi: 10.1007/s10278-008-9154-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Shyu CR, Pavlopoulou C, Kak AC, Broadley CE, Broderick LS. Using human perceptual categories for content-based retrieval from a medical image database. Computer Vision and Image Understanding. 2002;88:119–151. [Google Scholar]
6.Kirkpatrick S, Gelatt CD, Jr, Vechi MP. Optimization by simulated annealing. Science. 1983;220(4598):671–680. doi: 10.1126/science.220.4598.671. [DOI] [PubMed] [Google Scholar]
7.Lawlor EL. Combinatorial Optimization. Holt, Rinehart & Winston; New York: 1976. [Google Scholar]
8.Chong EKP, Zak SH. An introduction to optimization. 3rd ed. John Wiley & Sons, Inc; Hoboken, New Jersey: 2008. [Google Scholar]
9.Baeza-Yates R, Ribeiro-Net B. Modern information retrieval. New York: ACM Press; 1999. Harlow, England: Addison-Wesley. [Google Scholar]
10.Van Rijsbergen CJ. Information Retrieval. 2nd ed. Butterworth-Heinemann; London: 1979. [Google Scholar]
11.van Laarhoven PJM, Aarts EHL. Simulated annealing: theory and applications. D Reidel Publishing Company. 1987 [Google Scholar]
12.Raittinen H, Kaski K. Image deconvolution with simulated annealing method. Physica. Scripta. 1990;33:126–130. [Google Scholar]

[b1-amia-2010_sympproc_0276] 1.Brenner DJ, Hall EJ. Computed tomography - an increasing source of radiation exposure. N Engl J Med. 2007;357(22):2277–2284. doi: 10.1056/NEJMra072149. [DOI] [PubMed] [Google Scholar]

[b2-amia-2010_sympproc_0276] 2.Muller H, Michoux N, Bandon D, Geissbuhler A. A review of content-based image retrieval systems in medical applications – clinical benefits and future directions. Intl. J. Med. Informatics. 2004;73(1):1–23. doi: 10.1016/j.ijmedinf.2003.11.024. [DOI] [PubMed] [Google Scholar]

[b3-amia-2010_sympproc_0276] 3.Lew MS, Sebe N, Djeraba C, Jain R. Content-based multimedia information retrieval: state of the art and challenges. ACM TOMCCAP. 2006;2(1):1–19. [Google Scholar]

[b4-amia-2010_sympproc_0276] 4.Hearsh W, Muller H, Kalpathy-Cramer J. The ImageCLEFmed medical image retrieval task test collection. J. Digital Imaging. 2009;22(6):648–655. doi: 10.1007/s10278-008-9154-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b5-amia-2010_sympproc_0276] 5.Shyu CR, Pavlopoulou C, Kak AC, Broadley CE, Broderick LS. Using human perceptual categories for content-based retrieval from a medical image database. Computer Vision and Image Understanding. 2002;88:119–151. [Google Scholar]

[b6-amia-2010_sympproc_0276] 6.Kirkpatrick S, Gelatt CD, Jr, Vechi MP. Optimization by simulated annealing. Science. 1983;220(4598):671–680. doi: 10.1126/science.220.4598.671. [DOI] [PubMed] [Google Scholar]

[b7-amia-2010_sympproc_0276] 7.Lawlor EL. Combinatorial Optimization. Holt, Rinehart & Winston; New York: 1976. [Google Scholar]

[b8-amia-2010_sympproc_0276] 8.Chong EKP, Zak SH. An introduction to optimization. 3rd ed. John Wiley & Sons, Inc; Hoboken, New Jersey: 2008. [Google Scholar]

[b9-amia-2010_sympproc_0276] 9.Baeza-Yates R, Ribeiro-Net B. Modern information retrieval. New York: ACM Press; 1999. Harlow, England: Addison-Wesley. [Google Scholar]

[b10-amia-2010_sympproc_0276] 10.Van Rijsbergen CJ. Information Retrieval. 2nd ed. Butterworth-Heinemann; London: 1979. [Google Scholar]

[b11-amia-2010_sympproc_0276] 11.van Laarhoven PJM, Aarts EHL. Simulated annealing: theory and applications. D Reidel Publishing Company. 1987 [Google Scholar]

[b12-amia-2010_sympproc_0276] 12.Raittinen H, Kaski K. Image deconvolution with simulated annealing method. Physica. Scripta. 1990;33:126–130. [Google Scholar]

PERMALINK

Improving Retrieval Performance in Medical Image Databases Using Simulated Annealing

Jing Ginger Han, BS

Chi-Ren Shyu, PhD

Abstract

Introduction

Method