Abstract
Effective capability to search biomedical articles based on visual properties of article images may significantly augment information retrieval in the future. In this paper, we present a new method to classify the window setting types of brain CT images. Windowing is a technique frequently used in the evaluation of CT scans, and is used to enhance contrast for the particular tissue or abnormality type being evaluated. In particular, it provides radiologists with an enhanced view of certain types of cranial abnormalities, such as the skull lesions and bone dysplasia which are usually examined using the “ bone window” setting and illustrated in biomedical articles using “bone window images”. Due to the inherent large variations of images among articles, it is important that the proposed method is robust. Our algorithm attained 90% accuracy in classifying images as bone window or non-bone window in a 210 image data set.
1. Introduction
The exponential growth in the number of images produced in clinical practice has led to an increasing prevalence of biomedical images in scientific publications to illustrate both interesting and typical cases. Retrieving images in biomedical publications by exploiting image visual content to augment the text-based searching has been a topic of increasing research interest, as demonstrated by the ImageCLEF medical retrieval track1. We have participated in several ImageCLEF evaluations and have developed effective approaches for classifying medical images by their imaging modality (e.g., X-ray, CT, MR) and anatomical categories. For this we have used methods that combine text that describes the figure in the article and visual information2, 3 in the image. In this paper, we focus on classification of brain CT images with the end goal of obtaining better overall performance for biomedical article retrieval. This work is also a step toward our goal of building a visual ontology for medical image annotation and retrieval.
Head CT plays a critical role in the evaluation of intracranial abnormalities (such as trauma, stroke, and hemorrhage)4. Compared to MR imaging, CT has wider availability, lower cost, and higher sensitivity for the detection of skull fracture, calcification, and acute hemorrhage. CT images are generated using X-ray beams. The amount of X-rays absorbed by tissues at each location in the body is mapped to Hounsfield units (HU). The denser the tissue, the more the X-rays are attenuated, and the higher the number of HU. Water is always set to be 0 HU, while air is −1000HU, and bones have values between several hundred to several thousand HU.
To display/view CT scans, windowing is used to transform HU numbers into gray scale ([0, 255]) values. This allows different features of tissues to be seen and enables viewers to focus on certain tissue of interest by maximizing subtle differences among the tissues. Windowing is controlled by two parameters: window level (WL) and window width (WW). As illustrated in Figure 1, only the tissues with HU values within the specified window ([WL-WW/2, WL+WW/2]) are mapped onto the full range of gray scale; the tissues with HU values above (>WL+WW/2) or below window (<WL-WW/2) are set to be all white or all black. For head CT, bone window and brain window are two important window settings. Bone window is useful for visualizing details of bone structures and identifying subtle skull lesions. However, the details of soft tissues such as brain, that shows density lower than that of bones, are lost in the bone window setting. Brain window is the most frequently used setting, and the majority of evaluations of brain abnormality are done using this window setting. Brain window can show differences among different types of soft tissues, such as brain, blood, vasculature, air-filled structures, and fluid-containing spaces. Although large bone fractures can be seen in brain window setting, fine details of bony or calcified structures are obscured and cannot be well differentiated (as they can in bone window setting). Figure 2 gives an example of CT images shown in (a) brain window and (b) bone window, respectively. Other less commonly used windows include blood window and subdural window, which are optimized for depicting blood and subdural fluid, respectively. Similar to brain window, blood window and subdural window also provide detailed information on the soft tissue. In this paper, we focus on automatically classifying the window setting of brain CT images obtained from biomedical articles into two categories, bone window vs. non-bone window, to distinguish the pathology associated with bones (such as skull abnormalities) from the pathologies exhibited by soft tissues.
Figure 1.
CT windowing
Figure 2.
Head CT scan windowing (from http://radiographics.rsna.org/content/20/2/449.figures-only)
Figure 3 shows sample images from biomedical journals. Compared to the CT scan data obtained from hospitals, the major challenge we face for processing images obtained from biomedical publications (mostly online radiology journals) is the large variation with respect to image size, intensity illumination, viewing direction (axial, sagittal, and coronal), anatomical position (such as nasal cavity level or encephalic level), pathology abnormalities, brain entirety (containing the whole brain or only part of it), and graphical annotations, such as arrows, which are sometimes present. These variations are illustrated in Figure 3. The method that we have developed attempts to deal with this challenge by taking advantage of the distinctive image characteristics of the soft tissues in the different windows.
Figure 3.
Examples of brain CT images from biomedical journals
The rest of the paper is organized as follows. Section 2 presents the detailed description of the proposed algorithm. Section 3 describes and discusses the experimental results. Section 4 concludes the paper with remarks on future work.
2. Method
As stated in Section 1, our goal is to classify a head CT figure in a biomedical article to be a bone window image or non-bone window image. However, it is common to have multi-panel figures in the literature which contain several head CTs with different window settings. Several examples are shown in Figure 4. For this situation, the figure needs to be split first. We are currently developing algorithms to split multi-panel figures of all types (not just head CT figures). We also coarsely classify the extracted panels into two general types of figures: regular images vs illustrations 5. Regular images denote what are conventionally called “medical images”, such as CT, while illustrations include sketches, flowcharts, graphs, and other similar line diagrams. A major challenge in panel splitting is dealing with the high variability and complexity of multi-panel figures that are encountered in the medical literature. For example, the color of the figure background, the layout and size of individual panels, and the image resolutions vary significantly across figures. In some cases, there is no clear panel boundary, or the sizes of the panel boundaries are very small (only several pixels). In addition, the existence of labels, text descriptions, or visual markers such as arrows in the vicinity of the image panel boundaries potentially interfere not only with the panel segmentation procedure but also extend to the general problem of classifying brain CT images. In this paper, we assume the figure is a single panel, i.e., it contains one head in one figure.
Figure 4.
Multi-panel figures
We developed our algorithm based on the observation of the contrasting appearance of soft tissue in bone and non-bone window images. Bone windows result from an HU-to-gray scale mapping which is optimized for the dense tissue of bones, but in this optimization, the details of the less dense, soft tissue are lost. As shown in Figure 2, the area of soft tissues in bone window has relatively homogeneous intensity compared to brain window in which the intensity and texture details of soft tissues are much more apparent. Our method has three main steps: 1) segmentation of soft tissue region of interest; 2) region feature extraction; and 3) support vector machine classification of the region as bone or non-bone window.
• Segmentation of soft tissues
The step of soft tissue segmentation is crucial; the accuracy of the final classification relies on the performance of the segmented results. This is also the step where we deal with the large variation in the input data (described in Section 1). This step contains three sub-steps: pre-processing, image clustering, and post-processing. We describe these sub-steps below.
-
Pre-processing
Since a large variety of intensity ranges exists within the image dataset, we normalize and enhance the contrast of the images to bring them to a common dynamic range and contrast setting, as much as possible, before making comparisons among the images. Figure 5 shows two example original images as well as corresponding preprocessed images.
-
Image clustering
The aim of this sub-step is to segment the image into three types of regions: air-fluid filled spaces, high density regions, and soft tissue regions. The air-fluid filled spaces can exist outside of the brain or within the brain (nasal cavity or ventricle) and are of low intensity (dark black color). The high density regions (such as bones or calcified structures) are of high intensity (bright white color). The soft tissue regions (which are the regions of interest) have intensities falling in between. To cluster the image pixels, we tested two methods: k-means clustering and multilevel thresholding. We chose to evaluate the k-means clustering algorithm6 because of its simplicity and effectiveness for clustering data with Gaussian distributions. We implemented the multi-level thresholding method as described in a standard image-processing text7. For both methods, the number of clusters needs to be pre-specified. As shown in Figure 2, the intensity (as well as shape, size and location) of soft tissues varies considerably among non-bone window images. However, in general, the soft tissues in non-bone window have two broad shades of gray levels (dark gray vs. light gray) corresponding to two clusters, while the intensity of soft tissues in bone windows can be represented by one cluster. Therefore, the number of clusters is set to be four, to account for air-fluid filled spaces, high density regions, and the two types of soft tissue regions. After clustering, the mean intensity of each cluster is calculated and ranked. The cluster having the lowest mean intensity is classified as an air-fluid filled region. The cluster which has the highest mean intensity is classified as the high density region. The remaining two clusters (with intermediate mean intensities) are combined and considered as the candidates of soft tissue regions. In our approach, precise segmentation of soft tissues is not required. However, the extracted soft tissue regions should lie within the true boundary of soft tissues and contain as much as possible of the true soft tissues.
-
Post-processing
The final binary mask of the soft tissue region of interest is generated by (a) relabeling small low-intensity regions to be soft tissue regions and (b) relabeling small high-intensity regions to be soft tissue regions.
Figure 5.
Images before (left column) and after (right column) pre-processing
Figure 6 shows the segmentation results obtained using k-means clustering. Figure 6(a) is for bone window image examples while Figure 6(b) is for non-bone window examples. In each subfigure (Figure 6(a) or 6(b)), the first column is the original image. The second column is the 4-level clustering results (the image is pseudo-colored with labels from 0 to 3 where 0 indicates the air-fluid filled space, 1 and 2 indicates the intermediate soft tissues, and 3 indicates the high-density area). The third column is the segmentation result after cluster combination and post-processing (the gray area is the soft tissue ROIs).
Figure 6.
Segmentation results of soft tissues in different window settings. For (a) and (b), the first column is the original image; the second column is the 4-class clustering results; the third column is the final soft tissue/non-soft tissue ROI segmentation.
• Region feature extraction
After we extract the regions of soft tissues, we calculate the gray-level histogram (a vector of length 256) of the preprocessed image, using only the pixels in the largest connected region among the segmented soft tissue regions. We apply a median filter to reduce the effect of noise before we compute features. (Although more features could be used, we have found that the classification accuracy appears to be satisfactory using only the histogram feature.) The effectiveness of the histogram feature alone may be partly explained by the effectiveness of the segmentation method in extracting the region of soft tissues, coupled with the apparent fact that the intensity distribution of this soft tissue region is a good discriminator between bone windows and other types of windows.
• Classification
For every image, we identify the soft tissue ROIs and then extract the histogram. To classify the image into one of the two categories (bone window vs. non-bone window), we build a binary classifier using a Support Vector Machine (SVM)8; the SVM is a supervised classifier that has gained popularity because of its good performance in general. Given a subset of the data (training data), SVM finds an optimal hyper-plane that separates the data into two classes in a high dimensional space. Based on the decision boundary obtained by this training, the SVM can then predict the category of a new, previously-unseen test image. We will describe the dataset used in our experiment, the parameters of the SVM model, and the classification performance obtained in the following results section.
3. Experimental Results
• Dataset
The images that we used for testing were obtained from several well-known radiology journals, including American Journal of Neuroradiology, Radiology, Radiographics, and American Journal of Roentgenology. In addition to obtaining the images themselves, we obtained and recorded the text in the figure captions associated with the images. As a reference standard, we considered an image to be bone window if (and only if) its text caption indicated that the image was bone window. Using this method, we collected 50 bone window brain CT images and 160 non-bone window brain CT images. Although the dataset is unbalanced we consider it to be a good experimental dataset based on the observation that the visual characteristics of non-bone window images are much more variable than those of bone window images. (Hence a larger non-bone window dataset is needed to represent the non-bone window variations.)
• Classification accuracy
In our experiment, we used half of the image dataset (25 bone window images and 80 non-bone window images) to train the SVM, and the other half of the dataset as the test data. We used a non-linear SVM with Gaussian RBF kernel function. We selected the parameters of the kernel function using 5-fold cross-validation (the C-SVC package of LIBSVM9 was used). Table 1 lists the training and testing performance of the classifier. The performance was measured using sensitivity and specificity. We also performed a leave-one-out (LOO) cross validation on the entire dataset; the cross-validation results are also reported in Table 1, using accuracy as performance measure. As shown in Table 1, the classification results for the k-means clustering-based segmentation method are better (with LOO accuracy of 90.9%, sensitivity being 0.875 and specificity being 0.875) compared to those of multi-level thresholding based segmentation method.
Table 1.
Classification results
| Methods | Training | Testing | LOO |
|---|---|---|---|
| K-means clustering | Sens.= 1, Spec.= 1 | Sens. = 0.875, Spec. = 0.875 | Accuracy = 90.9% |
| Multilevel threshold. | Sens. = 1, Spec. = 1 | Sens. = 0.79, Spec. = 0.85 | Accuracy = 87.5% |
4. Conclusion
CT head scans are routinely acquired for evaluation of intra-cranial abnormalities. Besides brain window images, which are the images most commonly used for examining brain defects, bone window images are also frequently produced to identify skull fractures and other bone anomalies. These are two examples of images with different window settings that are often used in biomedical publications for illustrating different types of pathology, and which are referenced for medical research and educational purposes. This paper describes a new algorithm to classify CT head images into two classes of window settings. One merit of the proposed algorithm is its robustness to variability such as whether the entire or partial brain is shown, and whether the imaging view was sagittal, axial or coronal. Future work includes the expansion of the dataset with multi-panel figures as shown in Figure 4.
Acknowledgments
This research was supported by the Intramural Research Program of the National Institutes of Health (NIH), National Library of Medicine (NLM), and Lister Hill National Center for Biomedical Communications (LHNCBC).
References
- 1.Müller H, Clough P, Deselaers T, Caputo B. Experimental Evaluation in Visual Information Retrieval, The Information Retrieval Series 32. Springer; 2010. [Google Scholar]
- 2.Rahman MM, Antani SK, Thoma GR. A learning-based similarity fusion and filtering approach for biomedical image retrieval using SVM classification and relevance feedback. IEEE Transactions on Information Technology in Biomedicine. 2011;15(4):640–646. doi: 10.1109/TITB.2011.2151258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Rahman MM, Antani SK, Thoma GR. Local concept-based medical image retrieval with correlation-enhanced similarity matching based on global analysis. IEEE Computer Society Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA10) in conjunction with IEEE International Conference on Computer Vision and Pattern Recognition (CVPR); 2010. pp. 87–94. [Google Scholar]
- 4.Perron AD. In: How to Read a Head CT, Chapter 69 in Emergency Medicine. Adams James G., editor. Saunders, an imprint of Elsevier Inc; 2008. [Google Scholar]
- 5.Cheng B, Antani S, Stanley RJ, Demner-Fushman D, Thoma GR. Automatic Segmentation of Subfigure Image Panels For Multimodal Biomedical Document Retrieval. Proceedings of SPIE Electronic Imaging Science and Technology, Document Retrieval and Recognition XVIII; San Francisco, CA. January 2011; p. 78740Z. [Google Scholar]
- 6.MacQueen JB. Some Methods for classification and analysis of multivariate observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability 1; 1967. pp. 281–297. [Google Scholar]
- 7.Gonzalez R, Woods R. Digital Image Processing. 3rd edition. Prentice Hall; 2008. [Google Scholar]
- 8.Vapnik V, Golowich S, Smola A. Advances in Neural Information Processing Systems. Vol. 9. Cambridge, MA: MIT Press; 1997. Support vector method for function approximation, regression estimation, and signal processing; pp. 281–287. [Google Scholar]
- 9.Chang CC, Lin CJ. LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2. 2011;27:1–27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. [Google Scholar]






