Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2008;2008:747–751.

Integrating an automatic classification method into the medical image retrieval process

Epaphrodite Uwimana 1, Miguel E Ruiz 2
PMCID: PMC2655992  PMID: 18999165

Abstract

Combining low-level features that represent the content of medical images with high level features that are saved with images would allow the expansion of text queries submitted to Content Based Image Retrieval (CBIR) systems. Expanding these text queries would allow CBIR systems to respond more effectively to specific queries when retrieving medical images. We hypothesized that adding an automatic classification method to the current retrieval process would help improve the performance of the University at Buffalo Medical Text and Images Retrieval System (UBMedTIRS). This paper illustrates the results of our approach and its implications for expanding query statements in medical image information retrieval (IR) systems.

Introduction

In hospitals today, medical images are normally processed and saved digitally in Picture Archiving and Communication Systems (PACS) along with some text descriptions 1, 2. Additional information saved with the image could include a doctor’s name, patient identification, age, etc. This information is used to retrieve medical images for reading and interpretation using current retrieval functions in PACS systems, but text query statements could ask for information that is not part of these text descriptors or labels. This situation negatively affects the results of queries submitted to retrieve images. If low-level image features such as color, shape, and textures representing the actual content of the image could be combined with these text descriptors through a classification process, query statements could be expanded and help improve the retrieval process using CBIR systems

It is in this context that we developed a classification method to use with current information retrieval system. Our classification process was performed using the Image Retrieval for Medical Application (IRMA) codes3. This is a multi-faceted image classification code that was developed at the Aachen University of Technology by a group of scientists to promote the use of high-level standards descriptions for medical images. The IRMA code has four main facets: 1) Image Modality, also known as the technical (T), 2) Body Orientation, also known as the directional (D), 3) Body Region Examined, also known as the anatomical (A), and 4) Biological System, known as the biological (B). The Image Modality facet represents techniques that were employed to acquire the image such as x-ray, ultrasound, magnetic resonance measurement, nuclear medicine, optical imaging, biophysical procedures, secondary digitalization, figures, and “other”. The image direction facet indicates the position of the patient’s body at the time the image was acquired. This facet includes classes such as coronal, sagittal, axial, and “other” orientations. IRMA code also includes different classes in the anatomical facet like whole body, cranium, spine, upper extremity, chest, breast, abdomen, pelvis, lower extremity, and tissues. For the biological facet, the IRMA facet has classes like cerebrospinal, cardiovascular, respiratory, gastrointestinal, uropoietic (urinary), reproductive, musculoskeletal, endocrinic, immunic, dermal, exocrine, and dental3.

Because of the amount of images produced daily in healthcare along with limitations with current technologies in healthcare, CBIR systems in medicine have received more attention recently, with a shift toward combining image visual features and text retrieval techniques48. Some groups participating in the Cross Language Evaluation Forum (CLEF)1 are focusing on medical images annotation combining text descriptors and the content of images. As indicated by the results obtained from CLEF2005, the preliminary results show better performance on studies that combined both visual and text techniques in comparison to studies that used one technique6, 9, 10. Adding an automatic classifier to an existing CBIR system could facilitate this combination and make the system more helpful for radiologists in medical settings, researchers in medical image analysis, and medical students as well as instructors in academic environments.

Our ultimate goal for this project was to improve performance of an Image retrieval system similar to the UBmedTIRS system described by the State University of New York at Buffalo in their ImageCLEF20059. This will allow such as system to respond more effectively to queries that require specific attributes such as orientation, biological system, and image modality in combination with semantic terms such as diseases and symptoms.

Some of the queries submitted during that campaign asked for specific information about body parts such as “show me x-ray images of bone cysts”, image modality (“give me a hand x-ray”), body orientation (“Show me sagittal views of head MRI images”), or specific pathologies (“show me pathology images of an entire kidney”), etc. Most of these text descriptors are included in the IRMA classification codes, but if these words or phrases are not part of the text descriptions that accompany the image, it will be impossible to retrieve those using just key words. With this research we want to make additional descriptions be part of the metadata to use for the task of medical image retrieval process.

Methods

Our method combined two main steps: image features extraction and the classification process. We used the GNU Image Finding Tool (GIFT)2 to extract visual features (color and texture) from the images. One the advantages of using GIFT for features extraction was its ability to extract a high number of low level features, and the ability to index a collection based on image features content not just text descriptions. This tool can extract up to 84,000 features per image but as Squire et al have indicated, most images contain between 1000 and 2000 images11.

These features were fed into a multi class Support Vector Machine (SVM)3 program for classification using the four main facets of the IRMA codes. The reason for using a multi-class SVM instead of the common binary SVM was that we were dealing with more than two classes and an image can only belong to one class at a time in a given facet. The binary SVM would have worked as well but since we were dealing with more than two classes, it made more sense to use the multi-class. For example on anatomical facet, we classifying medical images into 27 classes. Figure 1 diagrams the classification process using this method:

Figure 1.

Figure 1

Classification process

As Figure 1 indicates, the input to the features extraction step was a file of labeled medical images from the dataset and the output was a file with image features. The extracted features from this file were formatted using a scripting language (PERL) and fed into the SVM classifier. These classified features were then ready to be used as additional metadata for the retrieval task. Figure 2 illustrates how this classification process fits into the existing image retrieval process.

Figure 2.

Figure 2

The classification process integrated within UBMedTIRS

Since the current UBMedTIRS system performs well with text to retrieve medical images, we plan to leave this process intact and concentrate on expandingimage metadata with visual features using IRMA facet codes (that is image modality, body region, biological system, and body orientation).

The steps on the left side of Figure 2, with red borders, are the classification process additions to the current image retrieval process.

Automatic classification results

We used standard retrieval evaluation measurements, including: recall, precision, error-rate, and the “F1” measure. Recall here represents the proportion of medical images that our enhanced retrieval system correctly classified out of all the relevant images included in each IRMA facet. Precision here represents the proportion of medical images that our system correctly classified out of all the images retrieved in a facet. Error-rate here measures the micro-average rate percentage of medical images that we incorrectly identified from each IRMA facet. This is the ratio between misclassified images in a facet over the total number of images in that facet. The F1 measurement was used to measure how the classifier performed by combining recall and precision with some weight on each. We assigned the same weight of 1 on both recall and precision. Using the two third approach with a dataset of 9000 images, we ran each classification process three times (2/3 for training and 1/3 for testing) with different dataset to avoid the bias in classification. We were limited to this approach due to unequal data distribution. We also used a three-fold validation and the results presented here correspond to the average of these three sets of training and testing pairs.

Within the image modality facet, we easily classified images in the x-ray and angiography classes. The method obtained a low performance classifying fluoroscopic images. Our best classification result was obtained on this facet with an overall error-rate of just 1%. Table 1 illustrates the averages classification outcomes on image modality:

Table 1.

Image modality classification outcomes

Class Recall Precision F1 Error-rate
x-ray 100% 99% 99% 1%
Fluoroscopy 6% 56% 11% 94%
Angiography 93% 100% 96% 7%

Using the body orientation IRMA facet, we encountered some problems with ill-defined classes such as “other orientation” that combined images from more than one class. However, our classification method worked well with other image classes such as coronal and sagittal orientations. The smaller number of images in each class was associated with poor performance. Table 2 illustrates the classification outcomes with images in this facet.

Table 2.

Body orientation classification outcomes

Class Recall Precision F1 Error-rate
Axial 51% 35% 32% 49%
Coronal 86% 80% 82% 14%
Other orientation 9% 34% 14% 91%
Sagittal 54% 64% 58% 46%

In the biological system facet, we again encountered the problem of an uneven distribution of images. For example since our dataset had few images of urinary and respiratory classes, the poor results on this class may be misleading. Table 3 shows the classification outcomes with this IRMA facet:

Table 3.

Biological system classification outcomes

Class Recall Precision F1 Error-rate
Cardiovascular 41% 44% 34% 59%
Musculoskeletal 94% 91% 92% 6%
Reproductive 72% 92% 80% 28%
Respiratory 0% 0% 0% 100%
Uropoietic (urinary) 13% 67% 22% 87%
Gastrointestinal 36% 61% 42% 64%
Unassigned 88% 96% 62% 12%

On the body region facet, some classes included overlapping or similar anatomical regions and could have been be combined into one class. However, we decided to keep them separate to preserve the integrity of the dataset. These included classes like abdomen vs upper abdomen; chest vs chest bones; and cranium versus cranium facial just to name a few. Our results confirmed this problem, since with these indistinct body region classes, our classification method did not perform well because an image could go either way. We did notice an easy classification of breast images with a recall of 96% on left breast images. The lowest recall was on “hand-forearm” images with just a 4% recall. Table 4 shows the classification outcomes with images in this facet:

Table 4.

Body region classification outcomes

Class recall Precision F1 Error-rate
Abdomen 33% 16% 20% 67%
Abdomen, Upper 51% 77% 62% 49%
Breast-left 96% 87% 91% 4%
Breast (Right) 66% 91% 76% 34%
Chest 91% 93% 92% 9%
Chest bones 22% 36% 27% 78%
cranium 70% 74% 72% 30%
facial cranium 7% 47% 11% 93%
neuro cranium 72% 65% 68% 28%
ankle joint 67% 62% 64% 33%
Foot 60% 55% 58% 40%
Hip 27% 41% 32% 73%
Knee 52% 48% 50% 48%
lower leg 21% 49% 29% 79%
upper leg 17% 23% 20% 83%
Pelvis 82% 86% 84% 18%
cervical spine 79% 73% 76% 21%
lumbar spine 70% 72% 70% 30%
thoracic spine 57% 63% 60% 43%
Hilum 17% 50% 25% 83%
Elbow 25% 37% 30% 75%
Forearm 19% 46% 27% 81%
Hand 76% 63% 69% 24%
Hand forearm 3% 13% 5% 97%
radio carpal joint 22% 39% 27% 78%
shoulder 40% 46% 42% 61%
upper arm 17% 35% 21% 83%

Discussion

As indicated in Figures 1 and 2, the goal is to use these results and combine them with information from the current flow of UBMedTIRS retrieval process to expand the metadata for the Information Retrieval (IR) system. During the classification process, we were able create a link between low-level and high-level features that are represented in the textual Image Retrieval for Medical Application (IRMA) code included in dataset provided for this project12. For example we achieved a good performance classifying a class of chest images that was not labeled with an IRMA code in the biological facet. We are in process of automatically adding the outcome of the classifier to each image in our dataset and then we will rerun the retrieval process. We will compare the performance before and after adding this automatic classification process.

The main limitation of this research is the uneven distribution of images among classes. Some classes have less than 10 images that we had to be divided into training and testing. Also the ill-defined classes did affect the outcome because images in very similar classes were easily misclassified. Because of these limitations associated with the current dataset, we are also looking into how we can balance each facet by adding medical images from other sources. We expect to do this work as part of our future research.

Footnotes

References

  • 1.Greenes RA, Brinkley James F. Medical Informatics Computer Applications in Health Care and Biomedicine. New York: Springer-Verlag New York, Inc; 2001. Imaging Systems; pp. 485–538. [Google Scholar]
  • 2.Bidgood WDJ, Horri c Steven, Prior Fred, Van Syckle Donald. Understanding and Using DICOM, the Data Interchange Standard for Biomedical Imaging. Journal of the American Medical Informatics Association. 1997 May-Jun;4(3):199–212. doi: 10.1136/jamia.1997.0040199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Lehmann TM, Schubert H, Keysers D, Kohnen M, Wein B. The IRMA code for unique classification of medical images. Paper presented at: SPIE The International Society For Optical Engineering; San Diego, California: 2003. [Google Scholar]
  • 4.Greenspan H, Pinhas T. Adi. Medical Image Categorization and Retrieval For PACS Using the GMM-KL Framework. IEEE Transactions on Information Technology in Biomedicine. 2006 doi: 10.1109/titb.2006.874191. [DOI] [PubMed] [Google Scholar]
  • 5.Lehmann MT, Wein B, Dahmen Jorg, Bredno Jorg, Vogeslang Frank, Kohnen Michael. Content-Based Image Retrieval in Medical Applications: A Novel Multi-Step Approach. Paper presented at: SPIE 2000; p. 2000. [Google Scholar]
  • 6.Lacoste C, Lim Joo-Hwee, Chevallet Jean Pierre, Hoang Le Diem Thi. Medical Image Retrieval based on Knowledge-Assisted Text and Image Indexing. IEEE Transactions on Circuits and Systems for Video Technology. 2007;17(5) [Google Scholar]
  • 7.Deselaers T, Weyand T, Keysers D, Macherey W, Ney H. FIRE in ImageCLEF 2005: Combining Content-Based Image Retrieval with Textual Information Retrieval. Accessing Multilingual Information Repositories. 2006:652–661. [Google Scholar]
  • 8.Muller H, Michoux Nicholas, Geissbuhler A, Bandon David. A Review of Content-Based Image Retrieval Systems in Medical Application-Clinical Benefits and Future Directions. Informatik. 2005 doi: 10.1016/j.ijmedinf.2003.11.024. [DOI] [PubMed] [Google Scholar]
  • 9.Clough P.ImageCLEF 2005 Evaluation of image retrieval systems for historic photographic and medical images http://ir.shef.ac.uk/imageclef/2005/ Accessed August 12, 2007.
  • 10.Ruiz ME. Combining Image Features, Case Descriptions and UMLS Concepts to Improve Retrieval of Medical Images. Paper presented at: AMIA 2006 Biomedical and Health Informatics; Washington DC: 2006. [PMC free article] [PubMed] [Google Scholar]
  • 11.Müller H, Pun T, Squire D. Learning from User Behavior in Image Retrieval: Application of Market Basket Analysis. International Journal of Computer Vision. 2004;56(1):65–77. [Google Scholar]
  • 12.Lehmann TM, Schubert H, Keysers D, Kohnen M, Wein BB. The IRMA code for unique classification of medical images. Paper presented at: Medical Imaging 2003: PACS and Integrated Medical Information Systems: Design and Evaluation; San Diego, CA, USA: 2003. [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES