Abstract
The increase in volume of medical images generated and stored has created difficulties in accurate image retrieval. An alternative is to generate three-dimensional (3D) models from such medical images and use them in the search. Some of the main cardiac illnesses, such as Congestive Heart Failure (CHF), have deformation in the heart’s shape as one of the main symptoms, which can be identified faster in a 3D object than in slices. This article presents techniques developed to retrieve 3D cardiac models using global and local descriptors within a content-based image retrieval system. These techniques were applied in pre-classified 3D models with and without the CHF disease and they were evaluated by using Precision vs. Recall metric. We observed that local descriptors achieved better results than a global descriptor, reaching 85% of accuracy. The results confirmed the potential of using 3D models retrieval in the medical context to aid in the diagnosis.
1. Introduction
Technology has entered in all sectors of society over the last years, changing the way people work and interact. Medicine, in particular, is one of the areas that has most benefited from the technological advent and often leads the way in the use of existing advances.
Computer-Aided Diagnosis (CAD) schemes provide assistance on diagnosis, using patient data and medical image data. These systems highlight suspicious areas in a medical image, in order to provide detailed data from an anomaly to the health professional1. Recently, Content-Based Image Retrieval (CBIR) has emerged as an important technology to aid diagnosis, since it allows retrieving from a database those images that are most similar to an image provided as an input.
Parallel to the technological advancement cited, the area of graphics processing, including image processing, computer graphics, virtual reality and related fields has also evolved over the years. One of the causes is because the hardware utilized by these areas has become more efficient and cheaper.
In this scenario, techniques have been developed for the reconstruction of three-dimensional (3D) models. One of the main reasons for the increasing use of 3D models in health is that they provide additional information when compared with two-dimensional (2D) images. Besides the information about depth and volume, it is possible to condense color, contrast and resolution characteristics from 2D slices3.
Because of these characteristics, many complex medical exams use 3D models in the diagnosis. Magnetic Resonance Imaging (MRI) and Computerized Tomography (CT) are examples of medical image modalities that allow experts to identify abnormalities such as aneurysms, coronary artery diseases and tumors in several internal organs of the body without the need for invasive methods4. However, each exam from these modalities can generate hundreds of 2D slices and a high volume of data to be examined by physicians in order to perform their diagnoses.
Cardiology can be greatly benefited by CBIR systems using these images. Through MRI and CT exams, computational aid can be provided for the analysis of different diseases such as cardiac ischemia, heart attacks and cardiac insufficiency. A possible solution to overcome the large data volume problem to be analyzed is to generate a 3D model from the 2D slices and use this model to aid the diagnosis. In the Information Retrieval area, the same approach can be used: the system uses the 3D model for extracting the features instead of hundreds of 2D slices. This solution can be faster and more accurate than the approach using 2D image2, 3, 5. However, this approach also has limitations, since the volume or the surface needs to be available. Obtaining this kind of object is not a simple process, since the slices need to be segmented and reconstructed.
In the health area the use of 3D models can be particularly useful for detecting shape alterations in the reconstructed structure. The shape deformation is an important symptom of many diseases, such as Congestive Heart Failure (CHF) which is characterized by the inability of the heart to pump blood at an adequate rate for the metabolic requirements of the body6.
CHF can be discussed from two perspectives: Left-sided Heart Failure and Right-sided Heart Failure. CHF can affect only one side of the heart, and as it is a closed circuit, it is common that this insufficiency on one side makes the other side work harder, resulting in excessive strain, hence producing global CHF. This excessive strain produced in the ventricles can deform this structure in the mid- and long-term. Weight gain or thickness of the ventricle indicates hypertrophy, while increase of the chamber size indicates dilation6.
In this context, a CAD system using CBIR concepts to assist the specialists in finding similar cases based on the 3D model of the structure analyzed can be quite useful both for the diagnosis and for medical education. Despite the large amount of retrieval systems for content-based 2D medical images, the research on 3D models is still incipient in the literature. It is therefore important to develop more specific techniques that take into account the particularities of medical models.
This article contributes to fill this gap by presenting the definition, implementation and validation of one global and three local shape feature descriptors, applied in reconstructed heart left ventricle models. These models were previously classified by an expert as cases with anomaly (CHF disease) and without anomaly. The aim of this research report is verify how different descriptors techniques behave on this specific scenario and which strategy is more precise.
This article is organized as follows: Section 2 presents some works related to the work theme; Section 3 describes the methodology used in the tests and describes the descriptors used; Section 4 discusses the results and Section 5 presents the final conclusion.
2. Background
Research on 3D CBIR is relatively recent, with the main articles of the area published over the last ten years. The components of a CBIR system, both in 2D and 3D domains, are descriptors, similarity function and indexation structures. Descriptors are algorithms that extract some feature from the images. A set of features forms a features vector. Similarity function is an algorithm (usually a distance function) used to measure how similar two images are. There are different metrics for calculating distances, such as Euclidean and Manhattan Distance2. Indexation structures are data structures developed to make faster the retrieval.
In 2D CBIR the descriptors can be classified as global or local, and they are usually divided into three categories: shape, color and texture. Shape descriptors are predominantly developed in 3D domain, and there are numerous different subcategories that vary from author to author. Yubin et al.3 propose categorizing shape descriptors based on geometry, statistics or projection. The geometry category considers as basic features of the 3D model, such as vertices and volume. Descriptors that use quantitative analysis tools such as histograms to define the features and create descriptors are in the statistics category, and descriptors based on projection are those that extract 2D images from the 3D models and also analyze them by using 2D descriptors3. We found only a study that performs the retrieval of 3D models based on their colors, which presented a limitation of retrieving 3D models with completely different shapes as being alike7.
In relation to the difference between global and local descriptors, the former analyzes the 3D model as a whole, and the feature vector includes information from all regions of the 3D structure. Local descriptors analyze parts of the models individually. According to Qin et al.5, global descriptors are easier to implement and have shown robust results taking into account that the 3D models analyzed are simple models. However, for more complex and detailed models, using a local descriptor approach can be preferable. Although it can require more processing, retrieval is more accurate and makes the retrieval system more flexible, since it can select regions of interest to be compared3.
The similarity functions applied to features vectors of the 3D CBIR systems analyzed are the same used by 2D CBIR systems, with a prevailing use of Euclidean and Manhattan distances. Since the features vectors are still a similar data structure, the type of similarity function used remains the same.
3D models of 3D CBIR systems cited in the literature are generally representations of domestic objects such as chairs, vases, bottles and cars. Benchmarks for 3D CBIR, as for instance the Princeton and McGill Benchmarks, also used generic models to evaluate their descriptor5. With this type of benchmarks global descriptors have presented good results, since these models have very distinct shape differences.
In the medical context, few works have focused on 3D CBIR. Glatard et al.8, for example, obtained a 3D volume of the myocardium, and use this object to identify the cardiac cycle phase (systole or diastole) a given query image was in and find slices similar to the query image. A projection-based descriptor that uses 2D images extracted from a 3D object was used.
In Wu et al.9 different volume descriptors were combined to analyze brain models from PET (Positron Emission Tomography) exams. According to the categorization of Yubin et al.3 these descriptors can be considered as geometry-based descriptors, since they analyze primitive information of the 3D model, such as the volume and number of voxels. Finally, Aman, Yao and Summers10 used the SIFT descriptor and Bag of Words to retrieve Colonography CT Scans. The authors applied the Normalized Discount Gain metric to evaluate their results10.
The development of CBIR systems based on 3D models, in order to aid the diagnosis, requires knowing the anatomical structure of the organ in the images and the anomaly under study. Often, the global shape descriptors have limitations, requiring local descriptors, as described in this present work.
3. Methodology
To achieve the proposed objectives one global and three local descriptors were defined and implemented. To calculate the similarity among the models the Euclidean Distance was used, as shown in Equation 1, where x and y are two feature vectors with n size and i indicates the i-th position at each feature vector. In the next sections each of the descriptors implemented will be detailed.
| (1) |
3.1. Descriptors
3.1.1. Distance Histogram Descriptor (DHD)
The Distance Histogram Descriptor (DHD) 12 considers the surface and the geometry of the 3D model analyzed. The algorithm computes the distance between the centroid of the model and its surface considering random points. Next, the distance is divided into ranges that form the bins of a Distance Histogram.
Figure 1 shows the DHD adaptation for the context of this work. From a reconstructed model provided as query, the distance between the centroid of this model and the random points on the surface is calculated. This distance is normalized, and the frequency of occurrence of each distance value in the model is computed to compose a histogram. This histogram is transformed into a feature vector and compared with other features vectors stored in the database. To compute the similarity between two vectors the Euclidean distance is applied and the models are classified according to the value obtained. The lower the value, the more similar the model stored in the database and in relation to the query provided13.
Figure 1.

Steps performed by Distance Histogram Descriptor.
3.1.2. Local Distance Histogram Descriptor (LDHD)
As mentioned earlier, CHF may cause ventricle deformations, especially in the lower regions of the structure. This descriptor takes in account this information and analyzes specific parts of the model based on their octants.
First, the 3D model is divided into eight octants and then the Distance Histogram of each part is calculated. For each Distance Histogram created, its respective area is computed, as given by Equation 2, where fi is the i-th value of frequency, di is its respective value of distance, and n is the number of histogram bins. In other words, each of the eight positions of the distance vector stores the area of the i-th octant (Figure 2).
| (2) |
Figure 2.

Steps of the Local Distance Histogram Descriptor.
Figure 2 shows an example of each step of the LDHD. From a model reconstructed and used as query, the Distance Histogram is calculated for each of the octants, considering the centroid of the model as the point of origin for all of them. The area of each histogram is calculated and stored in a vector. This vector is compared with other vectors in the database using a similarity function. If the result is less than a threshold value, the result for this comparison is zero, otherwise it is 1. The sum of the end positions of this vector indicates the degree of similarity between the two models.
The problem of using common similarity functions directly into features vectors in this case is the loss of information about the octant reference. Most of the similarity functions cited in the literature do not consider the positioning of each feature in the vector, all features have the same weight and if one of them has a bad result all the feature vector can be compromised. Thus, the main advantage of this descriptor regards allowing the end result of the descriptor to reflect the local differences of the model analyzed. This type of approach in the 3D scenario is new, because it allows applying a global solution at specific points in the model.
3.1.1. 3D Hough Transform Descriptor (3DHTD) - Frequency
The Hough Transform is an effective technique for detection of curves and objects in 2D and 3D domain from a set of points. An important characteristic of this technique is how it discretizes the spatial information of the image in order to identify which points belong to the same set of interest14.
R-Table is a reference table in which the indexed curve is represented by the angle of the gradient and which is widely used in 2D Hough transform14. In the 3D domain, R-Table can be used to organize information about the normal vector, which indicates the positioning of a given surface, in addition to the distance of this surface related to the origin. Compared to other descriptors that only take into consideration the distance, there is a major gain for spatial problems that require detecting shape alterations at specific locations.
In the 3D Hough Transform Descriptor (3DHTD) descriptor, the discretization of spatial information occurs by using a Cubic Matrix, a 3D extension of R-Table, where the values of the rows and columns are measured according to a predetermined degree of resolution based on spherical coordinates (θ,φ). This degree of resolution indicates the range value on cubic matrix that the information will be grouped. Thus, lower resolution values indicate a larger area to be analyzed. The normalized Euclidean distance between the centroid and surface is computed to measure the depth of the Cubic Matrix. Finally, the content of each cell is defined by the frequency that this triple (θ, φ and distance) occurs in the model.
The set of triples of a model, along with its respective frequency of occurrence (freq), is stored in the database as a position in the feature vector of the model, as shown in Equation 3, where ai is the i-th Cubic Matrix set {θ, φ, distance, freq}.
| (3) |
The similarity between two models is obtained by comparing the frequency of each cell of the Cubic Matrix. Figure 3 shows how this descriptor works: from a 3D model given as query the information is extracted using 3DHTD – Frequency, and the Cubic Matrix generated (mRquery) is compared by Euclidean Distance with another Cubic Matrix stored on Database (mRDatabase).
Figure 3.

Execution of the 3D Hough Transform Descriptor using the frequencies of the cubic matrices.
3.1.3. 3D Hough Transform Descriptor (3DHTD) – Standard deviation
This descriptor is a second approach of 3DHTD presented in the previous section, which analyzes the distance variations of the surface up to the origin point of the model in the spatial intervals defined by a given degree of resolution.
To reduce the dimensionality of the Cubic Matrix in order to find the standard deviation values, the mean distance for each pair of angles (φ, θ) is computed. Next, the standard deviation of a set of frequencies using Equation 4 is calculated, where fi is the i-th frequency of the pair of angles (φ, θ), μ is the average of the values found (in this case it is the distance), x is the cell value and n is the total of occurrences within the range of each pair of angles (φ, θ).
| (4) |
Afterwards, Euclidean distance is applied, which verifies the entire matrix by comparing the values of the standard deviation, as shown in Figure 4. Similar to 3DHTD – Frequency, the feature vector is created from a 3D model given as query and a matrix based on Standard Deviation (matrixSDQ) is generated. This matrix is compared with another matrices stored on database (matrixSDR) by using the Euclidean Distance.
Figure 4.

Implementation of the 3D Hough Transform Descriptor using standard deviation.
3.1. Experiments
The descriptors presented in the previous sections were applied in models of the left ventricle reconstructed from MRI images. Each MRI exam was composed by 45 heart slices obtained during diastole. These slices have spatial resolution of 256×256 pixels and contrast resolution of 16 bits per pixel. Besides the Precision versus Recall curve, the response time of each of these descriptors was also analyzed.
To test the descriptors in medical images, 30 sets of MRI exams from Heart Institute (InCor) were used - 53% exhibited the problem of CHF and 47% showed no abnormality. 55% of patients were women and 45% were men. The frames were segmented focusing on the region of interest in the left ventricle using Seg3D software15. The ImageVis tool was used for the reconstruction16. Figures 5 and 6 show an example of the slices and the reconstructed ventricle, respectively.
Figure 5:

Example of a slice and the ventricle region segmented (in red).
Figure 6:

Examples of reconstructed ventricles.
We used the Precision versus Recall metric to evaluate the results of the CBIR system, where Precision indicates the proportion of relevant images retrieved and Recall indicates the proportion of all relevant images that are in the database and that were retrieved18. We also used a Distance Matrix, where the cell color indicates how similar a model is to the model given as query. The matrix was divided into two clusters: one related to patients with CHF (patients 1 to 16), and another related to patients with no anomaly (patients 17 to 30). The color of the cells indicates the order in which they were retrieved: the darker cells indicate lower distances and, therefore, models were retrieved before the models related to the lighter cells.
4. Results and Discussion
Figures 7 and 8 show the Precision versus. Recall curve for each descriptor. In both cases, with and without the CHF disease, the 3DHTD using frequency was the best performing descriptor, indicating that spatial information is important for retrieving objects that have specificities at different places of the models. This descriptor reached an average of 85% of precision for lower values of recall, which means that as the number of models retrieved increased, the precision decreased, since we have fewer models available on database and the shape difference between them becomes more difficult to distinguish.
Figure 7:
Precision vs. Recall curve comparing all descriptors implemented in the models with CHF.
Figure 8:

Precision versus. Recall curve comparing all the descriptors implemented in the models with no anomalies.
The 3DHTD descriptor using standard deviation had less satisfactory results. The main limitation of this latter method is that small deformations may not be taken into consideration due to the standard deviation characteristic which considers the average of the range of values to compute the deviations of the data set.
The LDHD, which divides the models into octants and creates an auxiliary vector to compare local distances, presented good performance, about 75% of precision, showing that it is feasible to use locally the global descriptor Distance Histograms. The Distance Histograms, in turn, showed a somewhat less satisfactory performance, about 60% of precision, mainly due to the specificities of the models, in which local information is extremely important for composing the diagnosis.
The errors observed for the first five models retrieved for each query can be found in the Distance Matrices (Figure 9). The 3DHTD - Frequency and the LDHD descriptors had a better performance that can be realized by comparing the generated clusters. DHD and 3DHTD – Standard Deviation descriptors had many outliers, while 3DHTD - Frequency and the LDHD descriptors generated more uniform clusters, separating into two groups the 3D models with and without the CHF disease.
Figure 9:
Results of the Distance Matrices for the descriptors implemented: (a) DHD; (b) LDHD; (c) 3DHTD - standard deviation; (d) 3DHTD - Frequency.
These results show that the descriptors that take into consideration the spatial location of deformations achieved better performance than the descriptor with a global approach. The local descriptor 3DHTD – Frequency, for example, had a better performance – above 20% to 30% more precise – compared to Distance Histogram that uses a global approach. 3DHTD uses spherical coordinates and the distance of the face to the centroid to extract features from the 3D object. Consequently, it considers spatial location as well as the deformation degree as elements of comparison. Additionally, the 3DHTD – Frequency compare each model according to the deformation pattern. In other words, if two models have the same deformation on the same local but their frequencies are different, this means that they are different because the intensity of the deformation are highlighted by the frequency. The performance was consistent with the theory in which the descriptors were based on. Given that the main problem was to identify local deformations, the descriptors that provided this type of information and that cross-referenced with other data such as the degree of deformation (indicated by the distance from the surface to the center of the model) were more likely to achieve better results.
In this work the Euclidean distance was used, however there is a large set of other functions that can be implemented. Thus, evaluating results using other distance functions and develop more descriptors within this context can also enrich the discussion about 3D CBIR, hence offer a contribution to the actual state of the art. The current studies that we found in the literature usually use the same strategy applied in the 2D domain, commonly distance functions – as Euclidean and Manhattan – that measure the distance between two feature vectors. In this paper, for example, we have a 3D feature vector – the Cubic Matrix, and complex data structure to store the information resultant from the descriptors which we created. This can be considered a new way to store and compare the huge volume of information provided by medical images.
5. Conclusion
The objective of this paper was to define, implement and validate four descriptors and compare their performance for the retrieval of three-dimensional medical models with deformations at specific locations, in order to aid the CHF disease diagnosis. In many medical imaging exams it is common that several slices of a structure have to be analyzed by the expert. The generation of 3D models reconstructed from these slices can be a way to decrease the amount of information to be analyzed by experts. In addition, the development of descriptors that can characterize adequately these models is an important contribution for the computer-aided diagnosis area. As mentioned in the Discussion section, local descriptors achieved better results: they were above 20% to 30% more precise than global descriptors, mainly because we were looking for retrieve 3D models with specific and small changes.
This study has some limitations. The first of them is the low number of cases to test the developed techniques, which became difficult to make a deeper analysis of the results considering, for example, gender and age. Due to this limitation, it was also difficult to identify more specific variations on each group. The second one was the use of only Euclidean distance as similarity function; other metrics and evaluation forms can be explored.
Thus, from this work we show evidences that it is feasible to apply the concepts of 3D CBIR for more specific contexts. In this paper CHF disease was studied, however the descriptors developed in this paper can be applied to solve other problems that include local deformations of 3D convex models. In addition, the application of these descriptors in non-convex models also can contribute to evaluate them in a new context. New topics for study about 3D CBIR system can also include similarity functions for 3D feature vectors and complex structures. For future work we are working in increase our database with more 3D medical models. We also are planning to apply these descriptors as well as others new techniques to retrieve cases with different cardiac diseases with a high level of precision.
As mentioned, there are currently few studies related to CBIR in the 3D domain and, therefore, this research presents a contribution comparing performance of global and local descriptors facing the recovery of 3D medical models for aiding the diagnosis of CHF disease.
Acknowledgments
This research was supported by the State of São Paulo Research Foundation (FAPESP - Process #2010/15691-0 and 2011/15949-0), Heart Institute (InCor), Brazilian National Council of Scientific and Technological Development (CNPq - Process #559931/2010-7 and #401745/2013-9). The National Institute of Science and Technology Medicine Assisted by Scientific Computing (INCT-MACC) and we thank Prof. Dr. Carlos Eduardo Rochitte for your time and expertise.
References
- 1.Doi K. Computer-aided diagnosis in medical imaging: Historical review, current status and future potential. Comp Med Imag and Graph. 2007;31:198–211. doi: 10.1016/j.compmedimag.2007.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Vranić DV, Saupe D. 3D model retrieval; Proceedings of Springer Conference on Computer Graphics (SCCG); 2000. 2004. pp. 3–6. [Google Scholar]
- 3.Yubin Y, Hui L, Yao Z. Content-Based 3-D Model Retrieval; A Survey, in Proceedings of 7th IEEE Transactions on Systems, Man, and Cybernetics; IEEE Computer Society; 2007. pp. 1081–1098. [Google Scholar]
- 4.Pereira VM, et al. Diagnostic approach to cerebral aneurysms. European journal of radiology. 2013;82(10):1623–1632. doi: 10.1016/j.ejrad.2012.10.014. [DOI] [PubMed] [Google Scholar]
- 5.Qin Z, Jia J, Qin J. CBMI. IEEE Computer Society; London, England: 2008. Content based 3D model retrieval: A survey, in Proceedings of 6th International Workshop on Content-Based Multimedia Indexing; pp. 249–256. [Google Scholar]
- 6.Kumar V, et al. Robins e Cotran Pathologic Basis of Disease. Elsevier; 2010. p. 1464p. [Google Scholar]
- 7.Wei W, et al. CSSE; Color-Based 3D Model Classification Using Hopfield Neural Network, in Proceedings of 1th International Conference on Computer Science and Software Engineering; IEEE Computer Society: Wuhan, China; 2008. pp. 883–886. [Google Scholar]
- 8.Glatard T, Montagnat J, Magnin IE. Texture based medical image indexing and retrieval: application to cardiac imaging; Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval; New York, NY, USA: ACM; 2004. pp. 135–142. [Google Scholar]
- 9.Wu H, et al. Volume of interest (VOI) feature representation and retrieval of multi-dimensional dynamic positron emission tomography images, in Intelligent Multimedia, Video and Speech Processing; Proceedings of 2004 International Symposium on 2004; 2004. pp. 639–642. [Google Scholar]
- 10.Aman JM, Yao J, Summers RM. Content-based image retrieval on CT colonography using rotation and scale invariant features and bag-of-words model, in Biomedical Imaging: From Nano to Macro; 2010 IEEE International Symposium on; 2010. pp. 1357–1360. [Google Scholar]
- 11.Raghavan VBP, Jung GS. A critical investigation of recall and precision as measures of retrieval system performance. ACM Transactions on Information Systems (TOIS) 1989;7(3):205–229. [Google Scholar]
- 12.Khe L, Feng Z, Ning H. An Effective Approach to Content-Based 3D Model Retrieval and Classification; Proceedings of the 1th Internacional Conference on Computational Intelligence and Security (CIS); China: IEEE Computer Society; 2007. pp. 361–365. [Google Scholar]
- 13.Bergamasco LCC, Nunes FLS. Applying Distance Histogram to retrieve 3D cardiac medical models; Proceeding of American Medical Informatics Association; 2013. pp. 112–121. [PMC free article] [PubMed] [Google Scholar]
- 14.Ballard DH. Generalizing the Hough transform to detect arbitrary shapes. Pattern Recognition. 1981;13:111–122. [Google Scholar]
- 15.CBIC Seg3D: Volumetric Image Segmentation and Visualization. Scientific Computing and Imaging Institute (SCI) 2012. Available at: < http://www.seg3d.org>.
- 16.CBIC ImageVis3D: A Real-time Volume Rendering Tool for Large Data. Scientific Computing and Imaging Institute (SCI) 2012. Available at: < http://www.imagevis.org>.
- 17.Oracle, API Java 3D. 2013. Available at: http://www.oracle.com/technetwork/java/java/index-jsp-138252.html.
- 18.Datta R, et al. Image retrieval: Ideas, influences, and trends of the new age. ACM Comput Surv. 2008;40:5:1–5:60. [Google Scholar]



