Abstract
Objectives:
The motivation behind this work was to design an automatic algorithm capable of segmenting the exterior of the dental and facial bones including the mandible, teeth, maxilla and zygomatic bone with an open surface (a surface with a boundary) from CBCT images for the anatomy-based reconstruction of radiographs. Such an algorithm would provide speed, consistency and improved image quality for clinical workflows, for example, in planning of implants.
Methods:
We used CBCT images from two studies: first to develop (n = 19) and then to test (n = 30) a segmentation pipeline. The pipeline operates by parameterizing the topology and shape of the target, searching for potential points on the facial bone–soft tissue edge, reconstructing a triangular mesh by growing patches on from the edge points with good contrast and regularizing the result with a surface polynomial. This process is repeated for convergence.
Results:
The output of the algorithm was benchmarked against a hand-drawn reference and reached a 0.50 ± 1.0-mm average and 1.1-mm root mean squares error in Euclidean distance from the reference to our automatically segmented surface. These results were achieved with images affected by inhomogeneity, noise and metal artefacts that are typical for dental CBCT.
Conclusions:
Previously, this level of accuracy and precision in dental CBCT has been reported in segmenting only the mandible, a much easier target. The segmentation results were consistent throughout the data set and the pipeline was found fast enough (<1-min average computation time) to be considered for clinical use.
Keywords: computer-assisted image analysis, CBCT, dental implantation
Introduction
Precise planning of dental implants and dental and maxillofacial surgeries require volumetric (three-dimensional) images from the area to be operated. CBCT has been developed as a relatively low-cost and low-dose alternative to conventional CT to meet these needs.1 During the past decade, CBCT has become an established radiologic technique in dental imaging.
The full advantage of volumetric images will be obtained by visualizations of the target and the neighbouring structures, such as a tooth to be replaced with an implant and the inferior mandible canal. These include (two-dimensional) cross-sectional slices of the dental arch, panoramic radiographs of chosen geometry and depth and volumetric renderings.2 When these two-dimensional images or views are reconstructed from volumetric images, their correspondence to anatomical ground truth can be retained and the reconstructed image orientation, location and geometry fitted to individual anatomy when needed. Creating these views typically require segmentation of the anatomic structures from the volumetric images. Image segmentation is a very useful and important tool in other purposes also, such as in the evaluation of tumorous bone infiltration, intrabony pathologies, fracture diagnostics and orthognathic monomaxillary and bimaxillary surgery planning.3–5
When image segmentation is performed manually, the results are hard to reproduce. Manual drawing on potentially hundreds of slices of high-resolution image volumes takes considerable amount of time. Assuming that manual drawing of the facial contour takes 10 s per slice, drawing every slice of a single image consisting of 300 slices would take 25 min. Thus, automatic segmentation methods would be much preferred. Unfortunately, CBCT or CT volumes often present strong metal and other artefacts.6 The lower dose of CBCT yields a lower signal-to-noise ratio, worse contrast and higher intensity inhomogeneity than conventional CT. The highly varying anatomy of the mandible and maxilla, especially around the teeth, poses another major challenge.
Lamecker et al7 were one of the first to report results on segmenting the mandible from CBCT volume. Their strategy was to use a statistical shape model, which was deformed around the mandible.8 Rueda et al9 developed a method for segmenting the cortical bone and other targets from the cross-sectional slices of the mandible. They exploited both the shape and texture of the hand-segmented structures to train an active appearance model of the structures of interest.10 Kainmueller et al11 extended the work by Lamecker et al7 by improving the segmentation method and extending the application to tracking the mandibular canal. Recently, Wang et al12 have presented a volumetric segmentation method for both the mandible and maxilla. Their method registers a number of prior models (atlases) to the target and generates patient-specific prior model constructed of patches that are selected from the registered images.
Past works on segmenting dental CBCT volumes concentrate mostly on segmenting the mandible and its internal structures. Shape and appearance models used by published algorithms do not include the maxilla or teeth, which are the structures most prone to exhibiting metal artefacts, and usually also have the highest variation of shape and consequently represent areas where segmentation algorithms are most likely to fail. In order to develop an algorithm capable of segmenting the exterior of these structures, several established segmentation paradigms were considered. Registering the volume according to a mean intensity model was deemed unsuccessful owing to the high variation in anatomy and variation in pose and the varying state of the mandible (mouth) being closed or open to some extent.13 Atlas-based methods have been applied successfully in various applications such as brain image analysis and now also dental CBCT.12,14,15 However, generation of the shape and intensity models as required by atlases would have been challenging owing to the highly varying anatomy, pose and scanned area of our population. The usually very high computational cost and long execution times associated with atlas-based segmentation tools were unacceptable for our clinical tool. Active appearance models, statistical shape models or other deformable models have been found to be successful in segmenting the mandible.2,7,11 However, these models require a rather accurate initialization. In the typical case where the exact location, orientation or delineation of the target relative to the volume is not precisely known, the initialization of these models would sometimes require laborious manual interaction.
The aim of our study was to develop an automatic and data-driven segmentation algorithm that requires only very general and non-specific knowledge of the target. The algorithm was specified to segment the exterior of all visible bones in a CBCT volume with a single open surface in three dimensions. The algorithm was required to be accurate, precise and fast enough for clinical use and to tolerate the challenging characteristics of dental CBCT and possible gaps and holes around the facial skeleton. The intended use for the algorithm was to aid in creating visualizations and thus, good, consistent and continuous overall fit to the target was preferred over highly detailed result over small and local shapes.
Methods and materials
Data
The data set consisted of 49 isotropic CBCT volumes collected by scanning human subjects during two separate studies S1 (n = 19) and S2 (n = 30) on separate occasions. The subjects were scanned using manufacturer Soredex Scanora model (KaVo Kerr Group, Tuusula, Finland) prototypes during their development. S1 was scanned using an early version of the system and prior to S2, the system was upgraded. The volumes were reconstructed in cylinder-shaped fields of view (FOVs) in three different sizes (diameter × height: 60 × 60, 100 × 75 and 145 × 75 mm). The 145 × 75-mm FOVs were available only in S2, after the upgrade. The volumes were isotropic sized from 300 × 300 × 300 to 580 × 580 × 300 voxels and from 0.13 to 0.35 mm per voxel. All subjects were patients scanned by an authorized healthcare provider according to rules of ethics. No normal controls were included because of the use of ionizing X-rays in CBCT. Subject diagnoses or any other personal information was not disclosed with the images. Most volumes had at least one of the following types of artefacts: metal, movement, inhomogeneity or noise, some severe (Figure 1). Roughly, 36.7% of subjects had their mouth open to some extent during scanning (Table 1). Most of the subjects were presumably candidates for implant surgery and thus were missing some, even most, of the teeth. Positioning information, such as the anatomic region of interest (mandible, maxilla or sinuses) or the placement of the FOV (left, centre or right), was not included with the images.
Figure 1.
Artefacts typical for dental CBCT: CBCT volumes are acquired with a smaller X-ray exposure and thus have a worse signal-to-noise ratio when compared with conventional CT. The algorithm was designed to segment volumes with inhomogeneity (a), metal artefacts (b), movement and noise (c) typical for dental CBCT.
Table 1.
General characteristics of the data set
| Set | Mouth open | Maxilla visible | FOV centre | FOV left | FOV right | Noise | Metal |
|---|---|---|---|---|---|---|---|
| S1 | 15.8% | 89.5% | 89.5% | 0.0% | 10.5% | 42.1% | 63.2% |
| S2 | 50.0% | 76.7% | 76.7% | 20.0% | 3.3% | 0.0% | 70.0% |
| All | 36.7% | 81.6% | 81.6% | 12.24% | 16.3% | 16.3% | 67.4% |
FOV, field of view; S1, Study 1; S2, Study 2.
The table summarizes the relative number of images in S1, S2 and all where the subjects had the mouth open, the maxilla was at least partly visible, how the FOV was placed (centre, left or right) and whether the image exhibited metal or noise-type artefacts. This information was compiled by inspecting the data set visually.
Overview of the segmentation algorithm
The basic principles of the developed segmentation algorithm are rather straightforward: finding a sufficiently large number of points with good coverage on the exterior bone–soft tissue edge of the facial skeleton and reconstructing a surface mesh on them. The algorithm was implemented in a pipeline of sequentially performed steps (Figure 2).
Figure 2.
Overview of the automatic segmentation CBCT pipeline: the input volume is read and filtered to suppress noise, extract image gradients and estimate parameters of the volume content. The four steps: parameterizing the surface topology, searching for potential target edge points, reconstructing surface edge points and fitting a surface polynomial to the reconstructed mesh are iterated for best results. Once converged, the resulting triangular surface mesh is provided as output.
The pipeline begins with filtering to suppress noise and to estimate a number of parameters to be used later in the pipeline. The main novelty and the key elements of the algorithm are in four steps following filtering. First, the coarse shape and location of the target is parameterized. This parameterization also defines a topology according to which a set of equally spaced line profile segments presumably intersecting the target edge is aligned. Second, potential edge points are searched along these segments and the best candidates are selected. Third, a mesh of triangles connecting the neighbouring edge point candidates on the segments is reconstructed and the best triangles are selected for the final mesh configuration. Fourth, a surface polynomial is fitted to the mesh to fill holes and gaps and to regularize and smooth the result to a chosen degree. These four steps are iterated and once a good fit to the target is reached, the iteration loop is terminated and the resulting surface mesh is the output. The following sections discuss these steps in detail.
Filtering
The input volume is first filtered with a three-dimensional, σ = 1.5-mm Gaussian kernel. A kernel of this size is large enough to smooth the noise typical of our data set but not small enough to blur the target edge too much or lose other relevant detail. Kernel size may have to be determined for different modalities, image resolutions, quality and targets. The volume is filtered again with a two-dimensional Sobel-type kernel to estimate the magnitude and orientation of the edge gradient for every voxel in the volume. The kernel is applied to all slices of the image volume in axial orientation.
The approximate intensity ranges of the bone and the surrounding soft tissues need to be estimated for later use in the segmentation pipeline. In CT, a natural way of achieving this would be to use known ranges for these tissues in Hounsfield units. Unfortunately, owing to different volume reconstruction techniques used in CBCT, this is not necessarily possible and mapping the CBCT intensities to the Hounsfield units scale may not be straightforward. For these reasons, we estimate the ranges with a direct clustering-based classifier (Figure 3).16 Four predefined classes were set to approximate (1) the background, (2) soft tissue, (3) bone and (4) hard objects such as the tooth enamel and metal. The means and standard deviations of the intensities of the voxels labelled in the soft tissue and bone are retained.
Figure 3.
A CBCT image is filtered and the image voxels clustered according to thresholds defined by a direct-clustering method to approximate intensity ranges of the tissue classes before segmentation. Noise and small artefacts are smoothed with Gaussian filtering (a). The filtered images are clustered into classes of (1) background (dark grey areas), (2) soft tissue (grey areas), (3) bone (light grey areas) and (4) hard objects such as enamel and metal (white areas) (b). A small portion of the outside rim of the field of view (FOV) is removed from the parameter estimation owing to regular inhomogeneity in the area (b). The error in classification due to inhomogeneity shown in (b) as if the bone would spread along the borders of the FOV at 4 and 8 o'clock orientations.
Parameterization
The coarse shape, location and orientation of the target are captured by an arc length-type parameterization. Arc length parameterization works well on the characteristic shape of the dental arch, although other approaches such as spherical parameterization could be used.17 The parameterization is obtained by mapping the points x = {i,j,k} of the isotropic Cartesian index coordinate system of the image voxel grid to a parameterized system x′ = {i′,j′,k′}, where the coordinates are the signed distance from the arc i′ from the apex and the signed distance j′ from the nearest point on the arc. Coordinate k is the axial slice index and remains unchanged by the mapping, i.e. k′ = k. Only one arc is defined for the whole image volume. This mapping is illustrated in Figure 4.
Figure 4.
Image grid coordinate system and the parameterized system: the image coordinate system {i,j,k} is an isotropic, Cartesian grid where coordinates {i,j} lie on the axial plane. Parameterization bends the original coordinate system (grey areas) along a polynomial (white areas) fitted on the outer surface of the dental arc. The resulting coordinates of the parameterized space are signed arc length from the apex i′ and signed distance to the nearest point on the arch j′. The mapping is two dimensional and thus k′ = k.
The arc is defined by a third-degree polynomial. Its coefficients are solved by fitting the polynomial to edge points of the whole surface projected on the {i,j} (axial) plane. At the start when no edge points exist yet, the polynomial is simply a horizontal line from left to right, cutting the {i,j} plane in half. As the pipeline runs further, the arc converges towards the dental arc and facial bones. This parameterization defines a plane according to which property segmentation surface will be defined.
The parameterization plane defined by the arc will not have enough degrees of freedom to fit to the target exactly and it is not supposed to do so. The parameterization is meant to only capture the global curvature of the target and to define a rough topology and space where potential target edge points are searched.
The edge points are searched on a set of line profiles along orientation j′ in the parameterized space. A number of line profiles are placed on a grid in the {i′,j′} plane on a defined spacing. Uniform spacing of 1.25 mm was used for the best combination between accuracy and computation time. The grid also defines a topology in which the actual surface is reconstructed during the later stages of the pipeline. Use of line profiles has the benefit of limiting the space and the number of voxels for the search to achieve faster computation.
Edge point search
Potential edge points are searched by estimating an energy as follows:
| (1) |
for every voxel x′ crossing a line segment at n = {i′,k′}, where g(x′), v(x′) and o(x′) are the gradient magnitude, image intensity and gradient orientation, respectively. E(x′) has similarities to energy functions used in other segmentation methods but was defined specifically for this problem.2,7,9,11 g(x′), v(x′) and o(x′) are estimated at the filtering step of the pipeline and remain constant. vdiff is the difference between the average intensities of voxels labelled to the soft tissue and bone and r is the radius of the Sobel kernel used in estimating edge gradients. vn and on are the estimated intensity and target orientation of the line profile segment at n. vn and on are updated at every iteration based on the intensities and orientations of the edge point at n.
a, b and c are weighing constants determined with machine learning or, in this work case, determined simply by exhaustive search by using the training set. E(x′) was designed as a minimum energy function with the three components normalized to the scale of [0,1] with the value of 0 indicating the best possible properties.
Once the energy values have been computed for every voxel crossing a line profile, a number of q voxels at local minima per every profile are selected. In this study, we used q = 3, as in the vast majority of the cases, the correct edge was found among the three best candidates. The search of local minima along the line segments is illustrated in Figures 5 and 6.
Figure 5.
Edge point search along the line profiles: potential edge points are searched along the line profiles and a number of q (here, q = 3) best candidates are selected for further consideration. The best candidates are shown in red, second best in green and third best in blue. Image (a) shows the line profiles and point candidates after the first iteration when the parameterization and thus the last component of the energy Function (1) is not yet used. Image (b) shows a slice with line profile configuration bent along the parameterization during later iterations where segment lengths are trimmed for shorter energy profiles and thus less voxels for evaluation. For colour image see online.
Figure 6.
Energy and intensity: corresponding energy (grey line) profiles as computed with Equation (1), with the corresponding intensity (black line) (a). The profiles show that the energy minima correspond to the highest gradients in the intensity that match the edges of the outer and inner surfaces of the mandibular bone. The location and orientation of the intensity profile in (a) is shown on an axial image slice in (b).
Surface mesh reconstruction by patch growing
Besides the actual segmentation, the aim of reconstructing the target surface is to determine which of the edge point candidates lay on the same edge and thus are part of the same structure or object and further which surface represents the desired facial bone–soft tissue edge. The surface is constructed as a triangular mesh whose topology, i.e. the node to vertex configuration, is fixed at the parameterization step and remains constant. The goodness of fit of the mesh to the edge will be estimated locally by computing energies for the edge point candidate–triangle combinations using Function (1). The energies are estimated for every image voxel crossing the triangle and the median is taken by giving one energy value per triangle. The triangles with the lowest energy will be selected for the mesh if a maximum energy threshold of 0.9 is met. In case of exceeding the energy threshold, the triangle is discarded, leaving a hole in the mesh.
To find a global minimum, all triangle combinations by neighbouring point candidates should be tested. This, however, would lead to an unnecessarily high computational cost, since the energy values for triangles of q3 edge points would need to be estimated for every triplet of line segments. Instead of going through all combinations, a number of seed points are selected and fixed. All (q2 × 4) triangle combinations around the seed and the adjacent edge point candidates are evaluated and the minimum energy triangle is taken. This single triangle acts as the start of a surface patch (Figure 7) that is further grown by fitting the triangles on the patch boundary to the edge point candidates on the neighbouring line segment and adding the minimum energy triangle to the patch if it falls below the maximum energy threshold. Since all but the single tested edge point candidate (node) is fixed to the patch already, only the maximum of q combinations per triangle need to be tested. Growing patches in this manner does not guarantee a global minimum for the whole mesh but gives a very good chance of finding the local, consistent edges and requires only linear time to compute.
Figure 7.
Growing surface patches to a mesh: a number of seed points potentially on the bone–soft tissue edge are selected and a number of patches (separate patches in different shades of grey) are formed by joining neighbouring seed points by triangles (a). The patches shown are grown by adding potential edge points on the analyzed line segment (b). Image (c) shows the resulting surface mesh of the patch-growing phase.
The criteria for the selection of seed points are critical for providing an accurate starting point for the patch-growing phase and thus successful surface mesh reconstruction. We chose to select the points that have the lowest energy and lie foremost on their segments. These are most likely correct for segmenting the outer surface of the facial skeleton, since the only edges in dental CBCT outside the skeleton are the soft tissue–air boundary or artefacts. In another application, some other criteria could be used or the operator could be asked to select a number of points depending on the size and contrast of the target to act as seeds.
The patch growing will be run for every surface patch independently. This will result in a number of patches, some of which might overlap. In other words, two edge point candidates from the line segment may belong to different surfaces. This will lead to a problem where the right patches to represent the target surface need to be selected. In this application, the two largest non-overlapping patches were taken. It was simply presumed that the two largest patches represent the maxilla and mandible. Also, additional rules such as the minimum patch size and maximum average triangle energy threshold were used to discard patch edges likely other than facial bones (sinus cavities etc.) from consideration.
Fitting of surface polynomial
The surface reconstruction will result in a mesh with holes and disjoint patches. To bridge these holes and gaps and to obtain smooth and consistent surfaces, the mesh needs to be interpolated. This is performed by fitting a thin plate smoothing spline polynomial f(n) to the edge point candidates that remain on the mesh patches after the growing phase.18 The smoothing is performed in the parameterized space. The coefficients of f are estimated by minimizing the sum as follows:
| (2) |
where
| (3) |
is the error measure between the mesh node coordinate j′ and the value of the function f at coordinate n = {i′,k′} and R(f) is a roughness measure penalizing the bending of the surface too sharply. A smoothing parameter p ∈ [0,1] acts as the weighing term between the two components. R(f) can be chosen according to the application. We used the integral of the second derivatives of the node coordinates as implemented in MATLAB curve fitting toolbox function tpaps (MathWorks, Natick, MA).
The polynomial is fitted to the edge points in the parameterized space, since this removes the global curvature of the target and thus enables an effective representation of the surface. Figure 8 shows a typical result of a surface fitting to a mesh just after reconstruction.
Figure 8.
Surface regularization with a thin plate polynomial: a thin plate spline surface is fitted to the raw surface mesh (light grey) which typically contains holes, sharp peaks etc. The spline surface (dark grey) will fill the gaps and smooth the result to a chosen degree.
Convergence
Once the surface polynomial has been fitted, its properties will be evaluated. First, it needs to be determined whether to continue or terminate the iteration loop (Figure 2). During the early development of the algorithm, the decision was based on the estimation of the movement of the surface mesh node coordinate j′ from the current and previous iterations. If 95% of the nodes deviated less than a chosen threshold from the previous iteration, the loop was terminated. Since most of the progress is typically gained during the early iterations, we decided to run a fixed number of three iterations instead. Iterations beyond that gave very little improvement for the cost of the extra computation time used. If the loop is continued, the parameterization polynomial will be refitted to the current edge point coordinates and the terms vn and on in Equation (1) are re-estimated.
Assessment
The results were validated against a reference surface drawn manually by an expert (a medical physicist with 10 years' experience). The facial bones were drawn by placing markers on the axial slices of 2-mm spacing, covering all bone–soft tissue contours visible to the eye. The segmentation error was computed as the closest distance from the markers to the segmented surface. The distance was measured from a marker to the plane spanned by a mesh triangle along the normal direction of the plane. All marker–mesh triangle combinations were tested and the shortest distance was taken for every marker. In addition, the marker–triangle correspondence was tested by scaling the mesh (inflation or deflation around the centre of the mesh) so that it would cross the tested marker. Only those marker-to-surface measurements where the marker crosses a triangle of the scaled mesh were taken into account. This procedure also gives an estimate of how much the automatically segmented surface area covers the hand-drawn target. The average distance, standard deviation and root mean squares (RMS) of the marker–surface distances per image were estimated.
Results
Accuracy
Visual examples of the best, average and worst performance of the algorithm are shown in Figure 9.
Figure 9.
The best, average and worst segmentation results: slices (a) and (b) show the best result based on the shortest average distance from a hand-drawn surface. The automatically detected edge points (black circles) and manually drawn markers (grey circles) are shown on axial image slices. Slices (c) and (d) show a typical result. Some of the worst results occur when the algorithm gets attracted to false edges owing to bad contrast (e) and metal artefacts (f) or when the amount of applied smoothing is excessive to capture the sharp topological features such as the maxilla just below the zygomatic bone or a combination of these. The ruler overlay units are in centimetres.
Figure 10 shows the distributions of distances from the hand-drawn markers to the segmented surface per image.
Figure 10.
Distributions of surface-to-hand-drawn marker distances (errors): the distances [in millimetres (mm)] between hand-drawn markers and segmented surfaces are shown with box and whisker plots, where the midline in the box is the median, the box limits are 25 and 75% and the whisker limits are 5 and 95%. The distributions are ordered from poorest to best (left to right) as measured in the width of the 5 and 95% limits. The horizontal axis shows the image indices. Images that belong to the teaching set (Study 1) are marked with asterisks (*).
The results per volume are shown in Tables 2–4.
Table 2.
Individual segmentation results for the Study 1 data set
| Index | Mean (mm) | Std (mm) | RMS (mm) | Coverage (%) |
|---|---|---|---|---|
| 1 | 0.32 | 0.30 | 0.44 | 92 |
| 2 | 0.81 | 2.81 | 2.91 | 76 |
| 3 | 0.46 | 0.75 | 0.88 | 88 |
| 4 | 0.43 | 0.70 | 0.82 | 85 |
| 5 | 0.44 | 0.46 | 0.63 | 86 |
| 6 | 1.31 | 2.31 | 2.65 | 81 |
| 7 | 0.83 | 2.15 | 2.30 | 85 |
| 8 | 0.59 | 1.50 | 1.61 | 94 |
| 9 | 0.47 | 0.47 | 0.66 | 90 |
| 10 | 0.48 | 0.50 | 0.69 | 97 |
| 11 | 0.36 | 0.36 | 0.51 | 90 |
| 12 | 0.46 | 0.71 | 0.84 | 94 |
| 13 | 0.40 | 0.54 | 0.67 | 91 |
| 14 | 0.43 | 0.34 | 0.55 | 89 |
| 15 | 0.60 | 0.86 | 1.04 | 90 |
| 16 | 0.51 | 0.56 | 0.76 | 98 |
| 17 | 0.72 | 2.59 | 2.67 | 78 |
| 18 | 0.51 | 2.35 | 2.40 | 92 |
| 19 | 0.44 | 0.63 | 0.76 | 89 |
RMS, root mean squares; Std, standard deviation.
Table 4.
Averaged segmentation result images in Study 1 (S1), Study 2 (S2) and both (all) combined
| Set | Mean (mm) | Std (mm) | RMS (mm) | Coverage (%) |
|---|---|---|---|---|
| S1 | 0.52 | 1.14 | 1.26 | 89 |
| S2 | 0.49 | 0.93 | 1.05 | 94 |
| all | 0.50 | 1.01 | 1.13 | 92 |
RMS, root mean squares; Std, standard deviation.
Table 3.
Individual segmentation results for the Study 2 data set
| Index | Mean (mm) | Std (mm) | RMS (mm) | Coverage (%) |
|---|---|---|---|---|
| 20 | 0.43 | 0.61 | 0.74 | 96 |
| 21 | 0.51 | 0.87 | 1.01 | 94 |
| 22 | 0.45 | 0.48 | 0.66 | 100 |
| 23 | 0.87 | 2.29 | 2.44 | 95 |
| 24 | 0.46 | 0.63 | 0.78 | 95 |
| 25 | 0.56 | 0.93 | 1.08 | 97 |
| 26 | 0.49 | 0.80 | 0.94 | 98 |
| 27 | 0.62 | 1.13 | 1.29 | 93 |
| 28 | 0.31 | 0.28 | 0.41 | 97 |
| 29 | 0.56 | 1.69 | 1.78 | 97 |
| 30 | 0.46 | 0.54 | 0.71 | 97 |
| 31 | 0.34 | 0.29 | 0.45 | 94 |
| 32 | 0.49 | 0.67 | 0.83 | 96 |
| 33 | 0.42 | 0.57 | 0.71 | 90 |
| 34 | 0.34 | 0.26 | 0.43 | 96 |
| 35 | 0.25 | 0.26 | 0.36 | 95 |
| 36 | 0.36 | 0.39 | 0.53 | 90 |
| 37 | 0.77 | 1.22 | 1.44 | 80 |
| 38 | 0.62 | 1.10 | 1.26 | 87 |
| 39 | 0.33 | 0.43 | 0.55 | 91 |
| 40 | 0.47 | 0.58 | 0.75 | 96 |
| 41 | 0.35 | 0.55 | 0.65 | 94 |
| 42 | 0.41 | 0.52 | 0.66 | 96 |
| 43 | 0.40 | 0.36 | 0.54 | 97 |
| 44 | 0.36 | 0.35 | 0.50 | 95 |
| 45 | 0.53 | 1.44 | 1.53 | 98 |
| 46 | 0.48 | 0.38 | 0.61 | 97 |
| 47 | 1.11 | 1.89 | 2.19 | 98 |
| 48 | 0.39 | 0.51 | 0.64 | 98 |
| 49 | 0.33 | 0.40 | 0.52 | 96 |
RMS, root mean squares; Std, standard deviation.
The study S1 was used as a training set to find the best values for the constants a, b and c in Function (1) and S2 was used for validation. The values a = 0.48, b = 0.26 and c = 0.26 for Function (1) were found to give the best average accuracy. The largest weight in the energy function was thus given to edge gradient magnitude a. The relatively low values (<0.1) of p in Function (2) gave the most consistent surfaces. Thus, a rather large weight was given to the smoothing component of Function (2), smoothing the sharp curvatures.
The average computing time using an iMac OS X 10.6 (Apple, Cupertino, CA) Intel i5 2.4-GHz (Intel, Santa Clara, CA) workstation was just below a minute per volume running a single thread. The segmentation algorithm and the validation method were mostly implemented in MATLAB 2011b with a few parts such as the computation of line–voxel and triangle–voxel intersections written in C. The single most computationally intensive step was solving the thin plate surface polynomial, which took roughly half of the computation time. About one-third of the time was spent in filtering the volumes and the rest mostly in parameterization, energy computations and mesh reconstructions.
Application
The primary motivation behind developing the algorithm was to generate anatomy-based radiographs with no or very little user interaction. For example, in implant planning, a typical problem is to determine the exact location of the mandibular canal relative to the mandibular bone or the bottom of the sinus cavity relative to the maxillary bone. Here, we present an example application where a panoramic slice and a number of cross-sectional views (Figure 11) were reconstructed based on the segmentation of the facial bones on a presumed location (right molar) of the implant.
Figure 11.
An example of using the segmentation result to reconstruct panoramic and cross-sectional slices: a line contour (grey line) to mark the centre of a layer used in panoramic reconstruction is drawn on a slice of interest (a). The contour is computed by moving the segmentation contour (black dotted line) inwards to a fixed distance, here 4 mm (a). Seven cross-sections (grey lines) are placed on the right molar (a). A scene including the reconstruction of the panoramic and cross-sectional slices in (a) is rendered in three dimensions showing the curvature of the panoramic layer (b). The panoramic reconstruction laid on a plane shows the molar roots and the mandibular canal (c). The cross-sectional slices marked on (a) show the extent of the roots relative to the mandibular canal (d). This example was created using Image 26 in our data set. L, left; R, right.
Discussion
An algorithm capable of segmenting the exterior of the facial bone surfaces including the mandible, teeth, maxilla and zygomatic bones was developed and validated. The developed algorithm reached an accuracy of 0.5 mm (averaged over all images) from the segmented surface to the manually drawn markers. The segmented surfaces covered an average of 92% of the area (markers) of the reference surfaces. Extensive coverage is a significant indicator of the performance of the algorithm, since all facial bones in the images visible to the human eye, including the most distorted and thus difficult to segment areas, were included in the reference.
The precision of the developed algorithm was evaluated by computing the RMS distance between the segmented surface and the manually drawn markers. The algorithm reached an average RMS distance of 1.1 mm, the worst image giving 2.9 mm. This indicates that none of the surfaces segmented had major deviations from the reference. These results are close to the most accurate reported in dental CBCT segmentation by Kainmueller et al.11 They achieved the average RMS distance of 0.8 mm compared with our 1.1 mm in this study. Kainmueller et al's11 results were reported for the mandible, whereas ours include the exterior of the maxilla. Wang et al12 reported excellent dice ratios (0.91 for the mandible and 0.87 for the maxilla) and accuracy (average surface distance) similar to our method (0.61 vs. 0.50 mm). The average surface distance of Wang et al12's method was reported for the mandible. The computational cost of Wang et al12's algorithm was very high (5 h) that limits its clinical application.
The developed method was found to give consistent results for the entire data set. The visual impression of the algorithm robustness in noisy images and images with weak edges was very good. The developed algorithm, like almost any other segmentation method, is still somewhat prone to false edges. In dental CBCT, there may be several false edges with an intensity profile and orientation similar to those of the correct edges owing to inhomogeneity, metal, reconstruction errors or other artefacts. The rather simple logic of our algorithm in choosing the correct edges may get distracted in areas with similar, competing edges. This is the cause for the largest deviations shown in Figure 10. It may not be simple to overcome this by a fully automatic algorithm in dental or other applications, but a feature where the operator manually selects one or two correct surface patches from a number of proposed ones in case of difficult images could be trivially added to the current algorithm.
The parameters of our algorithm, the weights of Equation (1), smoothing parameter in Equation (2) and spacing of the parameterization grid, were set to reach the minimum average distance from the reference. The best combination of these was determined by an exhaustive search over a large range using study S1. The selected combination was independently tested with S2. The fact that the algorithm performed better in S2 than in S1 is explained by the improvement in the image quality owing to equipment upgrades between S1 and S2. In fact, S1 represents early prototype data, which provided us the opportunity to test our method in a much more challenging environment when compared with today's requirements for image quality. On the other hand, many images in S2 had larger FOVs that also included the zygomatic bones that are sharper in shape than those present in the smaller FOVs of S1. Since the choice of the roughness measure and value for the smoothing parameter in Function (2) was solely based on S1, we saw somewhat poorer performance in some of the large FOVs of S2. We would have likely achieved the best results by estimating the parameters with a cross-validation scheme that uses images from both studies, but since we had already used all images in S1 at the early design stages of the algorithm, this was not possible.
The thin plate spline surface polynomial was chosen to regularize and extend our segmentation surface over areas that have holes or are severely affected by artefacts. The use of the smoothing term with the spline results in a trade-off between the average and RMS distances. In general, using less smoothing gives a better accuracy but worse precision. Aggressively smoothed results would be beneficial for applications such as panoramic and cross-sectional slice reconstruction, where sharp curves could complicate the calculation of the orientations of the cross-sections and the location of the sharp layer. On the other hand, applications such as segmentation for volume rendering could benefit from a less smooth and thus more detailed fit to the target.
The primary motivation of the work was to develop a segmentation method to be used in the automatic reconstruction of cross-sectional and panoramic views of the facial and dental bones. This article presents an example where this is performed by using the segmentation of the exterior of the dental and facial bone structures. However, the inner structures of the mouth and sinuses are probably of equal clinical importance. Although not presented in this article, the algorithm was also tried on the interior surfaces of the mouth. These tests suggest that by readjusting the parameters, it is possible to segment the interior with similar accuracy as we now report for the exterior. Ideally, the pipeline should be configured to segment both the inner and outer surfaces simultaneously, exploiting the obvious similarities in shape, orientation and location of these structures. This is an obvious topic for further research. Segmenting the maxillary sinuses is a rather different kind of challenge for which methods exist.19,20 Although not tested, we believe that our algorithm has potential for segmenting the sinuses also.
A key feature of the algorithm is the use of parameterization to define the topology to define some properties of the surface to represent the target. The choice of using a two-dimensional, third-degree polynomial fitted to the facial skeleton was made for its resemblance to the setting of traditional panoramic imaging. This also led to the very intuitive relation between the parameterized and real-world volume spaces and a very simple mathematic description. In other applications (unknown pose, for example), this would be a limitation since it results in a poor ability to capture edges parallel to axial planes. In our case, moving to full three-dimensional surface parameterization would have resulted in significant and complicating changes with only potentially marginal improvements to the segmentation accuracy. We have tested the algorithm successfully in segmenting another target scanned with another modality by using genus 0-type closed surface for parametrization.2
A short execution time is a key property of any algorithm intended for clinical use. We believe that an algorithm with a runtime of 1 min, and less when implemented efficiently on modern hardware, would provide a dentist or oral surgeon enough time to ensure a smooth workflow, for example, for inspecting the processed images during a single patient visit. Although only those algorithms that can be implemented efficiently were chosen for our pipeline, the presented implementation was intended to demonstrate only the feasibility of our approach and was not optimized for performance. The current main bottlenecks of the pipeline, the thin plate spline computation and Gaussian + Sobel filtering, could be significantly improved with parallelization. The nature of those tasks readily permits this. The same applies to the computation of the voxel energies. We believe that exploiting the power of modern graphical processing unit computing would bring the execution time to a fraction of the current version.
Conclusions
In conclusion, we developed an algorithm for segmenting the exterior of the facial skeleton from CBCT images, yielding accuracy similar to that previously reported for the mandible, a much easier target. We achieved consistent results throughout the data set with algorithms fast enough to be implemented for clinical use.
Acknowledgments
Acknowledgments
We would like to thank Dr Jörg Mudrak for valuable comments for the manuscript as well as Dr Mark van Gils, Dr Jyrki Lötjönen and Ms Tiina Takalokastari for their mentoring, comments and proofreading of the manuscript.
Funding
We would like to thank Tekes, the Finnish Funding Agency for Innovation for providing funding for this work. Kari Antila has also been supported by the Finnish Cultural Foundation and the Instrumentarium Science Foundation.
Contributor Information
Kari Antila, Email: kari.antila@vtt.fi.
Mikko Lilja, Email: mikko.lilja@aalto.fi.
Martti Kalke, Email: martti.kalke@palodexgroup.com.
References
- 1.De Vos W, Casselman J, Swennen GR. Cone-beam computerized tomography (CBCT) imaging of the oral and maxillofacial region: a systematic review of the literature. Int J Oral Maxillofac Surg 2009; 38: 609–25. doi: https://doi.org/10.1016/j.ijom.2009.02.028 [DOI] [PubMed] [Google Scholar]
- 2.Antila K, Lilja M, Kalke M, Lotjonen J. Automatic extraction of mandibular bone geometry for anatomy-based synthetization of radiographs. Conf Proc IEEE Eng Med Biol Soc 2008; 2008: 490–3. doi: https://doi.org/10.1109/IEMBS.2008.4649197 [DOI] [PubMed] [Google Scholar]
- 3.Dreiseidler T, Alarabi N, Ritter L, Rothamel D, Scheer M, Zöller JE, et al. A comparison of multislice computerized tomography, cone-beam computerized tomography, and single photon emission computerized tomography for the assessment of bone invasion by oral malignancies. Oral Surg Oral Med Oral Pathol Oral Radiol Endod 2011; 112: 367–74. doi: https://doi.org/10.1016/j.tripleo.2011.04.001 [DOI] [PubMed] [Google Scholar]
- 4.Monteiro BM, Nobrega Filho DS, Lopes Pde M, de Sales MA. Impact of image filters and observations parameters in CBCT for identification of mandibular osteolytic lesions. Int J Dent 2012; 2012: 239306. doi: https://doi.org/10.1155/2012/239306 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Closmann JJ, Schmidt BL. The use of cone beam computed tomography as an aid in evaluating and treatment planning for mandibular cancer. J Oral Maxillofacial Surg 2007; 65: 766–71. doi: https://doi.org/10.1016/j.joms.2005.12.053 [DOI] [PubMed] [Google Scholar]
- 6.Schulze R, Heil U, Gross D, Bruellmann DD, Dranischnikow E, Schwanecke U, et al. Artefacts in CBCT: a review. Dentomaxillofac Radiol 2011; 40: 265–73. doi: https://doi.org/10.1259/dmfr/30642039 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lamecker H, Zachow S, Wittmers A, Weber B, Hege H, Elsholtz B, et al. Automatic segmentation of mandibles in low-dose CT-data. Int J Comput Assist Radiol Surg 2006; 1: 393–5. [Google Scholar]
- 8.Cootes TF, Taylor CJ, Cooper DH, Graham J. Active shape models—their training and application. Comput Vis Image Und 1995; 61: 38–59. doi: https://doi.org/10.1006/cviu.1995.1004 [Google Scholar]
- 9.Rueda S, Gil JA, Pichery R, Alcañiz M. Automatic segmentation of jaw tissues in CT using active appearance models and semi-automatic landmarking. Med Image Comput Comput Assist Interv 2006; 9(Pt 1): 167–74. [DOI] [PubMed] [Google Scholar]
- 10.Cootes TF, Edwards GJ, Taylor CJ. Active appearance models. IEEE Trans Pattern Anal Mach Intell 2001; 23: 681–5. doi: https://doi.org/10.1109/34.927467 [Google Scholar]
- 11.Kainmueller D, Lamecker H, Seim H, Zinser M, Zachow S. Automatic extraction of mandibular nerve and bone from cone-beam CT data. Med Image Comput Comput Assist Interv 2009; 12(Pt 2): 76–83. [DOI] [PubMed] [Google Scholar]
- 12.Wang L, Chen KC, Gao Y, Shi F, Liao S, Li G, et al. Automated bone segmentation from dental CBCT images using patch-based sparse representation and convex optimization. Med Phys 2014; 41: 043503. doi: https://doi.org/10.1118/1.4868455 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Collins DL, Neelin P, Peters TM, Evans AC. Automatic 3D intersubject registration of MR volumetric data in standardized Talairach space. J Comput Assist Tomogr 1994; 18: 192–205. doi: https://doi.org/10.1097/00004728-199403000-00005 [PubMed] [Google Scholar]
- 14.Lötjönen JM, Wolz R, Koikkalainen JR, Thurfjell L, Waldemar G, Soininen H, et al. ; Alzheimer's Disease Neuroimaging Initiative. Fast and robust multi-atlas segmentation of brain magnetic resonance images. Neuroimage 2010; 49: 2352–65. doi: https://doi.org/10.1016/j.neuroimage.2009.10.026 [DOI] [PubMed] [Google Scholar]
- 15.Mazziotta JC, Toga AW, Evans A, Fox P, Lancaster J. A probabilistic atlas of the human brain: theory and rationale for its development. The International Consortium for Brain Mapping (ICBM). Neuroimage 1995; 2: 89–101. doi: https://doi.org/10.1006/nimg.1995.1012 [DOI] [PubMed] [Google Scholar]
- 16.Pianykh OS. Analytically tractable case of fuzzy c-means clustering. Pattern Recogn 2006; 39: 35–46. doi: https://doi.org/10.1016/j.patcog.2005.06.005 [Google Scholar]
- 17.Brechbühler C, Gerig G, Kübler O. Parametrization of closed surfaces for 3-D shape description. Comput Vis Image Und 1995; 61: 154–70. doi: https://doi.org/10.1006/cviu.1995.1013 [Google Scholar]
- 18.Bookstein FL. Principal warps: thin-plate splines and the decomposition of deformations. IEEE Trans Pattern Anal Mach Intell 1989; 11: 567–85. doi: https://doi.org/10.1109/34.24792 [Google Scholar]
- 19.Descoteaux M, Audette M, Chinzei K, Siddiqi K. Bone enhancement filtering: application to sinus bone segmentation and simulation of pituitary surgery. Med Image Comput Comput Assist Interv 2005; 8(Pt 1): 9–16. [DOI] [PubMed] [Google Scholar]
- 20.Last C, Winkelbach S, Wahl FM, Eichhorn KWG, Bootz F. A model-based approach to the segmentation of nasal cavity and paranasal sinus boundaries. Pattern Recognition 2010; vol. 6376 of Lecture Notes in Computer Science. pp. 333–42. [Google Scholar]











