Skip to main content
Computational and Mathematical Methods in Medicine logoLink to Computational and Mathematical Methods in Medicine
. 2021 May 16;2021:3772129. doi: 10.1155/2021/3772129

Automatic Segmentation of Left Ventricle in Echocardiography Based on YOLOv3 Model to Achieve Constraint and Positioning

Zhemin Zhuang 1,2, Pengcheng Jin 1, Alex Noel Joseph Raj 2, Ye Yuan 1, Shuxin Zhuang 1,
PMCID: PMC8143884  PMID: 34055033

Abstract

Cardiovascular disease (CVD) is the most common type of disease and has a high fatality rate in humans. Early diagnosis is critical for the prognosis of CVD. Before using myocardial tissue strain, strain rate, and other indicators to evaluate and analyze cardiac function, accurate segmentation of the left ventricle (LV) endocardium is vital for ensuring the accuracy of subsequent diagnosis. For accurate segmentation of the LV endocardium, this paper proposes the extraction of the LV region features based on the YOLOv3 model to locate the positions of the apex and bottom of the LV, as well as that of the LV region; thereafter, the subimages of the LV can be obtained, and based on the Markov random field (MRF) model, preliminary identification and binarization of the myocardium of the LV subimages can be realized. Finally, under the constraints of the three aforementioned positions of the LV, precise segmentation and extraction of the LV endocardium can be achieved using nonlinear least-squares curve fitting and edge approximation. The experiments show that the proposed segmentation evaluation indices of the method, including computation speed (fps), Dice, mean absolute distance (MAD), and Hausdorff distance (HD), can reach 2.1–2.25 fps, 93.57 ± 1.97%, 2.57 ± 0.89 mm, and 6.68 ± 1.78 mm, respectively. This indicates that the suggested method has better segmentation accuracy and robustness than existing techniques.

1. Introduction

Cardiovascular diseases (CVDs) are one of the most common diseases affecting humans. “Global Burden of Cardiovascular Diseases and Risk Factors, 1990–2019,” published in [1], shows that the incidence and mortality of CVD worldwide have been increasing since 1990 and that the mortality of CVD ranks first and is far higher than that of other diseases. Therefore, early detection and diagnosis of cardiac disease through various means is crucial for reducing the prevalence and mortality of CVD and improving the quality of life of patients [2].

Compared with X-ray coronary angiography, myocardial contrast echocardiography, computed tomography, and magnetic resonance imaging, the use of ultrasound for the screening and diagnosis of heart function and disease has great advantages. Using an ultrasound instrument, the heart and blood vessels, the movement of the ventricular wall, and the opening and closing of the valve can be observed dynamically in real time through flexible operation from multiple directions and angles. In addition, ultrasound has many advantages, such as safety and noninvasiveness, high diagnostic accuracy, and rapid inspection, and has become one of the most used and important examination methods for heart disease.

At present, the diagnosis of heart diseases based on ultrasound technology usually focuses on the analysis of the left ventricle (LV). The LV is responsible for blood supply to the body. Based on the changes in the LV, indicators such as LV end-diastolic volume, LV end-systolic volume, LV ejection fraction (EF), and LV stroke volume can be obtained. To obtain the indicators above, accurate positioning and segmentation of the LV on echocardiography are very important.

Clinically, the segmentation methods for LV ultrasound images can be classified into manual and automatic methods. The manual segmentation method requires the user to outline the region of interest manually. Marking the position or contour of the LV manually is tedious and time-consuming, and there are subjective differences among different observers. The automatic segmentation method is superior to the manual segmentation method [3, 4]. Usually, the automatic segmentation method of LV ultrasound images includes two steps.

First, it is necessary to determine the position of the LV in the ultrasound images. Methods such as scale-invariant feature transformation [5] and histogram of oriented gradient [6] can be used to determine the position of the LV. However, the shape and appearance of the LV corresponding to different individuals are usually different, so these methods cannot accurately identify the position of the LV, and the segmentation accuracy of LV is also affected. Recently, the application of deep learning models for target detection and localization has attracted increasing attention [7, 8]. Compared with the faster R-CNN model [9] and the single-shot multibox detector model [10], the YOLOv3 model [11] has a higher detection speed and accuracy. Therefore, a method based on the YOLOv3 model is proposed herein for accurate positioning and segmentation of the LV.

Second, after the LV in the ultrasound image is accurately located, the LV can be segmented. Methods such as structured random forest based on machine learning [12] have been proposed for LV segmentation; however, such methods require manual selection of space features. Dong et al. [13] developed a deep fusion network and deformable model to achieve LV segmentation in 3-D echocardiography. Smistad et al. [14] successfully segmented the LV in two-dimensional ultrasound images based on the U-Net method. Oktay et al. [15] further extended the U-Net model to improve the accuracy of LV segmentation. However, these methods usually require significant morphological features or prior knowledge and have the disadvantages of poor real-time performance and high computing power requirements. Traditional image processing methods, such as a motion-based method (Kalman filter) [16], deformable models (BEAS, level-set) [17, 18], graph-based approach (graphcut) [19], active appearance model [20], and atlas-based method [21], have been proven to have high segmentation speed and robustness in heart image segmentation. Therefore, the YOLOv3 model and the traditional statistical shape model are combined in this study to achieve fast and accurate LV segmentation in ultrasound images.

Herein, an automatic segmentation method based on the YOLOv3 model to satisfy the relevant constraints and achieve appropriate positioning is proposed for accurate segmentation of the LV endocardium. The results of experiments conducted using the proposed method show that the segmentation evaluation indices, including the computation speed (fps), Dice, mean absolute distance (MAD), and Hausdorff distance (HD), can reach 2.1–2.25 fps, 93.57 ± 1.97%, 2.57 ± 0.89 mm, and 6.68 ± 1.78 mm, respectively.

2. Method

To obtain clinical indicators such as EF, strain, and strain rate of the LV on echocardiography, accurate segmentation of the LV is crucial. In this study, the YOLOv3 model is first used to determine the three positions of the apex and bottom of the LV, as well as the location of the LV region. Then, based on the Markov random field (MRF) model with the iterated conditional mode (ICM), preliminary identification and binarization of the myocardium of the LV subimages are performed, and under the three constraint points of the LV, the left and right parts of the myocardium in the LV subimages are located. Finally, when approaching the edge of the myocardium, the B-spline method is used to smooth the edge of the endocardium, and then, accurate segmentation and extraction of the LV endocardium are achieved. Speckle noise and artifacts in ultrasound images can lead to the loss of borders and edges during image segmentation; therefore, when approaching the LV endocardium, a morphological mask is applied to eliminate the interference from speckle noise and edge artifacts inside the LV cavity. Figure 1 presents the block diagram of the proposed technique.

Figure 1.

Figure 1

Block diagram of the proposed method.

3. Segmentation of LV Endocardium Based on YOLOv3 for Positioning and Restraint

3.1. LV Localization and Collection of Restraint Points Based on the YOLOv3 Model

There are large differences in the shape of the LV in different echocardiogram frames. In addition, due to the interference of the mitral valve, as well as the influence of noise, artifacts, and frame-to-frame drift, traditional methods cannot locate the LV position well or extract the endocardium accurately. Therefore, this study proposes to use the target detection model YOLOv3 to realize the positioning of the LV region and the three ventricular constraint points in echocardiography. From Figure 2, the YOLOv3 model consists of the following: a general feature extraction network based on the Darknet-53 network, a multibranch deep feature extraction network, and a multiscale target area bounding box detection network.

Figure 2.

Figure 2

Detection of LV based on YOLOv3.

In Figure 2, for the general feature extraction network, the convolutional network (Conv), batch normalization (BN) layer, and linear activation function (Leaky ReLU) constitute Darknetconv2d BN Leaky (DBL), which extracts the general features of cardiac ultrasound images. The DBL is also the basic block of deep feature extraction networks. Concurrently, to solve problems such as the disappearance of gradients due to the deep network structure, DarkNet53 uses the jump structure to form Res_unit, Resblock_body, and Res_Module in multiple DBLs.

For the deep-level feature extraction network, YOLOv3 forms a multibranch network and a Concat layer through the route structure. Simultaneously, YOLOv3 uses a bilinear upsampling layer to expand the feature map to form three branch networks for locating target areas of three different scales; through these three branch networks, the feature matrix of the LV ultrasound image can be obtained. In practice, it is difficult to obtain enough labeled LV images, and to avoid overfitting, transfer learning is applied in this study to train the entire feature extraction network: first, load the weight parameters obtained based on the VOC dataset [22] and then fine-tune the weight parameters of the feature extraction network using the labeled heart dataset.

After the feature matrices of the LV ultrasound images of the heart are obtained, they are input into the detection network to obtain the positioning matrices. The YOLOv3 model divides the original input images into three types of S × S grids (i.e., 13 × 13, 26 × 26, and 52 × 52) for positioning the target area; hence, three types of positioning matrices with different dimensions are obtained. As shown in Figure 2, the y1 matrix corresponding to a 13 × 13 grid is used to detect a large target area and is used to locate the LV area in this study; the y2 and y3 matrices correspond to the 26 × 26 and 52 × 52 grids, respectively, which are used to locate three ventricular restraint points in this study.

Each grid corresponds to a (B + O) × anchors-dimensional positioning vector, where B is the bounding box of the target area, composed of (bx, by, bw, bh, bc), corresponding to the center abscissa, ordinate, width, height, and confidence from the center of the target area, respectively, and O is the number of types of the target area. In this study, there are four types of targets: the LV region and three ventricular constraint points. anchors are the number of anchor frames in the positioning matrix; the number of anchor frames with three scales in this study is three.

The anchor box is used to describe the length and width of the target area in this study, and the relationship between the anchor box and bounding box is shown in Equation (1).

bx=σtx+Cx,by=σty+Cy,bw=Pwetw,bh=Pheth, (1)

where Cx, Cy, Pw, and Ph are the abscissa, ordinate, and the width and height of the upper left corner of the grid where the center point of the anchor frame is located, respectively; σ(∙) is the sigmoid activation function; tx and ty are the abscissa and ordinate offsets of the center of the anchor frame; and tw and th are the changes in the length and width of the anchor frame.

In this study, the target area in the training set is divided into nine anchor boxes using the K-means [23] clustering algorithm, and each anchor box is represented as (w, h). For these anchor boxes, three small anchor boxes ((0 × 0), (11 × 13), and (11 × 15)) (i.e., the y1 matrix in Figure 2) are used to locate the LV area: three medium anchor boxes ((13 × 15), (14 × 20), and (15 × 17)), and three large anchor boxes ((16 × 22), (110 × 218), and (146 × 323)) (i.e., the y2 and y3 matrices in Figure 2) are used for the positioning of three constraint points.

3.2. Extraction of Endocardium Based on Constraint Points

The three positions of the apex and bottom of the LV, as well as the positioning of the LV area, can be found by the YOLOv3 model mentioned above. Then, based on the MRF model, the binarization and preliminary identification of the LV myocardial region in the subimages can be performed. Under the constraints of the three position points of the apex and bottom of the LV, curve fitting was performed on the left and right myocardial parts in the LV subimages, and the edge of the endocardium was approximated to realize accurate segmentation of the LV endocardium, and the B-spline method was also employed to smooth the edge of the LV endocardium.

3.2.1. Binarization of LV Myocardium Based on MRF Model

Before the LV myocardial images are binarized, to reduce the influence of speckle and noise in echocardiograms, the echocardiograms are denoised on the premise of preserving the characteristics of the LV myocardium. First, the LV subimages are smoothed via 2-D adaptive Wiener noise-removal filtering [24], the local neighborhood size is set to (5 × 5), and then, the pixel-wise Wiener filter can be constructed using Equation (2).

bn1,n2=μ+σ2ν2σ2an1,n2μ, (2)

where μ and σ2 are the local mean and variance around each pixel, respectively, and ν2 is the variance of the noise. The Wiener filter adjusts itself to the local image variance, i.e., when the variance is large, a minor smoothing operation is performed by the Wiener filter whereas when variance is small, the Wiener filter performs a major smoothing.

The MRF model utilizes the correlation between the upper and lower adjacent pixels in the image; thus, the spatial connectivity and edge smoothness of the binarized region can be improved. Therefore, an MRF model based on the ICM algorithm was used in this study to binarize and initially identify the myocardial region.

Assume that X and Y are random fields on a two-dimensional plane, where X = {xi, i = 1, 2, 3, ⋯, M × N} represents the input image and Y = {yi, i = 1, 2, 3, ⋯, M × N} represents the labeling field, where M and N represent the rows and columns of the image, respectively. In this study, the K-means clustering method was used to obtain the initial marker field, and the category was set to 2.

Considering the input images as an MRF model, the image segmentation problem can be transformed into an optimization problem using the ICM algorithm. According to the Bayesian principle, the posterior probability distribution of MRF is as follows:

PX=xY=y=PY=yX=xPX=xPY=y, (3)

where P(Y = y) is a constant for a given LV subimage, P(X = x) is the prior probability of the label domain, and P(Y = y | X = x) is the likelihood function.

When binarizing the LV images, the optimal labels can be obtained by maximizing the posterior probability of Equation (3).

PX=xY=yMAP=argmaxPY=yX=xPX=x. (4)

The prior probability P(X = x) in the MRF neighborhood system can be expressed using the Gibbs distribution function [25]. Then, based on the Gibbs distribution, the prior probability P(X = x) of the marker field can be characterized as follows:

PX=x=1ZexpExT, (5)

where Z = ∑xΩexp[−E(x)/T] is a normalized constant, E(x) = ∑cSVc(x) is the energy function, Vc(x) is the potential function, and T is the temperature parameter, which is usually set to 1 [26].

Similarly, the posterior probability P(X = x | Y = y) can also be expressed by an energy function, as shown in Equation (6).

PX=xY=y=1ZexpExyT. (6)

Substituting Equations (5) and (6) into Equation (4), and taking the logarithms on both sides of the equation simultaneously, the product form is transformed into a summation form, and the result is as follows:

Exy=argmaxEyx+Ex, (7)

where E(x | y) represents the minimized energy function, E(y | x) is the likelihood function energy of pixel x, and E(x) is the prior probability energy corresponding to pixel x. Therefore, the final energy relationship can be expressed as Equation (8).

ExFy=Eyx+Ex, (8)

where xF is the final segmentation mark.

The ICM algorithm is used to optimize Equation (8), i.e., to minimize the energy function E(xF | y). Finally, the binarization results of LV myocardium images can be obtained and are shown in Figure 3. As shown in Figure 3, the LV myocardium can be clearly observed after the original LV images are binarized using the MRF model.

Figure 3.

Figure 3

Binarization results of LV endocardium images based on MRF: (a–d) the original frames extracted at the equal interval from the same echocardiogram and (e–h) the corresponding binarization results.

3.2.2. Segmentation and Extraction of LV Myocardium Based on Position Constraints

After binarizing the original LV images based on the MRF model, the positioning curve of the LV myocardium will be fitted based on the position constraints. Firstly, divide the LV into the left and right regions and then use the nonlinear least squares (NLS) method to perform curve fitting on the two regions. Because only the LV endocardium is approximated in this study, three constraint points are used to limit and constrain the fitted curve.

In this study, a polynomial model based on the NLS method is employed to fit the left and right segments, respectively, as shown in Equation (9).

Fx=a1xm+a2xm1++amx+am+1, (9)

where a1, a2, ⋯, am+1 represents the fitting coefficient of the polynomial, and m is the polynomial degree; in this study, the polynomial degree m is set to 3.

For a given set of coordinate points {(xi, yi): i = 1, 2, ⋯, n}, the polynomial fitting error equation can be written as Equation (10).

V=BXL, (10)

where

B=x1mx2mxnmx1m1x2m1xnm1x1x2xn,X=a1a2am+1,L=Fx1Fx2Fxn. (11)

Based on the NLS method, the estimated value of X can be obtained as follows:

X=BTB1BTL. (12)

Substituting the result obtained from Equation (12) into Equation (9), the LV myocardial positioning fitting curve can be obtained, as shown in Figure 4, where the red boxes in Figure 4(a) are the constraint points obtained by the YOLOv3 model. The obtained three constraint points are used for the constraint of the myocardial fitting curve. Figure 4(b) is the positioning fitting curve without restraint, and Figure 4(c) is the constrained positioning fitting curve.

Figure 4.

Figure 4

Fitting curve results of the LV subgraph in different sequences without and with constraints.

As shown in Figure 4, under the three constraint points obtained based on the YOLOv3 model, the positioning curve of the myocardium can be accurately determined in the LV.

To mitigate the influence of speckle noise around the LV myocardium, the binary LV images obtained based on the MRF model are processed using the morphological masking method. After the initial positioning of the LV myocardium is achieved, the endocardium is approached based on the three constraint points, the edge of the endocardium is smoothed by the B-spline method [27], and the segmentation and extraction of the LV endocardium can be realized as shown in Figure 5.

Figure 5.

Figure 5

Extraction and segmentation of LV endocardium. (a) Result of binarization of the LV image using MRF model. (b) Morphological mask generated according to the fitted positioning curve. (c) Binary myocardial image obtained after mask processing. (d) Approximation result of the endocardium based on the three constraint points. (e) Result of smoothing the myocardium based on the B-spline method.

4. Results

The cardiac ultrasound imaging data used in this study were provided by the Ultrasound Imaging Department of the First Affiliated Hospital of Medical College of Shantou University.

4.1. Evaluation Criteria

For target detection tasks, the average precision (AP) indicator [28] is commonly used to evaluate whether a model can detect a target class accurately. The AP is computed as the intersection of union (IOU) between the detection bounding box and the label bounding box. When the IOU of the detection bounding box and the label bounding box is greater than the set IOU threshold, it is considered that the model detects the target correctly. Subsequently, the AP value of the target class is calculated. In practice, the IOU threshold is usually set to 0.5, and the corresponding AP indicator is called AP50. For a model used to detect multiple target classes, the mean average precision (mAP) can comprehensively evaluate the performance of the model, i.e., compute the average value of the AP values of all target classes.

A precision-recall (PR) curve [29] is shown with precision and recall as the vertical and horizontal axis, respectively. Also the size of the area under the PR curve can comprehensively reflect the performance of a model for detecting the target.

AP can be expressed as

AP=01PRdR, (13)

where P and R represents the precision and recall rates, respectively. The precision and recall in the PR curve are calculated using Equations (14) and (15), respectively.

precision=TPTP+FP, (14)
recall=TPTP+FN, (15)

where TP, FP, and FN represent the true positive, the false positive and the false negative, respectively.

The Dice coefficient [30], MAD [31], and HD [32] parameters are used to evaluate the segmentation results of the LV endocardium:

DiceS,G=2AreaSGAreaS+AreaG,MADA,B=121mi=1mdai,B+1nj=1ndbi,A,HDA,B=maxmaxidai,B,maxjdbj,A, (16)

where S represents the myocardial area data obtained by different binarization methods, G is the gold standard data of the myocardial area, A = {a1, a2, ⋯, am} is the endomyocardial edge data obtained by the method proposed in this paper, and B = {b1, b2, ⋯, bn} is the gold standard endomyocardial edge data.

4.2. LV and Restraint Point Positioning Model Based on YOLOv3

Table 1 illustrates the performance of the YOLOv3-based LV and bounding box positioning model on the test dataset using AP50. From Table 1, all the AP50 values of the four target regions formed by the LV and the three bounding boxes are above 92%, and the mAP value reaches 95.57%, which indicates that the model designed in this study can detect the LV and the three bounding box areas well and meet the requirements of LV myocardium segmentation.

Table 1.

Evaluation results of the YOLOv3-based LV and three bounding box positioning model using AP50.

LV Left_down Right_down Top
AP 100.00% 92.33% 95.44% 94.50%

The PR curve, which can intuitively evaluate whether the model can detect a target class well, is drawn based on the precision-recall value pairs calculated from different confidence values when the model detects a target class. The value of the area enclosed by the PR curve is the AP value.

The PR curve of the model on the test dataset is shown in Figure 6. It can also be seen from Figure 6 that the area under the four PR curves is sufficiently large, which indicates that the performance of the model is satisfactory.

Figure 6.

Figure 6

PR curve of the LV identification and constraint box based on the proposed method in this paper. (VS: ventriculus sinister; Lower left: the constraint point in the lower left corner; Lower right: the constraint point in the lower right corner; and Top: the constraint point on the top of the myocardial wall).

4.3. LV Binarization

To analyze the effect of the MRF model on the binarization of the ultrasound LV images, the proposed method, traditional Otsu method, and K-means clustering algorithms were used to binarize the same LV image for comparison; the binarization results obtained by different methods were also compared with the gold standard, and the results are shown in Figure 7.

Figure 7.

Figure 7

Binarization results of ultrasonic LV images using the Otsu method, K-means clustering method, and MRF model. The area enclosed by the blue line in (a) and (e) is the gold standard for the LV myocardium. (b, f) The binarization results obtained by using the Otsu method. (c, g) The binarization results obtained by using the K-means clustering algorithm. (d, h) The binarization results obtained by using the method proposed in this paper.

From Figure 7, it can be verified that the myocardial area obtained using the proposed model is closest to the gold standard.

For quantitative analysis, the Dice index is used for evaluation. The LV myocardial regions obtained by the Otsu method, K-means clustering algorithm, and the method based on MRF proposed in this paper are compared with the gold standard myocardial region obtained by manual segmentation by senior clinicians, and the corresponding Dice indices are obtained, and the results are shown in Table 2.

Table 2.

Comparison of binarization results obtained by different methods and gold standards.

Otsu K-means MRF
Dice 0.58 ± 0.07 0.59 ± 0.06 0.88 ± 0.03

It can be seen from Table 2 that the Dice value corresponding to the proposed binarization method based on the MRF model is 0.88 ± 0.03, which is far greater than the Dice values corresponding to the Otsu method and K-means clustering algorithm, namely, the performance of the binarization method based on the MRF model proposed in this paper is much better than the other two methods. Therefore, the binarization method proposed in this study can fully meet the requirements for extraction of the LV myocardial region.

4.4. LV Endocardium Segmentation

In order to evaluate the performance of the method proposed in this paper, the same LV ultrasound images were segmented using different methods (listed in Table 3) along with the proposed method, and the segmentation results by different methods were compared with the gold standard obtained by manual segmentation by cardiologists, and five evaluation indicators including training set size, computation speed, Dice coefficient, MAD, and HD were used to evaluate the segmentation results. The results are shown in Table 3.

Table 3.

Comparison of endocardial segmentation results by different methods.

Methods Training set size (frame) Computation speed (fps) Dice (%) MAD (mm) HD (mm)
Hansson et al. [33] 0 0.3 2.58 ± 0.85
Qin et al. [34] 450 0.01 90.8 ± 1.7 2.0 ± 0.42 6.86 ± 1.71
Carneiro and Nascimento [35] 496 0.2 1.94 ± 0.51
The method without constraints 252 1.5–1.8 80.103 ± 2.13 7.06 ± 0.85 10.34 ± 3.51
Proposed method 252 2.1–2.25 93.578 ± 1.97 2.57 ± 0.89 6.68 ± 1.78

It can be seen from Table 3 that the proposed segmentation technique is superior to other methods in terms of various evaluation indicators. In particular, for the computation speed index, the method proposed in this paper has a great advantage, and owing to the use of transfer learning, the method uses less training data to obtain a better segmentation effect.

5. Discussion

In this paper, an automatic LV segmentation method based on the YOLOv3 model is proposed to determine the constraints and positioning. Through the YOLOv3 model, the three positions of the apex and bottom of the LV and LV area are positioned, and based on the MRF model, the LV myocardium subimages are binarized; under the limitation of the three constraint points of the LV, combined with NLS curve fitting and B-spline smoothing, the accurate segmentation and extraction of the LV can be realized. Experiments show that the suggested method can accurately and automatically identify and segment the LV in cardiac ultrasound images.

In the experimental section, a comparison is presented with other segmentation models. Hansson et al. [33] proposed an unsupervised segmentation method based on a Bayesian probability map. Although MADs corresponding to the aforementioned method and the method proposed herein are similar (which means that the two methods are similar in terms of segmentation accuracy), the computation speed of the latter is much higher than that of the former (see the computation speed indicator). The level set segmentation method proposed by Qin et al. [34] is unsupervised, does not require a training dataset, and can yield accurate segmentation results. However, owing to the need for sparse matrix transformation to identify the right ventricle, this method requires many training sets and a large processing time; in addition, it is necessary to readjust the parameters according to the movement of the heart, which will lead to unstable results. Compared with that of the aforementioned method, the MAD of the method proposed herein this paper is slightly lower, but the Dice value is better. In fact, the method proposed by Qin et al. is similar to our method in terms of segmentation accuracy. However, the method proposed herein is far superior in terms of the computation speed indicator. The method proposed by Carneiro and Nascimento [35] uses a deep neural network method to segment the systolic and end-diastolic contours and achieves high segmentation accuracy; however, a large number of datasets is required, and thus, a set of 496 images had to be established. Compared with this method, the method proposed herein only requires a small amount of data (252 frames) to obtain a suitable positioning effect; in terms of calculation speed, the method proposed herein this paper is significantly better than that proposed by Carneiro and Nascimento (see the corresponding computation speed index in Table 3). Finally, according to the computation speed, Dice, MAD, and HD, the automatic LV segmentation method based on constraints and positioning are better in the proposed technique than unconstrained positioning segmentation methods in terms of segmentation accuracy and computation speed.

In summary, if the segmentation accuracy indices (i.e., Dice, MAD, and HD) are considered, the method proposed is not the best, but it can be said that the method proposed in this paper is one of the best methods in terms of segmentation accuracy; however, if the computation speed, data volume, and segmentation accuracy are considered comprehensively, it can be said that the method proposed in this paper is the best. Compared with other methods, the proposed segmentation technique has significant advantages in terms of computation speed and the amount of data required. The method proposed in this study uses fewer data to obtain a good segmentation effect. It is well known that it is very difficult to obtain medical data in practice, thus obtaining a good segmentation effect based on a small amount of data is conducive to the clinical application of the algorithm. The computation speed is another important factor that affects the application of algorithms in clinical practice, and the algorithm proposed has significant advantages in terms of computation speed over the other methods.

6. Conclusions

Here, an automatic LV segmentation method based on the YOLOv3 model for constraint and positioning determination is proposed. Through the YOLOv3 model, the three positions of the apex and bottom of the LV and LV area are positioned, and based on the MRF model, the LV myocardium subimages are binarized; under the limitation of the three constraint points of the LV, combined with NLS curve fitting and B-spline smoothing, the accurate segmentation and extraction of the LV can be realized. Experiments show that the method can accurately and automatically identify and segment the LV in cardiac ultrasound images, and related indicators such as fps, Dice, MAD, and HD can reach 2.1–2.25 fps, 93.57 ± 1.97%, 2.57 ± 0.89 mm, and 6.68 ± 1.78 mm, respectively. Compared with other methods, the proposed method has a better segmentation accuracy and robustness. In particular, our method has a high computational speed, which is very important for real-time evaluation of cardiac function based on echocardiography. In addition, our method uses less training data to achieve better segmentation results. In short, our method can accurately segment LV ultrasound images, which is important for the accurate acquisition of clinical indicators for cardiac function evaluation, such as the EF, strain, and strain rate of the LV on echocardiography and will play a vital role in assisting doctors in clinical diagnosis.

Acknowledgments

We would like to thank Dr. Shunmin Qiu from the First Affiliated Hospital of Shantou University, China. She provided us with some datasets used in this study and offered professional advice. This work was supported by the Basic and Applied Basic Research Foundation of Guangdong Province (grant number 2020B1515120061), the National Natural Science Foundation of China (grant number 82071992), the Guangdong Province University Priority Field (Artificial Intelligence) Project (grant number 2019KZDZX1013), National Key R&D Program of China (grant number 2020YFC0122103), The Scientific Research Grant of Shantou University, China, (Grant No. NTF17016), and the Key Project of Guangdong Province Science & Technology Plan (grant number 2015B020233018).

Data Availability

The cardiac ultrasound imaging data used in this study were provided by the Ultrasound Imaging Department of the First Affiliated Hospital of Medical College of Shantou University, China, which is not open to the public because it would breach the privacy of the research.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

References

  • 1.Mensah G. A., Roth G. A., Fuster V. The global burden of cardiovascular diseases and risk factors. Journal of the American College of Cardiology. 2019;74(20):2529–2532. doi: 10.1016/j.jacc.2019.10.009. [DOI] [PubMed] [Google Scholar]
  • 2.Benyounes N., Van Der Vynckt C., Tibi S., et al. Echocardiography in confirmed and highly suspected symptomatic COVID-19 patients and its impact on treatment change. Cardiology research and practice. 2020;2020:9. doi: 10.1155/2020/4348598.4348598 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Tirronen V., Neri F., Krkkinen T., Majava K., Rossi T. An enhanced memetic differential evolution in filter design for defect detection in paper production. Evolutionary Computation. 2008;16(4):529–555. doi: 10.1162/evco.2008.16.4.529. [DOI] [PubMed] [Google Scholar]
  • 4.Caponio A., Neri F., Tirronen V. Super-fit control adaptation in memetic differential evolution frameworks. Soft Computing. 2009;13(8):811–831. [Google Scholar]
  • 5.Ng P. C., Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Research. 2003;31(13):3812–3814. doi: 10.1093/nar/gkg509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Dalal N., Triggs B. Histograms of oriented gradients for human detection. 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05); 2005; San Diego, CA, USA. [Google Scholar]
  • 7.Teng L., Fu Z. L., Ma Q., et al. Interactive echocardiography translation using few-shot GAN transfer learning. Computational and mathematical methods in medicine. 2020;2020:9. doi: 10.1155/2020/1487035.1487035 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Lu Y., Li B., Liu N., et al. CT-TEE image registration for surgical navigation of congenital heart disease based on a cycle adversarial network. Computational and Mathematical Methods in Medicine. 2020;2020:8. doi: 10.1155/2020/4942121.4942121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ren S., He K., Girshick R., Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems. 2015;39:91–99. doi: 10.1109/TPAMI.2016.2577031. [DOI] [PubMed] [Google Scholar]
  • 10.Liu W., Anguelov D., Erhan D., et al. SSD: single shot multibox detector. European conference on computer vision; 2016; Cham. pp. 21–37. [Google Scholar]
  • 11.Redmon J., Farhadi A. Yolov3: an incremental improvement. 2018. https://arxiv.org/abs/1804.02767.
  • 12.Leclerc S., Grenier T., Espinosa F., Bernard O. A fully automatic and multi-structural segmentation of the left ventricle and the myocardium on highly heterogeneous 2D echocardiographic data. 2017 IEEE International Ultrasonics Symposium (IUS); 2017; Washington, DC, USA. pp. 1–4. [Google Scholar]
  • 13.Dong S., Luo G., Wang K., Cao S., Li Q., Zhang H. A combined fully convolutional networks and deformable model for automatic left ventricle segmentation based on 3D echocardiography. BioMed research international. 2018;2018:16. doi: 10.1155/2018/5682365.5682365 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Smistad E., Ostvik A., Haugen B. O., Lovstakken L. 2D left ventricle segmentation using deep learning. 2017 IEEE international ultrasonics symposium (IUS); 2017; Washington, DC, USA. pp. 1–4. [Google Scholar]
  • 15.Oktay O., Ferrante E., Kamnitsas K., et al. Anatomically constrained neural networks (ACNNs): application to cardiac image enhancement and segmentation. IEEE transactions on medical imaging. 2018;37(2):384–395. doi: 10.1109/TMI.2017.2743464. [DOI] [PubMed] [Google Scholar]
  • 16.Smistad E., Lindseth F. Real-time tracking of the left ventricle in 3D ultrasound using Kalman filter and mean value coordinates. MICCAI Challenge Echocardiogr. Three-Dimensional Ultrasound Segmentation (CETUS) 2014. pp. 65–72.
  • 17.Barbosa D., Friboulet D., D’hooge J., Bernard O. Fast tracking of the left ventricle using global anatomical affine optical flow and local recursive block matching. MICCAI Challenge Echocardiogr. Three-Dimensional Ultrasound Segmentation (CETUS) 2014. pp. 17–24.
  • 18.Wang C., Smedby Ö. Model-based left ventricle segmentation in 3D ultrasound using phase image. MICCAI Challenge Echocardiogr. Three-Dimensional Ultrasound Segmentation (CETUS) 2014. pp. 81–88.
  • 19.Bernier M., Jodoin P., Lalande A. Automatized evaluation of the left ventricular ejection fraction from echocardiographic images using graph cut. MICCAI Challenge Echocardiogr. Three-Dimensional Ultrasound Segmentation (CETUS) 2014. pp. 25–32.
  • 20.Van Stralen M., Haak A., Leung K. E., van Burken G., Bosch J. G. Segmentation of multi-center 3D left ventricular echocardiograms by active appearance models. MICCAI Challenge Echocardiographic Three-Dimensional Ultrasound Segmentation (CETUS) 2014. pp. 73–80.
  • 21.Oktay O., Shi W., Keraudren K., Caballero J., Rueckert D. Learning shape representations for multi-atlas endocardium segmentation in 3D echo images. MICCAI Challenge Echocardiogr. Three-Dimensional Ultrasound Segmentation (CETUS) 2014. pp. 57–64.
  • 22.Ahmad T., Ma Y., Yahya M., Ahmad B., Nazir S. Object detection through modified YOLO neural network. Scientific Programming. 2020;2020:10. doi: 10.1155/2020/8403262.8403262 [DOI] [Google Scholar]
  • 23.Likas A., Vlassis N., Verbeek J. The global K-means clustering algorithm. Pattern Recognition. 2003;36(2):451–461. doi: 10.1016/S0031-3203(02)00060-2. [DOI] [Google Scholar]
  • 24.Aja-Fernndez S., Alberola-Lpez C., Westin C.-F. Noise and signal estimation in magnitude MRI and Rician distributed images: a LMMSE approach. IEEE Transactions on Image Processing. 2008;17(8):1383–1398. doi: 10.1109/TIP.2008.925382. [DOI] [PubMed] [Google Scholar]
  • 25.Su X., Khoshgoftaar T. M. A survey of collaborative filtering techniques. Advances in artificial intelligence. 2009;2009:19. doi: 10.1155/2009/421425. [DOI] [Google Scholar]
  • 26.Li S. Z. Markov Random Field Modeling in Image Analysis. London, U. K: Springer; 2009. [Google Scholar]
  • 27.Eilers P. H. C., Marx B. D. Splines knots and penalties. Wiley Interdisciplinary Reviews: Computational Statistics. 2010;2(6):637–653. doi: 10.1002/wics.125. [DOI] [Google Scholar]
  • 28.Robertson S. A new interpretation of average precision. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval; 2008; New York, NY, USA. pp. 689–690. [Google Scholar]
  • 29.Davis J., Goadrich M. The relationship between precision-recall and ROC curves. Proceedings of the 23rd international conference on Machine learning; 2006; New York, NY, USA. pp. 233–240. [Google Scholar]
  • 30.Pereira S., Pinto A., Alves V., Silva C. Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Transactions on Medical Imaging. 2016;35(5):1240–1251. doi: 10.1109/TMI.2016.2538465. [DOI] [PubMed] [Google Scholar]
  • 31.Huang X. Contour tracking in echocardiographic sequences via sparse representation and dictionary learning. Medical Image Analysis. 2014;18(2):253–271. doi: 10.1016/j.media.2013.10.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Taha A. A., Hanbury A. An efficient algorithm for calculating the exact Hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2015;37(11):2153–2163. doi: 10.1109/TPAMI.2015.2408351. [DOI] [PubMed] [Google Scholar]
  • 33.Hansson M., Brandt S. S., Lindström J., et al. Segmentation of B-mode cardiac ultrasound data by Bayesian probability maps. Medical Image Analysis. 2014;18(7):1184–1199. doi: 10.1016/j.media.2014.06.004. [DOI] [PubMed] [Google Scholar]
  • 34.Qin X., Cong Z., Fei B. Automatic segmentation of right ventricular ultrasound images using sparse matrix transform and a level set. Physics in Medicine & Biology. 2013;58(21):7609–7624. doi: 10.1088/0031-9155/58/21/7609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Carneiro G., Nascimento J. Combining multiple dynamic models and deep learning architectures for tracking the left ventricle endocardium in ultrasound data. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2013;35(11):2592–2607. doi: 10.1109/TPAMI.2013.96. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The cardiac ultrasound imaging data used in this study were provided by the Ultrasound Imaging Department of the First Affiliated Hospital of Medical College of Shantou University, China, which is not open to the public because it would breach the privacy of the research.


Articles from Computational and Mathematical Methods in Medicine are provided here courtesy of Wiley

RESOURCES