Abstract
Purpose
To determine the feasibility of using a deep learning approach to detect cartilage lesions (including cartilage softening, fibrillation, fissuring, focal defects, diffuse thinning due to cartilage degeneration, and acute cartilage injury) within the knee joint on MR images.
Materials and Methods
A fully automated deep learning–based cartilage lesion detection system was developed by using segmentation and classification convolutional neural networks (CNNs). Fat-suppressed T2-weighted fast spin-echo MRI data sets of the knee of 175 patients with knee pain were retrospectively analyzed by using the deep learning method. The reference standard for training the CNN classification was the interpretation provided by a fellowship-trained musculoskeletal radiologist of the presence or absence of a cartilage lesion within 17 395 small image patches placed on the articular surfaces of the femur and tibia. Receiver operating curve (ROC) analysis and the κ statistic were used to assess diagnostic performance and intraobserver agreement for detecting cartilage lesions for two individual evaluations performed by the cartilage lesion detection system.
Results
The sensitivity and specificity of the cartilage lesion detection system at the optimal threshold according to the Youden index were 84.1% and 85.2%, respectively, for evaluation 1 and 80.5% and 87.9%, respectively, for evaluation 2. Areas under the ROC curve were 0.917 and 0.914 for evaluations 1 and 2, respectively, indicating high overall diagnostic accuracy for detecting cartilage lesions. There was good intraobserver agreement between the two individual evaluations, with a κ of 0.76.
Conclusion
This study demonstrated the feasibility of using a fully automated deep learning–based cartilage lesion detection system to evaluate the articular cartilage of the knee joint with high diagnostic performance and good intraobserver agreement for detecting cartilage degeneration and acute cartilage injury.
© RSNA, 2018
Introduction
Identifying cartilage lesions, including cartilage softening, fibrillation, fissuring, focal defects, diffuse thinning due to cartilage degeneration, and acute cartilage injury, in patients undergoing MRI of the knee joint has many important clinical implications. Lifestyle modifications including weight loss (1), aerobic activity (2), and range-of-motion and strengthening exercises (3) have been shown to alleviate symptoms in patients with degenerative and posttraumatic cartilage lesions and may potentially slow the rate of disease progression. There has also been much recent effort in developing disease-modifying drugs for the treatment of osteoarthritis (4,5) and acute cartilage injury (6,7), with many promising new pharmacologic agents currently being investigated in clinical trials. However, lifestyle modifications and pharmaceutical interventions are likely to be most effective for treating cartilage lesions when initiated during the earliest stages of the disease process (8).
MRI with morphologic cartilage imaging sequences has been shown to have high specificity but only moderate sensitivity for detecting cartilage lesions within the knee joint (9,10). Diagnostic performance is highly dependent on the level of reader expertise, and only moderate interobserver agreement between readers has been reported in most studies (9). T2 mapping sequences have been shown to improve the sensitivity for detecting cartilage lesions within the knee joint, with a resultant decrease in specificity (11). Developing standardized computer-based methods for detecting cartilage lesions at MRI would be beneficial to maximize diagnostic performance while reducing subjectivity, variability, and errors due to distraction and fatigue associated with human interpretation.
There has been much recent interest in using deep learning methods in medical imaging (12). While a survey on deep learning in medical image analysis has shown a wide variety of applications in all imaging subspecialties (13), applications in musculoskeletal imaging remain limited (14–17). A fully automated deep learning–based cartilage lesion detection system has been developed at our institution by using a deep convolutional neural network (CNN) to segment cartilage and bone followed by a second CNN classification network to detect structural abnormalities within the segmented cartilage tissue. This study was performed to determine the feasibility of using the deep learning approach to detect cartilage lesions within the knee joint on MR images. It was our hypothesis that the deep learning method would provide similar diagnostic performance to and higher intraobserver agreement than the interobserver agreement of clinical radiologists for detecting cartilage softening, fibrillation, fissuring, focal defects, and diffuse thinning due to cartilage degeneration and acute cartilage injury.
Materials and Methods
Cartilage Lesion Detection System
The proposed deep learning–based cartilage lesion detection system consisted of two two-dimensional (2D) deep CNNs. The first CNN performed rapid segmentation of cartilage and bone. The second classification CNN evaluated structural abnormalities within the segmented cartilage tissue. The detailed network structure for the CNNs is summarized in Table 1. The two networks were connected in a cascaded fashion to create a fully automated processing pipeline (Fig 1). The CNN processing pipeline framework was implemented in a hybrid computing environment involving Python (version 2.7; Python Software Foundation, Wilmington, Del) and MATLAB (version 2013a; MathWorks, Natick, Mass). The CNNs were coded by using the Keras packages (version 2.0.6) with Tensorflow libraries (version 1.0.1) as the deep learning computing back end (19).
Table 1:
Detailed Network Structure for the Segmentation and Classification CNNs of the Deep Learning–based Cartilage Lesion Detection System

Note.—The convolutional neural networks (CNNs) were built by using the listed layers from the top to the bottom. Definitions of the different network structures and layers of a CNN have been described in a previously published review article on deep learning applications in medical image analysis (18). BN = batch normalization, Conv2D = two-dimensional convolution, FC = fully connected layer, ReLU = rectified-linear activation.
Figure 1:
The convolutional neural network (CNN) architecture for the deep learning–based cartilage lesion detection system. The proposed method consisted of segmentation and classification CNNs that were connected in a cascaded fashion to create a fully automated image processing pipeline. BN = batch normalization, ReLu = rectified-linear activation, 2D = two-dimensional.
The segmentation CNN in the processing pipeline was constructed by using a 2D convolutional encoder-decoder network that has been previously shown to provide rapid and accurate segmentation of cartilage and bone on knee MR images (14). The encoder network consisted of a Visual Geometry Group 16 (VGG16 ) (20), convolutional structure, which had a combination of 2D convolution layers, rectified-linear activation (21), batch normalization layers (22), and max-pooling layers to achieve image feature extraction and data compression (Table 1). The VGG16 network was originally proposed for recognition of natural images and has been evaluated in the ImageNet Large Scale Visual Recognition Challenge image data set (www.image-net.org/challenges/LSVRC/) (23) and shown to achieve top-notch performance (20). More recently, the VGG16 network has proven to be an effective and efficient network of feature extraction in many medical image applications, including tissue segmentation (14,24–26) and image contrast conversion (27,28). The convolutional encoder-decoder network using VGG16 shares many features with the commonly used U-Net network (29) but has unique advantages, including lower computation cost, presence of batch normalization layers to improve training efficiency, and more versatility for use as both a segmentation and classification network, which allows effective transfer learning in the current setup (30,31). The decoder network used an identical but reversed VGG16 convolutional structure with multiple levels of image upsampling to combine extracted image features for generation of segmentation output with the same input image size. A multiclass softmax layer was inserted into the final layer of decoder network to yield pixelwise label output for each image voxel for labeling the femur, tibia, femoral cartilage, tibial cartilage, and background. Shortcut connections between the encoder and decoder networks were added to enhance segmentation performance (32,33). Four shortcut connections were generated between the network layers by following the full preactivation method described in the deep residual network configuration (33).
An image partition step was performed following the output of the first segmentation CNN to locate and extract image patches containing the segmented cartilage tissue in multiple regions of interest placed on the articular surfaces of the knee joint. During the extraction process, a continuous curve representing the superficial cartilage surface was calculated. A set of image patches with an empirical 64 × 64 matrix size equaling a squared field of view with a length of 1.75 cm were placed along the articular surfaces of the femur and tibia. Each image patch was of identical size, with its center located on the surface line. The distance between the centers of adjacent image patches was the same as the patch size along the surface line (Fig 1). No image downsampling was performed when extracting the image patches.
The classification CNN in the processing pipeline consisted of a second 2D VGG16 convolutional structure. The VGG16 in the classification CNN was followed by two fully connected layers to provide an output probability score for the presence or absence of a cartilage lesion within each extracted image patch of the segmented cartilage tissue.
MRI Data Sets
This retrospective study was performed in compliance with the Health Insurance Portability and Accountability Act regulations, with approval from our institutional review board, and with a waiver of informed consent. MRI data sets were obtained from 175 patients with knee pain (99 men and 76 women, with an average age of 46.5 years and an age range of 16–74 years) who underwent a clinical MRI examination of the knee at our institution between December 15, 2010, and October 15, 2016, using the same 3.0-T MRI unit (Signa Excite HDx; GE Healthcare, Waukesha, Wis) and eight-channel phased-array extremity coil (Invivo, Orlando, Fla). The MRI data sets consisted of sagittal frequency-selective fat-suppressed T2-weighted fast spin-echo, sagittal proton density–weighted fast spin-echo, and sagittal multiecho spin-echo T2 map (Cartigram; GE Healthcare) sequences. The imaging parameters of all sequences are summarized in Table 2.
Table 2:
Imaging Parameters for the MRI Sequences Used to Detect Cartilage Lesions within the Knee Joint

Note.—DICOM = Digital Imaging and Communications in Medicine.
Training and Evaluation of the Cartilage Lesion Detection System
The reference standard for training the segmentation network was manual cartilage and bone segmentation performed on the sagittal fat-suppressed T2-weighted fast spin-echo image data sets of all 175 subjects. Manual segmentation was performed by a research scientist with 8 years of segmentation experience with supervision from a fellowship-trained musculoskeletal radiologist with 15 years of clinical experience using the segmentation feature in MATLAB (version 2013a; MathWorks). A multiclass mask was created for each image section with the following values: 0, background; 1, femur; 2, femoral cartilage; 3, tibia; and 4, tibial cartilage. A threefold cross-validation was performed on the fat-suppressed T2-weighted fast spin-echo images for training and evaluating the segmentation network, as described in Appendix E1 (online). The Dice coefficient was used to determine the overlap between the fully automated segmented cartilage and bone and the manually segmented reference standard (34). The Dice coefficient ranged between 0 and 1, with a higher value indicating better segmentation accuracy.
The reference standard for training the classification network was the interpretation of the presence or absence of a cartilage lesion within each image patch provided by a fellowship-trained musculoskeletal radiologist (R.K., with 15 years of clinical experience). A customized software program was developed in MATLAB (version 2013a, MathWorks), which allowed the radiologist to review the sagittal fat-suppressed T2-weighted fast spin-echo, proton density–weighted fast spin-echo, and T2 map images side by side (Fig 2). Image patches of identical size as those used in the extraction process for the cartilage lesion detection system were placed at identical locations along the articular surfaces of the femur and tibia on each fat-suppressed T2-weighted fast spin-echo image section. The radiologist used all three sequences together to determine the presence or absence of a cartilage lesion in each image patch. A cartilage lesion was defined as cartilage fibrillation, fissuring, focal defects, or diffuse thinning on the fast spin-echo images and cartilage softening, characterized as an area of increased or decreased T2 relaxation time, on the fast spin-echo or T2 map images. The T2 map images were used to detect subtle changes in the T2 relaxation time of articular cartilage, which could potentially improve the detection of superficial cartilage lesions (11). Interobserver agreement between fellowship-trained musculoskeletal radiologists for detecting cartilage lesions within the knee joint using the fast spin-echo and T2 map sequences has been reported in previous studies (11,35).
Figure 2:
Images show application of the customized software program used by the musculoskeletal radiologist and musculoskeletal radiology fellows to determine the presence or absence of a cartilage lesion in each image patch on each image section. From left to right, the images include a T2 map (in milliseconds), a proton density–weighted fast spin-echo (PD-FSE) MR image, and a fat-suppressed T2-weighted fast spin-echo ( T2-FSE) MR image. The annotation panel on the right was used for selecting the image patches and recording the image interpretations. The small image on the bottom right shows the automatically generated image patches on the articular surfaces of the femur (yellow boxes) and tibia (blue boxes). The yellow box on the fat-suppressed T2-weighted fast spin-echo image shows the current selected image patch and its location on the articular surface.
Extracted image patches from the sagittal fat-suppressed T2-weighted fast spin-echo images were used to train and evaluate the CNN classification network to detect cartilage lesions using the reference standard interpretation provided by the musculoskeletal radiologist with 15 years of clinical experience. A holdout test data set consisting of 660 randomly chosen image patches classified as cartilage lesions and 660 randomly chosen image patches classified as normal cartilage was used to evaluate the diagnostic performance of the cartilage lesion detection system. The remaining image patches were used to train the cartilage lesion detection system by using a stratified fivefold cross-validation method described in Appendix E1 (online). To investigate intraobserver agreement of the cartilage lesion detection system, the training of the system and evaluation on the holdout test data set was performed twice, and the result from each evaluation was treated as an independent assessment by the machine and reported separately for the statistical analysis.
Evaluation by Clinical Radiologists
To compare the diagnostic performance of the cartilage lesion detection system with that of clinical radiologists, a 2nd-year radiology resident (A.K. [resident 1]), a 4th-year radiology resident (W.L. [resident 2]), two musculoskeletal radiology fellows (K.L. [fellow 1] and S.K. [fellow 2]), and a fellowship-trained musculoskeletal radiologist with 17 years of clinical experience (D.B.) independently reviewed the sagittal fat-suppressed T2-weighted fast spin-echo, proton density-weighted fast spin-echo, and T2 map images of all 175 patients side-by-side using the same customized software program (Fig 2). The radiology residents, musculoskeletal radiology fellows, and musculoskeletal radiologist used all three sequences together to determine the presence or absence of a cartilage lesion in the same image patches placed at identical locations on the articular surfaces of the femur and tibia on the sagittal fat-suppressed T2-weighted fast spin-echo images. The clinical radiologists received no formal training or calibration session prior to evaluating the articular cartilage and used the same definition of a cartilage lesion when evaluating each image patch as that used by the musculoskeletal radiologist whose interpretation served as the reference standard.
Statistical Analysis
Statistical analysis was performed by using MATLAB (version 2013a; MathWorks) and MedCalc (version 14.8; MedCalc Software, Ostend, Belgium), with statistical significance defined at P < .05. Contingency tables and sensitivity and specificity for the radiology residents, musculoskeletal radiology fellows, musculoskeletal radiologist, and cartilage lesion detection system for determining the presence or absence of a cartilage lesion on each image patch of the holdout test data set were calculated by using the interpretation of the musculoskeletal radiologist with 15 years of clinical experience as the reference standard. Receiver operating characteristic (ROC) analysis was used to further evaluate the diagnostic performance of the cartilage lesion detection system. For the ROC analysis, the area under the ROC curve (AUC) was calculated for each evaluation of the cartilage lesion detection system, with the AUCs obtained during the two evaluations compared by using a nonparametric approach (36). The Youden index was used to determine the optimal sensitivity and specificity. The κ statistic was used to assess interobserver agreement for detecting cartilage lesions between the radiology residents, musculoskeletal radiology fellows, and musculoskeletal radiologist and intraobserver agreement between the two individual evaluations performed by the cartilage detection system. The degree of interobserver and intraobserver agreement was assessed by using the Landis and Koch method (37).
Results
The segmentation CNN provided good segmentation of the femur, tibia, femoral cartilage, and tibial cartilage. The mean Dice coefficients were 0.96 ± 0.02 (standard deviation) for the femur, 0.95 ± 0.03 for the tibia, 0.81 ± 0.04 for the femoral cartilage, and 0.82 ± 0.04 for the tibial cartilage. The average training time for the segmentation network was 6.2 hours in each fold of the image data sets. Segmentation of cartilage and bone on all image sections for a patient took approximately 20 seconds with the trained network.
A total of 17 395 cartilage image patches were extracted from the knee joint on the sagittal fat-suppressed T2-weighted fast spin-echo image data sets for the 175 patients. A total of 2642 image patches were classified as cartilage lesions, while the remaining 14 753 image patches were classified as normal cartilage by the musculoskeletal radiologist with 15 years of clinical experience. Classification of all image patches on all image sections for a patient with an average of 100 total patches within the knee joint took approximately 2 seconds with the trained network.
Tables 3 and 4 show the contingency tables and sensitivity and specificity values, respectively, for the radiology residents, musculoskeletal radiology fellows, musculoskeletal radiologist, and cartilage lesion detection system for determining the presence or absence of a cartilage lesion on the 1320 image patches in the holdout test data set, which contained equal numbers of patches classified as cartilage lesions and normal cartilage. For evaluations 1 and 2 performed by the cartilage lesion detection system, the sensitivity at the optimal threshold of the Youden index was 84.1% and 80.5%, respectively, while the specificity was 85.2% and 87.9%, respectively. In comparison, the sensitivity of the clinical radiologists ranged between 60.8% and 80.2%, while the specificity ranged between 92.2% and 96.5%. Given the average of 100 total cartilage patches within the knee joint, the specificity values would translate into five false-positive cartilage lesions for each knee for resident 1 and fellows 1 and 2, eight false-positive cartilage lesions for resident 2, three false-positive cartilage lesions for the musculoskeletal radiologist, and 15 and 12 false-positive cartilage lesions for evaluations 1 and 2, respectively, performed by the cartilage lesion detection system. The cartilage lesion detection system was able to detect all types of cartilage lesions, including softening, fibrillation, fissuring, focal defects, and diffuse thinning (Fig 3).
Table 3:
Contingency Tables for the Radiology Residents, Musculoskeletal Radiology Fellows, Musculoskeletal Radiologist, and Cartilage Lesion Detection System for Determining the Presence or Absence of a Cartilage Lesion on the Image Patches in the Holdout Test Data Set

Table 4:
Sensitivity and Specificity for the Radiology Residents, Musculoskeletal Radiology Fellows, Musculoskeletal Radiologist, and Cartilage Lesion Detection System for Determining the Presence or Absence of a Cartilage Lesion on the Image Patches in the Holdout Test Data Set

Note.—Data in parentheses are 95% confidence intervals.
Figure 3a:

Sagittal fat-suppressed T2-weighted fast spin-echo MR images of segmented cartilage show image patches with (a) cartilage softening on the lateral tibial plateau, (b) cartilage fissuring on the medial femoral condyle, (c) focal cartilage defect on the medial femoral condyle, and (d) diffuse cartilage thinning on the lateral femoral condyle that were correctly identified by the cartilage lesion detection system (arrow).
Figure 3b:

Sagittal fat-suppressed T2-weighted fast spin-echo MR images of segmented cartilage show image patches with (a) cartilage softening on the lateral tibial plateau, (b) cartilage fissuring on the medial femoral condyle, (c) focal cartilage defect on the medial femoral condyle, and (d) diffuse cartilage thinning on the lateral femoral condyle that were correctly identified by the cartilage lesion detection system (arrow).
Figure 3c:

Sagittal fat-suppressed T2-weighted fast spin-echo MR images of segmented cartilage show image patches with (a) cartilage softening on the lateral tibial plateau, (b) cartilage fissuring on the medial femoral condyle, (c) focal cartilage defect on the medial femoral condyle, and (d) diffuse cartilage thinning on the lateral femoral condyle that were correctly identified by the cartilage lesion detection system (arrow).
Figure 3d:

Sagittal fat-suppressed T2-weighted fast spin-echo MR images of segmented cartilage show image patches with (a) cartilage softening on the lateral tibial plateau, (b) cartilage fissuring on the medial femoral condyle, (c) focal cartilage defect on the medial femoral condyle, and (d) diffuse cartilage thinning on the lateral femoral condyle that were correctly identified by the cartilage lesion detection system (arrow).
Figure 4 shows the ROC curves describing the diagnostic performance of the cartilage lesion detection system for detecting cartilage lesions within the knee joint. The AUC for the cartilage detection system was 0.917 (95% confidence interval [CI]: 0.901, 0.932; P < .001) for evaluation 1 and 0.914 (95% CI: 0.898, 0.920; P < .001) for evaluation 2. There was no statistically significant difference (P = .68) between the AUCs for the two individual evaluations. For comparison, the points representing the sensitivity and specificity of the radiology residents, musculoskeletal radiology fellows, and musculoskeletal radiologist for detecting cartilage lesions were plotted on Figure 4 and were in close proximity to the ROC curves of the cartilage lesion detection system.
Figure 4:
Receiver operating characteristic (ROC) curves show the diagnostic performance of the cartilage lesion detection system for detecting cartilage lesions within the knee joint. Solid lines = ROC curves for the two individual evaluations performed by the cartilage lesion detection system. Dashed line = diagonal line, with an area under the ROC curve (AUC) of 0.5. The AUCs of the cartilage lesion detection system were 0.917 and 0.914 for evaluations 1 and 2, respectively, both indicating high overall diagnostic accuracy. Sensitivity and specificity for the radiology residents, musculoskeletal radiology fellows, musculoskeletal radiologist, and evaluations 1 and 2 of the cartilage lesion detection system at the optimal threshold of the Youden index are also plotted. Note that the sensitivity and specificity of the clinical radiologists are in close proximity to the ROC curves of the cartilage lesion detection system.
There was good intraobserver agreement between the two individual evaluations performed by the cartilage lesion detection system for determining the presence or absence of a cartilage lesion on the 1320 image patches in the holdout test data set, with a κ of 0.76 (95% CI: 0.73, 0.80). Table 5 shows the κ values for interobserver agreement between the radiology residents, musculoskeletal radiology fellows, and musculoskeletal radiologist for determining the presence or absence of a cartilage lesion on the same image patches. There was moderate to good interobserver agreement between the clinical radiologists, with κ values ranging between 0.57 and 0.73.
Table 5:
κ Values for Interobserver Agreement between the Radiology Residents, Musculoskeletal Radiology Fellows, and Musculoskeletal Radiologist for Determining the Presence or Absence of a Cartilage Lesion on the Image Patches in the Holdout Test Data Set

Note.—Data are κ values, with 95% confidence intervals in parentheses. NA = not applicable.
Discussion
Our study described a fully automated deep learning–based cartilage lesion detection system utilizing a convolutional encoder-decoder network for segmenting cartilage and bone followed by a second CNN classification network to detect structural abnormalities within the segmented cartilage tissue. The proposed deep learning approach achieved high diagnostic accuracy for detecting cartilage lesions within the knee joint, with AUCs above 0.91. Furthermore, the sensitivity and specificity of the cartilage lesion detection system were comparable to the diagnostic performance of clinical radiologists, including radiology residents, musculoskeletal radiology fellows, and a musculoskeletal radiologist. The high AUCs of the cartilage lesion detection system were similar to the AUCs for deep learning techniques used in previous studies to detect pulmonary nodules at chest CT (38), classify pulmonary tuberculosis at chest radiography (39) and breast density at mammography (40), detect coronary artery stenosis at contrast material–enhanced chest CT (41) and prostate cancer at pelvic MRI (42), and grade the severity of osteoarthritis at hip radiography (16), which further emphasizes the promising preliminary results of computer-based methods for evaluating medical images.
Compared with clinical radiologists, the cartilage lesion detection system at the optimal Youden index provided higher sensitivity but lower specificity for detecting cartilage lesions within the knee joint. The high sensitivity of the cartilage lesion detection system is particularly favorable, because the main limitation of MRI for evaluating articular cartilage (even by experienced musculoskeletal radiologists) is its relatively low sensitivity for identifying superficial cartilage lesions (9). The lower specificity of the cartilage lesion detection system is likely the result of the CNNs using only the sagittal fat-suppressed T2-weighted fast spin-echo images for cartilage lesion detection, while the clinical radiologists used three sagittal image data sets with different tissue contrasts. It was impossible for the cartilage lesion detection system to use multiple image data sets for evaluating articular cartilage in our retrospective study because of the different fields of view, spatial resolutions, section thicknesses, and intersection gaps of the three sequences acquired during the MRI examination. Nevertheless, the lower specificity of the cartilage lesion detection system is concerning, as the high number of false-positive cartilage lesions would require the clinical radiologist to verify the actual presence of disease in each image patch, which would increase the overall time required for image interpretation.
The cartilage lesion detection system has many potential advantages. The intraobserver agreement of the CNNs for detecting cartilage lesions was greater than the interobserver agreement of the clinical radiologists. However, the intraobserver agreement was far from perfect, which is likely due to multiple factors, including the inherent model uncertainty of the network structure (43,44), the stochastic nature of network training, and the variation between the training and testing image data sets (24,45). In addition, although the one-time training process of the cartilage lesion detection system was relatively long, the detection time was highly efficient (on the order of seconds) for evaluating the entire knee joint, which would make it quite useful as a rapid screening method for cartilage lesions. Furthermore, the threshold of the classification CNN is adjustable and could be optimized to provide more sensitive detection of cartilage lesions in certain patient populations, such as individuals with knee pain and no evidence of internal derangement at MRI and individuals with persistent knee pain after surgical intervention. The cartilage lesion detection system is also not influenced by errors related to inexperience, distraction, and fatigue associated with human interpretation of medical images.
Our study merely documents the feasibility of using a deep learning approach for evaluating the articular cartilage of the knee joint. Additional technical development and validation work is needed to improve the current cartilage lesion detection system. Improvements in diagnostic performance could be achieved if the CNNs were able to evaluate multiple sequences with different tissue contrasts. However, this would require performing multiple 2D sequences with identical fields of view, spatial resolutions, section thicknesses, and intersection gaps during the MRI examination or using newly developed three-dimensional combined morphologic and quantitative cartilage imaging sequences (46). Furthermore, the use of multiple classification CNNs, larger training data sets, and better reference standards consisting of the interpretations provided by multiple experienced fellowship-trained musculoskeletal radiologists could further improve diagnostic performance. Large prospective validation studies are also needed to compare the interpretations of the cartilage lesion detection system with cartilage lesion grades assigned at arthroscopy or histologic examination in small identically located regions of interest on the articular surfaces of the knee joint. Finally, future clinical studies are needed to determine whether combined use of human and machine interpretation could improve diagnostic performance for detecting cartilage lesions when compared with each method of image interpretation alone.
Our study had several limitations. First, only the articular cartilage on the femur and tibia was evaluated in our feasibility study, as the segmentation and classification CNNs were not yet optimized for assessing patellar cartilage and it was believed that evaluating the curved articular surface of the patella on the 3-mm-thick fat-suppressed T2-weighted fast spin-echo images would be challenging. Furthermore, the reference standard for the presence and absence of cartilage lesions in each image patch was the interpretation provided by a musculoskeletal radiologist rather than arthroscopy, which has higher sensitivity for detecting cartilage lesions (9). However, the use of arthroscopy as a reference standard in our retrospective study would not have been possible, as surgical reports provide only the highest grade of cartilage lesion on each articular surface, and the exact location of the cartilage lesion is not well described. Training the cartilage lesion detection system requires large numbers of image patches containing normal and abnormal cartilage placed at predetermined locations on the articular surfaces of the knee joint. A final limitation of our feasibility study was that it did not evaluate multiple different types of CNNs that could be used as segmentation and classification networks in the cartilage lesion detection system.
In summary, our study demonstrated the feasibility of using a fully automated deep learning–based cartilage lesion detection system for evaluating the articular cartilage of the knee joint. The cartilage lesion detection system was found to have similar diagnostic performance to and higher intraobserver agreement than the interobserver agreement of clinical radiologists with varying levels of experience for detecting cartilage degeneration and acute cartilage injury within the knee joint using a single sagittal fat-suppressed T2-weghted fast spin-echo sequence. While our initial results are promising, future work is needed for further technical development and validation of the cartilage lesion detection system before it can be fully implemented in clinical practice.
Summary
This cartilage lesion detection system (involving the use of a single sagittal fat-suppressed T2-weighted fast spin-echo MRI sequence) was found to have similar diagnostic performance to and higher intraobserver agreement than the interobserver agreement of clinical radiologists with varying levels of experience for detecting cartilage degeneration and acute cartilage injury within the knee joint.
Implication for Patient Care
■ Deep learning–based approaches have the potential to maximize diagnostic performance for detecting cartilage degeneration and acute cartilage injury within the knee joint while reducing subjectivity, variability, and errors due to distraction and fatigue associated with human interpretation.
APPENDIX
Funding support for the research project was provided by the National Institute of Arthritis and Musculoskeletal and Skin Diseases (R01-AR068373-01).
Disclosures of Conflicts of Interest: F.L. disclosed no relevant relationships. Z.Z. disclosed no relevant relationships. A.S. disclosed no relevant relationships. D.B. disclosed no relevant relationships. W.L. disclosed no relevant relationships. A.K. disclosed no relevant relationships. K.L. disclosed no relevant relationships. S.K. disclosed no relevant relationships. R.K. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: is a consultant for Boston Imaging Core Lab; institution has grants or grants pending with GE Healthcare. Other relationships: disclosed no relevant relationships.
Abbreviations:
- AUC
- area under the ROC curve
- CI
- confidence interval
- CNN
- convolutional neural network
- ROC
- receiver operating characteristic
- 2D
- two-dimensional
- VGG16
- Visual Geometry Group 16
References
- 1.Christensen R, Bartels EM, Astrup A, Bliddal H. Effect of weight reduction in obese patients diagnosed with knee osteoarthritis: a systematic review and meta-analysis. Ann Rheum Dis 2007;66(4):433–439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Roddy E, Zhang W, Doherty M. Aerobic walking or strengthening exercise for osteoarthritis of the knee? A systematic review. Ann Rheum Dis 2005;64(4):544–548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Fransen M, McConnell S. Land-based exercise for osteoarthritis of the knee: a metaanalysis of randomized controlled trials. J Rheumatol 2009;36(6):1109–1117. [DOI] [PubMed] [Google Scholar]
- 4.Le Hellio. Graverand-Gastineau MP. OA clinical trials: current targets and trials for OA. Choosing molecular targets: what have we learned and where we are headed? Osteoarthritis Cartilage 2009;17(11):1393–1401. [DOI] [PubMed] [Google Scholar]
- 5.Hunter DJ, Hellio Le Graverand-Gastineau MP. How close are we to having structure-modifying drugs available? Rheum Dis Clin North Am 2008;34(3):789–802. [DOI] [PubMed] [Google Scholar]
- 6.Grodzinsky AJ, Wang Y, Kakar S, Vrahas MS, Evans CH. Intra-articular dexamethasone to inhibit the development of post-traumatic osteoarthritis. J Orthop Res 2017;35(3):406–411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lattermann C, Jacobs CA, Proffitt Bunnell M, et al. A multicenter study of early anti-inflammatory treatment in patients with acute anterior cruciate ligament tear. Am J Sports Med 2017;45(2):325–333. [DOI] [PubMed] [Google Scholar]
- 8.Felson DT, Hodgson R. Identifying and treating preclinical and early osteoarthritis. Rheum Dis Clin North Am 2014;40(4):699–710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Quatman CE, Hettrich CM, Schmitt LC, Spindler KP. The clinical utility and diagnostic performance of magnetic resonance imaging for identification of early and advanced knee osteoarthritis: a systematic review. Am J Sports Med 2011;39(7):1557–1568. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Menashe L, Hirko K, Losina E, et al. The diagnostic performance of MRI in osteoarthritis: a systematic review and meta-analysis. Osteoarthritis Cartilage 2012;20(1):13–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kijowski R, Blankenbaker DG, Munoz Del Rio A, Baer GS, Graf BK. Evaluation of the articular cartilage of the knee joint: value of adding a T2 mapping sequence to a routine MR imaging protocol. Radiology 2013;267(2):503–513. [DOI] [PubMed] [Google Scholar]
- 12.LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–444. [DOI] [PubMed] [Google Scholar]
- 13.Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal 2017;42:60–88. [DOI] [PubMed] [Google Scholar]
- 14.Liu F, Zhou Z, Jang H, Samsonov A, Zhao G, Kijowski R. Deep convolutional neural network and 3D deformable approach for tissue segmentation in musculoskeletal magnetic resonance imaging. Magn Reson Med 2018;79(4):2379–2391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Cai Y, Landis M, Laidley DT, Kornecki A, Lum A, Li S. Multi-modal vertebrae recognition using Transformed Deep Convolution Network. Comput Med Imaging Graph 2016;51:11–19. [DOI] [PubMed] [Google Scholar]
- 16.Xue Y, Zhang R, Deng Y, Chen K, Jiang T. A preliminary examination of the diagnostic value of deep learning in hip osteoarthritis. PLoS One 2017;12(6):e0178992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Prasoon A, Petersen K, Igel C, Lauze F, Dam E, Nielsen M. Deep feature learning for knee cartilage segmentation using a triplanar convolutional neural network. Med Image Comput Comput Assist Interv 2013;16(Pt 2):246–253. [DOI] [PubMed] [Google Scholar]
- 18.Shen D, Wu G, Suk HI. Deep learning in medical image analysis. Annu Rev Biomed Eng 2017;19(1):221–248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Abadi M, Agarwal A, Barham P, et al. TensorFlow: large-scale machine learning on heterogeneous distributed systems. ArXiv e-prints 2016. http://arxiv.org/abs/1603.04467. Accessed June 30, 2017.
- 20.Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. ArXiv e-prints 2014. http://arxiv.org/abs/1409.1556. Accessed March 27, 2017.
- 21.Nair V, Hinton GE. Rectified linear units improve restricted Boltzmann machines. Proc 27th Int Conf Mach Learn 2010; 807–814. http://www.icml2010.org/papers/432.pdf. Accessed May 26, 2016. [Google Scholar]
- 22.Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. ArXiv e-prints 2015. http://arxiv.org/abs/1502.03167. Accessed June 5, 2017.
- 23.Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. ArXiv e-prints 2014. http://arxiv.org/abs/1409.0575. Accessed April 13, 2018.
- 24.Zhao G, Liu F, Oler JA, Meyerand ME, Kalin NH, Birn RM. Bayesian convolutional neural network based MRI brain extraction on nonhuman primates. Neuroimage 2018;175:32–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zha W, Kruger SJ, Johnson KM, et al. Pulmonary ventilation imaging in asthma and cystic fibrosis using oxygen-enhanced 3D radial ultrashort echo time MRI. J Magn Reson Imaging 2018;47(5):1287–1297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Zhou Z, Zhao G, Kijowski R, Liu F. Deep convolutional neural network for segmentation of knee joint anatomy. Magn Reson Med 2018 May 17. [Epub ahead of print]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Liu F, Jang H, Kijowski R, Bradshaw T, McMillan AB. Deep learning MR imaging-based attenuation correction for PET/MR imaging. Radiology 2018;286(2):676–684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Jang H, Liu F, Zhao G, Bradshaw T, McMillan AB. Technical note: deep learning based MRAC using rapid ultra-short echo time imaging. Med Phys 2018 May 15. [Epub ahead of print]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ronneberger O, Fischer P, Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF, eds. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science, vol 9351. Cham, Switzerland: Springer, 2015; 234–241. [Google Scholar]
- 30.Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. ArXiv e-prints 2016. http://arxiv.org/abs/1602.03409. Accessed December 12, 2017. [DOI] [PMC free article] [PubMed]
- 31.Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging 2016;35(5):1299–1312. [DOI] [PubMed] [Google Scholar]
- 32.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. ArXiv e-prints 2015. http://arxiv.org/abs/1512.03385. Accessed May 14, 2017.
- 33.He K, Zhang X, Ren S, Sun J. Identity mappings in deep residual networks. ArXiv e-prints 2016. http://arxiv.org/abs/1603.05027. Accessed May 14, 2017.
- 34.Zou KH, Warfield SK, Bharatha A, et al. Statistical validation of image segmentation quality based on a spatial overlap index. Acad Radiol 2004;11(2):178–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Kijowski R, Blankenbaker DG, Davis KW, Shinki K, Kaplan LD, De Smet AA. Comparison of 1.5- and 3.0-T MR imaging for evaluating the articular cartilage of the knee joint. Radiology 2009;250(3):839–848. [DOI] [PubMed] [Google Scholar]
- 36.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44(3):837–845. [PubMed] [Google Scholar]
- 37.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33(1):159–174. [PubMed] [Google Scholar]
- 38.Nibali A, He Z, Wollersheim D. Pulmonary nodule classification with deep residual networks. Int J CARS 2017;12(10):1799–1808. [DOI] [PubMed] [Google Scholar]
- 39.Lakhani P, Sundaram B. Deep Learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 2017;284(2):574–582. [DOI] [PubMed] [Google Scholar]
- 40.Mohamed AA, Berg WA, Peng H, Luo Y, Jankowitz RC, Wu S. A deep learning method for classifying mammographic breast density categories. Med Phys 2018;45(1):314–321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zreik M, Lessmann N, van Hamersvelt RW, et al. Deep learning analysis of the myocardium in coronary CT angiography for identification of patients with functionally significant coronary artery stenosis. Med Image Anal 2018;44:72–85. [DOI] [PubMed] [Google Scholar]
- 42.Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep 2017;7(1):15415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Kendall A, Badrinarayanan V, Cipolla R. Bayesian SegNet: model uncertainty in deep convolutional encoder-decoder architectures for scene understanding. ArXiv e-prints 2015. http://arxiv.org/abs/1511.02680. Accessed April 13, 2018. [DOI] [PubMed]
- 44.Kendall A, Gal Y. What uncertainties do we need in Bayesian deep learning for computer vision? ArXiv e-prints 2017. http://arxiv.org/abs/1703.04977. Accessed April 13, 2018.
- 45.Leibig C, Allken V, Ayhan MS, Berens P, Wahl S. Leveraging uncertainty information from deep neural networks for disease detection. Sci Rep 2017;7(1):17816. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Staroswiecki E, Granlund KL, Alley MT, Gold GE, Hargreaves BA. Simultaneous estimation of T(2) and apparent diffusion coefficient in human articular cartilage in vivo with a modified three-dimensional double echo steady state (DESS) sequence at 3 T. Magn Reson Med 2012;67(4):1086–1096. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



