Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Jun 1.
Published in final edited form as: IEEE Trans Biomed Eng. 2017 Aug 29;65(6):1291–1300. doi: 10.1109/TBME.2017.2743562

Acoustical Emission Analysis by Unsupervised Graph Mining: A Novel Biomarker of Knee Health Status

Sinan Hersek 1, Maziyar Baran Pouyan 2, Caitlin N Teague 3, Michael N Sawka 4, Mindy L Millard-Stafford 5, Geza F Kogler 6, Paul Wolkoff 7, Omer T Inan 8
PMCID: PMC6038802  NIHMSID: NIHMS978290  PMID: 28858782

Abstract

Objective

To study knee acoustical emission patterns in subjects with acute knee injury immediately following injury and several months after surgery and rehabilitation.

Methods

We employed an unsupervised graph mining algorithm to visualize heterogeneity of the high-dimensional acoustical emission data, and to then derive a quantitative metric capturing this heterogeneity – the graph community factor (GCF). A total of 42 subjects participated in the studies. Measurements were taken once each from 33 healthy subjects with no known previous knee injury, and twice each from 9 subjects with unilateral knee injury: first, within seven days of the injury, and second, 4–6 months after surgery when the subjects were determined ready to start functional activities. Acoustical signals were processed to extract time and frequency domain features from multiple time windows of the recordings from both knees, and k-Nearest Neighbor graphs were then constructed based on these features.

Results

The GCF calculated from these graphs was found to be 18.5 ± 3.5 for healthy subjects, 24.8 ± 4.4 (p=0.01) for recently injured and 16.5 ± 4.7 (p=0.01) at 4–6 months recovery from surgery.

Conclusion

The objective GCF scores changes were consistent with a medical professional’s subjective evaluations and subjective functional scores of knee recovery.

Significance

Unsupervised graph mining to extract GCF from knee acoustical emissions provides a novel objective and quantitative biomarker of knee injury and recovery that can be incorporated with a wearable joint health system for use outside of clinical settings, and austere / under resourced conditions, to aid treatment / therapy.

Index Terms: Wearable sensing, knee health, rehabilitation engineering, unsupervised learning, graph mining

I. Introduction

Acoustic emissions from joints during movement are a phenomenon clinicians qualitatively observe and refer to as crepitus [3]. Damaged cartilage and ligaments disrupt joint stability, geometry and cause edema – these factors alter tissue interfaces during specific movements thus modifying acoustical emissions. However, the ability to quantify and interpret acoustic emissions for clinical decisions regarding joint health has had limited success.

Blodgett first proposed in 1902 the concept of auscultating the knee [4] with a standard stethoscope and described his observations as “insufficient”, and expressed his hope that the “preliminary report may stimulate independent investigation” in follow-on studies. Over the past century, researchers have employed large precision microphones and surface vibration sensors for in-clinic measurements of joint sounds to diagnose degenerative diseases from a snapshot measurement that was compared against a population norm [58]. While some promising results were obtained, the major challenge was the large inter-subject variability in joint acoustical emission signatures, and thus the relatively low specificity of such diagnostic approaches as compared to costly medical imaging [1]. Moreover, a lack of mechanistic understanding of how specific structural changes in the knee can impact specific joint sound patterns had hindered progress in the field. Indeed, rather than focusing on diagnostics of specific sounds, the unveiled potential of joint sounds may be in the longitudinal assessment of the same person’s knee health over time. We anticipate that knee acoustical emissions may exhibit more complexity or erratic patterns following injury as compared to a healthy joint and that acoustic heterogeneity will abate after recovery from corrective surgery.

We presented in previous work a system for facilitating wearable joint acoustical emissions measurements outside of clinical settings on a longitudinal basis [10]. In that previous work, we examined acoustical emissions from subjects with healthy knees only, and focused on developing a robust interface between the sensor and the knee, to ensure that high quality measurements could be obtained. We found that consistent and robust recordings could be obtained with inexpensive and miniature microphones placed on the knee. This paper is the first that examines joint acoustical emissions in the context of knee injury and recovery, and the first to apply machine learning to evaluate those emissions to characterize joint health status. We hypothesized that joint acoustical emission heterogeneity will be more pronounced after knee injury (compared to when healthy) and abate with corrective surgery and recovery. We created a novel unsupervised learning based algorithm for visualizing the acoustical emissions from knee joints, and for providing a quantitative output representing the heterogeneity of the measures. We demonstrate, for the first time, that the heterogeneity of the constructed graph (quantified using the graph community factor, GCF) is higher for injured subjects than that for healthy subjects. Furthermore, we demonstrate that the heterogeneity decreases in injured knees following corrective surgery and four to six months of recovery.

II. Methods

A. Human Subject Protocol and Subject Demographics

This study was an extension of our previously published work [10] in which we acquired knee joint acoustic emissions from thirteen male subjects with no recent knee injuries. Forty-two subjects participated in the current study which, was approved by the Georgia Institute of Technology Institutional Review Board (IRB) and the Army Human Research Protection Office (AHRPO). Thirty-three of these subjects had no recent injury (within the last two years) to any of their knees, and measurements were made from each of these subjects. Nine of the subjects had an acute, unilateral knee injury, and one measurement was taken from all nine subjects within seven days of the injury. The injuries included torn anterior cruciate ligament (six subjects), torn lateral meniscus (one subject) and sprained medial collateral ligament (two subjects). Seven of these subjects required corrective surgery, and thus a second measurement was taken four to six months following this surgery at which point the subjects could resume functional activities.

Subject demographics and physical characteristics are presented in Table 1. Subjects were similar in age, height and weight between injured (n=9) and non-injured groups (n=33). A lower extremity functional scale questionnaire validated and utilized in clinical decision making (Binkley, et al.[12]) was completed at each laboratory visit. This self-reported score has a maximum value of 80 when no symptom limitations in daily function or activity are reported in the lower extremity. The questionnaire is made up of 20 questions related to the level of difficulty the subject has in doing various daily and sports activities (standing, sitting, running, squatting etc.). As expected, a significantly worse lower extremity functional score was reported by the injured subjects compared to the healthy subjects (p<0.01 using a two-sample Kolmogorov-Smirnov test).

Table 1.

Demographic Data for Study Participants

Healthy Injured
Number of Subjects 33 9
# Females (% of group) 7 (21%) 1 (11%)
# Males (% of group) 26 (79%) 8 (89%)
Age (mean ± σ, in years) 19.8 ± 0.9 20.8 ± 1.6
Height (mean ± σ., in cm) 184.0±8.0 186.1±7.9
Weight (mean ± σ., in kg) 94.6±18.0 106.6±22.3
Lower Extremity Functional Score 78.8±2.5 36.3±10.5*
*

Indicates statistically significant difference (p<0.01).

For each subject, an electret microphone (COS-11D, Sanken Microphone Co., Japan) was attached to the medial aspect of the patella using Kinesio Tex tape (shown in Figure 1(a)). The microphone was attached such that it was close to, but not in contact with, the skin. The subject then performed five repetitions of unloaded knee extension and flexion while seated without foot contact with the ground. The sound signals from the microphone were sampled at 44.1 kHz and recorded, using an audio recorder (Zoom H6 Recorder, Zoom Corp., Japan). The measurement procedure was repeated on both knees for all subjects (healthy and injured).

Figure 1.

Figure 1

An overview of the methods by which signals are acquired from the knee joint and subsequently analyzed. (a) The sensor setup used to acquire knee joint sounds. An air microphone (electret) affixed at the medial side of the patella using Kinesio Tex tape is used to acquire the signals. (b) The knee joint sound signal (x(t), in blue) and the spectrogram of x(t), measured from the knee of a human subject during four leg extension and flexion cycles. (c) The signal analysis workflow for knee joint sounds. The signals from the left knee of the subject are filtered and standardized (to zero mean and unity variance) and windowed (frame length of 200 ms with 50% overlap). M=64 features (f_1,f_2…f_M) are extracted from each of the NiL frames and stored in an NiL×M matrix where each row represents a frame and each column represents a feature. The aforementioned steps are repeated for the right knee and the data matrices formed using both knees are concatenated. A k-Nearest Neighbor graph (kNN graph) is constructed from the matrix formed using data from both knees. The graph community factor (GCF) is calculated from the kNN graph.

B. Signal Processing and Feature Extraction

The sound signal recorded from a knee with an acute injury during knee extension and flexion cycles is shown in Figure 1(a) as x(t). The signal x(t) contains high energy and short duration (on the order of magnitude of 10 ms) acoustic signatures that have a “spike-like” appearance in the plot of the signal. As seen from the spectrogram of the signal x(t) in Figure 1(b), these joint sound signatures occupy broad frequency bands with frequency components as high as 20 kHz, but are mostly limited to frequencies lower than 15 kHz. The signal x(t) also has frequency content below 1 kHz, most of which can be attributed to the microphone and tape rubbing on the subject’s skin (interface noise).

The signal processing and feature extraction workflow for the knee joint sound signals is illustrated in Figure 1(c). The sound signals acquired during five extension and flexion cycles from the left knee of the subject are digitally filtered using a finite impulse response band-pass filter with a bandwidth of 1 kHz–15 kHz. The lower cutoff 1 kHz removes most of the interface noise in the signal, and the upper cutoff of 15 kHz limits the bandwidth of the signal to be analyzed while keeping most of the information from the high energy and short duration joint sound signatures. The filtered signal is then standardized to have zero mean and unity variance to compensate for variations in the distance between the microphone and the skin. The standardized signal (shown in blue in Figure 1(c)) is then windowed with a frame duration 200 ms and 50% overlap between frames, resulting in NiL signal frames from the left knee (L) of the ith subject. The frame duration was chosen heuristically to allow for multiple joint sound signatures to be present within a given frame. A total of M=64 features was extracted from each frame and placed in an NiL × M dimensional matrix where each column represents a feature and each row represents a frame. The features extracted from each frame are summarized in Table 2 (refer to Giannakopoulos, et al. [13] for more detail on features f1f35). The aforementioned steps are repeated for the right knee of the ith subject as well, resulting in another NiR × M dimensional matrix. The matrices formed using both knees are concatenated to form a final (NiL + NiR) × M dimensional data matrix, Xi for the ith subject.

Table 2.

Description of Audio Features Extracted from Each 200 ms Frame

Feature Number and Name General Description Significance in terms of knee joint sounds
f1) Energy Total signal energy. High in frames with knee joint sound signatures.
f2) Zero Crossing Rate The rate of sign change. High in frames with knee joint sound signatures.
f3) Energy Entropy Measures sudden signal energy changes. Low in frames with joint sound signatures and lots of interface noise.
f4) Spectral Centroid The first moment of the signal spectrum. High in frames dominated by joint sound signatures.
f5) Spectral Spread The second central moment of the signal spectrum. High in frames with joint sound signatures.
f6) Spectral Flux A measure of the difference between the signal spectra of successive frames. Lower in frames rich in joint sound signatures as the spectra of these signatures are more consistent than that of random background and interface noise.
f7) Spectral Entropy A measure of the “complexity” of the spectrum of the signal. High in frames with lots of joint sound signatures as these frames have irregularly shaped spectra. Background and interface noise have flatter spectra.
f8) Spectral Roll-off The frequency below which 90% of the signal energy is concentrated. High in frames with joint sound signatures.
f9–f21) Mel-Frequency Cepstrum Coefficients A popular speech processing feature related to the signal cepstra. Especially coefficients f9–f13 have discriminative capability in identifying frames that have joint sound signatures vs. background/interface noise.
f22–f35) Fundamental frequency, harmonic ratio and Chroma vector Features widely used in music information retrieval. These features are capable of indicating frames rich of joint sound signatures.
f36–f64) Band powers Power of the signal in 29 distinct frequency bands, between 30 logarithmically spaced frequencies in the range of 1 kHz–15 kHz. Higher frequency band powers obtain high values at frames rich of joint sound signatures while low frequency band powers are high in frames dominated by interface noise.

Indicates time domain features. All others are frequency domain features.

C. Graph Community Factor Calculation

1) Methodological details

We use the data matrix Xi from each subject to construct a kNN graph, such that the data could be visualized in lower-dimensional (i.e., 2D) space while simultaneously the underlying geometric relationships between the data points could be preserved. We represent the rows of Xi as vertices in the graph and connect each vertex to its k nearest neighboring vertices using the Euclidean distance metric. Weights are then assigned to each graph edge using dice similarity, such that we incorporate the properties of each point’s neighborhood rather than relying on Euclidean distance alone in attributing points to particular clusters or communities [14]. Letting vp and vq represented two connected vertices within the graph, the weight for the edge between these vertices is defined as,

wpq=2|{ApBq}|(|Dp|+|Dq|) (1)

where Ap and Bq are the set of kNNs of vp and vq respectively, Dp and Dq denote the degree of vp and vq and |.| indicates the number of elements in a set. Therefore, for a kNN graph, the dice similarity is the ratio of the number of neighbors that vp and vp have in common.

After the weighted graph is constructed, the potential communities within the graph are detected. The Infomap community detection algorithm is used for this purpose [15]. This algorithm considers the amount of time spent in different portions of the graph, by a random walking process, to reveal different communities within the graph [15, 16]. The computational complexity, O, of this algorithm is O(|E|), where E is the set of edges in the graph. The version of Infomap used for this work is readily available online [17]. Once the communities are detected, the GCF is calculated as the number of communities discovered. The feature extraction and graph community detection were performed on a desktop system with 3.2 GHz CPU and 8 GB of RAM.

To assess the effect of dimensionality reduction on the GCF metric, we performed principal component analysis (PCA) on the data matrix Xi [18]. Following PCA we produced new matrices keeping only the first two, five and ten principal components out of 64, labelling them Xi,2, Xi,5 and Xi,10. We then calculated the GCF for each of these matrices and labelled them GCF2, GCF5 and GCF10 respectively.

2) Considerations and Justifications of Graph Modeling Approach

It is expected that the data from both knees of a healthy subject would be more homogenously distributed within the features space than the data from a subject with a unilateral knee injury. However, modeling the distribution of Xi using models such as Gaussian or student’s t-distribution [19] requires strong assumptions to be made about the shape of the data in the high-dimensional space (e.g. ellipsoid, convex). Furthermore, such models need parameters to be estimated about the underlying distribution of the data, which is difficult in high dimensions due to the curse of dimensionality [20, 21] (in high dimensional space where all data points appear to be sparse, it becomes increasingly difficult to understand the properties of the data). Kernel-density estimation based clustering techniques suffer from the same problem due to the existence of parameters to be estimated, such as kernel bandwidth.

Rather than modelling the distribution of the data in the high-dimensional feature space, we use a kNN graph, which has been used in previous studies to model and cluster high dimensional data in bioinformatics [22, 23]. We quantify the heterogeneity of the data distribution using the number of communities within the graph, which is expected to be higher for data that is more heterogeneously distributed.

Communities are detected within the data graph instead of finding clusters within the data using regular clustering algorithms such as k-means, Gaussian mixture models, or spectral clustering, as such algorithms take the number of clusters to be found as an input. Additionally, using kernel density estimate based clustering algorithms [24] to estimate the number of clusters in the data is not a feasible solution. Such techniques are time consuming for high-dimensional data, and the curse of dimensionality makes it difficult to robustly detect dense areas within the data distribution.

Another critical aspect of the proposed method is the use of dice similarity as graph edge weights. A kNN graph with k=3 that contains three predefined communities represented in black, blue, and red respectively is shown in Figure 2. If a similarity metric based on the Euclidean distance between the vertices is used (such as the reciprocal of the Euclidean distance) as the edge weights, the blue and red communities are merged during community detection, leading to two communities being detected rather than three. This occurs because vp and vq are edges that are close to each other (in terms of Euclidean distance) but belong to different communities and therefore the edge weight, wpq, would be non-zero. The merging of distinct communities can be exacerbated by the existence of communities that are concavely shaped such as the red community. Using a metric derived from the Euclidean distance also makes the number of communities detected very sensitive to the parameter k. However, using the dice similarity for edge weights addresses these problems. The weight, wpq, of the edge between the vertices vp and vq (shown in green) is zero when dice similarity is used, as these vertices have no common neighbors, thus making the dice similarity zero. Using dice similarity also makes the algorithm more robust to changes in the parameter k, making it more easily generalizable.

Figure 2.

Figure 2

A representative k-Nearest Neighbor graph where k=3. The graph contains three underlying communities shown in red, blue, and black. The communities in red and blue cannot be distinctly detected using edge weights, derived from the Euclidean distance between the vertices (vp and vq), as the weight of the edge in green (wpq) would be non-zero. Using dice similarity to calculate edge weights solves this problem, as this makes wpq zero.

III. Results

We demonstrated the capability of the GCF metric in discriminating healthy from injured subjects by analyzing knee sounds gathered from forty-two college athlete human subjects, with and without recently acquired acute, unilateral knee injuries. The data from all forty-two human subjects (33 healthy and 9 with acute unilateral knee injuries) are visualized in Figure 3 using t-Stochastic Neighbor Embedding (t-SNE) [25]. Each data point on this plot corresponds to a 200 ms window of an acoustic signal acquired from a subject’s knee joint. Each window is represented using 64 audio features and the dimensionality of the data is reduced to two using t-SNE. The two t-SNE dimensions do not necessarily have physical meanings. This visualization of the data shows that the frames from injured (in cyan) and control (in pink) subjects cannot be separated using the extracted features. To overcome this challenge, an unsupervised graph mining approach was employed to possibly devise a metric that can distinguish injured from healthy subjects.

Figure 3.

Figure 3

Visualization of the audio signal frames from nine subjects with acute unilateral knee injury (in cyan) and 33 healthy subjects (in pink), using t-Stochastic Neighbor Embedding (t-SNE). t-SNE computes a distance matrix corresponding to distances between every pair of frames. It converts this distance matrix to joint probabilities and minimizes the Kullback-Leibler divergence between the joint probabilities of the 2D embedding space and the high-dimensional feature space. Note that the new calculated dimensions (e.g. t-SNE dimension 1 and t-SNE dimension 2) do not correspond to any specific acoustical features. These dimensions integrate the relations between the individual data points in the high dimensional space to represent them in a lower dimensional space.

Figure 4(a–c) provides visualized data (knee graphs) constructed for a representative healthy (4(a)) and a representative injured subject (4(b), measured within seven days of the injury; 4(c), measured again six months following reconstructive surgery for the same subject), respectively. As described in the methods, each node in the graph represents the high dimensional vector of time and frequency domain features extracted from one windowed segment of the acoustical emission waveform. The different communities detected within these graphs are shown in different colors at the bottom. The number of communities detected within the healthy subject’s data graph (4(a)) was 16 (GCF=16); for the injured subject’s graph (4(b)), constructed from the recording taken within seven days of the injury, the GCF was 30. In addition, the GCF metric decreased to 15 for the same injured subject after corrective surgery and six months of recovery (4(c)). The graph on the upper and lower left (healthy subject, 4(a)) shows a set of densely clustered, homogeneous nodes, with many of the nodes falling close to one another in the high dimensional space. The graph on the upper and lower middle (4(b), injured subject within seven days of the injury), on the other hand, shows a more heterogeneous set of nodes, geometrically spread out in space rather than clustering together densely. Finally, the graph on the upper and lower right (4(c), injured subject six months after surgery and recovery) demonstrates that the same subject’s nodes become much more homogeneous following recovery, resembling closely those observed in the healthy subject’s graph.

Figure 4.

Figure 4

The relationship between graph heterogeneity (quantified using the graph community factor, GCF) and acute unilateral knee injury. (a) The graph constructed using features extracted from the audio signals acquired from both knees of a healthy subject (top). The Infomap community detection algorithm discovered 16 communities (GCF=16) in the graph, all shown in distinct colors on the graph in the bottom. (b) The graph constructed using the data acquired from a subject with an acute unilateral knee injury, where 30 communities are detected. The heterogeneity of the features for the injured subject is visually and quantitatively greater than for the healthy subject. (c) The graph constructed using the data acquired from the injured subject shown in Figure 4(b), after corrective surgery, where the number of communities detected has decreased to 15, and the heterogeneity has decreased visually.

Figure 5(a) compares the GCF metric of nine subjects with acute, unilateral knee injuries to that of 33 healthy subjects. Given the features extracted from the acoustical emissions of both knees, the graph community detection algorithm took on average 4.00 seconds to run per subject on our system. While the GCF is derived from each subject’s data independently, the metric can be compared among subjects in an absolute manner, with a lower GCF indicating more homogeneous acoustical emission signatures and a higher GCF more heterogeneous emissions. In the bar-plots shown in Figure 5(a), the height of the bar represents the mean value of the GCF metric within the population (injured or healthy) while the error bars represent one standard deviation. The GCF metric was higher for subjects with an acute, unilateral knee injury than the healthy subjects, and the difference between the groups was statistically significant (p=0.01). A two-sample Kolmogorov-Smirnov test was performed to evaluate the statistical significance.

Figure 5.

Figure 5

(a) The GCF calculated for healthy subjects (n = 33, shown in pink) and subjects with an acute unilateral knee injury within seven days of the injury (n = 9, shown in cyan). The bars represent the mean of the GCF within the population and the error bars represent one standard deviation. The asterisk (*) represents a statistically significant difference (p=0.01), where the p-value is calculated using a two sample Kolmogorov-Smirnov test. (b) The GCF metric for seven subjects with unilateral knee injury immediately after injury (within seven days) and 4 6 months after corrective surgery. The black data points connected with lines represent each subject’s data. The red data points and error bars represent the mean and one standard deviation of the GCF for all the seven subjects, before and after surgery. The asterisk (*) represents statistical significance (p=0.01) based on a two-sample Kolmogorov-Smirnov test.

Figure 5(b) presents the individual change in the GCF metric during injury recovery for seven subjects with unilateral knee injuries that were treated with corrective surgery. This figure depicts that the GCF metric decreased for six out of seven of these injured subjects. In the exceptional case, the GCF metric showed an ascending pattern. The subject with the ascending pattern had a very low GCF value for the first recording, which is likely attributed to a noisy measurement. The overall results for all subjects were found to be significantly lower following surgery and recovery using a two-sample Kolmogorov-Smirnov test (p = 0.01).

The proposed algorithm has a hyper-parameter, k, which is the number of neighbors each node in the kNN-graph is connected to, for each subject. Figure 6 shows the effect of varying the value of the parameter k (number of nearest neighbors for each vertex in the graph of a subject; see the section Graph Community Factor Calculation), on the ability of the GCF metric in discriminating healthy from injured subjects. The discriminating ability of the GCF metric is the best for k=5 in the set of values considered: 5, 10, 15, and 20. For all values of k except for 20, the differences between the injured and healthy groups were significant (p<0.05). This shows that the discriminative ability of the GCF metric is robust to variations in the parameter k.

Figure 6.

Figure 6

The effect of the parameter k, which is the number of neighbors each vertex is connected to in the constructed graph, on the ability of the graph community factor (GCF) metric in discriminating between healthy subjects (pink) and subjects with an acute unilateral knee injury within seven days of the injury (cyan). The values of k including 5, 10, 15, and 20 are considered. All differences were found to be significant (p < 0.05) except for the case of k = 20, as denoted by the asterisks (*).

The proposed algorithm uses a set of features that encompass mostly frequency domain features as well as some time domain features (Table 2). We compared our choice of features to feature sets that were previously used in analyzing knee joint acoustical emissions (Table 3). We extracted features from each 200 ms audio signal frames with five different features sets (FS1–FS5) along with the features in Table 2 (FS6). Then we computed the GCF for each of these five feature sets by constructing a kNN (k=5) graph from the data matrices from both knees and using Infomap to count the number of communities within the graph. For each features set, we performed a two sample Kolmogorov-Smirnov test to evaluate if the difference in GCF between injured and healthy subjects were statistically significant. Table 3 presents the p-values computed for the five feature sets along with the average feature extraction time per subject (both knees). FS1 which is also used in [8] to analyze knee joint acoustical emissions, performs orthogonal matching pursuit decomposition (OMP) of each of the 200 ms frames and uses the matching pursuit atoms to construct a time-frequency distribution (TFD) for the 200 ms frame. The amplitudes of each of the TFD matrix entries are used as features. Upon implementation, it was observed that this method takes several hours to extract features from each subject (OMP decomposition of a 200 ms signal takes around 25 seconds and the measurements from each knee includes around 250 of these frames). Therefore this method of feature extraction was not pursued any further as it presents computation times infeasible for a wearable system. FS2 uses a spectrogram (Short Term Fourier Transform), a commonly used TFD which has been previously used to analyze knee joint acoustical emissions as well [2]. For this feature set, we computed the spectrogram of each of the 200 ms audio frames (4 ms windows, 50% overlap and 128 FFT points) and used the logarithmic amplitudes of the spectrogram matrix entries as our features. FS3 uses a wavelet based TFD as suggested in [2], for feature extraction, which has higher resolution in time and frequency than spectrograms. In this method, a continuous wavelet TFD of each 200 ms frame was computed using a daubechies-8 wavelet which is the choice of wavelet in the previous work. The number of scales used for the wavelet TFD was 25, capturing frequencies from 1.3 kHz to 33 kHz within the TFD. The amplitudes of each entry within the TFD matrix were used as features. FS4 which uses the feature set suggested in [9], extract features using autoregressive (AR) modelling. The AR model parameters of each 200 ms audio frame are estimated for a model of order 40 (as suggested by the previous work). The poles of the model were computed and only one of each conjugate pole was kept. The poles were ranked in descending order of amplitude and the real and imaginary part of each pole was used as features. Finally FS5 uses a set of features suggested in [11]. The mean, standard deviation, form factor, entropy skewness and kurtosis of each 200 ms audio frame was computed after normalizing the frame to a range of [0,1]. FS5 was the only feature set that was made of time domain features while features sets FS1 to FS4 use frequency domain features.

Table 3.

Comparison with other features sets

Feature Set p-Value Feature Extraction Time Per Subject (s)
(FS1) TFD (OMP) [1] - Several Hours
(FS2) TFD (spectrogram) [2] 0.08 8.36
(FS3) TFD (daubechies 8 CWT) [2] 0.26 34.49
(FS4) AR model coefficients [9] 0.16 46.67
(FS5) Signal Histogram[11] 0.33 8.70
(FS6) This Work 0.01* 119.3
*

Indicates statistically significant difference (p<0.01).

TFD=timefrequency distribution. OMP=orthogonal matching pursuit. CWT= continuous wavelet transform. AR=autoregressive.

While the feature set used in this work (FS6) produced a GCF that was statistically significantly different between injured and healthy subjects, none of the other feature sets considered which were suggested in previous studies (FS2–FS5) produced statistically significant results. FS1 was excluded from the comparison as it had a high computation time. While our feature set produced the best results in terms of statistical significance, computation time was higher compared to FS2–FS5 while still being feasible for a wearable system (two minutes).

We also assessed the ability of the GCF metric calculated following dimensionality reduction via PCA (GCF2, GCF5 and GCF10 as explained in Section II.C.1) in distinguishing between healthy and injured subjects. We found that keeping two, five and ten principal components explained on average 83.6±8.1%, 94.3±4.0% and 98.0±1.5% of the variance in the subjects’ data respectively. Furthermore, the metrics GCF2, GCF5 and GCF10 were not statistically significantly different for healthy subjects and injured subjects (p=0.65, p=0.52 and p=0.18 respectively).

IV. Discussion

The changes in GCF were consistent with improved knee health from the athletic trainer’s subjective evaluation as well as the improvement in the lower extremity functional scores. The lower extremity functional score (see Human Subject Protocol and Subject Demographics) for the injured subjects increased significantly (p<0.01) from 36.1±12.1 in the first measurement (within seven days of the injury) to 63.7±11.0 in the second measurement (four to six months after surgery). Note that the functional score is derived from subject responses to survey questions regarding their ability to perform activities, and accordingly is qualitative and can be subjective. The GCF metric provided by the joint sound recordings can augment this functional score and medical professional’s evaluation by providing quantitative and objective data regarding joint health status. Moreover, if such data is obtained longitudinally throughout the rehabilitation, therapies can be titrated based on the changing joint health status of the patient. This paper shows the initial efficacy and potential of joint sounds, and associated GCF, as a metric of joint health; further studies with larger sample studies can allow for comparisons between GCF and functional scores, as well as investigate the possible combination of both for a more holistic assessment of joint health.

Knee joint acoustical emissions are complex signals produced by the underlying structures of the knee joint [8]. As the knee is flexed / extended, the femur and tibia move and the cartilage surfaces of these structures glide over each other as well as the patella. This gliding produces vibrations that contribute to knee joint acoustical emissions [2]. Furthermore, these motions cause changes of pressure in the synovial fluid within the joint and create vibrations in the surrounding muscles which also contribute to the acoustical emissions [26]. Therefore these acoustical emissions are generated by multiple complex processes within the knee joint and contain information about the underlying structures that generate them. In this work, we study the acoustical emissions generated by both knees by creating a kNN graph using these signals and conclude that the generated graph has more heterogeneity (defined as a higher GCF) for injured subjects. One of the reasons for the increased GCF in injured subjects is greater differences between the acoustic emissions produced by the knees (injured versus healthy), possibly due to the changes in the structure of the injured knee. Other possible reasons of increased GCF in injured subjects are more variability within the acoustical emissions produced by either one of the knees due to less structural stability in the joint during particular movements.

This paper is the first to apply unsupervised graph mining algorithms to bioacoustical signals. These powerful algorithms designed for visualizing and quantifying similarities in high dimensional datasets have been previously used in genomics, single cell data analyses [22], but have never been used to derive new knowledge from bioacoustical signals such as knee sounds. Compared to other physiological signals that have been studied extensively, namely signals of electrophysiological origin such as the electrocardiogram, knee acoustical emission waveforms do not have readily identifiable characteristic points or features and exhibit high inter-subject variability. Each person’s knees vary in size, shape, composition, and structure, and thus the sounds emitted during motion are, as expected, quite variable in nature. Accordingly, conventional feature extraction and classification algorithms based on identifying peaks, time intervals, frequency domain characteristics, or even combined time-frequency analyses (e.g., wavelet transform) cannot discriminate readily between healthy and injured joints in recordings taken from a population of subjects. Importantly, there is no existing knowledge of what knee acoustical emission features are associated with an injured versus healthy joint; thus supervised learning approaches are confined to black box models only, requiring very large datasets. Graph mining algorithms, on the other hand, are designed specifically to visualize and quantify the distribution of the data – even smaller datasets – in high dimensional spaces. In particular, the use of k-Nearest Neighbor (kNN) graphs for high dimensional data have been demonstrated to be more robust than other methods [27].

The entire unsupervised learning algorithm used in this paper only requires a single parameter to be tuned – the value of k, which represents the number of neighbors connected to each vertex in the graph. Thus, the sensitivity of the results to the value of this parameter can readily be computed, as shown in Figure 6. The results were remarkably insensitive to the value of k that was chosen with statistically significant differences between the two populations manifesting for a wide range of values. We thus anticipate that the approach defined in this paper will generalize well to other datasets of joint acoustical emissions for healthy and injured knees or other joints.

In this study, the value of k was chosen to be 5 heuristically, however a good value for this parameter can be chosen using a grid search algorithm. For this, a value of k, would be chosen from a set (or a grid, e.g. [5, 10, 15, 20]). The GCF metric would be calculated for each subject and the p-value of the GCF in distinguishing between the injured and healthy subjects would be noted using the two-sample Kolmogorov-Smirnov test, which can be denoted as p(k). This would be repeated for all values of k in the set and the k value that minimized p(k) can be chosen. Therefore, when the GCF is to be calculated for a new subject given the recordings acquired from this subject’s knees, the value of k that was chosen would be used. This situation was not simulated in the present study due to the limited number of subjects available.

While our selection of features produces better results than previously used feature sets, this comes at a cost of increased computation time. The higher feature extraction time of our feature set can be attributed to multiple signal transforms being performed while computing the features. In our implementation of this feature set, three different FFTs are computed while extracting features from a single audio frame (to compute features f4–f8, f9–f21 and f36–f64). This inefficient implementation can be improved by computing one FFT and calculating all features that require the use of a Fourier Transform using this. This feature extraction time can also be improved by using more computationally efficient audio feature extraction toolboxes [28].

In this work we represent the data from both knees as a kNN graph and perform Infomap community detection to calculate the number of communities within the data. An alternative to graph community detection is using clustering, which has led to favorable results in various previous studies [29, 30]. However in such studies, the number of clusters to be detected is either predetermined [29], or is varied as a hyper-parameter to optimize a final metric [30]. This approach can work when there is prior knowledge on the structure of the dataset (when the number of clusters within the data is known or a range for the number of clusters is known). However, such approaches are not feasible in our study as our aim is to actually determine the number of clusters (or communities) within the data and to use this as a biomarker. We use the number of communities detected as a measure of data heterogeneity (higher number of communities indicating a more heterogeneous kNN graph).

Additionally, we estimated the number of clusters within the data as an alternative to graph community detection. One possible way of determining the number of clusters within the data distribution (within the data matrix Xi) is applying a chosen clustering algorithm, using a range of values for the number of clusters, then finding the number of clusters that produces the maximum silhouette score, which is a metric that evaluates clustering quality [31]. When we performed this method to determine the number of clusters within the data, where we used k-means as the clustering algorithm, we found that the difference between the number of clusters for the injured and the healthy subjects was statistically insignificant (p=0.97 using a two sample Kolmogorov-Smirnov Test). The difference remained insignificant when spectral clustering was used as well (p=0.99 using a two sample Kolmogorov-Smirnov Test). We also used clustering algorithms that determine the number of clusters to estimate the number of clusters within the data. As an example, X-means clustering is similar to k-means clustering, but does not require the number of clusters to be input to the algorithm. Instead X-means estimates the number of clusters by optimizing the Bayesian Information Criterion (BIC), which is a model selection metric that typically takes lower values for better models [32]. We used the number of clusters found by X-means as our biomarker and found that the difference between healthy and injured subjects were statistically insignificant using the two sample Kolmogorov-Smirnov test (p=0.59). Affinity propagation is another algorithm which does not need to take the number of clusters as an input; it determines the number of clusters and outputs this number [33]. Using affinity propagation to estimate the number of clusters within the data matrix Xi and using this number as our biomarker also produced statistically insignificant results between injured and healthy subjects (p=0.93).

We also found that dimensionality reduction, by applying PCA to the data matrix of each subject and only keeping the first few principal components, deteriorated the ability of the calculated GCF metric in distinguishing between healthy and injured subjects. Despite the fact that keeping as few as two principal components explained on average more than 80% of the variance of each subjects’ data, dimensionality reduction via PCA did not produce favorable results. This sort of reduction works well if the data lies on a nearly linear manifold. However, in practice, many datasets contain non-linear structures that are not captured under PCA transformation, which is a possible reason why PCA deteriorates the performance of our biomarker [34].

Our method also creates the possibility of performing feature importance assessment and feature selection. For example, to assess the importance of individual features, the following procedure could be employed. A given feature could be excluded from the feature set and the GCF for each subject could be calculated, followed by a two sample Kolmogorov-Smirnov test to evaluate the statistical significance of the GCF for injured subjects compared to healthy subjects. If excluding the given feature diminishes the statistical significance (increases the p-value over 0.05), it can be concluded that the excluded feature is important and therefore should be kept in the feature set. Features that are evaluated as not important (ones that do not diminish statistical significance) can be excluded from the feature set and when the algorithm is run on a new subject, the new feature set can be used. Many other common methods of feature selection such as sequential forward or backward selection can be applied by using the p-value as a criterion to assess feature subset effectiveness as well [35]. A disadvantage of such feature selection methods is that they would require training and testing procedures (as feature selection has to be performed on a training set and then tested on a new subject) and cross validation to evaluate the generalizability of the algorithm. These would require a larger data set and deem our method supervised due to the necessity of a training procedure, and thus will be the topic of future work.

The required memory for the implementation of the proposed method is a potential limitation of using the method in a wearable system. The space complexity of the proposed algorithm to store the knee graph (using an adjacency matrix) is O(|V|2), where V is the set of nodes in the graph. Hence, if the signal has too many windows, the size of the graph would increase in a quadratic fashion, making the method impossible to be implement on a wearable system with limited memory. To overcome this limitation, one simple solution can be the use of an adjacency list instead of an adjacency matrix to store the graph, since the space complexity of an adjacency list is O(|E|+|V|) where |E| and |V| are the number of edges and vertices in the knee graph, respectively. The aforementioned complexity is more affordable in a wearable implementation of the system.

V. Conclusions

The scientific findings regarding joint sound complexity elucidated in this work by unsupervised knee graph mining can potentially provide significant benefits in the future for patients rehabilitating knee injuries, and advance the basic understanding of knee sound patterns for patients with injuries. The microphones and electronics used can be readily implemented in wearable systems, such as a knee brace worn by a user at home or austere conditions. Such a wearable knee brace that can provide a biomarker relating to knee joint health can improve the quality of care by providing feedback to patients and physicians to potentially modify treatment / therapy. Additionally, the feature extraction and unsupervised learning algorithms can be performed nearly in real-time, do not require computational complexity beyond what would be available on a smartphone or tablet device, and can potentially provide an output to the user indicative of joint health status – namely, the GCF metric. Thus, a user could wear a brace with embedded microphones and electronics that could send the acoustical waveforms captured from these sensors wirelessly to a smartphone or tablet for the GCF metric to be computed. Future studies with more granular data recordings during rehabilitation can allow GCF scores to be mapped to rehabilitation progress, such that users can then potentially modify the rehabilitation protocol based on the GCF score. This could lead to more effective rehabilitation for patients at home or austere conditions with therapies being tuned in real-time according to their changing knee health status. We believe that a wearable joint health system with analytic capabilities can help to speed rehabilitation of patients with joint injuries (by providing real-time biomarker feedback), lower treatment costs (by supplementing more expensive medical evaluations) and provide a powerful objective assessment tool to populations in austere / under-resourced conditions.

Acknowledgments

This material is based upon work supported in part by the Defense Advanced Research Projects Agency, Arlington, VA under Contract No. W911NF-14-C-0058, and in part by the National Institutes of Health, National Institute of Biomedical Imaging and Bioengineering, Grant No. 1R01EB023808, as part of the NSF/NIH Smart and Connected Health Program.

Contributor Information

Sinan Hersek, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30308.

Maziyar Baran Pouyan, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30308.

Caitlin N. Teague, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30308.

Michael N. Sawka, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30308

Mindy L. Millard-Stafford, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30308

Geza F. Kogler, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30308

Paul Wolkoff, Georgia Tech Athletic Association, Georgia Institute of Technology, Atlanta, GA 30308.

Omer T. Inan, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30308.

References

  • 1.Krishnan S, Rangayyan RM, Bell GD, Frank CB. Adaptive time-frequency analysis of knee joint vibroarthrographic signals for noninvasive screening of articular cartilage pathology. IEEE Trans Biomed Eng. 2000;47(6):773–783. doi: 10.1109/10.844228. [DOI] [PubMed] [Google Scholar]
  • 2.Wu Y, Krishnan S, Rangayyan RM. Computer-aided diagnosis of knee-joint disorders via vibroarthrographic signal analysis: a review. Crit Rev Biomed Eng. 2010;38(2):201–24. doi: 10.1615/critrevbiomedeng.v38.i2.60. [DOI] [PubMed] [Google Scholar]
  • 3.Calmbach WL, Hutchens M. Evaluation of Patients Presenting with Knee Pain: Part I. History, Physical Examination, Radiographs, and Laboratory Tests. American Family Physician. 2003;68(5):907–912. [PubMed] [Google Scholar]
  • 4.Blodgett WE. Auscultation of the Knee Joint. The Boston Medical and Surgical Journal. 1902;146(3):63–66. [Google Scholar]
  • 5.Shark L-K, Chen H, Goodacre J. Discovering Differences in Acoustic Emission Between Healthy and Osteoarthritic Knees Using a Four-Phase Model of Sit-Stand-Sit Movements. The Open Medical Informatics Journal. 2010;4:116–125. doi: 10.2174/1874431101004010116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lee JH, Jiang CC, Yuan TT. Vibration arthrometry in patients with knee joint disorders. IEEE Transactions on Biomedical Engineering. 2000;47(8):1131–1133. doi: 10.1109/10.855942. [DOI] [PubMed] [Google Scholar]
  • 7.Lee TF, Lin WC, Wu LF, Wang HY. Analysis of Vibroarthrographic Signals for Knee Osteoarthritis Diagnosis. :223–228. doi: 10.1186/s13104-016-2156-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Krishnan S, Rangayyan RM, Bell GD, Frank CB. Adaptive time-frequency analysis of knee joint vibroarthrographic signals for noninvasive screening of articular cartilage pathology. IEEE Transactions on Biomedical Engineering. 2000;47(6):773–783. doi: 10.1109/10.844228. [DOI] [PubMed] [Google Scholar]
  • 9.Rangayyan RM, Krishnan S, Bell GD, Frank CB, Ladly KO. Parametric representation and screening of knee joint vibroarthrographic signals. IEEE Transactions on Biomedical Engineering. 1997;44(11):1068–1074. doi: 10.1109/10.641334. [DOI] [PubMed] [Google Scholar]
  • 10.Teague CN, Hersek S, Töreyin H, Millard-Stafford ML, Jones ML, Kogler GF, Sawka MN, Inan OT. Novel Methods for Sensing Acoustical Emissions From the Knee for Wearable Joint Health Assessment. IEEE Transactions on Biomedical Engineering. 2016;63(8):1581–1590. doi: 10.1109/TBME.2016.2543226. [DOI] [PubMed] [Google Scholar]
  • 11.Rangayyan RM, Wu YF. Screening of knee-joint vibroarthrographic signals using statistical parameters and radial basis functions. Med Biol Eng Comput. 2008 Mar;46(3):223–32. doi: 10.1007/s11517-007-0278-7. [DOI] [PubMed] [Google Scholar]
  • 12.Binkley JM, Stratford PW, Lott SA, Riddle DL T. N. A. O. R. R. Network. The Lower Extremity Functional Scale (LEFS): Scale Development, Measurement Properties, and Clinical Application. Physical Therapy. 1999 Apr 1;79(4):371–383. [PubMed] [Google Scholar]
  • 13.Giannakopoulos T, Pikrakis A. Introduction to Audio Analysis: A MATLAB® Approach. Academic Press; 2014. pp. 70–96. [Google Scholar]
  • 14.Adamic LA, Adar E. Friends and neighbors on the web. Social networks. 2003;25(3):211–230. [Google Scholar]
  • 15.Rosvall M, Bergstrom CT. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences. 2008;105(4):1118–1123. doi: 10.1073/pnas.0706851105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Pouyan M, Nourani M. Clustering Single-Cell Expression Data Using Random Forest Graphs. IEEE Journal of Biomedical and Health Informatics. 2016 doi: 10.1109/JBHI.2016.2565561. [DOI] [PubMed] [Google Scholar]
  • 17.Edler D, Rosvall M. Source code for multilevel community detection with Infomap. 2017 Mar 23; http://www.mapequation.org/code.html.
  • 18.Abdi H, Williams LJ. Principal component analysis. Wiley interdisciplinary reviews: computational statistics. 2010;2(4):433–459. [Google Scholar]
  • 19.Bishop CM. Pattern recognition. Machine Learning. 2006;128 [Google Scholar]
  • 20.Verleysen M, François D. The curse of dimensionality in data mining and time series prediction. :758–770. [Google Scholar]
  • 21.Indyk P, Motwani R. Approximate nearest neighbors: towards removing the curse of dimensionality. :604–613. [Google Scholar]
  • 22.Levine JH, Simonds EF, Bendall SC, Davis KL, El-ad DA, Tadmor MD, Litvin O, Fienberg HG, Jager A, Zunder ER. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell. 2015;162(1):184–197. doi: 10.1016/j.cell.2015.05.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Xu C, Su Z. Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics. 2015:btv088. doi: 10.1093/bioinformatics/btv088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Fraley C, Raftery AE. Model-based clustering, discriminant analysis, and density estimation. Journal of the American statistical Association. 2002;97(458):611–631. [Google Scholar]
  • 25.Maaten Lvd, Hinton G. Visualizing data using t-SNE. Journal of Machine Learning Research. 2008 Nov;9:2579–2605. [Google Scholar]
  • 26.Andersen RE, Arendt-Nielsen L, Madeleine P. A Review of Engineering Aspects of Vibroarthography of the Knee Joint. Critical Reviews™ in Physical and Rehabilitation Medicine. 2016;28(1–2) [Google Scholar]
  • 27.Houle ME, Kriegel H-P, Kröger P, Schubert E, Zimek A. Can shared-neighbor distances defeat the curse of dimensionality? :482–500. [Google Scholar]
  • 28.Moffat D, Ronan D, Reiss JD. An evaluation of audio feature extraction toolboxes [Google Scholar]
  • 29.Shi C, Cheng Y, Wang J, Wang Y, Mori K, Tamura S. Low-rank and sparse decomposition based shape model and probabilistic atlas for automatic pathological organ segmentation. Medical image analysis. 2017;38:30–49. doi: 10.1016/j.media.2017.02.008. [DOI] [PubMed] [Google Scholar]
  • 30.Knops ZF, Maintz JA, Viergever MA, Pluim JP. Normalized mutual information based registration using k-means clustering and shading correction. Medical image analysis. 2006;10(3):432–439. doi: 10.1016/j.media.2005.03.009. [DOI] [PubMed] [Google Scholar]
  • 31.Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics. 1987;20:53–65. [Google Scholar]
  • 32.Pelleg D, Moore AW. X-means: Extending K-means with Efficient Estimation of the Number of Clusters. :727–734. [Google Scholar]
  • 33.Frey BJ, Dueck D. Clustering by passing messages between data points. science. 2007;315(5814):972–976. doi: 10.1126/science.1136800. [DOI] [PubMed] [Google Scholar]
  • 34.Tenenbaum JB, De Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. science. 2000;290(5500):2319–2323. doi: 10.1126/science.290.5500.2319. [DOI] [PubMed] [Google Scholar]
  • 35.Pudil P, Novovičová J, Kittler J. Floating search methods in feature selection. Pattern recognition letters. 1994;15(11):1119–1125. [Google Scholar]

RESOURCES