Skip to main content
. 2023 Jun 2;6:104. doi: 10.1038/s41746-023-00838-3

Fig. 6. Overview of the DeepBreath binary classification model.

Fig. 6

This binary classification architecture is trained for each of the four diagnostic classes. Top to bottom: a Data collection. Every patient has 8 lung audio recordings acquired at the indicated anatomical sites. b Pre-processing. A band-pass filter is applied to clips before transformation to log-mel spectograms which are batch-normalized and augmented and then fed into an (c) Audio classifier. Here, a CNN outputs both a segment-level prediction and attention values which are aggregated into a single clip-wise output for each site. These are then (d) Aggregated by concatenation to obtain a feature vector of size 8 for every patient, which is evaluated by a logistic regression. Finally (e) Patient-level classification is performed by thresholding to get a binary output. The segment-wise outputs of the audio classifier are extracted for further analysis. Note that the way the 5-second frames are created during training is not shown here (zero padding or random start).