Text, radiological pictures, audio notes, video, and other types of multimedia healthcare data are all generated by today’s smart healthcare system [1]. The evolution of COVID-19 has resulted in an incremental rise in current healthcare data. The study of multimodal healthcare data on such a big scale has revealed both obstacles and potential. Thanks to artificial intelligence (AI) and, more specifically, deep learning (DL) algorithms, which have been widely used by researchers for handling massive amounts of epidemic data, predicting live epidemic crises, and initiating new research directions in the analysis of healthcare multimedia data [2]. As a result, deep learning for multimedia healthcare data analysis is becoming a hot topic in multimedia and computer vision research. The call for papers attracted 54 submissions and after a rigorous review, 20 papers have been accepted for this special issue. A brief summary of papers in this special issue is presented in the following:
The paper titled “A Novel Study for Automatic Two-class Covid-19 Diagnosis (between Covid-19 and Healthy, Pneumonia) on X-ray Images using Texture Analysis and 2-D/3-D Convolutional Neural Networks” aims to diagnose COVID-19 early using X-ray images, automatic two-class classification was carried out in four different titles: COVID-19/Healthy, COVID-19 Pneumonia/Bacterial Pneumonia, COVID-19 Pneumonia/Viral Pneumonia, and COVID-19 Pneumonia/Other Pneumonia. In the study, besides using the original X-ray images alone, classification results were obtained by accessing the images obtained using Local Binary Pattern (LBP) and Local Entropy (LE). The classification procedures were repeated for the images that were combined with the original images, LBP, and LE images in various combinations. Mobilenetv2, Resnet101, and Googlenet architectures were used in the study as a 2-D CNN. A 24-layer 3-D CNN architecture has also been designed and used. The results obtained within the scope of the study indicate that diversifying X-ray images with tissue analysis methods in the diagnosis of COVID-19 and including CNN input provides significant improvements in the results. Also, it is understood that the 3-D CNN architecture can be an important alternative to achieving a high classification result.
The paper titled “A light-weight convolutional Neural Network Architecture for classification of COVID-19 chest X-Ray images” presents a lightweight convolutional neural network (CNN) model to classify COVID and Non-COVID patients by analyzing the hidden features in the X-Ray images. The CNN model has only four 2D convolutional layers and one fully-connected layer. The model obtained 98.78%, 93.22%, and 92.7% accuracy in the training, validation, and testing phases, respectively. In addition, the model achieved 0.964 scores in the Area Under Curve (AUC) metric. The model’s performance is also compared with four state-of-the-art pre-trained models (VGG16, InceptionV3, DenseNet121, and EfficientNetB6). The evaluation results demonstrate that the proposed CNN model is a candidate for an automatic diagnostic tool for the classification of COVID-19 patients using chest X-ray images. This research proposes a technique to classify COVID-19 patients and does not claim any medical diagnosis accuracy.
The widespread usage of multimedia technology has made it possible to explore information in a variety of formats, including texts, audio, video, and photos. Computational approaches are being developed for a variety of objectives, including monitoring, security, business, and even health through automatic disease detection using medical pictures. Melanoma skin cancer is one of these disorders. Melanoma is a type of skin cancer that kills a lot of people all over the world. A number of strategies for detecting melanoma in dermoscopic pictures have been developed. It is critical to isolate the lesion location for these approaches to be more effective. A melanoma segmentation method based on U-net and LinkNet deep learning networks, as well as transfer learning and fine-tuning techniques, was applied in the paper entitled “Automatic Segmentation of Melanoma Skin Cancer Using Transfer Learning and Fine-tuning”.
Coronavirus is one of the serious threats and challenges for existing healthcare systems. In “Self-assessment and deep learning-based coronavirus detection and medical diagnosis systems for healthcare,” the authors propose a deep learning-based medical image classification for COVID-19 patients. The proposed model in “Self-assessment and deep learning-based coronavirus detection and medical diagnosis systems for healthcare,” improves the medical image classification and optimization for better disease diagnosis. The authors also propose a mobile application for COVID-19 patient detection using a self-assessment test combined with medical expertise and diagnosing and preventing the virus using the online system. The manuscript entitled “Applying Deep Learning Based Multi-Modal for Detection of Coronavirus,” proposes a deep learning model to find similarities between SARS-CoV-2 and other prevalent virus genomes. The proposed method claims to help the people in the medical field quickly classify the infected genomes and is claimed it is useful in finding the most effective drug from the available batch of drugs for the treatment of 'COVID-19.
The authors in “Deep learning based Meta-Classifier Approach for COVID-19 Classification using CT scan and Chest X-ray Images,” propose meta-classifier approach to screen COVID-19 using two input modalities. The paper showed the detailed investigation and analysis of the proposed method and compared it with the other existing methods. The CT and CXR datasets were used to test the proposed method. The suggested method’s performance was compared to that of different current CNN-based pretrained methods. The proposed method outperformed existing methods and can be utilized by healthcare providers for point-of-care diagnostics.
The manuscript entitled “Deep learning on digital mammography for expert-level diagnosis accuracy in breast cancer detection,” proposes a breast cancer detection system using deep learning. The aim was to achieve the expert-level diagnosis accuracy by the system using the mammogram. The system demonstrated a state-of-the-art DL classification model trained on mammograms using only image-grade pathology labels. Experimental results showed that the model could locate lesions on mammograms, although this information was not provided during the training phase. Finally, the influence of input image resolution and different DL model structures on the diagnostic accuracy is given and analyzed.
The authors of “Prediction Model using SMOTE, Genetic Algorithm and Decision Tree (PMSGD) for Classification of Diabetes Mellitus” suggest a prediction model for diabetes disease classification that incorporates synthetic minority oversampling technique (SMOTE), genetic algorithm, and decision tree. SMOTE is utilized to handle the problem of class imbalance in the proposed system, followed by a genetic algorithm to choose the essential features and finally, a decision tree to train a diabetic classification model. To assess the performance of the classification system, the authors employed classification accuracy and error.
The manuscript entitled “Fusion of AI Techniques to Tackle COVID-19 Pandemic: Models, Incidence Rates, and Future Trends” provides a detailed end-to-end review of all AI strategies that can be utilized to combat the pandemic in all fields. In addition, the authors comprehensively address the COVID-19 concerns and, based on the literature analysis, propose viable countermeasures using AI approaches. Finally, they look at some of the open research questions and problems related to incorporating AI approaches into the COVID-19.
The paper titled “Multimodal biometric authentication approach Based on a deep fusion of Electrocardiogram (ECG) and finger vein,” presents a more accurate multimodal authentication system based on deep ECG and finger vein fusion than the prior multimodal authentication systems. Biometric preprocessing, deep feature extraction, and authentication are the three key components of the proposed system. The suggested method has applications in a variety of disciplines, including criminal investigation, logical and physical access control, surveillance, and so on. The algorithm is evaluated based on authentication performance with an equal error rate of 0.12 percent and 1.40 percent, respectively, utilizing feature fusion and scores fusion, which reveals the method's characteristics and excellent performance. The suggested method was evaluated on two vein databases (TW finger vein and VeinPolyU finger vein) as well as two ECG datasets (MWM-HIT and ECG-ID). These databases have undergone extensive testing, and the authentication performance has been compared to state-of-the-art approaches.
The paper titled, “Deep Learning and Evolutionary Intelligence with Fusion Based Feature Extraction for Detection of COVID-19 from Chest X-Ray Images,” uses chest X-ray images to provide a new metaheuristic-based fusion model for COVID-19 diagnosis. Pre-processing, feature extraction, and classification methods are all part of the proposed model. To begin, pictures are preprocessed using the Weiner filtering (WF) approach. The gray level co-occurrence matrix (GLCM), Gray Level Run Length Matrix (GLRM), and local binary patterns are then incorporated into the fusion-based feature extraction procedure (LBP). The ideal feature subset was then chosen using the salp swarm method (SSA). Finally, to classify infected and healthy patients, an artificial neural network (ANN) is used as a classification method. The performance of the proposed model was evaluated using the Chest X-ray image dataset, and the findings were studied from several angles. The collected findings confirmed that the proposed model outperformed existing techniques.
The paper entitled “An explainable stacked ensemble of deep learning models for improved melanoma skin cancer detection,” explains how to use the transfer learning concept to build a CNN-based deep learning stacked ensemble framework for detecting melanoma skin cancer. The current study aims to do this by combining predictions from numerous CNN sub-models and feeding them to a meta-learner to predict malignant melanoma moles. An image dataset from the public domain is used to train the model. The study employed three different CNN models, namely, Xception, DenseNet121, and EfficientNetB0 with fine-tuned weights. Melanoma images can be correctly classified with 95.76% accuracy and 0.978 AUC with the best stacked ensemble model, which has good sensitivity and specificity. Despite their high accuracy, deep CNN models are not widely used in clinical contexts due to their lack of proper explainability. Hence, using shapely adaptive explanations, a heatmap explainability approach has been developed to show the most suggestive spots on melanomas images. Experimental findings reveal that their model has good explainability qualities, efficiently identifying numerous benign and malignant melanoma symptoms.
The article titled “AI-Inspired Spatial Feature Extraction and Selection Method for Emotion Classification Using Electroencephalogram Signals,” article proposes a classification of human emotions based on an electroencephalogram (EEG) in fast and effective emotion recognition. The nonlinear EEG signal is first decomposed using the proposed multivariate empirical mode decomposition (iMEMD). An efficient method of spatial feature selection and extraction with a short processing time is also presented. The spatiotemporal analysis of EGG signals was performed with the help of the Complex Continuous Wavelet Transform (CCWT) to collect all the information in the time and frequency domains. The multi-model feature extraction method uses three deep neural networks (DNNs) to extract features and analyze them together to create a combined feature vector. Differential entropy and mutual information methods have also been proposed to select good quality channels and features, respectively, to save the computational overhead. The proposed methodology achieved good classification performance in a short processing time as compared to the available state-of-the-art models of emotion classification.
In the paper titled “Classification of COVID-19 Individuals using Adaptive Neuro-Fuzzy Inference System,” the authors used machine learning-based techniques detection of COVID-19. Using an Adaptive Neuro-Fuzzy Inference System, the proposed predictive methodology aims to identify the characteristics that aid in the early detection of Coronavirus Disease (COVID-19) (ANFIS). The study also compares the accuracy of various machine learning classifiers and chooses the best one. To anticipate the risk factor of this globally spread disease, researchers applied an intelligent NeuroFuzzy technique for modelling and control of ill-defined and uncertain systems (ANFIS). The linear Support Vector Machine (SVM) is used to classify the COVID-19 dataset since it has the best accuracy of all the classifiers, at 100%. In addition, the ANFIS-Sugeno model was used for this categorized dataset, resulting in a COVID-19 risk prediction of 80%.
This study in “DL-CNN-based approach with image processing techniques for diagnosis of retinal diseases” presents a diagnostic tool based on a deep-learning framework for four-class classification of ocular diseases by automatically detecting diabetic macular edema, drusen, choroidal neovascularization, and normal images in optical coherence tomography (OCT) scans of the human eye. The proposed framework utilizes OCT images of the retina and analyses using three different CNN models (five, seven, and nine layers) to identify the various retinal layers extracting useful information, observe any new deviations, and predict the multiple eye deformities. Results obtained from the experimental testing confirm that our model has excellently performed with 0.965 classification accuracy, 0.960 sensitivity, and 0.986 specificities compared with the manual ophthalmological diagnosis.
Clinician decisions are becoming increasingly evidence-based, making big data analytics in healthcare particularly promising. Big data analytics has revolutionized the healthcare industry and promises us a world of opportunity because of the sheer magnitude and availability of data. It promises early diagnosis, prediction, and preventive capabilities, as well as assisting us in improving our quality of life. Researchers and clinicians are striving to prevent big data from having a future positive impact on health. To analyze, process, accumulate, assimilate, and manage massive amounts of structured and unstructured healthcare data, a variety of technologies and procedures are employed. The paper entitled “Leveraging Big Data Analytics Towards Enhancement of Healthcare: Trends, Challenges and Opportunities”, discusses the inability to apply big data analytics in healthcare and the problems associated with doing so.
Myocardial Infarction (MI) is a sudden cessation of blood flow to the heart, resulting in a blood shortage and ischemia, which damages the heart muscle and causes cells to die and lose function. Despite the low global incidence of MI, it is nonetheless a common cause of mortality. As a result, spotting MI symptoms early can help to prevent mortality. This research “Myocardial Infarction Detection Based on Deep Neural Network on Imbalanced Data,” described a deep CNN-based approach for automatically detecting MI. The suggested CNN is an end-to-end model that automatically extracts and categorizes features. The interesting part about this paper is that it deals with imbalanced data by employing the focal loss function with deep learning. In the paper, “A Novel Method for ECG Signal Classification via One-Dimensional Convolutional Neural Network” the authors present a method with a novel segmentation strategy via one-dimensional convolutional neural networks (1D-CNN) to classify ECG signals. The major contribution includes data segmentation with three consecutive QRS complexes (R-R-R segmentation) and their network design for this specific problem. The experimental results show that the proposed method achieves better performance than the state-of-the-art methods.
Medical image fusion is the process of combining visual information from any number of medical imaging inputs into a single fused image with no loss of detail or distortion. It improves the clinical usability of medical imaging for the treatment and evaluation of medical disorders by keeping certain features in the image. Integrating the diseased aspects of the complement into a single image is a significant difficulty in medical image processing. The presence of fusion artifacts, the hardness of the base, the comparison of medical picture input, and computing cost are all obstacles that must be overcome with the fused image. Hybrid multimodal medical image fusion (HMMIF) approaches were created for pathologic research such as neurocysticercosis, degenerative disorders, and neoplastic diseases. The paper entitled, “Intelligent Multimodal Medical Image Fusion with Deep Guided Filtering” proposed a hybrid multimodal medical image fusion technique to improve the quality of images, which is highly recommended in the medical field. This study created two domain algorithms based on HMMIF approaches for MRI-SPECT, MRI-PET, and MRI-CT applications. The suggested method begins by decomposing the input images into low and high-frequency components using NSCT. Experimentation has shown that the proposed methods are preferable in terms of both qualitative and quantitative evaluation. The proposed algorithms merged images provide important information for visualizing and comprehending diseases to the best of both sources’ modalities.
In the article titled “Floor of log: a novel intelligent algorithm for 3D lung segmentation in computer tomography images”, the authors proposed a high-performance approach for 3D lung segmentation called Floor of Log(FoL). The FoL was presented as a new clustering algorithm, which can be trained to achieve better results to segment lungs in CT images of the chest region. The most important impact of the proposed model lies in the increase of the optimization process to segment the pulmonary region of the CT scan using just the logarithm function and the ‘floor’ operator, making the lung almost segmented. The results presented a high score rate with Se 83.62%, Acc 99.62%, MCC 83.08%, HD 3.51%, DICE 83.63%, Jaccard 99.73% in lung segmentation. On the other hand, the FoL reinforces the current Hounsfield-based techniques used by the specialist and can be used as an important tool in the CAD system to help in the diagnosis of diseases.
In closing, the guest editors would like to thank all the authors, who significantly contributed to this SI and the reviewers for their efforts in respecting deadlines and their constructive reviews. We are also grateful to the Editor-in-Chief, Changsheng Xu for his support and Multimedia Systems publication staffs as well, who collaborated with us in every step. We hope this SI will inspire further research and development ideas for Deep Learning for Multimedia Healthcare.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
M. Shamim Hossain, Email: mshossain@ksu.edu.sa.
Josu Bilbao, Email: JBilbao@ikerlan.es.
Diana P. Tobón, Email: dtobon@udemedellin.edu.co
Ghulam Muhammad, Email: ghulam@ksu.edu.sa.
Abdulmotaleb El Saddik, Email: elsaddik@uottawa.ca.
References
- 1.Hossain MS, Muhammad G, Alamri A. Smart healthcare monitoring: a voice pathology detection paradigm for smart cities. Multimed. Syst. 2019;25:565–575. doi: 10.1007/s00530-017-0561-x. [DOI] [Google Scholar]
- 2.Hossain MS, Muhammad G, Guizani N. Explainable AI and mass surveillance system-based healthcare framework to combat COVID-I9 like pandemics. IEEE Network. 2020;34(4):126–132. doi: 10.1109/MNET.011.2000458. [DOI] [Google Scholar]