Machine Learning Techniques for Antimicrobial Resistance Prediction of Pseudomonas Aeruginosa from Whole Genome Sequence Data

Sohail M Noman; Muhammad Zeeshan; Jehangir Arshad; Melkamu Deressa Amentie; Muhammad Shafiq; Yumeng Yuan; Mi Zeng; Xin Li; Qingdong Xie; Xiaoyang Jiao

doi:10.1155/2023/5236168

. 2023 Mar 1;2023:5236168. doi: 10.1155/2023/5236168

Machine Learning Techniques for Antimicrobial Resistance Prediction of Pseudomonas Aeruginosa from Whole Genome Sequence Data

Sohail M Noman ¹, Muhammad Zeeshan ², Jehangir Arshad ³, Melkamu Deressa Amentie ⁴, Muhammad Shafiq ¹, Yumeng Yuan ¹, Mi Zeng ¹, Xin Li ¹, Qingdong Xie ¹, Xiaoyang Jiao ^1,^✉

PMCID: PMC9995192 PMID: 36909968

Abstract

Aim

Due to the growing availability of genomic datasets, machine learning models have shown impressive diagnostic potential in identifying emerging and reemerging pathogens. This study aims to use machine learning techniques to develop and compare a model for predicting bacterial resistance to a panel of 12 classes of antibiotics using whole genome sequence (WGS) data of Pseudomonas aeruginosa.

Method

A machine learning technique called Random Forest (RF) and BioWeka was used for classification accuracy assessment and logistic regression (LR) for statistical analysis.

Results

Our results show 44.66% of isolates were resistant to twelve antimicrobial agents and 55.33% were sensitive. The mean classification accuracy was obtained ≥98% for BioWeka and ≥96 for RF on these families of antimicrobials. Where ampicillin was 99.31% and 94.00%, amoxicillin was 99.02% and 95.21%, meropenem was 98.27% and 96.63%, cefepime was 99.73% and 98.34%, fosfomycin was 96.44% and 99.23%, ceftazidime was 98.63% and 94.31%, chloramphenicol was 98.71% and 96.00%, erythromycin was 95.76% and 97.63%, tetracycline was 99.27% and 98.25%, gentamycin was 98.00% and 97.30%, butirosin was 99.57% and 98.03%, and ciprofloxacin was 96.17% and 98.97% with 10-fold-cross validation. In addition, out of twelve, eight drugs have found no false-positive and false-negative bacterial strains.

Conclusion

The ability to accurately detect antibiotic resistance could help clinicians make educated decisions about empiric therapy based on the local antibiotic resistance pattern. Moreover, infection prevention may have major consequences if such prescribing practices become widespread for human health.

1. Introduction

Antimicrobial resistance (AMR) is one of the leading public health concerns of the 21st century, which hinders the ability to effectively treat and prevent a wide variety of bacterial, viral, and fungal infections [1]. AMR occurs when microorganisms (bacteria, viruses, fungi, and parasites) evolve and lose their sensitivity to existing treatments, making infections more challenging to treat and raising the risk of disease transmission, severe illness, and death [2]. The rapid global spread of multi- and pan-resistant bacteria, also known as “superbugs,” is particularly concerning because these bacteria cause infections that cannot be treated with current antimicrobial medicines like antibiotics [3]. At least 1.27 million people died from AMR-related cases in 2019, according to the CDC (https://www.cdc.gov/drugresistance/biggest-threats.html). Over 2.8 million people in the United States year contract AMR, and over 35,000 people die directly [4]. The most common multidrug-resistant bacteria globally are Escherichia coli, Enterococcus faecium, Streptococcus, Klebsiella, and Pseudomonas aeruginosa, and they are responsible for an estimated 250,000 annual infections and deaths [5]. For instance, the WHO priority pathogen list calls for new antibacterials to treat infections caused by Pseudomonas aeruginosa and carbapenem-resistant bacteria (CRE) [6]. There are currently 32 antibiotics in clinical development that target WHO priority pathogens, but only six of them can be considered truly innovative [7].

Various researchers have talked about the resistance prediction of antimicrobials [8]. This lack of treatment options often requires broad-spectrum antibiotics, which may be less effective or safe. Resistance also affects empirical treatment, in which a clinician chooses an antibiotic for an infection without obtaining microbiological results. This can lead to an underestimation of the risk associated with specific infections and the use of inappropriate antibiotics. A meta-analysis found that patients with Enterobacteriaceae resistance are five times more likely to delay receiving an effective therapy than patients infected by a susceptible strain [9, 10]. This may reduce the long-term effectiveness of antibiotics, delay access to effective treatments, increase treatment failure with complications, and increase fatality rates. Infections caused by resistant Gram-positive and Gram-negative bacteria increase hospital stays, surgery needs, and mortality [11].

Another study by Yamani et al., calculated the health burden of antibiotic-resistant bacteria (ARB) in European Union/European Economic Area (EU/EEA) countries in disability-adjusted life-years [12]. Their models were populated with estimated incidence from the European Antimicrobial Resistance Surveillance Network (EARS-Net) and the European Centre for Disease Prevention and Control (ECDC) point prevalence surveys of healthcare-associated infections and antimicrobial use in European acute care hospitals [13, 14]. Systematic reviews of published literature showed attributable case fatality and length of stay for antibiotic-resistant infections [15, 16]. In 2014, 671689 infections occurred in EU/EEA countries [13]. This ratio increased globally between 2015 and 2022 [5, 10, 12]. Different ARB contribute variably to the global burden, so prevention and control strategies should be tailored to each country's needs. All countries must implement effective AMR strategies to combat antibiotic overuse and misuse [17]. All systemic antibiotics globally require a doctor's prescription. Most prescriptions are written in primary care, not secondary or tertiary [6].

In 2018, 74% of all antibiotics prescribed by the National Health Service (NHS) in England were for general practitioners (GPs) patients [18]. GPs are the most frequent antibiotic prescribers, so they focus on primary care literature. Nurse practitioners and community pharmacists play a key role. In the last 10 years, nurses' roles have expanded to include prescribing in many countries and are on the policy agenda in many more [19]. Nurse prescribing was introduced to better utilize the skills and knowledge of health professionals, improve medication access, and reduce the workload of doctors. In China, the number of nurses qualified to prescribe has steadily risen over the last 5 years, and 31,000 nurses now have the same prescribing ability as doctors [20]. Pharmacists in China can register as independent prescribers, often specializing in diabetes prescriptions. More pharmacists work in secondary care than primary. Lastly, dentists are considered antibiotic prescribers because they write fewer prescriptions than general practitioners. Further, most antibiotic prescriptions are for respiratory, urinary, skin, or tooth infections [21]. In addition, most antibiotics are given for acute respiratory tract infections (RTIs) [13]. Some RTIs, such as community-acquired bacterial pneumonia, are treatable with antibiotics, but most acute RTIs are viral and self-limiting.

P. aeruginosa has high baseline antibiotic resistance and can acquire new resistance mechanisms through chromosomal mutations or horizontal gene transfer (HGT), increasing the risk of ineffective antibiotic treatment [22]. Mutations can cause a failed therapeutic outcome during treatment, while resistance increases mortality, hospital stays, and costs. When microorganisms become resistant to antimicrobials, standard treatments are often ineffective. Disc diffusion and minimum inhibitory concentration (MIC) are the most common antimicrobial susceptibility tests [23]. Identification of resistance-specific markers by PCR or microarray hybridization is useful for epidemiological purposes and the validation of phenotypic results. As DNA sequencing throughput and costs increase, whole-genome sequencing (WGS) becomes a viable option for routine resistance profile surveillance and identifying emerging resistances [24]. Pathogenic P. aeruginosa alters genome sequences and protein expression to resist. Resistance disrupts biochemical pathways and protein channels [25]. Antibiotic resistance and susceptibility must be linked to specific resistance genes; all genes in an isolate are added to predict susceptibility [26]. ResFinder, CARD, and Resfams predict genotypes from phenotypes [27]. More and more often, computational tools like machine-learning algorithms are used to build models correlating genomic variations with phenotypes [28]. Both a stimulus and an outcome are present in every supervised learning example. The algorithm will succeed only if it learns a model that faithfully transforms any input into the desired output.

Considering the above, the fundamental objective of this study was to develop an accurate phenotype prediction model against antimicrobials. For this purpose, machine learning approaches called bio-Weka [29], and random forest (RF), and logistic regression (LR) [30–32] were used on the data mining platform called Weka (v3.9.2) (an open source java-based software) [33–35] for acquiring classification accuracy assumptions to accurately predict the phenotypes against a panel of twelve antimicrobial agents, including ampicillin, amoxicillin, meropenem, cefepime, fosfomycin, ceftazidime, chloramphenicol, erythromycin, tetracycline, gentamycin, butirosin, and ciprofloxacin from whole genome sequence data of P. aeruginosa. Significantly, this study can further enhance the antimicrobial predictions of various bacterial agents in clinical trials.

2. Methods

2.1. Data Collection

The WGS reads of Pseudomonas aeruginosa and binary resistance phenotypes of antimicrobial agents utilized in this study were obtained by accession numbers provided in various studies, consisting of different countries, including China and 65 others (developed and under development), and downloaded from the open access repository called GenBank at NCBI (https://www.ncbi.nlm.nih.gov/genbank/), which is the NIH genetic DNA sequences database. All the descriptive information about the raw data is present in the Supplementary file. The metadata consists of various attributes, including genome name, NCBI taxon id, genome status, associated strains, GenBank accession numbers, country name, number of contigs, genome lengths, isolation sources, resistance genes, twelve antibiotics, and many more.

2.2. Model Framework and Parameters

In this study, antimicrobial resistance of P. aeruginosa was predicted using a data mining assessment framework by machine learning algorithms, as shown in Figure 1. There were a total of six stages involved in reaching these conclusions, including the following: objective; data collection and preparation; machine learning techniques on a data mining platform; model building; evaluation and assessment; and implications. Initially, we collected the data and did some preliminary preprocessing to pick the right attributes. Afterward, this data was used for analysis and assessment. Secondly, Weka (v3.9.2), “a java-based machine learning and data mining platform,” was used to measure and evaluate classifications with the most recent bio-Weka and RF plugins. In addition, the results of machine learning classifiers were used in logistic regression (LR) to evaluate the resistance phenotype assessment to twelve different antibiotic drugs, namely, ampicillin, amoxicillin, meropenem, cefepime, fosfomycin, ceftazidime, chloramphenicol, erythromycin, tetracycline, gentamycin, butirosin, and ciprofloxacin.

The data mining assessment framework used in this study.

Furthermore, the data was divided into two sets (training set and testing set) by a ratio of 60 : 40. Overfitting was prevented by using 10-fold cross-validation, and training data were used further as efficiently as possible to determine the optimal hyperparameter settings. The training model's evaluation results were based on an average of the hyperparameter values that fared best in the 10-fold scross-validation procedure. Sensitivity, specificity, accuracy, and precision were used to assess the model performance of bio-Weka and RF by equations (1)–(4). The number of strains that turned out to be resistant was the true positive (TP), the number of strains that turned out to be sensitive was the true negative (TN), and the number of strains that turned out to be resistant when they should have been sensitive was the false positive (FP), and the number of strains that should have been sensitive when they should have been resistant was the false negative (FN) [36].

\begin{matrix} S e n s i t i v i t y = \frac{T P}{(T P + F N)}, \end{matrix}

(1)

\begin{matrix} S p e c i f i c i t y = \frac{T N}{(T N + F P)}, \end{matrix}

(2)

\begin{matrix} A c c u r a c y = \frac{(T P + T N)}{(T P + F N + T N + F P)}, \end{matrix}

(3)

\begin{matrix} P r e c i s i o n = \frac{T P}{(T P + F P)} . \end{matrix}

(4)

2.3. BioWeka and Random Forest Prediction of Phenotypes Resistance

Weka's datasets are used and stored in a unique file format known as attribute relation file format (ARFF). Due to the wide variety of file types used for biological data, it implements a format-conversion input layer that can transform common file types into the ARFF format. Weka filters any classes that can be applied to a dataset to alter it, and bio-Weka has filters for working with biological sequences. It enabled us to compare and match sequences with BLAST and other sequence alignment tools. In addition, alignment-based classification was performed using auto alignment score evaluation schemes.

A java-based machine learning algorithm called bio-Weka and RF was used to perform the predictive modeling. The DSK (k-mer counting software) [37, 38] was used to generate K-mer profiles (abundance profiles of all unique words of length k in each genome) from the assembled contigs, with k = 31. This is a common length for analyzing bacterial genomes [39]. In order to create the dataset, the 31-mer profiles of all strains were combined using the combine kmers tool in SEER [40]. The combined 31-mer counts were converted into presence/absence matrices to be used for model training and prediction. 10-fold cross-validation was used to select the best conjunctive and/or disjunctive model with a maximum of ten rules for binary classification analysis (using S/NS phenotypes based on the two different breakpoints for each drug) [41, 42], which involved testing the suggested broad range of values for the trade-off hyperparameter to determine the optimal rule scoring function (https://aldro61.github.io/kover/doclearning.html). In addition, classification (BW-mC) and regression (BW-R) models were constructed from log2 (MIC) data in bio-Weka and RF for the purpose of comparing the performance of binary classifiers to MIC prediction [29, 43].

Furthermore, the RF method uses a majority voting strategy (MVS) to classify samples based on the results of an ensemble of decision tree (DT) [44]. In other words, the RF method relies on the class indicated by the vast majority of the DT. Having a diverse ensemble of trees is essential for boosting RF performance with respect to a single DT. One way to achieve it is by using bootstrapping with replacement to generate the training set for developing each DT's unique feature set. However, features considered for splitting each node are not chosen from the full feature set but rather from a subset of features [45]. In addition, be aware that RF is more akin to an unintelligible black box model. In RF, as in individual DT, the CART algorithm is taken into account.

Multiple metrics were used to evaluate the model's efficacy, including sensitivity, specificity, accuracy, precision, and the overall bACC (the average of the sensitivity and specificity) [46]. Since the bACC represents false positive and false negative rates equally, regardless of the imbalance in the dataset, it was chosen as the overall measure of model performance. Two measures of MIC prediction accuracy were evaluated: firstly, the proportion of isolates for which the predicted MIC was identical to the phenotypic MIC (rounded to the nearest doubling dilution in the case of regression), and secondly, the proportion of isolates for which the predicted MIC was within one doubling dilution of the phenotypic MIC (1-tier accuracy). The MIC testing criteria for exact match rates and 1-tier accuracies have been removed to include predictions within 0.5 doubling dilutions or 1.5 doubling dilutions of the phenotypic MIC, respectively, to account for MIC variation [47]. Each analysis had 10 replicates, and the mean and 95% confidence intervals were calculated for all metrics. Mean bACC was compared between replicate sets using two-tailed unpaired t-tests with logistic regression (LR) correction for unequal variance (α = 0.05) to assess differential model performance across datasets or methods. In addition, P values were calculated using the results of these unpaired t-tests.

2.4. Regression Statistics

Kappa statistics are reliable because they can be tested repeatedly [48, 49], ensuring that researchers have access to accurate, comprehensive data regarding research samples. It evaluates the predicted classification accuracy against a random classification [50]. We used a kappa statistic that relies on binary values, where 0 is considered as a null value and 1 represents the predicted outcome of the evaluation as in equation (5)–(7) [51]. It also serves as an indicator of the reliability of the evaluation. Not only that, but the LR variables help resolve the two-way binary classifications. When applied to the field of binary numbers, it makes predictions in the form of continuous values that allow for the preservation of sensitivity [36]. If the value is greater than the threshold (value > threshold), then the value assigned is 1; otherwise, the value measured is 0 as determined by the equations (8)–(11) [52].

\begin{matrix} K = \frac{[P (A) - P (E)]}{[1 - P (E)]}, \end{matrix}

(5)

\begin{matrix} P (A) = [\frac{(T P + T N)}{N}], \end{matrix}

(6)

\begin{matrix} P (E) = [(T P + F N) * (T P + F P) * \frac{(T N + F N)}{N^{2}}, \end{matrix}

(7)

\begin{matrix} P = α + β_{1} X_{1} + β_{2} X_{2} + \dots + β_{m} X_{m}, \end{matrix}

(8)

\begin{matrix} σ (x) \frac{1}{1 + e^{- x}} \in [0, 1], \end{matrix}

(9)

\begin{matrix} \Pr (Y = + 1 |X) \sim β . X, \end{matrix}

(10)

\begin{matrix} \Pr (Y = - 1 |X) \end{matrix}

(11)

3. Results

A total of 1200 isolates of P. aeruginosa were included in this study, out of which 44.66% were resistant to 12 antimicrobial agents and 55.33% were sensitive, as shown in Figure 2. Of which 44.66% resistant isolates, 44 were resistant to ampicillin, 37 to amoxicillin, 58 to meropenem, 60 to cefepime, 45 to fosfomycin, 30 to ceftazidime, 52 to chloramphenicol, 58 to erythromycin, 39 to tetracycline, 30 to gentamycin, 20 to butirosin, and 63 to ciprofloxacin. In addition, of 55.33% of sensitive isolates, 56 were sensitive to ampicillin, 63 to amoxicillin, 42 to meropenem, 40 to cefepime, 55 to fosfomycin, 70 to ceftazidime, 48 to chloramphenicol, 42 to erythromycin, 61 to tetracycline, 70 to gentamycin, 80 to butirosin, and 37 to ciprofloxacin, respectively. The most resistant genes to these twelve antimicrobial drugs were included blaOXA-396, blaPAO, aph(3′)-IIb, catB5, qacE, blaOXA-488, aac(6′)-Ib-cr, aph(3′)-Iia, aph(6)-Ic, aac(6′)-Ib3, fosA, sul1, catB7, blaPAO, aac(3)-Ia, aac(6′)-Il, aph(3′)-Iib, sul1catB7, blaPAO, blaOXA-396, blaOXA494, qacE, crpP, catB7, blaPAO, and blaOXA-488. Furthermore, from the analysis total of 19,371,434, k-mers were obtained of length 31. Which were compared from the ResFinder k-mer genes database, and a range of (1,302,507) k-mers of fosA, catB7, crpP, aac(6′)-Ib-cr, fosA, tet(G), aadA6, aph(3′)-Iib, sul1, aph(3′)-XV, aac(6′)-Ib3, bla_OXA-488, bla_GES-13, bla_GES-7, bla_GES-5, bla_GES-6, bla_PAO, qacE, crpT, aph(3′)-Iib, aadA13, bla_OXA-50, and qacE genes were detected in genome of 360 stains.

Number of resistant and sensitive isolate counts.

The accuracy percentage obtained from the results of BioWeka was more than 98% (as a mean percentage) including the training set and testing set, as shown in Figure 3 for all twelve antimicrobial drugs, namely, ampicillin, amoxicillin, meropenem, cefepime, fosfomycin, ceftazidime, chloramphenicol, erythromycin, tetracycline, gentamycin, butirosin, and ciprofloxacin with the confidence factor of 0.25% by 10-fold-cross validation. After the loop tests, the final mean accuracy for ampicillin was (99.31%), amoxicillin was (99.02%), meropenem was (98.27%), cefepime was (99.73%), fosfomycin was (96.44%), ceftazidime was (98.63%), chloramphenicol was (98.71%), erythromycin was (95.76%), tetracycline was (99.27%), gentamycin was (98.00%), butirosin was (99.57%), and ciprofloxacin was (96.17%).

BioWeka classification accuracy percentage of the training set and testing set of twelve antimicrobial drugs.

In addition, Figure 4 shows the resulted classification accuracy percentage of RF algorithm in contrast to twelve antimicrobial drugs. The mean classification percentage was calculated more than 96% including the training set and testing set, as shown in Figure 5. After the loop testing, the final accuracy by RF for ampicillin was (94.00%), amoxicillin was (95.21%), meropenem was (96.63%), cefepime was (98.34%), fosfomycin was (99.23%), ceftazidime was (94.31%), chloramphenicol was (96.00%), erythromycin was (97.63%), tetracycline was (98.25%), gentamycin was (97.30%), butirosin was (98.03%), and ciprofloxacin was (98.97%). Furthermore, the standard deviation and average percentages of sensitivity, accuracy, precision, and specificity measured on the testing dataset are shown in Table 1. Our results of the testing dataset show that the antimicrobial drugs, namely ampicillin, amoxicillin, meropenem, cefepime, ceftazidime, tetracycline, butirosin, and ciprofloxacin, have no false-positive and false-negative bacterial strains.

Random forest classification accuracy percentage of the training set and testing set of twelve antimicrobial drugs.

Mean accuracy percentage of random forest and BioWeka in comparison of twelve antimicrobial drugs.

Table 1.

Classification ratio of antimicrobial drugs against BioWeka and RF with phenotypes correlations.

Algorithm against drugs		Accuracy	Sensitivity	Specificity	Precision	F1 score	Kappa stats	Phenotype correlation
BioWeka classifications	Ampicillin	99.3 ± 0.0	86.0 ± 1.3	74.0 ± 2.3	1.0 ± 0.0	76.0 ± 3.2	91.0 ± 1.0	p < 2.1e − 1
	Amoxicillin	99.0 ± 0.0	62.0 ± 1.2	88.3 ± 1.2	1.0 ± 0.0	77.0 ± 1.0	91.2 ± 1.0	p < 2.1e − 1
	Meropenem	98.2 ± 0.0	88.0 ± 2.7	91.0 ± 1.0	1.0 ± 0.0	86.0 ± 2.5	89.3 ± 1.0	p < 2.1e − 1
	Cefepime	99.7 ± 0.0	89.0 ± 1.0	89.0 ± 1.0	1.0 ± 0.0	77.0 ± 1.0	94.8 ± 1.0	p < 2.1e − 1
	Fosfomycin	96.4 ± 0.0	77.0 ± 3.5	78.0 ± 2.1	1.0 ± 0.0	89.0 ± 1.0	97.6 ± 1.0	p < 2.1e − 1
	Ceftazidime	98.6 ± 0.0	85.0 ± 14.2	86.0 ± 3.7	1.0 ± 0.0	88.6 ± 2.0	91.3 ± 1.0	p < 2.1e − 1
	Chloramphenicol	98.7 ± 0.0	89.0 ± 2.1	78.0 ± 3.7	1.0 ± 0.0	91.9 ± 3.8	92.4 ± 1.0	p < 2.1e − 1
	Erythromycin	95.7 ± 0.0	91.0 ± 12.3	86.0 ± 3.2	1.0 ± 0.0	87.0 ± 1.0	89.9 ± 1.0	p < 2.1e − 1
	Tetracycline	99.2 ± 0.0	79.0 ± 1.7	89.0 ± 2.7	1.0 ± 0.0	79.0 ± 2.4	88.0 ± 1.0	p < 2.1e − 1
	Gentamycin	98.0 ± 0.0	92.0 ± 2.5	77.0 ± 2.1	1.0 ± 0.0	81.0 ± 1.0	88.0 ± 1.0	p < 2.1e − 1
	Butriosin	99.5 ± 0.0	88.0 ± 3.8	79.0 ± 12.1	1.0 ± 0.0	81.3 ± 2.7	87.6 ± 1.0	p < 2.1e − 1
	Ciprofloxacin	96.1 ± 0.0	87.0 ± 2.4	91.0 ± 1.0	1.0 ± 0.0	85.0 ± 1.0	83.8 ± 1.0	p < 2.1e − 1

Random forest classification	Ampicillin	94.0 ± 0.0	81.5 ± 2.1	88.4 ± 1.0	1.0 ± 0.0	84.9 ± 1.0	81.1 ± 1.0	p < 2.1e − 1
	Amoxicillin	95.2 ± 0.0	88.4 ± 2.5	81.2 ± 2.1	1.0 ± 0.0	88.6 ± 1.0	84.3 ± 1.0	p < 2.1e − 1
	Meropenem	96.6 ± 0.0	84.3 ± 3.6	73.9 ± 2.6	1.0 ± 0.0	87.1 ± 1.0	88.9 ± 1.0	p < 2.1e − 1
	Cefepime	98.3 ± 0.0	90.7 ± 2.2	77.0 ± 4.7	1.0 ± 0.0	82.5 ± 1.0	91.7 ± 1.0	p < 2.1e − 1
	Fosfomycin	99.2 ± 0.0	88.6 ± 2.3	76.8 ± 5.4	1.0 ± 0.0	77.7 ± 1.4	91.0 ± 1.0	p < 2.1e − 1
	Ceftazidime	94.3 ± 0.0	83.6 ± 2.1	83.7 ± 3.6	1.0 ± 0.0	79.0 ± 1.0	87.6 ± 1.0	p < 2.1e − 1
	Chloramphenicol	96.0 ± 0.0	89.7 ± 2.8	85.3 ± 2.9	1.0 ± 0.0	80.3 ± 2.7	84.9 ± 1.0	p < 2.1e − 1
	Erythromycin	97.6 ± 0.0	81.4 ± 4.6	82.6 ± 2.1	1.0 ± 0.0	78.7 ± 2.5	88.2 ± 1.0	p < 2.1e − 1
	Tetracycline	98.2 ± 0.0	83.9 ± 3.7	87.8 ± 3.1	1.0 ± 0.0	82.6 ± 1.0	91.9 ± 1.0	p < 2.1e − 1
	Gentamycin	97.3 ± 0.0	92.4 ± 2.6	79.6 ± 2.5	1.0 ± 0.0	89.4 ± 1.0	91.0 ± 1.0	p < 2.1e − 1
	Butriosin	98.0 ± 0.0	90.3 ± 3.1	81.9 ± 1.7	1.0 ± 0.0	86.3 ± 3.1	97.6 ± 1.0	p < 2.1e − 1
	Ciprofloxacin	98.9 ± 0.0	82.5 ± 3.5	88.6 ± 1.0	1.0 ± 0.0	81.2 ± 1.0	94.3 ± 1.0	p < 2.1e − 1

Open in a new tab

4. Discussion

A number of studies have highlighted the increasing global prevalence of antimicrobial resistance [12–16, 21, 24, 27, 53–57]. This is related to the challenges of treating bacterial infections, the consequences of which can be severe. P. aeruginosa is one of the most common bacterial species, and its families are responsible for some of the most dangerous infections ever seen in humans. There is a correlation between the resistance of these bacteria to multiple antibiotic classes and the severity of the infection, which complicates treatment. Antibiotic resistance among these microorganisms has been rising steadily over the years, and it is now common to find clinical samples resistant to multiple drugs. The development of antibiotic resistance causes doctors to delay administering the most effective treatment methods and prescribe a larger dosage of antibiotics than is necessary. This is particularly important in the intensive care unit, where patients' health conditions necessitate longer courses of antibiotics. The extensive use of expensive medical interventions, increased mortality rates, and lengthened hospital stays are all consequences of antimicrobial resistance [58]. Another topic of great interest is the need to prevent the spread of bacteria resistant to antibiotics and to identify them in advance so that patients can be isolated as soon as possible. Since this is the case, novel approaches must be proposed for detecting antimicrobial resistance and taking appropriate action without delay. In addition, gaining insight into the factors that contribute to the spread of nosocomial infections is possible by identifying relevant features.

In this paper, we propose a data mining strategy based on two machine learning techniques, namely, bio-Weka and RF with a statistical approach for detecting the antimicrobial resistance of P. aeruginosa with different families of drugs. BioWeka and RF has shown that machine learning-based feature selection works with highly resulted accuracy as in Table 2. Consideration of antimicrobial drug resistance and susceptibility within data mining models and methods has been demonstrated to be useful in accelerating the workflow of clinical centers. Benefits for the individual, the healthcare system, and society may result from the early identification of patients at high risk of being resistant to one or more families of antibiotics. In addition, benefits include potential use in selecting the best antimicrobial treatment immediately.

Table 2.

Our machine learning resulted model accuracy percentage comparison with recent studies.

Methods	Accuracy (%)	References
BioWeka	≥98	This paper
Random forest	≥96	This paper
Support vector machine (SVM)	≥95	[59]
Set covering machine (SCM)	≥96	[59]
Logistic regression (LR)	≥93	[44]
Decision tree (DT)	≥95	[44]
Random forest (RF)	≥97	[44]
Multi-layer perceptron (MLP)	≥91	[44]

Open in a new tab

Furthermore, the best performance achieved when testing this model strategy for resistance identification of antimicrobial drugs was a ROC area of 0.91 with a mean accuracy of more than 97% with all twelve drugs, indicating that our model can distinguish between the different classes of antibiotic susceptibility based solely on the type of the examined sample, the Gram stain classification of the pathogen, and prior antibiotic susceptibility testing results. We can foresee the sensitivity results from the various researchers using the model presented in this study. The ability to accurately detect antibiotic resistance could help clinicians make educated decisions about empiric therapy based on the local antibiotic resistance pattern. There may be major consequences for infection prevention if such prescribing practices become widespread.

The model proposed in this study has only the limitation with the process of filtering by 60 : 40 ratio with 10- fold cross-validation. If the ratios change then the accuracy and sensitivity of model might get affected. In addition, once the patient's clinical characteristics are added to the antimicrobial susceptibility dataset, the prediction performance of our model will significantly increase in terms of resistance prediction accuracy to different drugs. However, still, any such inclusion must incur the cost of retrieving the relevant data, which may be an exercise that involves a number of healthcare units, thereby increasing communication costs and complicating the need to align protocols that may operate across departments. After incurring such information, it is important to evaluate how well the additional knowledge acquired in terms of the improved accuracy metrics of the model can be incorporated into the practice of the hospital physicians, who may need to reevaluate their decision-making processes in the context of supporting or contradicting recommendations from a decision support system. To sum up, we think of this study as a node on a spectrum of cost-effectiveness studies that data mining approaches and machine learning techniques will spark in the healthcare industry.

Acknowledgments

The authors would like to thank our colleagues who contributed in the study. This research work has been supported by the Natural Science Foundation of China (NSFC) for young international scientists (Grant no. 42150410383); and the 2020 Li Ka Shing Foundation Cross-Disciplinary Research Grant (Project no. 2020LKSFG03E).

Data Availability

All data used in this study can be found in the Supplementary file associated with this article, or it can also be made available upon request to the first author or corresponding author.

Consent

Not applicable.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors' Contributions

Sohail M. Noman was responsible for conceptualization, methodology, empirical estimations, writing, and drafting of the original draft by. Supervision was performed by Xiaoyang Jiao. Sohail M. Noman, Muhammad Shafiq, Yumeng Yuan, Mi Zeng, Qingdong Xie, and Xin Li performed data collection. Sohail M. Noman, Muhammad Zeeshan, Jehangir Arshad, and Melkamu Deressa Amentie performed review and editing. All authors have read and approved the final manuscript.

Supplementary Materials

All descriptive information about the raw data is present in the Supplementary file.

Click here for additional data file.^{(170.4KB, docx)}

References

1.Druge S., Ruiz S., Vardon-Bounes F., et al. Risk factors and the resistance mechanisms involved in Pseudomonas aeruginosa mutation in critically ill patients. Journal of Intensive Care Medicine . 2019;7:36–39. doi: 10.1186/S40560-019-0390-4/TABLES/4. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Mohanty S., Baliyarsingh B., Kumar Nayak S. Antimicrobial Resistance-A One Health Perspective . London, UK: IntechOpen; 2021. Antimicrobial Resistance in Pseudomonas aeruginosa: A Concise Review; pp. 1–11. [DOI] [Google Scholar]
3.Pang Z., Raudonis R., Glick B. R., Lin T. J., Cheng Z. Antibiotic resistance in Pseudomonas aeruginosa: mechanisms and alternative therapeutic strategies. Biotechnology Advances . 2019;37(1):177–192. doi: 10.1016/J.BIOTECHADV.2018.11.013. [DOI] [PubMed] [Google Scholar]
4.Langendonk R. F., Neill D. R., Fothergill J. L. The building blocks of antimicrobial resistance in Pseudomonas aeruginosa: implications for current resistance-breaking therapies. Frontiers in Cellular and Infection Microbiology . 2021;11:p. 307. doi: 10.3389/FCIMB.2021.665759/BIBTEX. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Prestinaci F., Pezzotti P., Pantosti A. Antimicrobial resistance: a global multifaceted phenomenon. Pathogens and Global Health . 2015;109(7):309–318. doi: 10.1179/2047773215Y.0000000030. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Drenkard E. Antimicrobial resistance of Pseudomonas aeruginosa biofilms. Microbes and Infection . 2003;5(13):1213–1219. doi: 10.1016/J.MICINF.2003.08.009. [DOI] [PubMed] [Google Scholar]
7.Noman S. M., Shafiq M., Bibi S., et al. Exploring antibiotic resistance genes, mobile gene elements, and virulence gene factors in an urban freshwater samples using metagenomic analysis. Environmental Science & Pollution Research . 2022 2022;30(2):2977–2990. doi: 10.1007/S11356-022-22197-4. [DOI] [PubMed] [Google Scholar]
8.Goodyear M. C., Garnier N. E., Levesque R. C., Khursigara C. M. Liverpool epidemic strain isolates of Pseudomonas aeruginosa display high levels of antimicrobial resistance during both planktonic and biofilm growth. Microbiology Spectrum . 2022;10(3):e102514. doi: 10.1128/SPECTRUM.01024-22.1024222 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Mekonnen H., Seid A., Molla Fenta G., Gebrecherkos T. Antimicrobial resistance profiles and associated factors of Acinetobacter and Pseudomonas aeruginosa nosocomial infection among patients admitted at Dessie comprehensive specialized Hospital, North-East Ethiopia. A cross-sectional study. PLoS One . 2021;16(11) doi: 10.1371/JOURNAL.PONE.0257272.e0257272 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Moore J. E., Millar B. C., Ollman-Selinger M., Cambridge L. The role of suboptimal concentrations of nebulized tobramycin in driving antimicrobial resistance in Pseudomonas aeruginosa isolates in cystic fibrosis. Respiratory Care . 2021;66(9):1446–1457. doi: 10.4187/RESPCARE.08671. [DOI] [PubMed] [Google Scholar]
11.Shafiq M., Rahman S. U., Bilal H., et al. Incidence and molecular characterization of ESBL-producing and colistin-resistant Escherichia coli isolates recovered from healthy food-producing animals in Pakistan. Journal of Applied Microbiology . 2022;133(3):1169–1182. doi: 10.1111/JAM.15469. [DOI] [PubMed] [Google Scholar]
12.Yamani L., Alamri A., Alsultan A., Alfifi S., Ansari M. A., Alnimr A. Inverse correlation between biofilm production efficiency and antimicrobial resistance in clinical isolates of Pseudomonas aeruginosa. Microbial Pathogenesis . 2021;157 doi: 10.1016/J.MICPATH.2021.104989.104989 [DOI] [PubMed] [Google Scholar]
13.Thacharodi A., Lamont I. L. Aminoglycoside resistance in Pseudomonas aeruginosa: the contribution of the MexXY-OprM efflux pump varies between isolates. Journal of Medical Microbiology . 2022;71(6):p. 1563. doi: 10.1099/JMM.0.001551. [DOI] [PubMed] [Google Scholar]
14.Araújo Ma dos S., Rodrigues J. S., Lobo T. de L. G. F., Maranhão F. C. de A. Healthcare-associated infections by Pseudomonas aeruginosa and antimicrobial resistance in a public hospital from alagoas (Brazil) Jornal Brasileiro de Patologia e Medicina Laboratorial . 2022;58:1–11. doi: 10.1900/JBPML.2022.58.447. [DOI] [Google Scholar]
15.Lynch J. P., Zhanel G. G., Zhanel G. G. Pseudomonas aeruginosa pneumonia: evolution of antimicrobial resistance and implications for therapy. Seminars in Respiratory and Critical Care Medicine . 2022;43(02):191–218. doi: 10.1055/S-0041-1740109. [DOI] [PubMed] [Google Scholar]
16.Madden D. E., Olagoke O., Baird T., et al. Express yourself: quantitative real-time PCR assays for rapid chromosomal antimicrobial resistance detection in Pseudomonas aeruginosa. Antimicrobial Agents and Chemotherapy . 2022;66(5):9. doi: 10.1128/AAC.00204-22.e0020422 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Yuan Y., Chen Y., Yao F., et al. Microbiomes and resistomes in biopsy tissue and intestinal lavage fluid of colorectal cancer. Frontiers in Cell and Developmental Biology . 2021;9 doi: 10.3389/FCELL.2021.736994.736994 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Tamma P. D., Aitken S. L., Bonomo R. A., Mathers A. J., van Duin D., Clancy C. J. Infectious diseases society of America guidance on the treatment of extended-spectrum β-lactamase producing enterobacterales (ESBL-E), carbapenem-resistant enterobacterales (CRE), and Pseudomonas aeruginosa with difficult-to-treat resistance (DTR-P. aeruginosa) Clinical Infectious Diseases . 2021;72(7):e169–e183. doi: 10.1093/CID/CIAA1478. [DOI] [PubMed] [Google Scholar]
19.Gajdács M., Baráth Z., Kárpáti K., et al. No correlation between biofilm formation, virulence factors, and antibiotic resistance in Pseudomonas aeruginosa: results from a laboratory-based in vitro study. Antibiotics . 2021;10(9):p. 1134. doi: 10.3390/ANTIBIOTICS10091134. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Wu Z. Y., Wu X. S., Yao W. Y., Wang X. F., Quan Z. W., Gong W. Pathogens’ distribution and changes of antimicrobial resistance in the bile of acute biliary tract infection patients. Zhonghua wai ke za zhi [Chinese journal of surgery] . 2021;59(1):24–31. doi: 10.3760/CMA.J.CN112139-20200717-00559. [DOI] [PubMed] [Google Scholar]
21.Boschetti G., Sgarabotto D., Meloni M., et al. Antimicrobial resistance patterns in diabetic foot infections, an epidemiological study in northeastern Italy. Antibiotics . 2021;10:p. 1241. doi: 10.3390/ANTIBIOTICS10101241. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Zahedi bialvaei A., Rahbar M., Hamidi-Farahani R., et al. Expression of RND efflux pumps mediated antibiotic resistance in Pseudomonas aeruginosa clinical strains. Microbial Pathogenesis . 2021;153 doi: 10.1016/J.MICPATH.2021.104789.104789 [DOI] [PubMed] [Google Scholar]
23.Karlowsky J. A., Walkty A. J., Baxter M. R., et al. Vitro Activity of Cefiderocol against Extensively Drug-Resistant Pseudomonas aeruginosa: CANWARD, 2007 to 2019. Microbiology Spectrum . 2022;10(4) doi: 10.1128/SPECTRUM.01724-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Mogasale V. V., Saldanha P., Pai V., Rekha P. D., Mogasale V. A descriptive analysis of antimicrobial resistance patterns of WHO priority pathogens isolated in children from a tertiary care hospital in India. Scientific Reports . 2021;11:p. 5116. doi: 10.1038/s41598-021-84293-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Liu C., Xu M., Li X., Dong H., Ming L. Trends in antimicrobial resistance in bloodstream infections at a large tertiary-care hospital in China: a 10-year retrospective study (2010–2019) Journal of Global Antimicrobial Resistance . 2022;29:413–419. doi: 10.1016/J.JGAR.2021.09.018. [DOI] [PubMed] [Google Scholar]
26.Vaz S., Lall M. Potential public health impact of the development of antimicrobial resistance in clinical isolates of Pseudomonas aeruginosa on repeated exposure to biocides In vitro. Med J Dr DY Patil Vidyapeeth . 2021;14(1):p. 45. doi: 10.4103/MJDRDYPU.MJDRDYPU_353_20. [DOI] [Google Scholar]
27.Soonthornsit J., Pimwaraluck K., Kongmuang N., Pratya P., Phumthanakorn N. Molecular epidemiology of antimicrobial-resistant Pseudomonas aeruginosa in a veterinary teaching hospital environment. Veterinary Research Communications . 2022;47:73–86. doi: 10.1007/S11259-022-09929-0. [DOI] [PubMed] [Google Scholar]
28.Qin J., Zou C., Tao J., et al. Carbapenem resistant Pseudomonas aeruginosa infections in elderly patients: antimicrobial resistance profiles, risk factors and impact on clinical outcomes. Infection and Drug Resistance . 2022;15:2301–2314. doi: 10.2147/IDR.S358778. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Gewehr J. E., Szugat M., Zimmer R. BioWeka - extending the Weka framework for bioinformatics. Bioinformatics . 2007;23(5):651–653. doi: 10.1093/bioinformatics/btl671. [DOI] [PubMed] [Google Scholar]
30.Sohail M. N., Jiadong R., Uba M. M., et al. Forecast Regression analysis for Diabetes Growth: an inclusive data mining approach. International Journal of Advanced Research in Computer Engineering . 2018;7:715–721. [Google Scholar]
31.Noman S. M., Arshad J., Zeeshan M., et al. An empirical study on diabetes depression over distress evaluation using diagnosis statistical manual and chi-square method. International Journal of Environmental Research and Public Health . 2021;18(7):p. 3755. doi: 10.3390/ijerph18073755. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Uba Muhammad M., Jiadong R., Sohail M. N., Irshad M., Bilal M., Osi A. A. A logistic regression modeling on the prevalence of diabetes mellitus in the North Western Part of Nigeria. Benin Journal of Statistics-Uniben . 2018;1:1–10. [Google Scholar]
33.Sohail N., Ren J., Abir I., Uba Muhammad M., Bilal M., Iqbal W. WHY only data mining? A pilot study on inadequacy and domination of data mining technology why only data mining? A pilot study on inadequacy and domination of data mining technology (google scholar Library2019 view project WHY only data mining? A pilot st. International Journal of Recent Scientific Research . 2018;9:75. doi: 10.24327/ijrsr.2018.0910.2787.29066 [DOI] [Google Scholar]
34.Sohail M. N., Ren J., Uba Muhammad M. A euclidean group assessment on semi-supervised clustering for healthcare clinical implications based on real-life data. International Journal of Environmental Research and Public Health . 2019;16(9):1581–1593. doi: 10.3390/ijerph16091581. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Uba M. M., Jiadong R., Sohail M. N., Irshad M., Yu K. Data mining process for predicting diabetes mellitus based model about other chronic diseases: a case study of the northwestern part of Nigeria. Healthcare Technology Letters . 2019;6(4):98–102. doi: 10.1049/htl.2018.5111. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Muhammad M. U., Jiadong R., Muhammad N. S., Nawaz B. Stratified diabetes mellitus prevalence for the Northwestern Nigerian States, a data mining approach. International Journal of Environmental Research and Public Health . 2019;16(21):p. 4089. doi: 10.3390/ijerph16214089. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Dufresne Y., Lemane T., Marijon P., et al. The K-mer File Format: a standardized and compact disk representation of sets of k-mers. Bioinformatics . 2022;38(18):4423–4425. doi: 10.1093/BIOINFORMATICS/BTAC528. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Li Y., Patel H., Lin Y. Variant Calling: Methods and Protocols . New York, NY, USA: Springer; 2022. Kmer2SNP: Reference-free Heterozygous SNP Calling Using K-Mer Frequency Distributions; pp. 257–265. [DOI] [PubMed] [Google Scholar]
39.Hicks A. L., Wheeler N., Sánchez-Busó L., Rakeman J. L., Harris S. R., Grad Y. H. Evaluation of parameters affecting performance and reliability of machine learning-based antibiotic susceptibility testing from whole genome sequencing data. PLoS Computational Biology . 2019;15(9):21. doi: 10.1371/journal.pcbi.1007349.e1007349 [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Su X., Jing G., Zhang Y., Wu S. Method development for cross-study microbiome data mining: challenges and opportunities. Computational and Structural Biotechnology Journal . 2020;18:2075–2080. doi: 10.1016/j.csbj.2020.07.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Feretzakis G., Loupelis E., Sakagianni A., et al. Using machine learning techniques to aid empirical antibiotic therapy decisions in the intensive care unit of a general hospital in Greece. Antibiotics . 2020;9(2):p. 50. doi: 10.3390/antibiotics9020050. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Shakeri H., Volkova V., Wen X., et al. Establishing statistical equivalence of data from different sampling approaches for assessment of bacterial phenotypic antimicrobial resistance. Applied and Environmental Microbiology . 2018;84(9):17. doi: 10.1128/AEM.02724-17.e027244 [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Arango-Argoty G., Garner E., Pruden A., Heath L. S., Vikesland P., Zhang L. DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data. Microbiome . 2018;6:1–15. doi: 10.1186/s40168-018-0401-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Martínez-Agüero S., Mora-Jiménez I., Lérida-García J., Álvarez-Rodríguez J., Soguero-Ruiz C. Machine learning techniques to identify antimicrobial resistance in the intensive care unit. Entropy . 2019;21(6):603–624. doi: 10.3390/e21060603. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Khan W., Kong L., Noman S. M., Brekhna B., Brekhna B. A novel feature selection method via mining Markov blanket. Applied Intelligence . 2022;1:1–24. doi: 10.1007/S10489-022-03863-Z. [DOI] [Google Scholar]
46.Suzuki S., Horinouchi T., Furusawa C. Prediction of antibiotic resistance by gene expression profiles. Nature Communications . 2014;5(1):p. 5792. doi: 10.1038/ncomms6792. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Feretzakis G., Loupelis E., Sakagianni A., et al. Using machine learning algorithms to predict antimicrobial resistance and assist empirical treatment. Studies in Health Technology and Informatics . 2020;272:75–78. doi: 10.3233/SHTI200497. [DOI] [PubMed] [Google Scholar]
48.Muhammad M. U., Jiadong R., Muhammad N. S., Hussain M., Muhammad I. Principal component analysis of categorized polytomous variable-based classification of diabetes and other chronic diseases. International Journal of Environmental Research and Public Health . 2019;16(19):p. 3593. doi: 10.3390/ijerph16193593. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Sohail M. N., Jiadong R., Muhammad M. U., Chauhdary S. T., Arshad J., Verghese A. J. An accurate clinical implication assessment for diabetes mellitus prevalence based on a study from Nigeria. Processes . 2019;7 doi: 10.3390/pr7050289. [DOI] [Google Scholar]
50.Sohail M. N., Jiadong R., Uba M. M., et al. A hybrid Forecast Cost Benefit Classification of diabetes mellitus prevalence based on epidemiological study on Real-life patient’s data. Scientific Reports . 2019;9:10110. doi: 10.1038/s41598-019-46631-9.10103 [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Muhammad M. U., Asiribo O. E., Muhammad S. Application of logistic regression modeling using fractional polynomials of grouped continuous covariates. Proceedings of the Nigeria Statistical Society . 2017;1:144–147. https://doi.org/http://nss.com.ng/2017_edited_proceedings . [Google Scholar]
52.Sohail M. N., Ren J., Muhammad M. U., et al. Group covariates assessment on real-life diabetes patients by fractional polynomials: a study based on logistic regression modeling. Journal of Biotech Research . 2019;10:116–125. [Google Scholar]
53.Bilal H., Khan M. N., Rehman T., Hameed M. F., Yang X. Antibiotic resistance in Pakistan: a systematic review of past decade. BMC Infectious Diseases . 2021;21:244–319. doi: 10.1186/S12879-021-05906-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Shafiq M., Huang J., Shah J. M., et al. Characterization and virulence factors distribution of blaCTX-M and mcr-1carrying Escherichia coli isolates from bovine mastitis. Journal of Applied Microbiology . 2021;131(2):634–646. doi: 10.1111/JAM.14994. [DOI] [PubMed] [Google Scholar]
55.Bilal H., Hameed F., Khan M. A., Khan S., Yang X., Rehman T. U. Detection of mcr-1 gene in extended-spectrum β-lactamase-producing Klebsiella pneumoniae from human urine samples in Pakistan. Jundishapur Journal of Microbiology . 2020;13(4):13–21. doi: 10.5812/JJM.96646. [DOI] [Google Scholar]
56.Bilal H., Rehman T. U., Khan M. A., et al. Molecular epidemiology of mcr-1, blaKPC-2, and blaNDM-1 harboring clinically isolated Escherichia coli from Pakistan. Infection and Drug Resistance . 2021;14:1467–1479. doi: 10.2147/IDR.S302687. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Bilal H., Zhang G., Rehman T., et al. First report of blaNDM-1 bearing IncX3 plasmid in clinically isolated ST11 Klebsiella pneumoniae from Pakistan. Microorganisms . 2021;9(5):p. 951. doi: 10.3390/MICROORGANISMS9050951. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Shafiq M., Huang J., Ur Rahman S., et al. High incidence of multidrug-resistant Escherichia coli coharboring mcr-1 and blaCTX-M-15 recovered from pigs. Infection and Drug Resistance . 2019;12:2135–2149. doi: 10.2147/IDR.S209473. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Liu Z., Deng D., Lu H., et al. Evaluation of machine learning models for predicting antimicrobial resistance of actinobacillus pleuropneumoniae from whole genome sequences. Frontiers in Microbiology . 2020;11:48–57. doi: 10.3389/fmicb.2020.00048. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

All descriptive information about the raw data is present in the Supplementary file.

Click here for additional data file.^{(170.4KB, docx)}

Data Availability Statement

All data used in this study can be found in the Supplementary file associated with this article, or it can also be made available upon request to the first author or corresponding author.

[B1] 1.Druge S., Ruiz S., Vardon-Bounes F., et al. Risk factors and the resistance mechanisms involved in Pseudomonas aeruginosa mutation in critically ill patients. Journal of Intensive Care Medicine . 2019;7:36–39. doi: 10.1186/S40560-019-0390-4/TABLES/4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] 2.Mohanty S., Baliyarsingh B., Kumar Nayak S. Antimicrobial Resistance-A One Health Perspective . London, UK: IntechOpen; 2021. Antimicrobial Resistance in Pseudomonas aeruginosa: A Concise Review; pp. 1–11. [DOI] [Google Scholar]

[B3] 3.Pang Z., Raudonis R., Glick B. R., Lin T. J., Cheng Z. Antibiotic resistance in Pseudomonas aeruginosa: mechanisms and alternative therapeutic strategies. Biotechnology Advances . 2019;37(1):177–192. doi: 10.1016/J.BIOTECHADV.2018.11.013. [DOI] [PubMed] [Google Scholar]

[B4] 4.Langendonk R. F., Neill D. R., Fothergill J. L. The building blocks of antimicrobial resistance in Pseudomonas aeruginosa: implications for current resistance-breaking therapies. Frontiers in Cellular and Infection Microbiology . 2021;11:p. 307. doi: 10.3389/FCIMB.2021.665759/BIBTEX. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B5] 5.Prestinaci F., Pezzotti P., Pantosti A. Antimicrobial resistance: a global multifaceted phenomenon. Pathogens and Global Health . 2015;109(7):309–318. doi: 10.1179/2047773215Y.0000000030. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6.Drenkard E. Antimicrobial resistance of Pseudomonas aeruginosa biofilms. Microbes and Infection . 2003;5(13):1213–1219. doi: 10.1016/J.MICINF.2003.08.009. [DOI] [PubMed] [Google Scholar]

[B7] 7.Noman S. M., Shafiq M., Bibi S., et al. Exploring antibiotic resistance genes, mobile gene elements, and virulence gene factors in an urban freshwater samples using metagenomic analysis. Environmental Science & Pollution Research . 2022 2022;30(2):2977–2990. doi: 10.1007/S11356-022-22197-4. [DOI] [PubMed] [Google Scholar]

[B8] 8.Goodyear M. C., Garnier N. E., Levesque R. C., Khursigara C. M. Liverpool epidemic strain isolates of Pseudomonas aeruginosa display high levels of antimicrobial resistance during both planktonic and biofilm growth. Microbiology Spectrum . 2022;10(3):e102514. doi: 10.1128/SPECTRUM.01024-22.1024222 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9.Mekonnen H., Seid A., Molla Fenta G., Gebrecherkos T. Antimicrobial resistance profiles and associated factors of Acinetobacter and Pseudomonas aeruginosa nosocomial infection among patients admitted at Dessie comprehensive specialized Hospital, North-East Ethiopia. A cross-sectional study. PLoS One . 2021;16(11) doi: 10.1371/JOURNAL.PONE.0257272.e0257272 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] 10.Moore J. E., Millar B. C., Ollman-Selinger M., Cambridge L. The role of suboptimal concentrations of nebulized tobramycin in driving antimicrobial resistance in Pseudomonas aeruginosa isolates in cystic fibrosis. Respiratory Care . 2021;66(9):1446–1457. doi: 10.4187/RESPCARE.08671. [DOI] [PubMed] [Google Scholar]

[B11] 11.Shafiq M., Rahman S. U., Bilal H., et al. Incidence and molecular characterization of ESBL-producing and colistin-resistant Escherichia coli isolates recovered from healthy food-producing animals in Pakistan. Journal of Applied Microbiology . 2022;133(3):1169–1182. doi: 10.1111/JAM.15469. [DOI] [PubMed] [Google Scholar]

[B12] 12.Yamani L., Alamri A., Alsultan A., Alfifi S., Ansari M. A., Alnimr A. Inverse correlation between biofilm production efficiency and antimicrobial resistance in clinical isolates of Pseudomonas aeruginosa. Microbial Pathogenesis . 2021;157 doi: 10.1016/J.MICPATH.2021.104989.104989 [DOI] [PubMed] [Google Scholar]

[B13] 13.Thacharodi A., Lamont I. L. Aminoglycoside resistance in Pseudomonas aeruginosa: the contribution of the MexXY-OprM efflux pump varies between isolates. Journal of Medical Microbiology . 2022;71(6):p. 1563. doi: 10.1099/JMM.0.001551. [DOI] [PubMed] [Google Scholar]

[B14] 14.Araújo Ma dos S., Rodrigues J. S., Lobo T. de L. G. F., Maranhão F. C. de A. Healthcare-associated infections by Pseudomonas aeruginosa and antimicrobial resistance in a public hospital from alagoas (Brazil) Jornal Brasileiro de Patologia e Medicina Laboratorial . 2022;58:1–11. doi: 10.1900/JBPML.2022.58.447. [DOI] [Google Scholar]

[B15] 15.Lynch J. P., Zhanel G. G., Zhanel G. G. Pseudomonas aeruginosa pneumonia: evolution of antimicrobial resistance and implications for therapy. Seminars in Respiratory and Critical Care Medicine . 2022;43(02):191–218. doi: 10.1055/S-0041-1740109. [DOI] [PubMed] [Google Scholar]

[B16] 16.Madden D. E., Olagoke O., Baird T., et al. Express yourself: quantitative real-time PCR assays for rapid chromosomal antimicrobial resistance detection in Pseudomonas aeruginosa. Antimicrobial Agents and Chemotherapy . 2022;66(5):9. doi: 10.1128/AAC.00204-22.e0020422 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B17] 17.Yuan Y., Chen Y., Yao F., et al. Microbiomes and resistomes in biopsy tissue and intestinal lavage fluid of colorectal cancer. Frontiers in Cell and Developmental Biology . 2021;9 doi: 10.3389/FCELL.2021.736994.736994 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] 18.Tamma P. D., Aitken S. L., Bonomo R. A., Mathers A. J., van Duin D., Clancy C. J. Infectious diseases society of America guidance on the treatment of extended-spectrum β-lactamase producing enterobacterales (ESBL-E), carbapenem-resistant enterobacterales (CRE), and Pseudomonas aeruginosa with difficult-to-treat resistance (DTR-P. aeruginosa) Clinical Infectious Diseases . 2021;72(7):e169–e183. doi: 10.1093/CID/CIAA1478. [DOI] [PubMed] [Google Scholar]

[B19] 19.Gajdács M., Baráth Z., Kárpáti K., et al. No correlation between biofilm formation, virulence factors, and antibiotic resistance in Pseudomonas aeruginosa: results from a laboratory-based in vitro study. Antibiotics . 2021;10(9):p. 1134. doi: 10.3390/ANTIBIOTICS10091134. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B20] 20.Wu Z. Y., Wu X. S., Yao W. Y., Wang X. F., Quan Z. W., Gong W. Pathogens’ distribution and changes of antimicrobial resistance in the bile of acute biliary tract infection patients. Zhonghua wai ke za zhi [Chinese journal of surgery] . 2021;59(1):24–31. doi: 10.3760/CMA.J.CN112139-20200717-00559. [DOI] [PubMed] [Google Scholar]

[B21] 21.Boschetti G., Sgarabotto D., Meloni M., et al. Antimicrobial resistance patterns in diabetic foot infections, an epidemiological study in northeastern Italy. Antibiotics . 2021;10:p. 1241. doi: 10.3390/ANTIBIOTICS10101241. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B22] 22.Zahedi bialvaei A., Rahbar M., Hamidi-Farahani R., et al. Expression of RND efflux pumps mediated antibiotic resistance in Pseudomonas aeruginosa clinical strains. Microbial Pathogenesis . 2021;153 doi: 10.1016/J.MICPATH.2021.104789.104789 [DOI] [PubMed] [Google Scholar]

[B23] 23.Karlowsky J. A., Walkty A. J., Baxter M. R., et al. Vitro Activity of Cefiderocol against Extensively Drug-Resistant Pseudomonas aeruginosa: CANWARD, 2007 to 2019. Microbiology Spectrum . 2022;10(4) doi: 10.1128/SPECTRUM.01724-22. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B24] 24.Mogasale V. V., Saldanha P., Pai V., Rekha P. D., Mogasale V. A descriptive analysis of antimicrobial resistance patterns of WHO priority pathogens isolated in children from a tertiary care hospital in India. Scientific Reports . 2021;11:p. 5116. doi: 10.1038/s41598-021-84293-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B25] 25.Liu C., Xu M., Li X., Dong H., Ming L. Trends in antimicrobial resistance in bloodstream infections at a large tertiary-care hospital in China: a 10-year retrospective study (2010–2019) Journal of Global Antimicrobial Resistance . 2022;29:413–419. doi: 10.1016/J.JGAR.2021.09.018. [DOI] [PubMed] [Google Scholar]

[B26] 26.Vaz S., Lall M. Potential public health impact of the development of antimicrobial resistance in clinical isolates of Pseudomonas aeruginosa on repeated exposure to biocides In vitro. Med J Dr DY Patil Vidyapeeth . 2021;14(1):p. 45. doi: 10.4103/MJDRDYPU.MJDRDYPU_353_20. [DOI] [Google Scholar]

[B27] 27.Soonthornsit J., Pimwaraluck K., Kongmuang N., Pratya P., Phumthanakorn N. Molecular epidemiology of antimicrobial-resistant Pseudomonas aeruginosa in a veterinary teaching hospital environment. Veterinary Research Communications . 2022;47:73–86. doi: 10.1007/S11259-022-09929-0. [DOI] [PubMed] [Google Scholar]

[B28] 28.Qin J., Zou C., Tao J., et al. Carbapenem resistant Pseudomonas aeruginosa infections in elderly patients: antimicrobial resistance profiles, risk factors and impact on clinical outcomes. Infection and Drug Resistance . 2022;15:2301–2314. doi: 10.2147/IDR.S358778. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B29] 29.Gewehr J. E., Szugat M., Zimmer R. BioWeka - extending the Weka framework for bioinformatics. Bioinformatics . 2007;23(5):651–653. doi: 10.1093/bioinformatics/btl671. [DOI] [PubMed] [Google Scholar]

[B30] 30.Sohail M. N., Jiadong R., Uba M. M., et al. Forecast Regression analysis for Diabetes Growth: an inclusive data mining approach. International Journal of Advanced Research in Computer Engineering . 2018;7:715–721. [Google Scholar]

[B31] 31.Noman S. M., Arshad J., Zeeshan M., et al. An empirical study on diabetes depression over distress evaluation using diagnosis statistical manual and chi-square method. International Journal of Environmental Research and Public Health . 2021;18(7):p. 3755. doi: 10.3390/ijerph18073755. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B32] 32.Uba Muhammad M., Jiadong R., Sohail M. N., Irshad M., Bilal M., Osi A. A. A logistic regression modeling on the prevalence of diabetes mellitus in the North Western Part of Nigeria. Benin Journal of Statistics-Uniben . 2018;1:1–10. [Google Scholar]

[B33] 33.Sohail N., Ren J., Abir I., Uba Muhammad M., Bilal M., Iqbal W. WHY only data mining? A pilot study on inadequacy and domination of data mining technology why only data mining? A pilot study on inadequacy and domination of data mining technology (google scholar Library2019 view project WHY only data mining? A pilot st. International Journal of Recent Scientific Research . 2018;9:75. doi: 10.24327/ijrsr.2018.0910.2787.29066 [DOI] [Google Scholar]

[B34] 34.Sohail M. N., Ren J., Uba Muhammad M. A euclidean group assessment on semi-supervised clustering for healthcare clinical implications based on real-life data. International Journal of Environmental Research and Public Health . 2019;16(9):1581–1593. doi: 10.3390/ijerph16091581. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B35] 35.Uba M. M., Jiadong R., Sohail M. N., Irshad M., Yu K. Data mining process for predicting diabetes mellitus based model about other chronic diseases: a case study of the northwestern part of Nigeria. Healthcare Technology Letters . 2019;6(4):98–102. doi: 10.1049/htl.2018.5111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B36] 36.Muhammad M. U., Jiadong R., Muhammad N. S., Nawaz B. Stratified diabetes mellitus prevalence for the Northwestern Nigerian States, a data mining approach. International Journal of Environmental Research and Public Health . 2019;16(21):p. 4089. doi: 10.3390/ijerph16214089. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B37] 37.Dufresne Y., Lemane T., Marijon P., et al. The K-mer File Format: a standardized and compact disk representation of sets of k-mers. Bioinformatics . 2022;38(18):4423–4425. doi: 10.1093/BIOINFORMATICS/BTAC528. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B38] 38.Li Y., Patel H., Lin Y. Variant Calling: Methods and Protocols . New York, NY, USA: Springer; 2022. Kmer2SNP: Reference-free Heterozygous SNP Calling Using K-Mer Frequency Distributions; pp. 257–265. [DOI] [PubMed] [Google Scholar]

[B39] 39.Hicks A. L., Wheeler N., Sánchez-Busó L., Rakeman J. L., Harris S. R., Grad Y. H. Evaluation of parameters affecting performance and reliability of machine learning-based antibiotic susceptibility testing from whole genome sequencing data. PLoS Computational Biology . 2019;15(9):21. doi: 10.1371/journal.pcbi.1007349.e1007349 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B40] 40.Su X., Jing G., Zhang Y., Wu S. Method development for cross-study microbiome data mining: challenges and opportunities. Computational and Structural Biotechnology Journal . 2020;18:2075–2080. doi: 10.1016/j.csbj.2020.07.020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B41] 41.Feretzakis G., Loupelis E., Sakagianni A., et al. Using machine learning techniques to aid empirical antibiotic therapy decisions in the intensive care unit of a general hospital in Greece. Antibiotics . 2020;9(2):p. 50. doi: 10.3390/antibiotics9020050. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B42] 42.Shakeri H., Volkova V., Wen X., et al. Establishing statistical equivalence of data from different sampling approaches for assessment of bacterial phenotypic antimicrobial resistance. Applied and Environmental Microbiology . 2018;84(9):17. doi: 10.1128/AEM.02724-17.e027244 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B43] 43.Arango-Argoty G., Garner E., Pruden A., Heath L. S., Vikesland P., Zhang L. DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data. Microbiome . 2018;6:1–15. doi: 10.1186/s40168-018-0401-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B44] 44.Martínez-Agüero S., Mora-Jiménez I., Lérida-García J., Álvarez-Rodríguez J., Soguero-Ruiz C. Machine learning techniques to identify antimicrobial resistance in the intensive care unit. Entropy . 2019;21(6):603–624. doi: 10.3390/e21060603. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B45] 45.Khan W., Kong L., Noman S. M., Brekhna B., Brekhna B. A novel feature selection method via mining Markov blanket. Applied Intelligence . 2022;1:1–24. doi: 10.1007/S10489-022-03863-Z. [DOI] [Google Scholar]

[B46] 46.Suzuki S., Horinouchi T., Furusawa C. Prediction of antibiotic resistance by gene expression profiles. Nature Communications . 2014;5(1):p. 5792. doi: 10.1038/ncomms6792. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B47] 47.Feretzakis G., Loupelis E., Sakagianni A., et al. Using machine learning algorithms to predict antimicrobial resistance and assist empirical treatment. Studies in Health Technology and Informatics . 2020;272:75–78. doi: 10.3233/SHTI200497. [DOI] [PubMed] [Google Scholar]

[B48] 48.Muhammad M. U., Jiadong R., Muhammad N. S., Hussain M., Muhammad I. Principal component analysis of categorized polytomous variable-based classification of diabetes and other chronic diseases. International Journal of Environmental Research and Public Health . 2019;16(19):p. 3593. doi: 10.3390/ijerph16193593. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B49] 49.Sohail M. N., Jiadong R., Muhammad M. U., Chauhdary S. T., Arshad J., Verghese A. J. An accurate clinical implication assessment for diabetes mellitus prevalence based on a study from Nigeria. Processes . 2019;7 doi: 10.3390/pr7050289. [DOI] [Google Scholar]

[B50] 50.Sohail M. N., Jiadong R., Uba M. M., et al. A hybrid Forecast Cost Benefit Classification of diabetes mellitus prevalence based on epidemiological study on Real-life patient’s data. Scientific Reports . 2019;9:10110. doi: 10.1038/s41598-019-46631-9.10103 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B51] 51.Muhammad M. U., Asiribo O. E., Muhammad S. Application of logistic regression modeling using fractional polynomials of grouped continuous covariates. Proceedings of the Nigeria Statistical Society . 2017;1:144–147. https://doi.org/http://nss.com.ng/2017_edited_proceedings . [Google Scholar]

[B52] 52.Sohail M. N., Ren J., Muhammad M. U., et al. Group covariates assessment on real-life diabetes patients by fractional polynomials: a study based on logistic regression modeling. Journal of Biotech Research . 2019;10:116–125. [Google Scholar]

[B53] 53.Bilal H., Khan M. N., Rehman T., Hameed M. F., Yang X. Antibiotic resistance in Pakistan: a systematic review of past decade. BMC Infectious Diseases . 2021;21:244–319. doi: 10.1186/S12879-021-05906-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B54] 54.Shafiq M., Huang J., Shah J. M., et al. Characterization and virulence factors distribution of blaCTX-M and mcr-1carrying Escherichia coli isolates from bovine mastitis. Journal of Applied Microbiology . 2021;131(2):634–646. doi: 10.1111/JAM.14994. [DOI] [PubMed] [Google Scholar]

[B55] 55.Bilal H., Hameed F., Khan M. A., Khan S., Yang X., Rehman T. U. Detection of mcr-1 gene in extended-spectrum β-lactamase-producing Klebsiella pneumoniae from human urine samples in Pakistan. Jundishapur Journal of Microbiology . 2020;13(4):13–21. doi: 10.5812/JJM.96646. [DOI] [Google Scholar]

[B56] 56.Bilal H., Rehman T. U., Khan M. A., et al. Molecular epidemiology of mcr-1, blaKPC-2, and blaNDM-1 harboring clinically isolated Escherichia coli from Pakistan. Infection and Drug Resistance . 2021;14:1467–1479. doi: 10.2147/IDR.S302687. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B57] 57.Bilal H., Zhang G., Rehman T., et al. First report of blaNDM-1 bearing IncX3 plasmid in clinically isolated ST11 Klebsiella pneumoniae from Pakistan. Microorganisms . 2021;9(5):p. 951. doi: 10.3390/MICROORGANISMS9050951. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B58] 58.Shafiq M., Huang J., Ur Rahman S., et al. High incidence of multidrug-resistant Escherichia coli coharboring mcr-1 and blaCTX-M-15 recovered from pigs. Infection and Drug Resistance . 2019;12:2135–2149. doi: 10.2147/IDR.S209473. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B59] 59.Liu Z., Deng D., Lu H., et al. Evaluation of machine learning models for predicting antimicrobial resistance of actinobacillus pleuropneumoniae from whole genome sequences. Frontiers in Microbiology . 2020;11:48–57. doi: 10.3389/fmicb.2020.00048. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Machine Learning Techniques for Antimicrobial Resistance Prediction of Pseudomonas Aeruginosa from Whole Genome Sequence Data

Sohail M Noman

Muhammad Zeeshan

Jehangir Arshad

Melkamu Deressa Amentie

Muhammad Shafiq

Yumeng Yuan

Mi Zeng

Xin Li

Qingdong Xie

Xiaoyang Jiao

Abstract

Aim

Method

Results

Conclusion

1. Introduction

2. Methods

2.1. Data Collection

2.2. Model Framework and Parameters

Figure 1.

2.3. BioWeka and Random Forest Prediction of Phenotypes Resistance

2.4. Regression Statistics

3. Results

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Table 1.

4. Discussion

Table 2.

Acknowledgments

Data Availability

Consent

Conflicts of Interest

Authors' Contributions

Supplementary Materials

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases