Skip to main content
iScience logoLink to iScience
. 2022 Aug 17;25(9):104967. doi: 10.1016/j.isci.2022.104967

Prediction of anti-inflammatory peptides by a sequence-based stacking ensemble model named AIPStack

Hua Deng 1, Chaofeng Lou 1, Zengrui Wu 1, Weihua Li 1, Guixia Liu 1, Yun Tang 1,2,
PMCID: PMC9449674  PMID: 36093066

Summary

Accurate and efficient identification of anti-inflammatory peptides (AIPs) is crucial for the treatment of inflammation. Here, we proposed a two-layer stacking ensemble model, AIPStack, to effectively predict AIPs. At first, we constructed a new dataset for model building and validation. Then, peptide sequences were represented by hybrid features, which were fused by two amino acid composition descriptors. Next, the stacking ensemble model was constructed by random forest and extremely randomized tree as the base-classifiers and logistic regression as the meta-classifier to receive the outputs from the base-classifiers. AIPStack achieved an AUC of 0.819, accuracy of 0.755, and MCC of 0.510 on the independent set 3, which were higher than other AIP predictors. Furthermore, the essential sequence features were highlighted by the Shapley Additive exPlanation (SHAP) method. It is anticipated that AIPStack could be used for AIP prediction in a high-throughput manner and facilitate the hypothesis-driven experimental design.

Subject areas: Drugs, Peptides, Artificial intelligence, Artificial intelligence applications

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • AIPStack model was developed for the prediction of anti-inflammatory peptides

  • The hybrid features were used to describe the peptide sequences

  • The proposed model AIPStack outperformed existing ones

  • SHAP was used to highlight the essential features required for AIP prediction


Drugs; Peptides; Artificial intelligence; Artificial intelligence applications

Introduction

Inflammation response is essentially the natural defense to injury, infection, or stimuli in the body, which helps to maintain tissue homeostasis under noxious conditions (Medzhitov, 2010). Usually, inflammation is classified into acute inflammation and chronic inflammation. Acute inflammation is a kind of innate immunity response, while chronic inflammation persists for a long time and results in various devastating chronic diseases, such as neurodegenerative diseases, cardiovascular diseases, cancer, and autoimmune disorders. According to statistics, three of five people die due to chronic inflammatory diseases in the world (Tsai et al., 2019; Deepak et al., 2019; Barcelos et al., 2019). There is no doubt that chronic inflammation has been a great threat to human health. Nonsteroidal anti-inflammatory drugs (NSAIDs), glucocorticoids, and some biologicals are the primary treatments for chronic inflammation and autoimmune disorders (Tabas and Glass, 2013; Vandewalle et al., 2018; Bindu et al., 2020; Chan and Carter, 2010). However, multiple adverse effects (Harirforoosh et al., 2013; Schäcke et al., 2002) and drug resistance (Dendoncker and Libert, 2017) pose challenges to the development of anti-inflammatory small molecular drugs. Thus, there is an urgent need for the discovery and rational design of novel effective anti-inflammatory drugs.

In recent years, peptide therapeutics, like antibacterial peptides and anticancer peptides, have brought a lot of attention due to their attractive advantages including safety, efficacy, high selectivity, and ease of synthesis (Muttenthaler et al., 2021). Anti-inflammatory peptide (AIP) is a type of therapeutic peptides that exhibit anti-inflammatory properties. Generally, AIPs are short linear peptides composed of 10–50 amino acids. Among AIPs discovered so far, most of them are endogenous peptides or derived from natural sources, such as endogenous neuropeptide vasoactive intestinal peptide (Jiang et al., 2016), melittin (Lee et al., 2014) from bee venom, and hydrostatin-SN1 (Zhang et al., 2020a) isolated from the sea snake. Some synthetic peptides were also explored to inhibit inflammatory responses, for example, the BCL-3-mimetic (Collins et al., 2015). The mechanisms of AIPs include modulation of immune cell differentiation, inducing anti-inflammatory responses, and prevention of excessive pro-inflammatory responses (Sun et al., 2018; Heinbockel et al., 2021). Recently, several AIPs have been approved by the U.S. Food and Drug Administration (FDA) to prevent and control inflammation (Usmani et al., 2017). These related studies show that AIPs have great therapeutic potentials and are likely to be a new alternative therapy for inflammation treatment.

The discovery of AIPs through wet-lab experiments is labor-intensive, expensive, and time-consuming, so it is difficult to be applied in a high-throughput manner. Moreover, with the rapid development and wide applications of next-generation sequencing techniques, there is an increasing demand for fast, cheaper, and efficient computational methods to annotate the enormous amount of protein sequences. Machine learning (ML) algorithms represent such kinds of computational methods that can efficiently predict the peptide properties based on the sequence profiles and are hoped to expedite the process of AIP discovery.

So far, several ML methods have been developed for the identification of potential AIPs. Tables 1 and 2 summarized the existing ML methods from a wide range of aspects, including the datasets, applied ML algorithms, feature encoding schemes, evaluation strategies, and the availability of web servers or standalone software. As shown in Table 1, Gupta2017 (Gupta et al., 2017) is the first dataset used for AIP prediction, whose positive and negative assaying epitopes were collected from the Immune Epitope Database (IEDB) (Vita et al., 2019). In 2018, Manavalan et al. (2018) recollected data from the IEDB with more peptide sequences than Gupta2017. The other six datasets were mostly derived from Manavalan2018. Relationships between datasets of the eight existing AIP prediction methods were presented in Figure S1. Early methods like AntiInflam (Gupta et al., 2017) and AIPpred (Manavalan et al., 2018) only utilized a single algorithm. Recently, two excellent methods, PreTP-EL (Guo et al., 2021) and PreTP-Stack (Yan et al., 2022), were constructed by integrating several ML algorithms for therapeutic peptide prediction. Random forest (RF) (Breiman, 2001) is the most popular algorithm; seven out of the eight methods employed it. Meantime, three out of the existing methods used support vector machine (SVM) (Grigoroiu et al., 2020).

Table 1.

List of currently available methods for AIP prediction

Methods Year Feature encoding Model Evaluation strategy Web server Standalone
AntiInflam 2017 TPC, motif features SVM 10-fold CV, IT http://metagenomics.iiserb.ac.in/antiinflam/ no
AIPpred 2018 DPC RF 5-fold CV, IT http://www.thegleelab.org/AIPpred/ no
PreAIP 2019 AAindex, KSAAP, structural features, pKSAAP RF 10-fold CV, IT http://kurata14.bio.kyutech.ac.jp/PreAIP/ no
PEPred-Suite 2019 89 class features RF 10-fold CV, IT http://server.malab.cn/PEPred-Suite yes
AIEpred 2020 AAC, PSSM, PP RF LOOCV, IT no no
iAIPs 2021 AAC, DDE, GDC RF 5-fold CV, IT no no
PreTP-EL 2021 Kmer, PPCT, Tng, DT, DR, PCG, PSDT, PSBT, DP SVM, RF 10-fold CV, IT http://bliulab.net/PreTP-EL no
PreTP-Stack 2022 Kmer, PCG, Tng, DT, DR, AAC, BIT20, PPCT, PSDT, PSBT SVM, RF, LDA, XGBoost, AMV 10-fold CV, IT http://bliulab.net/PreTP-Stack no

Feature abbreviations: TPC (tripeptide composition), DPC (dipeptide composition), AAindex (amino acid index), KSAAP (k-spaced amino acid pairs), pKSAAP (k-spaced amino acid pairs from position-specific scoring matrix), AAC (amino acid contact), PSSM (position-specific scoring matrix), PP (physicochemical property), DDE (dipeptide deviation from the expected mean), and GDC (g-gap dipeptide composition), PPCT (position-specific scoring matrix and position-specific frequency matrix cross transformation), Tng (top-n-gram), DT (distance-based top-n-gram), DR (distance-based residue), PCG (parallel correlation pseudo amino acid composition general), PSDT (PSSM distance transformation), PSBT (position-specific frequency matrix with distance bigram transformation), DP (distance amino acid pair or just distance-pair), BIT20 (twenty-bit feature).

Model abbreviations: SVM (support vector machine), RF (random forest), LDA (linear discriminant analysis), XGBoost (extreme gradient boosting), AMV (auto-weighted multi-view learning).

Evaluation strategy abbreviations: k-fold CV (k-fold cross-validation), IT (independent test), LOOCV (leave-one-out cross-validation).

Table 2.

A summary of the datasets used in currently available methods

Methods Datasets Benchmark sets
Independent sets
CD-HIT threshold Datasets availability
Number of AIPs Number of non-AIPs Number of AIPs Number of non-AIPs
AntiInflam Gupta2017 690 1,009 173 253 yes
AIPpred Manavalan2018 1,258 1,887 420 629 0.8 yes
PreAIPa Khatun2019 1,258 1,887 420 629 0.8 yes
PEPred-Suite Wei2019 1,258 1,887 420 629 0.8 yes
420 2,000
AIEpred Zhang2020 690 1,009 173 253 yes
420 629 0.8
iAIPsa Zhao2021 1,258 1,887 420 629 0.8 yes
PreTP-ELa Guo2021 1,258 1,887 420 629 0.8 yes
PreTP-Stacka Yan2022 1,258 1,887 420 629 0.8 yes
a

Method whose dataset is the same as Manavalan2018.

How to effectively describe AIPs with informative feature representations is a major challenge for prediction models. The existing methods adopt various feature encoding schemes which can be classified into four groups, i.e. sequence composition features (e.g. dipeptide composition (DPC) (Saravanan and Gautham, 2015)), physicochemical property (e.g. amino acid index (AAindex) (Kawashima et al., 2007)), structure features (e.g. motif features), and evolution features (e.g. position-specific scoring matrix (PSSM) (Cai et al., 2012)). All the existing methods except AIPpred applied multiple feature encoding schemes to incorporate more information for the description of the sequences. Four evaluation strategies were used to estimate the performance of the existing methods, including leave-one-out cross-validation (CV), 5/10-fold CV, and independent test. Among the eight existing methods, six have been implemented as web servers or standalone software, but one of them, namely PEPred-Suite (Wei et al., 2019), cannot be accessed now.

These ML-based tools indeed have made great progress in the identification of AIPs and provided a rational basis for the selection of AIP candidates. However, two issues remain to be addressed. First, as presented in Table 2, most existing methods used the same and a small number of sequence samples, which limited the model performance and generalization capability. Second, three-quarters of existing methods only applied a single algorithm. However, lots of studies have proven that the ensemble learning model usually outperforms the single-algorithm-based model (Guo et al., 2021; Jiang et al., 2021; Basith et al., 2022; Liang et al., 2021; Mishra et al., 2019). Thus, the utilization of an ensemble learning strategy might improve the performance of AIP identification. Recently, two ensemble learning methods, namely PreTP-EL and PreTP-Stack, were reported, but their performance in AIP prediction might be still limited. Accordingly, it is essential to develop a new prediction model with higher accuracy, which could not only help improve our understanding of the association between peptide sequence and anti-inflammatory activity but also provide a reference for the rational design of AIPs based on the important features given by the model explanation.

Keeping these issues in mind, we first constructed a new dataset for model building and validation in this study. Then, we explored different feature encoding schemes and ML algorithms to further improve the prediction performance. We proposed a stacking ensemble model called AIPStack to predict AIPs. In brief, the AIPStack was composed of a two-layer framework. The first layer consisted of two popular ML algorithms (extremely randomized tree (ET) (Geurts et al., 2006) and RF), and used feature vectors fused by two feature representations, namely dipeptide deviation from expected mean (DDE) (Saravanan and Gautham, 2015) and composition of k-spaced amino acid pairs (CKSAAP) (Chen et al., 2013). And the second layer adopted the prediction probabilities from the first layer as the inputs of the meta-classifier (logistic regression, LR) (LaValley, 2008). The systematic workflow of AIPStack was depicted in Figure 1. We also evaluated the generalization capability of the proposed AIPStack and compared it with several state-of-the-art models by independent sets. Moreover, to implement the model interpretability, we leveraged the Shapley Additive exPlanation (SHAP) (Lundberg and Lee, 2017) method to highlight the most important and contributing sequence features. The proposed approach is potentially useful for AIP research.

Figure 1.

Figure 1

The overall framework of our AIPStack

(A) Dataset preparation. The dataset used here was collected from the IEDB. After reducing sequence redundancy, the undersampling approach was used to handle the imbalanced dataset.

(B) Feature encoding. The hybrid features fused by the DDE descriptor and CKSAAP descriptor were used to represent the peptide sequences.

(C) Model construction. A two-layer stacking ensemble model, called AIPStack, was developed.

(D) Model evaluation and prediction. We evaluated the AIPStack by the 10-fold cross-validation, internal and external validation. It was also compared with the existing methods.

Results

Dataset preparation

Dataset preparation is the first step in an ML endeavor and is important to build a predictive model with strong generalization capability. In this study, at first, we obtained a total of 2,642 AIPs (positive samples) and 3,704 non-AIPs (negative samples) from the IEDB. To avoid the evaluation bias caused by sequence homology, we excluded the sequences that shared >80% similarity with others, which resulted in a non-redundant dataset containing 1,866 AIPs and 2,845 non-AIPs. Since the dataset was imbalanced, we adopted the random undersampling technique to solve the issue, whereby a balanced dataset was obtained. The sampling process was repeated five times to generate five balanced datasets. Each balanced dataset contained 1,866 AIPs and the equal number of non-AIPs. Then according to a ratio of 8:1:1, each balanced dataset was randomly divided into training set, test set, and independent set (hereinafter referred to as independent set 1), which were used for model building, internal validation, and external validation, respectively.

Considering that there might be some overlaps between our independent set 1 of the final model and the datasets used for training in other methods, we constructed independent set 2 (difference of AntiInflam’s benchmark dataset with independent set 1) and independent set 3 (difference of AIPpred’s benchmark dataset with independent set 1) for unbiased comparison with other methods, which led to 135 AIPs and 138 non-AIPs in independent set 2, and 71 AIPs and 72 non-AIPs in independent set 3.

The length distribution of AIPs and non-AIPs can be seen in Figure 2A. The positive samples and negative ones had a similar distribution of sequence length, and most sequence lengths ranged from 15 to 20 amino acids. Additionally, the t-distributed stochastic neighbor embedding (t-SNE) plot in Figures 2B and S2, illustrated that the created test sets and independent sets covered most sequence space occupied by the training set. The results implied the rationality of dataset partitioning.

Figure 2.

Figure 2

Sequence length and spatial distribution

(A) Sequence length distribution of AIPs and non-AIPs in the whole dataset.

(B) A t-SNE plot for sequence spatial distribution of one of the balanced datasets. See also Figure S2.

Composition analysis of AIPs and non-AIPs

We performed composition information analysis for AIPs and non-AIPs. Each of the amino acid composition (AAC) (Bhasin and Raghava, 2004) descriptor and DPC descriptor was calculated on the whole dataset. Figure 3A showed the average composition of natural amino acids in AIP and non-AIP sequences. Amino acids with the four highest absolute difference scores were Leu, Asp, Arg, and Pro. The absolute difference values were 0.016, 0.009, 0.007, and 0.006, respectively (see Table S1). However, there was little difference in the composition of residue Trp and Met between AIPs and non-AIPs. Furthermore, a two-sided Mann-Whitney U test was used to evaluate the statistically significant difference between each amino acid for AIPs and non-AIPs. As shown in Figure 3A (see also Table S1), the composition of four amino acids (Leu, Asp, Arg, and Pro) was significantly different between AIPs and non-AIPs. Among them, the abundance of Leu and Arg (positively charged) was higher in AIPs than that in non-AIPs, whereas Pro and Asp (negatively charged) were more abundant in non-AIPs compared with AIPs. Similarly, Figure 3B and Table S2 showed the 20 top-ranked dipeptides which had the highest composition absolute differences between AIPs and non-AIPs. It was observed that dipeptides Leu-Leu, Ser-Leu, Leu-Ser, Leu-Glu, Leu-Lys, Leu-Ile, Ser-Val, Tyr-Leu, Glu-Arg, Arg-Ile, and Val-Leu were significantly dominant in AIPs, while dipeptides Gln-Gln, Asp-Asp, Gln-Pro, and Val-Asp were significantly dominant in non-AIPs. Namely, the most abundant dipeptides in AIPs were primarily composed of apolar-apolar, polar uncharged-apolar, apolar-polar uncharged, apolar-negatively charged, apolar-positively charged, negatively charged-positively charged, and positively charged-apolar amino acid pairs; the most abundant dipeptides in non-AIPs were primarily composed of polar uncharged-polar uncharged, negatively charged-negatively charged, polar uncharged-apolar, and apolar-negatively charged amino acid pairs.

Figure 3.

Figure 3

The different residue compositions of AIPs and non-AIPs

(A) Average amino acid compositions of 20 natural amino acids for AIPs and non-AIPs. See also Table S1.

(B) Dipeptide compositions of 20 top-ranked dipeptides that have the highest absolute differences between AIPs and non-AIPs. See also Table S2.

(C) Residue positional preference of AIPs and non-AIPs. The upper portion and the lower portion of the sequence logo graph represent the conserved residues of AIPs and non-AIPs, respectively. The first ten positions represent the N-terminus of peptides, and the last ten positions represent the C-terminus of peptides. p-values were calculated by the two-sided Mann-Whitney U test. The asterisks represent the statistical p-values (∗p-value < 0.05; ∗∗ p-value < 0.01; ∗∗∗ p-value < 0.001; ∗∗∗∗ p-value < 0.0001).

These results suggested that the significant differences in the composition of amino acids and dipeptides between AIP and non-AIP sequences might be important factors for governing the activity of inducing the release of anti-inflammatory cytokines or not. Consequently, in this study, composition descriptors were mainly considered when performing feature extraction.

Conserved residues in the terminal regions of AIPs and non-AIPs

In the above composition analysis, it was observed that certain amino acids were abundant in AIPs. However, it was unclear whether the dominant amino acids were evenly distributed or preferred at a certain region in the sequence. To study the positional preference of amino acids, Two Sample Logo (TSL) (Crooks et al., 2004) analysis was conducted for 10 residues from the N-terminus and C-terminus, separately, in the sequences of AIPs and non-AIPs.

As shown in Figure 3C, the first ten positions represent the N-terminus of peptides, and the last ten positions represent the C-terminus of peptides. Residue Leu was mostly preferred at positions 5, 7, 8, 10, 13, 16, and 19 of AIPs sequences. Moreover, other significantly preferred amino acids in the terminal regions of AIPs were listed as follows, Arg at 2, 4, 5, 7, 14, and 19; Lys at positions 5, 11, and 17; Glu at positions 2; and Thr at positions 9 and 15. Similarly, Asp, Pro, and Gly were dominant in the terminal regions of non-AIPs. The analysis showed that AIPs and non-AIPs had different preference for amino acids in the terminal regions.

Model construction of AIPStack

The whole process of model construction included two subsequent steps: first to build baseline models, then to build optimal models.

Baseline models

Totally eight ML algorithms and thirteen descriptors of peptide sequences were used in model building. To assess the capability of these algorithms and descriptors in distinguishing AIPs, for each balanced dataset, we built 104 (13 × 8) baseline models without hyperparameter tuning on the training set and evaluated them by the 10-fold CV and the test set. We displayed the average results of five balanced datasets in Figures 4A and 4B and Tables S3 and S4. As shown in Figure 4A and Table S3, decision tree-based models (i.e. ET, RF, light gradient boosting machine (LightGBM) (Friedman, 2001), and eXtreme gradient boosting (XGBoost) (Chen and Guestrin, 2016)) achieved better AUC (area under the receiver operating characteristic curve) values (Fawcett, 2006) than models based on other algorithms (i.e. k-nearest neighbor (KNN) (Weinberger and Saul, 2009), LR, naive Bayes (NB) (Rish, 2001), and SVM) on the training set in most cases. It was observed that a baseline model which was ET-based and developed by DDE descriptor attained the maximum performance with AUC = 0.789 on the training set. Followed by an RF-based baseline model, which also adopted the DDE descriptor and achieved the second-best performance with AUC = 0.784 on the training set. Interestingly, LightGBM and XGBoost in conjunction with the DDE descriptor also achieved higher performance compared with the combination with other descriptors. In addition, baseline models which combined the CKSAAP descriptor with ET or RF algorithm also obtained higher AUC values on the training set. Meanwhile, we observed a similar tendency in the results on the test set (see Figure 4B and Table S4). ET or RF-based baseline models in conjunction with DDE or CKSAPP descriptor also performed better than other baseline models on the test set. The results above indicated that the DDE descriptor and CKSAAP descriptor were relatively informative for the identification of AIPs, and ET or RF-based models might be more suitable for AIP prediction. Consequently, DDE and CKSAAP were chosen for further study; ET and RF were used as base-classifiers in the stacking ensemble learning.

Figure 4.

Figure 4

Performance evaluation of different ML algorithms and descriptors

(A) A heatmap showing the average AUCs of baseline models on the training set of five balanced datasets. See also Table S3.

(B) A heatmap showing the average AUCs of baseline models on the test set of five balanced datasets. See also Table S4. Performance comparison of individual descriptors and the hybrid features. Results of ET-based models on (C) the training set and (E) the test set, respectively. Results of RF-based models on the training set (D) and (F) the test set, respectively. The “DDE + CKSAAP” denotes the hybrid features. The white dots in the violin plots represented the median values. p-values were calculated by the one-sided Wilcoxon signed-rank test and were also annotated. One asterisk represents p-value < 0.05 and the hybrid features performed better than the individual descriptor. See also Table S5 and Table S6.

Optimal models

Feature fusion was employed to check whether the models built by hybrid features would achieve better performance. The two selected meta-classifiers (ET and RF) and three descriptors (CKSAAP, DDE, and the hybrid features) were combined to develop six prediction models. A one-sided Wilcoxon signed-rank test was also carried out to statistically compare the performance. On the training set, for both ET-based and RF-based models, the models with hybrid features significantly outperformed the ones with a single descriptor at a p-value threshold of 0.05 in terms of AUC values (see Figures 4C and 4D and Table S5). It suggested that feature fusion contributed to a more accurate classification model than relying on individual features. On the test set, for RF-based models, the hybrid features resulted in the best AUC of 0.804 and a lower SD of 0.022. And the models with hybrid features significantly outperformed the models with CKSAAP descriptor in terms of AUC values (p-value < 0.05) (see Figures 4E and 4F and Table S6). For ET-based models, the hybrid features led to a significant improvement of AUC values compared with the CKSAAP descriptor, but it was somewhat worse than the DDE descriptor. We inferred from the above results that the DDE descriptor might make the main contribution in the hybrid features to the model performance. Taking the results of the training set and the test set into consideration, the hybrid features were employed in our proposed method.

Evaluation of the AIPStack model

From the models constructed through five balanced datasets, we chose the one with the highest AUC value on the training set as the final predictive model, called AIPStack. To investigate the effectiveness of the AIPStack model, we carried out comparative experiments on the training set and test set. All the base-classifiers and meta-classifier were developed using the hybrid features and finely tuned. The corresponding results are provided in Figures 5A and 5B and Table S7. Using 0.05 as the p-value cutoff value, AIPStack significantly outperformed the base-classifier RF in terms of AUC on the training set. When compared with the base-classifier ET, though there was no significant difference in AUC, AIPStack achieved a slightly higher average AUC (0.808 versus 0.797) and lower SD(0.025 versus 0.027) in 10-fold CV. As for the meta-classifier, AIPStack significantly performed better than LR in all evaluation metrics on the training set (p-value of AUC <0.001). From the results on the test set (see Figure 5B), AIPStack and all three constituent classifiers did not suffer from overfitting. Besides, the AIPStack model was superior to its three constituent classifiers in all evaluation metrics.

Figure 5.

Figure 5

Performance comparison of AIPStack with its constituent classifiers and the existing methods

(A) Average performance of the AIPStack and its constituent classifiers on the training set using the 10-fold CV. ET and RF are base-classifiers, and LR is the meta-classifier. The lines in the boxes represent the median value, and the diamonds show outliers. p-values were calculated by the one-sided Wilcoxon signed-rank test and were also annotated. The asterisks represent the statistical p-value (∗ p-value < 0.05; ∗∗∗ p-value < 0.001). See also Table S7.

(B) Performance of the AIPStack and its constituent classifiers on the test set.

(C) Performance of the AntiInflam and AIPStack on the independent set 2. AntiInflam provided two models which used different feature encodings and showed different accuracy. “LA” and “MA” stood for less accurate and more accurate model, respectively.

(D) Performance of the PreTP-Stack and AIPStack on the independent set 3.

(E) Performance of the PreAIP, AIPpred, PreTP-EL, and AIPStack on the independent set 3.

(F) ROC curves and AUCs of the PreAIP, AIPpred, PreTP-EL, and AIPStack. In panels (C) and (D), AntiInflam and PreTP-Stack did not provide predicted probabilities on their web servers, hence the AUCs cannot be computed and were not shown.

Generalization capability of AIPStack

To examine the generalization capability of our model, independent set 1 was used to assess and validate the robustness of the AIPStack model. Our method achieved good performance with AUC = 0.797, accuracy (ACC) = 0.701, sensitivity (SE) = 0.658, specificity (SP) = 0.743, precision = 0.719, and Matthews correlation coefficient (MCC) = 0.403. As can be seen, AIPStack still performed well on the unseen data. It proved that our model had good transferability and was capable of distinguishing AIPs in practical prediction.

Comparison with existing methods

AntiInflam is the first model for the identification of AIPs (Gupta et al., 2017). It was based on the SVM algorithm and was trained on a relatively small dataset (690 AIPs and 1,009 non-AIPs). The independent set 2 was employed for the comparison of AIPStack with AntiInflam. The prediction results of AntiInflam were obtained from its web server by uploading the dataset. As presented in Figure 5C, AIPStack obviously outperformed AntiInflam. The AntiInflam performed not well in terms of SE (0.156 or 0.081) and MCC (0.081 or 0.181), whether using the less accurate or more accurate model. It tended to predict all samples as negative ones. The results indicated that AntiInflam might have a poor generalization capability.

Manavalan et al. constructed a benchmark set and an independent set that contain more AIPs (1,678) and non-AIPs (2,516), which was called Manavalan2018 here. They then proposed AIPpred. Later, PreAIP (Khatun et al., 2019) was developed and evaluated on the basis of Manavalan2018. It is the best prediction model among the existing methods which specifically predict AIPs. Most recently, Liu’s group proposed two ensemble learning models for identifying all types of therapeutic peptides, i.e. PreTP-EL and PreTP-Stack. These two methods also used Manavalan2018 for model construction and validation. The independent set 3 was employed for the comparison of AIPStack with AIPpred, PreAIP, PreTP-EL, and PreTP-Stack. Likewise, prediction results were obtained from their web servers. AIPStack outperformed PreTP-Stack in all five evaluation metrics (see Figure 5D). Meanwhile, as illustrated in Figures 5E and 5F, AIPStack achieved the highest AUC of 0.819, as well as the maximum ACC of 0.755, precision of 0.757, and MCC of 0.510, respectively. Although our AIPStack achieved a lower SE than PreAIP and a lower SP than PreTP-EL, it showed a more balanced performance in terms of SE and SP.

The AIPStack was not compared with the other three methods, i.e. PEPred-Suite, AIEpred (Zhang et al., 2020b), and iAIPs (Zhao et al., 2021). The reasons are as follows: i) the PEPred-Suite server is no longer functional; ii) AIEpred and iAIPs do not provide web servers or standalone software, and the models are unavailable, too. However, it can be inferred that AIPStack will perform better than these three models in the external validation. Because according to the literature, these three models underperformed AIPpred on the same independent set (Zhang et al., 2020b; Zhao et al., 2021), while our AIPStack achieved better performance than AIPpred in all six metrics.

Model interpretation

In this section, we employed the SHAP method to analyze the contributions of sequence features during the prediction. In current work, due to the good interpretability and relatively high performance, the ET model based on the hybrid features was analyzed, instead of the stacked model which is complicated to interpret. We visualized the importance of each feature in each sample of the training set, and ranked features according to their SHAP values.

Figure 6A provided the mean absolute values of the SHAP values for the top 20 features, it was obvious that the five most influential features for the prediction of samples were “LS”, “LS.gap0”, “LE”, “SL”, and “LL”. Figure 6B illustrated the distribution of SHAP values for the 20 most influential features. From Figure 6B, we could figure out the relationships between SHAP values and the positive influence or negative influence of these features. Positive SHAP values indicated the prediction of AIPs with a high probability. On the contrary, negative ones indicated the prediction of non-AIPs with a high probability. Taking the feature “LS” in Figure 6B as an example, feature values of “LS” were relatively low for most negative samples. Hence, for an unknown sample, if its feature value of “LS” is high, then the model will tend to predict it as an AIP; otherwise, the model will tend to predict it as a non-AIP.

Figure 6.

Figure 6

SHAP analysis results

(A) A standard bar plot showing the mean absolute value of SHAP values for the top 20 features.

(B) Distribution of SHAP values for the top 20 features. Feature values are indicated by different colors (red: high and blue: low). A positive SHAP value indicates it is AIP, while a negative SHAP value indicates it is non-AIP.

Discussion

The accurate identification of potential AIPs via computational methods remains one of the most challenging problems. In this study, we presented a new method, called AIPStack, which allowed us to predict whether a given peptide could induce any anti-inflammatory cytokine or not, based on the sequence features.

First, we constructed a non-redundant dataset whose size was increased by approximately 12.6% compared with the dataset used in state-of-the-art methods (e.g. AIPpred, PreAIP, and PreTP-EL). Analysis on the composition information and positional preference suggested that residues Leu and Arg were significantly abundant in the terminal regions of AIPs but not the case in non-AIPs, which was consistent with the results of previous studies (Manavalan et al., 2018; Khatun et al., 2019). Also, there is some experimental evidence to support our findings. For example, it was reported that Leu had a major influence on the anti-inflammatory activity of peptides (Wang et al., 2010; Nan et al., 2007). Another example is that a lupin protein hydrolysat (LPH) peptide (GPETAFLR) derived from plants exerts anti-inflammatory activity by promoting the expression of the anti-inflammatory cytokine IL-10 and reducing the expression of pro-inflammatory cytokine TNF and IL-1β (Montserrat-de la Paz et al., 2019). In the sequence of this LPH peptide, residues Leu and Arg are at the C-terminus. Thus, introducing a Leu or Arg mutation to the terminal regions of peptides may improve the anti-inflammatory efficacy. For other dipeptides with a significant difference, further wet-lab experiments are needed to prove their effects on anti-inflammatory activity. Consequently, these observations of residue composition might shed light on the redesign and the de novo design of AIPs.

We explored various algorithms and encoding schemes for AIP identification, while six of eight existing methods only used RF to construct their models. Likewise, we found RF algorithm in conjunction with DDE and CKSAAP achieved good performance. However, ET algorithm has not been employed in any existing methods; when it was combined with the DDE descriptor, the model achieved top performance among all baseline models in this work. Next, we demonstrated the effectiveness of feature fusion by evaluating the performance of ET-based and RF-based models. The concatenation of feature vectors led to higher dimensional vectors, someone may argue that the hybrid descriptor may contain redundant or noisy features that potentially lead to the decreased predictive performance of the trained model. But because there were only two types of descriptors used in the fusion, feature selection is likely to cause an overfitting problem. Therefore, we did not perform feature selection in this work.

The final AIPStack model achieved an average AUC of 0.808 on the training set, representing an improvement of AUC of 1.4%–26.9% compared with the three constituent models. It also outperformed the constituent models in terms of all six evaluation metrics on the test set. Overall, our observations demonstrated the effectiveness of the stacking ensemble strategy. Ideally, it is desirable to apply classifiers that have different underlying operating principles as the base-classifiers, to enrich the meta-classifier with more information on the solution space. In our study, we found that tree-based models generally performed better, so we just chose the two best tree-based models as the base-classifiers. This might be the reason why there was only a slight improvement of the AIPStack and no statistical difference when compared with the base-classifier ET.

Moreover, we constructed three independent sets to assess the generalization capability of our method and objectively compared it with eight state-of-the-art methods. We found that the AIPStack performed well on all three independent sets, which demonstrated the stability and reliability of our method. Furthermore, AIPStack outperformed the existing methods. First, our AIPStack achieved much better performance compared with AntiInflam. The latter was built on a much smaller dataset, therefore showed worse performance of AIP prediction on the independent set 2 constructed by us. Second, on the independent set 3, AIPStack outperformed AIPpred and PreTP-Stack on all evaluation metrics, and it achieved a more balanced performance compared with PreAIP and PreTP-EL. These results indicated that AIPStack had great capacity and utility.

To further investigate the relationships between sequence features and anti-inflammatory properties of peptides, we applied the SHAP algorithm for model interpretation. It revealed some essential features for AIP optimization, such as “LS”, “LS.gap0”, “LE”, “SL”, and “LL”. The “LS.gap0” belongs to the CKSAAP descriptor and the others come from the DDE descriptor. According to the definition of these two descriptors, “LS.gap0” represents the composition of dipeptide Leu-Ser in a sequence, while the rest four features reflect the deviation of the corresponding dipeptide frequencies from expected mean values. Consistently, in the composition analysis, we also found dipeptides Leu-Ser, Leu-Glu, Ser-Leu, and Leu-Leu were significantly different between AIPs and non-AIPs. Peptides with higher values of the above five features are more likely to be AIPs. Though there is no direct evidence to prove the importance of features identified in this work, some studies proved the dipeptides we mentioned here are indispensable for AIPs. For example, Lin et al. found that tripeptide LSW showed anti-inflammatory activity on vascular smooth muscle cells (Qinlu et al., 2017), from which we can infer the important role of dipeptide Leu-Ser in the anti-inflammatory activity.

Taken together, AIPStack is a promising method for distinguishing AIPs; it should be helpful for large-scale AIP screening and facilitating hypothesis-driven experimental design.

Limitations of the study

One limitation of the current study is that we did not provide a user-friendly web interface; alternatively, we shared our final model on the well-known GitHub. It is convenient for researchers to download and use.

Another limitation is that all the existing predictors including AIPStack are ML-based approaches. ML-based predictors typically need sufficient data for training. However, the number of currently available AIPs still cannot meet the needs and limited the performance of existing predictors. In addition, ML-based predictors need to extract sequence information by third-party software in advance. Though we had explored several sequence representation schemes in the current study, the existing schemes might be not informative enough. To solve the issue, on the one hand, developing new and informative sequence representation schemes even AIP-specific feature representation schemes will be helpful. On the other hand, using more elaborate deep learning algorithms and thus extracting features through network layers automatically could avoid the problem.

The third limitation is that we did not consider the AIPStack’s capability in distinguishing AIPs from other types of peptides, since the dataset did not include peptides with other functions as negative samples. As suggested by Manavalan et al., in future work, a two-step framework can be employed to overcome the limitation (Manavalan et al., 2021). In the first step, the predictor will be developed using the dataset containing experimentally validated AIPs as positive samples, and other therapeutic peptides and random peptides as negative samples. In the second step, the predictor will be constructed using a dataset containing the same AIPs as positive samples and experimentally validated non-AIPs as negative samples. Model robustness and practical applicability will be enhanced through such a framework.

STAR★Methods

Key resources table

Resource availability

Lead contact

Further information and requests for resources should be directed to and will be fulfilled by the lead contact, Dr. Yun Tang (ytang234@ecust.edu.cn).

Materials availability

This study did not generate new unique reagents.

Method details

Dataset preparation

In this work, we constructed a new dataset for AIP prediction. Firstly, we collected linear peptides from the IEDB which captured large amounts of data from the literature, making it a reliable and popular database. The following criteria were used to separate positive samples from negative ones. If a linear peptide could induce any one of the anti-inflammatory cytokines including interleukin (IL)-4, IL-10, IL-13, IL-22, IL-27, IL-35, IL-37, transforming growth factor β (TGF-β), and interferon (IFN)-α/β, it was considered to be an AIP; if it cannot induce any anti-inflammatory cytokines above, it was regarded as a non-AIP (Marie et al., 1996; Paul, 2015; Yoshida and Hunter, 2015; Collison et al., 2007; Banchereau et al., 2012). Subsequently, to reduce homology bias and prevent overfitting, a CD-HIT (Fu et al., 2012) threshold of 0.8 among the whole dataset was applied to exclude redundant sequences.

We employed the random undersampling technique to obtain a new balanced dataset by randomly selecting samples from the majority class (non-AIPs). The random undersampling was repeated five times. Next, for each balanced dataset, 80% of the data was randomly selected as the training set for model training and hyperparameter optimization, 10% of the data was randomly selected as the test set for internal validation, and the remaining 10% was the independent set 1 used for external validation.

Furthermore, two additional independent sets named independent set 2 and independent set 3 were constructed. Independent set 2 was the difference set of independent set 1 and AntiInflam’s benchmark dataset. Similarly, independent set 3 was the difference set of independent set 1 and AIPpred’s benchmark dataset. It is worth noting that independent set 1 used here derived from the balanced dataset on which the model achieved the highest AUC among the five balanced datasets.

Position conservation analysis

Ten residues were extracted from the N-terminus and the C-terminus of each peptide sequence, respectively. The two terminal regions were joined to create a new sequence of 20 residues. The following example shows the process of creating a new sequence from a peptide of length 22 residues.

  • Original peptide sequence (N - > C): DIELLKKILAGGFIQKYDSVMQ

  • Ten residues in N-terminus (N - > C): DIELLKKILA

  • Ten residues in C-terminus (C - > N): QMVSDYKQIF

  • Created sequence (N-terminus (N - > C) + C-terminus (C - > N)): DIELLKKILAQMVSDYKQIF.

The created sequences of 20 residues were used as inputs for TSL software for generating logo representation of sequences. The first ten positions represented the N-terminus of peptides, and the last ten positions represented the C-terminus of peptides. In the logo graph, the height of each letter is proportional to the frequency of the corresponding residue at that position. To test the statistical significance of AIPs and non-AIPs, the height of the logo was scaled according to the statistical significance threshold of p-value < 0.05 (Welch t-test). By the logo representation, we could visualize the differences between the terminal residues of AIPs and non-AIPs.

Selection of feature encoding schemes

In this study, to select the optimal encoding schemes, we firstly employed 13 encoding schemes and combined them with several ML algorithms (see the subsection Selection of ML algorithms below) to construct baseline models. Then, these baseline models were built on the training set and were evaluated by a 10-fold CV and the test set. Encoding schemes used by the models with top performance were selected. The employed 13 encoding schemes can be grouped into three major types: simple composition descriptors (AAC, ATC (Pande et al., 2019), BTC (Pande et al., 2019), DDE, CKSAAP, and TPC (Bhasin and Raghava, 2004)), physicochemical descriptors (PCP (Pande et al., 2019), AAINDEX and C/T/D (Cai et al., 2003)), and Shannon entropy (SER and SPC (Pande et al., 2019)). Details about these descriptors are shown in Table S8. These descriptors were calculated by the iFeature toolkit and the Pfeature toolkit. A brief introduction to descriptors used in the final model was provided below.

DDE. The DDE descriptor was first proposed by Saravanan et al. in 2015 and was used for linear B-cell epitope prediction (Saravanan and Gautham, 2015). A 400-dimensional feature vector is generated by calculating 3 parameters, that is dipeptide composition measure (Dc), theoretical mean (Tm), and theoretical variance (Tv). Dc is defined as:

Dc=NijL1

where Nij is the occurrence time of amino acid pair ij in a given peptide or protein sequence, and L is the sequence length.

Tm is defined as:

Tm=CiCN×CjCN

where Ci and Cj stand for the numbers of codons encoding amino acid i and amino acid j in the dipeptide, while CN stands for the number of all possible codons except the termination codon, that is CN is equal to 61.

Tv is defined as

Tv=Tm×(1Tm)L1

So, DDE can be calculated as below:

DDE=DcTmTv

CKSAAP. The CKSAAP encoding strategy is widely employed in bioinformatics research, and it was also successfully applied in AIP prediction (Khatun et al., 2019). CKSAAP calculates the occurrence frequencies of k (k = 0, 1, 2, 3, 4, 5) -spaced amino acid pairs in a protein or peptide sequence, which can reflect the short-range interactions of amino acids within a sequence or sequence fragment. Taking k = 0 as an example, there are 400 amino acid pairs in 0-space, such as AA, AC, and AD. The composition of 0-spaced amino acid pairs is defined as:

(NAANtotal,NACNtotal,NADNtotal,,NYYNtotal)400

where NYY denotes the occurrence number of amino acid pair YY, and the value of Ntotal is equal to sequence length L minus k, namely L - k.

A sequence represented by the CKSAAP encoding is a 400 × (k + 1)-dimensional feature vector. When k = 0, the CKSAAP descriptor is the same as the DPC descriptor. Here, we set k = 5. As a result, this forms a 2400-dimensional feature vector for each peptide sequence.

Feature fusion

Feature fusion was applied here to check whether the performance was improved or not. Briefly, the DDE descriptor (400-dimensional vector) and CKSAAP descriptor (2,400-dimensional vector) were concatenated in a row, so each peptide sequence was converted into a 2,800-dimensional feature vector. After that, the MinMaxScaler normalization technique by the scikit-learn v1.0.2 package was used to scale the hybrid feature vectors into the range of 0 and 1.

Selection of ML algorithms

As mentioned in the subsection Selection of feature encoding schemes, we constructed several baseline models by combining 8 ML algorithms with 13 sequence descriptors to select the optimal ML algorithms simultaneously. ML algorithms used by the models with top performance were selected as the base-classifiers for the AIPStack. The employed ML algorithms included ET, KNN, LightGBM, LR, NB, RF, SVM, and XGBoost. The XGBoost Python package (v1.4.2) and LightGBM Python package (v3.2.1) were used to build the XGBoost model and LightGBM model, respectively. Other classifiers were implemented by scikit-learn v0.24.2 in Python v3.7.10. During the process of algorithm selection, default parameters were used for all models. But before constructing the final model, two selected ML algorithms were fine-tuned using the grid search technique. The two selected algorithms are introduced below, and the description of the other algorithms is presented in Table S9.

RF

RF is a powerful ensemble-based algorithm, which has been successfully utilized in various classification and regression tasks due to its advantages of simplicity, high efficiency, and high accuracy. RF model combines multiple classification and regression trees (CART) to create a “forest”. And it improves the prediction performance of CART classifiers by growing numbers of weak CART classifiers. In the classification task, the final results are obtained by simply voting from the classification results of each independent tree. In the regression task, the final results are the average of the outputs of each independent tree. In the RF model construction here, four hyperparameters were optimized, including the number of trees used for constructing the RF classifier (n_estimators), the number of features to consider when looking for the best split (max_features), class weights (class_weight), and the split criterion (criterion). These hyperparameters were tuned using a grid search method implemented by scikit-learn v0.24.2 within the following ranges: n_estimators from 500 to 2,000, with a step size of 50; “auto” or “log2” for max_features; “balanced” or “None” for class_weight; and “Gini” or “entropy” for criterion.

ET

ET was proposed in 2006. It also belongs to a kind of ensemble method and can be applied to classification and regression tasks. ET is very similar to RF, but there are also differences between them. ET fits each decision tree on the whole training dataset, while RF uses the bootstrap sample to grow decision trees. Additionally, ET selects a split point at random, but RF picks an optimal split point according to Gini impurity, information gain, or mean square error. The optimization procedure of hyperparameters in ET was the same as that in the RF method.

Framework of AIPStack

Many previous studies have shown that the ensemble model can achieve better predictive performance than single models in the ensemble, and reduce the generalization error of the prediction (Charoenkwan et al., 2021; Mishra et al., 2019; Basith et al., 2022; Liang et al., 2021; Jiang et al., 2021; Guo et al., 2021). The existing ensemble learning strategies include boosting, bagging, and stacking (Verma and Mehta, 2017). The stacking strategy integrates information from a range of base models to generate a new model, which reflects the idea of seeking the wisdom of crowds. Stacking ensemble models have been successfully applied in many biological tasks so far, such as the classification of therapeutic peptides (Charoenkwan et al., 2021), the prediction of DNA-binding protein (Mishra et al., 2019), and the recognition of non-coding RNA (Mishra et al., 2019).

In this study, a two-layer stacking model named AIPStack was proposed. Firstly, base-classifiers in the first layer were trained on the input dataset, and then the output probabilities from the base-classifiers were used for the second-layer classifier namely the meta-classifier. To avoid overfitting, our stacking model was implemented using the stacking cross-validation algorithm provided in the mlxtend (v0.19.0) package. In the stacking model, as shown in Figure 1, the 10-fold CV was used to split the training set into ten subsets with equal size. In ten successive rounds, each subset was used as the validation set and the remaining nine subsets as the training set in turn. Then, each base-classifier was used for model fitting and output the prediction results of the validation set and the prediction results of the test set. Ten prediction results for the validation set and ten prediction results for the test set were generated in this way. Furthermore, the former was merged into a feature vector matrix for the new training set and the average value of the latter was also merged into a feature vector matrix for the new test set. The new feature matrices together with labels were the final training set and test set for the meta-classifier. Secondly, the new datasets from the first layer were provided as the input datasets of the meta-classifier. LR algorithm was employed as the meta-classifier here.

Model evaluation

To evaluate the performance of the classifiers and compare our model with other existing models, several widely used evaluation metrics for binary classification were employed, i.e. ACC, SE (also called recall), SP, Precision, MCC, and AUC. They are defined as follows.

ACC=TP+TNTP+FN+FP+TN
SE=TPTP+FN
SP=TNTN+FP
Precision=TPTP+FP
MCC=(TP×TN)(FP×FN)(TP+FP)(TP+FN)(TN+FP)(TN+FN)

where TP, TN, FP, and FN represent the numbers of true positives, true negatives, false positives, and false negatives, respectively.

The six different metrics evaluate the performance of the classifier from different perspectives. Precision and SE focus on the classifier’s ability to predict positive samples, while SP focuses on the ability to predict negative samples. MCC measures the correlation of the actual classes with the predicted labels. It has a range of −1 to +1, where +1 represents a perfect prediction, 0 means no better than random prediction and −1 indicates a completely wrong prediction. The AUC is a threshold-dependent metric that summarizes the overall performance of the classifier. For imbalanced datasets, MCC and AUC usually perform better in evaluating the model performance than other metrics (Boughorbel et al., 2017).

It is worth mentioning that the hyperparameter tuning for two base-classifiers was guided by the AUC values, which were calculated through the 10-fold CV on the training set. A parameter combination that achieved the highest AUC was considered the optimal parameter.

Model interpretation

In the present study, we used a powerful and famous framework, SHAP, to help understand the relationships between each feature and positive or negative sample prediction. SHAP method applies a cooperative game theory to calculate the marginal contribution for each feature in a sample, and it reflects how and to what extent a feature affects the final prediction. Here, we calculated the SHAP value for each feature in each sample of the training set. The calculation was conducted via the Python package shap v0.40.0. Then, the features were ranked according to feature importance scores, namely SHAP values. SHAP values with the higher absolute value indicated features with greater overall contributions (either negatively or positively).

Quantification and statistical analysis

All computations were performed in the Python programming language. We used the two-sided Mann-Whitney U test to evaluate the statistically significant difference in residue composition for AIPs and non-AIPs. The one-sided Wilcoxon signed-rank test was carried out to statistically compare the performance of the hybrid features and individual features, and compare the performance of the AIPStack and its constituent models. Details pertaining to significance have been also noted in the respective figure legends or table footnotes. The graphic abstract and Figure 1 were generated by Microsoft PowerPoint, other plots appearing in this study were generated by the Python package.

Acknowledgments

This work was supported by the National Key Research and Development Program of China (Grant 2019YFA0904800), the National Natural Science Foundation of China (Grants 81872800, 82173746, and 82104066), and Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism (Shanghai Municipal Education Commission, Grant 2021 Sci & Tech 03-28).

Author contributions

Conceptualization, H.D. and Y.T.; Formal Analysis, H.D.; Investigation, H.D.; Visualization, H.D. and C.L.; Writing - Original Draft, H.D.; Writing - Review & Editing, Y.T., G.L., W.L., and Z.W.; Supervision, Y.T.; Funding Acquisition, Y.T.

Declaration of interests

The authors declare no competing interests.

Published: September 16, 2022

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.isci.2022.104967.

Supplemental information

Document S1. Figures S1 and S2 and Tables S1, S2, and S5–S9
mmc1.pdf (587.8KB, pdf)
Table S3. The performance of baseline models based on different ML algorithms and descriptors on the training set using the 10-fold CV, related to Figure 4

It shows the average performance with a SD of models for five balanced datasets.

mmc2.xlsx (18.5KB, xlsx)
Table S4. The performance of baseline models based on different ML algorithms and descriptors on the test set, related to Figure 4

It shows the average performance with a SD of models for five balanced datasets.

mmc3.xlsx (18KB, xlsx)

Data and code availability

The datasets of AIPStack are made available on Github: https://github.com/Nicole-DH/AIPStack. We also shared the main codes of the AIPStack at that link. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

References

  1. Banchereau J., Pascual V., O'garra A. From IL-2 to IL-37: the expanding spectrum of anti-inflammatory cytokines. Nat. Immunol. 2012;13:925–931. doi: 10.1038/ni.2406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Barcelos I.P.d., Troxell R.M., Graves J.S. Mitochondrial dysfunction and multiple sclerosis. Biology. 2019;8:37–53. doi: 10.3390/biology8020037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Basith S., Lee G., Manavalan B. STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction. Brief. Bioinform. 2022;23:bbab376. doi: 10.1093/bib/bbab376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bhasin M., Raghava G.P.S. Classification of nuclear receptors based on amino acid composition and dipeptide composition. J. Biol. Chem. 2004;279:23262–23266. doi: 10.1074/jbc.M401932200. [DOI] [PubMed] [Google Scholar]
  5. Bindu S., Mazumder S., Bandyopadhyay U. Non-steroidal anti-inflammatory drugs (NSAIDs) and organ damage: a current perspective. Biochem. Pharmacol. 2020;180:114147–114167. doi: 10.1016/j.bcp.2020.114147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Boughorbel S., Jarray F., El-Anbari M. Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. PLoS One. 2017;12:e0177678. doi: 10.1371/journal.pone.0177678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Breiman L. Random forests. Mach. Learn. 2001;45:5–32. [Google Scholar]
  8. Cai C.Z., Han L.Y., Ji Z.L., Chen X., Chen Y.Z. SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res. 2003;31:3692–3697. doi: 10.1093/nar/gkg600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cai Y., Huang T., Hu L., Shi X., Xie L., Li Y. Prediction of lysine ubiquitination with mRMR feature selection and analysis. Amino Acids. 2012;42:1387–1395. doi: 10.1007/s00726-011-0835-0. [DOI] [PubMed] [Google Scholar]
  10. Chan A.C., Carter P.J. Therapeutic antibodies for autoimmunity and inflammation. Nat. Rev. Immunol. 2010;10:301–316. doi: 10.1038/nri2761. [DOI] [PubMed] [Google Scholar]
  11. Charoenkwan P., Chiangjong W., Nantasenamat C., Hasan M.M., Manavalan B., Shoombuatong W. StackIL6: a stacking ensemble model for improving the prediction of IL-6 inducing peptides. Brief. Bioinform. 2021;22:bbab172. doi: 10.1093/bib/bbab172. [DOI] [PubMed] [Google Scholar]
  12. Chen T., Guestrin C. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery; 2016. XGBoost: a scalable tree boosting system. [Google Scholar]
  13. Chen X., Qiu J.-D., Shi S.-P., Suo S.-B., Huang S.-Y., Liang R.-P. Incorporating key position and amino acid residue features to identify general and species-specific Ubiquitin conjugation sites. Bioinformatics. 2013;29:1614–1622. doi: 10.1093/bioinformatics/btt196. [DOI] [PubMed] [Google Scholar]
  14. Chen Z., Zhao P., Li F., Leier A., Marquez-Lago T.T., Wang Y., Webb G.I., Smith A.I., Daly R.J., Chou K.-C., Song J. iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics. 2018;34:2499–2502. doi: 10.1093/bioinformatics/bty140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Collins P.E., Grassia G., Colleran A., Kiely P.A., Ialenti A., Maffia P., Carmody R.J. Mapping the interaction of B cell leukemia 3 (BCL-3) and nuclear factor κB (NF-κB) p50 identifies a BCL-3-mimetic anti-inflammatory peptide. J. Biol. Chem. 2015;290:15687–15696. doi: 10.1074/jbc.M115.643700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Collison L.W., Workman C.J., Kuo T.T., Boyd K., Wang Y., Vignali K.M., Cross R., Sehy D., Blumberg R.S., Vignali D.A.A. The inhibitory cytokine IL-35 contributes to regulatory T-cell function. Nature. 2007;450:566–569. doi: 10.1038/nature06306. [DOI] [PubMed] [Google Scholar]
  17. Crooks G.E., Hon G., Chandonia J.-M., Brenner S.E. WebLogo: a sequence logo generator. Genome Res. 2004;14:1188–1190. doi: 10.1101/gr.849004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Deepak P., Axelrad J.E., Ananthakrishnan A.N. The role of the radiologist in determining disease severity in inflammatory bowel diseases. Gastrointest. Endosc. Clin. N. Am. 2019;29:447–470. doi: 10.1016/j.giec.2019.02.006. [DOI] [PubMed] [Google Scholar]
  19. Dendoncker K., Libert C. Glucocorticoid resistance as a major drive in sepsis pathology. Cytokine Growth Factor Rev. 2017;35:85–96. doi: 10.1016/j.cytogfr.2017.04.002. [DOI] [PubMed] [Google Scholar]
  20. Fawcett T. An introduction to ROC analysis. Pattern Recognit. Lett. 2006;27:861–874. [Google Scholar]
  21. Friedman J.H. Greedy function approximation: a gradient boosting machine. Ann. Statist. 2001;29:1189–1232. [Google Scholar]
  22. Fu L., Niu B., Zhu Z., Wu S., Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28:3150–3152. doi: 10.1093/bioinformatics/bts565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Geurts P., Ernst D., Wehenkel L. Extremely randomized trees. Mach. Learn. 2006;63:3–42. [Google Scholar]
  24. Grigoroiu A., Yoon J., Bohndiek S.E. Deep learning applied to hyperspectral endoscopy for online spectral classification. Sci. Rep. 2020;10:3947. doi: 10.1038/s41598-020-60574-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Guo Y., Yan K., Lv H., Liu B. PreTP-EL: prediction of therapeutic peptides based on ensemble learning. Brief. Bioinform. 2021;22:bbab358. doi: 10.1093/bib/bbab358. [DOI] [PubMed] [Google Scholar]
  26. Gupta S., Sharma A.K., Shastri V., Madhu M.K., Sharma V.K. Prediction of anti-inflammatory proteins/peptides: an in silico approach. J. Transl. Med. 2017;15:7–11. doi: 10.1186/s12967-016-1103-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Harirforoosh S., Asghar W., Jamali F. Adverse effects of nonsteroidal antiinflammatory drugs: an update of gastrointestinal, cardiovascular and renal complications. J. Pharm. Pharm. Sci. 2013;16:821–847. doi: 10.18433/j3vw2f. [DOI] [PubMed] [Google Scholar]
  28. Heinbockel L., Weindl G., Correa W., Brandenburg J., Reiling N., Wiesmüller K.H., Schürholz T., Gutsmann T., Martinez de Tejada G., Mauss K., Brandenburg K. Anti-infective and anti-inflammatory mode of action of peptide 19-2.5. Int. J. Mol. Sci. 2021;22:1465. doi: 10.3390/ijms22031465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Jiang M., Zhao B., Luo S., Wang Q., Chu Y., Chen T., Mao X., Liu Y., Wang Y., Jiang X., et al. NeuroPpred-Fuse: an interpretable stacking model for prediction of neuropeptides by fusing sequence information and feature selection methods. Brief. Bioinform. 2021;22:bbab310. doi: 10.1093/bib/bbab310. [DOI] [PubMed] [Google Scholar]
  30. Jiang W., Wang H., Li Y.S., Luo W. Role of vasoactive intestinal peptide in osteoarthritis. J. Biomed. Sci. 2016;23:63. doi: 10.1186/s12929-016-0280-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kawashima S., Pokarowski P., Pokarowska M., Kolinski A., Katayama T., Kanehisa M. AAindex: amino acid index database, progress report 2008. Nucleic Acids Res. 2008;36:D202–D205. doi: 10.1093/nar/gkm998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Ke G., Meng Q., Finley T., Wang T., Chen W., Ma W., Ye Q., Liu T. Curran Associates, Inc; 2017. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. [Google Scholar]
  33. Khatun M.S., Hasan M.M., Kurata H. PreAIP: computational prediction of anti-inflammatory peptides by integrating multiple complementary features. Front. Genet. 2019;10:129–139. doi: 10.3389/fgene.2019.00129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. LaValley M.P. Logistic regression. Circulation. 2008;117:2395–2399. doi: 10.1161/CIRCULATIONAHA.106.682658. [DOI] [PubMed] [Google Scholar]
  35. Lee W.R., Kim K.H., An H.J., Kim J.Y., Chang Y.C., Chung H., Park Y.Y., Lee M.L., Park K.K. The protective effects of melittin on Propionibacterium acnes-induced inflammatory responses in vitro and in vivo. J. Invest. Dermatol. 2014;134:1922–1930. doi: 10.1038/jid.2014.75. [DOI] [PubMed] [Google Scholar]
  36. Liang X., Li F., Chen J., Li J., Wu H., Li S., Song J., Liu Q. Large-scale comparative review and assessment of computational methods for anti-cancer peptide identification. Brief. Bioinform. 2021;22:bbaa312. doi: 10.1093/bib/bbaa312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lundberg S.M., Lee S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017;30:4765–4774. [Google Scholar]
  38. Manavalan B., Basith S., Lee G. Comparative analysis of machine learning-based approaches for identifying therapeutic peptides targeting SARS-CoV-2. Brief. Bioinform. 2022;23:bbab412. doi: 10.1093/bib/bbab412. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Manavalan B., Shin T.H., Kim M.O., Lee G. AIPpred: sequence-based prediction of anti-inflammatory peptides using random forest. Front. Pharmacol. 2018;9:276–287. doi: 10.3389/fphar.2018.00276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Marie C., Pitton C., Fitting C., Cavaillon J.M. Regulation by anti-inflammatory cytokines (IL-4, IL-10, IL-13, TGFβ) of interleukin-8 production by LPS-and/or TNFα-activated human polymorphonuclear cells. Mediators Inflamm. 1996;5:334–340. doi: 10.1155/S0962935196000488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Medzhitov R. Inflammation 2010: new adventures of an old flame. Cell. 2010;140:771–776. doi: 10.1016/j.cell.2010.03.006. [DOI] [PubMed] [Google Scholar]
  42. Mishra A., Pokhrel P., Hoque M.T. StackDPPred: a stacking based prediction of DNA-binding protein from sequence. Bioinformatics. 2019;35:433–441. doi: 10.1093/bioinformatics/bty653. [DOI] [PubMed] [Google Scholar]
  43. Montserrat-de la Paz S., Lemus-Conejo A., Toscano R., Pedroche J., Millan F., Millan-Linares M.C. GPETAFLR, an octapeptide isolated from Lupinus angustifolius L. protein hydrolysate, promotes the skewing to the M2 phenotype in human primary monocytes. Food Funct. 2019;10:3303–3311. doi: 10.1039/c9fo00115h. [DOI] [PubMed] [Google Scholar]
  44. Muttenthaler M., King G.F., Adams D.J., Alewood P.F. Trends in peptide drug discovery. Nat. Rev. Drug Discov. 2021;20:309–325. doi: 10.1038/s41573-020-00135-8. [DOI] [PubMed] [Google Scholar]
  45. Nan Y.H., Park K.H., Jeon Y.J., Park Y., Park I.S., Hahm K.S., Shin S.Y. Antimicrobial and anti-inflammatory activities of a Leu/Lys-rich antimicrobial peptide with Phe-peptoid residues. Protein Pept. Lett. 2007;14:1003–1007. doi: 10.2174/092986607782541042. [DOI] [PubMed] [Google Scholar]
  46. Pande A., Patiyal S., Lathwal A., Arora C., Kaur D., Dhall A., Mishra G., Kaur H., Sharma N., Jain S., et al. Computing wide range of protein/peptide features from their sequence and structure. BioRxiv. 2019 doi: 10.1101/599126. Preprint at. [DOI] [Google Scholar]
  47. Paul W.E. History of interleukin-4. Cytokine. 2015;75:3–7. doi: 10.1016/j.cyto.2015.01.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
  49. Lin Q., Liao W., Bai J., Wu W., Wu J. Soy protein-derived ACE-inhibitory peptide LSW (Leu-Ser-Trp) shows anti-inflammatory activity on vascular smooth muscle cells. J. Funct.Foods. 2017;34:248–253. [Google Scholar]
  50. Raschka S. MLxtend: providing machine learning and data science utilities and extensions to Python’s scientific computing stack. J. Open Source Softw. 2018;3:638. [Google Scholar]
  51. Rish I. IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence. August 2001 Seattle; 2001. An Empirical Study of the Naive Bayes Classifier; pp. 41–46. [Google Scholar]
  52. Saravanan V., Gautham N. Harnessing computational biology for exact linear B-cell epitope prediction: a novel amino acid composition-based feature descriptor. OMICS. 2015;19:648–658. doi: 10.1089/omi.2015.0095. [DOI] [PubMed] [Google Scholar]
  53. Schäcke H., Döcke W.D., Asadullah K. Mechanisms involved in the side effects of glucocorticoids. Pharmacol. Ther. 2002;96:23–43. doi: 10.1016/s0163-7258(02)00297-8. [DOI] [PubMed] [Google Scholar]
  54. Sun G.Y., Yang H.H., Guan X.X., Zhong W.J., Liu Y.P., Du M.Y., Luo X.Q., Zhou Y., Guan C.X. Vasoactive intestinal peptide overexpression mediated by lentivirus attenuates lipopolysaccharide-induced acute lung injury in mice by inhibiting inflammation. Mol. Immunol. 2018;97:8–15. doi: 10.1016/j.molimm.2018.03.002. [DOI] [PubMed] [Google Scholar]
  55. Tabas I., Glass C.K. Anti-inflammatory therapy in chronic disease: challenges and opportunities. Science. 2013;339:166–172. doi: 10.1126/science.1230720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Tsai D.-H., Riediker M., Berchet A., Paccaud F., Waeber G., Vollenweider P., Bochud M. Effects of short-and long-term exposures to particulate matter on inflammatory marker levels in the general population. Environ. Sci. Pollut. Res. Int. 2019;26:19697–19704. doi: 10.1007/s11356-019-05194-y. [DOI] [PubMed] [Google Scholar]
  57. Usmani S.S., Bedi G., Samuel J.S., Singh S., Kalra S., Kumar P., Ahuja A.A., Sharma M., Gautam A., Raghava G.P.S. THPdb: database of FDA-approved peptide and protein therapeutics. PLoS One. 2017;12:e0181748. doi: 10.1371/journal.pone.0181748. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Vandewalle J., Luypaert A., De Bosscher K., Libert C. Therapeutic mechanisms of glucocorticoids. Trends Endocrinol. Metab. 2018;29:42–54. doi: 10.1016/j.tem.2017.10.010. [DOI] [PubMed] [Google Scholar]
  59. Verma A., Mehta S. 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence. IEEE; 2017. A comparative study of ensemble learning methods for classification in bioinformatics; pp. 155–158. [Google Scholar]
  60. Vita R., Mahajan S., Overton J.A., Dhanda S.K., Martini S., Cantrell J.R., Wheeler D.K., Sette A., Peters B. The immune epitope database (IEDB): 2018 update. Nucleic Acids Res. 2019;47:D339–D343. doi: 10.1093/nar/gky1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Wang P., Nan Y.H., Yang S.T., Kang S.W., Kim Y., Park I.S., Hahm K.S., Shin S.Y. Cell selectivity and anti-inflammatory activity of a Leu/Lys-rich alpha-helical model antimicrobial peptide and its diastereomeric peptides. Peptides. 2010;31:1251–1261. doi: 10.1016/j.peptides.2010.03.032. [DOI] [PubMed] [Google Scholar]
  62. Wei L., Zhou C., Su R., Zou Q. PEPred-Suite: improved and robust prediction of therapeutic peptides using adaptive feature representation learning. Bioinformatics. 2019;35:4272–4280. doi: 10.1093/bioinformatics/btz246. [DOI] [PubMed] [Google Scholar]
  63. Weinberger K.Q., Saul L.K. Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 2009;10:207–244. [Google Scholar]
  64. Yan K., Lv H., Wen J., Guo Y., Xu Y., Liu B. PreTP-Stack: prediction of therapeutic peptide based on the stacked ensemble learning. IEEE/ACM Trans. Comput. Biol. Bioinform. 2022;14:1–10. doi: 10.1109/tcbb.2022.3183018. [DOI] [PubMed] [Google Scholar]
  65. Yoshida H., Hunter C.A. The immunobiology of interleukin-27. Annu. Rev. Immunol. 2015;33:417–443. doi: 10.1146/annurev-immunol-032414-112134. [DOI] [PubMed] [Google Scholar]
  66. Zhang C., Guo S., Wang J., Li A., Sun K., Qiu L., Li J., Wang S., Ma X., Lu Y. Anti-inflammatory activity and mechanism of hydrostatin-SN1 from hydrophis cyanocinctus in interleukin-10 knockout mice. Front. Pharmacol. 2020;11:930. doi: 10.3389/fphar.2020.00930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Zhang J., Zhang Z., Pu L., Tang J., Guo F. AIEpred: an ensemble predictive model of classifier chain to identify anti-inflammatory peptides. IEEE/ACM Trans. Comput. Biol. Bioinform. 2021;18:1831–1840. doi: 10.1109/TCBB.2020.2968419. [DOI] [PubMed] [Google Scholar]
  68. Zhao D., Teng Z., Li Y., Chen D. iAIPs: identifying anti-inflammatory peptides using random forest. Front. Genet. 2021;12:773202–773210. doi: 10.3389/fgene.2021.773202. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1 and S2 and Tables S1, S2, and S5–S9
mmc1.pdf (587.8KB, pdf)
Table S3. The performance of baseline models based on different ML algorithms and descriptors on the training set using the 10-fold CV, related to Figure 4

It shows the average performance with a SD of models for five balanced datasets.

mmc2.xlsx (18.5KB, xlsx)
Table S4. The performance of baseline models based on different ML algorithms and descriptors on the test set, related to Figure 4

It shows the average performance with a SD of models for five balanced datasets.

mmc3.xlsx (18KB, xlsx)

Data Availability Statement

The datasets of AIPStack are made available on Github: https://github.com/Nicole-DH/AIPStack. We also shared the main codes of the AIPStack at that link. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.


Articles from iScience are provided here courtesy of Elsevier

RESOURCES