Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Nov 24;15:41553. doi: 10.1038/s41598-025-25499-y

Targeting SOX9: designing a novel vaccine against triple-negative breast cancer

Ghazaleh Hatamian 1, Amirali Ebrahimpour 2, Mojgan Nejabat 3,, Farzin Hadizadeh 4,
PMCID: PMC12644569  PMID: 41286148

Abstract

Triple-negative breast cancer (TNBC) is a highly aggressive breast cancer subtype that lacks targeted therapies. This study aimed to design a novel multi-epitope peptide vaccine targeting SOX9, a transcription factor associated with TNBC progression. Using an immunoinformatic approach, B-cell, helper T lymphocyte (HTL), and cytotoxic T lymphocyte (CTL) epitopes with high antigenicity, non-toxicity, and non-allergenicity were identified. These epitopes were linked with appropriate spacers and fused to the 50 S ribosomal protein L7/L12 adjuvant to construct the vaccine. Physicochemical analysis predicted the construct to be stable, soluble, and suitable for expression. Structural modeling and refinement confirmed its quality, while molecular docking and dynamics simulations demonstrated favorable interactions with TLR2 and TLR4 receptors. Immune simulations predicted the strong cellular and humoral immune responses. These findings suggest that the designed vaccine holds substantial promise as a candidate for TNBC immunotherapy and merits further experimental validation.

Keywords: Triple-Negative breast cancer, SOX9, Multi-Epitope vaccine, Immunoinformatic

Subject terms: Vaccines, Cancer

Introduction

Triple-negative breast cancer (TNBC) is a heterogeneous cancer characterized by the absence of conventional molecular targets, including estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2)1. Patients with a diagnosis of TNBC, which accounts for 12% to 17% of breast cancers2, demonstrate a lack of response to hormonal and targeted therapies as a result of genetic mutations occurring in metastatic cancer cells3. Localized methods designed to treat the disease, minimize the probability of its reappearance, or prevent cancer metastases, like surgery, radiation therapy, and systemic therapy, are considered the usual treatment for breast cancer4.

Despite the advancements in managing and treating the negative subtype of breast cancer, it remains to pose substantial challenges with all the progress made in breast cancer treatment. Invasiveness and tendency to metastasize are TNBC’s hallmarks. Diagnosed patients confront a mortality rate of 40%, which is higher compared to other subtypes of breast cancer. Other distinctions include the risk of relapse and unfavorable prognosis, resulting in diminished survival rates within five years after diagnosis. The initial systemic treatment regimens for patients with TNBC comprises a combination of chemotherapy drugs, including anthracyclines, cisplatin, cyclophosphamide, fluorouracil, and taxanes. However, the effectiveness of these regimens is curtailed by resistance to chemotherapy, substantial adverse effects, high relapse rates, and overall poor prognosis. Furthermore, the heterogeneous nature of TNBC and the absence of specified treatment targets pose challenges in developing targeted therapies at the molecular level. Consequently, current strategies for addressing TNBC meet limitations5,6.

Immunotherapy

Immunotherapy exhibits greater cancer treatment potential than chemotherapy and radiation therapy7,8. The high genetic instability and mutation levels observed in TNBC render multi-epitope cancer vaccine therapy a particularly viable approach in immunotherapy9,10. A TNBC vaccine can be developed using either tumor-associated antigens (TAAs) or tumor-specific antigens (TSAs)11.

SOX9

SOX9 (aka SRY-box 9) is a transcription factor belonging to the SOX family. SOX proteins (e.g., SOX9, SOX10, SOX2, SOX4, and SOX8) play a central role in developmental regulation and are typified by a conserved DNA-binding domain known as the High Mobility Group (HMG) box12. SOX9 has been recognized as a critical gene that undergoes significant upregulation during the early stages of tumor formation13,14. The inhibition of these transcription factors is demanding due to their intracellular location and binding sites15,16. Despite the recognition of SOX9 as an essential regulator of TNBC progression, there are currently no approved therapeutic option aimed at treating patients with SOX9 overexpression17.

A recent study has led to the design of a multi-epitope peptide vaccine targeting SOX9 for TNBC. Nine essential regions of SOX9 that are highly responsive to T and B cells were pinpointed, along with incorporating a 50 S ribosomal protein L7/L12 (rplL) adjuvant. The vaccine demonstrated stability through simulations involving docking and dynamics with TLR 4. Furthermore, simulations related to immunology indicated responses, signifying the promising therapeutic capabilities of the vaccine18. In addition to the establishment of oncogenic role and overexpression in TNBC, SOX9 has been also concerned to normal developmental and tissue homeostatic processes in cartilage, gonads, and hair follicles19,20. Hence, there are dual legitimate concerns in cases of antigen specificity and the possible risk of autoimmune responses, making these considerations critical in the design and evaluation of SOX9-targeted immunotherapies.

Materials and methodology

Obtaining protein sequence

The SOX9 transcription factor’s sequence was retrieved in FASTA format obtained from NCBI (accession NO.: CAA86598.1)21. Various computational tools were applied to analyze SOX-9 sequence to predict the critical epitopes for the immune response. The rplL from Mycobacterium was utilized as the adjuvant in the study. The sequence was derived in FASTA format from NCBI (ID: WP_003403353)22.

Prediction of B-cell and T-cell (MHC-I and MHC-II) epitopes

Peptides capable of stimulating B, CD4+, and CD8 + cells were incorporated into the design of [peptide-based] multi-epitope vaccines. T-cell epitope prediction was conducted using the Immune Epitope Database (IEDB). The ANN-based NetMHCpan 4.1 EL prediction method (the recommended epitope predictor-2023.09)23 was utilized to forecast CD8 + T cell epitopes, with the input being the sequence of the SOX9 protein and the selection of HLA allele reference set to include all the alleles and corresponding peptide lengths.

To forecast epitopes recognized by CD4 + T helper cells, the NetMHCIIpan 4.1EL method suggested by IEDB was utilized. The input parameters for MHC-I prediction remained consistent, with further specifications such as designating the species/locus as human/HLA-DR and selecting the complete HLA reference set to ensure the resulting epitopes encompass HLA diversity across global populations. B-cell epitope prediction was performed using BepiPred 2.0, a linear epitope prediction tool integrated into IEDB24, which utilizes a Random Forest (RF) algorithm exclusively trained on epitopes and non-epitope amino acids derived from crystal structures25.

Potentially non-allergic and antigenic epitopes prediction

The antigenicity of both B- and T-cell epitopes was predicted using VaxiJen v2.026, with an accuracy rate between 70% and 90%. This tool operates independently of sequence alignment by applying the autocross-covariance (ACC) approach, which differentiates between non-antigenic and antigenic peptides by exploiting physicochemical parameters. Additionally, the AllerTOP v.2.0 web server27 was used to assess the allergenicity profile of the predicted B- and T-cell epitopes, classifying them as potentially non-allergenic or allergenic. The server also employs an ACC transformation algorithm by integrating physicochemical properties into its analysis.

Designing vaccine using computer simulations

Three key elements are required for developing a vaccine: adjuvants, linkers, and epitopes. Adjuvant, which plays a role in enhancing responses and aiding in recognizing anticipated epitopes28, was placed at the end of the structure. Linkers with the responsibility of connecting the components29 were utilized to join MHC-I, MHC-II, and B-cell epitopes (BCE), adjuvants, and His-tag, each fulfilling a function, including AAY, GPGPG and KK, respectively. A comprehensive screening process was conducted for antigenicity allergens and toxicity by tools such as VaxiJ3en v2.0, AllerTOP v2.0, and ToxinPred30, ensuring the efficacy and safety of the vaccine with minimized hazards and instilling confidence in its development.

Prediction of physicochemical properties

To assess the newly developed vaccine’s physical and chemical features, some of the advanced computational tools, including ProtParam31, AllerTOP v2.0, VaxiJen v2.0, and SOLUPROT32, analyzed the amino acid sequence and collected a detailed profile of the vaccine’s physicochemical properties, including molecular weight, pivotal for antigen recognition; the PI-value, indicator of the vaccine’s acidic/basic nature; instability value, reflecting its stability under experimental conditions; aliphatic index, showing its heat resistance; GRAVY index, confirming its polar or nonpolar feature; solubility, ensuring uniform dispersion of essential components for precise dosing and distribution; allergenicity, predicting potential allergic reactions; and antigenicity, evaluating the ability of the vaccine to stimulate an immune response.

Vaccine construct’s secondary and tertiary structure prediction

The prediction of secondary and tertiary structures is essential in the design of vaccines to develop insight into functional properties. The secondary structure of the designed vaccine was predicted using PSIPRED33, while its tertiary structure was predicted using ROBETTA, a protein structure prediction service that undergoes continuous evaluation through CAMEO (continuously assessing the accuracy and reliability of prediction)34, using advanced deep learning techniques.

Enhancing, verifying, and evaluating the quality of the 3D structure

After generating the vaccine’s 3D model using ROBETTA, the model was subsequently submitted to GalaxyWEB for tertiary structure refinement. The method employed by GalaxyWEB involves generating reliable core structures from multiple templates and refining unreliable loop or terminus regions through an optimization-based refinement process35 .ERRAT and VERIFY3D from SAVES v6.136 are software tools commonly used to validate 3D protein structures, assessing the overall quality based on non-bonded atomic interactions37 utilized to confirm the structure.

For more evaluation, a Ramachandran plot was obtained using PDBSUM38, a key tool in vaccine design by aiding in the identification of essential antigenic epitopes for developing effective vaccines. This tool helps analyze proteins’ main chain conformational angles, predict unique tripeptide conformations like alpha helix and beta sheets, and determine stable peptide structures for pharmaceutical development39,40.

Selected T-cell epitopes population coverage

Ensuring the designed vaccines provide broad population coverage is critical to vaccine development, particularly considering the diversity of major histocompatibility complex (MHC) alleles and their varying peptide binding specificities. Various investigations highlight the importance of maximizing population coverage by selecting peptides capable of effectively binding to a broad spectrum of MHC molecules4143. This analysis using the IEDB server allows to analyze the distribution and frequency of specific epitope sequences across different populations, considering all T-cell epitopes and their corresponding binding alleles. Herein, the settings were set to “World” for the “Select area and population” to provide comprehensive coverage.

Homologues epitopes screening

To identify immune-triggering components linked to the SOX9 protein, an investigation for components was conducted using BLASTp. Initially, possible components for the SOX9 protein were identified through computer-based analysis using tools for predicting components. These components were then compared against the human protein database (taxid; 9606) using BLASTp to determine their specificity and the possibility of reacting with proteins. Factors like E value and percent similarity were used to evaluate the matches.

Docking investigations involving the designed vaccine and TLR2 and TLR4 immune receptors

Molecular docking has a crucial role in vaccine design by facilitating the prediction of interactions between antigens and immune system receptors and identifying immunogenic epitopes. Studies have utilized molecular docking to assess the stability of peptide-HLA complexes, such as in the design of personalized cancer vaccines44. In this study, two systems were designed to assess the vaccine’s efficacy in eliciting an effective immune response, one with the vaccine docked to the TLR2 receptor and the other with TLR4. The crystal structures of TLRs (TLR2 and TLR4) were retrieved from the RCSB Protein Data Bank (PDB) with PDB IDs 6NIG45 and 2Z6246, respectively. To identify the optimal configuration for the vaccine–receptor complex based on energy and clustering, docking was conducted between the designed vaccine and TLRs using ClusPro 2.047,48. The best candidates from the molecular docking results, identified as number 0, were selected for preparation for the MD simulations.

MD simulation parameters

The simulation box dimensions for the systems were determined by maintaining a minimum spacing of 1.2 nm between the box walls and protein atoms. The AMBER99SB-ILDN all-atom force field49,50, along with the TIP3P water model51, was chosen for the MD calculations of the two designed systems. Sodium ions (Na+) were included to achieve neutral conditions. To minimize edge effects, periodic boundary conditions (PBC) were applied in all three dimensions (XYZ)52. The NVT and NPT ensembles were employed with a time step of 2 fs and durations of 200 ps for thermal and pressure equilibration, respectively. The MD simulations were conducted for 150 ns after equilibrating the TLR2- and TLR4-vaccine systems, using the open-source program GROMACS version 2016.453. Analyses of Root Mean Square Deviation (RMSD), Gibbs Free-Energy Landscape (FEL), and Radius of Gyration (RG) were performed using GROMACS modules. Interaction details between the vaccine and TLRs were assessed using the Protein-Ligand Interaction Profiler (PLIP) web server54. Binding free energy between the vaccine construct and the respective receptors was calculated using the Python program gmx_MMPBSA by applying the Molecular Mechanics Poisson–Boltzmann Surface Area (MMPBSA) approach55. Additionally, molecular images were generated using the PyMOL 3.1.4 program56.

Reverse translation and codon adaptation

A series of bioinformatics procedures were employed to facilitate the efficient SOX9-based vaccine expression in Escherichia coli (E. coli) K-12 including converting the amino acid sequences corresponding to the SOX9 epitopes into nucleotide sequences utilizing the EMBOSS Backtranseq online server57. Following this, codon usage was adjusted using VectorBuilder’s codon optimization tool58 to include E. coli’s preferences while preserving the integrity of the amino acid sequence. The optimized sequence’s codon adaptation index (CAI) scores and GC content, calculated by E-CAI59, indicates codon bias and the potential of the sequence for gene expression.

In-silico cloning

The designed SOX9-based vaccine gene was virtually cloned using SnapGene60 in the pET-28a(+) plasmid, which is a widely employed vector for recombinant protein production in E. coli and is known for its robustness and efficiency and benefits from improved designs with restored T7 promoter and evolved translation initiation regions61. During the cloning procedure, the recombinant vaccine gene was integrated into a plasmid at the XhoI restriction site (CTCGAG) on the 5’ end (N-terminal) and the EcoR1 restriction site (GAATTC) on the 3’ end (C-terminal) within a segment of the plasmid that did not hinder the function of any gene or operon, guaranteeing the vaccine protein’s efficient expression and facilitating its easy purification within the host organism.

Immune simulation

The immune reactions triggered by the designed SOX9-based vaccine were forecasted using the C-ImmSim server62. This server uses machine learning (ML) models to predict the interactions of diverse immune responses and simulates three crucial anatomical sites, including bone marrow, thymus, lymph node, that play pivotal roles in immune system maintenance. The study’s simulated parameters were set as (a) a vaccine formulation excluding LPS, (b) the incorporation of three vaccine doses (aimed at fostering a potent and enduring immune response) administered at intervals of 1, 84, and 168 days, and (c) simulation volume at 10 and steps at 1100 (Each simulation step corresponds to eight hours of real-time, thereby modeling immune responses for approximately 350 days (1100 × 8 h / 24 h).

Results

Sequence retrieval and analysis of physicochemical properties of SOX9

The SOX9 amino acid sequence was retrieved from Uniport in FASTA format. To analyze its physicochemical properties, ProtParam was used to calculate theoretical PI, molecular weight, instability index, GRAVY score, and aliphatic index. Additionally, VaxiJen v2.0, Soluprot, and AllerTOP v.2.0 were employed to predict antigenicity, solubility, and allergenic profile. The results were compiled in Table 1.

Table 1.

SOX9 physicochemical properties.

Molecular weight 56137.06
Number of amino acids 509
Estimated half-life

30 h (mammalian reticulocytes, in vitro)

> 20 h (yeast, in vivo)

> 10 h (Escherichia coli, in vivo)

Theoretical PI 6.31
Instability index 78.58
Aliphatic index 48.15
GRAVY score − 1.007
Antigenicity 0.5456
Allergen Non allergen
Toxicity Non toxin
Solubility score 0.939

Prediction of B-cell and T-cell (MHC-I and MHC-II) epitopes

CD8 + T cells or cytotoxic T lymphocyte (CTL) engage with MHC class I molecules on the surface of antigen-presenting cells (APCs) and their target counterparts, where they recognize antigenic peptide fragments derived from the proteasomal degradation of cytoplasmic proteins. These fragments are bound to the binding grooves on the MHC I molecules, facilitating the interaction between the CD8 + T cells and the presenting cells63. On the other hand, CD4 + helper T lymphocytes (HTL) possess T cell receptors (TCRs) specifically designed to recognize peptide antigens presented within the framework of MHC II molecules64.

Using the IEDB NetMHCpan EL 4.1, 23,814 epitopes were generated (scores: 0.9964-0). The epitopes were selected based on their average scores from VaxiJen v2.0 and IEDB, ensuring they were recognized as probable non-toxins by the ToxinPred server and non-allergens via AllerTOP.v.2.0. HTL-specific epitopes were predicted using the IEDB-recommended 2.22 server. A total of 75,025 epitopes were initially generated (scores: 0.9767-0). The same method applied for MHC I was used for epitope selection. Tables 2 and 3 respectively summarize the immune profiles for all 10 MHC-I and 10 MHC-II epitopes. The majority of predicted epitopes were classified as non-allergenic (AllerTOP v2.0) and non-toxic (ToxinPred). The mentioned in silico assessments supported the overall specificity and safety profile of the epitope set, aligning with prior reports on in silico safety profiling to reduce off-target immune responses in vaccine candidates65.

Table 2.

Final SOX9 MHC-I epitopes and associated Immunogenic properties.

MHCI Epitope Allele Average score IEDB & Antigenicity Allergenicity Toxicity
TPASAGHVW HLA-B*35:01 0.573562482 Non-allergen Non-toxic
EAVSQVLKGY HLA-A*26:01 0.549887921 Non-allergen Non-toxic
LADQYPHLH HLA-A*01:01 0.594197369 Non-allergen Non-toxic
AAGQGTGLY HLA-A*30:02 0.525830677 Non-allergen Non-toxic
KSVKNGQAEA HLA-A*30:01 0.590505975 Non-allergen Non-toxic
YSPSYPPITR HLA-A*68:01 0.589552928 Non-allergen Non-toxic
DVQPGKADLK HLA-A*68:01 0.504767939 Non-allergen Non-toxic
GSGSDTENTR HLA-A*68:01 0.568226268 Non-allergen Non-toxic
HVKRPMNAF HLA-B*15:01 0.495597506 Non-allergen Non-toxic
YTDHQNSSSY HLA-A*01:01 0.525443729 Non-allergen Non-toxic

Table 3.

Final MHC-II epitopes for the vaccine construct.

MHCII epitope Allele Average score IEDB &Antigenicity Allergenicity Toxicity
LDPFMKMTDEQEK HLA-DRB1*04:05 0.98345000 Non-allergen Non-toxic
KSVKNGQAEAEEATEQ HLA-DQA1*01:02, HLA-DQB1*06:02 0.87750251 Non-allergen Non-toxic
PMPVRVNGSSKNKP HLA-DRB1*11:01 0.84654678 Non-allergen Non-toxic
YSTFTYMNPAQRP HLA-DRB1*15:01 0.64200239 Non-allergen Non-toxic
EKRPFVEEAERLRVQH HLA-DRB1*15:01 0.63505361 Non-allergen Non-toxic
ATHGQVTYTGSYGISST HLA-DRB1*15:01 0.59797496 Non-allergen Non-toxic
DYKYQPRRRKSVKNG HLA-DRB1*13:02 0.58899271 Non-allergen Non-toxic
THGQVTYTGSYGIS HLA-DRB1*07:01 0.55022029 Non-allergen Non-toxic
SYGISSTAATPASAGHV HLA-DQA1*01:02, HLA-DQB1*06:02 0.52127801 Non-allergen Non-toxic
ISSTAATPASAGH HLA-DQA1*01:02, HLA-DQB1*06:02 0.52204372 Non-allergen Non-toxic

B-cells play an indispensable role in the adaptive immune system, thereby facilitating the production of antibodies tailored to specific antigens66. This study employed BepiPred 2.0 to predict BCEs, yielding five epitopes. Table 4 lists the immune profiles for all BCEs. Following antigenicity, stability, and allergenicity tests, two epitopes were chosen for vaccine development.

Table 4.

Predicted B-cell epitopes.

B-cell epitope Antigenicity Antigenicity score Allergenicity toxicity
ADQYPHLH* Antigen 1.1201 PROBABLE NON-ALLERGEN Non-toxin
RLLNESEKRPFVEEAER* Antigen 0.7068 PROBABLE NON-ALLERGEN Non-toxin
MKMTDEQEKGLSGAPSPTMSEDSAGSPCPSGSGSDTENTRPQENTFPKGEPDLKKESEEDKFPVCIREAVSQVLKGYDWTLVPMPVRVNGSSKNKPHV Antigen 0.7293 PROBABLE NON-ALLERGEN Toxin
RVQHKKDHPDYKYQPRRRKSVKNGQAEAEEATEQTHISPNAIFKALQADSPHSSSGMSEVHSPGEHSGQSQGPPTPPTTPKTDVQPGKADLKREGRPLPEGGRQPPIDFRDVDIGELSSDVISNIETFDVNEFDQYLPPNGHPGVPATHGQVTYTGSYGISS Antigen 0.5612 PROBABLE NON-ALLERGEN Toxin
SAGHVWMSKQQAPPPPPQQPPQAPPAPQAPPQPQAAPPQQPAAPPQQPQAHTLTTLSSEPGQSQRTHIKTEQLSPSHYSEQQQHSPQQIAYSPFNLPHYSPSYPPITRSQYDYTDHQNSSSYYSHAAGQGTGLYSTFTYMNPAQRPMYTPIADTSGVPSIPQTHSPQHWEQPVYTQL Non-antigen 0.3981 PROBABLE ALLERGEN Non-toxin

Prediction of population coverage and physicochemical properties

The physicochemical properties of the developed SOX9-based vaccine were analyzed by ProtParam, predicting the molecular weight of 51792.66 Da (indicating the potential as an antigen as the optimal molecular weight is > 8 kDa). Also, the theoretical isoelectric point (pI) was determined to be 8.26, suggesting that the protein was basic in nature. The preferred instability value is < 40, while this value was 33.47 for the designed vaccine. Besides, the vaccine’s half-life was estimated at 7.2 h in mammalian reticulocytes and over 10 h in E. coli cells, implying that E. coli cell culture methods can efficiently produce and purify these vaccines in large quantities. A GRAVY score of -0.594 indicates the vaccine’s hydrophilic nature, while an aliphatic index of 63.58 demonstrated its resilience at physiological temperature. As seen in Table 5, the vaccine has a solubility score of 0.756, surpassing the threshold of 0.5, indicating its inherent solubility.

Table 5.

Designed SOX9-based vaccine physicochemical properties, allergenicity, antigenicity, and toxicity.

Molecular weight 51792.66
Number of amino acids 494
Estimated half-life

7.2 h (mammalian reticulocytes, in vitro).

> 20 h (yeast, in vivo).

>10 h (Escherichia coli, in vivo).

Theoretical PI 8.26
Instability index 33.47
Aliphatic index 59.05
GRAVY score − 0.594
Solubility score 0.756
Antigenicity 0.5793
Allergen Non-allergen
Toxicity Non-toxic

The allergenicity and antigenicity of the vaccine were assessed using AllerTOP v2.0 and VaxiJen v2.0, confirming that all constructs are non-allergenic and antigenic. Toxicity evaluation by ToxinPred indicated that the vaccines are non-toxic.

The IEDB tools were utilized to predict the designed vaccine’s population coverage, evaluating MHC-I and MHC-II alleles globally. This study found that 81.86% of the global population possesses alleles capable of being targeted by vaccine epitopes (Fig. 1). These outcomes imply that the designed vaccine may trigger a strong immune response by effectively targeting prioritized B- and T-cell epitopes (with the enhanced response by adding adjuvants). Regarding the essential physicochemical properties (e.g., toxicity, antigenicity, and allergenicity), which are crucial in vaccine development, the designed vaccine appears to be efficacious and safe, while the wide population coverage underscores its prospect for broad effectiveness against the intended target.

Fig. 1.

Fig. 1

Analysis of the global (combined) population coverage of the designed vaccine.

Organization of the final vaccine sequence

The final multi-epitope vaccine construct includes the most non-toxic and immunogenic epitopes carefully chosen from the SOX9 protein, encompassing CTL, HTL, and linear B-cell epitopes. The vaccine’s immunogenicity was enhanced by incorporating and positioning an rplL adjuvant at the construct’s N-terminal. The EAAAK linker was used to fuse the adjuvant to the epitope chain to ensure structural stability and functional independence. Subsequently, AAY were used as linkers to join the CTL epitopes, thereby facilitating efficient proteasomal cleavage and presentation via MHC-I. These CTL domains were then linked to HTL epitopes using the GPGPG linker, which improves the accessibility and processing of MHC-II epitopes. The HTL epitopes were likewise connected among themselves using additional GPGPG linkers. Linear BCEs (LBCEs) were attached to the construct’s C-terminal using KK linkers, ensuring appropriate spatial separation and flexibility to expose antigenic regions to B-cell receptors. Besides, appending a 6X-His tag at the C-terminus to assist downstream purification. This well-structured design (Fig. 2) ensures optimal immunogenic exposure and molecular compatibility, enabling the construct to undergo further structural and functional evaluations.

Fig. 2.

Fig. 2

Final sequence of the designed vaccine along with linkers and adjuvants.

Prediction of the designed vaccine’s secondary and (enhanced and verified) tertiary structures

The developed vaccine’s secondary structure was predicted by PSIPRED, indicating the alpha helix accounted for 47.77%, the beta bridge for 0.00%, the extended strand for 11.74%, the random coil for 40.49%, and there were no ambiguous states. The tertiary structure prediction using the ROBETTA server (Fig. 3.) and the model’s quality refinement using GalaxyWEB refiner and assessment using the ERRAT and Ramachandran plots, with most residues falling within acceptable regions, resulting in the favorable quality of the vaccine 3D models. Ultimately, the engineered vaccine maintains a stable secondary and tertiary structure, potentially crucial to generating immune responses.

Fig. 3.

Fig. 3

The secondary structures of the developed vaccine encompass α-helices, coils, and β-strands. In this representation, the α-helices are indicated in pink, the β-strands in yellow, and the coil regions in grey.

Among the five 3D models generated by ROBETTA for the vaccine, Model 1 was selected due to its superior SAVES results. According to a subsequent analysis using PDBsum to generate a Ramachandran plot, 87.3% of the residues in the most favored regions (i.e., A, B, L) in the initial model were slightly below the optimal threshold of over 90% for a high-quality model. Following the refinement of the structure by GalaxyWEB, a notable enhancement was observed in the Ramachandran plot analysis, (Fig. 4.) with the updated model demonstrating a significant improvement, with 90.6% of the residues residing in the most favored regions. The refined model was evaluated using SAVES, revealing an ERRAT Overall Quality Factor of 90.41. Additionally, the complete VERIFY3D analysis showed that 80.57% of the residues attained an averaged 3D-1D score of ≥ 0.1, meeting the necessary criteria to pass the requirement that a minimum of 80% of amino acids scored ≥ 0.1 in the 3D-1D profile.

Fig. 4.

Fig. 4

The vaccine model’s Ramachandran plot (A) before refinement; (B) after refinement.

Homologue epitope screening results

The BLASTp analysis of predicted SOX9 epitopes against the human protein database revealed several significant matches and from the 23 epitopes screened, the majority were particular to SOX9, with E-values indicating significant matches and identity percentages of 100%. However, a subset of epitopes showed cross-reactivity with some SOX protein family members like SOX8 and SRY-box8 isoform CRA_a. Despite the cross-reactive nature of these epitopes, SOX9 consistently appeared among the top hits in the BLAST results, supporting the epitopes’ relevance to the target protein. These findings shown in Tables 6, 7 and 8 underscore the importance of detailed validation to balance immunogenicity and specificity.

Table 6.

Results of homologue MHCI epitope screening.

MHCI epitopes Hit description Max score Total score Query cover E-VALUE Identity (%) Accession number
TPASAGHVW SOX-9 32 32 100% 0.012 100.00% NP_000337.1
EAVSQVLKGY SOX-9 33.7 33.7 100% 0.004 100.00% NP_000337.1
LADQYPHLH SRY-box 8, isoform CRA_a 33.7 33.7 100% 0.002 100.00% EAW85691.1
AAGQGTGLY SOX-9 29.5 29.5 100% 0.096 100.00% NP_000337.1
KSVKNGQAEA SOX-9 32.5 32.5 100% 0.01 100.00% NP_000337.1
YSPSYPPITR SOX-9 36.7 36.7 100% 0.43 100.00% NP_000337.1
DVQPGKADLK SOX-9 33.7 33.7 100% 0.004 100.00% NP_000337.1
GSGSDTENTR SOX-9 32.9 32.9 100% 0.007 100.00% NP_000337.1
HVKRPMNAF SRY-box 30 34.1 34.1 100% 0.002 100.00% AAH33492.2
YTDHQNSSSY SOX-9 36.3 36.3 100% 5e-04 100.00% NP_000337.1

Table 7.

Results of homologue MHCII epitope screening.

MHCII epitopes Hit description Max score Total score Query cover E-value Identity (%) Accession number
LDPFMKMTDEQEK SOX-9 48.6 48.6 100% 4.00E − 08 100.00% NP_000337.1
KSVKNGQAEAEEATEQ SOX-9 51.5 51.5 100% 6.00E − 09 100.00% NP_000337.1
PMPVRVNGSSKNKP SOX-9 47.7 47.7 100% 9.00E − 08 100.00% NP_000337.1
YSTFTYMNPAQRP SOX-9 48.1 48.1 100% 5.00E − 08 100.00% NP_000337.1
EKRPFVEEAERLRVQH SOX-8 55.8 55.8 100% 1.00E − 10 100.00% CAA46615.1
ATHGQVTYTGSYGISST SOX-9 54.9 54.9 100% 4.00E − 10 100.00% NP_000337.1
DYKYQPRRRKSVKNG SOX-9 52.4 52.4 100% 2.00E − 09 100.00% NP_000337.1
THGQVTYTGSYGIS SOX-9 46.9 46.9 100% 2.00E − 07 100.00% NP_000337.1
SYGISSTAATPASAGHV SOX-9 52.4 52.4 100% 3.00E − 09 100.00% NP_000337.1
ISSTAATPASAGH SOX-9 40.1 40.1 100% 4.00E − 05 100.00% NP_000337.1

Table 8.

Results of homologue B-cell epitope screening.

B cell epitopes Hit description Max score Total score Query cover E-value Identity (%) Accession number
ADQYPHLH SRY-box8, isoform CRA_a 30.8 30.8 100% 0.027 100.00% EAW85691.1
RLLNESEKRPFVEEAER SOX-9 57.9 57.9 100% 4e − 11 100.00% NP_000337.1

Exploration of molecular dynamics and Docking simulations

As a common measure to evaluate the stability and structural changes of molecular systems during MD simulation, the RMSD plot was calculated for the α-carbon atoms of VaccineTRL2 and VaccineTLR4 (Fig. 5A). The convergence and plateau observed after 130 ns for both systems indicate a dynamic equilibrium of the vaccine conformers interacting with both receptors and 150 ns is a suitable time for studying the vaccine-receptor interaction. The average RMSD values at the final 30 ns for VaccineTLR2 and VaccineTLR4 were 0.57 ± 0.02 and 0.59 ± 0.01, respectively. The Vac-TLR4 complex exhibited higher dynamics compared to Vac-TLR2. This suggests that during the transition of structures from their initial state, the TLR4 receptor undergoes significantly more structural adjustments to achieve a favorable configuration. The RG analysis was performed to assess the structural changes and compactness degree of the vaccines (Fig. 5B). The RG values for the vaccine in interaction with the TLR2 and TLR4 receptors reveal differences in their dynamics and structural integrity. The stability of the vaccineTLR2, indicated by fewer fluctuations in its RG values, suggests that the vaccine maintains a more compact and consistent structure when interacting with this receptor. This stability may enhance the vaccine’s effectiveness in eliciting a robust immune response through TLR2. Conversely, the greater fluctuations in the RG values for the vaccine interacting with TLR4 indicate that the structural integrity of the vaccine may be more dynamic in response to environmental changes when engaging with this receptor.

Fig. 5.

Fig. 5

(A) Structural analysis values for α-carbon atoms of the vaccine complexed with TLR2 and TLR4. Molecular graphics images of the complexes are presented in the final frame. (B) The progression of RG changes in the vaccine structures throughout the entire trajectory. Molecular alignment image of the two vaccine structures with RMSD of 0.57 nm are shown in the final frame.

The FEL analysis was conducted using the two principal components (RMSD and RG) to identify global minimum energy states for examining the preferred protein configurations during the vaccine-receptor binding process (Fig. 6). For vaccineTLR2, the landscape reveals a more pronounced and stable energy minimum, suggesting that the vaccine adopts a favorable conformation that may enhance its binding affinity and functional efficacy. In contrast, the FEL for vaccineTLR4 displays a broader range of energy states, indicating that the vaccine may experience greater conformational flexibility and variability in this interaction. Overall, these findings suggest that the TLR2-docked vaccine interaction is characterized by stability and predictability, while the TLR4-docked vaccine exhibits greater conformational diversity.

Fig. 6.

Fig. 6

FEL analysis of the vaccine structures complexed with (A) TLR2 and (B) TLR4.

To gain deeper atomic-level insights into vaccine-receptor interactions, probable complex structures were obtained through clustering analysis of the last 20 ns of the trajectory, and the average structure of the largest cluster was extracted. Two candidate structures were analyzed using the PLIP web server to determine the number and types of non-covalent interactions in the proteinprotein complex (Figs. 7 and 8). The analysis of non-bonded interactions reveals a diverse array of interaction types, including hydrogen bonds, salt bridges, hydrophobic interactions, and π-cation interactions. Notably, the presence of multiple salt bridges indicates a strong affinity between the vaccine and TLR2, thereby increasing the complex’s stability. Conversely, the interaction analysis with the TLR4 receptor reveals a distinct profile of non-bonded interactions, predominantly characterized by the presence of hydrogen bonds and hydrophobic interactions and absence of salt bridges. The repeated presence of ARG87 and PHE63 residues indicates the significance of certain residues in maintaining interactions; however, the lower variety of interaction types may limit the stability and binding strength of the TLR4-vaccine complex. This distinction in interaction profiles highlights the potential of the TLR2-vaccine complex to elicit a more effective immune response due to its greater interaction diversity, while vaccine-TLR4 interactions may require further optimization to enhance the vaccine’s efficacy. A comparison of the interaction results indicates that TLR2-vaccine has a stronger binding affinity to its receptor. Additionally, the range and frequency of non-covalent interactions in the docked complexes suggest that the vaccine-receptor complexes demonstrate excellent dynamic equilibrium and substantial stability.

Fig. 7.

Fig. 7

Analysis of non-bonded vaccine-TLR2 interactions as the most populated member obtained from the cluster analysis for the final 20 ns of the MD simulation.

Fig. 8.

Fig. 8

Analysis of non-bonded vaccine-TLR4 interactions as the most populated member obtained from the cluster analysis for the final 20 ns of the MD simulation.

To validate the results of the protein-protein interaction analysis, the average binding free energy between the receptor and the vaccine was calculated using the MMPBSA approach for the last 20 ns (Table 9). The average binding free energy was calculated to be -59.9 ± 4.3 and − 41.1 ± 5.1 for the vaccine-TLR2 and vaccine-TLR4 complexes, respectively. Consistent with the protein-protein interaction analysis, the vaccine exhibited a higher binding affinity for the TLR2 receptor. The low average binding free energy values for both complexes imply a high affinity of the vaccine toward the receptors and trigger a significant immune response.

Table 9.

Binding energies of Vac-TLR2 and Vac-TLR4 complexes for the top three candidates identified in the clustering analysis during the last 20 Ns of the MD simulation (All units are presented in Kcal mol−1).

Complex MM-PBSA component
1ΔGvdw 2ΔGeel 3ΔGpb 4ΔGgas 5ΔGsolv 6ΔGbind
Vac-TLR2 Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
Vac-TLR4 Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic

Analysis of codon adaptation and in-silico cloning

The codon usage was tailored for vaccine development, ensuring that the same amino acid sequence was maintained by the modified codons while expression efficiency was enhanced in the Escherichia Coli vector. The codon adaptation was assessed based on the GC content and CAI, with the optimal ranges being 0.8 to 1 for CAI and 30% to 70% for GC content. The CAI and GC content were 1.00 and 62.42% at the outset and 0.91 and 59.92% after optimization, respectively. EcoRI and XhoI restriction sites were introduced at the vaccine’s C- and N-termini (Fig. 9a). The 1494 bp was cloned into the pET28a(+) vector during in-silico cloning using the SnapGene tool (Fig. 9b). The resulting genome size post-insertion was 6823 base pairs.

Fig. 9.

Fig. 9

(A) EcoRI and HindIII restriction sites at the designed vaccine’s C- and N-termini; (B) In-silico cloned vaccine into the pet28a(+) vector.

Immune simulations

The C-ImmSim simulation results, displayed in the accompanying figures, provide insight into the immune responses triggered by the designed vaccine. As shown in Fig. 10A, the B-cell population demonstrates significant activity post-vaccination, with a marked increase in active B-cells and duplicating cells following each booster dose. The active B-cell population peaks after the first and second doses and stabilizes, indicating effective immune activation and memory formation. Internalized and presenting B-cells show a corresponding increase, reflecting antigen processing and presentation, while anergic B-cells remain relatively low, suggesting minimal tolerance induction. Figure 10B presents the T-cell population dynamics. Active T-cells exhibit a substantial initial increase, followed by a gradual decline while duplicating T-cells steadily rise, peaking around the third dose. Resting T-cells diminish correspondingly, indicating successful activation and clonal expansion. The anergic T-cell population remains negligible, highlighting the vaccine’s ability to avoid inducing T-cell anergy.

Fig. 10.

Fig. 10

Immune simulation results graphs. (A) B-cell population per state (cells per mm3). (B) T-cell population dynamics. (C) The antigen and antibody response (cells per mm3). (D) cytokine levels.

The antigen and antibody response, shown in Fig. 10C reveals a robust production of antibodies following vaccination. IgM and IgG levels peak after each dose, particularly IgG1 and IgG2, critical for long-term immunity. The antigen levels show a rapid decline after each booster, correlating with the rise in antibody titers, indicating efficient antigen clearance by the immune system. Fig. 10D shows significant spikes in cytokine levels corresponding to each vaccine dose. Key cytokines (e.g., IFN-g, IL-2, IL-18, IL-10, and TGF-b) exhibit marked increases, reflecting their roles in mediating and regulating the immune response.

Discussion

The present study introduces a rationally designed multi-epitope peptide vaccine targeting the SOX9 transcription factor, a molecule implicated in the progression, invasiveness, and chemoresistance of triple-negative breast cancer (TNBC). Rather than relying on conventional markers, which are absent in TNBC, this strategy focuses on SOX9 due to its known regulatory role in maintaining cancer stems and its overexpression in metastatic lesions13,17. This computational vaccine design effort utilized an immunoinformatics pipeline to construct a safe, stable, and immunogenic peptide capable of broad global efficacy. Epitope selection and immunogenic profiling ensured the incorporation of high-affinity peptides with no predicted allergenicity or toxicity. Notably, the CTL epitope TPASAGHVW and HTL epitope LDPFMKMTDEQEK demonstrated strong binding scores across various HLA alleles, emphasizing their relevance to immune response activation. B-cell epitope ADQYPHLH also showed high antigenicity, affirming its potential for stimulating a humoral immune response. Surface-accessible and immunodominant regions of SOX9 were identified, supporting efficient B-cell recognition. These insights are validated by population coverage analysis, which revealed an exceptional global HLA allele match of 81.86%, meaning the vaccine has near-universal applicability. The physicochemical properties of the construct support its suitability for in vivo expression: an MW of ~ 51.8 kDa, a pI of 8.26, and a favorable instability index of < 40. These features, alongside a solubility score of 0.756, predict ease of recombinant expression in E. coli, as reinforced by codon adaptation analysis with a CAI of 0.91 and optimized GC content (59.92%). Secondary structure composition included a balanced distribution of helices, strands, and coils, ideal for epitope presentation. The refined tertiary structure showed 90.6% of the residues in most favored regions on the Ramachandran plot, an ERRAT score of 90.41, and a VERIFY3D pass rate of 80.57%, confirming the model’s structural quality and reliability. Molecular docking and molecular dynamics simulations provided compelling evidence of stable interaction with TLR2 and TLR4. The vaccine–TLR2 complex demonstrated a more compact radius of gyration and lower fluctuations, along with a stronger binding energy (− 59.9 kcal/mol vs. −41.1 kcal/mol for TLR4). The greater variety and frequency of salt bridges and π-cation interactions in the TLR2 complex suggest a more robust and potentially immunostimulatory engagement.

Crucially, immune simulation results showed repeated peaks in IgG1 and IgG2 following vaccine boosters, reflecting successful memory response generation. The corresponding cytokine surge (especially IFN-γ and IL-2) supports the vaccine’s predicted capability to orchestrate a coordinated Th1-biased immune reaction—vital for eliminating tumor cells. In comparison to prior SOX9 vaccine models, this construct demonstrates improved epitope coverage, structural refinement, and immune stimulation profiles. Moreover, the comprehensive inclusion of adjuvants, linkers, and epitope screening against human homologs minimized the risk of autoimmune cross-reactivity.

Balancing specificity and safety in SOX9‑targeted vaccine design

It is clear that SOX9 overexpression is a well-established feature in TNBC progression17 but its essential physiological roles in normal tissues make it a potentially hazardous target if antigen selection lacks sufficient specificity. SOX9 has been known critical for chondrogenesis, gonadal differentiation, and hair follicle cycling. However, there are many evidences raised the possibility that immune responses against SOX9-derived epitopes could affect these tissues, adversely19,20. In the present study, antigen design incorporated BLASTp-based filtering to exclude epitopes with significant homology to non-tumor SOX family proteins or unrelated human proteins based on the suggestions of67,68. In silico safety profiling further evaluated allergenicity (AllerTOP v2.0) and toxicity (ToxinPred), revealing that most predicted epitopes were non-allergenic and non-toxic. Although these computational assessments support the overall specificity and safety profile of the refined epitope set, prior experience with epitope-based cancer vaccines highlighted the necessity of comprehensive preclinical validation65. These evaluations can be taken into considerations to include histopathological analysis of high-SOX9 expression tissues (e.g., cartilage, gonads, hair follicles), as well as systemic cytokine profiling to detect unintended inflammatory or autoimmune responses. Until such in vivo studies are completed, the possibility of off-target effects cannot be entirely excluded.

In summary, the data indicate that the SOX9-based multi-epitope vaccine holds substantial promise as a therapeutic candidate for TNBC immunotherapy. Its in-silico validation, wide population coverage, strong receptor binding, and simulated immunogenicity warrant its progression to experimental validation. Future studies should include in vitro expression, in vivo immunogenicity trials, and toxicological assessments to pave the way for clinical translation.

Acknowledgements

The authors sincerely thank the Bioinformatics Camp at the University of Tehran for their insightful support and guidance throughout this research. Special appreciation is also extended to Miss Farangis Rastin of the Department of Biology, School of Science, Ferdowsi University of Mashhad, for her valuable assistance during the manuscript preparation. Additionally, this work has benefited from the use of artificial intelligence tools to improve language clarity and grammatical accuracy.

Author contributions

Ghazaleh Hatamian: conducted in-silico modeling, and authored the initial manuscript, preparing manuscript draftAmirali Ebrahimpour: in-silico studies, data analysis, preparing manuscript draftMojgan Nejabat: study conception and design, reviewed the results and approved the final version of the manuscriptFarzin Hadizadeh: study conception and design, approved the final version of the manuscriptAll authors reviewed the final manuscript.

Data availability

All data were included in the manuscript.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Mojgan Nejabat, Email: nejabatm2@mums.ac.ir.

Farzin Hadizadeh, Email: hadizadehf@mums.ac.ir.

References

  • 1.Li, Y. et al. Recent advances in therapeutic strategies for triple-negative breast cancer. J. Hematol. Oncol.15 (1), 121 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Foulkes, W. D., Smith, I. E. & Reis-Filho, J. S. Triple-negative breast cancer. N. Engl. J. Med.363 (20), 1938–1948 (2010). [DOI] [PubMed] [Google Scholar]
  • 3.Dariushnejad, H., Ghorbanzadeh, V., Akbari, S. & Hashemzadeh, P. Design of a novel Recombinant multi-epitope vaccine against triple-negative breast cancer. Iran. Biomed. J.26 (2), 160 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Moo, T-A., Sanford, R., Dang, C. & Morrow, M. Overview of breast cancer therapy. PET. Clin.13 (3), 339 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bianchini, G., De Angelis, C., Licata, L. & Gianni, L. Treatment landscape of triple-negative breast cancer—expanded options, evolving needs. Nat. Reviews Clin. Oncol.19 (2), 91–113 (2022). [DOI] [PubMed] [Google Scholar]
  • 6.Yin, L., Duan, J-J., Bian, X-W. & Yu, S. Triple-negative breast cancer molecular subtyping and treatment progress. Breast Cancer Res.22, 1–13 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Luo, C. et al. Progress and prospect of immunotherapy for triple-negative breast cancer. Front. Oncol.12, 919072 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Cao, Y., Chen, C., Tao, Y., Lin, W. & Wang, P. Immunotherapy for triple-negative breast cancer. Pharmaceutics13 (12), 2003 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Benedetti, R., Dell’Aversana, C., Giorgio, C., Astorri, R. & Altucci, L. Breast cancer vaccines: new insights. Front. Endocrinol.8, 270 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.DeLuca, D. & Blasczyk, R. The immunoinformatics of cancer immunotherapy. Tissue Antigens. 70 (4), 265–271 (2007). [DOI] [PubMed] [Google Scholar]
  • 11.Paston, S. J., Brentville, V. A., Symonds, P. & Durrant, L. G. Cancer vaccines, adjuvants, and delivery systems. Front. Immunol.12, 627932 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Sarkar, A. & Hochedlinger, K. The Sox family of transcription factors: versatile regulators of stem and progenitor cell fate. Cell. Stem cell.12 (1), 15–30 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Domenici, G. et al. A Sox2–Sox9 signalling axis maintains human breast luminal progenitor and breast cancer stem cells. Oncogene38 (17), 3151–3169 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kopp, J. L. et al. Identification of Sox9-dependent acinar-to-ductal reprogramming as the principal mechanism for initiation of pancreatic ductal adenocarcinoma. Cancer cell.22 (6), 737–750 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Marqués, M. et al. Are transcription factors plausible oncotargets for triple negative breast cancers? Cancers14 (5), 1101 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Grimm, D. et al. (eds) The Role of SOX Family Members in Solid Tumours and metastasis. Seminars in Cancer Biology (Elsevier, 2020). [DOI] [PubMed]
  • 17.Ma, Y. et al. SOX9 is essential for triple-negative breast cancer cell survival and metastasis. Mol. Cancer Res.18 (12), 1825–1838 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rajendran Krishnamoorthy, H. & Karuppasamy, R. Designing a novel SOX9 based multi-epitope vaccine to combat metastatic triple-negative breast cancer using immunoinformatics approach. Mol. Diversity. 27 (4), 1829–1842 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lourenço, D. et al. Loss-of-function mutation in GATA4 causes anomalies of human testicular development. Proc. Natl. Acad. Sci. U S A. 108 (4), 1597–1602 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Jo, A. et al. The versatile functions of Sox9 in development, stem cells, and human diseases. Genes Dis.1 (2), 149–161 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sayers, E. W. et al. Database resources of the National center for biotechnology information. Nucleic Acids Res.52 (D1), D33–D43 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.UniProt. The universal protein knowledgebase in 2023. Nucleic Acids Res.51 (D1), D523–D31 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Reynisson, B., Alvarez, B., Paul, S., Peters, B. & Nielsen, M. NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif Deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res.48 (W1), W449–W54 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Vita, R. et al. The immune epitope database (IEDB): 2018 update. Nucleic Acids Res.47 (D1), D339–D43 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Smith, C. C. et al. Alternative tumour-specific antigens. Nat. Rev. Cancer. 19 (8), 465–478 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Doytchinova, I. A. & Flower, D. R. VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines. BMC Bioinform.8, 1–7 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Dimitrov, I., Bangov, I., Flower, D. R. & Doytchinova, I. AllerTOP v. 2—a server for in Silico prediction of allergens. J. Mol. Model.20, 1–6 (2014). [DOI] [PubMed] [Google Scholar]
  • 28.Ren, H., Jia, W., Xie, Y., Yu, M. & Chen, Y. Adjuvant physiochemistry and advanced nanotechnology for vaccine development. Chem. Soc. Rev.52 (15), 5172–5254 (2023). [DOI] [PubMed] [Google Scholar]
  • 29.Papaleo, E. et al. The role of protein loops and linkers in conformational dynamics and allostery. Chem. Rev.116 (11), 6391–6423 (2016). [DOI] [PubMed] [Google Scholar]
  • 30.Gupta, S. et al. In Silico approach for predicting toxicity of peptides and proteins. PloS One. 8 (9), e73957 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Gasteiger, E. et al. Protein identification and analysis tools on the expasy server. Proteom. Protocols Handb. :571–607. (2005). [DOI] [PubMed]
  • 32.Hon, J. et al. SoluProt: prediction of soluble protein expression in Escherichia coli. Bioinformatics37 (1), 23–28 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.McGuffin, L. J., Bryson, K. & Jones, D. T. The PSIPRED protein structure prediction server. Bioinformatics16 (4), 404–405 (2000). [DOI] [PubMed] [Google Scholar]
  • 34.Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science373 (6557), 871–876 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Ko, J., Park, H., Heo, L. & Seok, C. GalaxyWEB server for protein structure prediction and refinement. Nucleic Acids Res.40 (W1), W294–W7 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Colovos, C. & Yeates, T. O. Verification of protein structures: patterns of nonbonded atomic interactions. Protein Sci.2 (9), 1511–1519 (1993). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Papageorgiou, A. C., Poudel, N. & Mattsson, J. Protein structure analysis and validation with X-ray crystallography. Protein downstream processing: design, development, and application of high and low-resolution methods. :377–404. (2021). [DOI] [PubMed]
  • 38.Laskowski, R. A. et al. PDBsum: a Web-based database of summaries and analyses of all PDB structures. Trends Biochem. Sci.22 (12), 488–490 (1997). [DOI] [PubMed] [Google Scholar]
  • 39.Shafaghi, M. et al. Immunoinformatics-aided design of a new multi-epitope vaccine adjuvanted with domain 4 of Pneumolysin against Streptococcus pneumoniae strains. BMC Bioinform.24 (1), 67 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Rahman, M. & Parvege, M. M. In Silico structural analysis of Hantaan virus glycoprotein G2 and conserved epitope prediction for vaccine development. J. Appl. Virol.3 (3), 62–77 (2014). [Google Scholar]
  • 41.Schulte, S. C., Dilthey, A. T. & Klau, G. W. HOGVAX: exploiting epitope overlaps to maximize population coverage in vaccine design with application to SARS-CoV-2. Cell. Syst.14 (12), 1122–1130 (2023). e3. [DOI] [PubMed] [Google Scholar]
  • 42.Ghasemnejad, A., Bazmara, S., Shadmani, M. & Bagheri, K. P. Designing a new multi-epitope pertussis vaccine with highly population coverage based on a novel sequence and structural filtration algorithm. IEEE/ACM Trans. Comput. Biol. Bioinf.18 (5), 1885–1892 (2019). [DOI] [PubMed] [Google Scholar]
  • 43.Oyarzun, P. & Kobe, B. Computer-aided design of T‐cell epitope‐based vaccines: addressing population coverage. Int. J. Immunogenet.42 (5), 313–321 (2015). [DOI] [PubMed] [Google Scholar]
  • 44.Amaya-Ramirez, D., Martinez-Enriquez, L. C. & Parra-López, C. Usefulness of Docking and molecular dynamics in selecting tumor neoantigens to design personalized cancer vaccines: a proof of concept. Vaccines11 (7), 1174 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Su, L. et al. Structural basis of TLR2/TLR1 activation by the synthetic agonist Diprovocim. J. Med. Chem.62 (6), 2938–2949 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Kim, H. M. et al. Crystal structure of the TLR4-MD-2 complex with bound endotoxin antagonist Eritoran. Cell130 (5), 906–917 (2007). [DOI] [PubMed] [Google Scholar]
  • 47.Desta, I. T., Porter, K. A., Xia, B., Kozakov, D. & Vajda, S. Performance and its limits in rigid body protein-protein Docking. Structure28 (9), 1071–1081 (2020). e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Kozakov, D. et al. The cluspro web server for protein–protein Docking. Nat. Protoc.12 (2), 255–278 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Zhu, F. et al. Designing a multi-epitope vaccine against Pseudomonas aeruginosa via integrating reverse vaccinology with immunoinformatics approaches. Sci. Rep.15 (1), 10425 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Lindorff-Larsen, K. et al. Improved side‐chain torsion potentials for the amber ff99SB protein force field. Proteins Struct. Funct. Bioinform.78 (8), 1950–1958 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Ali, A. et al. Multi-epitope-based vaccine models prioritization against astrovirus MLB1 using immunoinformatics and reverse vaccinology approaches. J. Genetic Eng. Biotechnol.23 (1), 100451 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Bhardwaj, V. K. & Purohit, R. A new insight into protein-protein interactions and the effect of conformational alterations in PCNA. Int. J. Biol. Macromol.148, 999–1009 (2020). [DOI] [PubMed] [Google Scholar]
  • 53.Abraham, M. J. et al. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX1, 19–25 (2015). [Google Scholar]
  • 54.Adasme, M. F. et al. PLIP 2021: expanding the scope of the protein–ligand interaction profiler to DNA and RNA. Nucleic Acids Res.49 (W1), W530–W4 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Valdés-Tresanco, M. S., Valdés-Tresanco, M. E., Valiente, P. A. & Moreno, E. gmx_MMPBSA: a new tool to perform end-state free energy calculations with GROMACS. J. Chem. Theory Comput.17 (10), 6281–6291 (2021). [DOI] [PubMed] [Google Scholar]
  • 56.Schrodinger, L. L. C. The PyMOL Molecular Graphics System, Version 1.8. (2015).
  • 57.Madeira, F. et al. The EMBL-EBI job dispatcher sequence analysis tools framework in 2024. Nucleic Acids Res.52 (W1), W521–W5 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Duraisamy, N. et al. Machine learning tools used for mapping some Immunogenic epitopes within the major structural proteins of the bovine coronavirus (BCoV) and for the in Silico design of the multiepitope-based vaccines. Front. Veterinary Sci.11, 1468890 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Puigbò, P., Bravo, I. G. & Garcia-Vallvé, S. E-CAI: a novel server to estimate an expected value of codon adaptation index (eCAI). BMC Bioinform.9, 1–7 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Dey, J. et al. Molecular characterization and designing of a novel multiepitope vaccine construct against Pseudomonas aeruginosa. Int. J. Pept. Res. Ther.28 (2), 49 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Shilling, P. J. et al. Improved designs for pET expression plasmids increase protein production yield in Escherichia coli. Commun. Biology. 3 (1), 214 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Rapin, N., Lund, O., Bernaschi, M. & Castiglione, F. Computational immunology Meets bioinformatics: the use of prediction tools for molecular binding in the simulation of the immune system. PloS One. 5 (4), e9862 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Raskov, H., Orhan, A., Christensen, J. P. & Gögenur, I. Cytotoxic CD8 + T cells in cancer and cancer immunotherapy. Br. J. Cancer. 124 (2), 359–367 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Tay, R. E., Richardson, E. K. & Toh, H. C. Revisiting the role of CD4 + T cells in cancer immunotherapy—new insights into old paradigms. Cancer Gene Ther.28 (1), 5–17 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Li, W., Joshi, M. D., Singhania, S., Ramsey, K. H. & Murthy, A. K. Peptide vaccine: progress and challenges. Vaccines (Basel). 2 (3), 515–536 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Li, M. et al. B cells in breast cancer pathology. Cancers15 (5), 1517 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Morazán-Fernández, D., Mora, J. & Molina-Mora, J. A. In Silico pipeline to identify Tumor-Specific antigens for cancer immunotherapy using exome sequencing data. Phenomics3 (2), 130–137 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Prawiningrum, A. F., Paramita, R. I. & Panigoro, S. S. Immunoinformatics approach for Epitope-Based vaccine design: key steps for breast cancer vaccine. Diagnostics (Basel) ;12(12). (2022). [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All data were included in the manuscript.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES